subscribe to arXiv mailings

Information dynamics in decohered quantum memory with repeated syndrome measurements: a dual approach

Authors: Jacob Hauser, Yimu Bao, Shengqi Sang, Ali Lavasani, Utkarsh Agrawal, Matthew P. A. Fisher

Abstract: Measurements can detect errors in a decohered quantum memory allowing active error correction to increase the memory time. Previous understanding of this mechanism has focused on evaluating the performance of error correction algorithms based on measurement results. In this work, we instead intrinsically characterize the information dynamics in a quantum memory under repeated measurements, using c… ▽ More Measurements can detect errors in a decohered quantum memory allowing active error correction to increase the memory time. Previous understanding of this mechanism has focused on evaluating the performance of error correction algorithms based on measurement results. In this work, we instead intrinsically characterize the information dynamics in a quantum memory under repeated measurements, using coherent information and relative entropy. We consider the dynamics of a $d$-dimensional stabilizer code subject to Pauli errors and noisy stabilizer measurements and develop a $(d+1)$-dimensional statistical mechanics model for the information-theoretic diagnostics. Our model is dual to the model previously obtained for the optimal decoding algorithm, and the potential decoding transition in the quantum memory again manifests as a thermal phase transition in the statistical mechanics model. We explicitly derive the model and study the phase transition in information encoding in three examples: surface codes, repetition codes, and the XZZX code. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: 27 pages, 9 figures

arXiv:2407.02829 [pdf, other]

Mirage Sources and Large TeV Halo-Pulsar Offsets: Exploring the Parameter Space

Authors: Yiwei Bao, Ruo-Yu Liu, Gwenael Giacinti, Hai-Ming Zhang, Yang Chen

Abstract: We investigate the asymmetric propagation of 100 TeV electrons (whose radiation mainly concentrates on 20--30 TeV) in turbulent magnetic fields around pulsars, using GPU-accelerated simulations to explore their trajectories and interactions within pulsar wind nebulae and the interstellar medium. Key results include the identification of ``mirage'' sources indicating significant offsets in high-ene… ▽ More We investigate the asymmetric propagation of 100 TeV electrons (whose radiation mainly concentrates on 20--30 TeV) in turbulent magnetic fields around pulsars, using GPU-accelerated simulations to explore their trajectories and interactions within pulsar wind nebulae and the interstellar medium. Key results include the identification of ``mirage'' sources indicating significant offsets in high-energy emissions from their originating pulsars, challenging the results of traditional symmetric diffusion models. By varying parameters like source distance, magnetic field strength, and electron injection spectral index, the study delineates their effects on observable phenomena such as the probability that a source has at least one mirage around it, as well as the source separation. Our results offer insights into some puzzling sources observed recently by the Large High Altitude Air Shower Observatory (LHAASO), and shed light on the cosmic-ray transport mechanism in the interstellar medium. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.02727 [pdf, other]

Long-lived magnetization in an atomic spin chain tuned to a diabolic point

Authors: R. J. G. Elbertse, D. Borodin, J. Oh, T. Ahn, J. Hwang, J. C. Rietveld, A. J. Heinrich, F. Delgado, S. Otte, Y. Bae

Abstract: Scaling magnets down to where quantum size effects become prominent triggers quantum tunneling of magnetization (QTM), profoundly influencing magnetization dynamics. Measuring magnetization switching in an Fe atomic chain under a carefully tuned transverse magnetic field, we observe a non-monotonic variation of magnetization lifetimes around a level crossing, known as the diabolic point (DP). Near… ▽ More Scaling magnets down to where quantum size effects become prominent triggers quantum tunneling of magnetization (QTM), profoundly influencing magnetization dynamics. Measuring magnetization switching in an Fe atomic chain under a carefully tuned transverse magnetic field, we observe a non-monotonic variation of magnetization lifetimes around a level crossing, known as the diabolic point (DP). Near DPs, local environment effects causing QTM are efficiently suppressed, enhancing lifetimes by three orders of magnitude. Adjusting interatomic interactions further facilitates multiple DPs. Our study provides a deeper understanding of quantum dynamics near DPs and enhances our ability to engineer a quantum magnet. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: Main text and Supplementary

arXiv:2407.02478 [pdf, other]

Mirages and Large TeV Halo-Pulsar Offsets from Cosmic Ray Propagation

Authors: Yiwei Bao, Gwenael Giacinti, Ruo-Yu Liu, Hai-Ming Zhang, Yang Chen

Abstract: The study of extended $γ$-ray sources usually assumes symmetric diffusion of cosmic rays. However, recent observations of multiple sources near single pulsars and significant offsets between TeV halo centroids and their parent pulsars suggest that this assumption is overly simplistic. In this Letter, we demonstrate that asymmetric propagation of cosmic rays near their accelerators may create multi… ▽ More The study of extended $γ$-ray sources usually assumes symmetric diffusion of cosmic rays. However, recent observations of multiple sources near single pulsars and significant offsets between TeV halo centroids and their parent pulsars suggest that this assumption is overly simplistic. In this Letter, we demonstrate that asymmetric propagation of cosmic rays near their accelerators may create multiple TeV sources instead of a single symmetric source. This mechanism also explains the large offsets between TeV halo centroids and their pulsars. We demonstrate that several perplexing detected sources can be naturally explained without invoking additional invisible accelerators. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2406.17565 [pdf, other]

MemServe: Context Caching for Disaggregated LLM Serving with Elastic Memory Pool

Authors: Cunchen Hu, Heyang Huang, Junhao Hu, Jiang Xu, Xusheng Chen, Tao Xie, Chenxi Wang, Sa Wang, Yungang Bao, Ninghui Sun, Yizhou Shan

Abstract: Large language model (LLM) serving has transformed from stateless to stateful systems, utilizing techniques like context caching and disaggregated inference. These optimizations extend the lifespan and domain of the KV cache, necessitating a new architectural approach. We present MemServe, a unified system that integrates both inter-request and intra-request optimizations. MemServe introduces MemP… ▽ More Large language model (LLM) serving has transformed from stateless to stateful systems, utilizing techniques like context caching and disaggregated inference. These optimizations extend the lifespan and domain of the KV cache, necessitating a new architectural approach. We present MemServe, a unified system that integrates both inter-request and intra-request optimizations. MemServe introduces MemPool, an elastic memory pool managing distributed memory and KV caches across serving instances. Using MemPool APIs, MemServe combines context caching with disaggregated inference for the first time, supported by a global scheduler that enhances cache reuse through a global prompt tree-based locality-aware policy. Tests show that MemServe significantly improves job completion time and time-to-first-time. △ Less

Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.17267 [pdf, other]

doi 10.1364/OE.527862

Efficient source-independent quantum conference key agreement

Authors: Yu Bao, Yi-Ran Xiao, Yu-Chen Song, Yao Fu, Xiao-Yu Cao, Hua-Lei Yin, Zeng-Bing Chen

Abstract: Quantum conference key agreement (QCKA) enables the unconditional secure distribution of conference keys among multiple participants. Due to challenges in high-fidelity preparation and long-distance distribution of multi-photon entanglement, entanglement-based QCKA is facing severe limitations in both key rate and scalability. Here, we propose a source-independent QCKA scheme utilizing the post-ma… ▽ More Quantum conference key agreement (QCKA) enables the unconditional secure distribution of conference keys among multiple participants. Due to challenges in high-fidelity preparation and long-distance distribution of multi-photon entanglement, entanglement-based QCKA is facing severe limitations in both key rate and scalability. Here, we propose a source-independent QCKA scheme utilizing the post-matching method, feasible within the entangled photon pair distribution network. We introduce an equivalent distributing virtual multi-photon entanglement protocol for providing the unconditional security proof even in the case of coherent attacks. For the symmetry star-network, comparing with previous $n$-photon entanglement protocol, the conference key rate is improved from $O(η^{n})$ to $O(η^{2})$, where $η$ is the transmittance from the entanglement source to one participant. Simulation results show that the performance of our protocol has multiple orders of magnitude advantages in the intercity distance. We anticipate that our approach will demonstrate its potential in the implementation of quantum networks. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: 10 pages, 6 figures

Journal ref: Optics Express 32, 24629 (2024)

arXiv:2406.12588 [pdf, other]

UIFV: Data Reconstruction Attack in Vertical Federated Learning

Authors: Jirui Yang, Peng Chen, Zhihui Lu, Qiang Duan, Yubing Bao

Abstract: Vertical Federated Learning (VFL) facilitates collaborative machine learning without the need for participants to share raw private data. However, recent studies have revealed privacy risks where adversaries might reconstruct sensitive features through data leakage during the learning process. Although data reconstruction methods based on gradient or model information are somewhat effective, they… ▽ More Vertical Federated Learning (VFL) facilitates collaborative machine learning without the need for participants to share raw private data. However, recent studies have revealed privacy risks where adversaries might reconstruct sensitive features through data leakage during the learning process. Although data reconstruction methods based on gradient or model information are somewhat effective, they reveal limitations in VFL application scenarios. This is because these traditional methods heavily rely on specific model structures and/or have strict limitations on application scenarios. To address this, our study introduces the Unified InverNet Framework into VFL, which yields a novel and flexible approach (dubbed UIFV) that leverages intermediate feature data to reconstruct original data, instead of relying on gradients or model details. The intermediate feature data is the feature exchanged by different participants during the inference phase of VFL. Experiments on four datasets demonstrate that our methods significantly outperform state-of-the-art techniques in attack precision. Our work exposes severe privacy vulnerabilities within VFL systems that pose real threats to practical VFL applications and thus confirms the necessity of further enhancing privacy protection in the VFL architecture. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.09643 [pdf, other]

Reinforced Decoder: Towards Training Recurrent Neural Networks for Time Series Forecasting

Authors: Qi Sima, Xinze Zhang, Yukun Bao, Siyue Yang, Liang Shen

Abstract: Recurrent neural network-based sequence-to-sequence models have been extensively applied for multi-step-ahead time series forecasting. These models typically involve a decoder trained using either its previous forecasts or the actual observed values as the decoder inputs. However, relying on self-generated predictions can lead to the rapid accumulation of errors over multiple steps, while using th… ▽ More Recurrent neural network-based sequence-to-sequence models have been extensively applied for multi-step-ahead time series forecasting. These models typically involve a decoder trained using either its previous forecasts or the actual observed values as the decoder inputs. However, relying on self-generated predictions can lead to the rapid accumulation of errors over multiple steps, while using the actual observations introduces exposure bias as these values are unavailable during the extrapolation stage. In this regard, this study proposes a novel training approach called reinforced decoder, which introduces auxiliary models to generate alternative decoder inputs that remain accessible when extrapolating. Additionally, a reinforcement learning algorithm is utilized to dynamically select the optimal inputs to improve accuracy. Comprehensive experiments demonstrate that our approach outperforms representative training methods over several datasets. Furthermore, the proposed approach also exhibits promising performance when generalized to self-attention-based sequence-to-sequence forecasting models. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 12 pages,8 figures

arXiv:2406.08698 [pdf, other]

Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes of astrophysical $γ$-ray background while large amount of dark matter. By analyzing more than 700 days observational data at LHAASO, no significant dark matter signal from 1 TeV to 1 EeV is detected. Accordingly we derive the most stringent constraints on the ultra-heavy dark matter annihilation cross-section up to EeV. The constraints on the lifetime of dark matter in decay mode are also derived. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 17 pages, 12 figures, accepted by PRL

arXiv:2406.06559 [pdf, other]

Harnessing Business and Media Insights with Large Language Models

Authors: Yujia Bao, Ankit Parag Shah, Neeru Narang, Jonathan Rivers, Rajeev Maksey, Lan Guan, Louise N. Barrere, Shelley Evenson, Rahul Basole, Connie Miao, Ankit Mehta, Fabien Boulay, Su Min Park, Natalie E. Pearson, Eldhose Joy, Tiger He, Sumiran Thakur, Koustav Ghosal, Josh On, Phoebe Morrison, Tim Major, Eva Siqi Wang, Gina Escobar, Jiaheng Wei, Tharindu Cyril Weerasooriya , et al. (8 additional authors not shown)

Abstract: This paper introduces Fortune Analytics Language Model (FALM). FALM empowers users with direct access to comprehensive business analysis, including market trends, company performance metrics, and expert insights. Unlike generic LLMs, FALM leverages a curated knowledge base built from professional journalism, enabling it to deliver precise and in-depth answers to intricate business questions. Users… ▽ More This paper introduces Fortune Analytics Language Model (FALM). FALM empowers users with direct access to comprehensive business analysis, including market trends, company performance metrics, and expert insights. Unlike generic LLMs, FALM leverages a curated knowledge base built from professional journalism, enabling it to deliver precise and in-depth answers to intricate business questions. Users can further leverage natural language queries to directly visualize financial data, generating insightful charts and graphs to understand trends across diverse business sectors clearly. FALM fosters user trust and ensures output accuracy through three novel methods: 1) Time-aware reasoning guarantees accurate event registration and prioritizes recent updates. 2) Thematic trend analysis explicitly examines topic evolution over time, providing insights into emerging business landscapes. 3) Content referencing and task decomposition enhance answer fidelity and data visualization accuracy. We conduct both automated and human evaluations, demonstrating FALM's significant performance improvements over baseline methods while prioritizing responsible AI practices. These benchmarks establish FALM as a cutting-edge LLM in the business and media domains, with exceptional accuracy and trustworthiness. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.04888 [pdf, other]

Zero-Shot Video Editing through Adaptive Sliding Score Distillation

Authors: Lianghan Zhu, Yanqi Bao, Jing Huo, Jing Wu, Yu-Kun Lai, Wenbin Li, Yang Gao

Abstract: The burgeoning field of text-based video generation (T2V) has reignited significant interest in the research of controllable video editing. Although pre-trained T2V-based editing models have achieved efficient editing capabilities, current works are still plagued by two major challenges. Firstly, the inherent limitations of T2V models lead to content inconsistencies and motion discontinuities betw… ▽ More The burgeoning field of text-based video generation (T2V) has reignited significant interest in the research of controllable video editing. Although pre-trained T2V-based editing models have achieved efficient editing capabilities, current works are still plagued by two major challenges. Firstly, the inherent limitations of T2V models lead to content inconsistencies and motion discontinuities between frames. Secondly, the notorious issue of over-editing significantly disrupts areas that are intended to remain unaltered. To address these challenges, our work aims to explore a robust video-based editing paradigm based on score distillation. Specifically, we propose an Adaptive Sliding Score Distillation strategy, which not only enhances the stability of T2V supervision but also incorporates both global and local video guidance to mitigate the impact of generation errors. Additionally, we modify the self-attention layers during the editing process to further preserve the key features of the original video. Extensive experiments demonstrate that these strategies enable us to effectively address the aforementioned challenges, achieving superior editing performance compared to existing state-of-the-art methods. △ Less

Submitted 7 June, 2024; originally announced June 2024.

arXiv:2406.00396 [pdf, other]

Stochastic Restarting to Overcome Overfitting in Neural Networks with Noisy Labels

Authors: Youngkyoung Bae, Yeongwoo Song, Hawoong Jeong

Abstract: Despite its prevalence, giving up and starting over may seem wasteful in many situations such as searching for a target or training deep neural networks (DNNs). Our study, though, demonstrates that restarting from a checkpoint can significantly improve generalization performance when training DNNs with noisy labels. In the presence of noisy labels, DNNs initially learn the general patterns of the… ▽ More Despite its prevalence, giving up and starting over may seem wasteful in many situations such as searching for a target or training deep neural networks (DNNs). Our study, though, demonstrates that restarting from a checkpoint can significantly improve generalization performance when training DNNs with noisy labels. In the presence of noisy labels, DNNs initially learn the general patterns of the data but then gradually overfit to the noisy labels. To combat this overfitting phenomenon, we developed a method based on stochastic restarting, which has been actively explored in the statistical physics field for finding targets efficiently. By approximating the dynamics of stochastic gradient descent into Langevin dynamics, we theoretically show that restarting can provide great improvements as the batch size and the proportion of corrupted data increase. We then empirically validate our theory, confirming the significant improvements achieved by restarting. An important aspect of our method is its ease of implementation and compatibility with other methods, while still yielding notably improved performance. We envision it as a valuable tool that can complement existing methods for handling noisy labels. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: 21 pages, 10 figures

arXiv:2405.17315 [pdf, other]

All-day Depth Completion

Authors: Vadim Ezhov, Hyoungseob Park, Zhaoyang Zhang, Rishi Upadhyay, Howard Zhang, Chethan Chinder Chandrappa, Achuta Kadambi, Yunhao Ba, Julie Dorsey, Alex Wong

Abstract: We propose a method for depth estimation under different illumination conditions, i.e., day and night time. As photometry is uninformative in regions under low-illumination, we tackle the problem through a multi-sensor fusion approach, where we take as input an additional synchronized sparse point cloud (i.e., from a LiDAR) projected onto the image plane as a sparse depth map, along with a camera… ▽ More We propose a method for depth estimation under different illumination conditions, i.e., day and night time. As photometry is uninformative in regions under low-illumination, we tackle the problem through a multi-sensor fusion approach, where we take as input an additional synchronized sparse point cloud (i.e., from a LiDAR) projected onto the image plane as a sparse depth map, along with a camera image. The crux of our method lies in the use of the abundantly available synthetic data to first approximate the 3D scene structure by learning a mapping from sparse to (coarse) dense depth maps along with their predictive uncertainty - we term this, SpaDe. In poorly illuminated regions where photometric intensities do not afford the inference of local shape, the coarse approximation of scene depth serves as a prior; the uncertainty map is then used with the image to guide refinement through an uncertainty-driven residual learning (URL) scheme. The resulting depth completion network leverages complementary strengths from both modalities - depth is sparse but insensitive to illumination and in metric scale, and image is dense but sensitive with scale ambiguity. SpaDe can be used in a plug-and-play fashion, which allows for 25% improvement when augmented onto existing methods to preprocess sparse depth. We demonstrate URL on the nuScenes dataset where we improve over all baselines by an average 11.65% in all-day scenarios, 11.23% when tested specifically for daytime, and 13.12% for nighttime scenes. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 8 pages, 4 figures

arXiv:2405.11826 [pdf, other]

Data quality control system and long-term performance monitor of the LHAASO-KM2A

Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To ensure the reliability of the LHAASO-KM2A data, a three-level quality control system has been established. It is used to monitor the status of detector units, stability of reconstructed parameters and the performance of the array based on observations of the Crab Nebula and Moon shadow. This paper will introduce the control system and its application on the LHAASO-KM2A data collected from August 2021 to July 2023. During this period, the pointing and angular resolution of the array were stable. From the observations of the Moon shadow and Crab Nebula, the results achieved using the two methods are consistent with each other. According to the observation of the Crab Nebula at energies from 25 TeV to 100 TeV, the time averaged pointing errors are estimated to be $-0.003^{\circ} \pm 0.005^{\circ}$ and $0.001^{\circ} \pm 0.006^{\circ}$ in the R.A. and Dec directions, respectively. △ Less

Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: 15 pages, 9 figures

arXiv:2405.09193 [pdf, other]

Autonomous Cooperative Levels of Multiple-Heterogeneous Unmanned Vehicle Systems

Authors: Yoo-Bin Bae, Yeong-Ung Kim, Jun-Oh Park, Hyo-Sung Ahn

Abstract: As multiple and heterogenous unmanned vehicle systems continue to play an increasingly important role in addressing complex missions in the real world, the need for effective cooperation among unmanned vehicles becomes paramount. The concept of autonomous cooperation, wherein unmanned vehicles cooperate without human intervention or human control, offers promising avenues for enhancing the efficie… ▽ More As multiple and heterogenous unmanned vehicle systems continue to play an increasingly important role in addressing complex missions in the real world, the need for effective cooperation among unmanned vehicles becomes paramount. The concept of autonomous cooperation, wherein unmanned vehicles cooperate without human intervention or human control, offers promising avenues for enhancing the efficiency and adaptability of intelligence of multiple-heterogeneous unmanned vehicle systems. Despite the growing interests in this domain, as far as the authors are concerned, there exists a notable lack of comprehensive literature on defining explicit concept and classifying levels of autonomous cooperation of multiple-heterogeneous unmanned vehicle systems. In this aspect, this article aims to define the explicit concept of autonomous cooperation of multiple-heterogeneous unmanned vehicle systems. Furthermore, we provide a novel criterion to assess the technical maturity of the developed unmanned vehicle systems by classifying the autonomous cooperative levels of multiple-heterogeneous unmanned vehicle systems. △ Less

Submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.08385 [pdf, ps, other]

doi 10.3847/1538-4357/ad3939

Regions of suppressed diffusion around supernova remnants?

Authors: Yiwei Bao, Pasquale Blasi, Yang Chen

Abstract: The recent discovery of the so-called TeV halos has attracted much attention. The morphology of the emission requires that the region is characterized by severe suppression of the diffusion coefficient. This finding raises many questions as to its origin: 1) is the suppressed diffusion to be attributed to instabilities induced by the same radiating particles? 2) or does it actually show that the d… ▽ More The recent discovery of the so-called TeV halos has attracted much attention. The morphology of the emission requires that the region is characterized by severe suppression of the diffusion coefficient. This finding raises many questions as to its origin: 1) is the suppressed diffusion to be attributed to instabilities induced by the same radiating particles? 2) or does it actually show that the diffusion coefficient is small throughout the disc of the Galaxy? In both cases, one would expect that the surroundings of supernova remnants (SNRs) should also show evidence of reduced diffusion coefficient, since most remnants are located in the disc and are expected to be sites of effective particle acceleration. Should we expect the existence of regions of extended $γ$-ray emission from these regions as well? Here we investigate the transport of cosmic rays (CRs) escaped from SNRs in order to assess the viability of the idea of having a cocoon of suppressed diffusion around them. A comparison of our results with the $γ$-ray emission from the regions around HB9 and W28 does not provide solid evidence of reduced diffusivity. However, if indeed the phenomenon of reduced diffusivity occurs around SNRs surrounded by molecular clouds, our calculations show that the effects on the grammage of Galactic CRs can be significant. △ Less

Submitted 18 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

Comments: published in ApJ

Journal ref: 2024 ApJ 966 224

arXiv:2405.07964 [pdf, other]

Early phase simultaneous multi-band observations of Type II supernova SN 2024ggi with Mephisto

Authors: Xinlei Chen, Brajesh Kumar, Xinzhong Er, Helong Guo, Yuan-Pei Yang, Weikang Lin, Yuan Fang, Guowang Du, Chenxu Liu, Jiewei Zhao, Tianyu Zhang, Yuxi Bao, Xingzhu Zou, Yu Pan, Yu Wang, Xufeng Zhu, Kaushik Chatterjee, Xiangkun Liu, Dezi Liu, Edoardo P. Lagioia, Geeta Rangwal, Shiyan Zhong, Jinghua Zhang, Jianhui Lian, Yongzhi Cai , et al. (2 additional authors not shown)

Abstract: We present early-phase good cadence simultaneous multi-band ($ugi$, $vrz$--bands) imaging of nearby supernova SN 2024ggi, which exploded in the nearby galaxy, NGC~3621. A quick follow-up was conducted within less than a day after the explosion and continued $\sim$23 days. The $uvg$-band light curves display a rapid rise ($\sim$1.4 mag day$^{-1}$) to maximum in $\sim$4 days and absolute magnitude… ▽ More We present early-phase good cadence simultaneous multi-band ($ugi$, $vrz$--bands) imaging of nearby supernova SN 2024ggi, which exploded in the nearby galaxy, NGC~3621. A quick follow-up was conducted within less than a day after the explosion and continued $\sim$23 days. The $uvg$-band light curves display a rapid rise ($\sim$1.4 mag day$^{-1}$) to maximum in $\sim$4 days and absolute magnitude $M_{g}\sim$--17.75 mag. The post-peak decay rate in redder bands is $\sim$0.01 mag day$^{-1}$. Different colors (e.g., $u-g$ and $v-r$) of SN~2024ggi are slightly redder than SN~2023ixf. A significant rise ($\sim$12.5 kK) in black-body temperature (optical) was noticed within $\sim$2 days after the explosion, which successively decreased, indicating shock break out inside a dense circumstellar medium (CSM) surrounding the progenitor. Using semi-analytical modeling, the ejecta mass and progenitor radius were estimated as 1.2 M$_{\odot}$ and $\sim$550 R$_{\odot}$, respectively. The archival deep images ($g,r,i,z$-bands) from the Dark Energy Camera Legacy Survey (DECaLS) were examined, and a possible progenitor was detected in each band ($\sim$22--22.5 mag) and had a mass range of 14--17 M$_{\odot}$. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: Pages 9, Table 1, Figures 7

arXiv:2405.07691 [pdf, other]

Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i… ▽ More The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) is compatible with NGC 4278 within $\sim0.03$ degree. Variation analysis shows an indication of the variability at a few months level in the TeV band, which is consistent with low frequency observations. Based on these observations, we report the detection of TeV $γ$-ray emissions from this low-luminosity AGN NGC 4278. The observations by LHAASO-WCDA during active period has a significance level of 8.8\,$σ$ with best-fit photon spectral index $\varGamma=2.56\pm0.14$ and a flux $f_{1-10\,\rm{TeV}}=(7.0\pm1.1_{\rm{sta}}\pm0.35_{\rm{syst}})\times10^{-13}\,\rm{photons\,cm^{-2}\,s^{-1}}$, or approximately $5\%$ of the Crab Nebula. The discovery of VHE from NGC 4278 indicates that the compact, weak radio jet can efficiently accelerate particles and emit TeV photons. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures

arXiv:2405.06598 [pdf, other]

A Lightweight Transformer for Remote Sensing Image Change Captioning

Authors: Dongwei Sun, Yajie Bao, Xiangyong Cao

Abstract: Remote sensing image change captioning (RSICC) aims to automatically generate sentences that describe content differences in remote sensing bitemporal images. Recently, attention-based transformers have become a prevalent idea for capturing the features of global change. However, existing transformer-based RSICC methods face challenges, e.g., high parameters and high computational complexity cause… ▽ More Remote sensing image change captioning (RSICC) aims to automatically generate sentences that describe content differences in remote sensing bitemporal images. Recently, attention-based transformers have become a prevalent idea for capturing the features of global change. However, existing transformer-based RSICC methods face challenges, e.g., high parameters and high computational complexity caused by the self-attention operation in the transformer encoder component. To alleviate these issues, this paper proposes a Sparse Focus Transformer (SFT) for the RSICC task. Specifically, the SFT network consists of three main components, i.e. a high-level features extractor based on a convolutional neural network (CNN), a sparse focus attention mechanism-based transformer encoder network designed to locate and capture changing regions in dual-temporal images, and a description decoder that embeds images and words to generate sentences for captioning differences. The proposed SFT network can reduce the parameter number and computational complexity by incorporating a sparse attention mechanism within the transformer encoder network. Experimental results on various datasets demonstrate that even with a reduction of over 90\% in parameters and computational complexity for the transformer encoder, our proposed network can still obtain competitive performance compared to other state-of-the-art RSICC methods. The code can be available at △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.05557 [pdf, ps, other]

Composition Rules for Strong Structural Controllability and Minimum Input Problem in Diffusively-Coupled Networks

Authors: Nam-Jin Park, Seong-Ho Kwon, Yoo-Bin Bae, Byeong-Yeon Kim, Kevin L. Moore, Hyo-Sung Ahn

Abstract: This paper presents new results and reinterpretation of existing conditions for strong structural controllability in a structured network determined by the zero/non-zero patterns of edges. For diffusively-coupled networks with self-loops, we first establish a necessary and sufficient condition for strong structural controllability, based on the concepts of dedicated and sharing nodes. Subsequently… ▽ More This paper presents new results and reinterpretation of existing conditions for strong structural controllability in a structured network determined by the zero/non-zero patterns of edges. For diffusively-coupled networks with self-loops, we first establish a necessary and sufficient condition for strong structural controllability, based on the concepts of dedicated and sharing nodes. Subsequently, we define several conditions for strong structural controllability across various graph types by decomposing them into disjoint path graphs. We further extend our findings by introducing a composition rule, facilitating the analysis of strong structural controllability in larger networks. This rule allows us to determine the strong structural controllability of connected graphs called pactus graphs (a generalization of the well-known cactus graph) by consideration of the strong structural controllability of its disjoint component graphs. In this process, we introduce the notion of a component input node, which is a state node that functions identically to an external input node. Based on this concept, we present an algorithm with approximate polynomial complexity to determine the minimum number of external input nodes required to maintain strong structural controllability in a diffusively-coupled network with self-loops. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures. arXiv admin note: substantial text overlap with arXiv:2205.05275

arXiv:2404.17837 [pdf, other]

Hybrid 3D Human Pose Estimation with Monocular Video and Sparse IMUs

Authors: Yiming Bao, Xu Zhao, Dahong Qian

Abstract: Temporal 3D human pose estimation from monocular videos is a challenging task in human-centered computer vision due to the depth ambiguity of 2D-to-3D lifting. To improve accuracy and address occlusion issues, inertial sensor has been introduced to provide complementary source of information. However, it remains challenging to integrate heterogeneous sensor data for producing physically rational 3… ▽ More Temporal 3D human pose estimation from monocular videos is a challenging task in human-centered computer vision due to the depth ambiguity of 2D-to-3D lifting. To improve accuracy and address occlusion issues, inertial sensor has been introduced to provide complementary source of information. However, it remains challenging to integrate heterogeneous sensor data for producing physically rational 3D human poses. In this paper, we propose a novel framework, Real-time Optimization and Fusion (RTOF), to address this issue. We first incorporate sparse inertial orientations into a parametric human skeleton to refine 3D poses in kinematics. The poses are then optimized by energy functions built on both visual and inertial observations to reduce the temporal jitters. Our framework outputs smooth and biomechanically plausible human motion. Comprehensive experiments with ablation studies demonstrate its rationality and efficiency. On Total Capture dataset, the pose estimation error is significantly decreased compared to the baseline method. △ Less

Submitted 27 April, 2024; originally announced April 2024.

Comments: 10 pages, 5 figures, Under Review

arXiv:2404.17582 [pdf, other]

Data Quality in Crowdsourcing and Spamming Behavior Detection

Authors: Yang Ba, Michelle V. Mancenido, Erin K. Chiou, Rong Pan

Abstract: As crowdsourcing emerges as an efficient and cost-effective method for obtaining labels for machine learning datasets, it is important to assess the quality of crowd-provided data, so as to improve analysis performance and reduce biases in subsequent machine learning tasks. Given the lack of ground truth in most cases of crowdsourcing, we refer to data quality as annotators' consistency and credib… ▽ More As crowdsourcing emerges as an efficient and cost-effective method for obtaining labels for machine learning datasets, it is important to assess the quality of crowd-provided data, so as to improve analysis performance and reduce biases in subsequent machine learning tasks. Given the lack of ground truth in most cases of crowdsourcing, we refer to data quality as annotators' consistency and credibility. Unlike the simple scenarios where Kappa coefficient and intraclass correlation coefficient usually can apply, online crowdsourcing requires dealing with more complex situations. We introduce a systematic method for evaluating data quality and detecting spamming threats via variance decomposition, and we classify spammers into three categories based on their different behavioral patterns. A spammer index is proposed to assess entire data consistency and two metrics are developed to measure crowd worker's credibility by utilizing the Markov chain and generalized random effects models. Furthermore, we showcase the practicality of our techniques and their advantages by applying them on a face verification task with both simulation and real-world data collected from two crowdsourcing platforms. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: Preprint paper, under review on Behavior Research Methods. 45 pages, 10 figures

arXiv:2404.16831 [pdf, other]

The Third Monocular Depth Estimation Challenge

Authors: Jaime Spencer, Fabio Tosi, Matteo Poggi, Ripudaman Singh Arora, Chris Russell, Simon Hadfield, Richard Bowden, GuangYuan Zhou, ZhengXin Li, Qiang Rao, YiPing Bao, Xiao Liu, Dohyeong Kim, Jinseong Kim, Myunghyun Kim, Mykola Lavreniuk, Rui Li, Qing Mao, Jiang Wu, Yu Zhu, Jinqiu Sun, Yanning Zhang, Suraj Patni, Aradhye Agarwal, Chetan Arora , et al. (16 additional authors not shown)

Abstract: This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 su… ▽ More This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 submissions outperforming the baseline on the test set: 10 among them submitted a report describing their approach, highlighting a diffused use of foundational models such as Depth Anything at the core of their method. The challenge winners drastically improved 3D F-Score performance, from 17.51% to 23.72%. △ Less

Submitted 27 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

Comments: To appear in CVPRW2024

arXiv:2404.11929 [pdf, other]

A Symmetric Regressor for MRI-Based Assessment of Striatal Dopamine Transporter Uptake in Parkinson's Disease

Authors: Walid Abdullah Al, Il Dong Yun, Yun Jung Bae

Abstract: Dopamine transporter (DAT) imaging is commonly used for monitoring Parkinson's disease (PD), where striatal DAT uptake amount is computed to assess PD severity. However, DAT imaging has a high cost and the risk of radiance exposure and is not available in general clinics. Recently, MRI patch of the nigral region has been proposed as a safer and easier alternative. This paper proposes a symmetric r… ▽ More Dopamine transporter (DAT) imaging is commonly used for monitoring Parkinson's disease (PD), where striatal DAT uptake amount is computed to assess PD severity. However, DAT imaging has a high cost and the risk of radiance exposure and is not available in general clinics. Recently, MRI patch of the nigral region has been proposed as a safer and easier alternative. This paper proposes a symmetric regressor for predicting the DAT uptake amount from the nigral MRI patch. Acknowledging the symmetry between the right and left nigrae, the proposed regressor incorporates a paired input-output model that simultaneously predicts the DAT uptake amounts for both the right and left striata. Moreover, it employs a symmetric loss that imposes a constraint on the difference between right-to-left predictions, resembling the high correlation in DAT uptake amounts in the two lateral sides. Additionally, we propose a symmetric Monte-Carlo (MC) dropout method for providing a fruitful uncertainty estimate of the DAT uptake prediction, which utilizes the above symmetry. We evaluated the proposed approach on 734 nigral patches, which demonstrated significantly improved performance of the symmetric regressor compared with the standard regressors while giving better explainability and feature representation. The symmetric MC dropout also gave precise uncertainty ranges with a high probability of including the true DAT uptake amounts within the range. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.09482 [pdf, other]

Binary microlensing by high eccentric stellar-mass black hole binaries

Authors: Kyungmin Kim, Yeong-Bok Bae, Yoon-Hyun Ryu

Abstract: Microlensing is one of the most promising tools for discovering stellar-mass black holes (BHs) in the Milky Way because it allows us to probe dark or faint celestial compact objects. While the existence of stellar-mass BHs has been confirmed through observation of X-ray binaries within our galaxy and gravitational waves from extragalactic BH binaries, a conclusive observation of microlensing event… ▽ More Microlensing is one of the most promising tools for discovering stellar-mass black holes (BHs) in the Milky Way because it allows us to probe dark or faint celestial compact objects. While the existence of stellar-mass BHs has been confirmed through observation of X-ray binaries within our galaxy and gravitational waves from extragalactic BH binaries, a conclusive observation of microlensing events caused by Galactic BH binaries has yet to be achieved. In this study, we focus on those with high eccentricity, including unbound orbits, which can dynamically form in star clusters and could potentially increase the observation rate. We demonstrate parameter estimation for simulated light curves supposing various orbital configurations of BH binary lenses. We employ a model-based fitting using the Nelder-Mead method and Bayesian inference based on the Markov chain Monte Carlo method for the demonstration. The results show that we can retrieve true values of the parameters of high eccentric BH binary lenses within the 1$σ$ uncertainty of inferred values. We conclude it is feasible to find high eccentric Galactic BH binaries from the observation of binary microlensing events. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 12 pages, 9 figures, 4 tables

arXiv:2404.08217 [pdf, other]

Avoid Arguments and Escape with Your Self: Expressive Subtyping and Decidable Bidirectional Checking for Reachability Types

Authors: Songlin Jia, Guannan Wei, Siyuan He, Yuyan Bao, Tiark Rompf

Abstract: Despite Rust's success in systems programming, its ``shared XOR mutable'' principle significantly restricts how mutable values can be used, precluding many useful functional programming idioms. Reachability types are a recent proposal to address the key limitations of Rust-style approaches by tracking, rather than prohibiting, shared, escaping, and mutable data, even in the presence of higher-orde… ▽ More Despite Rust's success in systems programming, its ``shared XOR mutable'' principle significantly restricts how mutable values can be used, precluding many useful functional programming idioms. Reachability types are a recent proposal to address the key limitations of Rust-style approaches by tracking, rather than prohibiting, shared, escaping, and mutable data, even in the presence of higher-order functions and polymorphic types. The key to enabling such expressiveness is the notion of self-references in reachability qualifiers. However, self-references present major challenges in designing expressive subtyping and decidable type checking algorithms, since self-references are neither fully covariant nor fully contravariant, yet still need to vary in certain circumstances. This lack of an effective type checking algorithm is a key impediment toward making reachability types truly practical, and leveraging them to bring the benefits of programming with lifetimes and sharing to practical higher-level languages. In this paper, we investigate the issues of subtyping and type checking of self-references for reachability types. We address key gaps in previous work by proposing a refined notion of subtyping, which more smoothly supports features such as Church-encoded datatypes, making the overall system more expressive. We also develop a sound and decidable bidirectional type checking algorithm, implemented and verified in Coq. △ Less

Submitted 15 July, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.04801 [pdf, ps, other]

doi 10.1007/s41605-024-00467-8

LHAASO-KM2A detector simulation using Geant4

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (254 additional authors not shown)

Abstract: KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with… ▽ More KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with large altitude difference (30 m) and huge coverage (1.3 km^2). In this paper, the design of the KM2A simulation code G4KM2A based on Geant4 is introduced. The process of G4KM2A is optimized mainly in memory consumption to avoid memory overffow. Some simpliffcations are used to signiffcantly speed up the execution of G4KM2A. The running time is reduced by at least 30 times compared to full detector simulation. The particle distributions and the core/angle resolution comparison between simulation and experimental data of the full KM2A array are also presented, which show good agreement. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2403.20134 [pdf, other]

User Modeling Challenges in Interactive AI Assistant Systems

Authors: Megan Su, Yuwei Bao

Abstract: Interactive Artificial Intelligent(AI) assistant systems are designed to offer timely guidance to help human users to complete a variety tasks. One of the remaining challenges is to understand user's mental states during the task for more personalized guidance. In this work, we analyze users' mental states during task executions and investigate the capabilities and challenges for large language mo… ▽ More Interactive Artificial Intelligent(AI) assistant systems are designed to offer timely guidance to help human users to complete a variety tasks. One of the remaining challenges is to understand user's mental states during the task for more personalized guidance. In this work, we analyze users' mental states during task executions and investigate the capabilities and challenges for large language models to interpret user profiles for more personalized user guidance. △ Less

Submitted 29 March, 2024; originally announced March 2024.

arXiv:2403.19821 [pdf]

Formation of Oriented Bilayer Motif -- Vanadyl Phthalocyanine on Ag(100)

Authors: William Koll, Corina Urdaniz, Kyungju Noh, Yujeong Bae, Christoph Wolf, Jay Gupta

Abstract: The adsorption and self-assembly of vanadyl phthalocyanine molecules on Ag(100) has been investigated using a combination of scanning tunneling microscopy and density functional theory. At sub-monolayer coverage, we observe two distinct adsorption configurations of isolated molecules, corresponding to the central O atom pointing toward (O-down) or away (O-up) from the substrate. Upon adsorption in… ▽ More The adsorption and self-assembly of vanadyl phthalocyanine molecules on Ag(100) has been investigated using a combination of scanning tunneling microscopy and density functional theory. At sub-monolayer coverage, we observe two distinct adsorption configurations of isolated molecules, corresponding to the central O atom pointing toward (O-down) or away (O-up) from the substrate. Upon adsorption in the O-up orientation, the otherwise achiral molecules take on a windmill-like chiral appearance due to their interaction with the substrate. At monolayer coverage, we observe a self-assembled square lattice with a mixture of O-up and O-down molecules. At higher coverage we find a strong preference for bilayer formation with O-up and O-down molecules in alternating layers, suggesting stabilization by dipolar interactions. Close inspection of the multi-layer surface reveals grain boundaries separating domains of opposite organizational chirality, and long-range ordering. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: 15 pages, 5 figures

arXiv:2403.17069 [pdf, other]

Tensor network formulation of symmetry protected topological phases in mixed states

Authors: Hanyu Xue, Jong Yeon Lee, Yimu Bao

Abstract: We define and classify symmetry-protected topological (SPT) phases in mixed states based on the tensor network formulation of the density matrix. In one dimension, we introduce strong injective matrix product density operators (MPDO), which describe a broad class of short-range correlated mixed states, including the locally decohered SPT states. We map strong injective MPDO to a pure state in the… ▽ More We define and classify symmetry-protected topological (SPT) phases in mixed states based on the tensor network formulation of the density matrix. In one dimension, we introduce strong injective matrix product density operators (MPDO), which describe a broad class of short-range correlated mixed states, including the locally decohered SPT states. We map strong injective MPDO to a pure state in the doubled Hilbert space and define the SPT phases according to the cohomology class of the symmetry group in the doubled state. Although the doubled state exhibits an enlarged symmetry, the possible SPT phases are also constrained by the Hermiticity and the semi-positivity of the density matrix. We here obtain a complete classification of SPT phases with a direct product of strong $G$ and weak $K$ unitary symmetry given by the cohomology group $H^2(G, \text{U}(1))\oplus H^1(K, H^1(G, \text{U}(1)))$. The SPT phases in our definition are preserved under symmetric local circuits consisting of non-degenerate channels. This motivates an alternative definition of SPT phases according to the equivalence class of mixed states under a ``one-way" connection using symmetric non-degenerate channels. In locally purifiable MPDO with strong symmetry, we prove that this alternative definition reproduces the cohomology classification. We further extend our results to two-dimensional mixed states described by strong semi-injective tensor network density operators and classify the possible SPT phases. △ Less

Submitted 15 May, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: Appendix D is fixed

arXiv:2403.16541 [pdf, ps, other]

Effects of tensor spin polarization on the chiral restoration and deconfinement phase transitions

Authors: Yan-Ru Bao, Sheng-Qin Feng

Abstract: Effects of tensor spin polarization (TSP) on the chiral restoration and deconfinement phase transitions are studied in Polyakov loop extended Nambu-Jona-Lasinio (PNJL) model. For chiral phase transition, the higher the polarized degree of quark-antiquark pairs under the strong magnetic field, the higher the phase transition temperature. The TSP corrects the position of the critical end point. The… ▽ More Effects of tensor spin polarization (TSP) on the chiral restoration and deconfinement phase transitions are studied in Polyakov loop extended Nambu-Jona-Lasinio (PNJL) model. For chiral phase transition, the higher the polarized degree of quark-antiquark pairs under the strong magnetic field, the higher the phase transition temperature. The TSP corrects the position of the critical end point. The small impact of TSP on the phase transition temperature is found for the deconfinement phase transition. On the other hand, we divide the phase space into three ranges based on the phase diagram obtained from the PNJL model: the confinement phase with chiral symmetry broken, the deconfinement phase with restored chiral symmetry, and the confinement phase with restored chiral symmetry (quarkyonic phase). It is found that TSP has only a very small effect on the anisotropic pressure in the deconfined phase with chiral symmetry restored and the quarkyonic phase, but it has a very strong effect on the anisotropic pressure in the confined phase with chiral symmetry broken. This is because TSP is closely related to chiral symmetry. The restoration of chiral symmetry means the dissociation of spin polarization condensate. △ Less

Submitted 23 May, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: 20 pages, 7 figures

Journal ref: Physical Review D 109, 096033 (2024)

arXiv:2403.14874 [pdf, other]

WeatherProof: Leveraging Language Guidance for Semantic Segmentation in Adverse Weather

Authors: Blake Gella, Howard Zhang, Rishi Upadhyay, Tiffany Chang, Nathan Wei, Matthew Waliman, Yunhao Ba, Celso de Melo, Alex Wong, Achuta Kadambi

Abstract: We propose a method to infer semantic segmentation maps from images captured under adverse weather conditions. We begin by examining existing models on images degraded by weather conditions such as rain, fog, or snow, and found that they exhibit a large performance drop as compared to those captured under clear weather. To control for changes in scene structures, we propose WeatherProof, the first… ▽ More We propose a method to infer semantic segmentation maps from images captured under adverse weather conditions. We begin by examining existing models on images degraded by weather conditions such as rain, fog, or snow, and found that they exhibit a large performance drop as compared to those captured under clear weather. To control for changes in scene structures, we propose WeatherProof, the first semantic segmentation dataset with accurate clear and adverse weather image pairs that share an underlying scene. Through this dataset, we analyze the error modes in existing models and found that they were sensitive to the highly complex combination of different weather effects induced on the image during capture. To improve robustness, we propose a way to use language as guidance by identifying contributions of adverse weather conditions and injecting that as "side information". Models trained using our language guidance exhibit performance gains by up to 10.2% in mIoU on WeatherProof, up to 8.44% in mIoU on the widely used ACDC dataset compared to standard training techniques, and up to 6.21% in mIoU on the ACDC dataset as compared to previous SOTA methods. △ Less

Submitted 7 May, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2312.09534

arXiv:2403.14541 [pdf, other]

EDT: Improving Large Language Models' Generation by Entropy-based Dynamic Temperature Sampling

Authors: Shimao Zhang, Yu Bao, Shujian Huang

Abstract: Recently, Large Language Models (LLMs) have demonstrated outstanding performance across a wide range of downstream language tasks. Temperature sampling is a commonly used decoding strategy for LLMs' generation process. However, a fixed temperature parameter is used in most cases, which may not always be an optimal choice for balancing generation quality and diversity. In this paper, we propose an… ▽ More Recently, Large Language Models (LLMs) have demonstrated outstanding performance across a wide range of downstream language tasks. Temperature sampling is a commonly used decoding strategy for LLMs' generation process. However, a fixed temperature parameter is used in most cases, which may not always be an optimal choice for balancing generation quality and diversity. In this paper, we propose an effective Entropy-based Dynamic Temperature (EDT) Sampling method, to achieve a more balanced performance in terms of both generation quality and diversity by dynamically selecting the temperature parameter. Additionally, we also show model performance and comprehensive analyses for 4 different generation benchmarks. Our experiments show that EDT significantly outperforms the existing strategies across different tasks. △ Less

Submitted 3 April, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

arXiv:2403.13829 [pdf, other]

DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization

Authors: Xiangxin Zhou, Xiwei Cheng, Yuwei Yang, Yu Bao, Liang Wang, Quanquan Gu

Abstract: Recently, 3D generative models have shown promising performances in structure-based drug design by learning to generate ligands given target binding sites. However, only modeling the target-ligand distribution can hardly fulfill one of the main goals in drug discovery -- designing novel ligands with desired properties, e.g., high binding affinity, easily synthesizable, etc. This challenge becomes… ▽ More Recently, 3D generative models have shown promising performances in structure-based drug design by learning to generate ligands given target binding sites. However, only modeling the target-ligand distribution can hardly fulfill one of the main goals in drug discovery -- designing novel ligands with desired properties, e.g., high binding affinity, easily synthesizable, etc. This challenge becomes particularly pronounced when the target-ligand pairs used for training do not align with these desired properties. Moreover, most existing methods aim at solving \textit{de novo} design task, while many generative scenarios requiring flexible controllability, such as R-group optimization and scaffold hopping, have received little attention. In this work, we propose DecompOpt, a structure-based molecular optimization method based on a controllable and decomposed diffusion model. DecompOpt presents a new generation paradigm which combines optimization with conditional diffusion models to achieve desired properties while adhering to the molecular grammar. Additionally, DecompOpt offers a unified framework covering both \textit{de novo} design and controllable generation. To achieve so, ligands are decomposed into substructures which allows fine-grained control and local optimization. Experiments show that DecompOpt can efficiently generate molecules with improved properties than strong de novo baselines, and demonstrate great potential in controllable generation tasks. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: Accepted to ICLR 2024

arXiv:2403.12327 [pdf, other]

GT-Rain Single Image Deraining Challenge Report

Authors: Howard Zhang, Yunhao Ba, Ethan Yang, Rishi Upadhyay, Alex Wong, Achuta Kadambi, Yun Guo, Xueyao Xiao, Xiaoxiong Wang, Yi Li, Yi Chang, Luxin Yan, Chaochao Zheng, Luping Wang, Bin Liu, Sunder Ali Khowaja, Jiseok Yoon, Ik-Hyun Lee, Zhao Zhang, Yanyan Wei, Jiahuan Ren, Suiyi Zhao, Huan Zheng

Abstract: This report reviews the results of the GT-Rain challenge on single image deraining at the UG2+ workshop at CVPR 2023. The aim of this competition is to study the rainy weather phenomenon in real world scenarios, provide a novel real world rainy image dataset, and to spark innovative ideas that will further the development of single image deraining methods on real images. Submissions were trained o… ▽ More This report reviews the results of the GT-Rain challenge on single image deraining at the UG2+ workshop at CVPR 2023. The aim of this competition is to study the rainy weather phenomenon in real world scenarios, provide a novel real world rainy image dataset, and to spark innovative ideas that will further the development of single image deraining methods on real images. Submissions were trained on the GT-Rain dataset and evaluated on an extension of the dataset consisting of 15 additional scenes. Scenes in GT-Rain are comprised of real rainy image and ground truth image captured moments after the rain had stopped. 275 participants were registered in the challenge and 55 competed in the final testing phase. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.10026 [pdf]

High sensitivity and large scanning range optical antennas enabled by multi-casting ridge-waveguide subwavelength structure arrays

Authors: Weijie Xu, Xianxian Jiang, Yelong Bao, Junjia Wang

Abstract: With the rapid development of large-scale integrated photonics, optical phased array (OPA) is an effective way to realize highly integrated, stable and low-cost beam control system. Achieving a large field of view (FOV) in the longitudinal direction without increasing fabrication cost and system complexity is still a significant challenge in OPA antennas. Here, a high sensitivity and large scannin… ▽ More With the rapid development of large-scale integrated photonics, optical phased array (OPA) is an effective way to realize highly integrated, stable and low-cost beam control system. Achieving a large field of view (FOV) in the longitudinal direction without increasing fabrication cost and system complexity is still a significant challenge in OPA antennas. Here, a high sensitivity and large scanning range antenna based on subwavelength structure array is proposed to enhance the longitudinal scanning and free-space radiating efficiency by using the ridge-waveguide structure and backward-emitting. A millimeter-long grating antenna with a far-field beam divergence of 0.13° and a wavelength sensitivity of 0.237°/nm is experimentally demonstrated. Furthermore, by using different sideband periods, we introduce a multi-casting grating antenna with a large scanning range up to 42.6°. The proposed devices show significant improvement in longitudinal wavelength sensitivity compared with the typical waveguide grating antennas. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.10010 [pdf, other]

doi 10.1103/PhysRevLett.132.131002

Measurements of All-Particle Energy Spectrum and Mean Logarithmic Mass of Cosmic Rays from 0.3 to 30 PeV with LHAASO-KM2A

Authors: The LHAASO Collaboration, Zhen Cao, F. Aharonian, Q. An, A. Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen , et al. (256 additional authors not shown)

Abstract: We present the measurements of all-particle energy spectrum and mean logarithmic mass of cosmic rays in the energy range of 0.3-30 PeV using data collected from LHAASO-KM2A between September 2021 and December 2022, which is based on a nearly composition-independent energy reconstruction method, achieving unprecedented accuracy. Our analysis reveals the position of the knee at… ▽ More We present the measurements of all-particle energy spectrum and mean logarithmic mass of cosmic rays in the energy range of 0.3-30 PeV using data collected from LHAASO-KM2A between September 2021 and December 2022, which is based on a nearly composition-independent energy reconstruction method, achieving unprecedented accuracy. Our analysis reveals the position of the knee at $3.67 \pm 0.05 \pm 0.15$ PeV. Below the knee, the spectral index is found to be -$2.7413 \pm 0.0004 \pm 0.0050$, while above the knee, it is -$3.128 \pm 0.005 \pm 0.027$, with the sharpness of the transition measured with a statistical error of 2%. The mean logarithmic mass of cosmic rays is almost heavier than helium in the whole measured energy range. It decreases from 1.7 at 0.3 PeV to 1.3 at 3 PeV, representing a 24% decline following a power law with an index of -$0.1200 \pm 0.0003 \pm 0.0341$. This is equivalent to an increase in abundance of light components. Above the knee, the mean logarithmic mass exhibits a power law trend towards heavier components, which is reversal to the behavior observed in the all-particle energy spectrum. Additionally, the knee position and the change in power-law index are approximately the same. These findings suggest that the knee observed in the all-particle spectrum corresponds to the knee of the light component, rather than the medium-heavy components. △ Less

Submitted 26 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

Comments: 8 pages, 3 figures

Journal ref: Physical Review Letters 132, 131002 (2024)

arXiv:2403.09199 [pdf, other]

Customizing Segmentation Foundation Model via Prompt Learning for Instance Segmentation

Authors: Hyung-Il Kim, Kimin Yun, Jun-Seok Yun, Yuseok Bae

Abstract: Recently, foundation models trained on massive datasets to adapt to a wide range of domains have attracted considerable attention and are actively being explored within the computer vision community. Among these, the Segment Anything Model (SAM) stands out for its remarkable progress in generalizability and flexibility for image segmentation tasks, achieved through prompt-based object mask generat… ▽ More Recently, foundation models trained on massive datasets to adapt to a wide range of domains have attracted considerable attention and are actively being explored within the computer vision community. Among these, the Segment Anything Model (SAM) stands out for its remarkable progress in generalizability and flexibility for image segmentation tasks, achieved through prompt-based object mask generation. However, despite its strength, SAM faces two key limitations when applied to customized instance segmentation that segments specific objects or those in unique environments not typically present in the training data: 1) the ambiguity inherent in input prompts and 2) the necessity for extensive additional training to achieve optimal segmentation. To address these challenges, we propose a novel method, customized instance segmentation via prompt learning tailored to SAM. Our method involves a prompt learning module (PLM), which adjusts input prompts into the embedding space to better align with user intentions, thereby enabling more efficient training. Furthermore, we introduce a point matching module (PMM) to enhance the feature representation for finer segmentation by ensuring detailed alignment with ground truth boundaries. Experimental results on various customized instance segmentation scenarios demonstrate the effectiveness of the proposed method. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: 11 pages, 10 figures

arXiv:2403.09192 [pdf, other]

PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation

Authors: Yizhe Xiong, Hui Chen, Tianxiang Hao, Zijia Lin, Jungong Han, Yuesong Zhang, Guoxin Wang, Yongjun Bao, Guiguang Ding

Abstract: Recently, the scale of transformers has grown rapidly, which introduces considerable challenges in terms of training overhead and inference efficiency in the scope of task adaptation. Existing works, namely Parameter-Efficient Fine-Tuning (PEFT) and model compression, have separately investigated the challenges. However, PEFT cannot guarantee the inference efficiency of the original backbone, espe… ▽ More Recently, the scale of transformers has grown rapidly, which introduces considerable challenges in terms of training overhead and inference efficiency in the scope of task adaptation. Existing works, namely Parameter-Efficient Fine-Tuning (PEFT) and model compression, have separately investigated the challenges. However, PEFT cannot guarantee the inference efficiency of the original backbone, especially for large-scale models. Model compression requires significant training costs for structure searching and re-training. Consequently, a simple combination of them cannot guarantee accomplishing both training efficiency and inference efficiency with minimal costs. In this paper, we propose a novel Parallel Yielding Re-Activation (PYRA) method for such a challenge of training-inference efficient task adaptation. PYRA first utilizes parallel yielding adaptive weights to comprehensively perceive the data distribution in downstream tasks. A re-activation strategy for token modulation is then applied for tokens to be merged, leading to calibrated token features. Extensive experiments demonstrate that PYRA outperforms all competing methods under both low compression rate and high compression rate, demonstrating its effectiveness and superiority in maintaining both training efficiency and inference efficiency for large-scale foundation models. Our code is available at https://github.com/THU-MIG/PYRA. △ Less

Submitted 16 July, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

Comments: 14 pages, 4 figures, Accepted by ECCV 2024

arXiv:2403.07902 [pdf, other]

DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design

Authors: Jiaqi Guan, Xiangxin Zhou, Yuwei Yang, Yu Bao, Jian Peng, Jianzhu Ma, Qiang Liu, Liang Wang, Quanquan Gu

Abstract: Designing 3D ligands within a target binding site is a fundamental task in drug discovery. Existing structured-based drug design methods treat all ligand atoms equally, which ignores different roles of atoms in the ligand for drug design and can be less efficient for exploring the large drug-like molecule space. In this paper, inspired by the convention in pharmaceutical practice, we decompose the… ▽ More Designing 3D ligands within a target binding site is a fundamental task in drug discovery. Existing structured-based drug design methods treat all ligand atoms equally, which ignores different roles of atoms in the ligand for drug design and can be less efficient for exploring the large drug-like molecule space. In this paper, inspired by the convention in pharmaceutical practice, we decompose the ligand molecule into two parts, namely arms and scaffold, and propose a new diffusion model, DecompDiff, with decomposed priors over arms and scaffold. In order to facilitate the decomposed generation and improve the properties of the generated molecules, we incorporate both bond diffusion in the model and additional validity guidance in the sampling phase. Extensive experiments on CrossDocked2020 show that our approach achieves state-of-the-art performance in generating high-affinity molecules while maintaining proper molecular properties and conformational stability, with up to -8.39 Avg. Vina Dock score and 24.5 Success Rate. The code is provided at https://github.com/bytedance/DecompDiff △ Less

Submitted 26 February, 2024; originally announced March 2024.

Comments: Accepted to ICML 2023

arXiv:2403.07728 [pdf, other]

CAP: A General Algorithm for Online Selective Conformal Prediction with FCR Control

Authors: Yajie Bao, Yuyang Huo, Haojie Ren, Changliang Zou

Abstract: We study the problem of post-selection predictive inference in an online fashion. To avoid devoting resources to unimportant units, a preliminary selection of the current individual before reporting its prediction interval is common and meaningful in online predictive tasks. Since the online selection causes a temporal multiplicity in the selected prediction intervals, it is important to control t… ▽ More We study the problem of post-selection predictive inference in an online fashion. To avoid devoting resources to unimportant units, a preliminary selection of the current individual before reporting its prediction interval is common and meaningful in online predictive tasks. Since the online selection causes a temporal multiplicity in the selected prediction intervals, it is important to control the real-time false coverage-statement rate (FCR) which measures the overall miscoverage level. We develop a general framework named CAP (Calibration after Adaptive Pick) that performs an adaptive pick rule on historical data to construct a calibration set if the current individual is selected and then outputs a conformal prediction interval for the unobserved label. We provide tractable procedures for constructing the calibration set for popular online selection rules. We proved that CAP can achieve an exact selection-conditional coverage guarantee in the finite-sample and distribution-free regimes. To account for the distribution shift in online data, we also embed CAP into some recent dynamic conformal prediction algorithms and show that the proposed method can deliver long-run FCR control. Numerical results on both synthetic and real data corroborate that CAP can effectively control FCR around the target level and yield more narrowed prediction intervals over existing baselines across various settings. △ Less

Submitted 28 March, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

arXiv:2403.07521 [pdf, ps, other]

Cohomologies and deformations of differential algebra morphisms

Authors: Lei Du, Yanhong Bao

Abstract: This paper studies the formal deformations of differential algebra morphisms. As a consequence, we develop a cohomology theory of differential algebra morphisms to interpret the lower degree cohomology groups as formal deformations. Then, we prove the Cohomology Comparison Theorem of differential algebra morphisms, i.e., the cohomology of a morphism of differential algebras is isomorphic to the co… ▽ More This paper studies the formal deformations of differential algebra morphisms. As a consequence, we develop a cohomology theory of differential algebra morphisms to interpret the lower degree cohomology groups as formal deformations. Then, we prove the Cohomology Comparison Theorem of differential algebra morphisms, i.e., the cohomology of a morphism of differential algebras is isomorphic to the cohomology of an auxiliary differential algebra. Finally, we can give a minimal model for morphism of differential algebras with weight=0. △ Less

Submitted 12 March, 2024; originally announced March 2024.

arXiv:2403.06700 [pdf, other]

Enhancing Adversarial Training with Prior Knowledge Distillation for Robust Image Compression

Authors: Zhi Cao, Youneng Bao, Fanyang Meng, Chao Li, Wen Tan, Genhong Wang, Yongsheng Liang

Abstract: Deep neural network-based image compression (NIC) has achieved excellent performance, but NIC method models have been shown to be susceptible to backdoor attacks. Adversarial training has been validated in image compression models as a common method to enhance model robustness. However, the improvement effect of adversarial training on model robustness is limited. In this paper, we propose a prior… ▽ More Deep neural network-based image compression (NIC) has achieved excellent performance, but NIC method models have been shown to be susceptible to backdoor attacks. Adversarial training has been validated in image compression models as a common method to enhance model robustness. However, the improvement effect of adversarial training on model robustness is limited. In this paper, we propose a prior knowledge-guided adversarial training framework for image compression models. Specifically, first, we propose a gradient regularization constraint for training robust teacher models. Subsequently, we design a knowledge distillation based strategy to generate a priori knowledge from the teacher model to the student model for guiding adversarial training. Experimental results show that our method improves the reconstruction quality by about 9dB when the Kodak dataset is elected as the backdoor attack object for psnr attack. Compared with Ma2023, our method has a 5dB higher PSNR output at high bitrate points. △ Less

Submitted 15 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

arXiv:2403.06443 [pdf, other]

Temporal-Mapping Photography for Event Cameras

Authors: Yuhan Bao, Lei Sun, Yuqin Ma, Kaiwei Wang

Abstract: Event cameras, or Dynamic Vision Sensors (DVS) are novel neuromorphic sensors that capture brightness changes as a continuous stream of ``events'' rather than traditional intensity frames. Converting sparse events to dense intensity frames faithfully has long been an ill-posed problem. Previous methods have primarily focused on converting events to video in dynamic scenes or with a moving camera.… ▽ More Event cameras, or Dynamic Vision Sensors (DVS) are novel neuromorphic sensors that capture brightness changes as a continuous stream of ``events'' rather than traditional intensity frames. Converting sparse events to dense intensity frames faithfully has long been an ill-posed problem. Previous methods have primarily focused on converting events to video in dynamic scenes or with a moving camera. In this paper, for the first time, we realize events to dense intensity image conversion using a stationary event camera in static scenes. Different from traditional methods that mainly rely on event integration, the proposed Event-Based Temporal Mapping Photography (EvTemMap) measures the time of event emitting for each pixel. Then, the resulting Temporal Matrix is converted to an intensity frame with a temporal mapping neural network. At the hardware level, the proposed EvTemMap is implemented by combining a transmittance adjustment device with a DVS, named Adjustable Transmittance Dynamic Vision Sensor. Additionally, we collected TemMat dataset under various conditions including low-light and high dynamic range scenes. The experimental results showcase the high dynamic range, fine-grained details, and high-grayscale-resolution of the proposed EvTemMap, as well as the enhanced performance on downstream computer vision tasks compared to other methods. The code and TemMat dataset will be made publicly available. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: 17 pages, 10 figures

arXiv:2403.03698 [pdf, other]

Towards Controllable Time Series Generation

Authors: Yifan Bao, Yihao Ang, Qiang Huang, Anthony K. H. Tung, Zhiyong Huang

Abstract: Time Series Generation (TSG) has emerged as a pivotal technique in synthesizing data that accurately mirrors real-world time series, becoming indispensable in numerous applications. Despite significant advancements in TSG, its efficacy frequently hinges on having large training datasets. This dependency presents a substantial challenge in data-scarce scenarios, especially when dealing with rare or… ▽ More Time Series Generation (TSG) has emerged as a pivotal technique in synthesizing data that accurately mirrors real-world time series, becoming indispensable in numerous applications. Despite significant advancements in TSG, its efficacy frequently hinges on having large training datasets. This dependency presents a substantial challenge in data-scarce scenarios, especially when dealing with rare or unique conditions. To confront these challenges, we explore a new problem of Controllable Time Series Generation (CTSG), aiming to produce synthetic time series that can adapt to various external conditions, thereby tackling the data scarcity issue. In this paper, we propose \textbf{C}ontrollable \textbf{T}ime \textbf{S}eries (\textsf{CTS}), an innovative VAE-agnostic framework tailored for CTSG. A key feature of \textsf{CTS} is that it decouples the mapping process from standard VAE training, enabling precise learning of a complex interplay between latent features and external conditions. Moreover, we develop a comprehensive evaluation scheme for CTSG. Extensive experiments across three real-world time series datasets showcase \textsf{CTS}'s exceptional capabilities in generating high-quality, controllable outputs. This underscores its adeptness in seamlessly integrating latent features with external conditions. Extending \textsf{CTS} to the image domain highlights its remarkable potential for explainability and further reinforces its versatility across different modalities. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: 14 pages, 13 figures, and 5 tables

arXiv:2403.01549 [pdf, other]

Self-Supervised Representation Learning with Meta Comprehensive Regularization

Authors: Huijie Guo, Ying Ba, Jie Hu, Lingyu Si, Wenwen Qiang, Lei Shi

Abstract: Self-Supervised Learning (SSL) methods harness the concept of semantic invariance by utilizing data augmentation strategies to produce similar representations for different deformations of the same input. Essentially, the model captures the shared information among multiple augmented views of samples, while disregarding the non-shared information that may be beneficial for downstream tasks. To add… ▽ More Self-Supervised Learning (SSL) methods harness the concept of semantic invariance by utilizing data augmentation strategies to produce similar representations for different deformations of the same input. Essentially, the model captures the shared information among multiple augmented views of samples, while disregarding the non-shared information that may be beneficial for downstream tasks. To address this issue, we introduce a module called CompMod with Meta Comprehensive Regularization (MCR), embedded into existing self-supervised frameworks, to make the learned representations more comprehensive. Specifically, we update our proposed model through a bi-level optimization mechanism, enabling it to capture comprehensive features. Additionally, guided by the constrained extraction of features using maximum entropy coding, the self-supervised learning model learns more comprehensive features on top of learning consistent features. In addition, we provide theoretical support for our proposed method from information theory and causal counterfactual perspective. Experimental results show that our method achieves significant improvement in classification, object detection and instance segmentation tasks on multiple benchmark datasets. △ Less

Submitted 3 March, 2024; originally announced March 2024.

arXiv:2402.18583 [pdf, other]

Binding-Adaptive Diffusion Models for Structure-Based Drug Design

Authors: Zhilin Huang, Ling Yang, Zaixi Zhang, Xiangxin Zhou, Yu Bao, Xiawu Zheng, Yuwei Yang, Yu Wang, Wenming Yang

Abstract: Structure-based drug design (SBDD) aims to generate 3D ligand molecules that bind to specific protein targets. Existing 3D deep generative models including diffusion models have shown great promise for SBDD. However, it is complex to capture the essential protein-ligand interactions exactly in 3D space for molecular generation. To address this problem, we propose a novel framework, namely Binding-… ▽ More Structure-based drug design (SBDD) aims to generate 3D ligand molecules that bind to specific protein targets. Existing 3D deep generative models including diffusion models have shown great promise for SBDD. However, it is complex to capture the essential protein-ligand interactions exactly in 3D space for molecular generation. To address this problem, we propose a novel framework, namely Binding-Adaptive Diffusion Models (BindDM). In BindDM, we adaptively extract subcomplex, the essential part of binding sites responsible for protein-ligand interactions. Then the selected protein-ligand subcomplex is processed with SE(3)-equivariant neural networks, and transmitted back to each atom of the complex for augmenting the target-aware 3D molecule diffusion generation with binding interaction information. We iterate this hierarchical complex-subcomplex process with cross-hierarchy interaction node for adequately fusing global binding context between the complex and its corresponding subcomplex. Empirical studies on the CrossDocked2020 dataset show BindDM can generate molecules with more realistic 3D structures and higher binding affinities towards the protein targets, with up to -5.92 Avg. Vina Score, while maintaining proper molecular properties. Our code is available at https://github.com/YangLing0818/BindDM △ Less

Submitted 14 January, 2024; originally announced February 2024.

Comments: Accepted by AAAI 2024. Project: https://github.com/YangLing0818/BindDM

arXiv:2402.15678 [pdf, other]

Minions: Accelerating Large Language Model Inference with Adaptive and Collective Speculative Decoding

Authors: Siqi Wang, Hailong Yang, Xuezhu Wang, Tongxuan Liu, Pengbo Wang, Xuning Liang, Kejie Ma, Tianyu Feng, Xin You, Yongjun Bao, Yi Liu, Zhongzhi Luan, Depei Qian

Abstract: Large language models (LLM) have recently attracted surging interest due to their outstanding capabilities across various domains. However, enabling efficient LLM inference is challenging due to its autoregressive decoding that generates tokens only one at a time. Although research works apply pruning or quantization to speed up LLM inference, they typically require fine-tuning the LLM, incurring… ▽ More Large language models (LLM) have recently attracted surging interest due to their outstanding capabilities across various domains. However, enabling efficient LLM inference is challenging due to its autoregressive decoding that generates tokens only one at a time. Although research works apply pruning or quantization to speed up LLM inference, they typically require fine-tuning the LLM, incurring significant time and economic costs. Meanwhile, speculative decoding has been proposed to use small speculative models (SSMs) to accelerate the inference of LLM. However, the low acceptance rate of SSM and the high verification cost of LLM prohibit further performance improvement of inference. In this paper, we propose Minions, an LLM inference system that accelerates LLM inference with a collective and adaptive speculative generation. Specifically, Minions proposes a majority-voted mechanism to leverage multiple SSMs to jointly speculate the outputs of LLM, which improves the inference performance without introducing prohibitive computation costs for LLM. To better trade off the number of tokens speculated from SSM and the verification cost of LLM, Minions proposes an adaptive mechanism to dynamically determine the optimal speculation length of SSM, which can achieve better inference performance across different models, datasets, and hyper-parameters. In addition, Minions decouples the SSM decoding and LLM verification efficiently and adopts a pipelined execution mechanism to further improve the inference performance of LLM. By comparing with the state-of-the-art LLM inference systems, we demonstrate that Minions can achieve higher inference throughput and lower inference time. △ Less

Submitted 23 February, 2024; originally announced February 2024.

arXiv:2402.08861 [pdf, ps, other]

On generalized Beauville decompositions

Authors: Younghan Bae, Davesh Maulik, Junliang Shen, Qizheng Yin

Abstract: Motivated by the Beauville decomposition of an abelian scheme and the "Perverse = Chern" phenomenon for a compactified Jacobian fibration, we study in this paper splittings of the perverse filtration for compactified Jacobian fibrations. On the one hand, we prove for the Beauville-Mukai system associated with an irreducible curve class on a K3 surface the existence of a Fourier-stable multiplica… ▽ More Motivated by the Beauville decomposition of an abelian scheme and the "Perverse = Chern" phenomenon for a compactified Jacobian fibration, we study in this paper splittings of the perverse filtration for compactified Jacobian fibrations. On the one hand, we prove for the Beauville-Mukai system associated with an irreducible curve class on a K3 surface the existence of a Fourier-stable multiplicative splitting of the perverse filtration, which extends the Beauville decomposition for the nonsingular fibers. Our approach is to construct a Lefschetz decomposition associated with a Fourier-conjugate $\mathfrak{sl}_2$-triple, which relies heavily on recent work concerning the interaction between derived equivalences and LLV algebras for hyper-Kähler varieties. Motivic lifting and connections to the Beauville-Voisin conjectures are also discussed. On the other hand, we construct for any $g\geq 2$ a compactified Jacobian fibration of genus g curves such that each curve is integral with at worst simple nodes and the (multiplicative) perverse filtration does not admit a multiplicative splitting. This shows that in general an extension of the Beauville decomposition cannot exist for compactified Jacobian fibrations even when the simplest singular point appears. △ Less

Submitted 13 February, 2024; originally announced February 2024.

Comments: 38 pages. Comments are welcome!

arXiv:2402.06526 [pdf, ps, other]

Counting surfaces on Calabi-Yau 4-folds II: $\mathrm{DT}$-$\mathrm{PT}_0$ correspondence

Authors: Younghan Bae, Martijn Kool, Hyeonjun Park

Abstract: This is the second part in a series of papers on counting surfaces on Calabi-Yau 4-folds. In this paper, we introduce $K$-theoretic $\mathrm{DT}, \mathrm{PT}_0, \mathrm{PT}_1$ invariants and conjecture a $\mathrm{DT}$-$\mathrm{PT}_0$ correspondence. For certain tautological insertions, we derive Lefschetz principles in both the compact and toric case allowing reductions to 3-dimensional… ▽ More This is the second part in a series of papers on counting surfaces on Calabi-Yau 4-folds. In this paper, we introduce $K$-theoretic $\mathrm{DT}, \mathrm{PT}_0, \mathrm{PT}_1$ invariants and conjecture a $\mathrm{DT}$-$\mathrm{PT}_0$ correspondence. For certain tautological insertions, we derive Lefschetz principles in both the compact and toric case allowing reductions to 3-dimensional $\mathrm{DT}, \mathrm{PT}$ invariants. We also develop a topological vertex and conjecture a $\mathrm{DT}$-$\mathrm{PT}_0$ vertex correspondence. These methods enable us to verify our conjectures in several examples. △ Less

Submitted 9 February, 2024; originally announced February 2024.

Comments: 97 pages

MSC Class: 14N35 (Primary) 14D20; 14J35; 14J60; 14M25; 14N10 (Secondary)

Showing 1–50 of 516 results for author: Bao, Y