subscribe to arXiv mailings

Delayed luminescence and thermoluminescence in laboratory-grown diamonds

Authors: Jiahui Zhao, Ben L. Green, Ben G. Breeze, Hengxin Yuan, Troy Ardon, Wuyi Wang, Mark E. Newton

Abstract: The blue-green phosphorescence/thermoluminescence is most commonly observed in diamonds following excitation at or above the indirect band gap and has been explained by a substitutional nitrogen-boron donor-acceptor pair recombination model. Orange and red phosphorescence have also been frequently observed in lab-grown near-colourless high-pressure high-temperature diamonds following optical excit… ▽ More The blue-green phosphorescence/thermoluminescence is most commonly observed in diamonds following excitation at or above the indirect band gap and has been explained by a substitutional nitrogen-boron donor-acceptor pair recombination model. Orange and red phosphorescence have also been frequently observed in lab-grown near-colourless high-pressure high-temperature diamonds following optical excitation, and their luminescence mechanisms are shown to be different from that of the blue-green phosphorescence. The physics of the orange and red luminescence and phosphorescence bands including the optical-excitation dependency (UV-NIR), temperature dependency (20 - 573 K), and related charge transfer process are investigated by a combination of self-built time-resolved imaging/spectroscopic techniques. In this paper, an alternative model for long-lived phosphorescence based on charge trapping is proposed to explain the orange phosphorescence/ thermoluminescence band. Additionally, the red phosphorescence band are attributed to point defect which possibly has a three-level phosphorescence system. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: 11 pages, 9 figures

arXiv:2407.10727 [pdf]

Edwards thermodynamic framework controls density segregation in cyclically sheared granular materials

Authors: Haiyang Lu, Houfei Yuan, Shuyang Zhang, Zhikun Zeng, Yi Xing, Jiazhao Xu, Xin Wang, Yujie Wang

Abstract: Using X-ray tomography, we experimentally investigate granular segregation phenomena in a mixture of particles with different densities under quasi-static cyclic shear. We quantitatively characterize their height distributions at steady states by minimizing effective free energy based on a segregation temperature that captures the competition between the mixing entropy and gravitational potential… ▽ More Using X-ray tomography, we experimentally investigate granular segregation phenomena in a mixture of particles with different densities under quasi-static cyclic shear. We quantitatively characterize their height distributions at steady states by minimizing effective free energy based on a segregation temperature that captures the competition between the mixing entropy and gravitational potential energy. We find this temperature coincides with Edwards' compactivity within error under various pressures and cyclic shear amplitudes. Therefore, we find that granular segregation in quasi-static conditions can be fundamentally explained by an effective granular thermodynamic framework including real energy terms based on the Edwards statistical ensemble. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: 18 pages, 5 figures

arXiv:2407.09860 [pdf, other]

Quantum Vicsek Model for Active Matter

Authors: Hong Yuan, L. X. Cui, L. T. Chen, C. P. Sun

Abstract: We propose a quantum analog of the Vicsek model, consisting of an ensemble of overdamped spin$-1/2$ particles with ferromagnetic couplings, driven by a uniformly polarized magnetic field. The spontaneous magnetization of the spin components breaks the $SO(3)$ (or $SO(2)$) symmetry, inducing an ordered phase of flocking. We derive the hydrodynamic equations, similar to those formulated by Toner and… ▽ More We propose a quantum analog of the Vicsek model, consisting of an ensemble of overdamped spin$-1/2$ particles with ferromagnetic couplings, driven by a uniformly polarized magnetic field. The spontaneous magnetization of the spin components breaks the $SO(3)$ (or $SO(2)$) symmetry, inducing an ordered phase of flocking. We derive the hydrodynamic equations, similar to those formulated by Toner and Tu, by applying a mean-field approximation to the quantum analog model up to the next leading order. Our investigation not only establishes a microscopic connection between the Vicsek model and the Toner-Tu hydrodynamics for active matter, but also aims to inspire further studies of active matter in the quantum regime. △ Less

Submitted 13 July, 2024; originally announced July 2024.

arXiv:2407.08545 [pdf, other]

doi 10.1109/LSP.2024.3411917

OMR-NET: a two-stage octave multi-scale residual network for screen content image compression

Authors: Shiqi Jiang, Ting Ren, Congrui Fu, Shuai Li, Hui Yuan

Abstract: Screen content (SC) differs from natural scene (NS) with unique characteristics such as noise-free, repetitive patterns, and high contrast. Aiming at addressing the inadequacies of current learned image compression (LIC) methods for SC, we propose an improved two-stage octave convolutional residual blocks (IToRB) for high and low-frequency feature extraction and a cascaded two-stage multi-scale re… ▽ More Screen content (SC) differs from natural scene (NS) with unique characteristics such as noise-free, repetitive patterns, and high contrast. Aiming at addressing the inadequacies of current learned image compression (LIC) methods for SC, we propose an improved two-stage octave convolutional residual blocks (IToRB) for high and low-frequency feature extraction and a cascaded two-stage multi-scale residual blocks (CTMSRB) for improved multi-scale learning and nonlinearity in SC. Additionally, we employ a window-based attention module (WAM) to capture pixel correlations, especially for high contrast regions in the image. We also construct a diverse SC image compression dataset (SDU-SCICD2K) for training, including text, charts, graphics, animation, movie, game and mixture of SC images and NS images. Experimental results show our method, more suited for SC than NS data, outperforms existing LIC methods in rate-distortion performance on SC images. The code is publicly available at https://github.com/SunshineSki/OMR Net.git. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 7 figures, 2 tables

Journal ref: IEEE Signal Processing Letters, 2024

arXiv:2407.08528 [pdf, other]

doi 10.1109/LSP.2024.3426918

Enhancing octree-based context models for point cloud geometry compression with attention-based child node number prediction

Authors: Chang Sun, Hui Yuan, Xiaolong Mao, Xin Lu, Raouf Hamzaoui

Abstract: In point cloud geometry compression, most octreebased context models use the cross-entropy between the onehot encoding of node occupancy and the probability distribution predicted by the context model as the loss. This approach converts the problem of predicting the number (a regression problem) and the position (a classification problem) of occupied child nodes into a 255-dimensional classificati… ▽ More In point cloud geometry compression, most octreebased context models use the cross-entropy between the onehot encoding of node occupancy and the probability distribution predicted by the context model as the loss. This approach converts the problem of predicting the number (a regression problem) and the position (a classification problem) of occupied child nodes into a 255-dimensional classification problem. As a result, it fails to accurately measure the difference between the one-hot encoding and the predicted probability distribution. We first analyze why the cross-entropy loss function fails to accurately measure the difference between the one-hot encoding and the predicted probability distribution. Then, we propose an attention-based child node number prediction (ACNP) module to enhance the context models. The proposed module can predict the number of occupied child nodes and map it into an 8- dimensional vector to assist the context model in predicting the probability distribution of the occupancy of the current node for efficient entropy coding. Experimental results demonstrate that the proposed module enhances the coding efficiency of octree-based context models. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 2 figures and 2 tables

Journal ref: IEEE Signal Processing Letters, 2024

arXiv:2407.08520 [pdf, other]

doi 10.1109/JETCAS.2024.3367729

Enhancing context models for point cloud geometry compression with context feature residuals and multi-loss

Authors: Chang Sun, Hui Yuan, Shuai Li, Xin Lu, Raouf Hamzaoui

Abstract: In point cloud geometry compression, context models usually use the one-hot encoding of node occupancy as the label, and the cross-entropy between the one-hot encoding and the probability distribution predicted by the context model as the loss function. However, this approach has two main weaknesses. First, the differences between contexts of different nodes are not significant, making it difficul… ▽ More In point cloud geometry compression, context models usually use the one-hot encoding of node occupancy as the label, and the cross-entropy between the one-hot encoding and the probability distribution predicted by the context model as the loss function. However, this approach has two main weaknesses. First, the differences between contexts of different nodes are not significant, making it difficult for the context model to accurately predict the probability distribution of node occupancy. Second, as the one-hot encoding is not the actual probability distribution of node occupancy, the cross-entropy loss function is inaccurate. To address these problems, we propose a general structure that can enhance existing context models. We introduce the context feature residuals into the context model to amplify the differences between contexts. We also add a multi-layer perception branch, that uses the mean squared error between its output and node occupancy as a loss function to provide accurate gradients in backpropagation. We validate our method by showing that it can improve the performance of an octree-based model (OctAttention) and a voxel-based model (VoxelDNN) on the object point cloud datasets MPEG 8i and MVUB, as well as the LiDAR point cloud dataset SemanticKITTI. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 11 pages, 8 figures

Journal ref: IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 14, no. 2, pp. 224-234, Jun. 2024

arXiv:2407.08466 [pdf, other]

Global Spatial-Temporal Information-based Residual ConvLSTM for Video Space-Time Super-Resolution

Authors: Congrui Fu, Hui Yuan, Shiqi Jiang, Guanghui Zhang, Liquan Shen, Raouf Hamzaoui

Abstract: By converting low-frame-rate, low-resolution videos into high-frame-rate, high-resolution ones, space-time video super-resolution techniques can enhance visual experiences and facilitate more efficient information dissemination. We propose a convolutional neural network (CNN) for space-time video super-resolution, namely GIRNet. To generate highly accurate features and thus improve performance, th… ▽ More By converting low-frame-rate, low-resolution videos into high-frame-rate, high-resolution ones, space-time video super-resolution techniques can enhance visual experiences and facilitate more efficient information dissemination. We propose a convolutional neural network (CNN) for space-time video super-resolution, namely GIRNet. To generate highly accurate features and thus improve performance, the proposed network integrates a feature-level temporal interpolation module with deformable convolutions and a global spatial-temporal information-based residual convolutional long short-term memory (convLSTM) module. In the feature-level temporal interpolation module, we leverage deformable convolution, which adapts to deformations and scale variations of objects across different scene locations. This presents a more efficient solution than conventional convolution for extracting features from moving objects. Our network effectively uses forward and backward feature information to determine inter-frame offsets, leading to the direct generation of interpolated frame features. In the global spatial-temporal information-based residual convLSTM module, the first convLSTM is used to derive global spatial-temporal information from the input features, and the second convLSTM uses the previously computed global spatial-temporal information feature as its initial cell state. This second convLSTM adopts residual connections to preserve spatial information, thereby enhancing the output features. Experiments on the Vimeo90K dataset show that the proposed method outperforms state-of-the-art techniques in peak signal-to-noise-ratio (by 1.45 dB, 1.14 dB, and 0.02 dB over STARnet, TMNet, and 3DAttGAN, respectively), structural similarity index(by 0.027, 0.023, and 0.006 over STARnet, TMNet, and 3DAttGAN, respectively), and visually. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08183 [pdf, other]

The white-light superflares from cool stars in GWAC triggers

Authors: Guang-Wei Li, Liang Wang, Hai-Long Yuan, Li-Ping Xin, Jing Wang, Chao Wu, Hua-Li Li, Hasitieer Haerken, Wei-Hua Wang, Hong-Bo Cai, Xu-Hui Han, Yang Xu, Lei Huang, Xiao-Meng Lu, Jian-Ying Bai, Xiang-Yu Wang, Zi-Gao Dai, En-Wei Liang, Jian-Yan Wei

Abstract: M-type stars are the ones that flare most frequently, but how big their maximum flare energy can reach is still unknown. We present 163 flares from 162 individual M2 through L1-type stars that triggered the GWAC, with flare energies ranging from $10^{32.2}$ to $10^{36.4}$ erg . The flare amplitudes range from $\triangle G = 0.84$ to $\sim 10$ mag. Flare energy increases with stellar surface temper… ▽ More M-type stars are the ones that flare most frequently, but how big their maximum flare energy can reach is still unknown. We present 163 flares from 162 individual M2 through L1-type stars that triggered the GWAC, with flare energies ranging from $10^{32.2}$ to $10^{36.4}$ erg . The flare amplitudes range from $\triangle G = 0.84$ to $\sim 10$ mag. Flare energy increases with stellar surface temperature ($T_{\rm eff}$) but both $\triangle G$ and equivalent duration $\log_{10}(ED)$ seem to be independent of $T_{\rm eff}$. Combining periods detected from light curves of TESS and K2, spectra from LAMOST, SDSS and the 2.16 m Telescope, and the Gaia DR3 data, we found that these GWAC flare stars are young. For the stars that have spectra, we found that these stars are in or very near to the saturation region, and $\log_{10}(L_{\rm Hα}/L_{\rm bol})$ is lower for M7-L1 stars than for M2-M6 stars. We also studied the relation between GWAC flare bolometric energy $E_{\rm bol}$ and stellar hemispherical area $S$, and found that $\log_{10}E_{\rm bol}$ (in erg) increases with increasing $S$ (in cm$^2$), and the maximum flare energy $\log_{10}E_{\rm bol, max} \geqslant \log_{10}S + 14.25$. For M7-L1 stars, there seem to be other factors limiting their maximum flare energies in addition to stellar hemispherical area. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 18 pages, 11 figures, 4 tables

arXiv:2407.05677 [pdf, other]

PCAC-GAN:ASparse-Tensor-Based Generative Adversarial Network for 3D Point Cloud Attribute Compression

Authors: Xiaolong Mao, Hui Yuan, Xin Lu, Raouf Hamzaoui, Wei Gao

Abstract: Learning-based methods have proven successful in compressing geometric information for point clouds. For attribute compression, however, they still lag behind non-learning-based methods such as the MPEG G-PCC standard. To bridge this gap, we propose a novel deep learning-based point cloud attribute compression method that uses a generative adversarial network (GAN) with sparse convolution layers.… ▽ More Learning-based methods have proven successful in compressing geometric information for point clouds. For attribute compression, however, they still lag behind non-learning-based methods such as the MPEG G-PCC standard. To bridge this gap, we propose a novel deep learning-based point cloud attribute compression method that uses a generative adversarial network (GAN) with sparse convolution layers. Our method also includes a module that adaptively selects the resolution of the voxels used to voxelize the input point cloud. Sparse vectors are used to represent the voxelized point cloud, and sparse convolutions process the sparse tensors, ensuring computational efficiency. To the best of our knowledge, this is the first application of GANs to compress point cloud attributes. Our experimental results show that our method outperforms existing learning-based techniques and rivals the latest G-PCC test model (TMC13v23) in terms of visual quality. △ Less

Submitted 9 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

Comments: 14 pages, 5 figures

MSC Class: 94J20 ACM Class: I.4.2

arXiv:2407.05232 [pdf, other]

PAPM: A Physics-aware Proxy Model for Process Systems

Authors: Pengwei Liu, Zhongkai Hao, Xingyu Ren, Hangjie Yuan, Jiayang Ren, Dong Ni

Abstract: In the context of proxy modeling for process systems, traditional data-driven deep learning approaches frequently encounter significant challenges, such as substantial training costs induced by large amounts of data, and limited generalization capabilities. As a promising alternative, physics-aware models incorporate partial physics knowledge to ameliorate these challenges. Although demonstrating… ▽ More In the context of proxy modeling for process systems, traditional data-driven deep learning approaches frequently encounter significant challenges, such as substantial training costs induced by large amounts of data, and limited generalization capabilities. As a promising alternative, physics-aware models incorporate partial physics knowledge to ameliorate these challenges. Although demonstrating efficacy, they fall short in terms of exploration depth and universality. To address these shortcomings, we introduce a physics-aware proxy model (PAPM) that fully incorporates partial prior physics of process systems, which includes multiple input conditions and the general form of conservation relations, resulting in better out-of-sample generalization. Additionally, PAPM contains a holistic temporal-spatial stepping module for flexible adaptation across various process systems. Through systematic comparisons with state-of-the-art pure data-driven and physics-aware models across five two-dimensional benchmarks in nine generalization tasks, PAPM notably achieves an average performance improvement of 6.7%, while requiring fewer FLOPs, and just 1% of the parameters compared to the prior leading method. The code is available at https://github.com/pengwei07/PAPM. △ Less

Submitted 6 July, 2024; originally announced July 2024.

Comments: ICML 2024

arXiv:2407.00319 [pdf, other]

doi 10.1038/s41550-024-02309-5

A slightly oblate dark matter halo revealed by a retrograde precessing Galactic disk warp

Authors: Yang Huang, Qikang Feng, Tigran Khachaturyants, Huawei Zhang, Jifeng Liu, Juntai Shen, Timothy C. Beers, Youjun Lu, Song Wang, Haibo Yuan

Abstract: The shape of the dark matter (DM) halo is key to understanding the hierarchical formation of the Galaxy. Despite extensive efforts in recent decades, however, its shape remains a matter of debate, with suggestions ranging from strongly oblate to prolate. Here, we present a new constraint on its present shape by directly measuring the evolution of the Galactic disk warp with time, as traced by accu… ▽ More The shape of the dark matter (DM) halo is key to understanding the hierarchical formation of the Galaxy. Despite extensive efforts in recent decades, however, its shape remains a matter of debate, with suggestions ranging from strongly oblate to prolate. Here, we present a new constraint on its present shape by directly measuring the evolution of the Galactic disk warp with time, as traced by accurate distance estimates and precise age determinations for about 2,600 classical Cepheids. We show that the Galactic warp is mildly precessing in a retrograde direction at a rate of $ω= -2.1 \pm 0.5 ({\rm statistical}) \pm 0.6 ({\rm systematic})$ km s$^{-1}$ kpc$^{-1}$ for the outer disk over the Galactocentric radius [$7.5, 25$] kpc, decreasing with radius. This constrains the shape of the DM halo to be slightly oblate with a flattening (minor axis to major axis ratio) in the range $0.84 \le q_Φ \le 0.96$. Given the young nature of the disk warp traced by Cepheids (less than 200 Myr), our approach directly measures the shape of the present-day DM halo. This measurement, combined with other measurements from older tracers, could provide vital constraints on the evolution of the DM halo and the assembly history of the Galaxy. △ Less

Submitted 29 June, 2024; originally announced July 2024.

Comments: Published in Nature Astronomy on June 27th, 2024. Final published version here: https://www.nature.com/articles/s41550-024-02309-5

arXiv:2406.19389 [pdf, other]

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding

Authors: Tao Zhang, Xiangtai Li, Hao Fei, Haobo Yuan, Shengqiong Wu, Shunping Ji, Chen Change Loy, Shuicheng Yan

Abstract: Current universal segmentation methods demonstrate strong capabilities in pixel-level image and video understanding. However, they lack reasoning abilities and cannot be controlled via text instructions. In contrast, large vision-language multimodal models exhibit powerful vision-based conversation and reasoning capabilities but lack pixel-level understanding and have difficulty accepting visual p… ▽ More Current universal segmentation methods demonstrate strong capabilities in pixel-level image and video understanding. However, they lack reasoning abilities and cannot be controlled via text instructions. In contrast, large vision-language multimodal models exhibit powerful vision-based conversation and reasoning capabilities but lack pixel-level understanding and have difficulty accepting visual prompts for flexible user interaction. This paper proposes OMG-LLaVA, a new and elegant framework combining powerful pixel-level vision understanding with reasoning abilities. It can accept various visual and text prompts for flexible user interaction. Specifically, we use a universal segmentation method as the visual encoder, integrating image information, perception priors, and visual prompts into visual tokens provided to the LLM. The LLM is responsible for understanding the user's text instructions and providing text responses and pixel-level segmentation results based on the visual information. We propose perception prior embedding to better integrate perception priors with image features. OMG-LLaVA achieves image-level, object-level, and pixel-level reasoning and understanding in a single model, matching or surpassing the performance of specialized methods on multiple benchmarks. Rather than using LLM to connect each specialist, our work aims at end-to-end training on one encoder, one decoder, and one LLM. The code and model have been released for further research. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.19369 [pdf, other]

Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model

Authors: Haobo Yuan, Xiangtai Li, Lu Qi, Tao Zhang, Ming-Hsuan Yang, Shuicheng Yan, Chen Change Loy

Abstract: Transformer-based segmentation methods face the challenge of efficient inference when dealing with high-resolution images. Recently, several linear attention architectures, such as Mamba and RWKV, have attracted much attention as they can process long sequences efficiently. In this work, we focus on designing an efficient segment-anything model by exploring these different architectures. Specifica… ▽ More Transformer-based segmentation methods face the challenge of efficient inference when dealing with high-resolution images. Recently, several linear attention architectures, such as Mamba and RWKV, have attracted much attention as they can process long sequences efficiently. In this work, we focus on designing an efficient segment-anything model by exploring these different architectures. Specifically, we design a mixed backbone that contains convolution and RWKV operation, which achieves the best for both accuracy and efficiency. In addition, we design an efficient decoder to utilize the multiscale tokens to obtain high-quality masks. We denote our method as RWKV-SAM, a simple, effective, fast baseline for SAM-like models. Moreover, we build a benchmark containing various high-quality segmentation datasets and jointly train one efficient yet high-quality segmentation model using this benchmark. Based on the benchmark results, our RWKV-SAM achieves outstanding performance in efficiency and segmentation quality compared to transformers and other linear attention models. For example, compared with the same-scale transformer model, RWKV-SAM achieves more than 2x speedup and can achieve better segmentation performance on various datasets. In addition, RWKV-SAM outperforms recent vision Mamba models with better classification and semantic segmentation results. Code and models will be publicly available. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 16 pages; 8 figures

arXiv:2406.14064 [pdf, other]

PAPR Reduction with Pre-chirp Selection for Affine Frequency Division Multiple

Authors: Haozhi Yuan, Yin Xu, Xinghao Guo, Tianyao Ma, Haoyang Li, Dazhi He, Wenjun Zhang

Abstract: Affine frequency division multiplexing (AFDM) is a promising new multicarrier technique based on discrete affine Fourier transform (DAFT). By properly tuning pre-chirp parameter and post-chirp parameter in the DAFT, the effective channel in the DAFT domain can completely avoid overlap of different paths, thus constitutes a full representation of delay-Doppler profile, which significantly improves… ▽ More Affine frequency division multiplexing (AFDM) is a promising new multicarrier technique based on discrete affine Fourier transform (DAFT). By properly tuning pre-chirp parameter and post-chirp parameter in the DAFT, the effective channel in the DAFT domain can completely avoid overlap of different paths, thus constitutes a full representation of delay-Doppler profile, which significantly improves the system performance in high mobility scenarios. However, AFDM has the crucial problem of high peak-to-average power ratio (PAPR) caused by phase randomness of modulated symbols. In this letter, an algorithm named grouped pre-chirp selection (GPS) is proposed to reduce the PAPR by changing the value of pre-chirp parameter on sub-carriers group by group. Specifically, it is demonstrated first that the important properties of AFDM system are maintained when implementing GPS. Secondly, we elaborate the operation steps of GPS algorithm, illustrating its effect on PAPR reduction and its advantage in terms of computational complexity compared with the ungrouped approach. Finally, simulation results of PAPR reduction in the form of complementary cumulative distribution function (CCDF) show the effectiveness of the proposed GPS algorithm. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.12769 [pdf, other]

Latent Intuitive Physics: Learning to Transfer Hidden Physics from A 3D Video

Authors: Xiangming Zhu, Huayu Deng, Haochen Yuan, Yunbo Wang, Xiaokang Yang

Abstract: We introduce latent intuitive physics, a transfer learning framework for physics simulation that can infer hidden properties of fluids from a single 3D video and simulate the observed fluid in novel scenes. Our key insight is to use latent features drawn from a learnable prior distribution conditioned on the underlying particle states to capture the invisible and complex physical properties. To ac… ▽ More We introduce latent intuitive physics, a transfer learning framework for physics simulation that can infer hidden properties of fluids from a single 3D video and simulate the observed fluid in novel scenes. Our key insight is to use latent features drawn from a learnable prior distribution conditioned on the underlying particle states to capture the invisible and complex physical properties. To achieve this, we train a parametrized prior learner given visual observations to approximate the visual posterior of inverse graphics, and both the particle states and the visual posterior are obtained from a learned neural renderer. The converged prior learner is embedded in our probabilistic physics engine, allowing us to perform novel simulations on unseen geometries, boundaries, and dynamics without knowledge of the true physical parameters. We validate our model in three ways: (i) novel scene simulation with the learned visual-world physics, (ii) future prediction of the observed fluid dynamics, and (iii) supervised particle simulation. Our model demonstrates strong performance in all three tasks. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: Published as a conference paper at ICLR 2024

Journal ref: ICLR 2024

arXiv:2406.12416 [pdf, other]

Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models

Authors: Hongbang Yuan, Yubo Chen, Pengfei Cao, Zhuoran Jin, Kang Liu, Jun Zhao

Abstract: Large language models (LLMs) have achieved remarkable success but still tend to generate factually erroneous responses, a phenomenon known as hallucination. A recent trend is to use preference learning to fine-tune models to align with factuality. However, existing work primarily evaluates fine-tuned models on in-domain (ID) datasets and the factuality on out-of-domain (OOD) datasets remains under… ▽ More Large language models (LLMs) have achieved remarkable success but still tend to generate factually erroneous responses, a phenomenon known as hallucination. A recent trend is to use preference learning to fine-tune models to align with factuality. However, existing work primarily evaluates fine-tuned models on in-domain (ID) datasets and the factuality on out-of-domain (OOD) datasets remains underexplored. In this paper, we conduct a comprehensive evaluation of the factuality of different models tuned by various preference learning algorithms and demonstrate that their performance on OOD datasets either increases minimally or decreases. Subsequently, we reveal that the main cause of model's failure to uphold factuality under a distribution shift is \textbf{under-alignment}, rather than \textbf{over-alignment}, by analyzing the token distribution shift of the models before and after tuning. Finally, we propose \textbf{APEFT} (\textbf{A}tomic \textbf{P}reference \textbf{E}nhanced \textbf{F}actuality \textbf{T}uning), a framework that enhances model's awareness of factuality at the granularity of individual facts. Extensive experiments demonstrate that APEFT improves model performance by an average of $\boldsymbol{3.45\%}$ on both ID and OOD datasets, which is highly effective. △ Less

Submitted 27 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.11879 [pdf, other]

Experimental verification of the optimal fingerprint method for detecting climate change

Authors: Jinbo Hu, Hong Yuan, Letian Chen, Nan Zhao, C. P. Sun

Abstract: The optimal fingerprint method serves as a potent approach for detecting and attributing climate change. However, its experimental validation encounters challenges due to the intricate nature of climate systems. Here, we experimentally examine the optimal fingerprint method simulated by a precisely controlled magnetic resonance system of spins. The spin dynamic under an applied deterministic drivi… ▽ More The optimal fingerprint method serves as a potent approach for detecting and attributing climate change. However, its experimental validation encounters challenges due to the intricate nature of climate systems. Here, we experimentally examine the optimal fingerprint method simulated by a precisely controlled magnetic resonance system of spins. The spin dynamic under an applied deterministic driving field and a noise field is utilized to emulate the complex climate system with external forcing and internal variability. Our experimental results affirm the theoretical prediction regarding the existence of an optimal detection direction which maximizes the signal-to-noise ratio, thereby validating the optimal fingerprint method. This work offers direct empirical verification of the optimal fingerprint method, crucial for comprehending climate change and its societal impacts. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.10890 [pdf, other]

RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models

Authors: Zhuoran Jin, Pengfei Cao, Chenhao Wang, Zhitao He, Hongbang Yuan, Jiachun Li, Yubo Chen, Kang Liu, Jun Zhao

Abstract: Large language models (LLMs) inevitably memorize sensitive, copyrighted, and harmful knowledge from the training corpus; therefore, it is crucial to erase this knowledge from the models. Machine unlearning is a promising solution for efficiently removing specific knowledge by post hoc modifying models. In this paper, we propose a Real-World Knowledge Unlearning benchmark (RWKU) for LLM unlearning.… ▽ More Large language models (LLMs) inevitably memorize sensitive, copyrighted, and harmful knowledge from the training corpus; therefore, it is crucial to erase this knowledge from the models. Machine unlearning is a promising solution for efficiently removing specific knowledge by post hoc modifying models. In this paper, we propose a Real-World Knowledge Unlearning benchmark (RWKU) for LLM unlearning. RWKU is designed based on the following three key factors: (1) For the task setting, we consider a more practical and challenging unlearning setting, where neither the forget corpus nor the retain corpus is accessible. (2) For the knowledge source, we choose 200 real-world famous people as the unlearning targets and show that such popular knowledge is widely present in various LLMs. (3) For the evaluation framework, we design the forget set and the retain set to evaluate the model's capabilities across various real-world applications. Regarding the forget set, we provide four four membership inference attack (MIA) methods and nine kinds of adversarial attack probes to rigorously test unlearning efficacy. Regarding the retain set, we assess locality and utility in terms of neighbor perturbation, general ability, reasoning ability, truthfulness, factuality, and fluency. We conduct extensive experiments across two unlearning scenarios, two models and six baseline methods and obtain some meaningful findings. We release our benchmark and code publicly at http://rwku-bench.github.io for future work. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: 48 pages, 7 figures, 12 tables

arXiv:2406.09870 [pdf, other]

IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning

Authors: Jiawen Qin, Haonan Yuan, Qingyun Sun, Lyujin Xu, Jiaqi Yuan, Pengfeng Huang, Zhaonan Wang, Xingcheng Fu, Hao Peng, Jianxin Li, Philip S. Yu

Abstract: Deep graph learning has gained grand popularity over the past years due to its versatility and success in representing graph data across a wide range of domains. However, the pervasive issue of imbalanced graph data distributions, where certain parts exhibit disproportionally abundant data while others remain sparse, undermines the efficacy of conventional graph learning algorithms, leading to bia… ▽ More Deep graph learning has gained grand popularity over the past years due to its versatility and success in representing graph data across a wide range of domains. However, the pervasive issue of imbalanced graph data distributions, where certain parts exhibit disproportionally abundant data while others remain sparse, undermines the efficacy of conventional graph learning algorithms, leading to biased outcomes. To address this challenge, Imbalanced Graph Learning (IGL) has garnered substantial attention, enabling more balanced data distributions and better task performance. Despite the proliferation of IGL algorithms, the absence of consistent experimental protocols and fair performance comparisons pose a significant barrier to comprehending advancements in this field. To bridge this gap, we introduce IGL-Bench, a foundational comprehensive benchmark for imbalanced graph learning, embarking on 16 diverse graph datasets and 24 distinct IGL algorithms with uniform data processing and splitting strategies. Specifically, IGL-Bench systematically investigates state-of-the-art IGL algorithms in terms of effectiveness, robustness, and efficiency on node-level and graph-level tasks, with the scope of class-imbalance and topology-imbalance. Extensive experiments demonstrate the potential benefits of IGL algorithms on various imbalanced conditions, offering insights and opportunities in the IGL field. Further, we have developed an open-sourced and unified package to facilitate reproducible evaluation and inspire further innovative research, which is available at https://github.com/RingBDStack/IGL-Bench. △ Less

Submitted 19 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

Comments: The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Preprint, under review)

arXiv:2405.20212 [pdf, other]

Filter Design for Estimation of Stellar Metallicity: Insights from Experiments with Gaia XP Spectra

Authors: Kai Xiao, Bowen Huang, Yang Huang, Haibo Yuan, Timothy C. Beers, Jifeng Liu, Maosheng Xiang, Xue Lu, Shuai Xu, Lin Yang, Chuanjie Zheng, Zhirui Li, Bowen Zhang, Ruifeng Shi

Abstract: We search for an optimal filter design for the estimation of stellar metallicity, based on synthetic photometry from Gaia XP spectra convolved with a series of filter-transmission curves defined by different central wavelengths and bandwidths. Unlike previous designs based solely on maximizing metallicity sensitivity, we find that the optimal solution provides a balance between the sensitivity and… ▽ More We search for an optimal filter design for the estimation of stellar metallicity, based on synthetic photometry from Gaia XP spectra convolved with a series of filter-transmission curves defined by different central wavelengths and bandwidths. Unlike previous designs based solely on maximizing metallicity sensitivity, we find that the optimal solution provides a balance between the sensitivity and uncertainty of the spectra. With this optimal filter design, the best precision of metallicity estimates for relatively bright ($G \sim 11.5$) stars is excellent, $σ_{\rm [Fe/H]} = 0.034$\,dex for FGK dwarf stars, superior to that obtained utilizing custom sensitivity-optimized filters (e.g., SkyMapper\,$v$). By selecting hundreds of high-probabability member stars of the open cluster M67, our analysis reveals that the intrinsic photometric-metallicity scatter of these cluster members is only 0.036\,dex, consistent with this level of precision. Our results clearly demonstrate that the internal precision of photometric-metallicity estimates can be extremely high, even providing the opportunity to perform chemical tagging for very large numbers of field stars in the Milky Way. This experiment shows that it is crucial to take into account uncertainty alongside the sensitivity when designing filters for measuring the stellar metallicity and other parameters. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 9 pages, 5 figures; ApJL accepted, see main result in Figures 5

arXiv:2405.18317 [pdf, other]

Hybrid Multi-Head Physics-informed Neural Network for Depth Estimation in Terahertz Imaging

Authors: Mingjun Xiang, Hui Yuan, Kai Zhou, Hartmut G. Roskos

Abstract: Terahertz (THz) imaging is one of the hotspots in the field of optics, where the depth information retrieval is a key factor to restore the three-dimensional appearance of objects. Impressive results for depth extraction in visible and infrared wave range have been demonstrated through deep learning (DL). Among them, most DL methods are merely data-driven, lacking relevant physical priors, which t… ▽ More Terahertz (THz) imaging is one of the hotspots in the field of optics, where the depth information retrieval is a key factor to restore the three-dimensional appearance of objects. Impressive results for depth extraction in visible and infrared wave range have been demonstrated through deep learning (DL). Among them, most DL methods are merely data-driven, lacking relevant physical priors, which thus request for a large amount of experimental data to train the DL models.However, large training data acquirement in the THz domain is challenging due to the requirements of environmental and system stability, as well as the time-consuming data acquisition process. To overcome this limitation, this paper incorporates a complete physical model representing the THz image formation process into traditional DL networks to retrieve the depth information of objects. The most significant advantage is the ability to use it without pre-training, thereby eliminating the need for tens of thousands of labeled data. Through experiments validation, we demonstrate that by providing diffraction patterns of planar objects with their upper and lower halves individually masked, the proposed physics-informed neural network (NN) can automatically optimize and, ultimately, reconstruct the depth of the object through interaction between the NN and a physical model. The obtained results represent the initial steps towards achieving fast holographic THz imaging using reference-free beams and low-cost power detection. △ Less

Submitted 7 July, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.15705 [pdf, other]

Sums: Sniffing Unknown Multiband Signals under Low Sampling Rates

Authors: Jinbo Peng, Zhe Chen, Zheng Lin, Haoxuan Yuan, Zihan Fang, Lingzhong Bao, Zihang Song, Ying Li, Jing Ren, Yue Gao

Abstract: Due to sophisticated deployments of all kinds of wireless networks (e.g., 5G, Wi-Fi, Bluetooth, LEO satellite, etc.), multiband signals distribute in a large bandwidth (e.g., from 70 MHz to 8 GHz). Consequently, for network monitoring and spectrum sharing applications, a sniffer for extracting physical layer information, such as structure of packet, with low sampling rate (especially, sub-Nyquist… ▽ More Due to sophisticated deployments of all kinds of wireless networks (e.g., 5G, Wi-Fi, Bluetooth, LEO satellite, etc.), multiband signals distribute in a large bandwidth (e.g., from 70 MHz to 8 GHz). Consequently, for network monitoring and spectrum sharing applications, a sniffer for extracting physical layer information, such as structure of packet, with low sampling rate (especially, sub-Nyquist sampling) can significantly improve their cost- and energy-efficiency. However, to achieve a multiband signals sniffer is really a challenge. To this end, we propose Sums, a system that can sniff and analyze multiband signals in a blind manner. Our Sums takes advantage of hardware and algorithm co-design, multi-coset sub-Nyquist sampling hardware, and a multi-task deep learning framework. The hardware component breaks the Nyquist rule to sample GHz bandwidth, but only pays for a 50 MSPS sampling rate. Our multi-task learning framework directly tackles the sampling data to perform spectrum sensing, physical layer protocol recognition, and demodulation for deep inspection from multiband signals. Extensive experiments demonstrate that Sums achieves higher accuracy than the state-of-theart baselines in spectrum sensing, modulation classification, and demodulation. As a result, our Sums can help researchers and end-users to diagnose or troubleshoot their problems of wireless infrastructures deployments in practice. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 12 pages, 9 figures

arXiv:2405.15542 [pdf, other]

SATSense: Multi-Satellite Collaborative Framework for Spectrum Sensing

Authors: Haoxuan Yuan, Zhe Chen, Zheng Lin, Jinbo Peng, Zihan Fang, Yuhang Zhong, Zihang Song, Yue Gao

Abstract: Low Earth Orbit satellite Internet has recently been deployed, providing worldwide service with non-terrestrial networks. With the large-scale deployment of both non-terrestrial and terrestrial networks, limited spectrum resources will not be allocated enough. Consequently, dynamic spectrum sharing is crucial for their coexistence in the same spectrum, where accurate spectrum sensing is essential.… ▽ More Low Earth Orbit satellite Internet has recently been deployed, providing worldwide service with non-terrestrial networks. With the large-scale deployment of both non-terrestrial and terrestrial networks, limited spectrum resources will not be allocated enough. Consequently, dynamic spectrum sharing is crucial for their coexistence in the same spectrum, where accurate spectrum sensing is essential. However, spectrum sensing in space is more challenging than in terrestrial networks due to variable channel conditions, making single-satellite sensing unstable. Therefore, we first attempt to design a collaborative sensing scheme utilizing diverse data from multiple satellites. However, it is non-trivial to achieve this collaboration due to heterogeneous channel quality, considerable raw sampling data, and packet loss. To address the above challenges, we first establish connections between the satellites by modeling their sensing data as a graph and devising a graph neural network-based algorithm to achieve effective spectrum sensing. Meanwhile, we establish a joint sub-Nyquist sampling and autoencoder data compression framework to reduce the amount of transmitted sensing data. Finally, we propose a contrastive learning-based mechanism compensates for missing packets. Extensive experiments demonstrate that our proposed strategy can achieve efficient spectrum sensing performance and outperform the conventional deep learning algorithm in spectrum sensing accuracy. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 13 pages, 16 figures

arXiv:2405.15165 [pdf, other]

A Solution-based LLM API-using Methodology for Academic Information Seeking

Authors: Yuanchun Wang, Jifan Yu, Zijun Yao, Jing Zhang, Yuyang Xie, Shangqing Tu, Yiyang Fu, Youhe Feng, Jinkai Zhang, Jingyao Zhang, Bowen Huang, Yuanyao Li, Huihui Yuan, Lei Hou, Juanzi Li, Jie Tang

Abstract: Applying large language models (LLMs) for academic API usage shows promise in reducing researchers' academic information seeking efforts. However, current LLM API-using methods struggle with complex API coupling commonly encountered in academic queries. To address this, we introduce SoAy, a solution-based LLM API-using methodology for academic information seeking. It uses code with a solution as t… ▽ More Applying large language models (LLMs) for academic API usage shows promise in reducing researchers' academic information seeking efforts. However, current LLM API-using methods struggle with complex API coupling commonly encountered in academic queries. To address this, we introduce SoAy, a solution-based LLM API-using methodology for academic information seeking. It uses code with a solution as the reasoning method, where a solution is a pre-constructed API calling sequence. The addition of the solution reduces the difficulty for the model to understand the complex relationships between APIs. Code improves the efficiency of reasoning. To evaluate SoAy, we introduce SoAyBench, an evaluation benchmark accompanied by SoAyEval, built upon a cloned environment of APIs from AMiner. Experimental results demonstrate a 34.58-75.99\% performance improvement compared to state-of-the-art LLM API-based baselines. All datasets, codes, tuned models, and deployed online services are publicly accessible at https://github.com/RUCKBReasoning/SoAy. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 22 pages, 13 figures

arXiv:2405.12575 [pdf, other]

Three-dimensional mapping and electronic origin of large altermagnetic splitting near Fermi level in CrSb

Authors: Guowei Yang, Zhanghuan Li, Sai Yang, Jiyuan Li, Hao Zheng, Weifan Zhu, Saizheng Cao, Wenxuan Zhao, Jiawen Zhang, Mao Ye, Yu Song, Lun-Hui Hu, Lexian Yang, Ming Shi, Huiqiu Yuan, Yongjun Zhang, Yuanfeng Xu, Yang Liu

Abstract: Recently, a new kind of collinear magnetism, dubbed altermagnetism, has attracted considerable interests. A key characteristic of altermagnet is the momentum-dependent band and spin splitting without net magnetization. However, finding altermagnetic materials with large splitting near the Fermi level, which necessarily requires three-dimensional k-space mapping and is crucial for spintronic applic… ▽ More Recently, a new kind of collinear magnetism, dubbed altermagnetism, has attracted considerable interests. A key characteristic of altermagnet is the momentum-dependent band and spin splitting without net magnetization. However, finding altermagnetic materials with large splitting near the Fermi level, which necessarily requires three-dimensional k-space mapping and is crucial for spintronic applications and emergent phenomena, remains challenging. Here by employing synchrotron-based angle-resolved photoemission spectroscopy (ARPES) and model calculations, we uncover a large altermagnetic splitting, up to ~1.0 eV, near the Fermi level in CrSb. We verify its bulk-type g-wave altermagnetism through systematic three-dimensional kspace mapping, which unambiguously reveals the altermagnetic symmetry and associated nodal planes. The ARPES results are well captured by density functional theory calculations. In addition, tight-binding model analysis indicate that the large altermagnetic splitting arises from strong third-nearest-neighbor hopping mediated by Sb ions, which breaks both the space-time reversal symmetry and the translational spin-rotation symmetry. The large band/spin splitting near Fermi level in metallic CrSb, together with its high TN (up to 705 K) and simple spin configuration, paves the way for exploring emergent phenomena and spintronic applications based on altermagnets. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 16 pages, 4 figures and 1 table

arXiv:2405.10851 [pdf]

Bottom-up approach to assess carbon emissions of battery electric vehicle operations in China

Authors: Hong Yuan, Minda Ma

Abstract: The transportation sector is the third-largest global energy consumer and emitter, making it a focal point in the transition toward the net-zero future. To accelerate the decarbonization of passenger cars, this work is the first to propose a bottom-up charging demand model to estimate the operational electricity use and associated carbon emissions of best-selling battery electric vehicles (BEVs) i… ▽ More The transportation sector is the third-largest global energy consumer and emitter, making it a focal point in the transition toward the net-zero future. To accelerate the decarbonization of passenger cars, this work is the first to propose a bottom-up charging demand model to estimate the operational electricity use and associated carbon emissions of best-selling battery electric vehicles (BEVs) in various climate zones in China during the 2020s. The findings reveal that (1) the operational energy demand of the top-20 selling BEV models in China, such as Tesla, Wuling Hongguang, and BYD, increased from 601 to 3054 giga-watt hours (GWh) during 2020-2022, with BEVs in South China contributing more than half of the total electricity demand; (2) from 2020 to 2022, the energy and carbon intensities of the best-selling models decreased from 1364 to 1095 kilowatt-hour per vehicle and from 797 to 621 kilograms of carbon dioxide (CO2) per vehicle, respectively, with North China experiencing the highest intensity decline compared to that in other regions; and (3) the operational energy demand of BEV stocks in China increased from 4774 to 12,048 GWh during 2020-2022, while the carbon emissions of BEV stocks rose to 6.8 mega-tons of CO2 in 2022, reflecting an annual growth rate of ~50%. In summary, this work delves into the examination and contrast of benchmark data on a nation-regional scale, as well as performance metrics related to BEV chargings. The primary aim is to support nationwide efforts in decarbonization, aiming for carbon mitigation and facilitating the swift evolution of passenger cars toward a carbon-neutral future. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 6 pages, 6 figures

arXiv:2405.06823 [pdf, other]

PLeak: Prompt Leaking Attacks against Large Language Model Applications

Authors: Bo Hui, Haolin Yuan, Neil Gong, Philippe Burlina, Yinzhi Cao

Abstract: Large Language Models (LLMs) enable a new ecosystem with many downstream applications, called LLM applications, with different natural language processing tasks. The functionality and performance of an LLM application highly depend on its system prompt, which instructs the backend LLM on what task to perform. Therefore, an LLM application developer often keeps a system prompt confidential to prote… ▽ More Large Language Models (LLMs) enable a new ecosystem with many downstream applications, called LLM applications, with different natural language processing tasks. The functionality and performance of an LLM application highly depend on its system prompt, which instructs the backend LLM on what task to perform. Therefore, an LLM application developer often keeps a system prompt confidential to protect its intellectual property. As a result, a natural attack, called prompt leaking, is to steal the system prompt from an LLM application, which compromises the developer's intellectual property. Existing prompt leaking attacks primarily rely on manually crafted queries, and thus achieve limited effectiveness. In this paper, we design a novel, closed-box prompt leaking attack framework, called PLeak, to optimize an adversarial query such that when the attacker sends it to a target LLM application, its response reveals its own system prompt. We formulate finding such an adversarial query as an optimization problem and solve it with a gradient-based method approximately. Our key idea is to break down the optimization goal by optimizing adversary queries for system prompts incrementally, i.e., starting from the first few tokens of each system prompt step by step until the entire length of the system prompt. We evaluate PLeak in both offline settings and for real-world LLM applications, e.g., those on Poe, a popular platform hosting such applications. Our results show that PLeak can effectively leak system prompts and significantly outperforms not only baselines that manually curate queries but also baselines with optimized queries that are modified and adapted from existing jailbreaking attacks. We responsibly reported the issues to Poe and are still waiting for their response. Our implementation is available at this repository: https://github.com/BHui97/PLeak. △ Less

Submitted 14 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

Comments: To appear in the Proceedings of The ACM Conference on Computer and Communications Security (CCS), 2024

arXiv:2405.03387 [pdf, ps, other]

The high dimensional psychological profile and cultural bias of ChatGPT

Authors: Hang Yuan, Zhongyue Che, Shao Li, Yue Zhang, Xiaomeng Hu, Siyang Luo

Abstract: Given the rapid advancement of large-scale language models, artificial intelligence (AI) models, like ChatGPT, are playing an increasingly prominent role in human society. However, to ensure that artificial intelligence models benefit human society, we must first fully understand the similarities and differences between the human-like characteristics exhibited by artificial intelligence models and… ▽ More Given the rapid advancement of large-scale language models, artificial intelligence (AI) models, like ChatGPT, are playing an increasingly prominent role in human society. However, to ensure that artificial intelligence models benefit human society, we must first fully understand the similarities and differences between the human-like characteristics exhibited by artificial intelligence models and real humans, as well as the cultural stereotypes and biases that artificial intelligence models may exhibit in the process of interacting with humans. This study first measured ChatGPT in 84 dimensions of psychological characteristics, revealing differences between ChatGPT and human norms in most dimensions as well as in high-dimensional psychological representations. Additionally, through the measurement of ChatGPT in 13 dimensions of cultural values, it was revealed that ChatGPT's cultural value patterns are dissimilar to those of various countries/regions worldwide. Finally, an analysis of ChatGPT's performance in eight decision-making tasks involving interactions with humans from different countries/regions revealed that ChatGPT exhibits clear cultural stereotypes in most decision-making tasks and shows significant cultural bias in third-party punishment and ultimatum games. The findings indicate that, compared to humans, ChatGPT exhibits a distinct psychological profile and cultural value orientation, and it also shows cultural biases and stereotypes in interpersonal decision-making. Future research endeavors should emphasize enhanced technical oversight and augmented transparency in the database and algorithmic training procedures to foster more efficient cross-cultural communication and mitigate social disparities. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2405.00675 [pdf, other]

Self-Play Preference Optimization for Language Model Alignment

Authors: Yue Wu, Zhiqing Sun, Huizhuo Yuan, Kaixuan Ji, Yiming Yang, Quanquan Gu

Abstract: Traditional reinforcement learning from human feedback (RLHF) approaches relying on parametric models like the Bradley-Terry model fall short in capturing the intransitivity and irrationality in human preferences. Recent advancements suggest that directly working with preference probabilities can yield a more accurate reflection of human preferences, enabling more flexible and accurate language mo… ▽ More Traditional reinforcement learning from human feedback (RLHF) approaches relying on parametric models like the Bradley-Terry model fall short in capturing the intransitivity and irrationality in human preferences. Recent advancements suggest that directly working with preference probabilities can yield a more accurate reflection of human preferences, enabling more flexible and accurate language model alignment. In this paper, we propose a self-play-based method for language model alignment, which treats the problem as a constant-sum two-player game aimed at identifying the Nash equilibrium policy. Our approach, dubbed Self-Play Preference Optimization (SPPO), approximates the Nash equilibrium through iterative policy updates and enjoys a theoretical convergence guarantee. Our method can effectively increase the log-likelihood of the chosen response and decrease that of the rejected response, which cannot be trivially achieved by symmetric pairwise loss such as Direct Preference Optimization (DPO) and Identity Preference Optimization (IPO). In our experiments, using only 60k prompts (without responses) from the UltraFeedback dataset and without any prompt augmentation, by leveraging a pre-trained preference model PairRM with only 0.4B parameters, SPPO can obtain a model from fine-tuning Mistral-7B-Instruct-v0.2 that achieves the state-of-the-art length-controlled win-rate of 28.53% against GPT-4-Turbo on AlpacaEval 2.0. It also outperforms the (iterative) DPO and IPO on MT-Bench and the Open LLM Leaderboard. Starting from a stronger base model Llama-3-8B-Instruct, we are able to achieve a length-controlled win rate of 38.77%. Notably, the strong performance of SPPO is achieved without additional external supervision (e.g., responses, preferences, etc.) from GPT-4 or other stronger language models. Codes are available at https://github.com/uclaml/SPPO. △ Less

Submitted 14 June, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

Comments: 27 pages, 4 figures, 5 tables

arXiv:2404.14829 [pdf, other]

Revisiting Neural Networks for Continual Learning: An Architectural Perspective

Authors: Aojun Lu, Tao Feng, Hangjie Yuan, Xiaotian Song, Yanan Sun

Abstract: Efforts to overcome catastrophic forgetting have primarily centered around developing more effective Continual Learning (CL) methods. In contrast, less attention was devoted to analyzing the role of network architecture design (e.g., network depth, width, and components) in contributing to CL. This paper seeks to bridge this gap between network architecture design and CL, and to present a holistic… ▽ More Efforts to overcome catastrophic forgetting have primarily centered around developing more effective Continual Learning (CL) methods. In contrast, less attention was devoted to analyzing the role of network architecture design (e.g., network depth, width, and components) in contributing to CL. This paper seeks to bridge this gap between network architecture design and CL, and to present a holistic study on the impact of network architectures on CL. This work considers architecture design at the network scaling level, i.e., width and depth, and also at the network components, i.e., skip connections, global pooling layers, and down-sampling. In both cases, we first derive insights through systematically exploring how architectural designs affect CL. Then, grounded in these insights, we craft a specialized search space for CL and further propose a simple yet effective ArchCraft method to steer a CL-friendly architecture, namely, this method recrafts AlexNet/ResNet into AlexAC/ResAC. Experimental validation across various CL settings and scenarios demonstrates that improved architectures are parameter-efficient, achieving state-of-the-art performance of CL while being 86%, 61%, and 97% more compact in terms of parameters than the naive CL architecture in Task IL and Class IL. Code is available at https://github.com/byyx666/ArchCraft. △ Less

Submitted 28 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.14743 [pdf, other]

Gradient Guidance for Diffusion Models: An Optimization Perspective

Authors: Yingqing Guo, Hui Yuan, Yukang Yang, Minshuo Chen, Mengdi Wang

Abstract: Diffusion models have demonstrated empirical successes in various applications and can be adapted to task-specific needs via guidance. This paper introduces a form of gradient guidance for adapting or fine-tuning diffusion models towards user-specified optimization objectives. We study the theoretic aspects of a guided score-based sampling process, linking the gradient-guided diffusion model to fi… ▽ More Diffusion models have demonstrated empirical successes in various applications and can be adapted to task-specific needs via guidance. This paper introduces a form of gradient guidance for adapting or fine-tuning diffusion models towards user-specified optimization objectives. We study the theoretic aspects of a guided score-based sampling process, linking the gradient-guided diffusion model to first-order optimization. We show that adding gradient guidance to the sampling process of a pre-trained diffusion model is essentially equivalent to solving a regularized optimization problem, where the regularization term acts as a prior determined by the pre-training data. Diffusion models are able to learn data's latent subspace, however, explicitly adding the gradient of an external objective function to the sample process would jeopardize the structure in generated samples. To remedy this issue, we consider a modified form of gradient guidance based on a forward prediction loss, which leverages the pre-trained score function to preserve the latent structure in generated samples. We further consider an iteratively fine-tuned version of gradient-guided diffusion where one can query gradients at newly generated data points and update the score network using new samples. This process mimics a first-order optimization iteration in expectation, for which we proved O(1/K) convergence rate to the global optimum when the objective function is concave. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.14146 [pdf]

Physics-based reward driven image analysis in microscopy

Authors: Kamyar Barakati, Hui Yuan, Amit Goyal, Sergei V. Kalinin

Abstract: The rise of electron microscopy has expanded our ability to acquire nanometer and atomically resolved images of complex materials. The resulting vast datasets are typically analyzed by human operators, an intrinsically challenging process due to the multiple possible analysis steps and the corresponding need to build and optimize complex analysis workflows. We present a methodology based on the co… ▽ More The rise of electron microscopy has expanded our ability to acquire nanometer and atomically resolved images of complex materials. The resulting vast datasets are typically analyzed by human operators, an intrinsically challenging process due to the multiple possible analysis steps and the corresponding need to build and optimize complex analysis workflows. We present a methodology based on the concept of a Reward Function coupled with Bayesian Optimization, to optimize image analysis workflows dynamically. The Reward Function is engineered to closely align with the experimental objectives and broader context and is quantifiable upon completion of the analysis. Here, cross-section, high-angle annular dark field (HAADF) images of ion-irradiated $(Y, Dy)Ba_2Cu_3O_{7-δ}$ thin-films were used as a model system. The reward functions were formed based on the expected materials density and atomic spacings and used to drive multi-objective optimization of the classical Laplacian-of-Gaussian (LoG) method. These results can be benchmarked against the DCNN segmentation. This optimized LoG* compares favorably against DCNN in the presence of the additional noise. We further extend the reward function approach towards the identification of partially-disordered regions, creating a physics-driven reward function and action space of high-dimensional clustering. We pose that with correct definition, the reward function approach allows real-time optimization of complex analysis workflows at much higher speeds and lower computational costs than classical DCNN-based inference, ensuring the attainment of results that are both precise and aligned with the human-defined objectives. △ Less

Submitted 5 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

Comments: 12 pages, 4 figures

arXiv:2404.13817 [pdf, other]

Photometric Re-calibration of VPHAS+ $u$-band Photometry with the Stellar Colour Regression Method and Gaia DR3

Authors: Bing-Qiu Chen, Hai-Bo Yuan, Bo-Wen Huang

Abstract: The u band magnitude is vital for determining stellar parameters and investigating specific astronomical objects. However, flux calibration in the u band for stars in the Galactic disk presents significant challenges. In this study, we introduce a comprehensive re-calibration of $u$-band photometric magnitudes of the VPHAS+ Data Release 4 (DR4), employing the Stellar Colour Regression (SCR) techni… ▽ More The u band magnitude is vital for determining stellar parameters and investigating specific astronomical objects. However, flux calibration in the u band for stars in the Galactic disk presents significant challenges. In this study, we introduce a comprehensive re-calibration of $u$-band photometric magnitudes of the VPHAS+ Data Release 4 (DR4), employing the Stellar Colour Regression (SCR) technique. By leveraging the expansive set of XP spectra and $G_{\rm BP}$ photometry from Gaia Data Release 3 (DR3), as well as the individual stellar extinction values provided by the literature, we have obtained precise model magnitudes of nearly 3 million stars. Our analysis identifies systematic magnitude offsets that exhibit a standard deviation of 0.063 mag across different observational visits, 0.022 mag between various CCDs, and 0.009 mag within pixel bins. We have implemented precise corrections for these observational visits, CCD chips, and pixel bins-dependent magnitude offsets. These corrections have led to a reduction in the standard deviation between the observed magnitudes and the model magnitudes from 0.088 mag to 0.065 mag, ensuring that the calibrated magnitudes are independent of stellar magnitude, colour, and extinction. The enhanced precision of these magnitudes substantially improves the quality of astrophysical research and offers substantial potential for furthering our understanding of stellar astrophysics. △ Less

Submitted 21 April, 2024; originally announced April 2024.

Comments: 12 pages, 15 figures, accepted for publication in MNRAS

arXiv:2404.12872 [pdf, other]

LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency

Authors: Zhaodonghui Li, Haitao Yuan, Huiming Wang, Gao Cong, Lidong Bing

Abstract: Query rewrite, which aims to generate more efficient queries by altering a SQL query's structure without changing the query result, has been an important research problem. In order to maintain equivalence between the rewritten query and the original one during rewriting, traditional query rewrite methods always rewrite the queries following certain rewrite rules. However, some problems still remai… ▽ More Query rewrite, which aims to generate more efficient queries by altering a SQL query's structure without changing the query result, has been an important research problem. In order to maintain equivalence between the rewritten query and the original one during rewriting, traditional query rewrite methods always rewrite the queries following certain rewrite rules. However, some problems still remain. Firstly, existing methods of finding the optimal choice or sequence of rewrite rules are still limited and the process always costs a lot of resources. Methods involving discovering new rewrite rules typically require complicated proofs of structural logic or extensive user interactions. Secondly, current query rewrite methods usually rely highly on DBMS cost estimators which are often not accurate. In this paper, we address these problems by proposing a novel method of query rewrite named LLM-R2, adopting a large language model (LLM) to propose possible rewrite rules for a database rewrite system. To further improve the inference ability of LLM in recommending rewrite rules, we train a contrastive model by curriculum to learn query representations and select effective query demonstrations for the LLM. Experimental results have shown that our method can significantly improve the query execution efficiency and outperform the baseline methods. In addition, our method enjoys high robustness across different datasets. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: 12 pages

arXiv:2404.10219 [pdf, ps, other]

Hypersonic limit for steady compressible Euler flows passing straight cones

Authors: Qianfeng Li, Aifang Qu, Xueying Su, Hairong Yuan

Abstract: We investigate the hypersonic limit for steady, uniform, and compressible polytropic gas passing a symmetric straight cone. By considering Radon measure solutions, we show that as the Mach number of the upstream flow tends to infinity, the measures associated with the weak entropy solution containing an attached shock ahead of the cone converge vaguely to the measures associated with a Radon measu… ▽ More We investigate the hypersonic limit for steady, uniform, and compressible polytropic gas passing a symmetric straight cone. By considering Radon measure solutions, we show that as the Mach number of the upstream flow tends to infinity, the measures associated with the weak entropy solution containing an attached shock ahead of the cone converge vaguely to the measures associated with a Radon measure solution to the conical hypersonic-limit flow. This justifies the Newtonian sine-squared pressure law for cones in hypersonic aerodynamics. For Chaplygin gas, assuming that the Mach number of the incoming flow is less than a finite critical value, we demonstrate that the vertex angle of the leading shock is independent of the conical body's vertex angle and is totally determined by the incoming flow's Mach number. If the Mach number exceeds the critical value, we explicitly construct a Radon measure solution with a concentration boundary layer. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 30 pages, 0 figure

MSC Class: 35L50; 35L65; 35Q31; 35R06; 76K05

arXiv:2404.08234 [pdf, other]

Searching for Hyper-compact star clusters in the Milky Way using LAMOST and Gaia

Authors: Hao Wu, Haibo Yuan, Yilun Wang, Zexi Niu, Huawei Zhang

Abstract: During the early merger of the Milky Way, intermediate-mass black holes in merged dwarf galaxies may have been ejected from the center of their host galaxies due to gravitational waves, carrying some central stars along. This process can lead to the formation of hyper-compact star clusters, potentially hosting black holes in the mass range of $10^4$ to $10^5$ solar masses. These clusters are cruci… ▽ More During the early merger of the Milky Way, intermediate-mass black holes in merged dwarf galaxies may have been ejected from the center of their host galaxies due to gravitational waves, carrying some central stars along. This process can lead to the formation of hyper-compact star clusters, potentially hosting black holes in the mass range of $10^4$ to $10^5$ solar masses. These clusters are crucial targets for identifying and investigating intermediate-mass black holes. However, no hyper-compact star clusters in the Milky Way have been identified so far. In this paper, taking advantage of the high spatial resolution power of Gaia, we used data from Gaia EDR3 and LAMOST DR7, along with additional data from Pan-STARRS and SDSS, to conduct an initial screening of 6,138,049 sources using various parameters of Gaia EDR3. A total of 4,786 sources were selected for in-depth analysis. Each of these sources was meticulously scrutinized by examining their images, spectra, and nearby celestial objects to exclude various false positives, such as contaminations, galaxies, wide binaries, or wrong matches. We finally identified one likely hyper-compact star cluster candidate in the Milky Way, laying the foundation for further high-resolution imaging and spectral verification. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: 17 pages, 16 figures; Accepted by The Astronomical Journal

arXiv:2404.07127 [pdf, other]

Searching for short-period variables in M31: method and catalogs

Authors: Hongrui Gu, Haibo Yuan, Subo Dong, Chenfa Zheng, Shenzhe Cui, Yi Ren, Haozhu Fu, Yang Huang, Zhou Fan

Abstract: Utilizing high-cadence and continuous g- and r-band data over three nights acquired from the 3.6-meter Canada France Hawaii Telescope (CFHT) aimed to find short-duration microlensing events, we conduct a systematic search for variables, transients, and asteroids across a $\sim1^\circ$ field of view of the Andromeda Galaxy (M 31). We present a catalog of 5859 variable stars, yielding the most exten… ▽ More Utilizing high-cadence and continuous g- and r-band data over three nights acquired from the 3.6-meter Canada France Hawaii Telescope (CFHT) aimed to find short-duration microlensing events, we conduct a systematic search for variables, transients, and asteroids across a $\sim1^\circ$ field of view of the Andromeda Galaxy (M 31). We present a catalog of 5859 variable stars, yielding the most extensive compilation of short-period variable sources of M 31. We also detected 19 flares, predominantly associated with foreground M dwarfs in the Milky Way. In addition, we discovered 17 previously unknown asteroid candidates, and we subsequently reported them to the Minor Planet Center. Lastly, we report a microlensing event candidate C-ML-1 and present a preliminary analysis. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.06211 [pdf, other]

Unified Physical-Digital Attack Detection Challenge

Authors: Haocheng Yuan, Ajian Liu, Junze Zheng, Jun Wan, Jiankang Deng, Sergio Escalera, Hugo Jair Escalante, Isabelle Guyon, Zhen Lei

Abstract: Face Anti-Spoofing (FAS) is crucial to safeguard Face Recognition (FR) Systems. In real-world scenarios, FRs are confronted with both physical and digital attacks. However, existing algorithms often address only one type of attack at a time, which poses significant limitations in real-world scenarios where FR systems face hybrid physical-digital threats. To facilitate the research of Unified Attac… ▽ More Face Anti-Spoofing (FAS) is crucial to safeguard Face Recognition (FR) Systems. In real-world scenarios, FRs are confronted with both physical and digital attacks. However, existing algorithms often address only one type of attack at a time, which poses significant limitations in real-world scenarios where FR systems face hybrid physical-digital threats. To facilitate the research of Unified Attack Detection (UAD) algorithms, a large-scale UniAttackData dataset has been collected. UniAttackData is the largest public dataset for Unified Attack Detection, with a total of 28,706 videos, where each unique identity encompasses all advanced attack types. Based on this dataset, we organized a Unified Physical-Digital Face Attack Detection Challenge to boost the research in Unified Attack Detections. It attracted 136 teams for the development phase, with 13 qualifying for the final round. The results re-verified by the organizing team were used for the final ranking. This paper comprehensively reviews the challenge, detailing the dataset introduction, protocol definition, evaluation criteria, and a summary of published results. Finally, we focus on the detailed analysis of the highest-performing algorithms and offer potential directions for unified physical-digital attack detection inspired by this competition. Challenge Website: https://sites.google.com/view/face-anti-spoofing-challenge/welcome/challengecvpr2024. △ Less

Submitted 18 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

Comments: 11 pages, 10 figures

arXiv:2404.06032 [pdf, ps, other]

doi 10.1103/PhysRevB.109.104414

Inverse melting and intertwined orders in PrCuSb$_2$

Authors: H. Q. Ye, Y. N. Zhang, T. Le, H. Q. Yuan, M. Smidman

Abstract: Much of the rich physics of correlated systems is manifested in the diverse range of intertwined ordered phases and other quantum states that are associated with different electronic and structural degrees of freedom. Here we find that PrCuSb$_2$ exhibits such phenomena, which at ambient pressure exhibits a fragile antiferromagnetic order, where cooling in a small $c$ axis magnetic field leads to… ▽ More Much of the rich physics of correlated systems is manifested in the diverse range of intertwined ordered phases and other quantum states that are associated with different electronic and structural degrees of freedom. Here we find that PrCuSb$_2$ exhibits such phenomena, which at ambient pressure exhibits a fragile antiferromagnetic order, where cooling in a small $c$ axis magnetic field leads to an additional transition to a field-induced ferromagnetic state. This corresponds to an 'inverse melting' effect, whereby further cooling the system restores symmetries of the paramagnetic state broken at the antiferromagnetic transition. Moreover, hydrostatic pressure induces an additional first-order transition at low temperatures, which despite being not likely associated with solely magnetic degrees of freedom, is closely entwined with the magnetic order, disappearing once antiferromagnetism is destroyed by pressure or magnetic fields. Consequently, PrCuSb$_2$ presents a distinct scenario for interplay between different orders, underscoring the breadth of such behaviors within one family of correlated materials. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 10 pages, 11 figures

Journal ref: Phys. Rev. B 109, 104414 (2024)

arXiv:2404.05774 [pdf, other]

STMGF: An Effective Spatial-Temporal Multi-Granularity Framework for Traffic Forecasting

Authors: Zhengyang Zhao, Haitao Yuan, Nan Jiang, Minxiao Chen, Ning Liu, Zengxiang Li

Abstract: Accurate Traffic Prediction is a challenging task in intelligent transportation due to the spatial-temporal aspects of road networks. The traffic of a road network can be affected by long-distance or long-term dependencies where existing methods fall short in modeling them. In this paper, we introduce a novel framework known as Spatial-Temporal Multi-Granularity Framework (STMGF) to enhance the ca… ▽ More Accurate Traffic Prediction is a challenging task in intelligent transportation due to the spatial-temporal aspects of road networks. The traffic of a road network can be affected by long-distance or long-term dependencies where existing methods fall short in modeling them. In this paper, we introduce a novel framework known as Spatial-Temporal Multi-Granularity Framework (STMGF) to enhance the capture of long-distance and long-term information of the road networks. STMGF makes full use of different granularity information of road networks and models the long-distance and long-term information by gathering information in a hierarchical interactive way. Further, it leverages the inherent periodicity in traffic sequences to refine prediction results by matching with recent traffic data. We conduct experiments on two real-world datasets, and the results demonstrate that STMGF outperforms all baseline models and achieves state-of-the-art performance. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2404.04271 [pdf, other]

Towards Effective Next POI Prediction: Spatial and Semantic Augmentation with Remote Sensing Data

Authors: Nan Jiang, Haitao Yuan, Jianing Si, Minxiao Chen, Shangguang Wang

Abstract: The next point-of-interest (POI) prediction is a significant task in location-based services, yet its complexity arises from the consolidation of spatial and semantic intent. This fusion is subject to the influences of historical preferences, prevailing location, and environmental factors, thereby posing significant challenges. In addition, the uneven POI distribution further complicates the next… ▽ More The next point-of-interest (POI) prediction is a significant task in location-based services, yet its complexity arises from the consolidation of spatial and semantic intent. This fusion is subject to the influences of historical preferences, prevailing location, and environmental factors, thereby posing significant challenges. In addition, the uneven POI distribution further complicates the next POI prediction procedure. To address these challenges, we enrich input features and propose an effective deep-learning method within a two-step prediction framework. Our method first incorporates remote sensing data, capturing pivotal environmental context to enhance input features regarding both location and semantics. Subsequently, we employ a region quad-tree structure to integrate urban remote sensing, road network, and POI distribution spaces, aiming to devise a more coherent graph representation method for urban spatial. Leveraging this method, we construct the QR-P graph for the user's historical trajectories to encapsulate historical travel knowledge, thereby augmenting input features with comprehensive spatial and semantic insights. We devise distinct embedding modules to encode these features and employ an attention mechanism to fuse diverse encodings. In the two-step prediction procedure, we initially identify potential spatial zones by predicting user-preferred tiles, followed by pinpointing specific POIs of a designated type within the projected tiles. Empirical findings from four real-world location-based social network datasets underscore the remarkable superiority of our proposed approach over competitive baseline methods. △ Less

Submitted 22 March, 2024; originally announced April 2024.

Comments: 12 pages, 11 figures, Accepted by ICDE 2024

arXiv:2404.03236 [pdf, other]

Bright Heralded Source Reaching Theoretical Single-Photon Purity

Authors: Haoyang Wang, Huihong Yuan, Qiang Zeng, Lai Zhou, Haiqiang Ma, Zhiliang Yuan

Abstract: We derive the theoretical limit of single-photon purity of heralded single-photon sources, and accordingly demonstrate a bright, gigahertz-pulsed heralded source with the purity saturating the limit. Based on spontaneous four-wave mixing in a silicon spiral waveguide, this on-chip source is measured to have a coincidence rate exceeding 1.5 MHz at a coincidence to accidental (CAR) ratio of 16.77. T… ▽ More We derive the theoretical limit of single-photon purity of heralded single-photon sources, and accordingly demonstrate a bright, gigahertz-pulsed heralded source with the purity saturating the limit. Based on spontaneous four-wave mixing in a silicon spiral waveguide, this on-chip source is measured to have a coincidence rate exceeding 1.5 MHz at a coincidence to accidental (CAR) ratio of 16.77. The single-photon purity, quantified by the auto-correlation function $g^{(2)}_h(0)$, reaches the theoretical limit with the lowest value of $0.00094 \pm 0.00002$ obtained at a coincidence rate of 0.8 kHz. We attribute our results to effective spectral filtering as well as the coherent pump condition helped by optical injection locking. △ Less

Submitted 12 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

Comments: 7 Pages, 7 figures, comments are welcome!

arXiv:2404.03228 [pdf, other]

Steering nonlocality in high-speed telecommunication system without detection loophole

Authors: Qiang Zeng, Huihong Yuan, Haoyang Wang, Lai Zhou, Zhiliang Yuan

Abstract: Nonlocal correlation represents the key feature of quantum mechanics, which is exploited as a resource in quantum information processing. However, the loophole issues hamper the practical applications. We report the first demonstration of steering nonlocality with detection loophole closed at telecommunication wavelengths. In this endeavour, we design and fabricate a low-loss silicon chip for effi… ▽ More Nonlocal correlation represents the key feature of quantum mechanics, which is exploited as a resource in quantum information processing. However, the loophole issues hamper the practical applications. We report the first demonstration of steering nonlocality with detection loophole closed at telecommunication wavelengths. In this endeavour, we design and fabricate a low-loss silicon chip for efficient entanglement generation, and further apply the direct modulation technique to its optical pump to eliminate phase-encoding loss at the steering side. The newly proposed phase-encoding measurement setting adapts to an ultra-fast modulation rate (GHz). Consequently, we build a fiber-optic setup that can overcome the detection efficiency that is required by quantum steering with multiple measurement settings. Our setup provides an immediate platform for exploring applications based on steering nonlocality, especially for quantum communication. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: Comments are welcome. A potential new QKD protocol is expected to be developed, looking for collaborations

arXiv:2404.02476 [pdf, other]

Deep Reinforcement Learning for Traveling Purchaser Problems

Authors: Haofeng Yuan, Rongping Zhu, Wanlu Yang, Shiji Song, Keyou You, Yuli Zhang

Abstract: The traveling purchaser problem (TPP) is an important combinatorial optimization problem with broad applications. Due to the coupling between routing and purchasing, existing works on TPPs commonly address route construction and purchase planning simultaneously, which, however, leads to exact methods with high computational cost and heuristics with sophisticated design but limited performance. In… ▽ More The traveling purchaser problem (TPP) is an important combinatorial optimization problem with broad applications. Due to the coupling between routing and purchasing, existing works on TPPs commonly address route construction and purchase planning simultaneously, which, however, leads to exact methods with high computational cost and heuristics with sophisticated design but limited performance. In sharp contrast, we propose a novel approach based on deep reinforcement learning (DRL), which addresses route construction and purchase planning separately, while evaluating and optimizing the solution from a global perspective. The key components of our approach include a bipartite graph representation for TPPs to capture the market-product relations, and a policy network that extracts information from the bipartite graph and uses it to sequentially construct the route. One significant benefit of our framework is that we can efficiently construct the route using the policy network, and once the route is determined, the associated purchasing plan can be easily derived through linear programming, while, leveraging DRL, we can train the policy network to optimize the global solution objective. Furthermore, by introducing a meta-learning strategy, the policy network can be trained stably on large-sized TPP instances, and generalize well across instances of varying sizes and distributions, even to much larger instances that are never seen during training. Experiments on various synthetic TPP instances and the TPPLIB benchmark demonstrate that our DRL-based approach can significantly outperform well-established TPP heuristics, reducing the optimality gap by 40%-90%, and also showing an advantage in runtime, especially on large-sized instances. △ Less

Submitted 11 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

arXiv:2404.02011 [pdf]

Superionic Fluoride Gate Dielectrics with Low Diffusion Barrier for Advanced Electronics

Authors: Kui Meng, Zeya Li, Peng Chen, Xingyue Ma, Junwei Huang, Jiayi Li, Feng Qin, Caiyu Qiu, Yilin Zhang, Ding Zhang, Yu Deng, Yurong Yang, Genda Gu, Harold Y. Hwang, Qi-Kun Xue, Yi Cui, Hongtao Yuan

Abstract: Exploration of new dielectrics with large capacitive coupling is an essential topic in modern electronics when conventional dielectrics suffer from the leakage issue near breakdown limit. To address this looming challenge, we demonstrate that rare-earth-metal fluorides with extremely-low ion migration barriers can generally exhibit an excellent capacitive coupling over 20 $μ$F cm$^{-2}$ (with an e… ▽ More Exploration of new dielectrics with large capacitive coupling is an essential topic in modern electronics when conventional dielectrics suffer from the leakage issue near breakdown limit. To address this looming challenge, we demonstrate that rare-earth-metal fluorides with extremely-low ion migration barriers can generally exhibit an excellent capacitive coupling over 20 $μ$F cm$^{-2}$ (with an equivalent oxide thickness of ~0.15 nm and a large effective dielectric constant near 30) and great compatibility with scalable device manufacturing processes. Such static dielectric capability of superionic fluorides is exemplified by MoS$_2$ transistors exhibiting high on/off current ratios over 10$^8$, ultralow subthreshold swing of 65 mV dec$^{-1}$, and ultralow leakage current density of ~10$^{-6}$ A cm$^{-2}$. Therefore, the fluoride-gated logic inverters can achieve significantly higher static voltage gain values, surpassing ~167, compared to conventional dielectric. Furthermore, the application of fluoride gating enables the demonstration of NAND, NOR, AND, and OR logic circuits with low static energy consumption. Notably, the superconductor-to-insulator transition at the clean-limit Bi$_2$Sr$_2$CaCu$_2$O$_{8+δ}$ can also be realized through fluoride gating. Our findings highlight fluoride dielectrics as a pioneering platform for advanced electronics applications and for tailoring emergent electronic states in condensed matters. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Comments: 33 pages, 5 figures

arXiv:2404.00986 [pdf, other]

Make Continual Learning Stronger via C-Flat

Authors: Ang Bian, Wei Li, Hangjie Yuan, Chengrong Yu, Zixiang Zhao, Mang Wang, Aojun Lu, Tao Feng

Abstract: Model generalization ability upon incrementally acquiring dynamically updating knowledge from sequentially arriving tasks is crucial to tackle the sensitivity-stability dilemma in Continual Learning (CL). Weight loss landscape sharpness minimization seeking for flat minima lying in neighborhoods with uniform low loss or smooth gradient is proven to be a strong training regime improving model gener… ▽ More Model generalization ability upon incrementally acquiring dynamically updating knowledge from sequentially arriving tasks is crucial to tackle the sensitivity-stability dilemma in Continual Learning (CL). Weight loss landscape sharpness minimization seeking for flat minima lying in neighborhoods with uniform low loss or smooth gradient is proven to be a strong training regime improving model generalization compared with loss minimization based optimizer like SGD. Yet only a few works have discussed this training regime for CL, proving that dedicated designed zeroth-order sharpness optimizer can improve CL performance. In this work, we propose a Continual Flatness (C-Flat) method featuring a flatter loss landscape tailored for CL. C-Flat could be easily called with only one line of code and is plug-and-play to any CL methods. A general framework of C-Flat applied to all CL categories and a thorough comparison with loss minima optimizer and flat minima based CL approaches is presented in this paper, showing that our method can boost CL performance in almost all cases. Code will be publicly available upon publication. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2404.00695 [pdf]

Even-integer Quantum Hall Effect in an Oxide Caused by Hidden Rashba Effect

Authors: Jingyue Wang, Junwei Huang, Daniel Kaplan, Xuehan Zhou, Congwei Tan, Jing Zhang, Gangjian Jin, Xuzhong Cong, Yongchao Zhu, Xiaoyin Gao, Yan Liang, Huakun Zuo, Zengwei Zhu, Ruixue Zhu, Ady Stern, Hongtao Liu, Peng Gao, Binghai Yan, Hongtao Yuan, Hailin Peng

Abstract: In the presence of high magnetic field, quantum Hall systems usually host both even- and odd-integer quantized states because of lifted band degeneracies. Selective control of these quantized states is challenging but essential to understand the exotic ground states and manipulate the spin textures. Here, we study the quantum Hall effect in Bi2O2Se thin films. In magnetic fields as high as 50 T, w… ▽ More In the presence of high magnetic field, quantum Hall systems usually host both even- and odd-integer quantized states because of lifted band degeneracies. Selective control of these quantized states is challenging but essential to understand the exotic ground states and manipulate the spin textures. Here, we study the quantum Hall effect in Bi2O2Se thin films. In magnetic fields as high as 50 T, we observe only even-integer quantum Hall states, but no sign of odd-integer states. However, when reducing the thickness of the epitaxial Bi2O2Se film to one unit cell, we observe both odd- and even-integer states in this Janus (asymmetric) film grown on SrTiO3. By means of a Rashba bilayer model based on ab initio band structures of Bi2O2Se thin films, we can ascribe the absence of odd-integer states in thicker films to the hidden Rasbha effect, where the local inversion symmetry breaking in two sectors of the [Bi2O2]2+ layer yields opposite Rashba spin polarizations, which compensate with each other. In the one unit cell Bi2O2Se film grown on SrTiO3, the asymmetry introduced by top surface and bottom interface induces a net polar field. The resulting global Rashba effect lifts the band degeneracies present in the symmetric case of thicker films. △ Less

Submitted 28 June, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

Comments: 6 Figures, 23 pages

arXiv:2403.18871 [pdf]

doi 10.1016/j.jbi.2024.104673

Clinical Domain Knowledge-Derived Template Improves Post Hoc AI Explanations in Pneumothorax Classification

Authors: Han Yuan, Chuan Hong, Pengtao Jiang, Gangming Zhao, Nguyen Tuan Anh Tran, Xinxing Xu, Yet Yen Yan, Nan Liu

Abstract: Background: Pneumothorax is an acute thoracic disease caused by abnormal air collection between the lungs and chest wall. To address the opaqueness often associated with deep learning (DL) models, explainable artificial intelligence (XAI) methods have been introduced to outline regions related to pneumothorax diagnoses made by DL models. However, these explanations sometimes diverge from actual le… ▽ More Background: Pneumothorax is an acute thoracic disease caused by abnormal air collection between the lungs and chest wall. To address the opaqueness often associated with deep learning (DL) models, explainable artificial intelligence (XAI) methods have been introduced to outline regions related to pneumothorax diagnoses made by DL models. However, these explanations sometimes diverge from actual lesion areas, highlighting the need for further improvement. Method: We propose a template-guided approach to incorporate the clinical knowledge of pneumothorax into model explanations generated by XAI methods, thereby enhancing the quality of these explanations. Utilizing one lesion delineation created by radiologists, our approach first generates a template that represents potential areas of pneumothorax occurrence. This template is then superimposed on model explanations to filter out extraneous explanations that fall outside the template's boundaries. To validate its efficacy, we carried out a comparative analysis of three XAI methods with and without our template guidance when explaining two DL models in two real-world datasets. Results: The proposed approach consistently improved baseline XAI methods across twelve benchmark scenarios built on three XAI methods, two DL models, and two datasets. The average incremental percentages, calculated by the performance improvements over the baseline performance, were 97.8% in Intersection over Union (IoU) and 94.1% in Dice Similarity Coefficient (DSC) when comparing model explanations and ground-truth lesion areas. Conclusions: In the context of pneumothorax diagnoses, we proposed a template-guided approach for improving AI explanations. We anticipate that our template guidance will forge a fresh approach to elucidating AI models by integrating clinical domain expertise. △ Less

Submitted 26 March, 2024; originally announced March 2024.

arXiv:2403.17610 [pdf, other]

MMVP: A Multimodal MoCap Dataset with Vision and Pressure Sensors

Authors: He Zhang, Shenghao Ren, Haolei Yuan, Jianhui Zhao, Fan Li, Shuangpeng Sun, Zhenghao Liang, Tao Yu, Qiu Shen, Xun Cao

Abstract: Foot contact is an important cue for human motion capture, understanding, and generation. Existing datasets tend to annotate dense foot contact using visual matching with thresholding or incorporating pressure signals. However, these approaches either suffer from low accuracy or are only designed for small-range and slow motion. There is still a lack of a vision-pressure multimodal dataset with la… ▽ More Foot contact is an important cue for human motion capture, understanding, and generation. Existing datasets tend to annotate dense foot contact using visual matching with thresholding or incorporating pressure signals. However, these approaches either suffer from low accuracy or are only designed for small-range and slow motion. There is still a lack of a vision-pressure multimodal dataset with large-range and fast human motion, as well as accurate and dense foot-contact annotation. To fill this gap, we propose a Multimodal MoCap Dataset with Vision and Pressure sensors, named MMVP. MMVP provides accurate and dense plantar pressure signals synchronized with RGBD observations, which is especially useful for both plausible shape estimation, robust pose fitting without foot drifting, and accurate global translation tracking. To validate the dataset, we propose an RGBD-P SMPL fitting method and also a monocular-video-based baseline framework, VP-MoCap, for human motion capture. Experiments demonstrate that our RGBD-P SMPL Fitting results significantly outperform pure visual motion capture. Moreover, VP-MoCap outperforms SOTA methods in foot-contact and global translation estimation accuracy. We believe the configuration of the dataset and the baseline frameworks will stimulate the research in this direction and also provide a good reference for MoCap applications in various domains. Project page: https://metaverse-ai-lab-thu.github.io/MMVP-Dataset/. △ Less

Submitted 29 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

Comments: CVPR2024

arXiv:2403.16374 [pdf, other]

ProIn: Learning to Predict Trajectory Based on Progressive Interactions for Autonomous Driving

Authors: Yinke Dong, Haifeng Yuan, Hongkun Liu, Wei Jing, Fangzhen Li, Hongmin Liu, Bin Fan

Abstract: Accurate motion prediction of pedestrians, cyclists, and other surrounding vehicles (all called agents) is very important for autonomous driving. Most existing works capture map information through an one-stage interaction with map by vector-based attention, to provide map constraints for social interaction and multi-modal differentiation. However, these methods have to encode all required map rul… ▽ More Accurate motion prediction of pedestrians, cyclists, and other surrounding vehicles (all called agents) is very important for autonomous driving. Most existing works capture map information through an one-stage interaction with map by vector-based attention, to provide map constraints for social interaction and multi-modal differentiation. However, these methods have to encode all required map rules into the focal agent's feature, so as to retain all possible intentions' paths while at the meantime to adapt to potential social interaction. In this work, a progressive interaction network is proposed to enable the agent's feature to progressively focus on relevant maps, in order to better learn agents' feature representation capturing the relevant map constraints. The network progressively encode the complex influence of map constraints into the agent's feature through graph convolutions at the following three stages: after historical trajectory encoder, after social interaction, and after multi-modal differentiation. In addition, a weight allocation mechanism is proposed for multi-modal training, so that each mode can obtain learning opportunities from a single-mode ground truth. Experiments have validated the superiority of progressive interactions to the existing one-stage interaction, and demonstrate the effectiveness of each component. Encouraging results were obtained in the challenging benchmarks. △ Less

Submitted 24 March, 2024; originally announced March 2024.

Showing 1–50 of 955 results for author: Yuan, H