subscribe to arXiv mailings

Efficient Stochastic Routing in Path-Centric Uncertain Road Networks -- Extended Version

Authors: Chenjuan Guo, Ronghui Xu, Bin Yang, Ye Yuan, Tung Kieu, Yan Zhao, Christian S. Jensen

Abstract: The availability of massive vehicle trajectory data enables the modeling of road-network constrained movement as travel-cost distributions rather than just single-valued costs, thereby capturing the inherent uncertainty of movement and enabling improved routing quality. Thus, stochastic routing has been studied extensively in the edge-centric model, where such costs are assigned to the edges in a… ▽ More The availability of massive vehicle trajectory data enables the modeling of road-network constrained movement as travel-cost distributions rather than just single-valued costs, thereby capturing the inherent uncertainty of movement and enabling improved routing quality. Thus, stochastic routing has been studied extensively in the edge-centric model, where such costs are assigned to the edges in a graph representation of a road network. However, as this model still disregards important information in trajectories and fails to capture dependencies among cost distributions, a path-centric model, where costs are assigned to paths, has been proposed that captures dependencies better and provides an improved foundation for routing. Unfortunately, when applied in this model, existing routing algorithms are inefficient due to two shortcomings that we eliminate. First, when exploring candidate paths, existing algorithms only consider the costs of candidate paths from the source to intermediate vertices, while disregarding the costs of travel from the intermediate vertices to the destination, causing many non-competitive paths to be explored. We propose two heuristics for estimating the cost from an intermediate vertex to the destination, thus improving routing efficiency. Second, the edge-centric model relies on stochastic dominance-based pruning to improve efficiency. This pruning assumes that costs are independent and is therefore inapplicable in the path-centric model that takes dependencies into account. We introduce a notion of virtual path that effectively enables stochastic dominance-based pruning in the path-based model, thus further improving efficiency. Empirical studies using two real-world trajectory sets offer insight into the properties of the proposed solution, indicating that it enables efficient stochastic routing in the path-centric model. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.05813 [pdf, other]

DarkSide-20k sensitivity to light dark matter particles

Authors: DarkSide-20k Collaboration, :, F. Acerbi, P. Adhikari, P. Agnes, I. Ahmad, S. Albergo, I. F. M. Albuquerque, T. Alexander, A. K. Alton, P. Amaudruz, M. Angiolilli, E. Aprile, R. Ardito, M. Atzori Corona, D. J. Auty, M. Ave, I. C. Avetisov, O. Azzolini, H. O. Back, Z. Balmforth, A. Barrado Olmedo, P. Barrillon, G. Batignani, P. Bhowmick , et al. (289 additional authors not shown)

Abstract: The dual-phase liquid argon time projection chamber is presently one of the leading technologies to search for dark matter particles with masses below 10 GeV/c$^2$. This was demonstrated by the DarkSide-50 experiment with approximately 50 kg of low-radioactivity liquid argon as target material. The next generation experiment DarkSide-20k, currently under construction, will use 1,000 times more arg… ▽ More The dual-phase liquid argon time projection chamber is presently one of the leading technologies to search for dark matter particles with masses below 10 GeV/c$^2$. This was demonstrated by the DarkSide-50 experiment with approximately 50 kg of low-radioactivity liquid argon as target material. The next generation experiment DarkSide-20k, currently under construction, will use 1,000 times more argon and is expected to start operation in 2027. Based on the DarkSide-50 experience, here we assess the DarkSide-20k sensitivity to models predicting light dark matter particles, including Weakly Interacting Massive Particles (WIMPs) and sub-GeV/c$^2$ particles interacting with electrons in argon atoms. With one year of data, a sensitivity improvement to dark matter interaction cross-sections by at least one order of magnitude with respect to DarkSide-50 is expected for all these models. A sensitivity to WIMP--nucleon interaction cross-sections below $1\times10^{-42}$ cm$^2$ is achievable for WIMP masses above 800 MeV/c$^2$. With 10 years exposure, the neutrino fog can be reached for WIMP masses around 5 GeV/c$^2$. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: submitted to Nature Communications

arXiv:2407.04237 [pdf, other]

GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction

Authors: Yuxuan Mu, Xinxin Zuo, Chuan Guo, Yilin Wang, Juwei Lu, Xiaofeng Wu, Songcen Xu, Peng Dai, Youliang Yan, Li Cheng

Abstract: We present GSD, a diffusion model approach based on Gaussian Splatting (GS) representation for 3D object reconstruction from a single view. Prior works suffer from inconsistent 3D geometry or mediocre rendering quality due to improper representations. We take a step towards resolving these shortcomings by utilizing the recent state-of-the-art 3D explicit representation, Gaussian Splatting, and an… ▽ More We present GSD, a diffusion model approach based on Gaussian Splatting (GS) representation for 3D object reconstruction from a single view. Prior works suffer from inconsistent 3D geometry or mediocre rendering quality due to improper representations. We take a step towards resolving these shortcomings by utilizing the recent state-of-the-art 3D explicit representation, Gaussian Splatting, and an unconditional diffusion model. This model learns to generate 3D objects represented by sets of GS ellipsoids. With these strong generative 3D priors, though learning unconditionally, the diffusion model is ready for view-guided reconstruction without further model fine-tuning. This is achieved by propagating fine-grained 2D features through the efficient yet flexible splatting function and the guided denoising sampling process. In addition, a 2D diffusion model is further employed to enhance rendering fidelity, and improve reconstructed GS quality by polishing and re-using the rendered images. The final reconstructed objects explicitly come with high-quality 3D structure and texture, and can be efficiently rendered in arbitrary views. Experiments on the challenging real-world CO3D dataset demonstrate the superiority of our approach. Project page: $\href{https://yxmu.foo/GSD/}{\text{this https URL}}$ △ Less

Submitted 10 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

Comments: Accepted for ECCV 2024

arXiv:2407.03757 [pdf, other]

DiffRetouch: Using Diffusion to Retouch on the Shoulder of Experts

Authors: Zheng-Peng Duan, Jiawei zhang, Zheng Lin, Xin Jin, Dongqing Zou, Chunle Guo, Chongyi Li

Abstract: Image retouching aims to enhance the visual quality of photos. Considering the different aesthetic preferences of users, the target of retouching is subjective. However, current retouching methods mostly adopt deterministic models, which not only neglects the style diversity in the expert-retouched results and tends to learn an average style during training, but also lacks sample diversity during… ▽ More Image retouching aims to enhance the visual quality of photos. Considering the different aesthetic preferences of users, the target of retouching is subjective. However, current retouching methods mostly adopt deterministic models, which not only neglects the style diversity in the expert-retouched results and tends to learn an average style during training, but also lacks sample diversity during inference. In this paper, we propose a diffusion-based method, named DiffRetouch. Thanks to the excellent distribution modeling ability of diffusion, our method can capture the complex fine-retouched distribution covering various visual-pleasing styles in the training data. Moreover, four image attributes are made adjustable to provide a user-friendly editing mechanism. By adjusting these attributes in specified ranges, users are allowed to customize preferred styles within the learned fine-retouched distribution. Additionally, the affine bilateral grid and contrastive learning scheme are introduced to handle the problem of texture distortion and control insensitivity respectively. Extensive experiments have demonstrated the superior performance of our method on visually appealing and sample diversity. The code will be made available to the community. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2406.19212 [pdf, other]

JuliVQC: an Efficient Variational Quantum Circuit Simulator for Near-Term Quantum Algorithms

Authors: Wei-You Liao, Xiang Wang, Xiao-Yue Xu, Chen Ding, Shuo Zhang, He-Liang Huang, Chu Guo

Abstract: We introduce JuliVQC: a light-weight, yet extremely efficient variational quantum circuit simulator. JuliVQC is part of an effort for classical simulation of the \textit{Zuchongzhi} quantum processors, where it is extensively used to characterize the circuit noises, as a building block in the Schr$\ddot{\text{o}}$dinger-Feynman algorithm for classical verification and performance benchmarking, and… ▽ More We introduce JuliVQC: a light-weight, yet extremely efficient variational quantum circuit simulator. JuliVQC is part of an effort for classical simulation of the \textit{Zuchongzhi} quantum processors, where it is extensively used to characterize the circuit noises, as a building block in the Schr$\ddot{\text{o}}$dinger-Feynman algorithm for classical verification and performance benchmarking, and for variational optimization of the Fsim gate parameters. The design principle of JuliVQC is three-fold: (1) Transparent implementation of its core algorithms, realized by using the high-performance script language Julia; (2) Efficiency is the focus, with a cache-friendly implementation of each elementary operations and support for shared-memory parallelization; (3) Native support of automatic differentiation for both the noiseless and noisy quantum circuits. We perform extensive numerical experiments on JuliVQC in different application scenarios, including quantum circuits, variational quantum circuits and their noisy counterparts, which show that its performance is among the top of the popular alternatives. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 12 pages, 8 figures

arXiv:2406.18824 [pdf, other]

Topological winding guaranteed coherent orthogonal scattering

Authors: Cheng Guo, Shanhui Fan

Abstract: Coherent control has enabled various novel phenomena in wave scattering. We introduce an effect called coherent orthogonal scattering, where the output wave becomes orthogonal to the reference output state without scatterers. This effect leads to a unity extinction coefficient and complete mode conversion. We examine the conditions for this effect and reveal its topological nature by relating it t… ▽ More Coherent control has enabled various novel phenomena in wave scattering. We introduce an effect called coherent orthogonal scattering, where the output wave becomes orthogonal to the reference output state without scatterers. This effect leads to a unity extinction coefficient and complete mode conversion. We examine the conditions for this effect and reveal its topological nature by relating it to the indivisibility between the dimension and the winding number of scattering submatrices. These findings deepen our understanding of topological scattering phenomena. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 13 pages, 7 figures. In press

arXiv:2406.08160 [pdf, other]

Chemistry3D: Robotic Interaction Benchmark for Chemistry Experiments

Authors: Shoujie Li, Yan Huang, Changqing Guo, Tong Wu, Jiawei Zhang, Linrui Zhang, Wenbo Ding

Abstract: The advent of simulation engines has revolutionized learning and operational efficiency for robots, offering cost-effective and swift pipelines. However, the lack of a universal simulation platform tailored for chemical scenarios impedes progress in robotic manipulation and visualization of reaction processes. Addressing this void, we present Chemistry3D, an innovative toolkit that integrates exte… ▽ More The advent of simulation engines has revolutionized learning and operational efficiency for robots, offering cost-effective and swift pipelines. However, the lack of a universal simulation platform tailored for chemical scenarios impedes progress in robotic manipulation and visualization of reaction processes. Addressing this void, we present Chemistry3D, an innovative toolkit that integrates extensive chemical and robotic knowledge. Chemistry3D not only enables robots to perform chemical experiments but also provides real-time visualization of temperature, color, and pH changes during reactions. Built on the NVIDIA Omniverse platform, Chemistry3D offers interfaces for robot operation, visual inspection, and liquid flow control, facilitating the simulation of special objects such as liquids and transparent entities. Leveraging this toolkit, we have devised RL tasks, object detection, and robot operation scenarios. Additionally, to discern disparities between the rendering engine and the real world, we conducted transparent object detection experiments using Sim2Real, validating the toolkit's exceptional simulation performance. The source code is available at https://github.com/huangyan28/Chemistry3D, and a related tutorial can be found at https://www.omni-chemistry.com. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.07006 [pdf, other]

MIPI 2024 Challenge on Few-shot RAW Image Denoising: Methods and Results

Authors: Xin Jin, Chunle Guo, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Ruoqi Li, Chang Liu, Ziyi Wang, Yao Du, Jingjing Yang, Long Bao, Heng Sun, Xiangyu Kong, Xiaoxia Xing, Jinlong Wu, Yuanyang Xue, Hyunhee Park, Sejun Song, Changho Kim, Jingfan Tan , et al. (17 additional authors not shown)

Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photography and imaging (MIPI). Building on the achievements of the previous MIPI Workshops held at ECCV 2022 and CVPR 2023, we introduce our third MIPI challenge including three tracks focusing on novel image sensors and imaging algorithms. In this paper, we summarize and review the Few-shot RAW Image Denoising track on MIPI 2024. In total, 165 participants were successfully registered, and 7 teams submitted results in the final testing phase. The developed solutions in this challenge achieved state-of-the-art erformance on Few-shot RAW Image Denoising. More details of this challenge and the link to the dataset can be found at https://mipichallenge.org/MIPI2024. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Few-shot RAWImage Denoising Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

arXiv:2406.06216 [pdf, other]

Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis

Authors: Xin Jin, Pengyi Jiao, Zheng-Peng Duan, Xingchao Yang, Chun-Le Guo, Bo Ren, Chongyi Li

Abstract: Volumetric rendering based methods, like NeRF, excel in HDR view synthesis from RAWimages, especially for nighttime scenes. While, they suffer from long training times and cannot perform real-time rendering due to dense sampling requirements. The advent of 3D Gaussian Splatting (3DGS) enables real-time rendering and faster training. However, implementing RAW image-based view synthesis directly usi… ▽ More Volumetric rendering based methods, like NeRF, excel in HDR view synthesis from RAWimages, especially for nighttime scenes. While, they suffer from long training times and cannot perform real-time rendering due to dense sampling requirements. The advent of 3D Gaussian Splatting (3DGS) enables real-time rendering and faster training. However, implementing RAW image-based view synthesis directly using 3DGS is challenging due to its inherent drawbacks: 1) in nighttime scenes, extremely low SNR leads to poor structure-from-motion (SfM) estimation in distant views; 2) the limited representation capacity of spherical harmonics (SH) function is unsuitable for RAW linear color space; and 3) inaccurate scene structure hampers downstream tasks such as refocusing. To address these issues, we propose LE3D (Lighting Every darkness with 3DGS). Our method proposes Cone Scatter Initialization to enrich the estimation of SfM, and replaces SH with a Color MLP to represent the RAW linear color space. Additionally, we introduce depth distortion and near-far regularizations to improve the accuracy of scene structure for downstream tasks. These designs enable LE3D to perform real-time novel view synthesis, HDR rendering, refocusing, and tone-mapping changes. Compared to previous volumetric rendering based methods, LE3D reduces training time to 1% and improves rendering speed by up to 4,000 times for 2K resolution images in terms of FPS. Code and viewer can be found in https://github.com/Srameo/LE3D . △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2406.03721 [pdf, other]

Attribute-Aware Implicit Modality Alignment for Text Attribute Person Search

Authors: Xin Wang, Fangfang Liu, Zheng Li, Caili Guo

Abstract: Text attribute person search aims to find specific pedestrians through given textual attributes, which is very meaningful in the scene of searching for designated pedestrians through witness descriptions. The key challenge is the significant modality gap between textual attributes and images. Previous methods focused on achieving explicit representation and alignment through unimodal pre-trained m… ▽ More Text attribute person search aims to find specific pedestrians through given textual attributes, which is very meaningful in the scene of searching for designated pedestrians through witness descriptions. The key challenge is the significant modality gap between textual attributes and images. Previous methods focused on achieving explicit representation and alignment through unimodal pre-trained models. Nevertheless, the absence of inter-modality correspondence in these models may lead to distortions in the local information of intra-modality. Moreover, these methods only considered the alignment of inter-modality and ignored the differences between different attribute categories. To mitigate the above problems, we propose an Attribute-Aware Implicit Modality Alignment (AIMA) framework to learn the correspondence of local representations between textual attributes and images and combine global representation matching to narrow the modality gap. Firstly, we introduce the CLIP model as the backbone and design prompt templates to transform attribute combinations into structured sentences. This facilitates the model's ability to better understand and match image details. Next, we design a Masked Attribute Prediction (MAP) module that predicts the masked attributes after the interaction of image and masked textual attribute features through multi-modal interaction, thereby achieving implicit local relationship alignment. Finally, we propose an Attribute-IoU Guided Intra-Modal Contrastive (A-IoU IMC) loss, aligning the distribution of different textual attributes in the embedding space with their IoU distribution, achieving better semantic arrangement. Extensive experiments on the Market-1501 Attribute, PETA, and PA100K datasets show that the performance of our proposed method significantly surpasses the current state-of-the-art methods. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2406.03108 [pdf, ps, other]

Lepton flavor violating decays $Z\rightarrow l^{\pm}_{i}l^{\mp}_{j}$ in the B-L Supersymmetric Standard Model

Authors: Jia-Peng Huo, Xing-Xing Dong, Jiao Ma, Shu-Min Zhao, Cai Guo, Hai-Bin Zhang, Jin-Lei Yang, Tai-Fu Feng

Abstract: Lepton flavor violation (LFV) represents a clear new physics (NP) signal beyond the standard model (SM). In this paper, we study LFV decays $Z\rightarrow l^{\pm}_{i}l^{\mp}_{j}$ in the B-L Supersymmetric Standard Model(B-LSSM). We calculate these processes separately in the mass eigenstate basis and the electroweak interaction basis, and the latter adopt the mass insertion approximation (MIA) meth… ▽ More Lepton flavor violation (LFV) represents a clear new physics (NP) signal beyond the standard model (SM). In this paper, we study LFV decays $Z\rightarrow l^{\pm}_{i}l^{\mp}_{j}$ in the B-L Supersymmetric Standard Model(B-LSSM). We calculate these processes separately in the mass eigenstate basis and the electroweak interaction basis, and the latter adopt the mass insertion approximation (MIA) method. The MIA clearly shows the effect of parameters on the LFV decays $Z\rightarrow l^{\pm}_{i}l^{\mp}_{j}$ in the analytic level, which provides a new way for us to analyze the LFV processes. At the same time, the corresponding constraints from the LFV decays $l^{-}_{j} \rightarrow l^{-}_{i} γ$ and $(g-2)_μ$ are considered to analyze the numerical results. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2406.02249 [pdf, other]

A novel measurement method for SiPM external crosstalk probability at low temperature

Authors: Guanda Li, Lei Wang, Xilei Sun, Fang Liu, Cong Guo, Kangkang Zhao, Lei Tian, Zeyuan Yu, Zhilong Hou, Chi Li, Yu Lei, Bin Wang, Rongbin Zhou

Abstract: Silicon photomultipliers (SiPMs) are being considered as potential replacements for conventional photomultiplier tubes (PMTs). However, a significant disadvantage of SiPMs is crosstalk (CT), wherein photons propagate through other pixels, resulting in secondary avalanches. CT can be categorized into internal crosstalk and external crosstalk based on whether the secondary avalanche occurs within th… ▽ More Silicon photomultipliers (SiPMs) are being considered as potential replacements for conventional photomultiplier tubes (PMTs). However, a significant disadvantage of SiPMs is crosstalk (CT), wherein photons propagate through other pixels, resulting in secondary avalanches. CT can be categorized into internal crosstalk and external crosstalk based on whether the secondary avalanche occurs within the same SiPM or a different one. Numerous methods exist for quantitatively estimating the percentage of internal crosstalk (iCT). However, external crosstalk (eCT) has not been extensively studied. This article presents a novel measurement method for the probability of emitting an external crosstalk photon during a single pixel avalanche, using a setup involving two identical SiPMs facing each other, and without the need for complex optical designs. The entire apparatus is enclosed within a stainless steel chamber, functioning as a light-tight enclosure, and maintained at liquid nitrogen temperature. The experimental setup incorporates two Sensl J-60035 SiPM chips along with two 0.5-inch Hamamatsu Photonics (HPK) VUV4 S13370-6050CN SiPM arrays. The findings show a linear relationship between the probability of emitting an external crosstalk photon and the SiPM overvoltage for both SiPM samples. Surprisingly, this novel measurement method also rovides measurements of the SiPM photon detection efficiency (PDE) for eCT photons at low temperature. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.01595 [pdf, other]

MultiPly: Reconstruction of Multiple People from Monocular Video in the Wild

Authors: Zeren Jiang, Chen Guo, Manuel Kaufmann, Tianjian Jiang, Julien Valentin, Otmar Hilliges, Jie Song

Abstract: We present MultiPly, a novel framework to reconstruct multiple people in 3D from monocular in-the-wild videos. Reconstructing multiple individuals moving and interacting naturally from monocular in-the-wild videos poses a challenging task. Addressing it necessitates precise pixel-level disentanglement of individuals without any prior knowledge about the subjects. Moreover, it requires recovering i… ▽ More We present MultiPly, a novel framework to reconstruct multiple people in 3D from monocular in-the-wild videos. Reconstructing multiple individuals moving and interacting naturally from monocular in-the-wild videos poses a challenging task. Addressing it necessitates precise pixel-level disentanglement of individuals without any prior knowledge about the subjects. Moreover, it requires recovering intricate and complete 3D human shapes from short video sequences, intensifying the level of difficulty. To tackle these challenges, we first define a layered neural representation for the entire scene, composited by individual human and background models. We learn the layered neural representation from videos via our layer-wise differentiable volume rendering. This learning process is further enhanced by our hybrid instance segmentation approach which combines the self-supervised 3D segmentation and the promptable 2D segmentation module, yielding reliable instance segmentation supervision even under close human interaction. A confidence-guided optimization formulation is introduced to optimize the human poses and shape/appearance alternately. We incorporate effective objectives to refine human poses via photometric information and impose physically plausible constraints on human dynamics, leading to temporally consistent 3D reconstructions with high fidelity. The evaluation of our method shows the superiority over prior art on publicly available datasets and in-the-wild videos. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: Project page: https://eth-ait.github.io/MultiPly/

arXiv:2406.00516 [pdf, other]

Deep Learning based Performance Testing for Analog Integrated Circuits

Authors: Jiawei Cao, Chongtao Guo, Hao Li, Zhigang Wang, Houjun Wang, Geoffrey Ye Li

Abstract: In this paper, we propose a deep learning based performance testing framework to minimize the number of required test modules while guaranteeing the accuracy requirement, where a test module corresponds to a combination of one circuit and one stimulus. First, we apply a deep neural network (DNN) to establish the mapping from the response of the circuit under test (CUT) in each module to all specif… ▽ More In this paper, we propose a deep learning based performance testing framework to minimize the number of required test modules while guaranteeing the accuracy requirement, where a test module corresponds to a combination of one circuit and one stimulus. First, we apply a deep neural network (DNN) to establish the mapping from the response of the circuit under test (CUT) in each module to all specifications to be tested. Then, the required test modules are selected by solving a 0-1 integer programming problem. Finally, the predictions from the selected test modules are combined by a DNN to form the specification estimations. The simulation results validate the proposed approach in terms of testing accuracy and cost. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2405.17792 [pdf, other]

JUNO Sensitivity to Invisible Decay Modes of Neutrons

Authors: JUNO Collaboration, Angel Abusleme, Thomas Adam, Kai Adamowicz, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato, Marco Beretta, Antonio Bergnoli, Daniel Bick , et al. (635 additional authors not shown)

Abstract: We explore the bound neutrons decay into invisible particles (e.g., $n\rightarrow 3 ν$ or $nn \rightarrow 2 ν$) in the JUNO liquid scintillator detector. The invisible decay includes two decay modes: $ n \rightarrow { inv} $ and $ nn \rightarrow { inv} $. The invisible decays of $s$-shell neutrons in $^{12}{\rm C}$ will leave a highly excited residual nucleus. Subsequently, some de-excitation mode… ▽ More We explore the bound neutrons decay into invisible particles (e.g., $n\rightarrow 3 ν$ or $nn \rightarrow 2 ν$) in the JUNO liquid scintillator detector. The invisible decay includes two decay modes: $ n \rightarrow { inv} $ and $ nn \rightarrow { inv} $. The invisible decays of $s$-shell neutrons in $^{12}{\rm C}$ will leave a highly excited residual nucleus. Subsequently, some de-excitation modes of the excited residual nuclei can produce a time- and space-correlated triple coincidence signal in the JUNO detector. Based on a full Monte Carlo simulation informed with the latest available data, we estimate all backgrounds, including inverse beta decay events of the reactor antineutrino $\barν_e$, natural radioactivity, cosmogenic isotopes and neutral current interactions of atmospheric neutrinos. Pulse shape discrimination and multivariate analysis techniques are employed to further suppress backgrounds. With two years of exposure, JUNO is expected to give an order of magnitude improvement compared to the current best limits. After 10 years of data taking, the JUNO expected sensitivities at a 90% confidence level are $τ/B( n \rightarrow { inv} ) > 5.0 \times 10^{31} \, {\rm yr}$ and $τ/B( nn \rightarrow { inv} ) > 1.4 \times 10^{32} \, {\rm yr}$. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 28 pages, 7 figures, 4 tables

arXiv:2405.17478 [pdf, other]

ROSE: Register Assisted General Time Series Forecasting with Decomposed Frequency Learning

Authors: Yihang Wang, Yuying Qiu, Peng Chen, Kai Zhao, Yang Shu, Zhongwen Rao, Lujia Pan, Bin Yang, Chenjuan Guo

Abstract: With the increasing collection of time series data from various domains, there arises a strong demand for general time series forecasting models pre-trained on a large number of time-series datasets to support a variety of downstream prediction tasks. Enabling general time series forecasting faces two challenges: how to obtain unified representations from multi-domian time series data, and how to… ▽ More With the increasing collection of time series data from various domains, there arises a strong demand for general time series forecasting models pre-trained on a large number of time-series datasets to support a variety of downstream prediction tasks. Enabling general time series forecasting faces two challenges: how to obtain unified representations from multi-domian time series data, and how to capture domain-specific features from time series data across various domains for adaptive transfer in downstream tasks. To address these challenges, we propose a Register Assisted General Time Series Forecasting Model with Decomposed Frequency Learning (ROSE), a novel pre-trained model for time series forecasting. ROSE employs Decomposed Frequency Learning for the pre-training task, which decomposes coupled semantic and periodic information in time series with frequency-based masking and reconstruction to obtain unified representations across domains. We also equip ROSE with a Time Series Register, which learns to generate a register codebook to capture domain-specific representations during pre-training and enhances domain-adaptive transfer by selecting related register tokens on downstream tasks. After pre-training on large-scale time series data, ROSE achieves state-of-the-art forecasting performance on 8 real-world benchmarks. Remarkably, even in few-shot scenarios, it demonstrates competitive or superior performance compared to existing methods trained with full data. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.17420 [pdf, other]

Survival of the Fittest Representation: A Case Study with Modular Addition

Authors: Xiaoman Delores Ding, Zifan Carl Guo, Eric J. Michaud, Ziming Liu, Max Tegmark

Abstract: When a neural network can learn multiple distinct algorithms to solve a task, how does it "choose" between them during training? To approach this question, we take inspiration from ecology: when multiple species coexist, they eventually reach an equilibrium where some survive while others die out. Analogously, we suggest that a neural network at initialization contains many solutions (representati… ▽ More When a neural network can learn multiple distinct algorithms to solve a task, how does it "choose" between them during training? To approach this question, we take inspiration from ecology: when multiple species coexist, they eventually reach an equilibrium where some survive while others die out. Analogously, we suggest that a neural network at initialization contains many solutions (representations and algorithms), which compete with each other under pressure from resource constraints, with the "fittest" ultimately prevailing. To investigate this Survival of the Fittest hypothesis, we conduct a case study on neural networks performing modular addition, and find that these networks' multiple circular representations at different Fourier frequencies undergo such competitive dynamics, with only a few circles surviving at the end. We find that the frequencies with high initial signals and gradients, the "fittest," are more likely to survive. By increasing the embedding dimension, we also observe more surviving frequencies. Inspired by the Lotka-Volterra equations describing the dynamics between species, we find that the dynamics of the circles can be nicely characterized by a set of linear differential equations. Our results with modular addition show that it is possible to decompose complicated representations into simpler components, along with their basic interactions, to offer insight on the training dynamics of representations. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.17247 [pdf, other]

An Introduction to Vision-Language Modeling

Authors: Florian Bordes, Richard Yuanzhe Pang, Anurag Ajay, Alexander C. Li, Adrien Bardes, Suzanne Petryk, Oscar Mañas, Zhiqiu Lin, Anas Mahmoud, Bargav Jayaraman, Mark Ibrahim, Melissa Hall, Yunyang Xiong, Jonathan Lebensold, Candace Ross, Srihari Jayakumar, Chuan Guo, Diane Bouchacourt, Haider Al-Tahan, Karthik Padthe, Vasu Sharma, Hu Xu, Xiaoqing Ellen Tan, Megan Richards, Samuel Lavoie , et al. (16 additional authors not shown)

Abstract: Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through unfamiliar environments to generative models that produce images using only a high-level text description, the vision-language model (VLM) applications will significantly impact our relationship with technol… ▽ More Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through unfamiliar environments to generative models that produce images using only a high-level text description, the vision-language model (VLM) applications will significantly impact our relationship with technology. However, there are many challenges that need to be addressed to improve the reliability of those models. While language is discrete, vision evolves in a much higher dimensional space in which concepts cannot always be easily discretized. To better understand the mechanics behind mapping vision to language, we present this introduction to VLMs which we hope will help anyone who would like to enter the field. First, we introduce what VLMs are, how they work, and how to train them. Then, we present and discuss approaches to evaluate VLMs. Although this work primarily focuses on mapping images to language, we also discuss extending VLMs to videos. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.15273 [pdf, other]

Towards a General Time Series Anomaly Detector with Adaptive Bottlenecks and Dual Adversarial Decoders

Authors: Qichao Shentu, Beibu Li, Kai Zhao, Yang Shu, Zhongwen Rao, Lujia Pan, Bin Yang, Chenjuan Guo

Abstract: Time series anomaly detection plays a vital role in a wide range of applications. Existing methods require training one specific model for each dataset, which exhibits limited generalization capability across different target datasets, hindering anomaly detection performance in various scenarios with scarce training data. Aiming at this problem, we propose constructing a general time series anomal… ▽ More Time series anomaly detection plays a vital role in a wide range of applications. Existing methods require training one specific model for each dataset, which exhibits limited generalization capability across different target datasets, hindering anomaly detection performance in various scenarios with scarce training data. Aiming at this problem, we propose constructing a general time series anomaly detection model, which is pre-trained on extensive multi-domain datasets and can subsequently apply to a multitude of downstream scenarios. The significant divergence of time series data across different domains presents two primary challenges in building such a general model: (1) meeting the diverse requirements of appropriate information bottlenecks tailored to different datasets in one unified model, and (2) enabling distinguishment between multiple normal and abnormal patterns, both are crucial for effective anomaly detection in various target scenarios. To tackle these two challenges, we propose a General time series anomaly Detector with Adaptive Bottlenecks and Dual Adversarial Decoders (DADA), which enables flexible selection of bottlenecks based on different data and explicitly enhances clear differentiation between normal and abnormal series. We conduct extensive experiments on nine target datasets from different domains. After pre-training on multi-domain data, DADA, serving as a zero-shot anomaly detector for these datasets, still achieves competitive or even superior results compared to those models tailored to each specific dataset. △ Less

Submitted 2 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.14767 [pdf, other]

FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language Models

Authors: Hongyang Yang, Boyu Zhang, Neng Wang, Cheng Guo, Xiaoli Zhang, Likun Lin, Junlin Wang, Tianyu Zhou, Mao Guan, Runjia Zhang, Christina Dan Wang

Abstract: As financial institutions and professionals increasingly incorporate Large Language Models (LLMs) into their workflows, substantial barriers, including proprietary data and specialized knowledge, persist between the finance sector and the AI community. These challenges impede the AI community's ability to enhance financial tasks effectively. Acknowledging financial analysis's critical role, we aim… ▽ More As financial institutions and professionals increasingly incorporate Large Language Models (LLMs) into their workflows, substantial barriers, including proprietary data and specialized knowledge, persist between the finance sector and the AI community. These challenges impede the AI community's ability to enhance financial tasks effectively. Acknowledging financial analysis's critical role, we aim to devise financial-specialized LLM-based toolchains and democratize access to them through open-source initiatives, promoting wider AI adoption in financial decision-making. In this paper, we introduce FinRobot, a novel open-source AI agent platform supporting multiple financially specialized AI agents, each powered by LLM. Specifically, the platform consists of four major layers: 1) the Financial AI Agents layer that formulates Financial Chain-of-Thought (CoT) by breaking sophisticated financial problems down into logical sequences; 2) the Financial LLM Algorithms layer dynamically configures appropriate model application strategies for specific tasks; 3) the LLMOps and DataOps layer produces accurate models by applying training/fine-tuning techniques and using task-relevant data; 4) the Multi-source LLM Foundation Models layer that integrates various LLMs and enables the above layers to access them directly. Finally, FinRobot provides hands-on for both professional-grade analysts and laypersons to utilize powerful AI techniques for advanced financial analysis. We open-source FinRobot at \url{https://github.com/AI4Finance-Foundation/FinRobot}. △ Less

Submitted 27 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

Comments: FinRobot Whitepaper V1.0

arXiv:2405.14520 [pdf, other]

Ghost-Stereo: GhostNet-based Cost Volume Enhancement and Aggregation for Stereo Matching Networks

Authors: Xingguang Jiang, Xiaofeng Bian, Chenggang Guo

Abstract: Depth estimation based on stereo matching is a classic but popular computer vision problem, which has a wide range of real-world applications. Current stereo matching methods generally adopt the deep Siamese neural network architecture, and have achieved impressing performance by constructing feature matching cost volumes and using 3D convolutions for cost aggregation. However, most existing metho… ▽ More Depth estimation based on stereo matching is a classic but popular computer vision problem, which has a wide range of real-world applications. Current stereo matching methods generally adopt the deep Siamese neural network architecture, and have achieved impressing performance by constructing feature matching cost volumes and using 3D convolutions for cost aggregation. However, most existing methods suffer from large number of parameters and slow running time due to the sequential use of 3D convolutions. In this paper, we propose Ghost-Stereo, a novel end-to-end stereo matching network. The feature extraction part of the network uses the GhostNet to form a U-shaped structure. The core of Ghost-Stereo is a GhostNet feature-based cost volume enhancement (Ghost-CVE) module and a GhostNet-inspired lightweight cost volume aggregation (Ghost-CVA) module. For the Ghost-CVE part, cost volumes are constructed and fused by the GhostNet-based features to enhance the spatial context awareness. For the Ghost-CVA part, a lightweight 3D convolution bottleneck block based on the GhostNet is proposed to reduce the computational complexity in this module. By combining with the context and geometry fusion module, a classical hourglass-shaped cost volume aggregate structure is constructed. Ghost-Stereo achieves a comparable performance than state-of-the-art real-time methods on several publicly benchmarks, and shows a better generalization ability. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.11971 [pdf, other]

Data Augmentation for Text-based Person Retrieval Using Large Language Models

Authors: Zheng Li, Lijia Si, Caili Guo, Yang Yang, Qiushi Cao

Abstract: Text-based Person Retrieval (TPR) aims to retrieve person images that match the description given a text query. The performance improvement of the TPR model relies on high-quality data for supervised training. However, it is difficult to construct a large-scale, high-quality TPR dataset due to expensive annotation and privacy protection. Recently, Large Language Models (LLMs) have approached or ev… ▽ More Text-based Person Retrieval (TPR) aims to retrieve person images that match the description given a text query. The performance improvement of the TPR model relies on high-quality data for supervised training. However, it is difficult to construct a large-scale, high-quality TPR dataset due to expensive annotation and privacy protection. Recently, Large Language Models (LLMs) have approached or even surpassed human performance on many NLP tasks, creating the possibility to expand high-quality TPR datasets. This paper proposes an LLM-based Data Augmentation (LLM-DA) method for TPR. LLM-DA uses LLMs to rewrite the text in the current TPR dataset, achieving high-quality expansion of the dataset concisely and efficiently. These rewritten texts are able to increase the diversity of vocabulary and sentence structure while retaining the original key concepts and semantic information. In order to alleviate the hallucinations of LLMs, LLM-DA introduces a Text Faithfulness Filter (TFF) to filter out unfaithful rewritten text. To balance the contributions of original text and augmented text, a Balanced Sampling Strategy (BSS) is proposed to control the proportion of original text and augmented text used for training. LLM-DA is a plug-and-play method that can be easily integrated into various TPR models. Comprehensive experiments on three TPR benchmarks show that LLM-DA can improve the retrieval performance of current TPR models. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.11859 [pdf, ps, other]

Highly versatile, two-color setup for high-order harmonic generation using spatial light modulators

Authors: Ann-Kathrin Raab, Marvin Schmoll, Emma R. Simpson, Melvin Redon, Yuman Fang, Chen Guo, Anne-Lise Viotti, Cord L. Arnold, Anne L'Huillier, Johan Mauritsson

Abstract: We present a novel, interferometric, two-color, high-order harmonic generation setup, based on a turn-key Ytterbium-doped femtosecond laser source and its second harmonic. Each interferometer arm contains a spatial light modulator, with individual capabilities to manipulate the spatial beam profiles and to stabilize the relative delay between the fundamental and the second harmonic. Additionally,… ▽ More We present a novel, interferometric, two-color, high-order harmonic generation setup, based on a turn-key Ytterbium-doped femtosecond laser source and its second harmonic. Each interferometer arm contains a spatial light modulator, with individual capabilities to manipulate the spatial beam profiles and to stabilize the relative delay between the fundamental and the second harmonic. Additionally, separate control of the relative power and focusing geometries of the two color beams is implemented to conveniently perform automatized scans of multiple parameters. A live diagnostics system gives continuous information during ongoing measurements. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 8 pages, 8 figures

arXiv:2405.10845 [pdf, other]

Natural Language Processing for Requirements Traceability

Authors: Jin L. C. Guo, Jan-Philipp Steghöfer, Andreas Vogelsang, Jane Cleland-Huang

Abstract: Traceability, the ability to trace relevant software artifacts to support reasoning about the quality of the software and its development process, plays a crucial role in requirements and software engineering, particularly for safety-critical systems. In this chapter, we provide a comprehensive overview of the representative tasks in requirement traceability for which natural language processing (… ▽ More Traceability, the ability to trace relevant software artifacts to support reasoning about the quality of the software and its development process, plays a crucial role in requirements and software engineering, particularly for safety-critical systems. In this chapter, we provide a comprehensive overview of the representative tasks in requirement traceability for which natural language processing (NLP) and related techniques have made considerable progress in the past decade. We first present the definition of traceability in the context of requirements and the overall engineering process, as well as other important concepts related to traceability tasks. Then, we discuss two tasks in detail, including trace link recovery and trace link maintenance. We also introduce two other related tasks concerning when trace links are used in practical contexts. For each task, we explain the characteristics of the task, how it can be approached through NLP techniques, and how to design and conduct the experiment to demonstrate the performance of the NLP techniques. We further discuss practical considerations on how to effectively apply NLP techniques and assess their effectiveness regarding the data set collection, the metrics selection, and the role of humans when evaluating the NLP approaches. Overall, this chapter prepares the readers with the fundamental knowledge of designing automated traceability solutions enabled by NLP in practice. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: Book Chapter in the Handbook of Natural Language Processing for Requirements Engineering

arXiv:2405.03915 [pdf, other]

doi 10.1145/3643834.3661544

Motivating Users to Attend to Privacy: A Theory-Driven Design Study

Authors: Varun Shiri, Maggie Xiong, Jinghui Cheng, Jin L. C. Guo

Abstract: In modern technology environments, raising users' privacy awareness is crucial. Existing efforts largely focused on privacy policy presentation and failed to systematically address a radical challenge of user motivation for initiating privacy awareness. Leveraging the Protection Motivation Theory (PMT), we proposed design ideas and categories dedicated to motivating users to engage with privacy-re… ▽ More In modern technology environments, raising users' privacy awareness is crucial. Existing efforts largely focused on privacy policy presentation and failed to systematically address a radical challenge of user motivation for initiating privacy awareness. Leveraging the Protection Motivation Theory (PMT), we proposed design ideas and categories dedicated to motivating users to engage with privacy-related information. Using these design ideas, we created a conceptual prototype, enhancing the current App Store product page. Results from an online experiment and follow-up interviews showed that our design effectively motivated participants to attend to privacy issues, raising both the threat appraisal and coping appraisal, two main factors in PMT. Our work indicated that effective design should consider combining PMT components, calibrating information content, and integrating other design elements, such as visual cues and user familiarity. Overall, our study contributes valuable design considerations driven by the PMT to amplify the motivational aspect of privacy communication. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: 18 pages, 2 figures, DIS 2024

arXiv:2405.00739 [pdf, other]

Why does Knowledge Distillation Work? Rethink its Attention and Fidelity Mechanism

Authors: Chenqi Guo, Shiwei Zhong, Xiaofeng Liu, Qianli Feng, Yinglong Ma

Abstract: Does Knowledge Distillation (KD) really work? Conventional wisdom viewed it as a knowledge transfer procedure where a perfect mimicry of the student to its teacher is desired. However, paradoxical studies indicate that closely replicating the teacher's behavior does not consistently improve student generalization, posing questions on its possible causes. Confronted with this gap, we hypothesize th… ▽ More Does Knowledge Distillation (KD) really work? Conventional wisdom viewed it as a knowledge transfer procedure where a perfect mimicry of the student to its teacher is desired. However, paradoxical studies indicate that closely replicating the teacher's behavior does not consistently improve student generalization, posing questions on its possible causes. Confronted with this gap, we hypothesize that diverse attentions in teachers contribute to better student generalization at the expense of reduced fidelity in ensemble KD setups. By increasing data augmentation strengths, our key findings reveal a decrease in the Intersection over Union (IoU) of attentions between teacher models, leading to reduced student overfitting and decreased fidelity. We propose this low-fidelity phenomenon as an underlying characteristic rather than a pathology when training KD. This suggests that stronger data augmentation fosters a broader perspective provided by the divergent teacher ensemble and lower student-teacher mutual information, benefiting generalization performance. These insights clarify the mechanism on low-fidelity phenomenon in KD. Thus, we offer new perspectives on optimizing student model performance, by emphasizing increased diversity in teacher attentions and reduced mimicry behavior between teachers and student. △ Less

Submitted 29 April, 2024; originally announced May 2024.

arXiv:2404.18630 [pdf, other]

4D-DRESS: A 4D Dataset of Real-world Human Clothing with Semantic Annotations

Authors: Wenbo Wang, Hsuan-I Ho, Chen Guo, Boxiang Rong, Artur Grigorev, Jie Song, Juan Jose Zarate, Otmar Hilliges

Abstract: The studies of human clothing for digital avatars have predominantly relied on synthetic datasets. While easy to collect, synthetic data often fall short in realism and fail to capture authentic clothing dynamics. Addressing this gap, we introduce 4D-DRESS, the first real-world 4D dataset advancing human clothing research with its high-quality 4D textured scans and garment meshes. 4D-DRESS capture… ▽ More The studies of human clothing for digital avatars have predominantly relied on synthetic datasets. While easy to collect, synthetic data often fall short in realism and fail to capture authentic clothing dynamics. Addressing this gap, we introduce 4D-DRESS, the first real-world 4D dataset advancing human clothing research with its high-quality 4D textured scans and garment meshes. 4D-DRESS captures 64 outfits in 520 human motion sequences, amounting to 78k textured scans. Creating a real-world clothing dataset is challenging, particularly in annotating and segmenting the extensive and complex 4D human scans. To address this, we develop a semi-automatic 4D human parsing pipeline. We efficiently combine a human-in-the-loop process with automation to accurately label 4D scans in diverse garments and body movements. Leveraging precise annotations and high-quality garment meshes, we establish several benchmarks for clothing simulation and reconstruction. 4D-DRESS offers realistic and challenging data that complements synthetic sources, paving the way for advancements in research of lifelike human clothing. Website: https://ait.ethz.ch/4d-dress. △ Less

Submitted 29 April, 2024; originally announced April 2024.

Comments: CVPR 2024 paper, 21 figures, 9 tables

arXiv:2404.18492 [pdf, other]

A new hybrid gadolinium nanoparticles-loaded polymeric material for neutron detection in rare event searches

Authors: DarkSide-20k Collaboration, :, F. Acerbi, P. Adhikari, P. Agnes, I. Ahmad, S. Albergo, I. F. Albuquerque, T. Alexander, A. K. Alton, P. Amaudruz, M. Angiolilli, E. Aprile, R. Ardito, M. Atzori Corona, D. J. Auty, M. Ave, I. C. Avetisov, O. Azzolini, H. O. Back, Z. Balmforth, A. Barrado Olmedo, P. Barrillon, G. Batignani, P. Bhowmick , et al. (290 additional authors not shown)

Abstract: Experiments aimed at direct searches for WIMP dark matter require highly effective reduction of backgrounds and control of any residual radioactive contamination. In particular, neutrons interacting with atomic nuclei represent an important class of backgrounds due to the expected similarity of a WIMP-nucleon interaction, so that such experiments often feature a dedicated neutron detector surround… ▽ More Experiments aimed at direct searches for WIMP dark matter require highly effective reduction of backgrounds and control of any residual radioactive contamination. In particular, neutrons interacting with atomic nuclei represent an important class of backgrounds due to the expected similarity of a WIMP-nucleon interaction, so that such experiments often feature a dedicated neutron detector surrounding the active target volume. In the context of the development of DarkSide-20k detector at INFN Gran Sasso National Laboratory (LNGS), several R&D projects were conceived and developed for the creation of a new hybrid material rich in both hydrogen and gadolinium nuclei to be employed as an essential element of the neutron detector. Thanks to its very high cross-section for neutron capture, gadolinium is one of the most widely used elements in neutron detectors, while the hydrogen-rich material is instrumental in efficiently moderating the neutrons. In this paper results from one of the R&Ds are presented. In this effort the new hybrid material was obtained as a poly(methyl methacrylate) (PMMA) matrix, loaded with gadolinium oxide in the form of nanoparticles. We describe its realization, including all phases of design, purification, construction, characterization, and determination of mechanical properties of the new material. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.16873 [pdf, other]

AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs

Authors: Anselm Paulus, Arman Zharmagambetov, Chuan Guo, Brandon Amos, Yuandong Tian

Abstract: While recently Large Language Models (LLMs) have achieved remarkable successes, they are vulnerable to certain jailbreaking attacks that lead to generation of inappropriate or harmful content. Manual red-teaming requires finding adversarial prompts that cause such jailbreaking, e.g. by appending a suffix to a given instruction, which is inefficient and time-consuming. On the other hand, automatic… ▽ More While recently Large Language Models (LLMs) have achieved remarkable successes, they are vulnerable to certain jailbreaking attacks that lead to generation of inappropriate or harmful content. Manual red-teaming requires finding adversarial prompts that cause such jailbreaking, e.g. by appending a suffix to a given instruction, which is inefficient and time-consuming. On the other hand, automatic adversarial prompt generation often leads to semantically meaningless attacks that can easily be detected by perplexity-based filters, may require gradient information from the TargetLLM, or do not scale well due to time-consuming discrete optimization processes over the token space. In this paper, we present a novel method that uses another LLM, called the AdvPrompter, to generate human-readable adversarial prompts in seconds, $\sim800\times$ faster than existing optimization-based approaches. We train the AdvPrompter using a novel algorithm that does not require access to the gradients of the TargetLLM. This process alternates between two steps: (1) generating high-quality target adversarial suffixes by optimizing the AdvPrompter predictions, and (2) low-rank fine-tuning of the AdvPrompter with the generated adversarial suffixes. The trained AdvPrompter generates suffixes that veil the input instruction without changing its meaning, such that the TargetLLM is lured to give a harmful response. Experimental results on popular open source TargetLLMs show state-of-the-art results on the AdvBench dataset, that also transfer to closed-source black-box LLM APIs. Further, we demonstrate that by fine-tuning on a synthetic dataset generated by AdvPrompter, LLMs can be made more robust against jailbreaking attacks while maintaining performance, i.e. high MMLU scores. △ Less

Submitted 21 April, 2024; originally announced April 2024.

Comments: 32 pages, 9 figures, 7 tables

arXiv:2404.16284 [pdf, ps, other]

$L^p$-regularity of a geometrically nonlinear flat Cosserat micropolar model in supercritical dimensions

Authors: Chang-Yu Guo, Chang-Lin Xiang, Ming-Lun Liu

Abstract: In a recent work [Ann. Inst. H. Poincaré C Anal. Non Linéaire 2024], Gastel and Neff introduced an interesting system from a geometrically nonlinear flat cosserat micropolar model and established interior regularity in the critical dimension. Motived by this work, in this article, we establish both interior regularity and sharp $L^p$ regularity for their system in supercritical dimensions. In a recent work [Ann. Inst. H. Poincaré C Anal. Non Linéaire 2024], Gastel and Neff introduced an interesting system from a geometrically nonlinear flat cosserat micropolar model and established interior regularity in the critical dimension. Motived by this work, in this article, we establish both interior regularity and sharp $L^p$ regularity for their system in supercritical dimensions. △ Less

Submitted 24 April, 2024; originally announced April 2024.

Comments: 21 pages

MSC Class: 35B65; 35J47; 35G50

arXiv:2404.14999 [pdf, other]

A Unified Replay-based Continuous Learning Framework for Spatio-Temporal Prediction on Streaming Data

Authors: Hao Miao, Yan Zhao, Chenjuan Guo, Bin Yang, Kai Zheng, Feiteng Huang, Jiandong Xie, Christian S. Jensen

Abstract: The widespread deployment of wireless and mobile devices results in a proliferation of spatio-temporal data that is used in applications, e.g., traffic prediction, human mobility mining, and air quality prediction, where spatio-temporal prediction is often essential to enable safety, predictability, or reliability. Many recent proposals that target deep learning for spatio-temporal prediction suff… ▽ More The widespread deployment of wireless and mobile devices results in a proliferation of spatio-temporal data that is used in applications, e.g., traffic prediction, human mobility mining, and air quality prediction, where spatio-temporal prediction is often essential to enable safety, predictability, or reliability. Many recent proposals that target deep learning for spatio-temporal prediction suffer from so-called catastrophic forgetting, where previously learned knowledge is entirely forgotten when new data arrives. Such proposals may experience deteriorating prediction performance when applied in settings where data streams into the system. To enable spatio-temporal prediction on streaming data, we propose a unified replay-based continuous learning framework. The framework includes a replay buffer of previously learned samples that are fused with training data using a spatio-temporal mixup mechanism in order to preserve historical knowledge effectively, thus avoiding catastrophic forgetting. To enable holistic representation preservation, the framework also integrates a general spatio-temporal autoencoder with a carefully designed spatio-temporal simple siamese (STSimSiam) network that aims to ensure prediction accuracy and avoid holistic feature loss by means of mutual information maximization. The framework further encompasses five spatio-temporal data augmentation methods to enhance the performance of STSimSiam. Extensive experiments on real data offer insight into the effectiveness of the proposed framework. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: Accepted by ICDE 2024

arXiv:2404.14847 [pdf]

High-order harmonic generation from laser induced plasma comprising CdSe/V2O5 Core/Shell quantum dots embedded on MoS2 nanosheets

Authors: Srinivasa Rao Konda, Puspendu Barik, Subshash Singh, Venkatesh Mottamchetty, Amit Srivasthava, Vyacheslav V. Kim, Rashid A. Ganeev, Chunlei Guo, Wei Li

Abstract: Research of the nonlinear optical characteristics of transition metal dichalcogenides in the presence of photoactive particles, plasmonic nanocavities, waveguides, and metamaterials is still in its early stages. This investigation delves into the high-order harmonic generation (HHG) from laser induced plasma of MoS2 nanosheets in the presence of semiconductor photoactive medium such as CdSe and Cd… ▽ More Research of the nonlinear optical characteristics of transition metal dichalcogenides in the presence of photoactive particles, plasmonic nanocavities, waveguides, and metamaterials is still in its early stages. This investigation delves into the high-order harmonic generation (HHG) from laser induced plasma of MoS2 nanosheets in the presence of semiconductor photoactive medium such as CdSe and CdSe/V2O5 core/shell quantum dots. Our comprehensive findings shed light on the counteractive coupling impact of both bare and passivated quantum dots on MoS2 nanosheets, as evidenced by the emission of higher-order harmonics. Significantly, the intensity of harmonics and their cut-off were notably enhanced in the MoS2-CdSe and MoS2-V-CdSe configurations compared to pristine MoS2 nanosheets. These advancements hold promise for applications requiring the emission of coherent short-wavelength radiation. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: 8 pages, 4 figures

arXiv:2404.14642 [pdf, other]

Uncertainty Quantification on Graph Learning: A Survey

Authors: Chao Chen, Chenghua Guo, Rui Xu, Xiangwen Liao, Xi Zhang, Sihong Xie, Hui Xiong, Philip Yu

Abstract: Graphical models, including Graph Neural Networks (GNNs) and Probabilistic Graphical Models (PGMs), have demonstrated their exceptional capabilities across numerous fields. These models necessitate effective uncertainty quantification to ensure reliable decision-making amid the challenges posed by model training discrepancies and unpredictable testing scenarios. This survey examines recent works t… ▽ More Graphical models, including Graph Neural Networks (GNNs) and Probabilistic Graphical Models (PGMs), have demonstrated their exceptional capabilities across numerous fields. These models necessitate effective uncertainty quantification to ensure reliable decision-making amid the challenges posed by model training discrepancies and unpredictable testing scenarios. This survey examines recent works that address uncertainty quantification within the model architectures, training, and inference of GNNs and PGMs. We aim to provide an overview of the current landscape of uncertainty in graphical models by organizing the recent methods into uncertainty representation and handling. By summarizing state-of-the-art methods, this survey seeks to deepen the understanding of uncertainty quantification in graphical models, thereby increasing their effectiveness and safety in critical applications. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.13990 [pdf, other]

QCore: Data-Efficient, On-Device Continual Calibration for Quantized Models -- Extended Version

Authors: David Campos, Bin Yang, Tung Kieu, Miao Zhang, Chenjuan Guo, Christian S. Jensen

Abstract: We are witnessing an increasing availability of streaming data that may contain valuable information on the underlying processes. It is thus attractive to be able to deploy machine learning models on edge devices near sensors such that decisions can be made instantaneously, rather than first having to transmit incoming data to servers. To enable deployment on edge devices with limited storage and… ▽ More We are witnessing an increasing availability of streaming data that may contain valuable information on the underlying processes. It is thus attractive to be able to deploy machine learning models on edge devices near sensors such that decisions can be made instantaneously, rather than first having to transmit incoming data to servers. To enable deployment on edge devices with limited storage and computational capabilities, the full-precision parameters in standard models can be quantized to use fewer bits. The resulting quantized models are then calibrated using back-propagation and full training data to ensure accuracy. This one-time calibration works for deployments in static environments. However, model deployment in dynamic edge environments call for continual calibration to adaptively adjust quantized models to fit new incoming data, which may have different distributions. The first difficulty in enabling continual calibration on the edge is that the full training data may be too large and thus not always available on edge devices. The second difficulty is that the use of back-propagation on the edge for repeated calibration is too expensive. We propose QCore to enable continual calibration on the edge. First, it compresses the full training data into a small subset to enable effective calibration of quantized models with different bit-widths. We also propose means of updating the subset when new streaming data arrives to reflect changes in the environment, while not forgetting earlier training data. Second, we propose a small bit-flipping network that works with the subset to update quantized model parameters, thus enabling efficient continual calibration without back-propagation. An experimental study, conducted with real-world data in a continual learning setting, offers insight into the properties of QCore and shows that it is capable of outperforming strong baseline methods. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: 15 pages. An extended version of "QCore: Data-Efficient, On-Device Continual Calibration for Quantized Models" accepted at PVLDB 2024

arXiv:2404.12561 [pdf]

Charge transfer mechanism on MoS$_2$ nanosheets in the presence of a semiconductor photoactive media

Authors: Srinivasa Rao Konda, Puspendu Barik, Subshash Singh, Venkatesh Mottamchetty, Amit Srivasthava, Rashid A. Ganeev, Soma Venugopal Rao, Chunlei Guo, Wei Li

Abstract: The studies of the nonlinear optical (NLO) properties of the transition metal dichalcogenides (TMDs) coupled with photoactive particles, plasmonic nanocavities, waveguides, and metamaterials remain in their infancy. This study investigates the third-order NLO properties of MoS$_2$ nanosheets in the presence of a semiconductor photoactive medium. Our extensive studies and the obtained results revea… ▽ More The studies of the nonlinear optical (NLO) properties of the transition metal dichalcogenides (TMDs) coupled with photoactive particles, plasmonic nanocavities, waveguides, and metamaterials remain in their infancy. This study investigates the third-order NLO properties of MoS$_2$ nanosheets in the presence of a semiconductor photoactive medium. Our extensive studies and the obtained results reveal the counteractive coupling effect of bare and passivated quantum dots on the MoS$_2$ nanosheet, as made evident by the analysis of the NLO processes. The enhanced NLO properties of MoS$_2$ nanosheets functionalized with CdSe and CdSe-V2O5 quantum dots are helpful for applications as saturable absorbers in laser applications and the emission of coherent short-wavelength radiation. The multiphoton-excitation resonance energy transfer mechanism exploiting remote dipole dipole coupling, and ultrafast charge transfer pathways emerges as another plausible way to alter the NLO properties in TMDs. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 16 pages, 4 figures

arXiv:2404.11768 [pdf, ps, other]

Tensor-Networks-based Learning of Probabilistic Cellular Automata Dynamics

Authors: Heitor P. Casagrande, Bo Xing, William J. Munro, Chu Guo, Dario Poletti

Abstract: Algorithms developed to solve many-body quantum problems, like tensor networks, can turn into powerful quantum-inspired tools to tackle problems in the classical domain. In this work, we focus on matrix product operators, a prominent numerical technique to study many-body quantum systems, especially in one dimension. It has been previously shown that such a tool can be used for classification, lea… ▽ More Algorithms developed to solve many-body quantum problems, like tensor networks, can turn into powerful quantum-inspired tools to tackle problems in the classical domain. In this work, we focus on matrix product operators, a prominent numerical technique to study many-body quantum systems, especially in one dimension. It has been previously shown that such a tool can be used for classification, learning of deterministic sequence-to-sequence processes and of generic quantum processes. We further develop a matrix product operator algorithm to learn probabilistic sequence-to-sequence processes and apply this algorithm to probabilistic cellular automata. This new approach can accurately learn probabilistic cellular automata processes in different conditions, even when the process is a probabilistic mixture of different chaotic rules. In addition, we find that the ability to learn these dynamics is a function of the bit-wise difference between the rules and whether one is much more likely than the other. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 9 pages, 7 figures

arXiv:2404.10969 [pdf, other]

Integrated Communication, Navigation, and Remote Sensing in LEO Networks with Vehicular Applications

Authors: Min Sheng, Chongtao Guo, Lei Huang

Abstract: Traditionally, communication, navigation, and remote sensing (CNR) satellites are separately performed, leading to resource waste, information isolation, and independent optimization for each functionality. Taking future automated driving as an example, it faces great challenges in providing high-reliable and low-latency lane-level positioning, decimeter-level transportation observation, and huge… ▽ More Traditionally, communication, navigation, and remote sensing (CNR) satellites are separately performed, leading to resource waste, information isolation, and independent optimization for each functionality. Taking future automated driving as an example, it faces great challenges in providing high-reliable and low-latency lane-level positioning, decimeter-level transportation observation, and huge traffic sensing information downloading. To this end, this article proposes an integrated CNR (ICNR) framework based on low earth orbit (LEO) satellite mega-constellations from the perspective of vehicular applications. After introducing the main working principles of the CNR functionalities to serve as the technological basis, we characterize the potentials of the integration gain in vehicular use cases. Then, we investigate the ICNR framework in different integration levels, which sheds strong light on qualitative performance improvement by sophisticatedly sharing orbit constellation, wireless resource, and data information towards meeting the requirements of vehicular applications. We also instantiate a fundamental numerical case study to demonstrate the integration gain and highlight the main tradeoffs in managing the ICNR networks from the perspective of vehicular applications. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: This article has been submitted to IEEE Wireless Communications Magazine. It has 8 pages and 5 figures

arXiv:2404.10343 [pdf, other]

The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such as runtime, parameters, and FLOPs, while still maintaining a peak signal-to-noise ratio (PSNR) of approximately 26.90 dB on the DIV2K_LSDIR_valid dataset and 26.99 dB on the DIV2K_LSDIR_test dataset. In addition, this challenge has 4 tracks including the main track (overall performance), sub-track 1 (runtime), sub-track 2 (FLOPs), and sub-track 3 (parameters). In the main track, all three metrics (ie runtime, FLOPs, and parameter count) were considered. The ranking of the main track is calculated based on a weighted sum-up of the scores of all other sub-tracks. In sub-track 1, the practical runtime performance of the submissions was evaluated, and the corresponding score was used to determine the ranking. In sub-track 2, the number of FLOPs was considered. The score calculated based on the corresponding FLOPs was used to determine the ranking. In sub-track 3, the number of parameters was considered. The score calculated based on the corresponding parameters was used to determine the ranking. RLFN is set as the baseline for efficiency measurement. The challenge had 262 registered participants, and 34 teams made valid submissions. They gauge the state-of-the-art in efficient single-image super-resolution. To facilitate the reproducibility of the challenge and enable other researchers to build upon these findings, the code and the pre-trained model of validated solutions are made publicly available at https://github.com/Amazingren/NTIRE2024_ESR/. △ Less

Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

arXiv:2404.05410 [pdf, other]

Solving quantum impurity problems on the L-shaped Kadanoff-Baym contour

Authors: Ruofan Chen, Chu Guo

Abstract: The path integral formalism is the building block of many powerful numerical methods for quantum impurity problems. However, existing fermionic path integral based numerical calculations have only been performed in either the imaginary-time or the real-time axis, while the most generic scenario formulated on the L-shaped Kadanoff-Baym contour is left unexplored. In this work, we extended the recen… ▽ More The path integral formalism is the building block of many powerful numerical methods for quantum impurity problems. However, existing fermionic path integral based numerical calculations have only been performed in either the imaginary-time or the real-time axis, while the most generic scenario formulated on the L-shaped Kadanoff-Baym contour is left unexplored. In this work, we extended the recently developed Grassmann time-evolving matrix product operator (GTEMPO) method to solve quantum impurity problems directly on the Kadanoff-Baym contour. The resulting method is numerically exact, with only two sources of numerical errors, e.g., the time discretization error and the matrix product state bond truncation error, which can both be well controlled. The accuracy of this method is numerically demonstrated against exact solutions in the noninteracting case, and against existing calculations on the real- and imaginary-time axes for the single-orbital Anderson impurity model. Our method is a perfect benchmarking baseline for its alternatives which often employ less-controlled approximations, and can also be used as a real-time impurity solver in dynamical mean field theory and its non-equilibrium extension. △ Less

Submitted 25 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

Comments: 9 pages, 4 figures

arXiv:2404.04757 [pdf, other]

Infinite Grassmann time-evolving matrix product operator method for zero-temperature equilibrium quantum impurity problems

Authors: Chu Guo, Ruofan Chen

Abstract: The Grassmann time-evolving matrix product operator (GTEMPO) method has proven to be an accurate and efficient numerical method for the real-time dynamics of quantum impurity problems. Whereas its application for imaginary-time calculations is much less competitive compared to well-established methods such as the continuous-time quantum Monte Carlo (CTQMC). In this work, we unleash the full power… ▽ More The Grassmann time-evolving matrix product operator (GTEMPO) method has proven to be an accurate and efficient numerical method for the real-time dynamics of quantum impurity problems. Whereas its application for imaginary-time calculations is much less competitive compared to well-established methods such as the continuous-time quantum Monte Carlo (CTQMC). In this work, we unleash the full power of GTEMPO for zero-temperature imaginary-time calculations: the multi-time impurity state is time-translationally invariant with infinite boundary condition, therefore it can be represented as an infinite Grassmann matrix product state (GMPS) with nontrivial unit cell in a single time step, instead of an open boundary GMPS spanning the whole imaginary-time axis. We devise a very efficient infinite GTEMPO algorithm targeted at zero-temperature equilibrium quantum impurity problems, which is known to be a hard regime for quantum Monte Carlo methods. To demonstrate the performance of our method, we benchmark it against exact solutions in the noninteracting limit, and against CTQMC calculations in the Anderson impurity models with up to two orbitals, where we show that the required bond dimension of the infinite GMPS is much smaller than its finite-temperature counterpart. △ Less

Submitted 6 April, 2024; originally announced April 2024.

Comments: 9 pages, 7 figures

arXiv:2404.02866 [pdf, other]

Guarantees of confidentiality via Hammersley-Chapman-Robbins bounds

Authors: Kamalika Chaudhuri, Chuan Guo, Laurens van der Maaten, Saeed Mahloujifar, Mark Tygert

Abstract: Protecting privacy during inference with deep neural networks is possible by adding noise to the activations in the last layers prior to the final classifiers or other task-specific layers. The activations in such layers are known as "features" (or, less commonly, as "embeddings" or "feature embeddings"). The added noise helps prevent reconstruction of the inputs from the noisy features. Lower bou… ▽ More Protecting privacy during inference with deep neural networks is possible by adding noise to the activations in the last layers prior to the final classifiers or other task-specific layers. The activations in such layers are known as "features" (or, less commonly, as "embeddings" or "feature embeddings"). The added noise helps prevent reconstruction of the inputs from the noisy features. Lower bounding the variance of every possible unbiased estimator of the inputs quantifies the confidentiality arising from such added noise. Convenient, computationally tractable bounds are available from classic inequalities of Hammersley and of Chapman and Robbins -- the HCR bounds. Numerical experiments indicate that the HCR bounds are on the precipice of being effectual for small neural nets with the data sets, "MNIST" and "CIFAR-10," which contain 10 classes each for image classification. The HCR bounds appear to be insufficient on their own to guarantee confidentiality of the inputs to inference with standard deep neural nets, "ResNet-18" and "Swin-T," pre-trained on the data set, "ImageNet-1000," which contains 1000 classes. Supplementing the addition of noise to features with other methods for providing confidentiality may be warranted in the case of ImageNet. In all cases, the results reported here limit consideration to amounts of added noise that incur little degradation in the accuracy of classification from the noisy features. Thus, the added noise enhances confidentiality without much reduction in the accuracy on the task of image classification. △ Less

Submitted 17 June, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

Comments: 18 pages, 6 figures

arXiv:2403.20150 [pdf, other]

TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods

Authors: Xiangfei Qiu, Jilin Hu, Lekui Zhou, Xingjian Wu, Junyang Du, Buang Zhang, Chenjuan Guo, Aoying Zhou, Christian S. Jensen, Zhenli Sheng, Bin Yang

Abstract: Time series are generated in diverse domains such as economic, traffic, health, and energy, where forecasting of future values has numerous important applications. Not surprisingly, many forecasting methods are being proposed. To ensure progress, it is essential to be able to study and compare such methods empirically in a comprehensive and reliable manner. To achieve this, we propose TFB, an auto… ▽ More Time series are generated in diverse domains such as economic, traffic, health, and energy, where forecasting of future values has numerous important applications. Not surprisingly, many forecasting methods are being proposed. To ensure progress, it is essential to be able to study and compare such methods empirically in a comprehensive and reliable manner. To achieve this, we propose TFB, an automated benchmark for Time Series Forecasting (TSF) methods. TFB advances the state-of-the-art by addressing shortcomings related to datasets, comparison methods, and evaluation pipelines: 1) insufficient coverage of data domains, 2) stereotype bias against traditional methods, and 3) inconsistent and inflexible pipelines. To achieve better domain coverage, we include datasets from 10 different domains: traffic, electricity, energy, the environment, nature, economic, stock markets, banking, health, and the web. We also provide a time series characterization to ensure that the selected datasets are comprehensive. To remove biases against some methods, we include a diverse range of methods, including statistical learning, machine learning, and deep learning methods, and we also support a variety of evaluation strategies and metrics to ensure a more comprehensive evaluations of different methods. To support the integration of different methods into the benchmark and enable fair comparisons, TFB features a flexible and scalable pipeline that eliminates biases. Next, we employ TFB to perform a thorough evaluation of 21 Univariate Time Series Forecasting (UTSF) methods on 8,068 univariate time series and 14 Multivariate Time Series Forecasting (MTSF) methods on 25 datasets. The benchmark code and data are available at https://github.com/decisionintelligence/TFB. △ Less

Submitted 18 June, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

Comments: Directly accepted by PVLDB 2024

arXiv:2403.17425 [pdf, other]

doi 10.1145/3583780.3614697

Masked Multi-Domain Network: Multi-Type and Multi-Scenario Conversion Rate Prediction with a Single Model

Authors: Wentao Ouyang, Xiuwu Zhang, Chaofeng Guo, Shukui Ren, Yupei Sui, Kun Zhang, Jinmei Luo, Yunfeng Chen, Dongbo Xu, Xiangzheng Liu, Yanlong Du

Abstract: In real-world advertising systems, conversions have different types in nature and ads can be shown in different display scenarios, both of which highly impact the actual conversion rate (CVR). This results in the multi-type and multi-scenario CVR prediction problem. A desired model for this problem should satisfy the following requirements: 1) Accuracy: the model should achieve fine-grained accura… ▽ More In real-world advertising systems, conversions have different types in nature and ads can be shown in different display scenarios, both of which highly impact the actual conversion rate (CVR). This results in the multi-type and multi-scenario CVR prediction problem. A desired model for this problem should satisfy the following requirements: 1) Accuracy: the model should achieve fine-grained accuracy with respect to any conversion type in any display scenario. 2) Scalability: the model parameter size should be affordable. 3) Convenience: the model should not require a large amount of effort in data partitioning, subset processing and separate storage. Existing approaches cannot simultaneously satisfy these requirements. For example, building a separate model for each (conversion type, display scenario) pair is neither scalable nor convenient. Building a unified model trained on all the data with conversion type and display scenario included as two features is not accurate enough. In this paper, we propose the Masked Multi-domain Network (MMN) to solve this problem. To achieve the accuracy requirement, we model domain-specific parameters and propose a dynamically weighted loss to account for the loss scale imbalance issue within each mini-batch. To achieve the scalability requirement, we propose a parameter sharing and composition strategy to reduce model parameters from a product space to a sum space. To achieve the convenience requirement, we propose an auto-masking strategy which can take mixed data from all the domains as input. It avoids the overhead caused by data partitioning, individual processing and separate storage. Both offline and online experimental results validate the superiority of MMN for multi-type and multi-scenario CVR prediction. MMN is now the serving model for real-time CVR prediction in UC Toutiao. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: CIKM 2023 (larger figures)

arXiv:2403.16700 [pdf, other]

Infinite Grassmann Time-Evolving Matrix Product Operator Method in the Steady State

Authors: Chu Guo, Ruofan Chen

Abstract: We present an infinite Grassmann time-evolving matrix product operator method for quantum impurity problems, which directly works in the steady state. The method embraces the well-established infinite matrix product state algorithms with the recently developed GTEMPO method, and benefits from both sides: it obtains numerically exact real-time Green's functions without sampling noises and bath disc… ▽ More We present an infinite Grassmann time-evolving matrix product operator method for quantum impurity problems, which directly works in the steady state. The method embraces the well-established infinite matrix product state algorithms with the recently developed GTEMPO method, and benefits from both sides: it obtains numerically exact real-time Green's functions without sampling noises and bath discretization error, it is applicable for any temperature without the sign problem, its computational cost is independent of the transient dynamics and does not scale with the number of baths. We benchmark the method on the finite-temperature equilibrium Green's function in the noninteracting limit against exact solutions and in the single-orbital Anderson impurity model against GTEMPO calculations. We also study the zero-temperature non-equilibrium steady state of an impurity coupled to two baths with a voltage bias, obtaining consistent particle currents with existing calculations. The method is ideal for studying steady-state quantum transport, and can be readily used as an efficient real-time impurity solver in the dynamical mean field theory and its non-equilibrium extension. △ Less

Submitted 31 March, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: 6 pages, 4 figures

arXiv:2403.16538 [pdf]

Broadband and fabrication-tolerant 3-dB couplers with topological valley edge modes

Authors: Guo-Jing Tang, Xiao-Dong Chen, Lu Sun, Chao-Heng Guo, Meng-Yu Li, Zhong-Tao Tian, Hou-Hong Chen, Hong-Wei Wang, Qi-Yao Sun, Ying-Di Pan, Xin-Tao He, Yi-Kai Su, Jian-Wen Dong

Abstract: 3-dB couplers, which are commonly used in photonic integrated circuits for on-chip information processing, precision measurement, and quantum computing, face challenges in achieving robust performance due to their limited 3-dB bandwidths and sensitivity to fabrication errors. To address this, we introduce topological physics to nanophotonics, developing a framework for topological 3-dB couplers. T… ▽ More 3-dB couplers, which are commonly used in photonic integrated circuits for on-chip information processing, precision measurement, and quantum computing, face challenges in achieving robust performance due to their limited 3-dB bandwidths and sensitivity to fabrication errors. To address this, we introduce topological physics to nanophotonics, developing a framework for topological 3-dB couplers. These couplers exhibit broad working wavelength range and robustness against fabrication dimensional errors. By leveraging valley-Hall topology and mirror symmetry, the photonic-crystal-slab couplers achieve ideal 3-dB splitting characterized by a wavelength-insensitive scattering matrix. Tolerance analysis confirms the superiority on broad bandwidth of 48 nm and robust splitting against dimensional errors of 20 nm. We further propose a topological interferometer for on-chip distance measurement, which also exhibits robustness against dimensional errors. This extension of topological principles to the fields of interferometers, may open up new possibilities for constructing robust wavelength division multiplexing, temperature-drift-insensitive sensing, and optical coherence tomography applications. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 20 pages, 4 figures

arXiv:2403.14421 [pdf, other]

DP-RDM: Adapting Diffusion Models to Private Domains Without Fine-Tuning

Authors: Jonathan Lebensold, Maziar Sanjabi, Pietro Astolfi, Adriana Romero-Soriano, Kamalika Chaudhuri, Mike Rabbat, Chuan Guo

Abstract: Text-to-image diffusion models have been shown to suffer from sample-level memorization, possibly reproducing near-perfect replica of images that they are trained on, which may be undesirable. To remedy this issue, we develop the first differentially private (DP) retrieval-augmented generation algorithm that is capable of generating high-quality image samples while providing provable privacy guara… ▽ More Text-to-image diffusion models have been shown to suffer from sample-level memorization, possibly reproducing near-perfect replica of images that they are trained on, which may be undesirable. To remedy this issue, we develop the first differentially private (DP) retrieval-augmented generation algorithm that is capable of generating high-quality image samples while providing provable privacy guarantees. Specifically, we assume access to a text-to-image diffusion model trained on a small amount of public data, and design a DP retrieval mechanism to augment the text prompt with samples retrieved from a private retrieval dataset. Our \emph{differentially private retrieval-augmented diffusion model} (DP-RDM) requires no fine-tuning on the retrieval dataset to adapt to another domain, and can use state-of-the-art generative models to generate high-quality image samples while satisfying rigorous DP guarantees. For instance, when evaluated on MS-COCO, our DP-RDM can generate samples with a privacy budget of $ε=10$, while providing a $3.5$ point improvement in FID compared to public-only retrieval for up to $10,000$ queries. △ Less

Submitted 13 May, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

arXiv:2403.14117 [pdf, other]

doi 10.1145/3613904.3642697

A Design Space for Intelligent and Interactive Writing Assistants

Authors: Mina Lee, Katy Ilonka Gero, John Joon Young Chung, Simon Buckingham Shum, Vipul Raheja, Hua Shen, Subhashini Venugopalan, Thiemo Wambsganss, David Zhou, Emad A. Alghamdi, Tal August, Avinash Bhat, Madiha Zahrah Choksi, Senjuti Dutta, Jin L. C. Guo, Md Naimul Hoque, Yewon Kim, Simon Knight, Seyed Parsa Neshaei, Agnia Sergeyuk, Antonette Shibani, Disha Shrivastava, Lila Shroff, Jessi Stark, Sarah Sterman , et al. (11 additional authors not shown)

Abstract: In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge by proposing a design space as a structured way to examine and explore the multidimensional space of intelligent and interactive writing assistants. Through a large community collaboration, we explore… ▽ More In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge by proposing a design space as a structured way to examine and explore the multidimensional space of intelligent and interactive writing assistants. Through a large community collaboration, we explore five aspects of writing assistants: task, user, technology, interaction, and ecosystem. Within each aspect, we define dimensions (i.e., fundamental components of an aspect) and codes (i.e., potential options for each dimension) by systematically reviewing 115 papers. Our design space aims to offer researchers and designers a practical tool to navigate, comprehend, and compare the various possibilities of writing assistants, and aid in the envisioning and design of new writing assistants. △ Less

Submitted 26 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

Comments: Published as a conference paper at CHI 2024

arXiv:2403.12414 [pdf, other]

Development of low-radon ultra-pure water for the Jiangmen Underground Neutrino Observatory

Authors: T. Y. Guan, Y. P. Zhang, B. Wang, C. Guo, J. C. Liu, Q. Tang, C. G. Yang, C. Li

Abstract: The Jiangmen Underground Neutrino Observatory(JUNO) is a state-of-the-art liquid scintillator-based neutrino physics experiment under construction in South China. To reduce the background from external radioactivities, a water Cherenkov detector composed of 35~kton ultra-pure water and 2,400 20-inch photomultiplier tubes is developed. Even after specialized treatment, ultra-pure water still contai… ▽ More The Jiangmen Underground Neutrino Observatory(JUNO) is a state-of-the-art liquid scintillator-based neutrino physics experiment under construction in South China. To reduce the background from external radioactivities, a water Cherenkov detector composed of 35~kton ultra-pure water and 2,400 20-inch photomultiplier tubes is developed. Even after specialized treatment, ultra-pure water still contains trace levels of radioactive elements that can contribute to the detector background. Among which $^{222}$Rn is particularly significant. To address this, an online radon removal system based on the JUNO prototype has been developed. By integrating micro-bubble generators to enhance degasser's radon removal efficiency, the radon concentration in water can be reduced to 1~mBq/m$^{3}$ level, meeting the stringent requirements of JUNO. Additionally, a highly sensitive online radon concentration measurement system capable of detecting concentrations $\sim$1~mBq/m$^3$ has been developed to monitor the radon concentration in water. In this paper, the details regarding both systems will be presented. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: 20 pages, 13 figures

arXiv:2403.09996 [pdf, other]

MEDPNet: Achieving High-Precision Adaptive Registration for Complex Die Castings

Authors: Yu Du, Yu Song, Ce Guo, Xiaojing Tian, Dong Liu, Ming Cong

Abstract: Due to their complex spatial structure and diverse geometric features, achieving high-precision and robust point cloud registration for complex Die Castings has been a significant challenge in the die-casting industry. Existing point cloud registration methods primarily optimize network models using well-established high-quality datasets, often neglecting practical application in real scenarios. T… ▽ More Due to their complex spatial structure and diverse geometric features, achieving high-precision and robust point cloud registration for complex Die Castings has been a significant challenge in the die-casting industry. Existing point cloud registration methods primarily optimize network models using well-established high-quality datasets, often neglecting practical application in real scenarios. To address this gap, this paper proposes a high-precision adaptive registration method called Multiscale Efficient Deep Closest Point (MEDPNet) and introduces a die-casting point cloud dataset, DieCastCloud, specifically designed to tackle the challenges of point cloud registration in the die-casting industry. The MEDPNet method performs coarse die-casting point cloud data registration using the Efficient-DCP method, followed by precision registration using the Multiscale feature fusion dual-channel registration (MDR) method. We enhance the modeling capability and computational efficiency of the model by replacing the attention mechanism of the Transformer in DCP with Efficient Attention and implementing a collaborative scale mechanism through the combination of serial and parallel blocks. Additionally, we propose the MDR method, which utilizes multilayer perceptrons (MLP), Normal Distributions Transform (NDT), and Iterative Closest Point (ICP) to achieve learnable adaptive fusion, enabling high-precision, scalable, and noise-resistant global point cloud registration. Our proposed method demonstrates excellent performance compared to state-of-the-art geometric and learning-based registration methods when applied to complex die-casting point cloud data. △ Less

Submitted 14 March, 2024; originally announced March 2024.

arXiv:2403.09464 [pdf, other]

doi 10.1051/0004-6361/202348460

New constraints on Triton's atmosphere from the 6 October 2022 stellar occultation

Authors: Ye Yuan, Chen Zhang, Fan Li, Jian Chen, Yanning Fu, Chunhai Bai, Xing Gao, Yong Wang, Tuhong Zhong, Yixing Gao, Liang Wang, Donghua Chen, Yixing Zhang, Yang Zhang, Wenpeng Xie, Shupi Zhang, Ding Liu, Jun Cao, Xiangdong Yin, Xiaojun Mo, Jing Liu, Xinru Han, Tong Liu, Yuqiang Chen, Zhendong Gao , et al. (25 additional authors not shown)

Abstract: The atmosphere of Triton was probed directly by observing a ground-based stellar occultation on 6 October 2022. This rare event yielded 23 positive light curves collected from 13 separate observation stations contributing to our campaign. The significance of this event lies in its potential to directly validate the modest pressure fluctuation on Triton, a phenomenon not definitively verified by pr… ▽ More The atmosphere of Triton was probed directly by observing a ground-based stellar occultation on 6 October 2022. This rare event yielded 23 positive light curves collected from 13 separate observation stations contributing to our campaign. The significance of this event lies in its potential to directly validate the modest pressure fluctuation on Triton, a phenomenon not definitively verified by previous observations, including only five stellar occultations, and the Voyager 2 radio occultation in 1989. Using an approach consistent with a comparable study, we precisely determined a surface pressure of $14.07_{-0.13}^{+0.21}~\mathrm{μbar}$ in 2022. This new pressure rules out any significant monotonic variation in pressure between 2017 and 2022 through direct observations, as it is in alignment with the 2017 value. Additionally, both the pressures in 2017 and 2022 align with the 1989 value. This provides further support for the conclusion drawn from the previous volatile transport model simulation, which is consistent with the observed alignment between the pressures in 1989 and 2017; that is to say, the pressure fluctuation is modest. Moreover, this conclusion suggests the existence of a northern polar cap extended down to at least $45^\circ$N$-60^\circ$N and the presence of nitrogen between $30^\circ$S and $0^\circ$. △ Less

Submitted 24 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

Comments: Astronomy & Astrophysics, in press. 9 pages, 2 figures, 3 tables

Journal ref: A&A 684, L13 (2024)

Showing 1–50 of 908 results for author: Guo, C