subscribe to arXiv mailings

Instruct-ReID++: Towards Universal Purpose Instruction-Guided Person Re-identification

Authors: Weizhen He, Yiheng Deng, Yunfeng Yan, Feng Zhu, Yizhou Wang, Lei Bai, Qingsong Xie, Donglian Qi, Wanli Ouyang, Shixiang Tang

Abstract: Human intelligence can retrieve any person according to both visual and language descriptions. However, the current computer vision community studies specific person re-identification (ReID) tasks in different scenarios separately, which limits the applications in the real world. This paper strives to resolve this problem by proposing a novel instruct-ReID task that requires the model to retrieve… ▽ More Human intelligence can retrieve any person according to both visual and language descriptions. However, the current computer vision community studies specific person re-identification (ReID) tasks in different scenarios separately, which limits the applications in the real world. This paper strives to resolve this problem by proposing a novel instruct-ReID task that requires the model to retrieve images according to the given image or language instructions. Instruct-ReID is the first exploration of a general ReID setting, where existing 6 ReID tasks can be viewed as special cases by assigning different instructions. To facilitate research in this new instruct-ReID task, we propose a large-scale OmniReID++ benchmark equipped with diverse data and comprehensive evaluation methods e.g., task specific and task-free evaluation settings. In the task-specific evaluation setting, gallery sets are categorized according to specific ReID tasks. We propose a novel baseline model, IRM, with an adaptive triplet loss to handle various retrieval tasks within a unified framework. For task-free evaluation setting, where target person images are retrieved from task-agnostic gallery sets, we further propose a new method called IRM++ with novel memory bank-assisted learning. Extensive evaluations of IRM and IRM++ on OmniReID++ benchmark demonstrate the superiority of our proposed methods, achieving state-of-the-art performance on 10 test sets. The datasets, the model, and the code will be available at https://github.com/hwz-zju/Instruct-ReID △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2306.07520

arXiv:2405.17474 [pdf, other]

Federated Offline Policy Optimization with Dual Regularization

Authors: Sheng Yue, Zerui Qin, Xingyuan Hua, Yongheng Deng, Ju Ren

Abstract: Federated Reinforcement Learning (FRL) has been deemed as a promising solution for intelligent decision-making in the era of Artificial Internet of Things. However, existing FRL approaches often entail repeated interactions with the environment during local updating, which can be prohibitively expensive or even infeasible in many real-world domains. To overcome this challenge, this paper proposes… ▽ More Federated Reinforcement Learning (FRL) has been deemed as a promising solution for intelligent decision-making in the era of Artificial Internet of Things. However, existing FRL approaches often entail repeated interactions with the environment during local updating, which can be prohibitively expensive or even infeasible in many real-world domains. To overcome this challenge, this paper proposes a novel offline federated policy optimization algorithm, named $\texttt{DRPO}$, which enables distributed agents to collaboratively learn a decision policy only from private and static data without further environmental interactions. $\texttt{DRPO}$ leverages dual regularization, incorporating both the local behavioral policy and the global aggregated policy, to judiciously cope with the intrinsic two-tier distributional shifts in offline FRL. Theoretical analysis characterizes the impact of the dual regularization on performance, demonstrating that by achieving the right balance thereof, $\texttt{DRPO}$ can effectively counteract distributional shifts and ensure strict policy improvement in each federative learning round. Extensive experiments validate the significant performance gains of $\texttt{DRPO}$ over baseline methods. △ Less

Submitted 28 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

Comments: IEEE International Conference on Computer Communications (INFOCOM)

arXiv:2405.17457 [pdf, other]

Data-Free Federated Class Incremental Learning with Diffusion-Based Generative Memory

Authors: Naibo Wang, Yuchen Deng, Wenjie Feng, Jianwei Yin, See-Kiong Ng

Abstract: Federated Class Incremental Learning (FCIL) is a critical yet largely underexplored issue that deals with the dynamic incorporation of new classes within federated learning (FL). Existing methods often employ generative adversarial networks (GANs) to produce synthetic images to address privacy concerns in FL. However, GANs exhibit inherent instability and high sensitivity, compromising the effecti… ▽ More Federated Class Incremental Learning (FCIL) is a critical yet largely underexplored issue that deals with the dynamic incorporation of new classes within federated learning (FL). Existing methods often employ generative adversarial networks (GANs) to produce synthetic images to address privacy concerns in FL. However, GANs exhibit inherent instability and high sensitivity, compromising the effectiveness of these methods. In this paper, we introduce a novel data-free federated class incremental learning framework with diffusion-based generative memory (DFedDGM) to mitigate catastrophic forgetting by generating stable, high-quality images through diffusion models. We design a new balanced sampler to help train the diffusion models to alleviate the common non-IID problem in FL, and introduce an entropy-based sample filtering technique from an information theory perspective to enhance the quality of generative samples. Finally, we integrate knowledge distillation with a feature-based regularization term for better knowledge transfer. Our framework does not incur additional communication costs compared to the baseline FedAvg method. Extensive experiments across multiple datasets demonstrate that our method significantly outperforms existing baselines, e.g., over a 4% improvement in average accuracy on the Tiny-ImageNet dataset. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.16780 [pdf, other]

Analysis of Broken Randomized Experiments by Principal Stratification

Authors: Qinqing Liu, Xiang Peng, Tao Zhang, Yuhao Deng

Abstract: Although randomized controlled trials have long been regarded as the ``gold standard'' for evaluating treatment effects, there is no natural prevention from post-treatment events. For example, non-compliance makes the actual treatment different from the assigned treatment, truncation-by-death renders the outcome undefined or ill-defined, and missingness prevents the outcomes from being measured. I… ▽ More Although randomized controlled trials have long been regarded as the ``gold standard'' for evaluating treatment effects, there is no natural prevention from post-treatment events. For example, non-compliance makes the actual treatment different from the assigned treatment, truncation-by-death renders the outcome undefined or ill-defined, and missingness prevents the outcomes from being measured. In this paper, we develop a statistical analysis framework using principal stratification to investigate the treatment effect in broken randomized experiments. The average treatment effect in compliers and always-survivors is adopted as the target causal estimand. We establish the asymptotic property for the estimator. We apply the framework to study the effect of training on earnings in the Job Corps Study and find that the training program does not have an effect on employment but possibly have an effect on improving the earnings after employment. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.16741 [pdf, other]

doi 10.1007/s11207-024-02311-0

A Study on Magnetic-sensitivity Wavelength Position of the Working Line Used by the Full-Disk Magnetograph onboard the Advanced Space based Solar Observatory (ASO-S/FMG)

Authors: S. Liu, J. T. Su, X. Y. Bai, Y. Y. Deng, J. Chen, Y. L. Song, X. F. Wang, H. Q. Xu, X. Yang, Shahid Idrees

Abstract: Utilizing data from the $Solar$ $Magnetism$ and $Activity$ $Telescope$ (SMAT), analytical solutions of polarized radiative transfer equations, and in-orbit test data from the Full-disk Magnetograph (FMG) onboard the Advanced Space based Solar Observatory (ASO-S), this study reveals the magnetic-sensitivity spectral positions for the Fe {\sc i} $λ$5234.19 A, working line used by FMG. From the exper… ▽ More Utilizing data from the $Solar$ $Magnetism$ and $Activity$ $Telescope$ (SMAT), analytical solutions of polarized radiative transfer equations, and in-orbit test data from the Full-disk Magnetograph (FMG) onboard the Advanced Space based Solar Observatory (ASO-S), this study reveals the magnetic-sensitivity spectral positions for the Fe {\sc i} $λ$5234.19 A, working line used by FMG. From the experimental data of SMAT, it is found that the most sensitivity position is located at the line center for linear polarization (Stokes-Q/U), while it is about -0.07 A away from the line center for circular polarization (Stokes-V). Moreover, both the theoretical analysis and the in-orbit test data analysis of FMG prove again the above results. Additionally, the theoretical analysis suggests the presence of distinct spectral pockets (centered at 0.08-0.15 A) from the line, harboring intense magnetic sensitivity across all three Stokes parameters. Striking a balance between high sensitivity for both linear and circular polarization while capturing additional valuable information, a spectral position of -0.08 A emerges as the champion for routine FMG magnetic-field observations. △ Less

Submitted 26 May, 2024; originally announced May 2024.

Comments: 12pages,8figures

Journal ref: Solar Physics, 2024,May

arXiv:2405.16432 [pdf, other]

Revealing the hidden Dirac gap in a topological antiferromagnet using Floquet-Bloch manipulation

Authors: Nina Bielinski, Rajas Chari, Julian May-Mann, Soyeun Kim, Jack Zwettler, Yujun Deng, Anuva Aishwarya, Subhajit Roychowdhury, Chandra Shekhar, Makoto Hashimoto, Donghui Lu, Jiaqiang Yan, Claudia Felser, Vidya Madhavan, Zhi-Xun Shen, Taylor L. Hughes, Fahad Mahmood

Abstract: Manipulating solids using the time-periodic drive of a laser pulse is a promising route to generate new phases of matter. Whether such `Floquet-Bloch' manipulation can be achieved in topological magnetic systems with disorder has so far been unclear. In this work, we realize Floquet-Bloch manipulation of the Dirac surface-state mass of the topological antiferromagnet (AFM) MnBi$_2$Te$_4$. Using ti… ▽ More Manipulating solids using the time-periodic drive of a laser pulse is a promising route to generate new phases of matter. Whether such `Floquet-Bloch' manipulation can be achieved in topological magnetic systems with disorder has so far been unclear. In this work, we realize Floquet-Bloch manipulation of the Dirac surface-state mass of the topological antiferromagnet (AFM) MnBi$_2$Te$_4$. Using time- and angle-resolved photoemission spectroscopy (tr-ARPES), we show that opposite helicities of mid-infrared circularly polarized light result in substantially different Dirac mass gaps in the AFM phase, despite the equilibrium Dirac cone being massless. We explain our findings in terms of a Dirac fermion with a random mass. Our results underscore Floquet-Bloch manipulation as a powerful tool for controlling topology even in the presence of disorder, and for uncovering properties of materials that may elude conventional probes. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.16425 [pdf, other]

Dipolar bosons in a twisted bilayer geometry

Authors: Chao Zhang, Zhijie Fan, Barbara Capogrosso-Sansone, Youjin Deng

Abstract: In recent years, twisted bilayer systems such as bilayer graphene have attracted a great deal of attention as the twist angle introduces a degree of freedom which can be used to non-trivially modify system properties. This idea has been picked up in the cold atom community, first with a theoretical proposal to simulate twisted bilayers in state-dependent optical lattices, and, more recently, with… ▽ More In recent years, twisted bilayer systems such as bilayer graphene have attracted a great deal of attention as the twist angle introduces a degree of freedom which can be used to non-trivially modify system properties. This idea has been picked up in the cold atom community, first with a theoretical proposal to simulate twisted bilayers in state-dependent optical lattices, and, more recently, with an experimental realization of twisted bilayers with bosonic atoms in two different spin states. In this manuscript, we theoretically investigate dipolar bosons in a twisted bilayer geometry. The interplay between dipolar interaction and the twist between the layers results in the emergence of quantum states not observed in the absence of twist. We study how system properties vary as we change the twist angle at fixed distance between the layers and fixed dipolar interaction. We find that at a twist angle $θ=0.1^{\circ}$, the observed quantum phases are consistent with those seen in the absence of twist angle, i.e. paired superfluid, paired supersolid, and paired solid phases. However, a slight increase in the twist angle to $θ=0.2^{\circ}$ disrupts these paired phases in favor of a phase separation between checkerboard solid and superfluid regions. Notably, at a twist angle of $θ=5.21^{\circ}$, the local occupation number follows the moiré pattern of the underlying moiré bilayers so that a periodic structure of insulating islands is formed. These insulating islands are surrounded by a superfluid. △ Less

Submitted 26 May, 2024; originally announced May 2024.

Comments: 6 pages, 4 figures

arXiv:2405.14838 [pdf, other]

From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step

Authors: Yuntian Deng, Yejin Choi, Stuart Shieber

Abstract: When leveraging language models for reasoning tasks, generating explicit chain-of-thought (CoT) steps often proves essential for achieving high accuracy in final outputs. In this paper, we investigate if models can be taught to internalize these CoT steps. To this end, we propose a simple yet effective method for internalizing CoT steps: starting with a model trained for explicit CoT reasoning, we… ▽ More When leveraging language models for reasoning tasks, generating explicit chain-of-thought (CoT) steps often proves essential for achieving high accuracy in final outputs. In this paper, we investigate if models can be taught to internalize these CoT steps. To this end, we propose a simple yet effective method for internalizing CoT steps: starting with a model trained for explicit CoT reasoning, we gradually remove the intermediate steps and finetune the model. This process allows the model to internalize the intermediate reasoning steps, thus simplifying the reasoning process while maintaining high performance. Our approach enables a GPT-2 Small model to solve 9-by-9 multiplication with up to 99% accuracy, whereas standard training cannot solve beyond 4-by-4 multiplication. Furthermore, our method proves effective on larger language models, such as Mistral 7B, achieving over 50% accuracy on GSM8K without producing any intermediate steps. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.14620 [pdf, other]

Closed-form Symbolic Solutions: A New Perspective on Solving Partial Differential Equations

Authors: Shu Wei, Yanjie Li, Lina Yu, Min Wu, Weijun Li, Meilan Hao, Wenqiang Li, Jingyi Liu, Yusong Deng

Abstract: Solving partial differential equations (PDEs) in Euclidean space with closed-form symbolic solutions has long been a dream for mathematicians. Inspired by deep learning, Physics-Informed Neural Networks (PINNs) have shown great promise in numerically solving PDEs. However, since PINNs essentially approximate solutions within the continuous function space, their numerical solutions fall short in bo… ▽ More Solving partial differential equations (PDEs) in Euclidean space with closed-form symbolic solutions has long been a dream for mathematicians. Inspired by deep learning, Physics-Informed Neural Networks (PINNs) have shown great promise in numerically solving PDEs. However, since PINNs essentially approximate solutions within the continuous function space, their numerical solutions fall short in both precision and interpretability compared to symbolic solutions. This paper proposes a novel framework: a closed-form \textbf{Sym}bolic framework for \textbf{PDE}s (SymPDE), exploring the use of deep reinforcement learning to directly obtain symbolic solutions for PDEs. SymPDE alleviates the challenges PINNs face in fitting high-frequency and steeply changing functions. To our knowledge, no prior work has implemented this approach. Experiments on solving the Poisson's equation and heat equation in time-independent and spatiotemporal dynamical systems respectively demonstrate that SymPDE can provide accurate closed-form symbolic solutions for various types of PDEs. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.13820 [pdf, other]

Towards Comprehensive and Efficient Post Safety Alignment of Large Language Models via Safety Patching

Authors: Weixiang Zhao, Yulin Hu, Zhuojun Li, Yang Deng, Yanyan Zhao, Bing Qin, Tat-Seng Chua

Abstract: Safety alignment of large language models (LLMs) has been gaining increasing attention. However, current safety-aligned LLMs suffer from the fragile and imbalanced safety mechanisms, which can still be induced to generate unsafe responses, exhibit over-safety by rejecting safe user inputs, and fail to preserve general utility after safety alignment. To this end, we propose a novel post safety alig… ▽ More Safety alignment of large language models (LLMs) has been gaining increasing attention. However, current safety-aligned LLMs suffer from the fragile and imbalanced safety mechanisms, which can still be induced to generate unsafe responses, exhibit over-safety by rejecting safe user inputs, and fail to preserve general utility after safety alignment. To this end, we propose a novel post safety alignment (PSA) method to address these inherent and emerging safety challenges, including safety enhancement, over-safety mitigation, and utility preservation. In specific, we introduce \textsc{SafePatching}, a novel framework for comprehensive and efficient PSA, where two distinct safety patches are developed on the harmful data to enhance safety and mitigate over-safety concerns, and then seamlessly integrated into the target LLM backbone without compromising its utility. Extensive experiments show that \textsc{SafePatching} achieves a more comprehensive and efficient PSA than baseline methods. It even enhances the utility of the backbone, further optimizing the balance between being helpful and harmless in current aligned LLMs. Also, \textsc{SafePatching} demonstrates its superiority in continual PSA scenarios. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 24 pages, 8 figures and 12 tables

arXiv:2405.13315 [pdf, other]

Study of the decays $χ_{cJ}\toΛ\barΛω$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, we present the first observation of the decays $χ_{cJ}\toΛ\barΛω$, where $J=0, 1, 2$, with statistical significances of $11.7 σ, 11.2 σ$, and $11.8 σ$. The branching fractions of these decays are determined to be $\mathcal{B}(χ_{c0}\toΛ\barΛω)=({2.37 \pm 0.22 \pm 0.23}) \times 10^{-4}$,… ▽ More Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, we present the first observation of the decays $χ_{cJ}\toΛ\barΛω$, where $J=0, 1, 2$, with statistical significances of $11.7 σ, 11.2 σ$, and $11.8 σ$. The branching fractions of these decays are determined to be $\mathcal{B}(χ_{c0}\toΛ\barΛω)=({2.37 \pm 0.22 \pm 0.23}) \times 10^{-4}$, $\mathcal{B}(χ_{c1}\toΛ\barΛω)=({1.01 \pm 0.10 \pm 0.11}) \times 10^{-4}$, and $\mathcal{B}(χ_{c2}\toΛ\barΛω)=({1.40 \pm 0.13 \pm 0.17}) \times 10^{-4}$, where the first uncertainties are statistical and the second are systematic. We observe no clear intermediate structures. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 11 pages, 10 figures

arXiv:2405.13311 [pdf, other]

Observation of a large-scale filament eruption initiated by two small-scale erupting filaments pushing out from below

Authors: Yongliang Song, Jiangtao Su, Qingmin Zhang, Mei Zhang, Yuanyong Deng, Xianyong Bai, Suo Liu, Xiao Yang, Jie Chen, Haiqing Xu, Kaifan Ji, Ziyao Hu

Abstract: Filament eruptions often result in flares and coronal mass ejections (CMEs). Most studies attribute the filament eruptions to their instabilities or magnetic reconnection. In this study, we report a unique observation of a filament eruption whose initiation process has not been reported before. This large-scale filament, with a length of about 360 Mm crossing an active region, is forced to erupted… ▽ More Filament eruptions often result in flares and coronal mass ejections (CMEs). Most studies attribute the filament eruptions to their instabilities or magnetic reconnection. In this study, we report a unique observation of a filament eruption whose initiation process has not been reported before. This large-scale filament, with a length of about 360 Mm crossing an active region, is forced to erupted by two small-scale erupting filaments pushing out from below. This process of multi-filament eruption results in an M6.4 flare in the active region NOAA 13229 on 25th February 2023. The whole process can be divided into three stages: the eruptions of two active-region filaments F1 and F2; the interactions between the erupting F1, F2, and the large-scale filament F3; and the eruption of F3. Though this multi-filament eruption occurs near the northwest limb of the solar disk, it produces a strong halo CME that causes a significant geomagnetic disturbance. Our observations present a new filament eruption mechanism, in which the initial kinetic energy of the eruption is obtained from and transported to by other erupting structures. This event provides us a unique insight into the dynamics of multi-filament eruptions and their corresponding effects on the interplanetary space. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 16 pages, 10 figures. Accepted for publication in Solar Physics

arXiv:2405.12809 [pdf, other]

Precision measurement of the branching fraction of \boldmath $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (604 additional authors not shown)

Abstract: Using a sample of $448.1 \times 10^6$ $ψ(2S)$ events collected with the BESIII detector, we perform a study of the decay $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$. The branching fraction of $J/ψ\rightarrow K^+K^-$ is determined to be $\mathcal{B}_{K^+K^-}=(3.072\pm 0.023({\rm stat.})\pm 0.050({\rm syst.}))\times 10^{-4}$, which is consistent with previous measurements but with sig… ▽ More Using a sample of $448.1 \times 10^6$ $ψ(2S)$ events collected with the BESIII detector, we perform a study of the decay $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$. The branching fraction of $J/ψ\rightarrow K^+K^-$ is determined to be $\mathcal{B}_{K^+K^-}=(3.072\pm 0.023({\rm stat.})\pm 0.050({\rm syst.}))\times 10^{-4}$, which is consistent with previous measurements but with significantly improved precision. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: to be submitted to PRD

arXiv:2405.12376 [pdf]

A silicon photonics waveguide-coupled colloidal quantum dot photodiode sensitive beyond 1.6 um

Authors: Chao Pang, Yu-Hao Deng, Ezat Kheradmand, Luis Moreno Hagelsieb, Yujie Guo, David Cheyns, Pieter Geiregat, Zeger Hens, Dries Van Thourhout

Abstract: Silicon photonics faces a persistent challenge in extending photodetection capabilities beyond the 1.6 um wavelength range, primarily due to the lack of appropriate epitaxial materials. Colloidal quantum dots (QDs) present a promising solution here, offering distinct advantages such as infrared wavelength tunability, cost-effectiveness, and facile deposition. Their unique properties position them… ▽ More Silicon photonics faces a persistent challenge in extending photodetection capabilities beyond the 1.6 um wavelength range, primarily due to the lack of appropriate epitaxial materials. Colloidal quantum dots (QDs) present a promising solution here, offering distinct advantages such as infrared wavelength tunability, cost-effectiveness, and facile deposition. Their unique properties position them as a potential candidate for enabling photodetection in silicon photonics beyond the conventional telecom wavelength, thereby expanding the potential applications and capabilities within this domain. In this study, we have successfully integrated lead sulfide (PbS) colloidal quantum dot photodiodes (QDPDs) onto silicon waveguides using standard process techniques. The integrated photodiodes exhibit a remarkable responsivity of 1.3 A/W (with an external quantum efficiency of 74.8%) at a wavelength of 2.1 um, a low dark current of only 106 nA and a bandwidth of 1.1 MHz under a -3 V bias. To demonstrate the scalability of our integration approach, we have developed a compact 8-channel spectrometer incorporating an array of QDPDs. This achievement marks a significant step toward realizing a cost-effective photodetector solution for silicon photonics, particularly tailored for a wide range of sensing applications around the 2 um wavelength range. △ Less

Submitted 28 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.12081 [pdf, other]

Selective Annotation via Data Allocation: These Data Should Be Triaged to Experts for Annotation Rather Than the Model

Authors: Chen Huang, Yang Deng, Wenqiang Lei, Jiancheng Lv, Ido Dagan

Abstract: To obtain high-quality annotations under limited budget, semi-automatic annotation methods are commonly used, where a portion of the data is annotated by experts and a model is then trained to complete the annotations for the remaining data. However, these methods mainly focus on selecting informative data for expert annotations to improve the model predictive ability (i.e., triage-to-human data),… ▽ More To obtain high-quality annotations under limited budget, semi-automatic annotation methods are commonly used, where a portion of the data is annotated by experts and a model is then trained to complete the annotations for the remaining data. However, these methods mainly focus on selecting informative data for expert annotations to improve the model predictive ability (i.e., triage-to-human data), while the rest of the data is indiscriminately assigned to model annotation (i.e., triage-to-model data). This may lead to inefficiencies in budget allocation for annotations, as easy data that the model could accurately annotate may be unnecessarily assigned to the expert, and hard data may be misclassified by the model. As a result, the overall annotation quality may be compromised. To address this issue, we propose a selective annotation framework called SANT. It effectively takes advantage of both the triage-to-human and triage-to-model data through the proposed error-aware triage and bi-weighting mechanisms. As such, informative or hard data is assigned to the expert for annotation, while easy data is handled by the model. Experimental results show that SANT consistently outperforms other baselines, leading to higher-quality annotation through its proper allocation of data to both expert and model workers. We provide pioneering work on data annotation within budget constraints, establishing a landmark for future triage-based annotation studies. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 18 pages, 4 figures

arXiv:2405.12063 [pdf, other]

CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models

Authors: Tong Zhang, Peixin Qin, Yang Deng, Chen Huang, Wenqiang Lei, Junhong Liu, Dingnan Jin, Hongru Liang, Tat-Seng Chua

Abstract: Large language models (LLMs) are increasingly used to meet user information needs, but their effectiveness in dealing with user queries that contain various types of ambiguity remains unknown, ultimately risking user trust and satisfaction. To this end, we introduce CLAMBER, a benchmark for evaluating LLMs using a well-organized taxonomy. Building upon the taxonomy, we construct ~12K high-quality… ▽ More Large language models (LLMs) are increasingly used to meet user information needs, but their effectiveness in dealing with user queries that contain various types of ambiguity remains unknown, ultimately risking user trust and satisfaction. To this end, we introduce CLAMBER, a benchmark for evaluating LLMs using a well-organized taxonomy. Building upon the taxonomy, we construct ~12K high-quality data to assess the strengths, weaknesses, and potential risks of various off-the-shelf LLMs. Our findings indicate the limited practical utility of current LLMs in identifying and clarifying ambiguous user queries, even enhanced by chain-of-thought (CoT) and few-shot prompting. These techniques may result in overconfidence in LLMs and yield only marginal enhancements in identifying ambiguity. Furthermore, current LLMs fall short in generating high-quality clarifying questions due to a lack of conflict resolution and inaccurate utilization of inherent knowledge. In this paper, CLAMBER presents a guidance and promotes further research on proactive and trustworthy LLMs. Our dataset is available at https://github.com/zt991211/CLAMBER △ Less

Submitted 1 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: Accepted to ACL 2024. Camera Ready. Our dataset is available at https://github.com/zt991211/CLAMBER

arXiv:2405.12059 [pdf, other]

STYLE: Improving Domain Transferability of Asking Clarification Questions in Large Language Model Powered Conversational Agents

Authors: Yue Chen, Chen Huang, Yang Deng, Wenqiang Lei, Dingnan Jin, Jia Liu, Tat-Seng Chua

Abstract: Equipping a conversational search engine with strategies regarding when to ask clarification questions is becoming increasingly important across various domains. Attributing to the context understanding capability of LLMs and their access to domain-specific sources of knowledge, LLM-based clarification strategies feature rapid transfer to various domains in a post-hoc manner. However, they still s… ▽ More Equipping a conversational search engine with strategies regarding when to ask clarification questions is becoming increasingly important across various domains. Attributing to the context understanding capability of LLMs and their access to domain-specific sources of knowledge, LLM-based clarification strategies feature rapid transfer to various domains in a post-hoc manner. However, they still struggle to deliver promising performance on unseen domains, struggling to achieve effective domain transferability. We take the first step to investigate this issue and existing methods tend to produce one-size-fits-all strategies across diverse domains, limiting their search effectiveness. In response, we introduce a novel method, called Style, to achieve effective domain transferability. Our experimental results indicate that Style bears strong domain transferability, resulting in an average search performance improvement of ~10% on four unseen domains. △ Less

Submitted 1 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: Accepted to Findings of ACL 2024. Camera Ready

arXiv:2405.12012 [pdf, ps, other]

Linear Singer-Hopf conjecture

Authors: Ya Deng, Botong Wang

Abstract: If $X$ is a closed $2n$-dimensional aspherical manifold, i.e., the universal cover of $X$ is contractible, then the Singer-Hopf conjecture predicts that $(-1)^nχ(X)\geq 0$. We prove this conjecture when $X$ is a complex projective manifold whose fundamental group admits an almost faithful linear representation over any field. In fact, we prove a much stronger statement that if $X$ is a complex pro… ▽ More If $X$ is a closed $2n$-dimensional aspherical manifold, i.e., the universal cover of $X$ is contractible, then the Singer-Hopf conjecture predicts that $(-1)^nχ(X)\geq 0$. We prove this conjecture when $X$ is a complex projective manifold whose fundamental group admits an almost faithful linear representation over any field. In fact, we prove a much stronger statement that if $X$ is a complex projective manifold with large fundamental group and $π_1(X)$ admits an almost faithful linear representation, then $χ(X, {P})\geq 0$ for any perverse sheaf ${P}$ on $X$. To prove the main result, we introduce a vanishing cycle functor of multivalued one-forms. Then using techniques from non-abelian Hodge theories in both archimedean and non-archimedean settings, we deduce the desired positivity from the geometry of pure and mixed period maps. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.11864 [pdf, other]

Unraveling the Spin Coulomb Drag Effect and Its Impact on Spin Transport in the Three-Dimensional Electron Gas

Authors: Zhiyi Li, Pengcheng Hou, Youjin Deng, Kun Chen

Abstract: The spin Coulomb drag effect, arising from the exchange of momentum between electrons of opposite spins, plays a crucial role in the spin transport of interacting electron systems. This effect leads to the emergence of a spin mass and the finite lifetime of spin currents, posing challenges for the accurate description of spin dynamics. Using the state-of-the-art Variational Diagrammatic Monte Carl… ▽ More The spin Coulomb drag effect, arising from the exchange of momentum between electrons of opposite spins, plays a crucial role in the spin transport of interacting electron systems. This effect leads to the emergence of a spin mass and the finite lifetime of spin currents, posing challenges for the accurate description of spin dynamics. Using the state-of-the-art Variational Diagrammatic Monte Carlo approach, we investigate the spin-resolved exchange-correlation (XC) kernel in the three-dimensional uniform electron gas. Our analysis reveals a distinct nature in the spin response, characterized by a $1/q^2$ divergence in the spin XC kernel at finite frequencies. This so-called ultranonlocal behavior, stemming from the spin Coulomb drag effect, is absent in the charge channel. By extracting precise values for the spin mass enhancement factor, we observe a trend of increasing enhancement with decreasing electron density. Furthermore, we find compelling evidence for the suppression of spin diffusion at low temperatures, characterized by vanishing inverse relaxation times. These findings deepen our understanding of the intricate relation between the Coulomb interaction and the spin transport, providing valuable insights for the development of accurate density functional approximations and the advancement of spintronics and quantum technologies. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 8 pages, 5 figures

arXiv:2405.11585 [pdf, other]

Improved measurement of the branching fraction of $h_{c}\rightarrowγη^\prime/η$ and search for $h_{c}\rightarrowγπ^0$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (645 additional authors not shown)

Abstract: The processes $h_c\rightarrowγP(P = η^\prime,~η,~π^{0}))$ are studied with a sample of $(27.12\pm0.14)\times10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. The branching fractions of $h_c\rightarrowγη^\prime$ and $h_c\rightarrowγη$ are measured to be $(1.40\pm0.11\pm0.04\pm0.10)\times10^{-3}$ and $(3.77\pm0.55\pm0.13\pm0.26)\times10^{-4}$, respectively, where the… ▽ More The processes $h_c\rightarrowγP(P = η^\prime,~η,~π^{0}))$ are studied with a sample of $(27.12\pm0.14)\times10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. The branching fractions of $h_c\rightarrowγη^\prime$ and $h_c\rightarrowγη$ are measured to be $(1.40\pm0.11\pm0.04\pm0.10)\times10^{-3}$ and $(3.77\pm0.55\pm0.13\pm0.26)\times10^{-4}$, respectively, where the first uncertainties are statistical, the second systematic, and the third from the branching fraction of $ψ(3686)\rightarrowπ^{0}h_c$. The ratio $R_{h_c}=\frac{\mathscr{B}(h_c\rightarrowγη)}{\mathscr{B}(h_c\rightarrowγη^\prime)}$ is calculated to be $(27.0\pm4.4\pm1.0)\%$. The measurements are consistent with the previous results with improved precision by a factor of 2. The results are valuable for gaining a deeper understanding of $η-η^\prime$ mixing, and its manifestation within quantum chromodynamics. No significant signal is found for the decay $h_c\rightarrowγπ^{0}$, and an upper limit is placed on its branching fraction of $\mathscr{B}(h_c\rightarrowγπ^{0})<5.0\times10^{-5}$, at the 90\% confidence level. △ Less

Submitted 19 May, 2024; originally announced May 2024.

arXiv:2405.10248 [pdf, other]

Co-Matching: Towards Human-Machine Collaborative Legal Case Matching

Authors: Chen Huang, Xinwei Yang, Yang Deng, Wenqiang Lei, JianCheng Lv, Tat-Seng Chua

Abstract: Recent efforts have aimed to improve AI machines in legal case matching by integrating legal domain knowledge. However, successful legal case matching requires the tacit knowledge of legal practitioners, which is difficult to verbalize and encode into machines. This emphasizes the crucial role of involving legal practitioners in high-stakes legal case matching. To address this, we propose a collab… ▽ More Recent efforts have aimed to improve AI machines in legal case matching by integrating legal domain knowledge. However, successful legal case matching requires the tacit knowledge of legal practitioners, which is difficult to verbalize and encode into machines. This emphasizes the crucial role of involving legal practitioners in high-stakes legal case matching. To address this, we propose a collaborative matching framework called Co-Matching, which encourages both the machine and the legal practitioner to participate in the matching process, integrating tacit knowledge. Unlike existing methods that rely solely on the machine, Co-Matching allows both the legal practitioner and the machine to determine key sentences and then combine them probabilistically. Co-Matching introduces a method called ProtoEM to estimate human decision uncertainty, facilitating the probabilistic combination. Experimental results demonstrate that Co-Matching consistently outperforms existing legal case matching methods, delivering significant performance improvements over human- and machine-based matching in isolation (on average, +5.51% and +8.71%, respectively). Further analysis shows that Co-Matching also ensures better human-machine collaboration effectiveness. Our study represents a pioneering effort in human-machine collaboration for the matching task, marking a milestone for future collaborative matching studies. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: Draft V1: 23 pages, 7 figures

arXiv:2405.09672 [pdf, other]

doi 10.1145/3658180

Eulerian-Lagrangian Fluid Simulation on Particle Flow Maps

Authors: Junwei Zhou, Duowen Chen, Molin Deng, Yitong Deng, Yuchen Sun, Sinan Wang, Shiying Xiong, Bo Zhu

Abstract: We propose a novel Particle Flow Map (PFM) method to enable accurate long-range advection for incompressible fluid simulation. The foundation of our method is the observation that a particle trajectory generated in a forward simulation naturally embodies a perfect flow map. Centered on this concept, we have developed an Eulerian-Lagrangian framework comprising four essential components: Lagrangian… ▽ More We propose a novel Particle Flow Map (PFM) method to enable accurate long-range advection for incompressible fluid simulation. The foundation of our method is the observation that a particle trajectory generated in a forward simulation naturally embodies a perfect flow map. Centered on this concept, we have developed an Eulerian-Lagrangian framework comprising four essential components: Lagrangian particles for a natural and precise representation of bidirectional flow maps; a dual-scale map representation to accommodate the mapping of various flow quantities; a particle-to-grid interpolation scheme for accurate quantity transfer from particles to grid nodes; and a hybrid impulse-based solver to enforce incompressibility on the grid. The efficacy of PFM has been demonstrated through various simulation scenarios, highlighting the evolution of complex vortical structures and the details of turbulent flows. Notably, compared to NFM, PFM reduces computing time by up to 49 times and memory consumption by up to 41%, while enhancing vorticity preservation as evidenced in various tests like leapfrog, vortex tube, and turbulent flow. △ Less

Submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.09066 [pdf, other]

Search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, V. Batozskaya, D. Becker, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko , et al. (559 additional authors not shown)

Abstract: We present the first search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ by analyzing a data sample of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.178 and 4.226 GeV, corresponding to an integrated luminosity of 6.32~fb$^{-1}$. No significant signal is observed. The upper limits on the branching fractions for… ▽ More We present the first search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ by analyzing a data sample of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.178 and 4.226 GeV, corresponding to an integrated luminosity of 6.32~fb$^{-1}$. No significant signal is observed. The upper limits on the branching fractions for $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ are set to be $1.1 \times 10^{-5}$ and $4.3 \times 10^{-6}$ at 90\% confidence level, respectively. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 14 pages, 7 figures

arXiv:2405.08869 [pdf, other]

RIGEL: Simulating dwarf galaxies at solar mass resolution with radiative transfer and feedback from individual massive stars

Authors: Yunwei Deng, Hui Li, Boyuan Liu, Rahul Kannan, Aaron Smith, Greg L. Bryan

Abstract: We introduce the RIGEL model, a novel framework to self-consistently model the effects of stellar feedback in the multiphase ISM of dwarf galaxies with radiative transfer (RT) on a star-by-star basis. The RIGEL model integrates detailed implementations of feedback from individual massive stars into the RHD code, AREPO-RT. It forms individual massive stars from the resolved multiphase ISM by sampli… ▽ More We introduce the RIGEL model, a novel framework to self-consistently model the effects of stellar feedback in the multiphase ISM of dwarf galaxies with radiative transfer (RT) on a star-by-star basis. The RIGEL model integrates detailed implementations of feedback from individual massive stars into the RHD code, AREPO-RT. It forms individual massive stars from the resolved multiphase ISM by sampling the IMF and tracks their evolution individually. The lifetimes, photon production rates, mass-loss rates, and wind velocities of these stars are determined by their initial masses and metallicities based on a library that incorporates a variety of stellar models. The RT equations are solved in seven spectral bins accounting for the IR to HeII ionizing bands, using an M1 RT scheme. The thermochemistry model tracks the non-equilibrium H, He chemistry and the equilibrium abundance of CI, CII, OI, OII, and CO to capture the thermodynamics of all ISM phases. We evaluate the performance of the RIGEL model using $1\,{\rm M}_\odot$ resolution simulations of isolated dwarf galaxies. We found that the SFR and ISRF show strong positive correlations to the metallicity of the galaxy. Photoionization and photoheating can reduce the SFR by an order of magnitude by removing the available cold-dense gas fuel for star formation. The ISRF also changes the thermal structure of the ISM. Radiative feedback occurs immediately after the birth of massive stars and rapidly disperses the molecular clouds within 1 Myr. As a consequence, radiative feedback reduces the age spread of star clusters to less than 2 Myr, prohibits the formation of massive star clusters, and shapes the cluster initial mass function to a steep power-law form with a slope of $\sim-2$. The mass-loading factor of the fiducial galaxy has a median of $\sim50$, while turning off radiative feedback reduces this factor by an order of magnitude. △ Less

Submitted 16 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

Comments: 27 pages, 18 figures; submitted to A&A; abstract slightly abridged; comments welcome

arXiv:2405.07741 [pdf, other]

Search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (635 additional authors not shown)

Abstract: Using 9.0 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies from 4.178 to 4.278 GeV with the BESIII detector at the BEPCII collider, we perform the first search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$. No $χ_{c1}(3872)\toγψ_2(3823)$ signal is observed. The upper limit on the ratio of branching fractions… ▽ More Using 9.0 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies from 4.178 to 4.278 GeV with the BESIII detector at the BEPCII collider, we perform the first search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$. No $χ_{c1}(3872)\toγψ_2(3823)$ signal is observed. The upper limit on the ratio of branching fractions $\mathcal{B}(χ_{c1}(3872)\toγψ_2(3823), ψ_2(3823)\toγχ_{c1})/\mathcal{B}(χ_{c1}(3872)\toπ^+π^- J/ψ)$ is set as 0.075 at the 90\% confidence level. Our result contradicts theoretical predictions under the assumption that the $χ_{c1}(3872)$ is the pure charmonium state $χ_{c1}(2P)$. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 8 pages, 2 figures

arXiv:2405.07459 [pdf, other]

DualFocus: A Unified Framework for Integrating Positive and Negative Descriptors in Text-based Person Retrieval

Authors: Yuchuan Deng, Zhanpeng Hu, Jiakun Han, Chuang Deng, Qijun Zhao

Abstract: Text-based person retrieval (TPR) aims to retrieve images of a person from an extensive array of candidates based on a given textual description. The core challenge lies in mapping visual and textual data into a unified latent space. While existing TPR methods concentrate on recognizing explicit and positive characteristics, they often neglect the critical influence of negative descriptors, result… ▽ More Text-based person retrieval (TPR) aims to retrieve images of a person from an extensive array of candidates based on a given textual description. The core challenge lies in mapping visual and textual data into a unified latent space. While existing TPR methods concentrate on recognizing explicit and positive characteristics, they often neglect the critical influence of negative descriptors, resulting in potential false positives that fulfill positive criteria but could be excluded by negative descriptors. To alleviate these issues, we introduce DualFocus, a unified framework for integrating positive and negative descriptors to enhance the interpretative accuracy of vision-language foundational models regarding textual queries. DualFocus employs Dual (Positive/Negative) Attribute Prompt Learning (DAPL), which integrates Dual Image-Attribute Contrastive (DIAC) Learning and Sensitive Image-Attributes Matching (SIAM) Learning. This way DualFocus enhances the detection of unseen attributes, thereby boosting retrieval precision. To further achieve a balance between coarse and fine-grained alignment of visual and textual embeddings, we propose the Dynamic Tokenwise Similarity (DTS) loss, which refines the representation of both matching and non-matching descriptions, thereby enhancing the matching process through a detailed and adaptable similarity assessment. By focusing on token-level comparisons, DualFocus significantly outperforms existing techniques in both precision and robustness. The experiment results highlight DualFocus's superior performance on CUHK-PEDES, ICFG-PEDES, and RSTPReid. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.07308 [pdf]

China's plug-in hybrid electric vehicles transition: an operational carbon perspective

Authors: Yanqiao Deng, Minda Ma

Abstract: Assessing the energy and emissions of representative plug-in hybrid electric vehicle (PHEV) model operations is crucial for accelerating carbon neutrality transitions in China's passenger car sector. This study makes the first attempt to create a bottom-up model to measure the real-world energy use and carbon dioxide (CO2) emissions of China's top twenty selling PHEV model operations across differ… ▽ More Assessing the energy and emissions of representative plug-in hybrid electric vehicle (PHEV) model operations is crucial for accelerating carbon neutrality transitions in China's passenger car sector. This study makes the first attempt to create a bottom-up model to measure the real-world energy use and carbon dioxide (CO2) emissions of China's top twenty selling PHEV model operations across different geographical regions during 2020-2022. The results indicate that (1) the actual electricity intensity for the best-selling PEHV models (20.2-38.2 kilowatt-hour [kWh]/100 kilometers [km]) was 30-40% higher than the New European Driving Cycle (NEDC) values, and the actual gasoline intensity (4.7 to 23.5 liters [L]/100 km) was 3-6 times greater than the NEDC values. (2) The overall energy consumption of the best-selling models exhibited variations among various geographical regions, and the total gasoline equivalent was twice as high in southern China (1283 mega-liters, 2020-2022) than in northern China and the Yangtze River Middle Reach. (3) The top-selling models emitted 4.9 mega-tons (Mt) of CO2 nationwide from 2020-2022, 1.9 Mt from electricity and 3 Mt from gasoline. In northern China, carbon emissions per vehicle were more than 1.2 times greater than those in other regions. Furthermore, targeted policy implications for expediting the carbon-neutral transition within the passenger vehicles are proposed. Overall, this study reviews and compares national and regional benchmark data and performance data for PHEV operations. Its objective is to bolster national decarbonization initiatives, ensuring low emissions and expediting the transportation sector's transition toward a net-zero era. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: 43 pages, 7 figures

arXiv:2405.06419 [pdf, other]

Time Evidence Fusion Network: Multi-source View in Long-Term Time Series Forecasting

Authors: Tianxiang Zhan, Yuanpeng He, Zhen Li, Yong Deng

Abstract: In real-world scenarios, time series forecasting often demands timeliness, making research on model backbones a perennially hot topic. To meet these performance demands, we propose a novel backbone from the perspective of information fusion. Introducing the Basic Probability Assignment (BPA) Module and the Time Evidence Fusion Network (TEFN), based on evidence theory, allows us to achieve superior… ▽ More In real-world scenarios, time series forecasting often demands timeliness, making research on model backbones a perennially hot topic. To meet these performance demands, we propose a novel backbone from the perspective of information fusion. Introducing the Basic Probability Assignment (BPA) Module and the Time Evidence Fusion Network (TEFN), based on evidence theory, allows us to achieve superior performance. On the other hand, the perspective of multi-source information fusion effectively improves the accuracy of forecasting. Due to the fact that BPA is generated by fuzzy theory, TEFN also has considerable interpretability. In real data experiments, the TEFN partially achieved state-of-the-art, with low errors comparable to PatchTST, and operating efficiency surpass performance models such as Dlinear. Meanwhile, TEFN has high robustness and small error fluctuations in the random hyperparameter selection. TEFN is not a model that achieves the ultimate in single aspect, but a model that balances performance, accuracy, stability, and interpretability. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.06393 [pdf, other]

Measurement of the ${e}^{+}{e}^{-}\to p \bar{p}π^{0}$ cross section at $\sqrt{s}=2.1000-3.0800$ GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: The process $e^{+}e^{-}\to p\bar{p}π^{0}$ is studied at 20 center-of-mass energies ranging from 2.1000 to 3.0800 GeV using 636.8 pb$^{-1}$ of data collected with the BESIII detector operating at the BEPCII collider. The Born cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$ are measured with high precision. Since the lowest center-of-mass energy, 2.1000 GeV, is less than 90 MeV above the… ▽ More The process $e^{+}e^{-}\to p\bar{p}π^{0}$ is studied at 20 center-of-mass energies ranging from 2.1000 to 3.0800 GeV using 636.8 pb$^{-1}$ of data collected with the BESIII detector operating at the BEPCII collider. The Born cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$ are measured with high precision. Since the lowest center-of-mass energy, 2.1000 GeV, is less than 90 MeV above the $p\bar{p}π^0$ energy threshold, we can probe the threshold behavior for this reaction. However, no anomalous threshold enhancement is found in the cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.04788 [pdf, other]

DiffMatch: Visual-Language Guidance Makes Better Semi-supervised Change Detector

Authors: Kaiyu Li, Xiangyong Cao, Yupeng Deng, Junmin Liu, Deyu Meng, Zhi Wang

Abstract: Change Detection (CD) aims to identify pixels with semantic changes between images. However, annotating massive numbers of pixel-level images is labor-intensive and costly, especially for multi-temporal images, which require pixel-wise comparisons by human experts. Considering the excellent performance of visual language models (VLMs) for zero-shot, open-vocabulary, etc. with prompt-based reasonin… ▽ More Change Detection (CD) aims to identify pixels with semantic changes between images. However, annotating massive numbers of pixel-level images is labor-intensive and costly, especially for multi-temporal images, which require pixel-wise comparisons by human experts. Considering the excellent performance of visual language models (VLMs) for zero-shot, open-vocabulary, etc. with prompt-based reasoning, it is promising to utilize VLMs to make better CD under limited labeled data. In this paper, we propose a VLM guidance-based semi-supervised CD method, namely DiffMatch. The insight of DiffMatch is to synthesize free change labels using VLMs to provide additional supervision signals for unlabeled data. However, almost all current VLMs are designed for single-temporal images and cannot be directly applied to bi- or multi-temporal images. Motivated by this, we first propose a VLM-based mixed change event generation (CEG) strategy to yield pseudo labels for unlabeled CD data. Since the additional supervised signals provided by these VLM-driven pseudo labels may conflict with the pseudo labels from the consistency regularization paradigm (e.g. FixMatch), we propose the dual projection head for de-entangling different signal sources. Further, we explicitly decouple the bi-temporal images semantic representation through two auxiliary segmentation decoders, which are also guided by VLM. Finally, to make the model more adequately capture change representations, we introduce metric-aware supervision by feature-level contrastive loss in auxiliary branches. Extensive experiments show the advantage of DiffMatch. For instance, DiffMatch improves the FixMatch baseline by +5.3 IoU on WHU-CD and by +2.4 IoU on LEVIR-CD with 5% labels. In addition, our CEG strategy, in an un-supervised manner, can achieve performance far superior to state-of-the-art un-supervised CD methods. △ Less

Submitted 22 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

Comments: 13 pages, 5 figures

arXiv:2405.04739 [pdf]

Pressure induced metallization and loss of surface magnetism in FeSi

Authors: Yuhang Deng, Farhad Taraporevala, Haozhe Wang, Eric Lee-Wong, Camilla M. Moir, Jinhyuk Lim, Shubham Sinha, Weiwei Xie, James Hamlin, Yogesh Vohra, M. Brian Maple

Abstract: Single crystalline FeSi samples with a conducting surface state (CSS) were studied under high pressure ($\textit{P}$) and magnetic field ($\textit{B}$) by means of electrical resistance ($\textit{R}$) measurements to explore how the bulk semiconducting state and the surface state are tuned by the application of pressure. We found that the energy gap ($Δ$) associated with the semiconducting bulk ph… ▽ More Single crystalline FeSi samples with a conducting surface state (CSS) were studied under high pressure ($\textit{P}$) and magnetic field ($\textit{B}$) by means of electrical resistance ($\textit{R}$) measurements to explore how the bulk semiconducting state and the surface state are tuned by the application of pressure. We found that the energy gap ($Δ$) associated with the semiconducting bulk phase begins to close abruptly at a critical pressure ($P_{cr}$) of ~10 GPa and the bulk material becomes metallic with no obvious sign of any emergent phases or non-Fermi liquid behavior in $\textit{R}$($\textit{T}$) in the neighborhood of $P_{cr}$ above 3 K. Moreover, the metallic phase appears to remain at near-ambient pressure upon release of the pressure. Interestingly, the hysteresis in the $\textit{R}$($\textit{T}$) curve associated with the magnetically ordered CSS decreases with pressure and vanishes at $P_{cr}$, while the slope of the $\textit{R}$($\textit{B}$) curve, d$\textit{R}$/d$\textit{B}$, which has a negative value for $\textit{P}$ < $P_{cr}$, decreases in magnitude with $\textit{P}$ and changes sign at $P_{cr}$. Thus, the CSS and the corresponding two-dimensional magnetic order collapse at $P_{cr}$ where the energy gap $Δ$ of the bulk material starts to close abruptly, revealing the connection between the CSS and the semiconducting bulk state in FeSi. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.02653 [pdf, other]

Isopignistic Canonical Decomposition via Belief Evolution Network

Authors: Qianli Zhou, Tianxiang Zhan, Yong Deng

Abstract: Developing a general information processing model in uncertain environments is fundamental for the advancement of explainable artificial intelligence. Dempster-Shafer theory of evidence is a well-known and effective reasoning method for representing epistemic uncertainty, which is closely related to subjective probability theory and possibility theory. Although they can be transformed to each othe… ▽ More Developing a general information processing model in uncertain environments is fundamental for the advancement of explainable artificial intelligence. Dempster-Shafer theory of evidence is a well-known and effective reasoning method for representing epistemic uncertainty, which is closely related to subjective probability theory and possibility theory. Although they can be transformed to each other under some particular belief structures, there remains a lack of a clear and interpretable transformation process, as well as a unified approach for information processing. In this paper, we aim to address these issues from the perspectives of isopignistic belief functions and the hyper-cautious transferable belief model. Firstly, we propose an isopignistic transformation based on the belief evolution network. This transformation allows for the adjustment of the information granule while retaining the potential decision outcome. The isopignistic transformation is integrated with a hyper-cautious transferable belief model to establish a new canonical decomposition. This decomposition offers a reverse path between the possibility distribution and its isopignistic mass functions. The result of the canonical decomposition, called isopignistic function, is an identical information content distribution to reflect the propensity and relative commitment degree of the BPA. Furthermore, this paper introduces a method to reconstruct the basic belief assignment by adjusting the isopignistic function. It explores the advantages of this approach in modeling and handling uncertainty within the hyper-cautious transferable belief model. More general, this paper establishes a theoretical basis for building general models of artificial intelligence based on probability theory, Dempster-Shafer theory, and possibility theory. △ Less

Submitted 4 May, 2024; originally announced May 2024.

arXiv:2405.01868 [pdf, other]

Incorporating External Knowledge and Goal Guidance for LLM-based Conversational Recommender Systems

Authors: Chuang Li, Yang Deng, Hengchang Hu, Min-Yen Kan, Haizhou Li

Abstract: This paper aims to efficiently enable large language models (LLMs) to use external knowledge and goal guidance in conversational recommender system (CRS) tasks. Advanced LLMs (e.g., ChatGPT) are limited in domain-specific CRS tasks for 1) generating grounded responses with recommendation-oriented knowledge, or 2) proactively leading the conversations through different dialogue goals. In this work,… ▽ More This paper aims to efficiently enable large language models (LLMs) to use external knowledge and goal guidance in conversational recommender system (CRS) tasks. Advanced LLMs (e.g., ChatGPT) are limited in domain-specific CRS tasks for 1) generating grounded responses with recommendation-oriented knowledge, or 2) proactively leading the conversations through different dialogue goals. In this work, we first analyze those limitations through a comprehensive evaluation, showing the necessity of external knowledge and goal guidance which contribute significantly to the recommendation accuracy and language quality. In light of this finding, we propose a novel ChatCRS framework to decompose the complex CRS task into several sub-tasks through the implementation of 1) a knowledge retrieval agent using a tool-augmented approach to reason over external Knowledge Bases and 2) a goal-planning agent for dialogue goal prediction. Experimental results on two multi-goal CRS datasets reveal that ChatCRS sets new state-of-the-art benchmarks, improving language quality of informativeness by 17% and proactivity by 27%, and achieving a tenfold enhancement in recommendation accuracy. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: Main paper 8 pages; References and Appendix 9 pages; 7 figures and 14 tables

arXiv:2405.01470 [pdf, other]

WildChat: 1M ChatGPT Interaction Logs in the Wild

Authors: Wenting Zhao, Xiang Ren, Jack Hessel, Claire Cardie, Yejin Choi, Yuntian Deng

Abstract: Chatbots such as GPT-4 and ChatGPT are now serving millions of users. Despite their widespread use, there remains a lack of public datasets showcasing how these tools are used by a population of users in practice. To bridge this gap, we offered free access to ChatGPT for online users in exchange for their affirmative, consensual opt-in to anonymously collect their chat transcripts and request head… ▽ More Chatbots such as GPT-4 and ChatGPT are now serving millions of users. Despite their widespread use, there remains a lack of public datasets showcasing how these tools are used by a population of users in practice. To bridge this gap, we offered free access to ChatGPT for online users in exchange for their affirmative, consensual opt-in to anonymously collect their chat transcripts and request headers. From this, we compiled WildChat, a corpus of 1 million user-ChatGPT conversations, which consists of over 2.5 million interaction turns. We compare WildChat with other popular user-chatbot interaction datasets, and find that our dataset offers the most diverse user prompts, contains the largest number of languages, and presents the richest variety of potentially toxic use-cases for researchers to study. In addition to timestamped chat transcripts, we enrich the dataset with demographic data, including state, country, and hashed IP addresses, alongside request headers. This augmentation allows for more detailed analysis of user behaviors across different geographical regions and temporal dimensions. Finally, because it captures a broad range of use cases, we demonstrate the dataset's potential utility in fine-tuning instruction-following models. WildChat is released at https://wildchat.allen.ai under AI2 ImpACT Licenses. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: accepted by ICLR 2024

arXiv:2405.00603 [pdf, other]

Learning Expressive Disentangled Speech Representations with Soft Speech Units and Adversarial Style Augmentation

Authors: Yimin Deng, Jianzong Wang, Xulong Zhang, Ning Cheng, Jing Xiao

Abstract: Voice conversion is the task to transform voice characteristics of source speech while preserving content information. Nowadays, self-supervised representation learning models are increasingly utilized in content extraction. However, in these representations, a lot of hidden speaker information leads to timbre leakage while the prosodic information of hidden units lacks use. To address these issue… ▽ More Voice conversion is the task to transform voice characteristics of source speech while preserving content information. Nowadays, self-supervised representation learning models are increasingly utilized in content extraction. However, in these representations, a lot of hidden speaker information leads to timbre leakage while the prosodic information of hidden units lacks use. To address these issues, we propose a novel framework for expressive voice conversion called "SAVC" based on soft speech units from HuBert-soft. Taking soft speech units as input, we design an attribute encoder to extract content and prosody features respectively. Specifically, we first introduce statistic perturbation imposed by adversarial style augmentation to eliminate speaker information. Then the prosody is implicitly modeled on soft speech units with knowledge distillation. Experiment results show that the intelligibility and naturalness of converted speech outperform previous work. △ Less

Submitted 1 May, 2024; originally announced May 2024.

Comments: Accepted by the 2024 International Joint Conference on Neural Networks (IJCNN 2024)

arXiv:2404.19368 [pdf, other]

Exploring Multi-Lingual Bias of Large Code Models in Code Generation

Authors: Chaozheng Wang, Zongjie Li, Cuiyun Gao, Wenxuan Wang, Ting Peng, Hailiang Huang, Yuetang Deng, Shuai Wang, Michael R. Lyu

Abstract: Code generation aims to synthesize code and fulfill functional requirements based on natural language (NL) specifications, which can greatly improve development efficiency. In the era of large language models (LLMs), large code models (LCMs) have been recently proposed to generate source code. LCMs can generate highly feasible solutions for programming problems described in natural language. Despi… ▽ More Code generation aims to synthesize code and fulfill functional requirements based on natural language (NL) specifications, which can greatly improve development efficiency. In the era of large language models (LLMs), large code models (LCMs) have been recently proposed to generate source code. LCMs can generate highly feasible solutions for programming problems described in natural language. Despite the effectiveness, we observe a noticeable multilingual bias in the generation performance of LCMs. Specifically, LCMs demonstrate proficiency in generating solutions when provided with instructions in English, yet may falter when faced with semantically equivalent instructions in other NLs such as Chinese. Moreover, the ability of LCMs to generate code exhibits variety across different programming languages (PLs), such as Python and C++. The observed phenomenon indicates the presence of multi-lingual bias within the generative capabilities of LCMs, which has remained unexplored. In this paper, we aim to investigate the multi-lingual bias that exists in current LCMs. First, we initiate our investigation by constructing the first multi-lingual evaluation benchmark X-HumanEval-X, enabling us to systematically evaluate the extent of multi-lingual bias that exists in current LCMs. In our large-scale experiments on nine popular LCMs, we observe a pronounced multi-lingual bias of LCMs in code generation, including multi-NL and multi-PL bias. Specifically, when using Chinese instructions, the code generation capabilities of LCMs decrease by at least 13% in terms of the Pass@1 metric. Furthermore, LCMs perform variously across different programming languages, e.g., the performance gap between Python and C++ reaches as high as 20.9%. ... △ Less

Submitted 30 April, 2024; originally announced April 2024.

Comments: 12 pages

arXiv:2404.18156 [pdf, other]

Event-based Video Frame Interpolation with Edge Guided Motion Refinement

Authors: Yuhan Liu, Yongjian Deng, Hao Chen, Bochen Xie, Youfu Li, Zhen Yang

Abstract: Video frame interpolation, the process of synthesizing intermediate frames between sequential video frames, has made remarkable progress with the use of event cameras. These sensors, with microsecond-level temporal resolution, fill information gaps between frames by providing precise motion cues. However, contemporary Event-Based Video Frame Interpolation (E-VFI) techniques often neglect the fact… ▽ More Video frame interpolation, the process of synthesizing intermediate frames between sequential video frames, has made remarkable progress with the use of event cameras. These sensors, with microsecond-level temporal resolution, fill information gaps between frames by providing precise motion cues. However, contemporary Event-Based Video Frame Interpolation (E-VFI) techniques often neglect the fact that event data primarily supply high-confidence features at scene edges during multi-modal feature fusion, thereby diminishing the role of event signals in optical flow (OF) estimation and warping refinement. To address this overlooked aspect, we introduce an end-to-end E-VFI learning method (referred to as EGMR) to efficiently utilize edge features from event signals for motion flow and warping enhancement. Our method incorporates an Edge Guided Attentive (EGA) module, which rectifies estimated video motion through attentive aggregation based on the local correlation of multi-modal features in a coarse-to-fine strategy. Moreover, given that event data can provide accurate visual references at scene edges between consecutive frames, we introduce a learned visibility map derived from event data to adaptively mitigate the occlusion problem in the warping refinement process. Extensive experiments on both synthetic and real datasets show the effectiveness of the proposed approach, demonstrating its potential for higher quality video frame interpolation. △ Less

Submitted 28 April, 2024; originally announced April 2024.

arXiv:2404.18092 [pdf, ps, other]

Numerous Bidirectionally Propagating Plasma Blobs near the Reconnection Site of a Solar Eruption

Authors: Zhenyong Hou, Hui Tian, Maria S. Madjarska, Hechao Chen, Tanmoy Samanta, Xianyong Bai, Zhentong Li, Yang Su, Wei Chen, Yuanyong Deng

Abstract: Current sheet is a common structure involved in solar eruptions. However, it is observed in minority of the events and the physical properties of its fine structures during a solar eruption are rarely investigated. Here, we report an on-disk observation that displays 108 compact, circular or elliptic bright structures, presumably plasma blobs, propagating bidirectionally along a flare current shee… ▽ More Current sheet is a common structure involved in solar eruptions. However, it is observed in minority of the events and the physical properties of its fine structures during a solar eruption are rarely investigated. Here, we report an on-disk observation that displays 108 compact, circular or elliptic bright structures, presumably plasma blobs, propagating bidirectionally along a flare current sheet during a period of $\sim$24 minutes. From extreme ultraviolet images, we have investigated the temporal variation of the blob number around the flare peak time. The current sheet connects the flare loops and the erupting filament. The width, duration, projected velocity, temperature, and density of these blobs are $\sim$1.7$\pm$0.5\,Mm, $\sim$79$\pm$57\,s, $\sim$191$\pm$81\,\kms, $\sim$10$^{6.4\pm0.1}$ K, and $\sim$10$^{10.1\pm0.3}$ cm$^{-3}$, respectively. The reconnection site rises with a velocity of $\leqslant$69\,\kms. The observational results suggest that plasmoid instability plays an important role in the energy release process of solar eruptions. △ Less

Submitted 28 April, 2024; originally announced April 2024.

Comments: Accepted by A&A, 9 pages, and 5 figures

arXiv:2404.17900 [pdf, other]

Unsupervised Anomaly Detection via Masked Diffusion Posterior Sampling

Authors: Di Wu, Shicai Fan, Xue Zhou, Li Yu, Yuzhong Deng, Jianxiao Zou, Baihong Lin

Abstract: Reconstruction-based methods have been commonly used for unsupervised anomaly detection, in which a normal image is reconstructed and compared with the given test image to detect and locate anomalies. Recently, diffusion models have shown promising applications for anomaly detection due to their powerful generative ability. However, these models lack strict mathematical support for normal image re… ▽ More Reconstruction-based methods have been commonly used for unsupervised anomaly detection, in which a normal image is reconstructed and compared with the given test image to detect and locate anomalies. Recently, diffusion models have shown promising applications for anomaly detection due to their powerful generative ability. However, these models lack strict mathematical support for normal image reconstruction and unexpectedly suffer from low reconstruction quality. To address these issues, this paper proposes a novel and highly-interpretable method named Masked Diffusion Posterior Sampling (MDPS). In MDPS, the problem of normal image reconstruction is mathematically modeled as multiple diffusion posterior sampling for normal images based on the devised masked noisy observation model and the diffusion-based normal image prior under Bayesian framework. Using a metric designed from pixel-level and perceptual-level perspectives, MDPS can effectively compute the difference map between each normal posterior sample and the given test image. Anomaly scores are obtained by averaging all difference maps for multiple posterior samples. Exhaustive experiments on MVTec and BTAD datasets demonstrate that MDPS can achieve state-of-the-art performance in normal image reconstruction quality as well as anomaly detection and localization. △ Less

Submitted 27 April, 2024; originally announced April 2024.

Journal ref: International Joint Conference on Artificial Intelligence 2024

arXiv:2404.16687 [pdf, other]

NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

Authors: Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, Haoning Wu, Yixuan Gao, Yuqin Cao, Zicheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng , et al. (89 additional authors not shown)

Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Conte… ▽ More This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Content (AIGC). The challenge is divided into the image track and the video track. The image track uses the AIGIQA-20K, which contains 20,000 AI-Generated Images (AIGIs) generated by 15 popular generative models. The image track has a total of 318 registered participants. A total of 1,646 submissions are received in the development phase, and 221 submissions are received in the test phase. Finally, 16 participating teams submitted their models and fact sheets. The video track uses the T2VQA-DB, which contains 10,000 AI-Generated Videos (AIGVs) generated by 9 popular Text-to-Video (T2V) models. A total of 196 participants have registered in the video track. A total of 991 submissions are received in the development phase, and 185 submissions are received in the test phase. Finally, 12 participating teams submitted their models and fact sheets. Some methods have achieved better results than baseline methods, and the winning methods in both tracks have demonstrated superior prediction performance on AIGC. △ Less

Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

arXiv:2404.15458 [pdf, other]

Can Large Language Models Learn the Physics of Metamaterials? An Empirical Study with ChatGPT

Authors: Darui Lu, Yang Deng, Jordan M. Malof, Willie J. Padilla

Abstract: Large language models (LLMs) such as ChatGPT, Gemini, LlaMa, and Claude are trained on massive quantities of text parsed from the internet and have shown a remarkable ability to respond to complex prompts in a manner often indistinguishable from humans. We present a LLM fine-tuned on up to 40,000 data that can predict electromagnetic spectra over a range of frequencies given a text prompt that onl… ▽ More Large language models (LLMs) such as ChatGPT, Gemini, LlaMa, and Claude are trained on massive quantities of text parsed from the internet and have shown a remarkable ability to respond to complex prompts in a manner often indistinguishable from humans. We present a LLM fine-tuned on up to 40,000 data that can predict electromagnetic spectra over a range of frequencies given a text prompt that only specifies the metasurface geometry. Results are compared to conventional machine learning approaches including feed-forward neural networks, random forest, linear regression, and K-nearest neighbor (KNN). Remarkably, the fine-tuned LLM (FT-LLM) achieves a lower error across all dataset sizes explored compared to all machine learning approaches including a deep neural network. We also demonstrate the LLM's ability to solve inverse problems by providing the geometry necessary to achieve a desired spectrum. LLMs possess some advantages over humans that may give them benefits for research, including the ability to process enormous amounts of data, find hidden patterns in data, and operate in higher-dimensional spaces. We propose that fine-tuning LLMs on large datasets specific to a field allows them to grasp the nuances of that domain, making them valuable tools for research and analysis. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.15451 [pdf, other]

CFPFormer: Feature-pyramid like Transformer Decoder for Segmentation and Detection

Authors: Hongyi Cai, Mohammad Mahdinur Rahman, Jingyu Wu, Yulun Deng

Abstract: Feature pyramids have been widely adopted in convolutional neural networks (CNNs) and transformers for tasks like medical image segmentation and object detection. However, the currently existing models generally focus on the Encoder-side Transformer to extract features, from which decoder improvement can bring further potential with well-designed architecture. We propose CFPFormer, a novel decoder… ▽ More Feature pyramids have been widely adopted in convolutional neural networks (CNNs) and transformers for tasks like medical image segmentation and object detection. However, the currently existing models generally focus on the Encoder-side Transformer to extract features, from which decoder improvement can bring further potential with well-designed architecture. We propose CFPFormer, a novel decoder block that integrates feature pyramids and transformers. Specifically, by leveraging patch embedding, cross-layer feature concatenation, and Gaussian attention mechanisms, CFPFormer enhances feature extraction capabilities while promoting generalization across diverse tasks. Benefiting from Transformer structure and U-shaped Connections, our introduced model gains the ability to capture long-range dependencies and effectively up-sample feature maps. Our model achieves superior performance in detecting small objects compared to existing methods. We evaluate CFPFormer on medical image segmentation datasets and object detection benchmarks (VOC 2007, VOC2012, MS-COCO), demonstrating its effectiveness and versatility. On the ACDC Post-2017-MICCAI-Challenge online test set, our model reaches exceptionally impressive accuracy, and performed well compared with the original decoder setting in Synapse multi-organ segmentation dataset. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.14662 [pdf, other]

NExT: Teaching Large Language Models to Reason about Code Execution

Authors: Ansong Ni, Miltiadis Allamanis, Arman Cohan, Yinlin Deng, Kensen Shi, Charles Sutton, Pengcheng Yin

Abstract: A fundamental skill among human developers is the ability to understand and reason about program execution. As an example, a programmer can mentally simulate code execution in natural language to debug and repair code (aka. rubber duck debugging). However, large language models (LLMs) of code are typically trained on the surface textual form of programs, thus may lack a semantic understanding of h… ▽ More A fundamental skill among human developers is the ability to understand and reason about program execution. As an example, a programmer can mentally simulate code execution in natural language to debug and repair code (aka. rubber duck debugging). However, large language models (LLMs) of code are typically trained on the surface textual form of programs, thus may lack a semantic understanding of how programs execute at run-time. To address this issue, we propose NExT, a method to teach LLMs to inspect the execution traces of programs (variable states of executed lines) and reason about their run-time behavior through chain-of-thought (CoT) rationales. Specifically, NExT uses self-training to bootstrap a synthetic training set of execution-aware rationales that lead to correct task solutions (e.g., fixed programs) without laborious manual annotation. Experiments on program repair tasks based on MBPP and HumanEval demonstrate that NExT improves the fix rate of a PaLM 2 model, by 26.1% and 14.3% absolute, respectively, with significantly improved rationale quality as verified by automated metrics and human raters. Our model can also generalize to scenarios where program traces are absent at test-time. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: 35 pages

arXiv:2404.13840 [pdf, other]

Study of $e^+e^-\toωX(3872)$ and $γX(3872)$ from 4.66 to 4.95 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: Using data samples with an integrated luminosity of $4.5~\text{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies ranging from 4.66 to 4.95 GeV, we study the processes of $e^+e^-\toωX(3872)$ and $e^+e^-\toγX(3872)$. With the $e^+e^-\toωX(3872)$ process, the branching fraction ratio $R\equiv\frac{\mathcal{B}(X(3872)\toγJ/ψ)}{\mathcal{B}(X(3872)\toπ^+π^- J/ψ)}$ is measured to be… ▽ More Using data samples with an integrated luminosity of $4.5~\text{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies ranging from 4.66 to 4.95 GeV, we study the processes of $e^+e^-\toωX(3872)$ and $e^+e^-\toγX(3872)$. With the $e^+e^-\toωX(3872)$ process, the branching fraction ratio $R\equiv\frac{\mathcal{B}(X(3872)\toγJ/ψ)}{\mathcal{B}(X(3872)\toπ^+π^- J/ψ)}$ is measured to be $0.38\pm0.20_\text{stat.}\pm0.01_\text{syst.}$ ($R< 0.83$ at 90\% confidence level). In addition, we measure the ratio of the average cross section of $e^+e^-\toωX(3872)$ to $e^+e^-\toωχ_{c1}(ωχ_{c2})$ to be $σ_{ωX(3872)}/σ_{ωχ_{c1}}~(σ_{ωX(3872)}/σ_{ωχ_{c2}})=5.2\pm1.0_\text{stat.}\pm1.9_\text{syst.}~ (5.5\pm1.1_\text{stat.}\pm2.4_\text{syst.})$. Finally, we search for the process of $e^+e^-\toγX(3872)$, and no obvious signal is observed. The upper limit on the ratio of the average cross section of $e^+e^-\toγX(3872)$ to $e^+e^-\toωX(3872)$ is set as $σ_{γX(3872)}/σ_{ωX(3872)}<0.23$ at 90\% confidence level. △ Less

Submitted 21 April, 2024; originally announced April 2024.

Comments: 19 pages, 10 figures

arXiv:2404.12670 [pdf, other]

Towards Human-centered Proactive Conversational Agents

Authors: Yang Deng, Lizi Liao, Zhonghua Zheng, Grace Hui Yang, Tat-Seng Chua

Abstract: Recent research on proactive conversational agents (PCAs) mainly focuses on improving the system's capabilities in anticipating and planning action sequences to accomplish tasks and achieve goals before users articulate their requests. This perspectives paper highlights the importance of moving towards building human-centered PCAs that emphasize human needs and expectations, and that considers eth… ▽ More Recent research on proactive conversational agents (PCAs) mainly focuses on improving the system's capabilities in anticipating and planning action sequences to accomplish tasks and achieve goals before users articulate their requests. This perspectives paper highlights the importance of moving towards building human-centered PCAs that emphasize human needs and expectations, and that considers ethical and social implications of these agents, rather than solely focusing on technological capabilities. The distinction between a proactive and a reactive system lies in the proactive system's initiative-taking nature. Without thoughtful design, proactive systems risk being perceived as intrusive by human users. We address the issue by establishing a new taxonomy concerning three key dimensions of human-centered PCAs, namely Intelligence, Adaptivity, and Civility. We discuss potential research opportunities and challenges based on this new taxonomy upon the five stages of PCA system construction. This perspectives paper lays a foundation for the emerging area of conversational information retrieval research and paves the way towards advancing human-centered proactive conversational systems. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: Accepted by SIGIR 2024 (Perspectives Track)

arXiv:2404.12130 [pdf, other]

One-Shot Sequential Federated Learning for Non-IID Data by Enhancing Local Model Diversity

Authors: Naibo Wang, Yuchen Deng, Wenjie Feng, Shichen Fan, Jianwei Yin, See-Kiong Ng

Abstract: Traditional federated learning mainly focuses on parallel settings (PFL), which can suffer significant communication and computation costs. In contrast, one-shot and sequential federated learning (SFL) have emerged as innovative paradigms to alleviate these costs. However, the issue of non-IID (Independent and Identically Distributed) data persists as a significant challenge in one-shot and SFL se… ▽ More Traditional federated learning mainly focuses on parallel settings (PFL), which can suffer significant communication and computation costs. In contrast, one-shot and sequential federated learning (SFL) have emerged as innovative paradigms to alleviate these costs. However, the issue of non-IID (Independent and Identically Distributed) data persists as a significant challenge in one-shot and SFL settings, exacerbated by the restricted communication between clients. In this paper, we improve the one-shot sequential federated learning for non-IID data by proposing a local model diversity-enhancing strategy. Specifically, to leverage the potential of local model diversity for improving model performance, we introduce a local model pool for each client that comprises diverse models generated during local training, and propose two distance measurements to further enhance the model diversity and mitigate the effect of non-IID data. Consequently, our proposed framework can improve the global model performance while maintaining low communication costs. Extensive experiments demonstrate that our method exhibits superior performance to existing one-shot PFL methods and achieves better accuracy compared with state-of-the-art one-shot SFL methods on both label-skew and domain-shift tasks (e.g., 6%+ accuracy improvement on the CIFAR-10 dataset). △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.11759 [pdf, other]

Modelling infectious disease transmission dynamics in conference environments: An individual-based approach

Authors: Xue Liu, Yue Deng, Jingying Huang, Yuhong Zhang, Jinzhi Lei

Abstract: The global public health landscape is perpetually challenged by the looming threat of infectious diseases. Central to addressing this concern is the imperative to prevent and manage disease transmission during pandemics, particularly in unique settings. This study addresses the transmission dynamics of infectious diseases within conference venues, presenting a computational model designed to simul… ▽ More The global public health landscape is perpetually challenged by the looming threat of infectious diseases. Central to addressing this concern is the imperative to prevent and manage disease transmission during pandemics, particularly in unique settings. This study addresses the transmission dynamics of infectious diseases within conference venues, presenting a computational model designed to simulate transmission processes within a condensed timeframe (one day), beginning with sporadic cases. Our model intricately captures the activities of individual attendees within the conference venue, encompassing meetings, rest intervals, and meal breaks. While meetings entail proximity seating, rest and lunch periods allow attendees to interact with diverse individuals. Moreover, the restroom environment poses an additional avenue for potential infection transmission. Employing an individual-based model, we meticulously replicated the transmission dynamics of infectious diseases, with a specific emphasis on close-contact interactions between infected and susceptible individuals. Through comprehensive analysis of model simulations, we elucidated the intricacies of disease transmission dynamics within conference settings and assessed the efficacy of control strategies to curb disease dissemination. Ultimately, our study proffers a numerical framework for assessing the risk of infectious disease transmission during short-duration conferences, furnishing conference organizers with valuable insights to inform the implementation of targeted prevention and control measures. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 25 pages; 8 figures

arXiv:2404.11157 [pdf, ps, other]

Self-Ordered Supersolid in Spinor Condensates with Cavity-Mediated Spin-Momentum-Mixing Interactions

Authors: Jingjun You, Su Yi, Yuangang Deng

Abstract: Ultracold atoms with cavity-mediated long-range interactions offer a promising platform for investing novel quantum phenomena. Exploiting recent experimental advancements, we propose an experimental scheme to create self-ordered supersolid in spin-$1/2$ condensates confined within an optical cavity. The interplay of cavity and pump fields gives rise to supersolid square and plane wave phases, comp… ▽ More Ultracold atoms with cavity-mediated long-range interactions offer a promising platform for investing novel quantum phenomena. Exploiting recent experimental advancements, we propose an experimental scheme to create self-ordered supersolid in spin-$1/2$ condensates confined within an optical cavity. The interplay of cavity and pump fields gives rise to supersolid square and plane wave phases, comprehensively described by the two-component Tavis-Cummings model. We show that the self-ordered supersolid phase exhibits an undamped gapless Goldstone mode over a wide parameter range. This proposal, achievable with current experimental setups utilizing identical laser configurations, is in contrast to the realization of checkerboard supersolidity, which hinges on constructing a $U(1)$ symmetry by utilizing two ${\cal Z}_2$ symmetries with precisely matched atom-cavity coupling in multimode resonators. By employing the superradiant photon-exchange process, we realize for the first time cavity-mediated spin-momentum-mixing interactions between highly correlated spin and momentum modes, analogous to that observed spin-mixing in spin-1 condensates. Our scheme provides a unique platform for realizing spin-momentum squeezing and spatially distributed multipartite entanglement. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 5+9 pages, 4+3 figures

arXiv:2404.10573 [pdf, other]

AAVDiff: Experimental Validation of Enhanced Viability and Diversity in Recombinant Adeno-Associated Virus (AAV) Capsids through Diffusion Generation

Authors: Lijun Liu, Jiali Yang, Jianfei Song, Xinglin Yang, Lele Niu, Zeqi Cai, Hui Shi, Tingjun Hou, Chang-yu Hsieh, Weiran Shen, Yafeng Deng

Abstract: Recombinant adeno-associated virus (rAAV) vectors have revolutionized gene therapy, but their broad tropism and suboptimal transduction efficiency limit their clinical applications. To overcome these limitations, researchers have focused on designing and screening capsid libraries to identify improved vectors. However, the large sequence space and limited resources present challenges in identifyin… ▽ More Recombinant adeno-associated virus (rAAV) vectors have revolutionized gene therapy, but their broad tropism and suboptimal transduction efficiency limit their clinical applications. To overcome these limitations, researchers have focused on designing and screening capsid libraries to identify improved vectors. However, the large sequence space and limited resources present challenges in identifying viable capsid variants. In this study, we propose an end-to-end diffusion model to generate capsid sequences with enhanced viability. Using publicly available AAV2 data, we generated 38,000 diverse AAV2 viral protein (VP) sequences, and evaluated 8,000 for viral selection. The results attested the superiority of our model compared to traditional methods. Additionally, in the absence of AAV9 capsid data, apart from one wild-type sequence, we used the same model to directly generate a number of viable sequences with up to 9 mutations. we transferred the remaining 30,000 samples to the AAV9 domain. Furthermore, we conducted mutagenesis on AAV9 VP hypervariable regions VI and V, contributing to the continuous improvement of the AAV9 VP sequence. This research represents a significant advancement in the design and functional validation of rAAV vectors, offering innovative solutions to enhance specificity and transduction efficiency in gene therapy applications. △ Less

Submitted 17 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

arXiv:2404.09219 [pdf, ps, other]

Observation of $D \to a_{0}(980)π$ in the decays $D^{0} \rightarrow π^{+}π^{-}η$ and $D^{+} \rightarrow π^{+}π^{0}η$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: We report the first amplitude analysis of the decays $D^{0} \to π^{+} π^{-} η$ and $D^{+} \rightarrow π^{+}π^{0}η$ using a data sample taken with the BESIII detector at the center-of-mass energy of 3.773 GeV, corresponding to an integrated luminosity of 7.9 ${\rm fb}^{-1}$. The contribution from the process $D^{0(+)} \to a_{0}(980)^{+} π^{-(0)}$ is significantly larger than the… ▽ More We report the first amplitude analysis of the decays $D^{0} \to π^{+} π^{-} η$ and $D^{+} \rightarrow π^{+}π^{0}η$ using a data sample taken with the BESIII detector at the center-of-mass energy of 3.773 GeV, corresponding to an integrated luminosity of 7.9 ${\rm fb}^{-1}$. The contribution from the process $D^{0(+)} \to a_{0}(980)^{+} π^{-(0)}$ is significantly larger than the $D^{0(+)} \to a_{0}(980)^{-(0)} π^{+}$ contribution. The ratios $\mathcal{B}(D^{0} \rightarrow a_{0}(980)^{+}π^{-})/\mathcal{B}(D^{0} \rightarrow a_{0}(980)^{-}π^{+})$ and $\mathcal{B}(D^{+} \rightarrow a_{0}(980)^{+}π^{0})/\mathcal{B}(D^{+} \rightarrow a_{0}(980)^{0}π^{+})$ are measured to be $7.5^{+2.5}_{-0.8\,\mathrm{stat.}}\pm1.7_{\mathrm{syst.}}$ and $2.6\pm0.6_{\mathrm{stat.}}\pm0.3_{\mathrm{syst.}}$, respectively. The measured $D^{0}$ ratio disagrees with the theoretical predictions by orders of magnitudes, thus implying a substantial contribution from final-state interactions. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Showing 51–100 of 1,875 results for author: Deng, Y