subscribe to arXiv mailings

Measurement of the branching fraction of $D^+_s\to \ell^+ν_\ell$ via $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and… ▽ More Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and $\mathcal{B}(D_s^+\toτ^+ν_τ)=(\bftauv)\%$, respectively. The product of the decay constant and Cabibbo-Kobayashi-Maskawa matrix element $|V_{cs}|$ is determined to be $f_{D_s^+}|V_{cs}|=(\mufdsxvcsresult)_{μν}~\mathrm{MeV}$ and $f_{D_s^+}|V_{cs}|=(\taufdsxvcsresult))_{τν}~\mathrm{MeV}$, respectively. Taking the value of $|V_{cs}|$ from a global fit in the Standard Model, we obtain ${f_{D^+_s}}=(\mufdsresult)_{μν}$\,MeV and ${f_{D^+_s}}=(\taufdsresult)_{τν}$\,MeV, respectively. Conversely, taking the value for $f_{D_s^+}$ from the latest lattice quantum chromodynamics calculation, we obtain $|V_{cs}| =(\muvcsresult)_{μν}$ and $|V_{cs}| = (\tauvcsresult)_{τν}$, respectively. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: 27 pages, 13 figures

arXiv:2407.11595 [pdf, other]

Machine Learning in Communications: A Road to Intelligent Transmission and Processing

Authors: Shixiong Wang, Geoffrey Ye Li

Abstract: Prior to the era of artificial intelligence and big data, wireless communications primarily followed a conventional research route involving problem analysis, model building and calibration, algorithm design and tuning, and holistic and empirical verification. However, this methodology often encountered limitations when dealing with large-scale and complex problems and managing dynamic and massive… ▽ More Prior to the era of artificial intelligence and big data, wireless communications primarily followed a conventional research route involving problem analysis, model building and calibration, algorithm design and tuning, and holistic and empirical verification. However, this methodology often encountered limitations when dealing with large-scale and complex problems and managing dynamic and massive data, resulting in inefficiencies and limited performance of traditional communication systems and methods. As such, wireless communications have embraced the revolutionary impact of artificial intelligence and machine learning, giving birth to more adaptive, efficient, and intelligent systems and algorithms. This technological shift opens a road to intelligent information transmission and processing. This overview article discusses the typical roles of machine learning in intelligent wireless communications, as well as its features, challenges, and practical considerations. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: Invited Article

arXiv:2407.11434 [pdf, ps, other]

Chain projection ordered categories and DRC-restriction semigroups

Authors: Yin Die, Shoufeng Wang

Abstract: In this paper we provide a theory of chain projection ordered categories and generalize that of chain projection ordered groupoids developed by East and Azeef Muhammed recently. By using chain projection ordered categories, we obtain a structure theorem for DRC-restriction semigroups. More specifically, we prove that the category of DRC-restriction semigroups together with (2,1,1)-homomorphisms is… ▽ More In this paper we provide a theory of chain projection ordered categories and generalize that of chain projection ordered groupoids developed by East and Azeef Muhammed recently. By using chain projection ordered categories, we obtain a structure theorem for DRC-restriction semigroups. More specifically, we prove that the category of DRC-restriction semigroups together with (2,1,1)-homomorphisms is isomorphic to the category of chain projection ordered categories together with chain projection ordered functors. Moreover, some special cases are also considered. As applications of the main theorem, we demonstrate that the existence of free projection-generated DRC-restriction semigroups associated to any strong two-sided projection algebra and reobtain the structures of projection-fundamental DRC-restriction semigroups. Our work may be regarded as an answer for the fourth problem proposed by East and Azeef Muhammed in [Advances in Mathematics, 437 (2024) 109447]. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: 60pages

MSC Class: 20M10; 20M50; 18B40; 20M05; 20M20

arXiv:2407.11424 [pdf, other]

Model Inversion Attacks Through Target-Specific Conditional Diffusion Models

Authors: Ouxiang Li, Yanbin Hao, Zhicai Wang, Bin Zhu, Shuo Wang, Zaixi Zhang, Fuli Feng

Abstract: Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications. Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space. To alleviate these issues, leveraging on diffusion models' remarkable synthesis capabilities, w… ▽ More Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications. Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space. To alleviate these issues, leveraging on diffusion models' remarkable synthesis capabilities, we propose Diffusion-based Model Inversion (Diff-MI) attacks. Specifically, we introduce a novel target-specific conditional diffusion model (CDM) to purposely approximate target classifier's private distribution and achieve superior accuracy-fidelity balance. Our method involves a two-step learning paradigm. Step-1 incorporates the target classifier into the entire CDM learning under a pretrain-then-finetune fashion, with creating pseudo-labels as model conditions in pretraining and adjusting specified layers with image predictions in fine-tuning. Step-2 presents an iterative image reconstruction method, further enhancing the attack performance through a combination of diffusion priors and target knowledge. Additionally, we propose an improved max-margin loss that replaces the hard max with top-k maxes, fully leveraging feature information and soft labels from the target classifier. Extensive experiments demonstrate that Diff-MI significantly improves generative fidelity with an average decrease of 20% in FID while maintaining competitive attack accuracy compared to state-of-the-art methods across various datasets and models. We will release our code and models. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: Preprint. Under review

arXiv:2407.11401 [pdf, other]

EndoFinder: Online Image Retrieval for Explainable Colorectal Polyp Diagnosis

Authors: Ruijie Yang, Yan Zhu, Peiyao Fu, Yizhe Zhang, Zhihua Wang, Quanlin Li, Pinghong Zhou, Xian Yang, Shuo Wang

Abstract: Determining the necessity of resecting malignant polyps during colonoscopy screen is crucial for patient outcomes, yet challenging due to the time-consuming and costly nature of histopathology examination. While deep learning-based classification models have shown promise in achieving optical biopsy with endoscopic images, they often suffer from a lack of explainability. To overcome this limitatio… ▽ More Determining the necessity of resecting malignant polyps during colonoscopy screen is crucial for patient outcomes, yet challenging due to the time-consuming and costly nature of histopathology examination. While deep learning-based classification models have shown promise in achieving optical biopsy with endoscopic images, they often suffer from a lack of explainability. To overcome this limitation, we introduce EndoFinder, a content-based image retrieval framework to find the 'digital twin' polyp in the reference database given a newly detected polyp. The clinical semantics of the new polyp can be inferred referring to the matched ones. EndoFinder pioneers a polyp-aware image encoder that is pre-trained on a large polyp dataset in a self-supervised way, merging masked image modeling with contrastive learning. This results in a generic embedding space ready for different downstream clinical tasks based on image retrieval. We validate the framework on polyp re-identification and optical biopsy tasks, with extensive experiments demonstrating that EndoFinder not only achieves explainable diagnostics but also matches the performance of supervised classification models. EndoFinder's reliance on image retrieval has the potential to support diverse downstream decision-making tasks during real-time colonoscopy procedures. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: MICCAI 2024

arXiv:2407.11361 [pdf, other]

Graph Structure Prompt Learning: A Novel Methodology to Improve Performance of Graph Neural Networks

Authors: Zhenhua Huang, Kunhao Li, Shaojie Wang, Zhaohong Jia, Wentao Zhu, Sharad Mehrotra

Abstract: Graph neural networks (GNNs) are widely applied in graph data modeling. However, existing GNNs are often trained in a task-driven manner that fails to fully capture the intrinsic nature of the graph structure, resulting in sub-optimal node and graph representations. To address this limitation, we propose a novel Graph structure Prompt Learning method (GPL) to enhance the training of GNNs, which is… ▽ More Graph neural networks (GNNs) are widely applied in graph data modeling. However, existing GNNs are often trained in a task-driven manner that fails to fully capture the intrinsic nature of the graph structure, resulting in sub-optimal node and graph representations. To address this limitation, we propose a novel Graph structure Prompt Learning method (GPL) to enhance the training of GNNs, which is inspired by prompt mechanisms in natural language processing. GPL employs task-independent graph structure losses to encourage GNNs to learn intrinsic graph characteristics while simultaneously solving downstream tasks, producing higher-quality node and graph representations. In extensive experiments on eleven real-world datasets, after being trained by GPL, GNNs significantly outperform their original performance on node classification, graph classification, and edge prediction tasks (up to 10.28%, 16.5%, and 24.15%, respectively). By allowing GNNs to capture the inherent structural prompts of graphs in GPL, they can alleviate the issue of over-smooth and achieve new state-of-the-art performances, which introduces a novel and effective direction for GNN research with potential applications in various domains. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.11358 [pdf, other]

SES: Bridging the Gap Between Explainability and Prediction of Graph Neural Networks

Authors: Zhenhua Huang, Kunhao Li, Shaojie Wang, Zhaohong Jia, Wentao Zhu, Sharad Mehrotra

Abstract: Despite the Graph Neural Networks' (GNNs) proficiency in analyzing graph data, achieving high-accuracy and interpretable predictions remains challenging. Existing GNN interpreters typically provide post-hoc explanations disjointed from GNNs' predictions, resulting in misrepresentations. Self-explainable GNNs offer built-in explanations during the training process. However, they cannot exploit the… ▽ More Despite the Graph Neural Networks' (GNNs) proficiency in analyzing graph data, achieving high-accuracy and interpretable predictions remains challenging. Existing GNN interpreters typically provide post-hoc explanations disjointed from GNNs' predictions, resulting in misrepresentations. Self-explainable GNNs offer built-in explanations during the training process. However, they cannot exploit the explanatory outcomes to augment prediction performance, and they fail to provide high-quality explanations of node features and require additional processes to generate explainable subgraphs, which is costly. To address the aforementioned limitations, we propose a self-explained and self-supervised graph neural network (SES) to bridge the gap between explainability and prediction. SES comprises two processes: explainable training and enhanced predictive learning. During explainable training, SES employs a global mask generator co-trained with a graph encoder and directly produces crucial structure and feature masks, reducing time consumption and providing node feature and subgraph explanations. In the enhanced predictive learning phase, mask-based positive-negative pairs are constructed utilizing the explanations to compute a triplet loss and enhance the node representations by contrastive learning. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: 20pages,8pages

arXiv:2407.11007 [pdf, other]

Panacea: A foundation model for clinical trial search, summarization, design, and recruitment

Authors: Jiacheng Lin, Hanwen Xu, Zifeng Wang, Sheng Wang, Jimeng Sun

Abstract: Clinical trials are fundamental in developing new drugs, medical devices, and treatments. However, they are often time-consuming and have low success rates. Although there have been initial attempts to create large language models (LLMs) for clinical trial design and patient-trial matching, these models remain task-specific and not adaptable to diverse clinical trial tasks. To address this challen… ▽ More Clinical trials are fundamental in developing new drugs, medical devices, and treatments. However, they are often time-consuming and have low success rates. Although there have been initial attempts to create large language models (LLMs) for clinical trial design and patient-trial matching, these models remain task-specific and not adaptable to diverse clinical trial tasks. To address this challenge, we propose a clinical trial foundation model named Panacea, designed to handle multiple tasks, including trial search, trial summarization, trial design, and patient-trial matching. We also assemble a large-scale dataset, named TrialAlign, of 793,279 trial documents and 1,113,207 trial-related scientific papers, to infuse clinical knowledge into the model by pre-training. We further curate TrialInstruct, which has 200,866 of instruction data for fine-tuning. These resources enable Panacea to be widely applicable for a range of clinical trial tasks based on user requirements. We evaluated Panacea on a new benchmark, named TrialPanorama, which covers eight clinical trial tasks. Our method performed the best on seven of the eight tasks compared to six cutting-edge generic or medicine-specific LLMs. Specifically, Panacea showed great potential to collaborate with human experts in crafting the design of eligibility criteria, study arms, and outcome measures, in multi-round conversations. In addition, Panacea achieved 14.42% improvement in patient-trial matching, 41.78% to 52.02% improvement in trial search, and consistently ranked at the top for five aspects of trial summarization. Our approach demonstrates the effectiveness of Panacea in clinical trials and establishes a comprehensive resource, including training data, model, and benchmark, for developing clinical trial foundation models, paving the path for AI-based clinical trial development. △ Less

Submitted 25 June, 2024; originally announced July 2024.

arXiv:2407.10990 [pdf]

MedBench: A Comprehensive, Standardized, and Reliable Benchmarking System for Evaluating Chinese Medical Large Language Models

Authors: Mianxin Liu, Jinru Ding, Jie Xu, Weiguo Hu, Xiaoyang Li, Lifeng Zhu, Zhian Bai, Xiaoming Shi, Benyou Wang, Haitao Song, Pengfei Liu, Xiaofan Zhang, Shanshan Wang, Kang Li, Haofen Wang, Tong Ruan, Xuanjing Huang, Xin Sun, Shaoting Zhang

Abstract: Ensuring the general efficacy and goodness for human beings from medical large language models (LLM) before real-world deployment is crucial. However, a widely accepted and accessible evaluation process for medical LLM, especially in the Chinese context, remains to be established. In this work, we introduce "MedBench", a comprehensive, standardized, and reliable benchmarking system for Chinese med… ▽ More Ensuring the general efficacy and goodness for human beings from medical large language models (LLM) before real-world deployment is crucial. However, a widely accepted and accessible evaluation process for medical LLM, especially in the Chinese context, remains to be established. In this work, we introduce "MedBench", a comprehensive, standardized, and reliable benchmarking system for Chinese medical LLM. First, MedBench assembles the currently largest evaluation dataset (300,901 questions) to cover 43 clinical specialties and performs multi-facet evaluation on medical LLM. Second, MedBench provides a standardized and fully automatic cloud-based evaluation infrastructure, with physical separations for question and ground truth. Third, MedBench implements dynamic evaluation mechanisms to prevent shortcut learning and answer remembering. Applying MedBench to popular general and medical LLMs, we observe unbiased, reproducible evaluation results largely aligning with medical professionals' perspectives. This study establishes a significant foundation for preparing the practical applications of Chinese medical LLMs. MedBench is publicly accessible at https://medbench.opencompass.org.cn. △ Less

Submitted 23 June, 2024; originally announced July 2024.

Comments: 25 pages.4 figures

arXiv:2407.10984 [pdf, other]

On the Combination of AI and Wireless Technologies: 3GPP Standardization Progress

Authors: Chen Sun, Tao Cui, Wenqi Zhang, Yingshuang Bai, Shuo Wang, Haojin Li

Abstract: Combing Artificial Intelligence (AI) and wireless communication technologies has become one of the major technologies trends towards 2030. This includes using AI to improve the efficiency of the wireless transmission and supporting AI deployment with wireless networks. In this article, the latest progress of the Third Generation Partnership Project (3GPP) standards development is introduced. Conce… ▽ More Combing Artificial Intelligence (AI) and wireless communication technologies has become one of the major technologies trends towards 2030. This includes using AI to improve the efficiency of the wireless transmission and supporting AI deployment with wireless networks. In this article, the latest progress of the Third Generation Partnership Project (3GPP) standards development is introduced. Concentrating on AI model distributed transfer and AI for Beam Management (BM) with wireless network, we introduce the latest studies and explain how the existing standards should be modified to incorporate the results from academia. △ Less

Submitted 16 June, 2024; originally announced July 2024.

arXiv:2407.10956 [pdf, other]

Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

Authors: Ruisheng Cao, Fangyu Lei, Haoyuan Wu, Jixuan Chen, Yeqiao Fu, Hongcheng Gao, Xinzhuang Xiong, Hanchong Zhang, Yuchen Mao, Wenjing Hu, Tianbao Xie, Hongshen Xu, Danyang Zhang, Sida Wang, Ruoxi Sun, Pengcheng Yin, Caiming Xiong, Ansong Ni, Qian Liu, Victor Zhong, Lu Chen, Kai Yu, Tao Yu

Abstract: Data science and engineering workflows often span multiple stages, from warehousing to orchestration, using tools like BigQuery, dbt, and Airbyte. As vision language models (VLMs) advance in multimodal understanding and code generation, VLM-based agents could potentially automate these workflows by generating SQL queries, Python code, and GUI operations. This automation can improve the productivit… ▽ More Data science and engineering workflows often span multiple stages, from warehousing to orchestration, using tools like BigQuery, dbt, and Airbyte. As vision language models (VLMs) advance in multimodal understanding and code generation, VLM-based agents could potentially automate these workflows by generating SQL queries, Python code, and GUI operations. This automation can improve the productivity of experts while democratizing access to large-scale data analysis. In this paper, we introduce Spider2-V, the first multimodal agent benchmark focusing on professional data science and engineering workflows, featuring 494 real-world tasks in authentic computer environments and incorporating 20 enterprise-level professional applications. These tasks, derived from real-world use cases, evaluate the ability of a multimodal agent to perform data-related tasks by writing code and managing the GUI in enterprise data software systems. To balance realistic simulation with evaluation simplicity, we devote significant effort to developing automatic configurations for task setup and carefully crafting evaluation metrics for each task. Furthermore, we supplement multimodal agents with comprehensive documents of these enterprise data software systems. Our empirical evaluation reveals that existing state-of-the-art LLM/VLM-based agents do not reliably automate full data workflows (14.0% success). Even with step-by-step guidance, these agents still underperform in tasks that require fine-grained, knowledge-intensive GUI actions (16.2%) and involve remote cloud-hosted workspaces (10.6%). We hope that Spider2-V paves the way for autonomous multimodal agents to transform the automation of data science and engineering workflow. Our code and data are available at https://spider2-v.github.io. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: 34 pages, 14 figures, 10 tables

arXiv:2407.10953 [pdf, other]

MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models

Authors: Chengguang Gan, Qingyu Yin, Xinyang He, Hanjun Wei, Yunhao Liang, Younghun Lim, Shijian Wang, Hexiang Huang, Qinghao Zhang, Shiwen Ni, Tatsunori Mori

Abstract: The Mutual Reinforcement Effect (MRE) represents a promising avenue in information extraction and multitasking research. Nevertheless, its applicability has been constrained due to the exclusive availability of MRE mix datasets in Japanese, thereby limiting comprehensive exploration by the global research community. To address this limitation, we introduce a Multilingual MRE mix dataset (MMM) that… ▽ More The Mutual Reinforcement Effect (MRE) represents a promising avenue in information extraction and multitasking research. Nevertheless, its applicability has been constrained due to the exclusive availability of MRE mix datasets in Japanese, thereby limiting comprehensive exploration by the global research community. To address this limitation, we introduce a Multilingual MRE mix dataset (MMM) that encompasses 21 sub-datasets in English, Japanese, and Chinese. In this paper, we also propose a method for dataset translation assisted by Large Language Models (LLMs), which significantly reduces the manual annotation time required for dataset construction by leveraging LLMs to translate the original Japanese datasets. Additionally, we have enriched the dataset by incorporating open-domain Named Entity Recognition (NER) and sentence classification tasks. Utilizing this expanded dataset, we developed a unified input-output framework to train an Open-domain Information Extraction Large Language Model (OIELLM). The OIELLM model demonstrates the capability to effectively process novel MMM datasets, exhibiting significant improvements in performance. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: Under Review. 11 pages, 5 Figure

arXiv:2407.10923 [pdf, other]

OPa-Ma: Text Guided Mamba for 360-degree Image Out-painting

Authors: Penglei Gao, Kai Yao, Tiandi Ye, Steven Wang, Yuan Yao, Xiaofeng Wang

Abstract: In this paper, we tackle the recently popular topic of generating 360-degree images given the conventional narrow field of view (NFoV) images that could be taken from a single camera or cellphone. This task aims to predict the reasonable and consistent surroundings from the NFoV images. Existing methods for feature extraction and fusion, often built with transformer-based architectures, incur subs… ▽ More In this paper, we tackle the recently popular topic of generating 360-degree images given the conventional narrow field of view (NFoV) images that could be taken from a single camera or cellphone. This task aims to predict the reasonable and consistent surroundings from the NFoV images. Existing methods for feature extraction and fusion, often built with transformer-based architectures, incur substantial memory usage and computational expense. They also have limitations in maintaining visual continuity across the entire 360-degree images, which could cause inconsistent texture and style generation. To solve the aforementioned issues, we propose a novel text-guided out-painting framework equipped with a State-Space Model called Mamba to utilize its long-sequence modelling and spatial continuity. Furthermore, incorporating textual information is an effective strategy for guiding image generation, enriching the process with detailed context and increasing diversity. Efficiently extracting textual features and integrating them with image attributes presents a significant challenge for 360-degree image out-painting. To address this, we develop two modules, Visual-textual Consistency Refiner (VCR) and Global-local Mamba Adapter (GMA). VCR enhances contextual richness by fusing the modified text features with the image features, while GMA provides adaptive state-selective conditions by capturing the information flow from global to local representations. Our proposed method achieves state-of-the-art performance with extensive experiments on two broadly used 360-degree image datasets, including indoor and outdoor settings. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.10892 [pdf, other]

First Measurement of Solar $^8$B Neutrino Flux through Coherent Elastic Neutrino-Nucleus Scattering in PandaX-4T

Authors: PandaX Collaboration, Zihao Bo, Wei Chen, Xun Chen, Yunhua Chen, Zhaokan Cheng, Xiangyi Cui, Yingjie Fan, Deqing Fang, Zhixing Gao, Lisheng Geng, Karl Giboni, Xunan Guo, Xuyuan Guo, Zichao Guo, Chencheng Han, Ke Han, Changda He, Jinrong He, Di Huang, Houqi Huang, Junting Huang, Ruquan Hou, Yu Hou, Xiangdong Ji , et al. (77 additional authors not shown)

Abstract: The PandaX-4T liquid xenon detector at the China Jinping Underground Laboratory is used to measure the solar $^8$B neutrino flux by detecting neutrinos through coherent scattering with xenon nuclei. Data samples requiring the coincidence of scintillation and ionization signals (paired), as well as unpaired ionization-only signals (US2), are selected with energy threshold of approximately 1.1 keV (… ▽ More The PandaX-4T liquid xenon detector at the China Jinping Underground Laboratory is used to measure the solar $^8$B neutrino flux by detecting neutrinos through coherent scattering with xenon nuclei. Data samples requiring the coincidence of scintillation and ionization signals (paired), as well as unpaired ionization-only signals (US2), are selected with energy threshold of approximately 1.1 keV (0.33 keV) nuclear recoil energy. Combining the commissioning run and the first science run of PandaX-4T, a total exposure of 1.25 and 1.04 tonne$\cdot$year are collected for the paired and US2, respectively. After unblinding, 3 and 332 events are observed with an expectation of 2.8$\pm$0.5 and 251$\pm$32 background events, for the paired and US2 data, respectively. A combined analysis yields a best-fit $^8$B neutrino signal of 3.5 (75) events from the paired (US2) data sample, with $\sim$37\% uncertainty, and the background-only hypothesis is disfavored at 2.64$σ$ significance. This gives a solar $^8$B neutrino flux of ($8.4\pm3.1$)$\times$10$^6$ cm$^{-2}$s$^{-1}$, consistent with the standard solar model prediction. This is the first indication of solar $^8$B neutrino ``fog'' in a dark matter direct detection experiment. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.10695 [pdf, other]

IE-NeRF: Inpainting Enhanced Neural Radiance Fields in the Wild

Authors: Shuaixian Wang, Haoran Xu, Yaokun Li, Jiwei Chen, Guang Tan

Abstract: We present a novel approach for synthesizing realistic novel views using Neural Radiance Fields (NeRF) with uncontrolled photos in the wild. While NeRF has shown impressive results in controlled settings, it struggles with transient objects commonly found in dynamic and time-varying scenes. Our framework called \textit{Inpainting Enhanced NeRF}, or \ours, enhances the conventional NeRF by drawing… ▽ More We present a novel approach for synthesizing realistic novel views using Neural Radiance Fields (NeRF) with uncontrolled photos in the wild. While NeRF has shown impressive results in controlled settings, it struggles with transient objects commonly found in dynamic and time-varying scenes. Our framework called \textit{Inpainting Enhanced NeRF}, or \ours, enhances the conventional NeRF by drawing inspiration from the technique of image inpainting. Specifically, our approach extends the Multi-Layer Perceptrons (MLP) of NeRF, enabling it to simultaneously generate intrinsic properties (static color, density) and extrinsic transient masks. We introduce an inpainting module that leverages the transient masks to effectively exclude occlusions, resulting in improved volume rendering quality. Additionally, we propose a new training strategy with frequency regularization to address the sparsity issue of low-frequency transient components. We evaluate our approach on internet photo collections of landmarks, demonstrating its ability to generate high-quality novel views and achieve state-of-the-art performance. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.10671 [pdf, other]

Qwen2 Technical Report

Authors: An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin, Kai Dang , et al. (34 additional authors not shown)

Abstract: This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, a… ▽ More This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, and exhibits competitive performance relative to proprietary models across diverse benchmarks on language understanding, generation, multilingual proficiency, coding, mathematics, and reasoning. The flagship model, Qwen2-72B, showcases remarkable performance: 84.2 on MMLU, 37.9 on GPQA, 64.6 on HumanEval, 89.5 on GSM8K, and 82.4 on BBH as a base language model. The instruction-tuned variant, Qwen2-72B-Instruct, attains 9.1 on MT-Bench, 48.1 on Arena-Hard, and 35.7 on LiveCodeBench. Moreover, Qwen2 demonstrates robust multilingual capabilities, proficient in approximately 30 languages, spanning English, Chinese, Spanish, French, German, Arabic, Russian, Korean, Japanese, Thai, Vietnamese, and more, underscoring its versatility and global reach. To foster community innovation and accessibility, we have made the Qwen2 model weights openly available on Hugging Face and ModelScope, and the supplementary materials including example code on GitHub. These platforms also include resources for quantization, fine-tuning, and deployment, facilitating a wide range of applications and research endeavors. △ Less

Submitted 16 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

Comments: 25 pages, 1 figure

arXiv:2407.10629 [pdf, other]

Balancing the Scales: Reinforcement Learning for Fair Classification

Authors: Leon Eshuijs, Shihan Wang, Antske Fokkens

Abstract: Fairness in classification tasks has traditionally focused on bias removal from neural representations, but recent trends favor algorithmic methods that embed fairness into the training process. These methods steer models towards fair performance, preventing potential elimination of valuable information that arises from representation manipulation. Reinforcement Learning (RL), with its capacity fo… ▽ More Fairness in classification tasks has traditionally focused on bias removal from neural representations, but recent trends favor algorithmic methods that embed fairness into the training process. These methods steer models towards fair performance, preventing potential elimination of valuable information that arises from representation manipulation. Reinforcement Learning (RL), with its capacity for learning through interaction and adjusting reward functions to encourage desired behaviors, emerges as a promising tool in this domain. In this paper, we explore the usage of RL to address bias in imbalanced classification by scaling the reward function to mitigate bias. We employ the contextual multi-armed bandit framework and adapt three popular RL algorithms to suit our objectives, demonstrating a novel approach to mitigating bias. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.10623 [pdf]

Roadmap for Animate Matter

Authors: Giorgio Volpe, Nuno A. M. Araújo, Maria Guix, Mark Miodownik, Ayusman Sen, Samuel Sanchez, Nicolas Martin, Laura Alvarez, Juliane Simmchen, Roberto Di Leonardo, Nicola Pellicciotta, Quentin Martinet, Jérémie Palacci, Wai Kit Ng, Dhruv Saxena, Riccardo Sapienza, Sara Nadine, João F. Mano, Reza Mahdavi, Caroline Beck Adiels, Joe Forth, Christian Santangelo, Stefano Palagi, Ji Min Seok, Victoria A. Webster-Wood , et al. (17 additional authors not shown)

Abstract: Humanity has long sought inspiration from nature to innovate materials and devices. As science advances, nature-inspired materials are becoming part of our lives. Animate materials, characterized by their activity, adaptability, and autonomy, emulate properties of living systems. While only biological materials fully embody these principles, artificial versions are advancing rapidly, promising tra… ▽ More Humanity has long sought inspiration from nature to innovate materials and devices. As science advances, nature-inspired materials are becoming part of our lives. Animate materials, characterized by their activity, adaptability, and autonomy, emulate properties of living systems. While only biological materials fully embody these principles, artificial versions are advancing rapidly, promising transformative impacts across various sectors. This roadmap presents authoritative perspectives on animate materials across different disciplines and scales, highlighting their interdisciplinary nature and potential applications in diverse fields including nanotechnology, robotics and the built environment. It underscores the need for concerted efforts to address shared challenges such as complexity management, scalability, evolvability, interdisciplinary collaboration, and ethical and environmental considerations. The framework defined by classifying materials based on their level of animacy can guide this emerging field encouraging cooperation and responsible development. By unravelling the mysteries of living matter and leveraging its principles, we can design materials and systems that will transform our world in a more sustainable manner. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.10460 [pdf]

An electromagnetic-thermal-mechanical coupling model of dry-wound HTS coil based on T-A formulation with Neumann boundary condition

Authors: Yunkai Tang, Sijian Wang, Donghui Liu, Huadong Yong, Youhe Zhou

Abstract: The multi-physics coupling behaviours of HTS coils have now received much attention. In particular, the electromagnetic field, temperature field and mechanical deformation interact with each other during quench of high-field magnets. Accurate analysis of coupling behaviours becomes the key to designing magnets and quench protection. In this paper, a multi-physics coupling model is proposed based o… ▽ More The multi-physics coupling behaviours of HTS coils have now received much attention. In particular, the electromagnetic field, temperature field and mechanical deformation interact with each other during quench of high-field magnets. Accurate analysis of coupling behaviours becomes the key to designing magnets and quench protection. In this paper, a multi-physics coupling model is proposed based on T-A formulation with Neumann boundary conditions. It is convenient to analyse the effects of deformation and temperature on the electromagnetic field, as well as the redistribution of the current between the different layers during quench. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.10374 [pdf, other]

An Empirical Study of Mamba-based Pedestrian Attribute Recognition

Authors: Xiao Wang, Weizhe Kong, Jiandong Jin, Shiao Wang, Ruichong Gao, Qingchuan Ma, Chenglong Li, Jin Tang

Abstract: Current strong pedestrian attribute recognition models are developed based on Transformer networks, which are computationally heavy. Recently proposed models with linear complexity (e.g., Mamba) have garnered significant attention and have achieved a good balance between accuracy and computational cost across a variety of visual tasks. Relevant review articles also suggest that while these models… ▽ More Current strong pedestrian attribute recognition models are developed based on Transformer networks, which are computationally heavy. Recently proposed models with linear complexity (e.g., Mamba) have garnered significant attention and have achieved a good balance between accuracy and computational cost across a variety of visual tasks. Relevant review articles also suggest that while these models can perform well on some pedestrian attribute recognition datasets, they are generally weaker than the corresponding Transformer models. To further tap into the potential of the novel Mamba architecture for PAR tasks, this paper designs and adapts Mamba into two typical PAR frameworks, i.e., the text-image fusion approach and pure vision Mamba multi-label recognition framework. It is found that interacting with attribute tags as additional input does not always lead to an improvement, specifically, Vim can be enhanced, but VMamba cannot. This paper further designs various hybrid Mamba-Transformer variants and conducts thorough experimental validations. These experimental results indicate that simply enhancing Mamba with a Transformer does not always lead to performance improvements but yields better results under certain settings. We hope this empirical study can further inspire research in Mamba for PAR, and even extend into the domain of multi-label recognition, through the design of these network structures and comprehensive experimentation. The source code of this work will be released at \url{https://github.com/Event-AHU/OpenPAR} △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: In Peer Review

arXiv:2407.10339 [pdf, other]

Supernova Pointing Capabilities of DUNE

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, D. A. Andrade , et al. (1340 additional authors not shown)

Abstract: The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electr… ▽ More The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electron-neutrino charged-current absorption on $^{40}$Ar and elastic scattering of neutrinos on electrons. Procedures to reconstruct individual interactions, including a newly developed technique called ``brems flipping'', as well as the burst direction from an ensemble of interactions are described. Performance of the burst direction reconstruction is evaluated for supernovae happening at a distance of 10 kpc for a specific supernova burst flux model. The pointing resolution is found to be 3.4 degrees at 68% coverage for a perfect interaction-channel classification and a fiducial mass of 40 kton, and 6.6 degrees for a 10 kton fiducial mass respectively. Assuming a 4% rate of charged-current interactions being misidentified as elastic scattering, DUNE's burst pointing resolution is found to be 4.3 degrees (8.7 degrees) at 68% coverage. △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: 25 pages, 16 figures

Report number: FERMILAB-PUB-24-0319-LBNF

arXiv:2407.10199 [pdf, other]

Charge radii of $^{11-16}$C, $^{13-17}$N and $^{15-18}$O determined from their charge-changing cross-sections and the mirror-difference charge radii

Authors: J. W. Zhao, B. -H. Sun, I. Tanihata, J. Y. Xu, K. Y. Zhang, A. Prochazka, L. H. Zhu, S. Terashima, J. Meng, L. C. He, C. Y. Liu, G. S. Li, C. G. Lu, W. J. Lin, W. P. Lin, Z. Liu, P. P Ren, Z. Y. Sun, F. Wang, J. Wang, M. Wang, S. T. Wang, X. L. Wei, X. D. Xu, J. C. Zhang , et al. (2 additional authors not shown)

Abstract: Charge-changing cross-sections of $^{11-16}$C, $^{13-17}$N and $^{15-18}$O on a carbon target have been determined at energies around 300 MeV/nucleon. A nucleon separation energy dependent correction factor has been introduced to the Glauber model calculation for extracting the nuclear charge radii from the experimental CCCSs. The charge radii of $^{11}$C, $^{13,16}$N and $^{15}$O thus were determ… ▽ More Charge-changing cross-sections of $^{11-16}$C, $^{13-17}$N and $^{15-18}$O on a carbon target have been determined at energies around 300 MeV/nucleon. A nucleon separation energy dependent correction factor has been introduced to the Glauber model calculation for extracting the nuclear charge radii from the experimental CCCSs. The charge radii of $^{11}$C, $^{13,16}$N and $^{15}$O thus were determined for the first time. With the new radii, we studied the experimental mirror-difference charge radii ($ΔR_{\text {ch}}^{\text {mirror}}$) of $^{11}$B-$^{11}$C, $^{13}$C-$^{13}$N, $^{15}$N-$^{15}$O, $^{17}$N-$^{17}$Ne pairs for the first time. We find that the $ΔR_{\text {ch}}^{\text {mirror}}$, including both bound and weakly bound proton-rich mirror partners, are reproduced by the empirical relation to the isospin asymmetry predicted by the $ab$ $initio$ calculations. △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: 3 figures, submitted to Physics Letters B

arXiv:2407.10166 [pdf, other]

A general theory for infernal points in non-Hermitian systems

Authors: Shu-Xuan Wang, Zhongbo Yan

Abstract: The coalescence of eigenstates is a unique phenomena in non-Hermitian systems. Remarkably, it has been noticed in some non-Hermitian systems under open boundary conditions that the whole set of eigenstates can coalesce to only a few eigenstates. In the parameter space, the point at which such a coalescence of macroscopic eigenstates occurs is dubbed as an infernal point. In this paper, based on th… ▽ More The coalescence of eigenstates is a unique phenomena in non-Hermitian systems. Remarkably, it has been noticed in some non-Hermitian systems under open boundary conditions that the whole set of eigenstates can coalesce to only a few eigenstates. In the parameter space, the point at which such a coalescence of macroscopic eigenstates occurs is dubbed as an infernal point. In this paper, based on the non-Bloch band theory and amoeba formulation, we establish the criteria for the presence of infernal points in one-dimensional and higher dimensional open-boundary non-Hermitian systems. In addition, we find an explanation of the extreme localization of the wave functions and unveil the mechanism for the coalescence of enormous eigenstates at the infernal points. Our work provides a general theory for infernal points in open-boundary non-Hermitian systems in arbitrary dimensions, and hence paves the way to study the intriguing infernal points systematically. △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: 7+9 pages, 2+3 figures

arXiv:2407.10065 [pdf, other]

An Efficient High-dimensional Gradient Estimator for Stochastic Differential Equations

Authors: Shengbo Wang, Jose Blanchet, Peter Glynn

Abstract: Overparameterized stochastic differential equation (SDE) models have achieved remarkable success in various complex environments, such as PDE-constrained optimization, stochastic control and reinforcement learning, financial engineering, and neural SDEs. These models often feature system evolution coefficients that are parameterized by a high-dimensional vector $θ\in \mathbb{R}^n$, aiming to optim… ▽ More Overparameterized stochastic differential equation (SDE) models have achieved remarkable success in various complex environments, such as PDE-constrained optimization, stochastic control and reinforcement learning, financial engineering, and neural SDEs. These models often feature system evolution coefficients that are parameterized by a high-dimensional vector $θ\in \mathbb{R}^n$, aiming to optimize expectations of the SDE, such as a value function, through stochastic gradient ascent. Consequently, designing efficient gradient estimators for which the computational complexity scales well with $n$ is of significant interest. This paper introduces a novel unbiased stochastic gradient estimator--the generator gradient estimator--for which the computation time remains stable in $n$. In addition to establishing the validity of our methodology for general SDEs with jumps, we also perform numerical experiments that test our estimator in linear-quadratic control problems parameterized by high-dimensional neural networks. The results show a significant improvement in efficiency compared to the widely used pathwise differentiation method: Our estimator achieves near-constant computation times, increasingly outperforms its counterpart as $n$ increases, and does so without compromising estimation variance. These empirical findings highlight the potential of our proposed methodology for optimizing SDEs in contemporary applications. △ Less

Submitted 13 July, 2024; originally announced July 2024.

arXiv:2407.09893 [pdf, other]

Synergistic Multi-Agent Framework with Trajectory Learning for Knowledge-Intensive Tasks

Authors: Shengbin Yue, Siyuan Wang, Wei Chen, Xuanjing Huang, Zhongyu Wei

Abstract: Recent advancements in Large Language Models (LLMs) have led to significant breakthroughs in various natural language processing tasks. However, generating factually consistent responses in knowledge-intensive scenarios remains a challenge due to issues such as hallucination, difficulty in acquiring long-tailed knowledge, and limited memory expansion. This paper introduces SMART, a novel multi-age… ▽ More Recent advancements in Large Language Models (LLMs) have led to significant breakthroughs in various natural language processing tasks. However, generating factually consistent responses in knowledge-intensive scenarios remains a challenge due to issues such as hallucination, difficulty in acquiring long-tailed knowledge, and limited memory expansion. This paper introduces SMART, a novel multi-agent framework that leverages external knowledge to enhance the interpretability and factual consistency of LLM-generated responses. SMART comprises four specialized agents, each performing a specific sub-trajectory action to navigate complex knowledge-intensive tasks. We propose a multi-agent co-training paradigm, Long- and Short-Trajectory Learning, which ensures synergistic collaboration among agents while maintaining fine-grained execution by each agent. Extensive experiments on 5 tasks demonstrate SMART's superior performance compared to previous widely adopted methods. △ Less

Submitted 13 July, 2024; originally announced July 2024.

arXiv:2407.09857 [pdf, other]

IFTR: An Instance-Level Fusion Transformer for Visual Collaborative Perception

Authors: Shaohong Wang, Lu Bin, Xinyu Xiao, Zhiyu Xiang, Hangguan Shan, Eryun Liu

Abstract: Multi-agent collaborative perception has emerged as a widely recognized technology in the field of autonomous driving in recent years. However, current collaborative perception predominantly relies on LiDAR point clouds, with significantly less attention given to methods using camera images. This severely impedes the development of budget-constrained collaborative systems and the exploitation of t… ▽ More Multi-agent collaborative perception has emerged as a widely recognized technology in the field of autonomous driving in recent years. However, current collaborative perception predominantly relies on LiDAR point clouds, with significantly less attention given to methods using camera images. This severely impedes the development of budget-constrained collaborative systems and the exploitation of the advantages offered by the camera modality. This work proposes an instance-level fusion transformer for visual collaborative perception (IFTR), which enhances the detection performance of camera-only collaborative perception systems through the communication and sharing of visual features. To capture the visual information from multiple agents, we design an instance feature aggregation that interacts with the visual features of individual agents using predefined grid-shaped bird eye view (BEV) queries, generating more comprehensive and accurate BEV features. Additionally, we devise a cross-domain query adaptation as a heuristic to fuse 2D priors, implicitly encoding the candidate positions of targets. Furthermore, IFTR optimizes communication efficiency by sending instance-level features, achieving an optimal performance-bandwidth trade-off. We evaluate the proposed IFTR on a real dataset, DAIR-V2X, and two simulated datasets, OPV2V and V2XSet, achieving performance improvements of 57.96%, 9.23% and 12.99% in AP@70 metrics compared to the previous SOTAs, respectively. Extensive experiments demonstrate the superiority of IFTR and the effectiveness of its key components. The code is available at https://github.com/wangsh0111/IFTR. △ Less

Submitted 13 July, 2024; originally announced July 2024.

arXiv:2407.09802 [pdf, ps, other]

Chaos, entanglement and Husimi Q function in quantum Rabi model

Authors: Shangyun Wang, Songbai Chen, Jiliang Jing

Abstract: As one of the famous effects in quantum Rabi model (QRM), Rabi oscillation may lead to the occurrence of quantum dynamics behaviors without classical dynamic counterparts, such as quantum collapse and revival effects. In this paper, we focus on studying whether the entanglement entropy and Husimi Q function, as diagnostic tools for quantum chaos in quantum systems, are invalidated by quantum colla… ▽ More As one of the famous effects in quantum Rabi model (QRM), Rabi oscillation may lead to the occurrence of quantum dynamics behaviors without classical dynamic counterparts, such as quantum collapse and revival effects. In this paper, we focus on studying whether the entanglement entropy and Husimi Q function, as diagnostic tools for quantum chaos in quantum systems, are invalidated by quantum collapse and revival. It is shown that the saturation values of entanglement entropy for initial states located in the chaotic sea of QRM are higher than that in the regular regions. When the system reaches dynamic equilibrium, the Husimi Q function which initial states located in the chaotic sea are more dispersed than that in the regular regions. Moreover, we observe a good correspondence between the the time-average entanglement entropy and classical phase space structures. Our results imply that entanglement entropy and Husimi Q function maintain the function for diagnosing chaos in the QRM and the corresponding principle does not be invalidated by quantum collapse and revival effects in this system. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: 6 pages, 5 figures

arXiv:2407.09793 [pdf, other]

Uncovering Weaknesses in Neural Code Generation

Authors: Xiaoli Lian, Shuaisong Wang, Jieping Ma, Fang Liu, Xin Tan, Lin Shi, Li Zhang

Abstract: Code generation, the task of producing source code from prompts, has seen significant advancements with the advent of pre-trained large language models (PLMs). Despite these achievements, there lacks a comprehensive taxonomy of weaknesses about the benchmark and the generated code, which risks the community's focus on known issues at the cost of under-explored areas. Our systematic study aims to… ▽ More Code generation, the task of producing source code from prompts, has seen significant advancements with the advent of pre-trained large language models (PLMs). Despite these achievements, there lacks a comprehensive taxonomy of weaknesses about the benchmark and the generated code, which risks the community's focus on known issues at the cost of under-explored areas. Our systematic study aims to fill this gap by evaluating five state-of-the-art PLMs: three larger models, CodeGen2.5 with 7 billion parameters, CodeGeeX2 with 6 billion parameters, GPT-4 Turbo, and two smaller ones, UnixCoder with 110 million parameters and CodeT5 base with 220 million parameters, across three popular datasets, CoNaLa, HumanEval Plus, and DS-1000. We assess the quality of generated code using match-based and execution-based metrics, then conduct thematic analysis to develop a taxonomy of nine types of weaknesses. We dissected weakness distributions in both larger and smaller models, applying an extensive methodology that encompasses model-specific as well as collective analysis (union and intersection) across models. Our research uncovers three salient findings: 1. In the CoNaLa dataset, inaccurate prompts are a notable problem, causing all large models to fail in 26.84% of cases, with even higher failure rates of 40% for smaller models; 2. Missing pivotal semantics is a pervasive issue across benchmarks, with one or more large models omitting key semantics in 65.78% of CoNaLa tasks, and similarly high occurrences in HumanEval Plus (66.09%) and DS-1000 (80.51%); 3. All models struggle with proper API usage, a challenge amplified by vague or complex prompts. Our findings aim to steer researchers towards addressing specific weaknesses and challenges in code generation. Furthermore, our annotations can offer a targeted benchmark subset for detailed analysis. △ Less

Submitted 13 July, 2024; originally announced July 2024.

arXiv:2407.09553 [pdf, other]

RESVMUNetX: A Low-Light Enhancement Network Based on VMamba

Authors: Shuang Wang, Qingchuan Tao, Zhenming Tang

Abstract: This study presents ResVMUNetX, a novel image enhancement network for low-light conditions, addressing the limitations of existing deep learning methods in capturing long-range image information. Leveraging error regression and an efficient VMamba architecture, ResVMUNetX enhances brightness, recovers structural details, and removes noise through a two-step process involving direct pixel addition… ▽ More This study presents ResVMUNetX, a novel image enhancement network for low-light conditions, addressing the limitations of existing deep learning methods in capturing long-range image information. Leveraging error regression and an efficient VMamba architecture, ResVMUNetX enhances brightness, recovers structural details, and removes noise through a two-step process involving direct pixel addition and a specialized Denoise CNN module. Demonstrating superior performance on the LOL dataset, ResVMUNetX significantly improves image clarity and quality with reduced computational demands, achieving real-time processing speeds of up to 70 frames per second. This confirms its effectiveness in enhancing low-light images and its potential for practical, real-time applications. △ Less

Submitted 28 June, 2024; originally announced July 2024.

arXiv:2407.09380 [pdf, other]

Measuring the anisotropies in astrophysical and cosmological gravitational-wave backgrounds with Taiji and LISA networks

Authors: Zhi-Chao Zhao, Sai Wang

Abstract: We investigate the capabilities of space-based gravitational-wave detector networks, specifically Taiji and LISA, to measure the anisotropies in stochastic gravitational-wave backgrounds (SGWBs), which are characterized by the angular power spectrum. We find that a detector network can improve the measurement precision of anisotropies by at most fourteen orders of magnitude, depending on the angul… ▽ More We investigate the capabilities of space-based gravitational-wave detector networks, specifically Taiji and LISA, to measure the anisotropies in stochastic gravitational-wave backgrounds (SGWBs), which are characterized by the angular power spectrum. We find that a detector network can improve the measurement precision of anisotropies by at most fourteen orders of magnitude, depending on the angular multipoles. By doing so, we can enhance our understanding of the physical origins of SGWB, both in astrophysical and cosmological contexts. We assess the prospects of the detector networks for measuring the parameters of angular power spectrum. We further find an inevitable effect of cosmic variance, which strengthens the importance of configuring detector networks. Our findings also suggest a potential detection of the kinematic dipole due to Doppler boosting of SGWB. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 10 pages, 8 figures

arXiv:2407.09252 [pdf, other]

Context Embeddings for Efficient Answer Generation in RAG

Authors: David Rau, Shuai Wang, Hervé Déjean, Stéphane Clinchant

Abstract: Retrieval-Augmented Generation (RAG) allows overcoming the limited knowledge of LLMs by extending the input with external information. As a consequence, the contextual inputs to the model become much longer which slows down decoding time directly translating to the time a user has to wait for an answer. We address this challenge by presenting COCOM, an effective context compression method, reducin… ▽ More Retrieval-Augmented Generation (RAG) allows overcoming the limited knowledge of LLMs by extending the input with external information. As a consequence, the contextual inputs to the model become much longer which slows down decoding time directly translating to the time a user has to wait for an answer. We address this challenge by presenting COCOM, an effective context compression method, reducing long contexts to only a handful of Context Embeddings speeding up the generation time by a large margin. Our method allows for different compression rates trading off decoding time for answer quality. Compared to earlier methods, COCOM allows for handling multiple contexts more effectively, significantly reducing decoding time for long inputs. Our method demonstrates a speed-up of up to 5.69 $\times$ while achieving higher performance compared to existing efficient context compression methods. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 10 pages

arXiv:2407.09053 [pdf, other]

Navi2Gaze: Leveraging Foundation Models for Navigation and Target Gazing

Authors: Jun Zhu, Zihao Du, Haotian Xu, Fengbo Lan, Zilong Zheng, Bo Ma, Shengjie Wang, Tao Zhang

Abstract: Task-aware navigation continues to be a challenging area of research, especially in scenarios involving open vocabulary. Previous studies primarily focus on finding suitable locations for task completion, often overlooking the importance of the robot's pose. However, the robot's orientation is crucial for successfully completing tasks because of how objects are arranged (e.g., to open a refrigerat… ▽ More Task-aware navigation continues to be a challenging area of research, especially in scenarios involving open vocabulary. Previous studies primarily focus on finding suitable locations for task completion, often overlooking the importance of the robot's pose. However, the robot's orientation is crucial for successfully completing tasks because of how objects are arranged (e.g., to open a refrigerator door). Humans intuitively navigate to objects with the right orientation using semantics and common sense. For instance, when opening a refrigerator, we naturally stand in front of it rather than to the side. Recent advances suggest that Vision-Language Models (VLMs) can provide robots with similar common sense. Therefore, we develop a VLM-driven method called Navigation-to-Gaze (Navi2Gaze) for efficient navigation and object gazing based on task descriptions. This method uses the VLM to score and select the best pose from numerous candidates automatically. In evaluations on multiple photorealistic simulation benchmarks, Navi2Gaze significantly outperforms existing approaches and precisely determines the optimal orientation relative to target objects. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.08990 [pdf, other]

Dynamic neural network with memristive CIM and CAM for 2D and 3D vision

Authors: Yue Zhang, Woyu Zhang, Shaocong Wang, Ning Lin, Yifei Yu, Yangu He, Bo Wang, Hao Jiang, Peng Lin, Xiaoxin Xu, Xiaojuan Qi, Zhongrui Wang, Xumeng Zhang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

Abstract: The brain is dynamic, associative and efficient. It reconfigures by associating the inputs with past experiences, with fused memory and processing. In contrast, AI models are static, unable to associate inputs with past experiences, and run on digital computers with physically separated memory and processing. We propose a hardware-software co-design, a semantic memory-based dynamic neural network… ▽ More The brain is dynamic, associative and efficient. It reconfigures by associating the inputs with past experiences, with fused memory and processing. In contrast, AI models are static, unable to associate inputs with past experiences, and run on digital computers with physically separated memory and processing. We propose a hardware-software co-design, a semantic memory-based dynamic neural network (DNN) using memristor. The network associates incoming data with the past experience stored as semantic vectors. The network and the semantic memory are physically implemented on noise-robust ternary memristor-based Computing-In-Memory (CIM) and Content-Addressable Memory (CAM) circuits, respectively. We validate our co-designs, using a 40nm memristor macro, on ResNet and PointNet++ for classifying images and 3D points from the MNIST and ModelNet datasets, which not only achieves accuracy on par with software but also a 48.1% and 15.9% reduction in computational budget. Moreover, it delivers a 77.6% and 93.3% reduction in energy consumption. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: In press

arXiv:2407.08936 [pdf, ps, other]

HHLPar: Automated Theorem Prover for Parallel Hybrid Communicating Sequential Processes

Authors: Xiangyu Jin, Bohua Zhan, Shuling Wang, Naijun Zhan

Abstract: We present a tool called HHLPar for verifying hybrid systems modelled in Hybrid Communicating Sequential Processes (HCSP). HHLPar is built upon a Hybrid Hoare Logic for HCSP, which is able to reason about continuous-time properties of differential equations, as well as communication and parallel composition of parallel HCSP processes with the help of parameterised trace assertions and their synchr… ▽ More We present a tool called HHLPar for verifying hybrid systems modelled in Hybrid Communicating Sequential Processes (HCSP). HHLPar is built upon a Hybrid Hoare Logic for HCSP, which is able to reason about continuous-time properties of differential equations, as well as communication and parallel composition of parallel HCSP processes with the help of parameterised trace assertions and their synchronization. The logic was formalised and proved to be sound in Isabelle/HOL, which constitutes a trustworthy foundation for the verification conducted by HHLPar. HHLPar implements the Hybrid Hoare Logic in Python and supports automated verification: On one hand, it provides functions for symbolically decomposing HCSP processes, generating specifications for separate sequential processes and then composing them via synchronization to obtain the final specification for the whole parallel HCSP processes; On the other hand, it is integrated with external solvers for handling differential equations and real arithmetic properties. We have conducted experiments on a simplified cruise control system to validate the performance of the tool. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08924 [pdf, other]

Disassembling Obfuscated Executables with LLM

Authors: Huanyao Rong, Yue Duan, Hang Zhang, XiaoFeng Wang, Hongbo Chen, Shengchen Duan, Shen Wang

Abstract: Disassembly is a challenging task, particularly for obfuscated executables containing junk bytes, which is designed to induce disassembly errors. Existing solutions rely on heuristics or leverage machine learning techniques, but only achieve limited successes. Fundamentally, such obfuscation cannot be defeated without in-depth understanding of the binary executable's semantics, which is made possi… ▽ More Disassembly is a challenging task, particularly for obfuscated executables containing junk bytes, which is designed to induce disassembly errors. Existing solutions rely on heuristics or leverage machine learning techniques, but only achieve limited successes. Fundamentally, such obfuscation cannot be defeated without in-depth understanding of the binary executable's semantics, which is made possible by the emergence of large language models (LLMs). In this paper, we present DisasLLM, a novel LLM-driven dissembler to overcome the challenge in analyzing obfuscated executables. DisasLLM consists of two components: an LLM-based classifier that determines whether an instruction in an assembly code snippet is correctly decoded, and a disassembly strategy that leverages this model to disassemble obfuscated executables end-to-end. We evaluated DisasLLM on a set of heavily obfuscated executables, which is shown to significantly outperform other state-of-the-art disassembly solutions. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08771 [pdf, other]

On $3$-graphs with vanishing codegree Turán density

Authors: Laihao Ding, Ander Lamaison, Hong Liu, Shuaichao Wang, Haotian Yang

Abstract: For a $k$-uniform hypergraph (or simply $k$-graph) $F$, the codegree Turán density $π_{\mathrm{co}}(F)$ is the supremum over all $α$ such that there exist arbitrarily large $n$-vertex $F$-free $k$-graphs $H$ in which every $(k-1)$-subset of $V(H)$ is contained in at least $αn$ edges. Recently, it was proved that for every $3$-graph $F$, $π_{\mathrm{co}}(F)=0$ implies $π_{\therefore}(F)=0$, where… ▽ More For a $k$-uniform hypergraph (or simply $k$-graph) $F$, the codegree Turán density $π_{\mathrm{co}}(F)$ is the supremum over all $α$ such that there exist arbitrarily large $n$-vertex $F$-free $k$-graphs $H$ in which every $(k-1)$-subset of $V(H)$ is contained in at least $αn$ edges. Recently, it was proved that for every $3$-graph $F$, $π_{\mathrm{co}}(F)=0$ implies $π_{\therefore}(F)=0$, where $π_{\therefore}(F)$ is the uniform Turán density of $F$ and is defined as the supremum over all $d$ such that there are infinitely many $F$-free $k$-graphs $H$ satisfying that any induced linear-size subhypergraph of $H$ has edge density at least $d$. In this paper, we introduce a layered structure for $3$-graphs which allows us to obtain the reverse implication: every layered $3$-graph $F$ with $π_{\therefore}(F)=0$ satisfies $π_{\mathrm{co}}(F)=0$. Along the way, we answer in the negative a question of Falgas-Ravry, Pikhurko, Vaughan and Volec [J. London Math. Soc., 2023] about whether $π_{\therefore}(F)\leqπ_{\mathrm{co}}(F)$ always holds. In particular, we construct counterexamples $F$ with positive but arbitrarily small $π_{\mathrm{co}}(F)$ while having $π_{\therefore}(F)\ge 4/27$. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 17 pages. This work will be merged with arXiv:2312.02879

MSC Class: 05C35; 05C65

arXiv:2407.08770 [pdf, other]

Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing

Authors: Huanqian Wang, Yang Yue, Rui Lu, Jingxin Shi, Andrew Zhao, Shenzhi Wang, Shiji Song, Gao Huang

Abstract: Large Language Models (LLMs) have demonstrated great potential as generalist assistants, showcasing powerful task understanding and problem-solving capabilities. To deploy LLMs as AI assistants, it is crucial that these models exhibit desirable behavioral traits, such as non-toxicity and resilience against jailbreak attempts. Current methods for detoxification or preventing jailbreaking usually in… ▽ More Large Language Models (LLMs) have demonstrated great potential as generalist assistants, showcasing powerful task understanding and problem-solving capabilities. To deploy LLMs as AI assistants, it is crucial that these models exhibit desirable behavioral traits, such as non-toxicity and resilience against jailbreak attempts. Current methods for detoxification or preventing jailbreaking usually involve Supervised Fine-Tuning (SFT) or Reinforcement Learning from Human Feedback (RLHF), which requires finetuning billions of parameters through gradient descent with substantial computation cost. Furthermore, models modified through SFT and RLHF may deviate from the pretrained models, potentially leading to a degradation in foundational LLM capabilities. In this paper, we observe that surprisingly, directly editing a small subset of parameters can effectively modulate specific behaviors of LLMs, such as detoxification and resistance to jailbreaking. Specifically, for a behavior that we aim to avoid, we employ a linear classifier, which we term the behavior probe, to classify binary behavior labels within the hidden state space of the LLM. Using this probe, we introduce an algorithm to identify a critical subset of LLM parameters that significantly influence this targeted behavior. Then we directly edit these selected parameters by shifting them towards the behavior probe. Such a direct parameter editing method necessitates only inference-level computational resources. Experiments demonstrate that in the representative detoxification task, our approach achieves reductions of up to 90.0\% in toxicity on the RealToxicityPrompts dataset and 49.2\% on ToxiGen, while maintaining the LLM's general capabilities in areas such as common sense, question answering, and mathematics. Our code is available at https://github.com/lucywang720/model-surgery. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 23 pages, 14 figures

MSC Class: 68T50 (Primary) 68T07; 62M45 (Secondary) ACM Class: I.2.7

arXiv:2407.08664 [pdf, other]

MBD-NODE: Physics-informed data-driven modeling and simulation of constrained multibody systems

Authors: Jingquan Wang, Shu Wang, Huzaifa Mustafa Unjhawala, Jinlong Wu, Dan Negrut

Abstract: We describe a framework that can integrate prior physical information, e.g., the presence of kinematic constraints, to support data-driven simulation in multi-body dynamics. Unlike other approaches, e.g., Fully-connected Neural Network (FCNN) or Recurrent Neural Network (RNN)-based methods that are used to model the system states directly, the proposed approach embraces a Neural Ordinary Different… ▽ More We describe a framework that can integrate prior physical information, e.g., the presence of kinematic constraints, to support data-driven simulation in multi-body dynamics. Unlike other approaches, e.g., Fully-connected Neural Network (FCNN) or Recurrent Neural Network (RNN)-based methods that are used to model the system states directly, the proposed approach embraces a Neural Ordinary Differential Equation (NODE) paradigm that models the derivatives of the system states. A central part of the proposed methodology is its capacity to learn the multibody system dynamics from prior physical knowledge and constraints combined with data inputs. This learning process is facilitated by a constrained optimization approach, which ensures that physical laws and system constraints are accounted for in the simulation process. The models, data, and code for this work are publicly available as open source at https://github.com/uwsbel/sbel-reproducibility/tree/master/2024/MNODE-code. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08570 [pdf, ps, other]

Determination of the QCD running coupling in the entire perturbative regime from a single experiment using the Principle of Maximum Conformality

Authors: Leonardo Di Giustino, Stanley J. Brodsky, Philip G. Ratcliffe, Sheng-Quan Wang, Xing-Gang Wu

Abstract: We present a new approach for determining the strong coupling $α_s(Q)$ over the entire perturbative range of validity, for scales from $Λ_{\mathrm{QCD}}$ up to the Planck scale ${\sim}10^{19}$\,GeV, with the highest precision and using the data of just a single experiment. The results obtained with this method are consistent with world averages and exhibit improved precision with respect to previo… ▽ More We present a new approach for determining the strong coupling $α_s(Q)$ over the entire perturbative range of validity, for scales from $Λ_{\mathrm{QCD}}$ up to the Planck scale ${\sim}10^{19}$\,GeV, with the highest precision and using the data of just a single experiment. The results obtained with this method are consistent with world averages and exhibit improved precision with respect to previous reports. \pacs{12.38.Bx, 13.66.Bc, 13.66.Jn, 13.87.-a,11.10.Gh △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 7 pages, 4 figures

Report number: SLAC-PUB-17782

arXiv:2407.08474 [pdf, other]

DIDUP: Dynamic Iterative Development for UI Prototyping

Authors: Jenny Ma, Karthik Sreedhar, Vivian Liu, Sitong Wang, Pedro Alejandro Perez, Lydia B. Chilton

Abstract: Large language models (LLMs) are remarkably good at writing code. A particularly valuable case of human-LLM collaboration is code-based UI prototyping, a method for creating interactive prototypes that allows users to view and fully engage with a user interface. We conduct a formative study of GPT Pilot, a leading LLM-generated code-prototyping system, and find that its inflexibility towards chang… ▽ More Large language models (LLMs) are remarkably good at writing code. A particularly valuable case of human-LLM collaboration is code-based UI prototyping, a method for creating interactive prototypes that allows users to view and fully engage with a user interface. We conduct a formative study of GPT Pilot, a leading LLM-generated code-prototyping system, and find that its inflexibility towards change once development has started leads to weaknesses in failure prevention and dynamic planning; it closely resembles the linear workflow of the waterfall model. We introduce DIDUP, a system for code-based UI prototyping that follows an iterative spiral model, which takes changes and iterations that come up during the development process into account. We propose three novel mechanisms for LLM-generated code-prototyping systems: (1) adaptive planning, where plans should be dynamic and reflect changes during implementation, (2) code injection, where the system should write a minimal amount of code and inject it instead of rewriting code so users have a better mental model of the code evolution, and (3) lightweight state management, a simplified version of source control so users can quickly revert to different working states. Together, this enables users to rapidly develop and iterate on prototypes. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 5 pages, 3 figures

arXiv:2407.08164 [pdf, other]

Hierarchical Consensus-Based Multi-Agent Reinforcement Learning for Multi-Robot Cooperation Tasks

Authors: Pu Feng, Junkang Liang, Size Wang, Xin Yu, Rongye Shi, Wenjun Wu

Abstract: In multi-agent reinforcement learning (MARL), the Centralized Training with Decentralized Execution (CTDE) framework is pivotal but struggles due to a gap: global state guidance in training versus reliance on local observations in execution, lacking global signals. Inspired by human societal consensus mechanisms, we introduce the Hierarchical Consensus-based Multi-Agent Reinforcement Learning (HC-… ▽ More In multi-agent reinforcement learning (MARL), the Centralized Training with Decentralized Execution (CTDE) framework is pivotal but struggles due to a gap: global state guidance in training versus reliance on local observations in execution, lacking global signals. Inspired by human societal consensus mechanisms, we introduce the Hierarchical Consensus-based Multi-Agent Reinforcement Learning (HC-MARL) framework to address this limitation. HC-MARL employs contrastive learning to foster a global consensus among agents, enabling cooperative behavior without direct communication. This approach enables agents to form a global consensus from local observations, using it as an additional piece of information to guide collaborative actions during execution. To cater to the dynamic requirements of various tasks, consensus is divided into multiple layers, encompassing both short-term and long-term considerations. Short-term observations prompt the creation of an immediate, low-layer consensus, while long-term observations contribute to the formation of a strategic, high-layer consensus. This process is further refined through an adaptive attention mechanism that dynamically adjusts the influence of each consensus layer. This mechanism optimizes the balance between immediate reactions and strategic planning, tailoring it to the specific demands of the task at hand. Extensive experiments and real-world applications in multi-robot systems showcase our framework's superior performance, marking significant advancements over baselines. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: 8 pages, 10 figures. Accepted for presentation at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

arXiv:2407.07651 [pdf, other]

Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07472 [pdf, other]

Rectifier: Code Translation with Corrector via LLMs

Authors: Xin Yin, Chao Ni, Tien N. Nguyen, Shaohua Wang, Xiaohu Yang

Abstract: Software migration is garnering increasing attention with the evolution of software and society. Early studies mainly relied on handcrafted translation rules to translate between two languages, the translation process is error-prone and time-consuming. In recent years, researchers have begun to explore the use of pre-trained large language models (LLMs) in code translation. However, code translati… ▽ More Software migration is garnering increasing attention with the evolution of software and society. Early studies mainly relied on handcrafted translation rules to translate between two languages, the translation process is error-prone and time-consuming. In recent years, researchers have begun to explore the use of pre-trained large language models (LLMs) in code translation. However, code translation is a complex task that LLMs would generate mistakes during code translation, they all produce certain types of errors when performing code translation tasks, which include (1) compilation error, (2) runtime error, (3) functional error, and (4) non-terminating execution. We found that the root causes of these errors are very similar (e.g. failure to import packages, errors in loop boundaries, operator errors, and more). In this paper, we propose a general corrector, namely Rectifier, which is a micro and universal model for repairing translation errors. It learns from errors generated by existing LLMs and can be widely applied to correct errors generated by any LLM. The experimental results on translation tasks between C++, Java, and Python show that our model has effective repair ability, and cross experiments also demonstrate the robustness of our method. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: arXiv admin note: text overlap with arXiv:2308.03109, arXiv:2302.03908 by other authors

arXiv:2407.06631 [pdf, other]

A Systematic Review of Echo Chamber Research: Comparative Analysis of Conceptualizations, Operationalizations, and Varying Outcomes

Authors: David Hartmann, Lena Pohlmann, Sonja Mei Wang, Bettina Berendt

Abstract: This systematic review synthesizes current research on echo chambers and filter bubbles to highlight the reasons for the dissent in echo chamber research on the existence, antecedents, and effects of the phenomenon. The review of 112 studies reveals that the lack of consensus in echo chamber research is based on different conceptualizations and operationalizations of echo chambers. While studies t… ▽ More This systematic review synthesizes current research on echo chambers and filter bubbles to highlight the reasons for the dissent in echo chamber research on the existence, antecedents, and effects of the phenomenon. The review of 112 studies reveals that the lack of consensus in echo chamber research is based on different conceptualizations and operationalizations of echo chambers. While studies that have conceptualized echo chambers with homophily and utilized data-driven computational social science (CSS) methods have confirmed the echo chamber hypothesis and polarization effects in social media, content exposure studies and surveys that have explored the full spectrum of media exposure have rejected it. Most of these studies have been conducted in the United States, and the review emphasizes the need for a more comprehensive understanding of how echo chambers work in systems with more than two parties and outside the Global North. To advance our understanding of this phenomenon, future research should prioritize conducting more cross-platform studies, considering algorithmic filtering changes through continuous auditing, and examining the causal direction of the association between polarization, fragmentation, and the establishment of online echo chambers. The review also provides the advantages and disadvantages of different operationalizations and makes recommendations for studies in the European Union (EU), which will become possible with the upcoming Digital Services Act (DSA). Overall, this systematic review contributes to the ongoing scholarly discussion on the existence, antecedents, and effects of echo chambers and filter bubbles. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.06573 [pdf, other]

LLM for Mobile: An Initial Roadmap

Authors: Daihang Chen, Yonghui Liu, Mingyi Zhou, Yanjie Zhao, Haoyu Wang, Shuai Wang, Xiao Chen, Tegawendé F. Bissyandé, Jacques Klein, Li Li

Abstract: When mobile meets LLMs, mobile app users deserve to have more intelligent usage experiences. For this to happen, we argue that there is a strong need to appl LLMs for the mobile ecosystem. We therefore provide a research roadmap for guiding our fellow researchers to achieve that as a whole. In this roadmap, we sum up six directions that we believe are urgently required for research to enable nativ… ▽ More When mobile meets LLMs, mobile app users deserve to have more intelligent usage experiences. For this to happen, we argue that there is a strong need to appl LLMs for the mobile ecosystem. We therefore provide a research roadmap for guiding our fellow researchers to achieve that as a whole. In this roadmap, we sum up six directions that we believe are urgently required for research to enable native intelligence in mobile devices. In each direction, we further summarize the current research progress and the gaps that still need to be filled by our fellow researchers. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.05763 [pdf, other]

Homogeneous Distributed Observers for Quasilinear Systems

Authors: Min Li, Andrey Polyakov, Siyuan Wang, Gang Zheng

Abstract: The problem of finite/fixed-time cooperative state estimation is considered for a class of quasilinear systems with nonlinearities satisfying a Hölder condition. A strongly connected nonlinear distributed observer is designed under the assumption of global observability. By proper parameter tuning with linear matrix inequalities, the observer error equation possesses finite/fixed-time stability in… ▽ More The problem of finite/fixed-time cooperative state estimation is considered for a class of quasilinear systems with nonlinearities satisfying a Hölder condition. A strongly connected nonlinear distributed observer is designed under the assumption of global observability. By proper parameter tuning with linear matrix inequalities, the observer error equation possesses finite/fixed-time stability in the perturbation-free case and input-to-state stability with respect to bounded perturbations. Numerical simulations are performed to validate this design. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: This manuscript has been submitted for a possible journal publication

arXiv:2407.05639 [pdf]

Deep Learning-based Anomaly Detection and Log Analysis for Computer Networks

Authors: Shuzhan Wang, Ruxue Jiang, Zhaoqi Wang, Yan Zhou

Abstract: Computer network anomaly detection and log analysis, as an important topic in the field of network security, has been a key task to ensure network security and system reliability. First, existing network anomaly detection and log analysis methods are often challenged by high-dimensional data and complex network topologies, resulting in unstable performance and high false-positive rates. In additio… ▽ More Computer network anomaly detection and log analysis, as an important topic in the field of network security, has been a key task to ensure network security and system reliability. First, existing network anomaly detection and log analysis methods are often challenged by high-dimensional data and complex network topologies, resulting in unstable performance and high false-positive rates. In addition, traditional methods are usually difficult to handle time-series data, which is crucial for anomaly detection and log analysis. Therefore, we need a more efficient and accurate method to cope with these problems. To compensate for the shortcomings of current methods, we propose an innovative fusion model that integrates Isolation Forest, GAN (Generative Adversarial Network), and Transformer with each other, and each of them plays a unique role. Isolation Forest is used to quickly identify anomalous data points, and GAN is used to generate synthetic data with the real data distribution characteristics to augment the training dataset, while the Transformer is used for modeling and context extraction on time series data. The synergy of these three components makes our model more accurate and robust in anomaly detection and log analysis tasks. We validate the effectiveness of this fusion model in an extensive experimental evaluation. Experimental results show that our model significantly improves the accuracy of anomaly detection while reducing the false alarm rate, which helps to detect potential network problems in advance. The model also performs well in the log analysis task and is able to quickly identify anomalous behaviors, which helps to improve the stability of the system. The significance of this study is that it introduces advanced deep learning techniques, which work anomaly detection and log analysis. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 38 pages

arXiv:2407.05589 [pdf, other]

Improving the trainability of VQE on NISQ computers for solving portfolio optimization using convex interpolation

Authors: Shengbin Wang, Guihui Li, Zhaoyun Chen, Peng Wang, Menghan Dou, Haiyong Zheng, Zhimin Wang, Yongjian Gu, Yu-Chun Wu, Guo-Ping Guo

Abstract: Solving combinatorial optimization problems using variational quantum algorithms (VQAs) represents one of the most promising applications in the NISQ era. However, the limited trainability of VQAs could hinder their scalability to large problem sizes. In this paper, we improve the trainability of variational quantum eigensolver (VQE) by utilizing convex interpolation to solve portfolio optimizatio… ▽ More Solving combinatorial optimization problems using variational quantum algorithms (VQAs) represents one of the most promising applications in the NISQ era. However, the limited trainability of VQAs could hinder their scalability to large problem sizes. In this paper, we improve the trainability of variational quantum eigensolver (VQE) by utilizing convex interpolation to solve portfolio optimization. The idea is inspired by the observation that the Dicke state possesses an inherent clustering property. Consequently, the energy of a state with a larger Hamming distance from the ground state intuitively results in a greater energy gap away from the ground state energy in the overall distribution trend. Based on convex interpolation, the location of the ground state can be evaluated by learning the property of a small subset of basis states in the Hilbert space. This enlightens naturally the proposals of the strategies of close-to-solution initialization, regular cost function landscape, and recursive ansatz equilibrium partition. The successfully implementation of a $40$-qubit experiment using only $10$ superconducting qubits demonstrates the effectiveness of our proposals. Furthermore, the quantum inspiration has also spurred the development of a prototype greedy algorithm. Extensive numerical simulations indicate that the hybridization of VQE and greedy algorithms achieves a mutual complementarity, combining the advantages of both global and local optimization methods. Our proposals can be extended to improve the trainability for solving other large-scale combinatorial optimization problems that are widely used in real applications, paving the way to unleash quantum advantages of NISQ computers in the near future. △ Less

Submitted 7 July, 2024; originally announced July 2024.

arXiv:2407.05458 [pdf, other]

A Survey of Models for Cognitive Diagnosis: New Developments and Future Directions

Authors: Fei Wang, Weibo Gao, Qi Liu, Jiatong Li, Guanhao Zhao, Zheng Zhang, Zhenya Huang, Mengxiao Zhu, Shijin Wang, Wei Tong, Enhong Chen

Abstract: Cognitive diagnosis has been developed for decades as an effective measurement tool to evaluate human cognitive status such as ability level and knowledge mastery. It has been applied to a wide range of fields including education, sport, psychological diagnosis, etc. By providing better awareness of cognitive status, it can serve as the basis for personalized services such as well-designed medical… ▽ More Cognitive diagnosis has been developed for decades as an effective measurement tool to evaluate human cognitive status such as ability level and knowledge mastery. It has been applied to a wide range of fields including education, sport, psychological diagnosis, etc. By providing better awareness of cognitive status, it can serve as the basis for personalized services such as well-designed medical treatment, teaching strategy and vocational training. This paper aims to provide a survey of current models for cognitive diagnosis, with more attention on new developments using machine learning-based methods. By comparing the model structures, parameter estimation algorithms, model evaluation methods and applications, we provide a relatively comprehensive review of the recent trends in cognitive diagnosis models. Further, we discuss future directions that are worthy of exploration. In addition, we release two Python libraries: EduData for easy access to some relevant public datasets we have collected, and EduCDM that implements popular CDMs to facilitate both applications and research purposes. △ Less

Submitted 7 July, 2024; originally announced July 2024.

arXiv:2407.05310 [pdf, other]

Ternary Spike-based Neuromorphic Signal Processing System

Authors: Shuai Wang, Dehao Zhang, Ammar Belatreche, Yichen Xiao, Hongyu Qing, Wenjie We, Malu Zhang, Yang Yang

Abstract: Deep Neural Networks (DNNs) have been successfully implemented across various signal processing fields, resulting in significant enhancements in performance. However, DNNs generally require substantial computational resources, leading to significant economic costs and posing challenges for their deployment on resource-constrained edge devices. In this study, we take advantage of spiking neural net… ▽ More Deep Neural Networks (DNNs) have been successfully implemented across various signal processing fields, resulting in significant enhancements in performance. However, DNNs generally require substantial computational resources, leading to significant economic costs and posing challenges for their deployment on resource-constrained edge devices. In this study, we take advantage of spiking neural networks (SNNs) and quantization technologies to develop an energy-efficient and lightweight neuromorphic signal processing system. Our system is characterized by two principal innovations: a threshold-adaptive encoding (TAE) method and a quantized ternary SNN (QT-SNN). The TAE method can efficiently encode time-varying analog signals into sparse ternary spike trains, thereby reducing energy and memory demands for signal processing. QT-SNN, compatible with ternary spike trains from the TAE method, quantifies both membrane potentials and synaptic weights to reduce memory requirements while maintaining performance. Extensive experiments are conducted on two typical signal-processing tasks: speech and electroencephalogram recognition. The results demonstrate that our neuromorphic signal processing system achieves state-of-the-art (SOTA) performance with a 94% reduced memory requirement. Furthermore, through theoretical energy consumption analysis, our system shows 7.5x energy saving compared to other SNN works. The efficiency and efficacy of the proposed system highlight its potential as a promising avenue for energy-efficient signal processing. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Showing 1–50 of 8,662 results for author: Wang, S