subscribe to arXiv mailings

Measurement of the branching fraction of $D^+_s\to \ell^+ν_\ell$ via $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and… ▽ More Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and $\mathcal{B}(D_s^+\toτ^+ν_τ)=(\bftauv)\%$, respectively. The product of the decay constant and Cabibbo-Kobayashi-Maskawa matrix element $|V_{cs}|$ is determined to be $f_{D_s^+}|V_{cs}|=(\mufdsxvcsresult)_{μν}~\mathrm{MeV}$ and $f_{D_s^+}|V_{cs}|=(\taufdsxvcsresult))_{τν}~\mathrm{MeV}$, respectively. Taking the value of $|V_{cs}|$ from a global fit in the Standard Model, we obtain ${f_{D^+_s}}=(\mufdsresult)_{μν}$\,MeV and ${f_{D^+_s}}=(\taufdsresult)_{τν}$\,MeV, respectively. Conversely, taking the value for $f_{D_s^+}$ from the latest lattice quantum chromodynamics calculation, we obtain $|V_{cs}| =(\muvcsresult)_{μν}$ and $|V_{cs}| = (\tauvcsresult)_{τν}$, respectively. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: 27 pages, 13 figures

arXiv:2407.11643 [pdf, other]

Batch SLAM with PMBM Data Association Sampling and Graph-Based Optimization

Authors: Yu Ge, Ossi Kaltiokallio, Yuxuan Xia, Ángel F. García-Fernández, Hyowon Kim, Jukka Talvitie, Mikko Valkama, Henk Wymeersch, Lennart Svensson

Abstract: Simultaneous localization and mapping (SLAM) methods need to both solve the data association (DA) problem and the joint estimation of the sensor trajectory and the map, conditioned on a DA. In this paper, we propose a novel integrated approach to solve both the DA problem and the batch SLAM problem simultaneously, combining random finite set (RFS) theory and the graph-based SLAM approach. A sampli… ▽ More Simultaneous localization and mapping (SLAM) methods need to both solve the data association (DA) problem and the joint estimation of the sensor trajectory and the map, conditioned on a DA. In this paper, we propose a novel integrated approach to solve both the DA problem and the batch SLAM problem simultaneously, combining random finite set (RFS) theory and the graph-based SLAM approach. A sampling method based on the Poisson multi-Bernoulli mixture (PMBM) density is designed for dealing with the DA uncertainty, and a graph-based SLAM solver is applied for the conditional SLAM problem. In the end, a post-processing approach is applied to merge SLAM results from different iterations. Using synthetic data, it is demonstrated that the proposed SLAM approach achieves performance close to the posterior Cramér-Rao bound, and outperforms state-of-the-art RFS-based SLAM filters in high clutter and high process noise scenarios. △ Less

Submitted 16 July, 2024; originally announced July 2024.

arXiv:2407.11033 [pdf, other]

Hadamard Adapter: An Extreme Parameter-Efficient Adapter Tuning Method for Pre-trained Language Models

Authors: Yuyan Chen, Qiang Fu, Ge Fan, Lun Du, Jian-Guang Lou, Shi Han, Dongmei Zhang, Zhixu Li, Yanghua Xiao

Abstract: Recent years, Pre-trained Language models (PLMs) have swept into various fields of artificial intelligence and achieved great success. However, most PLMs, such as T5 and GPT3, have a huge amount of parameters, fine-tuning them is often expensive and time consuming, and storing them takes up a lot of space. Therefore, it is necessary to adopt a parameter-efficient approach to reduce parameters of P… ▽ More Recent years, Pre-trained Language models (PLMs) have swept into various fields of artificial intelligence and achieved great success. However, most PLMs, such as T5 and GPT3, have a huge amount of parameters, fine-tuning them is often expensive and time consuming, and storing them takes up a lot of space. Therefore, it is necessary to adopt a parameter-efficient approach to reduce parameters of PLMs in fine-tuning without compromising their performance in downstream tasks. In this paper, we design a novel adapter which only acts on self-attention outputs in PLMs. This adapter adopts element-wise linear transformation using Hadamard product, hence named as Hadamard adapter, requires the fewest parameters compared to previous parameter-efficient adapters. In addition, we also summarize some tuning patterns for Hadamard adapter shared by various downstream tasks, expecting to provide some guidance for further parameter reduction with shared adapters in future studies. The experiments conducted on the widely-used GLUE benchmark with several SOTA PLMs prove that the Hadamard adapter achieves competitive performance with only 0.033\% parameters compared with full fine-tuning, and it has the fewest parameters compared with other adapters. Moreover, we further find that there is also some redundant layers in the Hadamard adapter which can be removed to achieve more parameter efficiency with only 0.022\% parameters. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Accepted to CIKM 2023 (Long Paper)

arXiv:2407.10482 [pdf, other]

NGP-RT: Fusing Multi-Level Hash Features with Lightweight Attention for Real-Time Novel View Synthesis

Authors: Yubin Hu, Xiaoyang Guo, Yang Xiao, Jingwei Huang, Yong-Jin Liu

Abstract: This paper presents NGP-RT, a novel approach for enhancing the rendering speed of Instant-NGP to achieve real-time novel view synthesis. As a classic NeRF-based method, Instant-NGP stores implicit features in multi-level grids or hash tables and applies a shallow MLP to convert the implicit features into explicit colors and densities. Although it achieves fast training speed, there is still a lot… ▽ More This paper presents NGP-RT, a novel approach for enhancing the rendering speed of Instant-NGP to achieve real-time novel view synthesis. As a classic NeRF-based method, Instant-NGP stores implicit features in multi-level grids or hash tables and applies a shallow MLP to convert the implicit features into explicit colors and densities. Although it achieves fast training speed, there is still a lot of room for improvement in its rendering speed due to the per-point MLP executions for implicit multi-level feature aggregation, especially for real-time applications. To address this challenge, our proposed NGP-RT explicitly stores colors and densities as hash features, and leverages a lightweight attention mechanism to disambiguate the hash collisions instead of using computationally intensive MLP. At the rendering stage, NGP-RT incorporates a pre-computed occupancy distance grid into the ray marching strategy to inform the distance to the nearest occupied voxel, thereby reducing the number of marching points and global memory access. Experimental results show that on the challenging Mip-NeRF360 dataset, NGP-RT achieves better rendering quality than previous NeRF-based methods, achieving 108 fps at 1080p resolution on a single Nvidia RTX 3090 GPU. Our approach is promising for NeRF-based real-time applications that require efficient and high-quality rendering. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: ECCV 2024

arXiv:2407.10339 [pdf, other]

Supernova Pointing Capabilities of DUNE

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, D. A. Andrade , et al. (1340 additional authors not shown)

Abstract: The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electr… ▽ More The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electron-neutrino charged-current absorption on $^{40}$Ar and elastic scattering of neutrinos on electrons. Procedures to reconstruct individual interactions, including a newly developed technique called ``brems flipping'', as well as the burst direction from an ensemble of interactions are described. Performance of the burst direction reconstruction is evaluated for supernovae happening at a distance of 10 kpc for a specific supernova burst flux model. The pointing resolution is found to be 3.4 degrees at 68% coverage for a perfect interaction-channel classification and a fiducial mass of 40 kton, and 6.6 degrees for a 10 kton fiducial mass respectively. Assuming a 4% rate of charged-current interactions being misidentified as elastic scattering, DUNE's burst pointing resolution is found to be 4.3 degrees (8.7 degrees) at 68% coverage. △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: 25 pages, 16 figures

Report number: FERMILAB-PUB-24-0319-LBNF

arXiv:2407.09811 [pdf, other]

CellAgent: An LLM-driven Multi-Agent Framework for Automated Single-cell Data Analysis

Authors: Yihang Xiao, Jinyi Liu, Yan Zheng, Xiaohan Xie, Jianye Hao, Mingzhi Li, Ruitao Wang, Fei Ni, Yuxiao Li, Jintian Luo, Shaoqing Jiao, Jiajie Peng

Abstract: Single-cell RNA sequencing (scRNA-seq) data analysis is crucial for biological research, as it enables the precise characterization of cellular heterogeneity. However, manual manipulation of various tools to achieve desired outcomes can be labor-intensive for researchers. To address this, we introduce CellAgent (http://cell.agent4science.cn/), an LLM-driven multi-agent framework, specifically desi… ▽ More Single-cell RNA sequencing (scRNA-seq) data analysis is crucial for biological research, as it enables the precise characterization of cellular heterogeneity. However, manual manipulation of various tools to achieve desired outcomes can be labor-intensive for researchers. To address this, we introduce CellAgent (http://cell.agent4science.cn/), an LLM-driven multi-agent framework, specifically designed for the automatic processing and execution of scRNA-seq data analysis tasks, providing high-quality results with no human intervention. Firstly, to adapt general LLMs to the biological field, CellAgent constructs LLM-driven biological expert roles - planner, executor, and evaluator - each with specific responsibilities. Then, CellAgent introduces a hierarchical decision-making mechanism to coordinate these biological experts, effectively driving the planning and step-by-step execution of complex data analysis tasks. Furthermore, we propose a self-iterative optimization mechanism, enabling CellAgent to autonomously evaluate and optimize solutions, thereby guaranteeing output quality. We evaluate CellAgent on a comprehensive benchmark dataset encompassing dozens of tissues and hundreds of distinct cell types. Evaluation results consistently show that CellAgent effectively identifies the most suitable tools and hyperparameters for single-cell analysis tasks, achieving optimal performance. This automated framework dramatically reduces the workload for science data analyses, bringing us into the "Agent for Science" era. △ Less

Submitted 13 July, 2024; originally announced July 2024.

arXiv:2407.09674 [pdf, other]

Accelerating High-Throughput Phonon Calculations via Machine Learning Universal Potentials

Authors: Huiju Lee, Vinay I. Hegde, Chris Wolverton, Yi Xia

Abstract: Phonons play a critical role in determining various material properties, but conventional methods for phonon calculations are computationally intensive, limiting their broad applicability. In this study, we present an approach to accelerate high-throughput harmonic phonon calculations using machine learning universal potentials. We train a state-of-the-art machine learning interatomic potential, b… ▽ More Phonons play a critical role in determining various material properties, but conventional methods for phonon calculations are computationally intensive, limiting their broad applicability. In this study, we present an approach to accelerate high-throughput harmonic phonon calculations using machine learning universal potentials. We train a state-of-the-art machine learning interatomic potential, based on multi-atomic cluster expansion (MACE), on a comprehensive dataset of 2,738 crystal structures with 77 elements, totaling 15,670 supercell structures, computed using high-fidelity density functional theory (DFT) calculations. Our approach significantly reduces the number of required supercells for phonon calculations while maintaining high accuracy in predicting harmonic phonon properties across diverse materials. The trained model is validated against phonon calculations for a held-out subset of 384 materials, achieving a mean absolute error (MAE) of 0.18 THz for vibrational frequencies from full phonon dispersions, 2.19 meV/atom for Helmholtz vibrational free energies at 300K, as well as a classification accuracy of 86.2% for dynamical stability of materials. A thermodynamic analysis of polymorphic stability in 126 systems demonstrates good agreement with DFT results at 300 K and 1000 K. In addition, the diverse and extensive high-quality DFT dataset curated in this study serves as a valuable resource for researchers to train and improve other machine learning interatomic potential models. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 16 pages, 7 figures

arXiv:2407.09658 [pdf, other]

BoBa: Boosting Backdoor Detection through Data Distribution Inference in Federated Learning

Authors: Ning Wang, Shanghao Shi, Yang Xiao, Yimin Chen, Y. Thomas Hou, Wenjing Lou

Abstract: Federated learning, while being a promising approach for collaborative model training, is susceptible to poisoning attacks due to its decentralized nature. Backdoor attacks, in particular, have shown remarkable stealthiness, as they selectively compromise predictions for inputs containing triggers. Previous endeavors to detect and mitigate such attacks are based on the Independent and Identically… ▽ More Federated learning, while being a promising approach for collaborative model training, is susceptible to poisoning attacks due to its decentralized nature. Backdoor attacks, in particular, have shown remarkable stealthiness, as they selectively compromise predictions for inputs containing triggers. Previous endeavors to detect and mitigate such attacks are based on the Independent and Identically Distributed (IID) data assumption where benign model updates exhibit high-level similarity in multiple feature spaces due to IID data. Thus, outliers are detected as backdoor attacks. Nevertheless, non-IID data presents substantial challenges in backdoor attack detection, as the data variety introduces variance among benign models, making outlier detection-based mechanisms less effective. We propose a novel distribution-aware anomaly detection mechanism, BoBa, to address this problem. In order to differentiate outliers arising from data variety versus backdoor attack, we propose to break down the problem into two steps: clustering clients utilizing their data distribution followed by a voting-based detection. Based on the intuition that clustering and subsequent backdoor detection can drastically benefit from knowing client data distributions, we propose a novel data distribution inference mechanism. To improve detection robustness, we introduce an overlapping clustering method, where each client is associated with multiple clusters, ensuring that the trustworthiness of a model update is assessed collectively by multiple clusters rather than a single cluster. Through extensive evaluations, we demonstrate that BoBa can reduce the attack success rate to lower than 0.001 while maintaining high main task accuracy across various attack strategies and experimental settings. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.09331 [pdf, other]

Suppression of quantum dissipation: A cooperative effect of quantum squeezing and quantum measurement

Authors: Yi-Ming Xia, Yi-Fei Wang, Xiao-Yun Zhang, Hai-Chao Li, Wei Xiong

Abstract: The ability to isolate a quantum system from its environment is of fundamental interest and importance in optical quantum science and technology. Here we propose an experimentally feasible scheme for beating environment-induced dissipation in an open two-level system coupled to a parametrically driven cavity. The mechanism relies on a novel cooperation between light-matter coupling enhancement and… ▽ More The ability to isolate a quantum system from its environment is of fundamental interest and importance in optical quantum science and technology. Here we propose an experimentally feasible scheme for beating environment-induced dissipation in an open two-level system coupled to a parametrically driven cavity. The mechanism relies on a novel cooperation between light-matter coupling enhancement and frequent measurements. We demonstrate that, in the presence of the cooperation, the system dynamics can be completely dominated by the effective system-cavity interaction and the dissipative effects from the system-environment coupling can be surprisingly ignored. This work provides a generic method of dissipation suppression in a variety of quantum mechanical platforms, including natural atoms and superconducting circuits. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.08559 [pdf]

Study of a Novel Capacitive Pressure Sensor Using Spiral Comb Electrodes

Authors: Wenjie Chen, Qi Yang, Qi Liu, Yiqun Zhang, Liang He, Yuanlin Xia, Zhuqing Wang, Yubo Huang, Jianfeng Chen, Cao Xia

Abstract: For traditional capacitive pressure sensors, high nonlinearity and poor sensitivity greatly limited their sensing applications. Hence, an innovative design of capacitors based on spiral comb electrodes is proposed for high-sensitivity pressure detection in this work. Compared to traditional capacitive pressure sensors with straight plate electrodes, the proposed sensor with the spiral electrodes i… ▽ More For traditional capacitive pressure sensors, high nonlinearity and poor sensitivity greatly limited their sensing applications. Hence, an innovative design of capacitors based on spiral comb electrodes is proposed for high-sensitivity pressure detection in this work. Compared to traditional capacitive pressure sensors with straight plate electrodes, the proposed sensor with the spiral electrodes increases the overlap areas of electrodes sufficiently, the pressure sensitivity can thus be greatly improved. Moreover, the capacitance variation of the proposed sensor is dominated by the change of the overlap area of the electrodes rather than the electrode's distance, the linearity can also thus be improved to higher than 0.99. Theoretical analysis and COMSOL-based finite element simulation have been implemented for principle verification and performance optimization. Simulation results show that the proposed design has a mechanical sensitivity of 1.5x10-4 m/Pa, capacitive sensitivity of 1.10 aF/Pa, and nonlinear error of 3.63%, respectively, at the pressure range from 0 to 30 kPa. An equivalent experiment has been further carried out for verification. Experimental results also show that both the sensitivity and linearity of capacitive pressure sensors with spiral electrodes are higher than those with straight electrodes. This work not only provides a new avenue for capacitor design, but also can be applied to high-sensitivity pressure detection. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 20 pages, 14 figures

MSC Class: -

arXiv:2407.08550 [pdf]

doi 10.51202/9783181024379

Incorporating Large Language Models into Production Systems for Enhanced Task Automation and Flexibility

Authors: Yuchen Xia, Jize Zhang, Nasser Jazdi, Michael Weyrich

Abstract: This paper introduces a novel approach to integrating large language model (LLM) agents into automated production systems, aimed at enhancing task automation and flexibility. We organize production operations within a hierarchical framework based on the automation pyramid. Atomic operation functionalities are modeled as microservices, which are executed through interface invocation within a dedica… ▽ More This paper introduces a novel approach to integrating large language model (LLM) agents into automated production systems, aimed at enhancing task automation and flexibility. We organize production operations within a hierarchical framework based on the automation pyramid. Atomic operation functionalities are modeled as microservices, which are executed through interface invocation within a dedicated digital twin system. This allows for a scalable and flexible foundation for orchestrating production processes. In this digital twin system, low-level, hardware-specific data is semantically enriched and made interpretable for LLMs for production planning and control tasks. Large language model agents are systematically prompted to interpret these production-specific data and knowledge. Upon receiving a user request or identifying a triggering event, the LLM agents generate a process plan. This plan is then decomposed into a series of atomic operations, executed as microservices within the real-world automation system. We implement this overall approach on an automated modular production facility at our laboratory, demonstrating how the LLMs can handle production planning and control tasks through a concrete case study. This results in an intuitive production facility with higher levels of task automation and flexibility. Finally, we reveal the several limitations in realizing the full potential of the large language models in autonomous systems and point out promising benefits. Demos of this series of ongoing research series can be accessed at: https://github.com/YuchenXia/GPT4IndustrialAutomation △ Less

Submitted 11 July, 2024; originally announced July 2024.

Report number: VDI-Berichte Nr. 2437, 2024

arXiv:2407.08290 [pdf, other]

Gap Completion in Point Cloud Scene occluded by Vehicles using SGC-Net

Authors: Yu Feng, Yiming Xu, Yan Xia, Claus Brenner, Monika Sester

Abstract: Recent advances in mobile mapping systems have greatly enhanced the efficiency and convenience of acquiring urban 3D data. These systems utilize LiDAR sensors mounted on vehicles to capture vast cityscapes. However, a significant challenge arises due to occlusions caused by roadside parked vehicles, leading to the loss of scene information, particularly on the roads, sidewalks, curbs, and the lowe… ▽ More Recent advances in mobile mapping systems have greatly enhanced the efficiency and convenience of acquiring urban 3D data. These systems utilize LiDAR sensors mounted on vehicles to capture vast cityscapes. However, a significant challenge arises due to occlusions caused by roadside parked vehicles, leading to the loss of scene information, particularly on the roads, sidewalks, curbs, and the lower sections of buildings. In this study, we present a novel approach that leverages deep neural networks to learn a model capable of filling gaps in urban scenes that are obscured by vehicle occlusion. We have developed an innovative technique where we place virtual vehicle models along road boundaries in the gap-free scene and utilize a ray-casting algorithm to create a new scene with occluded gaps. This allows us to generate diverse and realistic urban point cloud scenes with and without vehicle occlusion, surpassing the limitations of real-world training data collection and annotation. Furthermore, we introduce the Scene Gap Completion Network (SGC-Net), an end-to-end model that can generate well-defined shape boundaries and smooth surfaces within occluded gaps. The experiment results reveal that 97.66% of the filled points fall within a range of 5 centimeters relative to the high-density ground truth point cloud scene. These findings underscore the efficacy of our proposed model in gap completion and reconstructing urban scenes affected by vehicle occlusions. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.07651 [pdf, other]

Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.06492 [pdf, other]

Using Graph Neural Networks and Frequency Domain Data for Automated Operational Modal Analysis of Populations of Structures

Authors: Xudong Jian, Yutong Xia, Gregory Duthé, Kiran Bacsa, Wei Liu, Eleni Chatzi

Abstract: The Population-Based Structural Health Monitoring (PBSHM) paradigm has recently emerged as a promising approach to enhance data-driven assessment of engineering structures by facilitating transfer learning between structures with some degree of similarity. In this work, we apply this concept to the automated modal identification of structural systems. We introduce a Graph Neural Network (GNN)-base… ▽ More The Population-Based Structural Health Monitoring (PBSHM) paradigm has recently emerged as a promising approach to enhance data-driven assessment of engineering structures by facilitating transfer learning between structures with some degree of similarity. In this work, we apply this concept to the automated modal identification of structural systems. We introduce a Graph Neural Network (GNN)-based deep learning scheme to identify modal properties, including natural frequencies, damping ratios, and mode shapes of engineering structures based on the Power Spectral Density (PSD) of spatially-sparse vibration measurements. Systematic numerical experiments are conducted to evaluate the proposed model, employing two distinct truss populations that possess similar topological characteristics but varying geometric (size, shape) and material (stiffness) properties. The results demonstrate that, once trained, the proposed GNN-based model can identify modal properties of unseen structures within the same structural population with good efficiency and acceptable accuracy, even in the presence of measurement noise and sparse measurement locations. The GNN-based model exhibits advantages over the classic Frequency Domain Decomposition (FDD) method in terms of identification speed, as well as against an alternate Multi-Layer Perceptron (MLP) architecture in terms of identification accuracy, rendering this a promising tool for PBSHM purposes. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.06445 [pdf, other]

Unconventional Field-Like Spin-Torques in CrPt$_3$

Authors: Robin Klause, Yuxuan Xiao, Jonathan Gibbons, Eric Fullerton, Axel Hoffmann

Abstract: The topological semimetal CrPt$_3$ has potential for generating unconventional spin torques due to its ferrimagnetic ordering, topological band structure, and high anomalous Hall effect. CrPt$_3$ exhibits ferrimagnetic behavior only in its chemically ordered phase and is paramagnetic in its chemically disordered phase. By controlling the growth and annealing temperatures epitaxial films of both ch… ▽ More The topological semimetal CrPt$_3$ has potential for generating unconventional spin torques due to its ferrimagnetic ordering, topological band structure, and high anomalous Hall effect. CrPt$_3$ exhibits ferrimagnetic behavior only in its chemically ordered phase and is paramagnetic in its chemically disordered phase. By controlling the growth and annealing temperatures epitaxial films of both chemically ordered and disordered phases of CrPt$_3$ are prepared allowing us to investigate the role of magnetic ordering on unconventional torque generation. We use angle dependent spin-torque ferromagnetic resonance and second harmonic Hall measurements to probe the spin torques generated from epitaxial CrPt$_3$ in CrPt$_3$/Cu/Ni$_{81}$Fe$_{19}$ heterostructures. With current applied along specific directions with respect to the crystal order we reveal unconventional spin torques in both ordered and disordered films. When current flows parallel to the $[1\overline{1}1]$ and $[\overline{1}11]$ directions we observe an unconventional field-like torque that is opposite in sign for the two directions, which lack a mirror plane thus allowing unconventional torques to be generated. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.06168 [pdf, other]

TARGO: Benchmarking Target-driven Object Grasping under Occlusions

Authors: Yan Xia, Ran Ding, Ziyuan Qin, Guanqi Zhan, Kaichen Zhou, Long Yang, Hao Dong, Daniel Cremers

Abstract: Recent advances in predicting 6D grasp poses from a single depth image have led to promising performance in robotic grasping. However, previous grasping models face challenges in cluttered environments where nearby objects impact the target object's grasp. In this paper, we first establish a new benchmark dataset for TARget-driven Grasping under Occlusions, named TARGO. We make the following contr… ▽ More Recent advances in predicting 6D grasp poses from a single depth image have led to promising performance in robotic grasping. However, previous grasping models face challenges in cluttered environments where nearby objects impact the target object's grasp. In this paper, we first establish a new benchmark dataset for TARget-driven Grasping under Occlusions, named TARGO. We make the following contributions: 1) We are the first to study the occlusion level of grasping. 2) We set up an evaluation benchmark consisting of large-scale synthetic data and part of real-world data, and we evaluated five grasp models and found that even the current SOTA model suffers when the occlusion level increases, leaving grasping under occlusion still a challenge. 3) We also generate a large-scale training dataset via a scalable pipeline, which can be used to boost the performance of grasping under occlusion and generalized to the real world. 4) We further propose a transformer-based grasping model involving a shape completion module, termed TARGO-Net, which performs most robustly as occlusion increases. Our benchmark dataset can be found at https://TARGO-benchmark.github.io/. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 19 pages, 17 figures

arXiv:2407.06112 [pdf, other]

Enhancing Language Model Rationality with Bi-Directional Deliberation Reasoning

Authors: Yadong Zhang, Shaoguang Mao, Wenshan Wu, Yan Xia, Tao Ge, Man Lan, Furu Wei

Abstract: This paper introduces BI-Directional DEliberation Reasoning (BIDDER), a novel reasoning approach to enhance the decision rationality of language models. Traditional reasoning methods typically rely on historical information and employ uni-directional (left-to-right) reasoning strategy. This lack of bi-directional deliberation reasoning results in limited awareness of potential future outcomes and… ▽ More This paper introduces BI-Directional DEliberation Reasoning (BIDDER), a novel reasoning approach to enhance the decision rationality of language models. Traditional reasoning methods typically rely on historical information and employ uni-directional (left-to-right) reasoning strategy. This lack of bi-directional deliberation reasoning results in limited awareness of potential future outcomes and insufficient integration of historical context, leading to suboptimal decisions. BIDDER addresses this gap by incorporating principles of rational decision-making, specifically managing uncertainty and predicting expected utility. Our approach involves three key processes: Inferring hidden states to represent uncertain information in the decision-making process from historical data; Using these hidden states to predict future potential states and potential outcomes; Integrating historical information (past contexts) and long-term outcomes (future contexts) to inform reasoning. By leveraging bi-directional reasoning, BIDDER ensures thorough exploration of both past and future contexts, leading to more informed and rational decisions. We tested BIDDER's effectiveness in two well-defined scenarios: Poker (Limit Texas Hold'em) and Negotiation. Our experiments demonstrate that BIDDER significantly improves the decision-making capabilities of LLMs and LLM agents. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.05928 [pdf, other]

CA-FedRC: Codebook Adaptation via Federated Reservoir Computing in 5G NR

Authors: Ziqiang Ye, Sikai Liao, Yulan Gao, Shu Fang, Yue Xiao, Ming Xiao, Saviour Zammit

Abstract: With the burgeon deployment of the fifth-generation new radio (5G NR) networks, the codebook plays a crucial role in enabling the base station (BS) to acquire the channel state information (CSI). Different 5G NR codebooks incur varying overheads and exhibit performance disparities under diverse channel conditions, necessitating codebook adaptation based on channel conditions to reduce feedback ove… ▽ More With the burgeon deployment of the fifth-generation new radio (5G NR) networks, the codebook plays a crucial role in enabling the base station (BS) to acquire the channel state information (CSI). Different 5G NR codebooks incur varying overheads and exhibit performance disparities under diverse channel conditions, necessitating codebook adaptation based on channel conditions to reduce feedback overhead while enhancing performance. However, existing methods of 5G NR codebooks adaptation require significant overhead for model training and feedback or fall short in performance. To address these limitations, this letter introduces a federated reservoir computing framework designed for efficient codebook adaptation in computationally and feedback resource-constrained mobile devices. This framework utilizes a novel series of indicators as input training data, striking an effective balance between performance and feedback overhead. Compared to conventional models, the proposed codebook adaptation via federated reservoir computing (CA-FedRC), achieves rapid convergence and significant loss reduction in both speed and accuracy. Extensive simulations under various channel conditions demonstrate that our algorithm not only reduces resource consumption of users but also accurately identifies channel types, thereby optimizing the trade-off between spectrum efficiency, computational complexity, and feedback overhead. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.05739 [pdf, other]

Multi-Bit Mechanism: A Novel Information Transmission Paradigm for Spiking Neural Networks

Authors: Yongjun Xiao, Xianlong Tian, Yongqi Ding, Pei He, Mengmeng Jing, Lin Zuo

Abstract: Since proposed, spiking neural networks (SNNs) gain recognition for their high performance, low power consumption and enhanced biological interpretability. However, while bringing these advantages, the binary nature of spikes also leads to considerable information loss in SNNs, ultimately causing performance degradation. We claim that the limited expressiveness of current binary spikes, resulting… ▽ More Since proposed, spiking neural networks (SNNs) gain recognition for their high performance, low power consumption and enhanced biological interpretability. However, while bringing these advantages, the binary nature of spikes also leads to considerable information loss in SNNs, ultimately causing performance degradation. We claim that the limited expressiveness of current binary spikes, resulting in substantial information loss, is the fundamental issue behind these challenges. To alleviate this, our research introduces a multi-bit information transmission mechanism for SNNs. This mechanism expands the output of spiking neurons from the original single bit to multiple bits, enhancing the expressiveness of the spikes and reducing information loss during the forward process, while still maintaining the low energy consumption advantage of SNNs. For SNNs, this represents a new paradigm of information transmission. Moreover, to further utilize the limited spikes, we extract effective signals from the previous layer to re-stimulate the neurons, thus encouraging full spikes emission across various bit levels. We conducted extensive experiments with our proposed method using both direct training method and ANN-SNN conversion method, and the results show consistent performance improvements. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: Under review

arXiv:2407.05585 [pdf, other]

Unmasking Bias: A Framework for Evaluating Treatment Benefit Predictors Using Observational Studies

Authors: Yuan Xia, Mohsen Sadatsafavi, Paul Gustafson

Abstract: Treatment benefit predictors (TBPs) map patient characteristics into an estimate of the treatment benefit tailored to individual patients, which can support optimizing treatment decisions. However, the assessment of their performance might be challenging with the non-random treatment assignment. This study conducts a conceptual analysis, which can be applied to finite-sample studies. We present a… ▽ More Treatment benefit predictors (TBPs) map patient characteristics into an estimate of the treatment benefit tailored to individual patients, which can support optimizing treatment decisions. However, the assessment of their performance might be challenging with the non-random treatment assignment. This study conducts a conceptual analysis, which can be applied to finite-sample studies. We present a framework for evaluating TBPs using observational data from a target population of interest. We then explore the impact of confounding bias on TBP evaluation using measures of discrimination and calibration, which are the moderate calibration and the concentration of the benefit index ($C_b$), respectively. We illustrate that failure to control for confounding can lead to misleading values of performance metrics and establish how the confounding bias propagates to an evaluation bias to quantify the explicit bias for the performance metrics. These findings underscore the necessity of accounting for confounding factors when evaluating TBPs, ensuring more reliable and contextually appropriate treatment decisions. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: 31 pages, 5 figures

arXiv:2407.05310 [pdf, other]

Ternary Spike-based Neuromorphic Signal Processing System

Authors: Shuai Wang, Dehao Zhang, Ammar Belatreche, Yichen Xiao, Hongyu Qing, Wenjie We, Malu Zhang, Yang Yang

Abstract: Deep Neural Networks (DNNs) have been successfully implemented across various signal processing fields, resulting in significant enhancements in performance. However, DNNs generally require substantial computational resources, leading to significant economic costs and posing challenges for their deployment on resource-constrained edge devices. In this study, we take advantage of spiking neural net… ▽ More Deep Neural Networks (DNNs) have been successfully implemented across various signal processing fields, resulting in significant enhancements in performance. However, DNNs generally require substantial computational resources, leading to significant economic costs and posing challenges for their deployment on resource-constrained edge devices. In this study, we take advantage of spiking neural networks (SNNs) and quantization technologies to develop an energy-efficient and lightweight neuromorphic signal processing system. Our system is characterized by two principal innovations: a threshold-adaptive encoding (TAE) method and a quantized ternary SNN (QT-SNN). The TAE method can efficiently encode time-varying analog signals into sparse ternary spike trains, thereby reducing energy and memory demands for signal processing. QT-SNN, compatible with ternary spike trains from the TAE method, quantifies both membrane potentials and synaptic weights to reduce memory requirements while maintaining performance. Extensive experiments are conducted on two typical signal-processing tasks: speech and electroencephalogram recognition. The results demonstrate that our neuromorphic signal processing system achieves state-of-the-art (SOTA) performance with a 94% reduced memory requirement. Furthermore, through theoretical energy consumption analysis, our system shows 7.5x energy saving compared to other SNN works. The efficiency and efficacy of the proposed system highlight its potential as a promising avenue for energy-efficient signal processing. △ Less

Submitted 7 July, 2024; originally announced July 2024.

arXiv:2407.04984 [pdf]

Prolonged Phase Segregation of Mixed-Halide Perovskite Nanocrystals in the Dark

Authors: Xueying Ma, Yuhui Ye, Yang Xiao, Shengnan Feng, Chunfeng Zhang, Keyu Xia, Fengrui Hu, Min Xiao, Xiaoyong Wang

Abstract: A critical issue hindering the potential applications of semiconductor mixed-halide perovskites is the phase segregation effect, wherein localized regions enriched with one type of halide anions would be formed upon continuous photogeneration of the excited-state charge carriers. These unexpected phases are capable of remixing again in the dark under the entropic driving force, the process of whic… ▽ More A critical issue hindering the potential applications of semiconductor mixed-halide perovskites is the phase segregation effect, wherein localized regions enriched with one type of halide anions would be formed upon continuous photogeneration of the excited-state charge carriers. These unexpected phases are capable of remixing again in the dark under the entropic driving force, the process of which are now being exclusively studied after mixed-halide perovskites have arrived at the final stage of complete phase segregation. Here we show that after the removal of laser excitation from a solid film of mixed-halide perovskite nanocrystals with partial phase segregation, the iodide- and bromide-rich regions can continuously grow in the dark for a prolonged time period of several minutes. We propose that this dark phase segregation is sustained by the local electric fields associated with the surface-trapped charge carriers, whose slow dissipation out of mixed-halide perovskite nanocrystals causes a delayed occurrence of the reversal phase remixing process. △ Less

Submitted 6 July, 2024; originally announced July 2024.

arXiv:2407.04973 [pdf, other]

LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts

Authors: Yijia Xiao, Edward Sun, Tianyu Liu, Wei Wang

Abstract: We propose LogicVista, an evaluation benchmark that assesses the integrated logical reasoning capabilities of multimodal large language models (MLLMs) in Visual contexts. Recent advancements in MLLMs have demonstrated various fascinating abilities, from crafting poetry based on an image to performing mathematical reasoning. However, there is still a lack of systematic evaluation of MLLMs' proficie… ▽ More We propose LogicVista, an evaluation benchmark that assesses the integrated logical reasoning capabilities of multimodal large language models (MLLMs) in Visual contexts. Recent advancements in MLLMs have demonstrated various fascinating abilities, from crafting poetry based on an image to performing mathematical reasoning. However, there is still a lack of systematic evaluation of MLLMs' proficiency in logical reasoning tasks, which are essential for activities like navigation and puzzle-solving. Thus we evaluate general logical cognition abilities across 5 logical reasoning tasks encompassing 9 different capabilities, using a sample of 448 multiple-choice questions. Each question is annotated with the correct answer and the human-written reasoning behind the selection, enabling both open-ended and multiple-choice evaluation. A total of 8 MLLMs are comprehensively evaluated using LogicVista. Code and Data Available at https://github.com/Yijia-Xiao/LogicVista. △ Less

Submitted 6 July, 2024; originally announced July 2024.

Comments: LogicVista benchmarks the logical reasoning of multimodal large language models in visual tasks

arXiv:2407.04121 [pdf, other]

Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models

Authors: Yuyan Chen, Qiang Fu, Yichen Yuan, Zhihao Wen, Ge Fan, Dayiheng Liu, Dongmei Zhang, Zhixu Li, Yanghua Xiao

Abstract: Large Language Models (LLMs) have gained widespread adoption in various natural language processing tasks, including question answering and dialogue systems. However, a major drawback of LLMs is the issue of hallucination, where they generate unfaithful or inconsistent content that deviates from the input source, leading to severe consequences. In this paper, we propose a robust discriminator name… ▽ More Large Language Models (LLMs) have gained widespread adoption in various natural language processing tasks, including question answering and dialogue systems. However, a major drawback of LLMs is the issue of hallucination, where they generate unfaithful or inconsistent content that deviates from the input source, leading to severe consequences. In this paper, we propose a robust discriminator named RelD to effectively detect hallucination in LLMs' generated answers. RelD is trained on the constructed RelQA, a bilingual question-answering dialogue dataset along with answers generated by LLMs and a comprehensive set of metrics. Our experimental results demonstrate that the proposed RelD successfully detects hallucination in the answers generated by diverse LLMs. Moreover, it performs well in distinguishing hallucination in LLMs' generated answers from both in-distribution and out-of-distribution datasets. Additionally, we also conduct a thorough analysis of the types of hallucinations that occur and present valuable insights. This research significantly contributes to the detection of reliable answers generated by LLMs and holds noteworthy implications for mitigating hallucination in the future work. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Accepted to CIKM 2023 (Long Paper)

arXiv:2407.04118 [pdf, other]

MAPO: Boosting Large Language Model Performance with Model-Adaptive Prompt Optimization

Authors: Yuyan Chen, Zhihao Wen, Ge Fan, Zhengyu Chen, Wei Wu, Dayiheng Liu, Zhixu Li, Bang Liu, Yanghua Xiao

Abstract: Prompt engineering, as an efficient and effective way to leverage Large Language Models (LLM), has drawn a lot of attention from the research community. The existing research primarily emphasizes the importance of adapting prompts to specific tasks, rather than specific LLMs. However, a good prompt is not solely defined by its wording, but also binds to the nature of the LLM in question. In this w… ▽ More Prompt engineering, as an efficient and effective way to leverage Large Language Models (LLM), has drawn a lot of attention from the research community. The existing research primarily emphasizes the importance of adapting prompts to specific tasks, rather than specific LLMs. However, a good prompt is not solely defined by its wording, but also binds to the nature of the LLM in question. In this work, we first quantitatively demonstrate that different prompts should be adapted to different LLMs to enhance their capabilities across various downstream tasks in NLP. Then we novelly propose a model-adaptive prompt optimizer (MAPO) method that optimizes the original prompts for each specific LLM in downstream tasks. Extensive experiments indicate that the proposed method can effectively refine prompts for an LLM, leading to significant improvements over various downstream tasks. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Accepted to EMNLP 2023 (Findings)

arXiv:2407.04105 [pdf, other]

Can Pre-trained Language Models Understand Chinese Humor?

Authors: Yuyan Chen, Zhixu Li, Jiaqing Liang, Yanghua Xiao, Bang Liu, Yunwen Chen

Abstract: Humor understanding is an important and challenging research in natural language processing. As the popularity of pre-trained language models (PLMs), some recent work makes preliminary attempts to adopt PLMs for humor recognition and generation. However, these simple attempts do not substantially answer the question: {\em whether PLMs are capable of humor understanding?} This paper is the first wo… ▽ More Humor understanding is an important and challenging research in natural language processing. As the popularity of pre-trained language models (PLMs), some recent work makes preliminary attempts to adopt PLMs for humor recognition and generation. However, these simple attempts do not substantially answer the question: {\em whether PLMs are capable of humor understanding?} This paper is the first work that systematically investigates the humor understanding ability of PLMs. For this purpose, a comprehensive framework with three evaluation steps and four evaluation tasks is designed. We also construct a comprehensive Chinese humor dataset, which can fully meet all the data requirements of the proposed evaluation framework. Our empirical study on the Chinese humor dataset yields some valuable observations, which are of great guiding value for future optimization of PLMs in humor understanding and generation. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Accepted to WSDM 2022

arXiv:2407.03663 [pdf, other]

Limited-View Photoacoustic Imaging Reconstruction Via High-quality Self-supervised Neural Representation

Authors: Youshen xiao, Yuting Shen, Bowei Yao, Xiran Cai, Yuyao Zhang, Fei Gao

Abstract: In practical applications within the human body, it is often challenging to fully encompass the target tissue or organ, necessitating the use of limited-view arrays, which can lead to the loss of crucial information. Addressing the reconstruction of photoacoustic sensor signals in limited-view detection spaces has become a focal point of current research. In this study, we introduce a self-supervi… ▽ More In practical applications within the human body, it is often challenging to fully encompass the target tissue or organ, necessitating the use of limited-view arrays, which can lead to the loss of crucial information. Addressing the reconstruction of photoacoustic sensor signals in limited-view detection spaces has become a focal point of current research. In this study, we introduce a self-supervised network termed HIgh-quality Self-supervised neural representation (HIS), which tackles the inverse problem of photoacoustic imaging to reconstruct high-quality photoacoustic images from sensor data acquired under limited viewpoints. We regard the desired reconstructed photoacoustic image as an implicit continuous function in 2D image space, viewing the pixels of the image as sparse discrete samples. The HIS's objective is to learn the continuous function from limited observations by utilizing a fully connected neural network combined with Fourier feature position encoding. By simply minimizing the error between the network's predicted sensor data and the actual sensor data, HIS is trained to represent the observed continuous model. The results indicate that the proposed HIS model offers superior image reconstruction quality compared to three commonly used methods for photoacoustic image reconstruction. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2407.03661 [pdf, other]

Configurable DOA Estimation using Incremental Learning

Authors: Yang Xiao, Rohan Kumar Das

Abstract: This study introduces a progressive neural network (PNN) model for direction of arrival (DOA) estimation, DOA-PNN, addressing the challenge due to catastrophic forgetting in adapting dynamic acoustic environments. While traditional methods such as GCC, MUSIC, and SRP-PHAT are effective in static settings, they perform worse in noisy, reverberant conditions. Deep learning models, particularly CNNs,… ▽ More This study introduces a progressive neural network (PNN) model for direction of arrival (DOA) estimation, DOA-PNN, addressing the challenge due to catastrophic forgetting in adapting dynamic acoustic environments. While traditional methods such as GCC, MUSIC, and SRP-PHAT are effective in static settings, they perform worse in noisy, reverberant conditions. Deep learning models, particularly CNNs, offer improvements but struggle with a mismatch configuration between the training and inference phases. The proposed DOA-PNN overcomes these limitations by incorporating task incremental learning of continual learning, allowing for adaptation across varying acoustic scenarios with less forgetting of previously learned knowledge. Featuring task-specific sub-networks and a scaling mechanism, DOA-PNN efficiently manages parameter growth, ensuring high performance across incremental microphone configurations. We study DOA-PNN on a simulated data under various mic distance based microphone settings. The studies reveal its capability to maintain performance with minimal parameter increase, presenting an efficient solution for DOA estimation. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Submitted to DCASE WS 2024

arXiv:2407.03657 [pdf, other]

UCIL: An Unsupervised Class Incremental Learning Approach for Sound Event Detection

Authors: Yang Xiao, Rohan Kumar Das

Abstract: This work explores class-incremental learning (CIL) for sound event detection (SED), advancing adaptability towards real-world scenarios. CIL's success in domains like computer vision inspired our SED-tailored method, addressing the unique challenges of diverse and complex audio environments. Our approach employs an independent unsupervised learning framework with a distillation loss function to i… ▽ More This work explores class-incremental learning (CIL) for sound event detection (SED), advancing adaptability towards real-world scenarios. CIL's success in domains like computer vision inspired our SED-tailored method, addressing the unique challenges of diverse and complex audio environments. Our approach employs an independent unsupervised learning framework with a distillation loss function to integrate new sound classes while preserving the SED model consistency across incremental tasks. We further enhance this framework with a sample selection strategy for unlabeled data and a balanced exemplar update mechanism, ensuring varied and illustrative sound representations. Evaluating various continual learning methods on the DCASE 2023 Task 4 dataset, we find that our research offers insights into each method's applicability for real-world SED systems that can have newly added sound classes. The findings also delineate future directions of CIL in dynamic audio settings. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Submitted to DCASE WS 2024

arXiv:2407.03656 [pdf, other]

WildDESED: An LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection System

Authors: Yang Xiao, Rohan Kumar Das

Abstract: This work aims to advance sound event detection (SED) research by presenting a new large language model (LLM)-powered dataset namely wild domestic environment sound event detection (WildDESED). It is crafted as an extension to the original DESED dataset to reflect diverse acoustic variability and complex noises in home settings. We leveraged LLMs to generate eight different domestic scenarios base… ▽ More This work aims to advance sound event detection (SED) research by presenting a new large language model (LLM)-powered dataset namely wild domestic environment sound event detection (WildDESED). It is crafted as an extension to the original DESED dataset to reflect diverse acoustic variability and complex noises in home settings. We leveraged LLMs to generate eight different domestic scenarios based on target sound categories of the DESED dataset. Then we enriched the scenarios with a carefully tailored mixture of noises selected from AudioSet and ensured no overlap with target sound. We consider widely popular convolutional neural recurrent network to study WildDESED dataset, which depicts its challenging nature. We then apply curriculum learning by gradually increasing noise complexity to enhance the model's generalization capabilities across various noise levels. Our results with this approach show improvements within the noisy environment, validating the effectiveness on the WildDESED dataset promoting noise-robust SED advancements. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Submitted to DCASE WS 2024

arXiv:2407.03654 [pdf, other]

Mixstyle based Domain Generalization for Sound Event Detection with Heterogeneous Training Data

Authors: Yang Xiao, Han Yin, Jisheng Bai, Rohan Kumar Das

Abstract: This work explores domain generalization (DG) for sound event detection (SED), advancing adaptability towards real-world scenarios. Our approach employs a mean-teacher framework with domain generalization to integrate heterogeneous training data, while preserving the SED model performance across the datasets. Specifically, we first apply mixstyle to the frequency dimension to adapt the mel-spectro… ▽ More This work explores domain generalization (DG) for sound event detection (SED), advancing adaptability towards real-world scenarios. Our approach employs a mean-teacher framework with domain generalization to integrate heterogeneous training data, while preserving the SED model performance across the datasets. Specifically, we first apply mixstyle to the frequency dimension to adapt the mel-spectrograms from different domains. Next, we use the adaptive residual normalization method to generalize features across multiple domains by applying instance normalization in the frequency dimension. Lastly, we use the sound event bounding boxes method for post-processing. Our approach integrates features from bidirectional encoder representations from audio transformers and a convolutional recurrent neural network. We evaluate the proposed approach on DCASE 2024 Challenge Task 4 dataset, measuring polyphonic SED score (PSDS) on the DESED dataset and macro-average pAUC on the MAESTRO dataset. The results indicate that the proposed DG-based method improves both PSDS and macro-average pAUC compared to the challenge baseline. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Sumbitted to DCASE WS 2024. 5 pages. arXiv admin note: text overlap with arXiv:2407.00291

arXiv:2407.02899 [pdf, other]

Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be… ▽ More A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.02580 [pdf, other]

A Precise Fitting Formula for Gravitational Wave Spectra from Phase Transitions

Authors: Huai-ke Guo, Fazlollah Hajkarim, Kuver Sinha, Graham White, Yang Xiao

Abstract: Obtaining a precise form for the predicted gravitational wave (GW) spectrum from a phase transition is a topic of great relevance for beyond Standard Model (BSM) physicists. Currently, the most sophisticated semi-analytic framework for estimating the dominant contribution to the spectrum is the sound shell model; however, full calculations within this framework can be computationally expensive, es… ▽ More Obtaining a precise form for the predicted gravitational wave (GW) spectrum from a phase transition is a topic of great relevance for beyond Standard Model (BSM) physicists. Currently, the most sophisticated semi-analytic framework for estimating the dominant contribution to the spectrum is the sound shell model; however, full calculations within this framework can be computationally expensive, especially for large-scale scans. The community therefore generally manages with fit functions to the GW spectrum, the most widely used of which is a single broken power law. We provide a more precise fit function based on the sound shell model: our fit function features a double broken power law with two frequency breaks corresponding to the two characteristic length scales of the problem -- inter-bubble spacing and thickness of sound shells, the second of which is neglected in the single broken power law fit. Compared to previously proposed fits, we demonstrate that our fit function more faithfully captures the GW spectrum coming from a full calculation of the sound shell model, over most of the space of the thermodynamic parameters governing the phase transition. The physical origins of the fit parameters and their dependence on the thermodynamic parameters are studied in the underlying sound shell model: in particular, we perform a series of detailed scans for these quantities over the plane of the strength of the phase transition ($α$) and the bubble wall velocity ($v_w$). Wherever possible, we comment on the physical interpretations of these scans. The result of our study can be used to generate accurate GW spectra with our fit function, given initial inputs of $α$, $v_w$, $β/H$ (nucleation rate parameter) and $T_n$ (nucleation temperature) for the relevant BSM scenario. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 31 pages, 9 figures

arXiv:2407.01796 [pdf, other]

Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation

Authors: Sirui Xia, Xintao Wang, Jiaqing Liang, Yifei Zhang, Weikang Zhou, Jiaji Deng, Fei Yu, Yanghua Xiao

Abstract: Retrieval-Augmented Generation (RAG) has been widely adopted to enhance Large Language Models (LLMs) in knowledge-intensive tasks. Recently, Attributed Text Generation (ATG) has attracted growing attention, which provides citations to support the model's responses in RAG, so as to enhance the credibility of LLM-generated content and facilitate verification. Prior methods mainly adopt coarse-graine… ▽ More Retrieval-Augmented Generation (RAG) has been widely adopted to enhance Large Language Models (LLMs) in knowledge-intensive tasks. Recently, Attributed Text Generation (ATG) has attracted growing attention, which provides citations to support the model's responses in RAG, so as to enhance the credibility of LLM-generated content and facilitate verification. Prior methods mainly adopt coarse-grained attributions, linking to passage-level references or providing paragraph-level citations. However, these methods still fall short in verifiability and require certain time costs for fact checking. This paper proposes a fine-grained ATG method called ReClaim(Refer & Claim), which alternates the generation of references and answers step by step. Unlike traditional coarse-grained attribution, ReClaim allows the model to add sentence-level fine-grained citations to each answer sentence in long-form question-answering tasks. Our experiments encompass various training and inference methods and multiple LLMs, verifying the effectiveness of our approach. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 15 pages,2 figures

arXiv:2407.00904 [pdf, other]

Background-aware Multi-source Fusion Financial Trend Forecasting Mechanism

Authors: Fengting Mo, Shanshan Yan, Yinhao Xiao

Abstract: Stock prices, as an economic indicator, reflect changes in economic development and market conditions. Traditional stock price prediction models often only consider time-series data and are limited by the mechanisms of the models themselves. Some deep learning models have high computational costs, depend on a large amount of high-quality data, and have poor interpretations, making it difficult to… ▽ More Stock prices, as an economic indicator, reflect changes in economic development and market conditions. Traditional stock price prediction models often only consider time-series data and are limited by the mechanisms of the models themselves. Some deep learning models have high computational costs, depend on a large amount of high-quality data, and have poor interpretations, making it difficult to intuitively understand the driving factors behind the predictions. Some studies have used deep learning models to extract text features and combine them with price data to make joint predictions, but there are issues with dealing with information noise, accurate extraction of text sentiment, and how to efficiently fuse text and numerical data. To address these issues in this paper, we propose a background-aware multi-source fusion financial trend forecasting mechanism. The system leverages a large language model to extract key information from policy and stock review texts, utilizing the MacBERT model to generate feature vectors. These vectors are then integrated with stock price data to form comprehensive feature representations. These integrated features are input into a neural network comprising various deep learning architectures. By integrating multiple data sources, the system offers a holistic view of market dynamics. It harnesses the comprehensive analytical and interpretative capabilities of large language models, retaining deep semantic and sentiment information from policy texts to provide richer input features for stock trend prediction. Additionally, we compare the accuracy of six models (LSTM, BiLSTM, MogrifierLSTM, GRU, ST-LSTM, SwinLSTM). The results demonstrate that our system achieves generally better accuracy in predicting stock movements, attributed to the incorporation of large language model processing, policy information, and other influential features. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2407.00653 [pdf, other]

Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs

Authors: Yifei Zhang, Xintao Wang, Jiaqing Liang, Sirui Xia, Lida Chen, Yanghua Xiao

Abstract: Large Language Models (LLMs) have exhibited impressive proficiency in various natural language processing (NLP) tasks, which involve increasingly complex reasoning. Knowledge reasoning, a primary type of reasoning, aims at deriving new knowledge from existing one.While it has been widely studied in the context of knowledge graphs (KGs), knowledge reasoning in LLMs remains underexplored. In this pa… ▽ More Large Language Models (LLMs) have exhibited impressive proficiency in various natural language processing (NLP) tasks, which involve increasingly complex reasoning. Knowledge reasoning, a primary type of reasoning, aims at deriving new knowledge from existing one.While it has been widely studied in the context of knowledge graphs (KGs), knowledge reasoning in LLMs remains underexplored. In this paper, we introduce Chain-of-Knowledge, a comprehensive framework for knowledge reasoning, including methodologies for both dataset construction and model learning. For dataset construction, we create KnowReason via rule mining on KGs. For model learning, we observe rule overfitting induced by naive training. Hence, we enhance CoK with a trial-and-error mechanism that simulates the human process of internal knowledge exploration. We conduct extensive experiments with KnowReason. Our results show the effectiveness of CoK in refining LLMs in not only knowledge reasoning, but also general reasoning benchmarkms. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2407.00600 [pdf, other]

GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing

Authors: Yisong Xiao, Aishan Liu, QianJia Cheng, Zhenfei Yin, Siyuan Liang, Jiapeng Li, Jing Shao, Xianglong Liu, Dacheng Tao

Abstract: Large Vision-Language Models (LVLMs) have been widely adopted in various applications; however, they exhibit significant gender biases. Existing benchmarks primarily evaluate gender bias at the demographic group level, neglecting individual fairness, which emphasizes equal treatment of similar individuals. This research gap limits the detection of discriminatory behaviors, as individual fairness o… ▽ More Large Vision-Language Models (LVLMs) have been widely adopted in various applications; however, they exhibit significant gender biases. Existing benchmarks primarily evaluate gender bias at the demographic group level, neglecting individual fairness, which emphasizes equal treatment of similar individuals. This research gap limits the detection of discriminatory behaviors, as individual fairness offers a more granular examination of biases that group fairness may overlook. For the first time, this paper introduces the GenderBias-\emph{VL} benchmark to evaluate occupation-related gender bias in LVLMs using counterfactual visual questions under individual fairness criteria. To construct this benchmark, we first utilize text-to-image diffusion models to generate occupation images and their gender counterfactuals. Subsequently, we generate corresponding textual occupation options by identifying stereotyped occupation pairs with high semantic similarity but opposite gender proportions in real-world statistics. This method enables the creation of large-scale visual question counterfactuals to expose biases in LVLMs, applicable in both multimodal and unimodal contexts through modifying gender attributes in specific modalities. Overall, our GenderBias-\emph{VL} benchmark comprises 34,581 visual question counterfactual pairs, covering 177 occupations. Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs (\eg, LLaVA) and state-of-the-art commercial APIs, including GPT-4o and Gemini-Pro. Our findings reveal widespread gender biases in existing LVLMs. Our benchmark offers: (1) a comprehensive dataset for occupation-related gender bias evaluation; (2) an up-to-date leaderboard on LVLM biases; and (3) a nuanced understanding of the biases presented by these models. \footnote{The dataset and code are available at the \href{https://genderbiasvl.github.io/}{website}.} △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 9 pages, 4 figures

arXiv:2407.00291 [pdf, other]

FMSG-JLESS Submission for DCASE 2024 Task4 on Sound Event Detection with Heterogeneous Training Dataset and Potentially Missing Labels

Authors: Yang Xiao, Han Yin, Jisheng Bai, Rohan Kumar Das

Abstract: This report presents the systems developed and submitted by Fortemedia Singapore (FMSG) and Joint Laboratory of Environmental Sound Sensing (JLESS) for DCASE 2024 Task 4. The task focuses on recognizing event classes and their time boundaries, given that multiple events can be present and may overlap in an audio recording. The novelty this year is a dataset with two sources, making it challenging… ▽ More This report presents the systems developed and submitted by Fortemedia Singapore (FMSG) and Joint Laboratory of Environmental Sound Sensing (JLESS) for DCASE 2024 Task 4. The task focuses on recognizing event classes and their time boundaries, given that multiple events can be present and may overlap in an audio recording. The novelty this year is a dataset with two sources, making it challenging to achieve good performance without knowing the source of the audio clips during evaluation. To address this, we propose a sound event detection method using domain generalization. Our approach integrates features from bidirectional encoder representations from audio transformers and a convolutional recurrent neural network. We focus on three main strategies to improve our method. First, we apply mixstyle to the frequency dimension to adapt the mel-spectrograms from different domains. Second, we consider training loss of our model specific to each datasets for their corresponding classes. This independent learning framework helps the model extract domain-specific features effectively. Lastly, we use the sound event bounding boxes method for post-processing. Our proposed method shows superior macro-average pAUC and polyphonic SED score performance on the DCASE 2024 Challenge Task 4 validation dataset and public evaluation dataset. △ Less

Submitted 28 June, 2024; originally announced July 2024.

Comments: Technical report for DCASE 2024 Challenge Task 4

arXiv:2407.00141 [pdf, other]

Towards Secure and Efficient Data Scheduling for Vehicular Social Networks

Authors: Youhua Xia, Tiehua Zhang, Jiong Jin, Ying He, Fei Yu

Abstract: Efficient data transmission scheduling within vehicular environments poses a significant challenge due to the high mobility of such networks. Contemporary research predominantly centers on crafting cooperative scheduling algorithms tailored for vehicular networks. Notwithstanding, the intricacies of orchestrating scheduling in vehicular social networks both effectively and efficiently remain formi… ▽ More Efficient data transmission scheduling within vehicular environments poses a significant challenge due to the high mobility of such networks. Contemporary research predominantly centers on crafting cooperative scheduling algorithms tailored for vehicular networks. Notwithstanding, the intricacies of orchestrating scheduling in vehicular social networks both effectively and efficiently remain formidable. This paper introduces an innovative learning-based algorithm for scheduling data transmission that prioritizes efficiency and security within vehicular social networks. The algorithm first uses a specifically constructed neural network to enhance data processing capabilities. After this, it incorporates a Q-learning paradigm during the data transmission phase to optimize the information exchange, the privacy of which is safeguarded by differential privacy through the communication process. Comparative experiments demonstrate the superior performance of the proposed Q-learning enhanced scheduling algorithm relative to existing state-of-the-art scheduling algorithms in the context of vehicular social networks. △ Less

Submitted 28 June, 2024; originally announced July 2024.

arXiv:2407.00136 [pdf, other]

Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, X. H. Bai, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (495 additional authors not shown)

Abstract: Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions… ▽ More Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components. △ Less

Submitted 2 July, 2024; v1 submitted 28 June, 2024; originally announced July 2024.

arXiv:2406.19804 [pdf, ps, other]

Rateless Stochastic Coding for Delay-constrained Semantic Communication

Authors: Cheng Peng, Rulong Wang, Yong Xiao

Abstract: We consider the problem of joint source-channel coding with distortion and perception constraints from a rateless perspective, the purpose of which is to settle the balance between reliability (distortion/perception) and effectiveness (rate) of transmission over uncertain channels. We find a new finite-blocklength bound for the achievable joint source-channel code rate with the above two constrain… ▽ More We consider the problem of joint source-channel coding with distortion and perception constraints from a rateless perspective, the purpose of which is to settle the balance between reliability (distortion/perception) and effectiveness (rate) of transmission over uncertain channels. We find a new finite-blocklength bound for the achievable joint source-channel code rate with the above two constraints. To achieve a superior rateless characteristic of JSCC coding, we perform multi-level optimization on various finite-blocklength codes. Based on these two, we then propose a new JSCC coding scheme called rateless stochastic coding (RSC). We experimentally demonstrate that the proposed RSC can achieve variable rates of transmission maintaining an excellent trade-off between distortion and perception. △ Less

Submitted 28 June, 2024; originally announced June 2024.

arXiv:2406.19621 [pdf, other]

Machine-Learning-Driven Runtime Optimization of BLAS Level 3 on Modern Multi-Core Systems

Authors: Yufan Xia, Giuseppe Maria Junior Barca

Abstract: BLAS Level 3 operations are essential for scientific computing, but finding the optimal number of threads for multi-threaded implementations on modern multi-core systems is challenging. We present an extension to the Architecture and Data-Structure Aware Linear Algebra (ADSALA) library that uses machine learning to optimize the runtime of all BLAS Level 3 operations. Our method predicts the best n… ▽ More BLAS Level 3 operations are essential for scientific computing, but finding the optimal number of threads for multi-threaded implementations on modern multi-core systems is challenging. We present an extension to the Architecture and Data-Structure Aware Linear Algebra (ADSALA) library that uses machine learning to optimize the runtime of all BLAS Level 3 operations. Our method predicts the best number of threads for each operation based on the matrix dimensions and the system architecture. We test our method on two HPC platforms with Intel and AMD processors, using MKL and BLIS as baseline BLAS implementations. We achieve speedups of 1.5 to 3.0 for all operations, compared to using the maximum number of threads. We also analyze the runtime patterns of different BLAS operations and explain the sources of speedup. Our work shows the effectiveness and generality of the ADSALA approach for optimizing BLAS routines on modern multi-core systems. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: Multi-Thread, Matrix Multiplication, Optimization, BLAS, Machine Learning

Journal ref: 2024 International Parallel and Distributed Processing Symposium (IPDPS)

arXiv:2406.19190 [pdf, ps, other]

Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

Abstract: Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec… ▽ More Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 13 pages, 6 figures

arXiv:2406.18921 [pdf, other]

Capturing Minds, Not Just Words: Enhancing Role-Playing Language Models with Personality-Indicative Data

Authors: Yiting Ran, Xintao Wang, Rui Xu, Xinfeng Yuan, Jiaqing Liang, Yanghua Xiao, Deqing Yang

Abstract: Role-playing agents (RPA) have been a popular application area for large language models (LLMs), attracting significant interest from both industry and academia.While existing RPAs well portray the characters' knowledge and tones, they face challenges in capturing their minds, especially for small role-playing language models (RPLMs). In this paper, we propose to enhance RPLMs via personality-indi… ▽ More Role-playing agents (RPA) have been a popular application area for large language models (LLMs), attracting significant interest from both industry and academia.While existing RPAs well portray the characters' knowledge and tones, they face challenges in capturing their minds, especially for small role-playing language models (RPLMs). In this paper, we propose to enhance RPLMs via personality-indicative data. Specifically, we leverage questions from psychological scales and distill advanced RPAs to generate dialogues that grasp the minds of characters. Experimental results validate that RPLMs trained with our dataset exhibit advanced role-playing capabilities for both general and personality-related evaluations. Code and data are available at \href{https://github.com/alienet1109/RolePersonality}{this URL}. △ Less

Submitted 29 June, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

Comments: 10pages

arXiv:2406.18426 [pdf]

Fast 3D 31P B1+ mapping with a weighted stack of spiral trajectory at 7 Tesla

Authors: Mark Widmaier, Antonia Kaiser, Salome Baup, Daniel Wenz, Katarzyna Pierzchala, Ying Xiao, Zhiwei Huang, Yun Jiang, Lijing Xin

Abstract: Purpose: Phosphorus Magnetic Resonance Spectroscopy (31P MRS) enables non-invasive assessment of energy metabolism, yet its application is hindered by sensitivity limitations. To overcome this, often high magnetic fields are used, leading to challenges such as spatial B_1^+ inhomogeneity and therefore the need for accurate flip angle determination in accelerated acquisitions with short repetition… ▽ More Purpose: Phosphorus Magnetic Resonance Spectroscopy (31P MRS) enables non-invasive assessment of energy metabolism, yet its application is hindered by sensitivity limitations. To overcome this, often high magnetic fields are used, leading to challenges such as spatial B_1^+ inhomogeneity and therefore the need for accurate flip angle determination in accelerated acquisitions with short repetition times (T_R). In response to these challenges, we propose a novel short T_R and look-up table-based Double-Angle Method for fast 3D 31P B_1^+ mapping (fDAM). Methods: Our method incorporates 3D weighted stack of spiral gradient echo acquisitions and a frequency-selective pulse to enable efficient B_1^+ mapping based on the phosphocreatine signal at 7T. Protocols were optimised using simulations and validated through phantom experiments. The method was validated in phantom experiments and skeletal muscle applications using a birdcage 1H/31P volume coil. Results: The results of fDAM were compared to the classical DAM (cDAM). A good correlation (r=0.94) was obtained between the two B_1^+ maps. A 3D 31P B_1^+ mapping in the human calf muscle was achieved in about 10 min using a birdcage volume coil, with a 20% extended coverage relative to that of the cDAM (24 min). fDAM also enabled the first full brain coverage 31P 3D B_1^+ mapping in approx. 10 min using a 1 Tx/ 32 Rx coil. Conclusion: fDAM is an efficient method for 31P 3D B_1^+ mapping, showing promise for future applications in rapid 31P MRSI. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.18313 [pdf, other]

Advancing Airport Tower Command Recognition: Integrating Squeeze-and-Excitation and Broadcasted Residual Learning

Authors: Yuanxi Lin, Tonglin Zhou, Yang Xiao

Abstract: Accurate recognition of aviation commands is vital for flight safety and efficiency, as pilots must follow air traffic control instructions precisely. This paper addresses challenges in speech command recognition, such as noisy environments and limited computational resources, by advancing keyword spotting technology. We create a dataset of standardized airport tower commands, including routine an… ▽ More Accurate recognition of aviation commands is vital for flight safety and efficiency, as pilots must follow air traffic control instructions precisely. This paper addresses challenges in speech command recognition, such as noisy environments and limited computational resources, by advancing keyword spotting technology. We create a dataset of standardized airport tower commands, including routine and emergency instructions. We enhance broadcasted residual learning with squeeze-and-excitation and time-frame frequency-wise squeeze-and-excitation techniques, resulting in our BC-SENet model. This model focuses on crucial information with fewer parameters. Our tests on five keyword spotting models, including BC-SENet, demonstrate superior accuracy and efficiency. These findings highlight the effectiveness of our model advancements in improving speech command recognition for aviation safety and efficiency in noisy, high-stakes environments. Additionally, BC-SENet shows comparable performance on the common Google Speech Command dataset. △ Less

Submitted 28 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

Comments: Accepted by IALP 2024

arXiv:2406.18183 [pdf, other]

Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of… ▽ More Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$, evidence for $ψ(4160) \to K^{-}\barΞ^{+}Λ$ is found for the first time with a significance of 4.4$σ$, including systematic uncertainties. No evidence for other possible resonances is found. In addition, the products of electronic partial width and branching fraction for all assumed resonances decaying into $K^{-}\barΞ^{+}Λ/Σ^{0}$ are determined. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 26 pages,5 tables, 4 figures

arXiv:2406.18083 [pdf, other]

Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

Abstract: Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an… ▽ More Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, and $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^0)=(2.02 \pm 0.13 \pm 0.05)\%$, where the first uncertainties are statistical and the second systematic. Combining with the known branching fractions of $Λ_c^+ \to pK_{S}^{0}$, $Λ_c^+ \to pK_{S}^{0}π^+π^-$, and $Λ_c^+ \to pK_{S}^{0}π^0$, we present the first measurements of the $K_{S}^{0}$-$K_{L}^{0}$ asymmetries $R(Λ_c^+, K_{S,L}^0X) = \frac{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) - \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) + \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}$ in charmed baryon decays: $R(Λ_c^+, pK_{S,L}^0) = -0.025 \pm 0.031$, $R(Λ_c^+, pK_{S,L}^0π^+π^-) = -0.027 \pm 0.048$, and $R(Λ_c^+, pK_{S,L}^0π^0) =-0.015 \pm 0.046$. No significant asymmetries within the uncertainties are observed. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 19 pages, 2 figures

arXiv:2406.18045 [pdf, other]

PharmaGPT: Domain-Specific Large Language Models for Bio-Pharmaceutical and Chemistry

Authors: Linqing Chen, Weilei Wang, Zilong Bai, Peng Xu, Yan Fang, Jie Fang, Wentao Wu, Lizhi Zhou, Ruiji Zhang, Yubin Xia, Chaobo Xu, Ran Hu, Licong Xu, Qijun Cai, Haoran Hua, Jing Sun, Jin Liu, Tian Qiu, Haowen Liu, Meng Hu, Xiuwen Li, Fei Gao, Yufu Wang, Lin Tie, Chaochao Wang , et al. (11 additional authors not shown)

Abstract: Large language models (LLMs) have revolutionized Natural Language Processing (NLP) by minimizing the need for complex feature engineering. However, the application of LLMs in specialized domains like biopharmaceuticals and chemistry remains largely unexplored. These fields are characterized by intricate terminologies, specialized knowledge, and a high demand for precision areas where general purpo… ▽ More Large language models (LLMs) have revolutionized Natural Language Processing (NLP) by minimizing the need for complex feature engineering. However, the application of LLMs in specialized domains like biopharmaceuticals and chemistry remains largely unexplored. These fields are characterized by intricate terminologies, specialized knowledge, and a high demand for precision areas where general purpose LLMs often fall short. In this study, we introduce PharmaGPT, a suite of domain specilized LLMs with 13 billion and 70 billion parameters, specifically trained on a comprehensive corpus tailored to the Bio-Pharmaceutical and Chemical domains. Our evaluation shows that PharmaGPT surpasses existing general models on specific-domain benchmarks such as NAPLEX, demonstrating its exceptional capability in domain-specific tasks. Remarkably, this performance is achieved with a model that has only a fraction, sometimes just one-tenth-of the parameters of general-purpose large models. This advancement establishes a new benchmark for LLMs in the bio-pharmaceutical and chemical fields, addressing the existing gap in specialized language modeling. It also suggests a promising path for enhanced research and development, paving the way for more precise and effective NLP applications in these areas. △ Less

Submitted 9 July, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.17578 [pdf, other]

Sparse-view Signal-domain Photoacoustic Tomography Reconstruction Method Based on Neural Representation

Authors: Bowei Yao, Yi Zeng, Haizhao Dai, Qing Wu, Youshen Xiao, Fei Gao, Yuyao Zhang, Jingyi Yu, Xiran Cai

Abstract: Photoacoustic tomography is a hybrid biomedical technology, which combines the advantages of acoustic and optical imaging. However, for the conventional image reconstruction method, the image quality is affected obviously by artifacts under the condition of sparse sampling. in this paper, a novel model-based sparse reconstruction method via implicit neural representation was proposed for improving… ▽ More Photoacoustic tomography is a hybrid biomedical technology, which combines the advantages of acoustic and optical imaging. However, for the conventional image reconstruction method, the image quality is affected obviously by artifacts under the condition of sparse sampling. in this paper, a novel model-based sparse reconstruction method via implicit neural representation was proposed for improving the image quality reconstructed from sparse data. Specially, the initial acoustic pressure distribution was modeled as a continuous function of spatial coordinates, and parameterized by a multi-layer perceptron. The weights of multi-layer perceptron were determined by training the network in self-supervised manner. And the total variation regularization term was used to offer the prior knowledge. We compared our result with some ablation studies, and the results show that out method outperforms existing methods on simulation and experimental data. Under the sparse sampling condition, our method can suppress the artifacts and avoid the ill-posed problem effectively, which reconstruct images with higher signal-to-noise ratio and contrast-to-noise ratio than traditional methods. The high-quality results for sparse data make the proposed method hold the potential for further decreasing the hardware cost of photoacoustic tomography system. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Showing 1–50 of 3,002 results for author: Xiao, Y