-
Measurement of the branching fraction of $D^+_s\to \ell^+ν_\ell$ via $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and…
▽ More
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and $\mathcal{B}(D_s^+\toτ^+ν_τ)=(\bftauv)\%$, respectively. The product of the decay constant and Cabibbo-Kobayashi-Maskawa matrix element $|V_{cs}|$ is determined to be $f_{D_s^+}|V_{cs}|=(\mufdsxvcsresult)_{μν}~\mathrm{MeV}$ and $f_{D_s^+}|V_{cs}|=(\taufdsxvcsresult))_{τν}~\mathrm{MeV}$, respectively. Taking the value of $|V_{cs}|$ from a global fit in the Standard Model, we obtain ${f_{D^+_s}}=(\mufdsresult)_{μν}$\,MeV and ${f_{D^+_s}}=(\taufdsresult)_{τν}$\,MeV, respectively. Conversely, taking the value for $f_{D_s^+}$ from the latest lattice quantum chromodynamics calculation, we obtain $|V_{cs}| =(\muvcsresult)_{μν}$ and $|V_{cs}| = (\tauvcsresult)_{τν}$, respectively.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models
Authors:
Qingcheng Zeng,
Mingyu Jin,
Qinkai Yu,
Zhenting Wang,
Wenyue Hua,
Zihao Zhou,
Guangyan Sun,
Yanda Meng,
Shiqing Ma,
Qifan Wang,
Felix Juefei-Xu,
Kaize Ding,
Fan Yang,
Ruixiang Tang,
Yongfeng Zhang
Abstract:
Large Language Models (LLMs) are employed across various high-stakes domains, where the reliability of their outputs is crucial. One commonly used method to assess the reliability of LLMs' responses is uncertainty estimation, which gauges the likelihood of their answers being correct. While many studies focus on improving the accuracy of uncertainty estimations for LLMs, our research investigates…
▽ More
Large Language Models (LLMs) are employed across various high-stakes domains, where the reliability of their outputs is crucial. One commonly used method to assess the reliability of LLMs' responses is uncertainty estimation, which gauges the likelihood of their answers being correct. While many studies focus on improving the accuracy of uncertainty estimations for LLMs, our research investigates the fragility of uncertainty estimation and explores potential attacks. We demonstrate that an attacker can embed a backdoor in LLMs, which, when activated by a specific trigger in the input, manipulates the model's uncertainty without affecting the final output. Specifically, the proposed backdoor attack method can alter an LLM's output probability distribution, causing the probability distribution to converge towards an attacker-predefined distribution while ensuring that the top-1 prediction remains unchanged. Our experimental results demonstrate that this attack effectively undermines the model's self-evaluation reliability in multiple-choice questions. For instance, we achieved a 100 attack success rate (ASR) across three different triggering strategies in four models. Further, we investigate whether this manipulation generalizes across different prompts and domains. This work highlights a significant threat to the reliability of LLMs and underscores the need for future defenses against such attacks. The code is available at https://github.com/qcznlp/uncertainty_attack.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Optical Diffusion Models for Image Generation
Authors:
Ilker Oguz,
Niyazi Ulas Dinc,
Mustafa Yildirim,
Junjie Ke,
Innfarn Yoo,
Qifei Wang,
Feng Yang,
Christophe Moser,
Demetri Psaltis
Abstract:
Diffusion models generate new samples by progressively decreasing the noise from the initially provided random distribution. This inference procedure generally utilizes a trained neural network numerous times to obtain the final output, creating significant latency and energy consumption on digital electronic hardware such as GPUs. In this study, we demonstrate that the propagation of a light beam…
▽ More
Diffusion models generate new samples by progressively decreasing the noise from the initially provided random distribution. This inference procedure generally utilizes a trained neural network numerous times to obtain the final output, creating significant latency and energy consumption on digital electronic hardware such as GPUs. In this study, we demonstrate that the propagation of a light beam through a semi-transparent medium can be programmed to implement a denoising diffusion model on image samples. This framework projects noisy image patterns through passive diffractive optical layers, which collectively only transmit the predicted noise term in the image. The optical transparent layers, which are trained with an online training approach, backpropagating the error to the analytical model of the system, are passive and kept the same across different steps of denoising. Hence this method enables high-speed image generation with minimal power consumption, benefiting from the bandwidth and energy efficiency of optical information processing.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Signature of Orbital Driven Finite Momentum Pairing in a 3D Ising Superconductor
Authors:
F. Z. Yang,
H. D. Zhang,
Saswata Mandal,
F. Y. Meng,
G. Fabbris,
A. Said,
P. Mercado Lozano,
A. Rajapitamahuni,
E. Vescovo,
C. Nelson,
S. Lin,
Y. Park,
E. M. Clements,
T. Z. Ward,
H. -N. Lee,
H. C. Lei,
C. X. Liu,
H. Miao
Abstract:
The finite momentum superconducting pairing states (FMPs), where Cooper pairs carry non-zero momentum, are believed to give rise to exotic physical phenomena including the pseudogap phase of cuprate high-Tc superconductors and Majorana fermions in topological superconductivity. FMPs can emerge in intertwined electronic liquids with strong spin-spin interactions or be induced by lifting the spin de…
▽ More
The finite momentum superconducting pairing states (FMPs), where Cooper pairs carry non-zero momentum, are believed to give rise to exotic physical phenomena including the pseudogap phase of cuprate high-Tc superconductors and Majorana fermions in topological superconductivity. FMPs can emerge in intertwined electronic liquids with strong spin-spin interactions or be induced by lifting the spin degeneracy under magnetic field as originally proposed by Fulde-Ferrell and Larkin-Ovchinnikov. In quantum materials with strong Ising-type spin-orbit coupling, such as the 2D transition metal dichalcogenides (TMDs), the spin degree of freedom is frozen enabling novel orbital driven FMPs via magnetoelectric effect. While evidence of orbital driven FMPs has been revealed in bilayer TMDs, its realization in 3D bulk materials remains an unresolved challenge. Here we report experimental signatures of FMP in a locally noncentrosymmetric bulk superconductor 4Hb-TaS2. Using hard X-ray diffraction and angle-resolved photoemission spectroscopy, we reveal unusual 2D chiral charge density wave (CDW) and weak interlayer hopping in 4Hb-TaS2. Below the superconducting transition temperature, the upper critical field, Hc2, linearly increases via decreasing temperature, and well exceeds the Pauli limit, thus establishing the dominant orbital pair-breaking mechanism. Remarkably, we discover a field-induced superconductivity-to-superconductivity transition that breaks continuous rotational symmetry of the s-wave uniform pairing in the Bardeen-Cooper-Schrieffer theory down to the six-fold rotation symmetry. Combining with a Ginzburg-Landau free energy analysis that incorporates magnetoelectric effect, our observations provide strong evidence of orbital driven FMP in the 3D quantum heterostructure 4Hb-TaS2.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Symmetric Second-Harmonic Generation in Sub-wavelength Periodically Poled Thin Film Lithium Niobate
Authors:
Fengyan Yang,
Juanjuan Lu,
Mohan Shen,
Guangcanlan Yang,
Hong X. Tang
Abstract:
Second harmonic generation (SHG) extensively employs periodically poled nonlinear crystals through forward quasi-phase-matching to achieve efficient frequency conversion. As poling periods approach sub-micrometers, backward quasi-phase-matching has also been demonstrated, albeit by utilizing pulsed laser drives. The realization of symmetric second harmonic generation, characterized by counterpropa…
▽ More
Second harmonic generation (SHG) extensively employs periodically poled nonlinear crystals through forward quasi-phase-matching to achieve efficient frequency conversion. As poling periods approach sub-micrometers, backward quasi-phase-matching has also been demonstrated, albeit by utilizing pulsed laser drives. The realization of symmetric second harmonic generation, characterized by counterpropagating pumps, however, has remained elusive despite theoretical predictions. The main challenge lies in achieving strong nonlinear coupling with poling period below half the wavelength of the second-harmonic light. The recent emergence of high-quality ferroelectric lithium niobate thin films provides an opportunity for achieving precise domain control at submicron dimensions. In this article, we demonstrate reliable control of ferroelectric domains in thin film lithium niobate waveguide with a poling period down to 370nm, thereby realizing highly efficient continuous-wave pumped symmetric SHG. This demonstration not only validates the feasibility of achieving subwavelength periodic poling on waveguides but also opens new avenues for leveraging submicron ferroelectric domain structures in integrated photonics and nonlinear optics research.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Large spin-orbit torque in a-plane $α$-Fe$_{2}$O$_{3}$/Pt bilayers
Authors:
Igor Lyalin,
Hantao Zhang,
Justin Michel,
Daniel Russell,
Fengyuan Yang,
Ran Cheng,
Roland K. Kawakami
Abstract:
Realization of efficient spin-orbit torque switching of the Néel vector in insulating antiferromagnets is a challenge, often complicated by spurious effects. Quantifying the spin-orbit torques in antiferromagnet/heavy metal heterostructures is an important first step towards this goal. Here, we employ magneto-optic techniques to study damping-like spin-orbit torque (DL-SOT) in a-plane $α$-Fe$_2$O…
▽ More
Realization of efficient spin-orbit torque switching of the Néel vector in insulating antiferromagnets is a challenge, often complicated by spurious effects. Quantifying the spin-orbit torques in antiferromagnet/heavy metal heterostructures is an important first step towards this goal. Here, we employ magneto-optic techniques to study damping-like spin-orbit torque (DL-SOT) in a-plane $α$-Fe$_2$O$_3$ (hematite) with a Pt spin-orbit overlayer. We find that the DL-SOT efficiency is two orders of magnitude larger than reported in c- and r-plane hematite/Pt using harmonic Hall techniques. The large magnitude of DL-SOT is supported by direct imaging of current-induced motion of antiferromagnetic domains that happens at moderate current densities. Our study introduces a new method for quantifying spin-orbit torque in antiferromagnets with a small canted moment and identifies a-plane $α$-Fe$_2$O$_3$ as a promising candidate to realize efficient SOT switching.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (645 additional authors not shown)
Abstract:
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be…
▽ More
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System
Authors:
Miao Zheng,
Hao Liang,
Fan Yang,
Haoze Sun,
Tianpeng Li,
Lingchu Xiong,
Yan Zhang,
Youzhen Wu,
Kun Li,
Yanjun Shen,
Mingan Lin,
Tao Zhang,
Guosheng Dong,
Yujing Qiao,
Kun Fang,
Weipeng Chen,
Bin Cui,
Wentao Zhang,
Zenan Zhou
Abstract:
In recent years, the rise of Large Language Models (LLMs) has spurred a growing demand for plug-and-play AI systems. Among the various AI techniques, prompt engineering stands out as particularly significant. However, users often face challenges in writing prompts due to the steep learning curve and significant time investment, and existing automatic prompt engineering (APE) models can be difficul…
▽ More
In recent years, the rise of Large Language Models (LLMs) has spurred a growing demand for plug-and-play AI systems. Among the various AI techniques, prompt engineering stands out as particularly significant. However, users often face challenges in writing prompts due to the steep learning curve and significant time investment, and existing automatic prompt engineering (APE) models can be difficult to use. To address this issue, we propose PAS, an LLM-based plug-and-play APE system. PAS utilizes LLMs trained on high-quality, automatically generated prompt complementary datasets, resulting in exceptional performance. In comprehensive benchmarks, PAS achieves state-of-the-art (SoTA) results compared to previous APE models, with an average improvement of 6.09 points. Moreover, PAS is highly efficient, achieving SoTA performance with only 9000 data points. Additionally, PAS can autonomously generate prompt augmentation data without requiring additional human labor. Its flexibility also allows it to be compatible with all existing LLMs and applicable to a wide range of tasks. PAS excels in human evaluations, underscoring its suitability as a plug-in for users. This combination of high performance, efficiency, and flexibility makes PAS a valuable system for enhancing the usability and effectiveness of LLMs through improved prompt engineering.
△ Less
Submitted 12 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
KidSat: satellite imagery to map childhood poverty dataset and benchmark
Authors:
Makkunda Sharma,
Fan Yang,
Duy-Nhat Vo,
Esra Suel,
Swapnil Mishra,
Samir Bhatt,
Oliver Fiala,
William Rudgard,
Seth Flaxman
Abstract:
Satellite imagery has emerged as an important tool to analyse demographic, health, and development indicators. While various deep learning models have been built for these tasks, each is specific to a particular problem, with few standard benchmarks available. We propose a new dataset pairing satellite imagery and high-quality survey data on child poverty to benchmark satellite feature representat…
▽ More
Satellite imagery has emerged as an important tool to analyse demographic, health, and development indicators. While various deep learning models have been built for these tasks, each is specific to a particular problem, with few standard benchmarks available. We propose a new dataset pairing satellite imagery and high-quality survey data on child poverty to benchmark satellite feature representations. Our dataset consists of 33,608 images, each 10 km $\times$ 10 km, from 19 countries in Eastern and Southern Africa in the time period 1997-2022. As defined by UNICEF, multidimensional child poverty covers six dimensions and it can be calculated from the face-to-face Demographic and Health Surveys (DHS) Program . As part of the benchmark, we test spatial as well as temporal generalization, by testing on unseen locations, and on data after the training years. Using our dataset we benchmark multiple models, from low-level satellite imagery models such as MOSAIKS , to deep learning foundation models, which include both generic vision models such as Self-Distillation with no Labels (DINOv2) models and specific satellite imagery models such as SatMAE. We provide open source code for building the satellite dataset, obtaining ground truth data from DHS and running various models assessed in our work.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Unveiling nonmagnetic phase and many-body entanglement in two-dimensional random quantum magnets Sr$_2$CuTe$_{1-x}$W$_x$O$_6$
Authors:
Dian Wu,
Fan Yang,
Giuseppe Carleo
Abstract:
We apply a random-plaquette $J_1$-$J_2$ model on the square lattice to capture the physics of a series of spin-$1/2$ Heisenberg antiferromagnet compounds Sr$_2$CuTe$_{1-x}$W$_x$O$_6$. With the input of experimentally relevant coupling strengths, our exact diagonalization (ED) study probes the ground state properties beyond previous linear spin-wave approach. An intermediate range of…
▽ More
We apply a random-plaquette $J_1$-$J_2$ model on the square lattice to capture the physics of a series of spin-$1/2$ Heisenberg antiferromagnet compounds Sr$_2$CuTe$_{1-x}$W$_x$O$_6$. With the input of experimentally relevant coupling strengths, our exact diagonalization (ED) study probes the ground state properties beyond previous linear spin-wave approach. An intermediate range of $x \in [0.08, 0.55]$ is identified for a nonmagnetic phase without the long-range Néel or stripe order. The absence of both valence-bond-glass order and spin-glass non-ergodic dynamics renders its nature intriguing. Deep inside this phase around $x = 0.3$, we observe signatures potentially linked to randomness-induced short-range spin-liquid-like (SLL) states, including close to zero spin-freezing parameter, vanishing spin-spin correlation beyond nearest neighbors, almost uniform static spin structure factor, as well as a broad tail in the dynamical spin structure factor. The nonmagnetic phase also features multipartite entanglement in the ground state witnessed by quantum Fisher information (QFI), which exhibits universal scaling behaviors at quantum critical points.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Testing the cosmic distance duality relation using Type Ia supernovae and radio quasars through model-independent methods
Authors:
Fan Yang,
Xiangyun Fu,
Bing Xu,
Kaituo Zhang,
Yang Huang,
Ying Yang
Abstract:
In this work, we perform a cosmological-model-independent test on the cosmic distance duality relation (CDDR) by comparing the angular diameter distance (ADD) obtained from the compact radio quasars (QSOs) with the luminosity distance (LD) from the Pantheon Type Ia supernovae (SNIa) sample. The binning method and Artificial Neural Network (ANN) are employed to match ADD data with LD data at the sa…
▽ More
In this work, we perform a cosmological-model-independent test on the cosmic distance duality relation (CDDR) by comparing the angular diameter distance (ADD) obtained from the compact radio quasars (QSOs) with the luminosity distance (LD) from the Pantheon Type Ia supernovae (SNIa) sample. The binning method and Artificial Neural Network (ANN) are employed to match ADD data with LD data at the same redshift, and three different parameterizations are adopted to quantify the possible deviations from the CDDR. We initially investigate the impacts of the specific prior values for the absolute magnitude $M_{\rm B}$ from SNIa and the linear size scaling factor $l$ from QSOs on the CDDR test, demonstrating that these prior values introduce significant biases in the CDDR test. To avoid the biases, we propose a method independent of $M_{\rm B}$ and $l$ to test CDDR, which treats the fiducial value of a new variable $κ\equiv10^{M_{\rm B} \over 5}\,l$ as a nuisance parameter and then marginalize its impact with a flat prior in the statistical analysis. The results show that the CDDR is consistent with the observational data, and QSOs can serve as a powerful tool for testing the CDDR independent of cosmological models.
△ Less
Submitted 10 July, 2024; v1 submitted 7 July, 2024;
originally announced July 2024.
-
Fluid-Antenna Enhanced Integrated Sensing and Communication: Joint Antenna Positioning and Beamforming Design
Authors:
Tian Hao,
Changxin Shi,
Yinghong Guo,
Bin Xia,
Feng Yang
Abstract:
This paper investigates a fluid antenna (FA) enhanced integrated sensing and communication (ISAC) system consisting of a base station (BS), multiple single-antenna communication users, and one point target, where the BS is equipped with FAs to enhance both the communication and sensing performance. First, we formulate a problem that maximizes the radar signal-to-noise ratio (SNR) by jointly optimi…
▽ More
This paper investigates a fluid antenna (FA) enhanced integrated sensing and communication (ISAC) system consisting of a base station (BS), multiple single-antenna communication users, and one point target, where the BS is equipped with FAs to enhance both the communication and sensing performance. First, we formulate a problem that maximizes the radar signal-to-noise ratio (SNR) by jointly optimizing the FAs' positions and transmit beamforming matrix. Then, to tackle this highly non-convex problem, we present efficient algorithms by using alternating optimization (AO), successive convex approximation (SCA), and semi-definite relaxation (SDR). Numerical results demonstrate the convergence behavior and effectiveness of the proposed algorithm.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
A Preconditioned Discontinuous Galerkin Method for Biharmonic Equation with $C^0$-Reconstructed Approximation
Authors:
Ruo Li,
Qicheng Liu,
Fanyi Yang
Abstract:
In this paper, we present a high-order finite element method based on a reconstructed approximation to the biharmonic equation. In our construction, the space is reconstructed from nodal values by solving a local least squares fitting problem per element. It is shown that the space can achieve an arbitrarily high-order accuracy and share the same nodal degrees of freedom with the $C^0$ linear spac…
▽ More
In this paper, we present a high-order finite element method based on a reconstructed approximation to the biharmonic equation. In our construction, the space is reconstructed from nodal values by solving a local least squares fitting problem per element. It is shown that the space can achieve an arbitrarily high-order accuracy and share the same nodal degrees of freedom with the $C^0$ linear space. The interior penalty discontinuous Galerkin scheme can be directly applied to the reconstructed space for solving the biharmonic equation. We prove that the numerical solution converges with optimal orders under error measurements. More importantly, we establish a norm equivalence between the reconstructed space and the continuous linear space. This property allows us to precondition the linear system arising from the high-order space by the linear space on the same mesh. This preconditioner is shown to be optimal in the sense that the condition number of the preconditioned system admits a uniform upper bound independent of the mesh size. Numerical examples in two and three dimensions are provided to illustrate the accuracy of the scheme and the efficiency of the preconditioning method.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be…
▽ More
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers
Authors:
Yanfeng Jiang,
Ning Sun,
Xueshuo Xie,
Fei Yang,
Tao Li
Abstract:
Vision Transformers (ViTs) have exhibited exceptional performance across diverse computer vision tasks, while their substantial parameter size incurs significantly increased memory and computational demands, impeding effective inference on resource-constrained devices. Quantization has emerged as a promising solution to mitigate these challenges, yet existing methods still suffer from significant…
▽ More
Vision Transformers (ViTs) have exhibited exceptional performance across diverse computer vision tasks, while their substantial parameter size incurs significantly increased memory and computational demands, impeding effective inference on resource-constrained devices. Quantization has emerged as a promising solution to mitigate these challenges, yet existing methods still suffer from significant accuracy loss at low-bit. We attribute this issue to the distinctive distributions of post-LayerNorm and post-GELU activations within ViTs, rendering conventional hardware-friendly quantizers ineffective, particularly in low-bit scenarios. To address this issue, we propose a novel framework called Activation-Distribution-Friendly post-training Quantization for Vision Transformers, ADFQ-ViT. Concretely, we introduce the Per-Patch Outlier-aware Quantizer to tackle irregular outliers in post-LayerNorm activations. This quantizer refines the granularity of the uniform quantizer to a per-patch level while retaining a minimal subset of values exceeding a threshold at full-precision. To handle the non-uniform distributions of post-GELU activations between positive and negative regions, we design the Shift-Log2 Quantizer, which shifts all elements to the positive region and then applies log2 quantization. Moreover, we present the Attention-score enhanced Module-wise Optimization which adjusts the parameters of each quantizer by reconstructing errors to further mitigate quantization error. Extensive experiments demonstrate ADFQ-ViT provides significant improvements over various baselines in image classification, object detection, and instance segmentation tasks at 4-bit. Specifically, when quantizing the ViT-B model to 4-bit, we achieve a 10.23% improvement in Top-1 accuracy on the ImageNet dataset.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
On the Performance and Memory Footprint of Distributed Training: An Empirical Study on Transformers
Authors:
Zhengxian Lu,
Fangyu Wang,
Zhiwei Xu,
Fei Yang,
Tao Li
Abstract:
Transformer models have emerged as potent solutions to a wide array of multidisciplinary challenges. The deployment of Transformer architectures is significantly hindered by their extensive computational and memory requirements, necessitating the reliance on advanced efficient distributed training methodologies. Prior research has delved into the performance bottlenecks associated with distributed…
▽ More
Transformer models have emerged as potent solutions to a wide array of multidisciplinary challenges. The deployment of Transformer architectures is significantly hindered by their extensive computational and memory requirements, necessitating the reliance on advanced efficient distributed training methodologies. Prior research has delved into the performance bottlenecks associated with distributed training, aiming to unravel these bottlenecks and suggest optimization directions. However, such analyses often overlook three aspects unique to Transformer models: the specialized architecture, the dependency on various distributed strategies, and the requirement to balance computational and memory overhead.
This paper aims to bridge this gap by offering a comprehensive examination of the performance bottlenecks inherent in distributed training of Transformer models, leveraging both theoretical analysis and empirical investigation. We propose an analytical framework tailored to these unique aspects of Transformers, facilitating a holistic evaluation of model architectures, distributed strategies, and resource consumption. Based on this analytical framework, we conduct a comparative analysis of theoretical performances and further systematically explore how various distributed training strategies fare in real-world scenarios. Most of the experimental results can be well explained by the analytical outcomes derived from the analytical framework. Notably, our findings suggest an advantage of pipeline parallelism over data parallelism for Transformer models. Moreover, we shed light on some unexpected outcomes, such as the potential for increased total memory overhead due to suboptimal model partitioning within pipeline parallelism. Additionally, we underscore the significance of communication block size and waiting time to further enhance performance.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Forecast Linear Augmented Projection (FLAP): A free lunch to reduce forecast error variance
Authors:
Yangzhuoran Fin Yang,
George Athanasopoulos,
Rob J. Hyndman,
Anastasios Panagiotelis
Abstract:
A novel forecast linear augmented projection (FLAP) method is introduced, which reduces the forecast error variance of any unbiased multivariate forecast without introducing bias. The method first constructs new component series which are linear combinations of the original series. Forecasts are then generated for both the original and component series. Finally, the full vector of forecasts is pro…
▽ More
A novel forecast linear augmented projection (FLAP) method is introduced, which reduces the forecast error variance of any unbiased multivariate forecast without introducing bias. The method first constructs new component series which are linear combinations of the original series. Forecasts are then generated for both the original and component series. Finally, the full vector of forecasts is projected onto a linear subspace where the constraints implied by the combination weights hold. It is proven that the trace of the forecast error variance is non-increasing with the number of components, and mild conditions are established for which it is strictly decreasing. It is also shown that the proposed method achieves maximum forecast error variance reduction among linear projection methods. The theoretical results are validated through simulations and two empirical applications based on Australian tourism and FRED-MD data. Notably, using FLAP with Principal Component Analysis (PCA) to construct the new series leads to substantial forecast error variance reduction.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Balanced clique subdivisions and cycles lengths in $K_{s, t}$-free graphs
Authors:
Jianfeng Hou,
Yindong Jin,
Donglei Yang,
Fan Yang
Abstract:
Let $ t\ge s\ge2$ be integers. Confirming a conjecture of Mader, Liu and Montgomery [J. Lond. Math. Soc., 2017] showed that every $K_{s, t}$-free graph with average degree $d$ contains a subdivision of a clique with at least $Ω(d^{\frac{s}{2(s-1)}})$ vertices. We give an improvement by showing that such a graph contains a balanced subdivision of a clique with the same order, where a balanced subdi…
▽ More
Let $ t\ge s\ge2$ be integers. Confirming a conjecture of Mader, Liu and Montgomery [J. Lond. Math. Soc., 2017] showed that every $K_{s, t}$-free graph with average degree $d$ contains a subdivision of a clique with at least $Ω(d^{\frac{s}{2(s-1)}})$ vertices. We give an improvement by showing that such a graph contains a balanced subdivision of a clique with the same order, where a balanced subdivision is a subdivision in which each edge is subdivided the same number of times.
In 1975, Erdős asked whether the sum of the reciprocals of the cycle lengths in a graph with infinite average degree $d$ is necessarily infinite. Recently, Liu and Montgomery [J. Amer. Math. Soc., 2023] confirmed the asymptotically correct lower bound on the reciprocals of the cycle lengths, and provided a lower bound of at least $(\frac{1}{2} -o_d(1)) \log d$. In this paper, we improve this low bound to $\left(\frac{s}{2(s-1)} -o_d(1)\right) \log d$ for $K_{s, t}$-free graphs.
Both proofs of our results use the graph sublinear expansion property as well as some novel structural techniques.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Actuation system of the inertial sensor for high-precision space missions using torsion pendulum
Authors:
Fangchao Yang,
Yan Zhu,
Xiaofei Jin,
Yujie Zhao,
Shixun Pei,
Wei Hong
Abstract:
Precision space inertial sensors are imperative to Earth geodesy missions, gravitational wave observations and several fundamental physics experiments in space. In these missions, the residual acceleration noise of the test mass(TM) caused by the forces from inertial sensor components and environment is supposed to be kept below a certain level. As a number of forces contributing to residual accel…
▽ More
Precision space inertial sensors are imperative to Earth geodesy missions, gravitational wave observations and several fundamental physics experiments in space. In these missions, the residual acceleration noise of the test mass(TM) caused by the forces from inertial sensor components and environment is supposed to be kept below a certain level. As a number of forces contributing to residual acceleration are related to actuation system, developing a precise actuation system to exclude any erroneous force and obtain an ultra sensitive value for TM acceleration noise is necessary and essential. However, it is difficult to test the actuation system on ground. In this paper, a torsion pendulum is established to test the influence of actuation system on TM torque noise and a closed-loop control system combined torsion pendulum and parts of actuation modules is designed to assess the performance of actuation control algorithm. The experimental results show that the parameters in an actuation system will introduce additional torque noise and the maximum noise can reach as much as 10^{-13}Nm /Hz^{1/2} at 1 mHz. The stable tracking error for the closed-loop system is about 10^{-7}, indicating that the combination system achieves good tracking performance and robustness for TM rotation control in different conditions of inertial sensors.
△ Less
Submitted 10 July, 2024; v1 submitted 1 July, 2024;
originally announced July 2024.
-
Chiral Quantum-Optical Elements for Waveguide-QED with Sub-wavelength Rydberg-Atom Arrays
Authors:
Lida Zhang,
Fan Yang,
Klaus Mølmer,
Thomas Pohl
Abstract:
We describe an approach to achieve near-perfect unidirectional light-matter coupling to an effective quantum emitter that is formed by a subwavelength array of atoms in the Rydberg-blockade regime. The nonlinear reflection and transmission of such two-dimensional superatoms are exploited in different interferometric setups for the deterministic generation of tunable single photons and entangling t…
▽ More
We describe an approach to achieve near-perfect unidirectional light-matter coupling to an effective quantum emitter that is formed by a subwavelength array of atoms in the Rydberg-blockade regime. The nonlinear reflection and transmission of such two-dimensional superatoms are exploited in different interferometric setups for the deterministic generation of tunable single photons and entangling two-photon operations with high fidelities, $\mathcal{F}\gtrsim0.999$. The described setup can function as a versatile nonlinear optical element in a free-space photonic quantum network with simple linear elements and without the need of additional mode confinement, optical resonators, or optical isolators.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Active Healing of Microtubule-Motor Networks
Authors:
Fan Yang,
Shichen Liu,
Heun Jin Lee,
Rob Phillips,
Matt Thomson
Abstract:
Cytoskeletal networks have a self-healing property where networks can repair defects to maintain structural integrity. However, both the mechanisms and dynamics of healing remain largely unknown. Here we report an unexplored healing mechanism in microtubule-motor networks by active crosslinking. We directly generate network cracks using a light-controlled microtubule-motor system, and observe that…
▽ More
Cytoskeletal networks have a self-healing property where networks can repair defects to maintain structural integrity. However, both the mechanisms and dynamics of healing remain largely unknown. Here we report an unexplored healing mechanism in microtubule-motor networks by active crosslinking. We directly generate network cracks using a light-controlled microtubule-motor system, and observe that the cracks can self-heal. Combining theory and experiment, we find that the networks must overcome internal elastic resistance in order to heal cracks, giving rise to a bifurcation of dynamics dependent on the initial opening angle of the crack: the crack heals below a critical angle and opens up at larger angles. Simulation of a continuum model reproduces the bifurcation dynamics, revealing the importance of a boundary layer where free motors and microtubules can actively crosslink and thereby heal the crack. We also formulate a simple elastic-rod model that can qualitatively predict the critical angle, which is found to be tunable by two dimensionless geometric parameters, the ratio of the boundary layer and network width, and the aspect ratio of the network. Our results provide a new framework for understanding healing in cytoskeletal networks and designing self-healable biomaterials.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Learning Granularity-Aware Affordances from Human-Object Interaction for Tool-Based Functional Grasping in Dexterous Robotics
Authors:
Fan Yang,
Wenrui Chen,
Kailun Yang,
Haoran Lin,
DongSheng Luo,
Conghui Tang,
Zhiyong Li,
Yaonan Wang
Abstract:
To enable robots to use tools, the initial step is teaching robots to employ dexterous gestures for touching specific areas precisely where tasks are performed. Affordance features of objects serve as a bridge in the functional interaction between agents and objects. However, leveraging these affordance cues to help robots achieve functional tool grasping remains unresolved. To address this, we pr…
▽ More
To enable robots to use tools, the initial step is teaching robots to employ dexterous gestures for touching specific areas precisely where tasks are performed. Affordance features of objects serve as a bridge in the functional interaction between agents and objects. However, leveraging these affordance cues to help robots achieve functional tool grasping remains unresolved. To address this, we propose a granularity-aware affordance feature extraction method for locating functional affordance areas and predicting dexterous coarse gestures. We study the intrinsic mechanisms of human tool use. On one hand, we use fine-grained affordance features of object-functional finger contact areas to locate functional affordance regions. On the other hand, we use highly activated coarse-grained affordance features in hand-object interaction regions to predict grasp gestures. Additionally, we introduce a model-based post-processing module that includes functional finger coordinate localization, finger-to-end coordinate transformation, and force feedback-based coarse-to-fine grasping. This forms a complete dexterous robotic functional grasping framework GAAF-Dex, which learns Granularity-Aware Affordances from human-object interaction for tool-based Functional grasping in Dexterous Robotics. Unlike fully-supervised methods that require extensive data annotation, we employ a weakly supervised approach to extract relevant cues from exocentric (Exo) images of hand-object interactions to supervise feature extraction in egocentric (Ego) images. We have constructed a small-scale dataset, FAH, which includes near 6K images of functional hand-object interaction Exo- and Ego images of 18 commonly used tools performing 6 tasks. Extensive experiments on the dataset demonstrate our method outperforms state-of-the-art methods. The code will be made publicly available at https://github.com/yangfan293/GAAF-DEX.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
OfCaM: Global Human Mesh Recovery via Optimization-free Camera Motion Scale Calibration
Authors:
Fengyuan Yang,
Kerui Gu,
Ha Linh Nguyen,
Angela Yao
Abstract:
Accurate camera motion estimation is critical to estimate human motion in the global space. A standard and widely used method for estimating camera motion is Simultaneous Localization and Mapping (SLAM). However, SLAM only provides a trajectory up to an unknown scale factor. Different from previous attempts that optimize the scale factor, this paper presents Optimization-free Camera Motion Scale C…
▽ More
Accurate camera motion estimation is critical to estimate human motion in the global space. A standard and widely used method for estimating camera motion is Simultaneous Localization and Mapping (SLAM). However, SLAM only provides a trajectory up to an unknown scale factor. Different from previous attempts that optimize the scale factor, this paper presents Optimization-free Camera Motion Scale Calibration (OfCaM), a novel framework that utilizes prior knowledge from human mesh recovery (HMR) models to directly calibrate the unknown scale factor. Specifically, OfCaM leverages the absolute depth of human-background contact joints from HMR predictions as a calibration reference, enabling the precise recovery of SLAM camera trajectory scale in global space. With this correctly scaled camera motion and HMR's local motion predictions, we achieve more accurate global human motion estimation. To compensate for scenes where we detect SLAM failure, we adopt a local-to-global motion mapping to fuse with previously derived motion to enhance robustness. Simple yet powerful, our method sets a new standard for global human mesh estimation tasks, reducing global human motion error by 60% over the prior SOTA while also demanding orders of magnitude less inference time compared with optimization-based methods.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Authors:
Longrong Yang,
Dong Sheng,
Chaoxiang Cai,
Fan Yang,
Size Li,
Di Zhang,
Xi Li
Abstract:
The Mixture-of-Experts (MoE) has gained increasing attention in the study of Large Vision-Language Models (LVLMs). It uses a sparse model to replace the dense model, achieving comparable performance while activating fewer parameters during inference, thus significantly reducing the inference cost. Existing MoE methods in LVLMs encourage different experts to handle different tokens, and thus they e…
▽ More
The Mixture-of-Experts (MoE) has gained increasing attention in the study of Large Vision-Language Models (LVLMs). It uses a sparse model to replace the dense model, achieving comparable performance while activating fewer parameters during inference, thus significantly reducing the inference cost. Existing MoE methods in LVLMs encourage different experts to handle different tokens, and thus they employ a router to predict the routing for each token. However, the predictions are based solely on sample features and do not truly reveal the optimization direction of tokens. This can lead to severe optimization conflicts between different tokens within an expert. To address this problem, this paper proposes a novel method based on token-level gradient analysis. Specifically, we first use token-level gradients to identify conflicting tokens in experts. Then, we add a specialized loss tailored to eliminate conflicts among tokens within each expert. Our method can serve as a plug-in for diverse Large Vision-Language Models, and extensive experimental results demonstrate the effectiveness of our method. The code will be publicly available at https://github.com/longrongyang/STGC.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Efficient Event Stream Super-Resolution with Recursive Multi-Branch Fusion
Authors:
Quanmin Liang,
Zhilin Huang,
Xiawu Zheng,
Feidiao Yang,
Jun Peng,
Kai Huang,
Yonghong Tian
Abstract:
Current Event Stream Super-Resolution (ESR) methods overlook the redundant and complementary information present in positive and negative events within the event stream, employing a direct mixing approach for super-resolution, which may lead to detail loss and inefficiency. To address these issues, we propose an efficient Recursive Multi-Branch Information Fusion Network (RMFNet) that separates po…
▽ More
Current Event Stream Super-Resolution (ESR) methods overlook the redundant and complementary information present in positive and negative events within the event stream, employing a direct mixing approach for super-resolution, which may lead to detail loss and inefficiency. To address these issues, we propose an efficient Recursive Multi-Branch Information Fusion Network (RMFNet) that separates positive and negative events for complementary information extraction, followed by mutual supplementation and refinement. Particularly, we introduce Feature Fusion Modules (FFM) and Feature Exchange Modules (FEM). FFM is designed for the fusion of contextual information within neighboring event streams, leveraging the coupling relationship between positive and negative events to alleviate the misleading of noises in the respective branches. FEM efficiently promotes the fusion and exchange of information between positive and negative branches, enabling superior local information enhancement and global information complementation. Experimental results demonstrate that our approach achieves over 17% and 31% improvement on synthetic and real datasets, accompanied by a 2.3X acceleration. Furthermore, we evaluate our method on two downstream event-driven applications, \emph{i.e.}, object recognition and video reconstruction, achieving remarkable results that outperform existing methods. Our code and Supplementary Material are available at https://github.com/Lqm26/RMFNet.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation
Authors:
Jia Fu,
Xiaoting Qin,
Fangkai Yang,
Lu Wang,
Jue Zhang,
Qingwei Lin,
Yubo Chen,
Dongmei Zhang,
Saravan Rajmohan,
Qi Zhang
Abstract:
Recent advancements in Large Language Models have transformed ML/AI development, necessitating a reevaluation of AutoML principles for the Retrieval-Augmented Generation (RAG) systems. To address the challenges of hyper-parameter optimization and online adaptation in RAG, we propose the AutoRAG-HP framework, which formulates the hyper-parameter tuning as an online multi-armed bandit (MAB) problem…
▽ More
Recent advancements in Large Language Models have transformed ML/AI development, necessitating a reevaluation of AutoML principles for the Retrieval-Augmented Generation (RAG) systems. To address the challenges of hyper-parameter optimization and online adaptation in RAG, we propose the AutoRAG-HP framework, which formulates the hyper-parameter tuning as an online multi-armed bandit (MAB) problem and introduces a novel two-level Hierarchical MAB (Hier-MAB) method for efficient exploration of large search spaces. We conduct extensive experiments on tuning hyper-parameters, such as top-k retrieved documents, prompt compression ratio, and embedding methods, using the ALCE-ASQA and Natural Questions datasets. Our evaluation from jointly optimization all three hyper-parameters demonstrate that MAB-based online learning methods can achieve Recall@5 $\approx 0.8$ for scenarios with prominent gradients in search space, using only $\sim20\%$ of the LLM API calls required by the Grid Search approach. Additionally, the proposed Hier-MAB approach outperforms other baselines in more challenging optimization scenarios. The code will be made available at https://aka.ms/autorag.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec…
▽ More
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Confident Natural Policy Gradient for Local Planning in $q_π$-realizable Constrained MDPs
Authors:
Tian Tian,
Lin F. Yang,
Csaba Szepesvári
Abstract:
The constrained Markov decision process (CMDP) framework emerges as an important reinforcement learning approach for imposing safety or other critical objectives while maximizing cumulative reward. However, the current understanding of how to learn efficiently in a CMDP environment with a potentially infinite number of states remains under investigation, particularly when function approximation is…
▽ More
The constrained Markov decision process (CMDP) framework emerges as an important reinforcement learning approach for imposing safety or other critical objectives while maximizing cumulative reward. However, the current understanding of how to learn efficiently in a CMDP environment with a potentially infinite number of states remains under investigation, particularly when function approximation is applied to the value functions. In this paper, we address the learning problem given linear function approximation with $q_π$-realizability, where the value functions of all policies are linearly representable with a known feature map, a setting known to be more general and challenging than other linear settings. Utilizing a local-access model, we propose a novel primal-dual algorithm that, after $\tilde{O}(\text{poly}(d) ε^{-3})$ queries, outputs with high probability a policy that strictly satisfies the constraints while nearly optimizing the value with respect to a reward function. Here, $d$ is the feature dimension and $ε> 0$ is a given error. The algorithm relies on a carefully crafted off-policy evaluation procedure to evaluate the policy using historical data, which informs policy updates through policy gradients and conserves samples. To our knowledge, this is the first result achieving polynomial sample complexity for CMDP in the $q_π$-realizable setting.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of…
▽ More
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$, evidence for $ψ(4160) \to K^{-}\barΞ^{+}Λ$ is found for the first time with a significance of 4.4$σ$, including systematic uncertainties. No evidence for other possible resonances is found. In addition, the products of electronic partial width and branching fraction for all assumed resonances decaying into $K^{-}\barΞ^{+}Λ/Σ^{0}$ are determined.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an…
▽ More
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, and $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^0)=(2.02 \pm 0.13 \pm 0.05)\%$, where the first uncertainties are statistical and the second systematic. Combining with the known branching fractions of $Λ_c^+ \to pK_{S}^{0}$, $Λ_c^+ \to pK_{S}^{0}π^+π^-$, and $Λ_c^+ \to pK_{S}^{0}π^0$, we present the first measurements of the $K_{S}^{0}$-$K_{L}^{0}$ asymmetries $R(Λ_c^+, K_{S,L}^0X) = \frac{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) - \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) + \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}$ in charmed baryon decays: $R(Λ_c^+, pK_{S,L}^0) = -0.025 \pm 0.031$, $R(Λ_c^+, pK_{S,L}^0π^+π^-) = -0.027 \pm 0.048$, and $R(Λ_c^+, pK_{S,L}^0π^0) =-0.015 \pm 0.046$. No significant asymmetries within the uncertainties are observed.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Learning for Bandits under Action Erasures
Authors:
Osama Hanna,
Merve Karakas,
Lin F. Yang,
Christina Fragouli
Abstract:
We consider a novel multi-arm bandit (MAB) setup, where a learner needs to communicate the actions to distributed agents over erasure channels, while the rewards for the actions are directly available to the learner through external sensors. In our model, while the distributed agents know if an action is erased, the central learner does not (there is no feedback), and thus does not know whether th…
▽ More
We consider a novel multi-arm bandit (MAB) setup, where a learner needs to communicate the actions to distributed agents over erasure channels, while the rewards for the actions are directly available to the learner through external sensors. In our model, while the distributed agents know if an action is erased, the central learner does not (there is no feedback), and thus does not know whether the observed reward resulted from the desired action or not. We propose a scheme that can work on top of any (existing or future) MAB algorithm and make it robust to action erasures. Our scheme results in a worst-case regret over action-erasure channels that is at most a factor of $O(1/\sqrt{1-ε})$ away from the no-erasure worst-case regret of the underlying MAB algorithm, where $ε$ is the erasure probability. We also propose a modification of the successive arm elimination algorithm and prove that its worst-case regret is $\Tilde{O}(\sqrt{KT}+K/(1-ε))$, which we prove is optimal by providing a matching lower bound.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Study of the $f_{0}(980)$ through the decay $D_{s}^{+}\rightarrow π^{+}π^{+}π^{-}π^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (649 additional authors not shown)
Abstract:
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and…
▽ More
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and determine the branching fractions $\mathcal{B}(D_s^+\toπ^+π^+π^-π^0|_{{\rm non}-η})=(2.04\pm0.08_{\rm stat.}\pm0.05_{\rm syst.})\%$ and $\mathcal{B}(D_s^+\toηπ^+)=(1.56\pm0.09_{\rm stat.}\pm0.04_{\rm syst.})\%$. Moreover, we measure the relative branching fraction between $φ\toπ^+π^-π^0$ and $φ\to K^+K^-$ to be $\frac{\mathcal{B}(φ(1020) \to π^+π^-π^0)}{\mathcal{B}(φ(1020) \to K^+K^-)}=0.230 \pm 0.014_{\rm stat.} \pm 0.010_{\rm syst.}$, which deviates from the world average value by more than $4σ$.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Displaced Heavy Neutral Lepton from New Higgs Doublet
Authors:
Fa-Xin Yang,
Feng-Lan Shao,
Zhi-Long Han,
Yi Jin,
Honglei Li
Abstract:
Heavy neutral leptons $N$ are introduced to explain the tiny neutrino masses via the seesaw mechanism. For proper small mixing parameter $V_{\ell N}$, the heavy neutral leptons $N$ become long-lived, which leads to the displaced vertex signature at colliders. In this paper, we consider the displaced heavy neutral lepton from the neutrinophilic Higgs doublet $Φ_ν$ decay. The new Higgs doublet with…
▽ More
Heavy neutral leptons $N$ are introduced to explain the tiny neutrino masses via the seesaw mechanism. For proper small mixing parameter $V_{\ell N}$, the heavy neutral leptons $N$ become long-lived, which leads to the displaced vertex signature at colliders. In this paper, we consider the displaced heavy neutral lepton from the neutrinophilic Higgs doublet $Φ_ν$ decay. The new Higgs doublet with MeV scale VEV can naturally explain the tiny neutrino masses with TeV scale $N$. Different from current experimental searches via the $W^\pm\to \ell^\pm N$ decay, the new decays as $H^\pm\to \ell^\pm N$ are not suppressed by the small mixing parameter $V_{\ell N}$. Therefore, a larger parameter space is expected to be detected at colliders. We then investigate the promising region at the 14 TeV HL-LHC and the 3 TeV CLIC. According to our simulation, the DV signature could probe $|V_{\ell N}|^2\gtrsim10^{-19}$ with $m_N<m_{H^+}$, which covers the seesaw predicted value $|V_{\ell N}|^2\sim m_ν/m_N$. We could probe $m_{H^+}\lesssim1200$ GeV at the 14 TeV HL-LHC and $m_{H^+}\lesssim1490$ GeV at the 3 TeV CLIC.
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
Imperative Learning: A Self-supervised Neural-Symbolic Learning Framework for Robot Autonomy
Authors:
Chen Wang,
Kaiyi Ji,
Junyi Geng,
Zhongqiang Ren,
Taimeng Fu,
Fan Yang,
Yifan Guo,
Haonan He,
Xiangyu Chen,
Zitong Zhan,
Qiwei Du,
Shaoshu Su,
Bowen Li,
Yuheng Qiu,
Yi Du,
Qihang Li,
Yifan Yang,
Xiao Lin,
Zhipeng Zhao
Abstract:
Data-driven methods such as reinforcement and imitation learning have achieved remarkable success in robot autonomy. However, their data-centric nature still hinders them from generalizing well to ever-changing environments. Moreover, collecting large datasets for robotic tasks is often impractical and expensive. To overcome these challenges, we introduce a new self-supervised neural-symbolic (NeS…
▽ More
Data-driven methods such as reinforcement and imitation learning have achieved remarkable success in robot autonomy. However, their data-centric nature still hinders them from generalizing well to ever-changing environments. Moreover, collecting large datasets for robotic tasks is often impractical and expensive. To overcome these challenges, we introduce a new self-supervised neural-symbolic (NeSy) computational framework, imperative learning (IL), for robot autonomy, leveraging the generalization abilities of symbolic reasoning. The framework of IL consists of three primary components: a neural module, a reasoning engine, and a memory system. We formulate IL as a special bilevel optimization (BLO), which enables reciprocal learning over the three modules. This overcomes the label-intensive obstacles associated with data-driven approaches and takes advantage of symbolic reasoning concerning logical reasoning, physical principles, geometric analysis, etc. We discuss several optimization techniques for IL and verify their effectiveness in five distinct robot autonomy tasks including path planning, rule induction, optimal control, visual odometry, and multi-robot routing. Through various experiments, we show that IL can significantly enhance robot autonomy capabilities and we anticipate that it will catalyze further research across diverse domains.
△ Less
Submitted 6 July, 2024; v1 submitted 23 June, 2024;
originally announced June 2024.
-
Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction…
▽ More
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction $\mathcal{B}[χ_{c1}(3872)\toπ^+π^- J/ψ]$ at 4.914 and 4.946 GeV are set to be 0.85 and 0.96 pb, respectively. These measurements provide useful information for the production of the $χ_{c1}(3872)$ at $e^+e^-$ collider and deepen our understanding about the nature of this particle.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Wide Field of View Large Aperture Meta-Doublet Eyepiece
Authors:
Anna Wirth-Singh,
Johannes E. Fröch,
Fan Yang,
Louis Martin,
Hualiang Zhang,
Quentin T. Tanguy,
Zhihao Zhou,
Luocheng Huang,
Demis D. John,
Biljana Stamenic,
Juejun Hu,
Tian Gu,
Arka Majumdar
Abstract:
Wide field of view and light weight optics are critical for advanced eyewear, with applications in augmented/virtual reality and night vision. Conventional refractive lenses are often stacked to correct aberrations at wide field of view, leading to limited performance and increased size and weight. In particular, simultaneously achieving wide field of view and large aperture for light collection i…
▽ More
Wide field of view and light weight optics are critical for advanced eyewear, with applications in augmented/virtual reality and night vision. Conventional refractive lenses are often stacked to correct aberrations at wide field of view, leading to limited performance and increased size and weight. In particular, simultaneously achieving wide field of view and large aperture for light collection is desirable but challenging to realize in a compact form-factor. Here, we demonstrate a wide field of view (greater than 60$^\circ$) meta-optic doublet eyepiece with an entrance aperture of 2.1 cm. At the design wavelength of 633 nm, the meta-optic doublet achieves comparable performance to a refractive lens-based eyepiece system. This meta-doublet eyepiece illustrates the potential for meta-optics to play an important role in the development of high-quality monochrome near-eye display and night vision systems.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
A microwave photonic prototype for concurrent radar detection and spectrum sensing over an 8 to 40 GHz bandwidth
Authors:
Taixia Shi,
Dingding Liang,
Lu Wang,
Lin Li,
Shaogang Guo,
Jiawei Gao,
Xiaowei Li,
Chulun Lin,
Lei Shi,
Baogang Ding,
Shiyang Liu,
Fangyi Yang,
Chi Jiang,
Yang Chen
Abstract:
In this work, a microwave photonic prototype for concurrent radar detection and spectrum sensing is proposed, designed, built, and investigated. A direct digital synthesizer and an analog electronic circuit are integrated to generate an intermediate frequency (IF) linearly frequency-modulated (LFM) signal with a tunable center frequency from 2.5 to 9.5 GHz and an instantaneous bandwidth of 1 GHz.…
▽ More
In this work, a microwave photonic prototype for concurrent radar detection and spectrum sensing is proposed, designed, built, and investigated. A direct digital synthesizer and an analog electronic circuit are integrated to generate an intermediate frequency (IF) linearly frequency-modulated (LFM) signal with a tunable center frequency from 2.5 to 9.5 GHz and an instantaneous bandwidth of 1 GHz. The IF LFM signal is converted to the optical domain via an intensity modulator and then filtered by a fiber Bragg grating (FBG) to generate only two 2nd-order optical LFM sidebands. In radar detection, the two optical LFM sidebands beat with each other to generate a frequency-and-bandwidth-quadrupled LFM signal, which is used for ranging, radial velocity measurement, and imaging. By changing the center frequency of the IF LFM signal, the radar function can be operated within 8 to 40 GHz. In spectrum sensing, one 2nd-order optical LFM sideband is selected by another FBG, which then works in conjunction with the stimulated Brillouin scattering gain spectrum to map the frequency of the signal under test to time with an instantaneous measurement bandwidth of 2 GHz. By using a frequency shift module to adjust the pump frequency, the frequency measurement range can be adjusted from 0 to 40 GHz. The prototype is comprehensively studied and tested, which is capable of achieving a range resolution of 3.75 cm, a range error of less than $\pm$ 2 cm, a radial velocity error within $\pm$ 1 cm/s, delivering clear imaging of multiple small targets, and maintaining a frequency measurement error of less than $\pm$ 7 MHz and a frequency resolution of better than 20 MHz.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Thread: A Logic-Based Data Organization Paradigm for How-To Question Answering with Retrieval Augmented Generation
Authors:
Kaikai An,
Fangkai Yang,
Liqun Li,
Junting Lu,
Sitao Cheng,
Lu Wang,
Pu Zhao,
Lele Cao,
Qingwei Lin,
Saravan Rajmohan,
Dongmei Zhang,
Qi Zhang
Abstract:
Current question answering systems leveraging retrieval augmented generation perform well in answering factoid questions but face challenges with non-factoid questions, particularly how-to queries requiring detailed step-by-step instructions and explanations. In this paper, we introduce Thread, a novel data organization paradigm that transforms documents into logic units based on their inter-conne…
▽ More
Current question answering systems leveraging retrieval augmented generation perform well in answering factoid questions but face challenges with non-factoid questions, particularly how-to queries requiring detailed step-by-step instructions and explanations. In this paper, we introduce Thread, a novel data organization paradigm that transforms documents into logic units based on their inter-connectivity. Extensive experiments across open-domain and industrial scenarios demonstrate that Thread outperforms existing data organization paradigms in RAG-based QA systems, significantly improving the handling of how-to questions.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
A hybrid graphene-siliconnitride nanomembrane as a versatile and ultra-widely tunable mechanical device
Authors:
Mengqi Fu,
Bojan Bošnjak,
Zhan Shi,
Jannik Dornseiff,
Robert H. Blick,
Elke Scheer,
Fan Yang
Abstract:
Integration of 2D materials in nanoelectromechanical systems (NEMS) marries the robustness of silicon-based materials with exceptional electrical controllability in 2D materials, drastically enhancing system performance which now is the key for many advanced applications in nanotechnology. Here, we experimentally demonstrate and theoretically analyze a powerful on-chip graphene integrated NEMS dev…
▽ More
Integration of 2D materials in nanoelectromechanical systems (NEMS) marries the robustness of silicon-based materials with exceptional electrical controllability in 2D materials, drastically enhancing system performance which now is the key for many advanced applications in nanotechnology. Here, we experimentally demonstrate and theoretically analyze a powerful on-chip graphene integrated NEMS device consisting of a hybrid graphene/silicon-nitride membrane with metallic leads that enables an extremely large static and dynamic parameter regulation. When a static voltage is applied to the leads, the force induced by the thermal expansion difference between the leads and the membrane results in ultra-wide frequency tuning, deformation (post-buckling transition) and regulation of mechanical properties. Moreover, by injecting an alternating voltage to the leads, we can excite the resonator vibrating even far beyond its linear regime without a complex and space consuming actuation system. Our results prove that the device is a compact integrated system possessing mechanical robustness, high controllability, and fast response. It not only expands the limit of the application range of NEMS devices but also pushes multidimensional nanomechanical resonators into working in the nonlinear regime.
△ Less
Submitted 23 June, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses
Authors:
Zhiwen Fan,
Pu Wang,
Yang Zhao,
Yibo Zhao,
Boris Ivanovic,
Zhangyang Wang,
Marco Pavone,
Hao Frank Yang
Abstract:
The increasing rate of road accidents worldwide results not only in significant loss of life but also imposes billions financial burdens on societies. Current research in traffic crash frequency modeling and analysis has predominantly approached the problem as classification tasks, focusing mainly on learning-based classification or ensemble learning methods. These approaches often overlook the in…
▽ More
The increasing rate of road accidents worldwide results not only in significant loss of life but also imposes billions financial burdens on societies. Current research in traffic crash frequency modeling and analysis has predominantly approached the problem as classification tasks, focusing mainly on learning-based classification or ensemble learning methods. These approaches often overlook the intricate relationships among the complex infrastructure, environmental, human and contextual factors related to traffic crashes and risky situations. In contrast, we initially propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports and incorporating infrastructure data, environmental and traffic textual and visual information in Washington State. Leveraging this rich dataset, we further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes, such as crash types, severity and number of injuries, based on contextual and environmental factors. The proposed model, CrashLLM, distinguishes itself from existing solutions by leveraging the inherent text reasoning capabilities of LLMs to parse and learn from complex, unstructured data, thereby enabling a more nuanced analysis of contributing factors. Our experiments results shows that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes, all with averaged F1 score boosted from 34.9% to 53.8%. Furthermore, CrashLLM can provide valuable insights for numerous open-world what-if situational-awareness traffic safety analyses with learned reasoning features, which existing models cannot offer. We make our benchmark, datasets, and model public available for further exploration.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
Authors:
Wei Chen,
Lin Li,
Yongqi Yang,
Bin Wen,
Fan Yang,
Tingting Gao,
Yu Wu,
Long Chen
Abstract:
Interleaved image-text generation has emerged as a crucial multimodal task, aiming at creating sequences of interleaved visual and textual content given a query. Despite notable advancements in recent multimodal large language models (MLLMs), generating integrated image-text sequences that exhibit narrative coherence and entity and style consistency remains challenging due to poor training data qu…
▽ More
Interleaved image-text generation has emerged as a crucial multimodal task, aiming at creating sequences of interleaved visual and textual content given a query. Despite notable advancements in recent multimodal large language models (MLLMs), generating integrated image-text sequences that exhibit narrative coherence and entity and style consistency remains challenging due to poor training data quality. To address this gap, we introduce CoMM, a high-quality Coherent interleaved image-text MultiModal dataset designed to enhance the coherence, consistency, and alignment of generated multimodal content. Initially, CoMM harnesses raw data from diverse sources, focusing on instructional content and visual storytelling, establishing a foundation for coherent and consistent content. To further refine the data quality, we devise a multi-perspective filter strategy that leverages advanced pre-trained models to ensure the development of sentences, consistency of inserted images, and semantic alignment between them. Various quality evaluation metrics are designed to prove the high quality of the filtered dataset. Meanwhile, extensive few-shot experiments on various downstream tasks demonstrate CoMM's effectiveness in significantly enhancing the in-context learning capabilities of MLLMs. Moreover, we propose four new tasks to evaluate MLLMs' interleaved generation abilities, supported by a comprehensive evaluation framework. We believe CoMM opens a new avenue for advanced MLLMs with superior multimodal in-context learning and understanding ability.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Integrated Modeling, Verification, and Code Generation for Unmanned Aerial Systems
Authors:
Jianyu Zhang,
Long Zhang,
Yixuan Wu,
Linru Ma,
Feng Yang
Abstract:
Unmanned Aerial Systems (UAS) are currently widely used in safety-critical fields such as industrial production, military operations, and disaster relief. Due to the diversity and complexity of application scenarios, UAS have become increasingly intricate. The challenge of designing and implementing highly reliable UAS while effectively controlling development costs and enhancing efficiency is a p…
▽ More
Unmanned Aerial Systems (UAS) are currently widely used in safety-critical fields such as industrial production, military operations, and disaster relief. Due to the diversity and complexity of application scenarios, UAS have become increasingly intricate. The challenge of designing and implementing highly reliable UAS while effectively controlling development costs and enhancing efficiency is a pressing issue faced by both academia and industry. Addressing this challenge, this paper aims to investigate an integrated approach to modeling, verification, and code generation for UAS. The paper begins by utilizing Architecture Analysis and Design Language (AADL) to model the UAS, proposing a set of generic UAS models. Based on these models, formal specifications are written to describe the system's safety properties and functions. Finally, the paper introduces a method for generating flight controller code for UAS based on the verified models. Experiments conducted with the proposed method demonstrate its effectiveness in identifying potential vulnerabilities in the UAS during the early design phase and in generating viable flight controller code from the verified models. This approach can enhance the efficiency of designing and verifying high-reliability UAS.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Search for $X(1870)$ via the decay $J/ψ\to ωK^+ K^-η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the…
▽ More
Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the $90\%$ confidence level. In addition, the branching faction $B(J/ψ\toωK^+ K^- η)$ is measured to be $(3.33\pm0.02(\rm{stat.})\pm 0.12(\rm{syst.}))\times 10^{-4}$.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes…
▽ More
In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes of astrophysical $γ$-ray background while large amount of dark matter. By analyzing more than 700 days observational data at LHAASO, no significant dark matter signal from 1 TeV to 1 EeV is detected. Accordingly we derive the most stringent constraints on the ultra-heavy dark matter annihilation cross-section up to EeV. The constraints on the lifetime of dark matter in decay mode are also derived.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
A green solvent system for precursor phase-engineered sequential deposition of stable formamidinium lead triiodide for perovskite solar cells
Authors:
Benjamin M. Gallant,
Philippe Holzhey,
Joel A. Smith,
Saqlain Choudhary,
Karim A. Elmestekawy,
Pietro Caprioglio,
Igal Levine,
Alex Sheader,
Fengning Yang,
Daniel T. W. Toolan,
Rachel C. Kilbride,
Augustin K. A. Zaininger,
James M. Ball,
M. Greyson Christoforo,
Nakita Noel,
Laura M. Herz,
Dominik J. Kubicki,
Henry J. Snaith
Abstract:
Perovskite solar cells (PSCs) offer an efficient, inexpensive alternative to current photovoltaic technologies, with the potential for manufacture via high-throughput coating methods. However, challenges for commercial-scale solution-processing of metal-halide perovskites include the use of harmful solvents, the expense of maintaining controlled atmospheric conditions, and the inherent instabiliti…
▽ More
Perovskite solar cells (PSCs) offer an efficient, inexpensive alternative to current photovoltaic technologies, with the potential for manufacture via high-throughput coating methods. However, challenges for commercial-scale solution-processing of metal-halide perovskites include the use of harmful solvents, the expense of maintaining controlled atmospheric conditions, and the inherent instabilities of PSCs under operation. Here, we address these challenges by introducing a high volatility, low toxicity, biorenewable solvent system to fabricate a range of 2D perovskites, which highly effective precursor phases for subsequent transformation to alpha-formamidinium lead triiodide (FAPbI3), fully processed under ambient conditions. PSCs utilising our FAPbI3 reproducibly show remarkable stability under illumination and elevated temperature (ISOS-L-2) and "damp heat" (ISOS-D-3) stressing, surpassing other state-of-the-art perovskite compositions. We determine that this enhancement is a consequence of the 2D precursor phase crystallisation route, which simultaneously avoids retention of residual low-volatility solvents (such as DMF and DMSO) and reduces the rate of degradation of FA+ in the material. Our findings highlight both the critical role of the initial crystallisation process in determining the operational stability of perovskite materials, and that neat FA+-based perovskites can be competitively stable despite the inherent metastability of the alpha-phase.
△ Less
Submitted 14 June, 2024; v1 submitted 12 June, 2024;
originally announced June 2024.
-
Observation of $η_{c}$(1S, 2S) and $χ_{cJ}$ decays to 2$(π^{+}π^{-})η$ via $ψ$(3686) radiative transitions
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (636 additional authors not shown)
Abstract:
Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measur…
▽ More
Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measured in both destructive and constructive interference scenarios for the first time. The mass and width of the $η_{c}(1S)$ are measured to be $M=(2984.14 \pm 0.13 \pm 0.38)$ MeV/$c^{2}$ and $Γ=(28.82 \pm 0.11 \pm 0.82)$ MeV, respectively. Clear signals for the decays of the $χ_{cJ}(J=0,1,2)$ and the $η_{c}(2S)$ to $2(π^{+}π^{-})η$ are also observed for the first time, and the corresponding branching fractions are measured. The ratio of the branching fractions between the $η_{c}(2S)$ and $η_{c}(1S)$ decays is significantly lower than the theoretical prediction, which might suggest different dynamics in their decays.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Making 'syscall' a Privilege not a Right
Authors:
Fangfei Yang,
Anjo Vahldiek-Oberwagner,
Chia-Che Tsai,
Kelly Kaoudis,
Nathan Dautenhahn
Abstract:
Browsers, Library OSes, and system emulators rely on sandboxes and in-process isolation to emulate system resources and securely isolate untrusted components. All access to system resources like system calls (syscall) need to be securely mediated by the application. Otherwise system calls may allow untrusted components to evade the emulator or sandbox monitor, and hence, escape and attack the enti…
▽ More
Browsers, Library OSes, and system emulators rely on sandboxes and in-process isolation to emulate system resources and securely isolate untrusted components. All access to system resources like system calls (syscall) need to be securely mediated by the application. Otherwise system calls may allow untrusted components to evade the emulator or sandbox monitor, and hence, escape and attack the entire application or system. Existing approaches, such as ptrace, require additional context switches between kernel and userspace, which introduce high performance overhead. And, seccomp-bpf supports only limited policies, which restricts its functionality, or it still requires ptrace to provide assistance.
In this paper, we present nexpoline, a secure syscall interception mechanism combining Memory Protection Keys (MPK) and Seccomp or Syscall User Dispatch (SUD). Our approach transforms an application's syscall instruction into a privilege reserved for the trusted monitor within the address space, allowing flexible user defined policy. To execute a syscall, the application must switch contexts via nexpoline. It offers better efficiency than secure interception techniques like ptrace, as nexpoline can intercept syscalls through binary rewriting securely. Consequently, nexpoline ensures the safety, flexibility and efficiency for syscall interception. Notably, it operates without kernel modifications, making it viable on current Linux systems without needing root privileges. Our benchmarks demonstrate improved performance over ptrace in interception overhead while achieving the same security guarantees. When compared to similarly performing firejail, nexpoline supports more complex policies and enables the possibility to emulate system resources.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Strong and weak $CP$ tests in sequential decays of polarized $Σ^0$ hyperons
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The wea…
▽ More
The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The weak-$CP$ test is performed in the subsequent decays of their daughter particles $Λ$ and $\barΛ$. Also for the first time, the transverse polarizations of the $Σ^0$ hyperons in $J/ψ$ and $ψ(3686)$ decays are observed with opposite directions, and the ratios between the S-wave and D-wave contributions of the $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ decays are obtained. These results are crucial to understand the decay dynamics of the charmonium states and the production mechanism of the $Σ^0-\barΣ^0$ pairs.
△ Less
Submitted 16 July, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Measurement of the integrated luminosity of the data collected at 3.773 GeV by BESIII from 2021 to 2024
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
We present a measurement of the integrated luminosity of $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at a center-of-mass energy of $E_{\rm cm} = 3.773$~GeV. The integrated luminosities of the data sets taken from December 2021 to June 2022, from November 2022 to June 2023, and from October 2023 to February 2024 are determined to be $4.995 \pm 0.019$~fb$^{-1}$,…
▽ More
We present a measurement of the integrated luminosity of $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at a center-of-mass energy of $E_{\rm cm} = 3.773$~GeV. The integrated luminosities of the data sets taken from December 2021 to June 2022, from November 2022 to June 2023, and from October 2023 to February 2024 are determined to be $4.995 \pm 0.019$~fb$^{-1}$, $8.157 \pm 0.031$~fb$^{-1}$, and $4.191 \pm 0.016$~fb$^{-1}$, respectively, by analyzing large angle Bhabha scattering events. The uncertainties are dominated by systematic effects and the statistical uncertainties are negligible. Our results provide essential input for future analyses and precision measurements.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling
Authors:
Yuepeng Jiang,
Tao Li,
Fengyu Yang,
Lei Xie,
Meng Meng,
Yujun Wang
Abstract:
Recent research in zero-shot speech synthesis has made significant progress in speaker similarity. However, current efforts focus on timbre generalization rather than prosody modeling, which results in limited naturalness and expressiveness. To address this, we introduce a novel speech synthesis model trained on large-scale datasets, including both timbre and hierarchical prosody modeling. As timb…
▽ More
Recent research in zero-shot speech synthesis has made significant progress in speaker similarity. However, current efforts focus on timbre generalization rather than prosody modeling, which results in limited naturalness and expressiveness. To address this, we introduce a novel speech synthesis model trained on large-scale datasets, including both timbre and hierarchical prosody modeling. As timbre is a global attribute closely linked to expressiveness, we adopt a global vector to model speaker timbre while guiding prosody modeling. Besides, given that prosody contains both global consistency and local variations, we introduce a diffusion model as the pitch predictor and employ a prosody adaptor to model prosody hierarchically, further enhancing the prosody quality of the synthesized speech. Experimental results show that our model not only maintains comparable timbre quality to the baseline but also exhibits better naturalness and expressiveness.
△ Less
Submitted 11 June, 2024; v1 submitted 9 June, 2024;
originally announced June 2024.