-
Measurement of the branching fraction of $D^+_s\to \ell^+ν_\ell$ via $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and…
▽ More
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and $\mathcal{B}(D_s^+\toτ^+ν_τ)=(\bftauv)\%$, respectively. The product of the decay constant and Cabibbo-Kobayashi-Maskawa matrix element $|V_{cs}|$ is determined to be $f_{D_s^+}|V_{cs}|=(\mufdsxvcsresult)_{μν}~\mathrm{MeV}$ and $f_{D_s^+}|V_{cs}|=(\taufdsxvcsresult))_{τν}~\mathrm{MeV}$, respectively. Taking the value of $|V_{cs}|$ from a global fit in the Standard Model, we obtain ${f_{D^+_s}}=(\mufdsresult)_{μν}$\,MeV and ${f_{D^+_s}}=(\taufdsresult)_{τν}$\,MeV, respectively. Conversely, taking the value for $f_{D_s^+}$ from the latest lattice quantum chromodynamics calculation, we obtain $|V_{cs}| =(\muvcsresult)_{μν}$ and $|V_{cs}| = (\tauvcsresult)_{τν}$, respectively.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Search for the rare $Λ_c^+ \to p μ^+ μ^-$ decay
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1062 additional authors not shown)
Abstract:
A search for the nonresonant $Λ_c^+ \to p μ^+ μ^-$ decay is performed using proton-proton collision data recorded at a centre-of-mass energy of 13 TeV by the LHCb experiment, corresponding to an integrated luminosity of 5.4 fb$^{-1}$. No evidence for the decay is found in the dimuon invariant-mass regions where the expected contributions of resonances is subdominant. The upper limit on the branchi…
▽ More
A search for the nonresonant $Λ_c^+ \to p μ^+ μ^-$ decay is performed using proton-proton collision data recorded at a centre-of-mass energy of 13 TeV by the LHCb experiment, corresponding to an integrated luminosity of 5.4 fb$^{-1}$. No evidence for the decay is found in the dimuon invariant-mass regions where the expected contributions of resonances is subdominant. The upper limit on the branching fraction of the $Λ_c^+ \to p μ^+ μ^-$ decay is determined to be $2.9~(3.2) \times 10^{-8}$ at 90% (95%) confidence level. The branching fractions in the dimuon invariant-mass regions dominated by the $η$, $ρ$ and $ω$ resonances are also determined.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
COMET: "Cone of experience" enhanced large multimodal model for mathematical problem generation
Authors:
Sannyuya Liu,
Jintian Feng,
Zongkai Yang,
Yawei Luo,
Qian Wan,
Xiaoxuan Shen,
Jianwen Sun
Abstract:
The automatic generation of high-quality mathematical problems is practically valuable in many educational scenarios. Large multimodal model provides a novel technical approach for the mathematical problem generation because of its wide success in cross-modal data scenarios. However, the traditional method of separating problem solving from problem generation and the mainstream fine-tuning framewo…
▽ More
The automatic generation of high-quality mathematical problems is practically valuable in many educational scenarios. Large multimodal model provides a novel technical approach for the mathematical problem generation because of its wide success in cross-modal data scenarios. However, the traditional method of separating problem solving from problem generation and the mainstream fine-tuning framework of monotonous data structure with homogeneous training objectives limit the application of large multimodal model in mathematical problem generation. Addressing these challenges, this paper proposes COMET, a "Cone of Experience" enhanced large multimodal model for mathematical problem generation. Firstly, from the perspective of mutual ability promotion and application logic, we unify stem generation and problem solving into mathematical problem generation. Secondly, a three-stage fine-turning framework guided by the "Cone of Experience" is proposed. The framework divides the fine-tuning data into symbolic experience, iconic experience, and direct experience to draw parallels with experiences in the career growth of teachers. Several fine-grained data construction and injection methods are designed in this framework. Finally, we construct a Chinese multimodal mathematical problem dataset to fill the vacancy of Chinese multimodal data in this field. Combined with objective and subjective indicators, experiments on multiple datasets fully verify the effectiveness of the proposed framework and model.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Panacea: A foundation model for clinical trial search, summarization, design, and recruitment
Authors:
Jiacheng Lin,
Hanwen Xu,
Zifeng Wang,
Sheng Wang,
Jimeng Sun
Abstract:
Clinical trials are fundamental in developing new drugs, medical devices, and treatments. However, they are often time-consuming and have low success rates. Although there have been initial attempts to create large language models (LLMs) for clinical trial design and patient-trial matching, these models remain task-specific and not adaptable to diverse clinical trial tasks. To address this challen…
▽ More
Clinical trials are fundamental in developing new drugs, medical devices, and treatments. However, they are often time-consuming and have low success rates. Although there have been initial attempts to create large language models (LLMs) for clinical trial design and patient-trial matching, these models remain task-specific and not adaptable to diverse clinical trial tasks. To address this challenge, we propose a clinical trial foundation model named Panacea, designed to handle multiple tasks, including trial search, trial summarization, trial design, and patient-trial matching. We also assemble a large-scale dataset, named TrialAlign, of 793,279 trial documents and 1,113,207 trial-related scientific papers, to infuse clinical knowledge into the model by pre-training. We further curate TrialInstruct, which has 200,866 of instruction data for fine-tuning. These resources enable Panacea to be widely applicable for a range of clinical trial tasks based on user requirements.
We evaluated Panacea on a new benchmark, named TrialPanorama, which covers eight clinical trial tasks. Our method performed the best on seven of the eight tasks compared to six cutting-edge generic or medicine-specific LLMs. Specifically, Panacea showed great potential to collaborate with human experts in crafting the design of eligibility criteria, study arms, and outcome measures, in multi-round conversations. In addition, Panacea achieved 14.42% improvement in patient-trial matching, 41.78% to 52.02% improvement in trial search, and consistently ranked at the top for five aspects of trial summarization. Our approach demonstrates the effectiveness of Panacea in clinical trials and establishes a comprehensive resource, including training data, model, and benchmark, for developing clinical trial foundation models, paving the path for AI-based clinical trial development.
△ Less
Submitted 25 June, 2024;
originally announced July 2024.
-
Open (Clinical) LLMs are Sensitive to Instruction Phrasings
Authors:
Alberto Mario Ceballos Arroyo,
Monica Munnangi,
Jiuding Sun,
Karen Y. C. Zhang,
Denis Jered McInerney,
Byron C. Wallace,
Silvio Amir
Abstract:
Instruction-tuned Large Language Models (LLMs) can perform a wide range of tasks given natural language instructions to do so, but they are sensitive to how such instructions are phrased. This issue is especially concerning in healthcare, as clinicians are unlikely to be experienced prompt engineers and the potential consequences of inaccurate outputs are heightened in this domain.
This raises a…
▽ More
Instruction-tuned Large Language Models (LLMs) can perform a wide range of tasks given natural language instructions to do so, but they are sensitive to how such instructions are phrased. This issue is especially concerning in healthcare, as clinicians are unlikely to be experienced prompt engineers and the potential consequences of inaccurate outputs are heightened in this domain.
This raises a practical question: How robust are instruction-tuned LLMs to natural variations in the instructions provided for clinical NLP tasks? We collect prompts from medical doctors across a range of tasks and quantify the sensitivity of seven LLMs -- some general, others specialized -- to natural (i.e., non-adversarial) instruction phrasings. We find that performance varies substantially across all models, and that -- perhaps surprisingly -- domain-specific models explicitly trained on clinical data are especially brittle, compared to their general domain counterparts. Further, arbitrary phrasing differences can affect fairness, e.g., valid but distinct instructions for mortality prediction yield a range both in overall performance, and in terms of differences between demographic groups.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
SU(3) symmetry analysis in charmed baryon two body decays with penguin diagram contribution
Authors:
Zhi-Peng Xing,
Yu-Ji Shi,
Jin Sun,
Ye Xing
Abstract:
An increasing number of experimental measurements from the BESIII, Belle, and Belle-II collaborations encourage investigations into charmed baryon two-body decay processes. By including contributions from the penguin diagrams that are ignored in previous studies, we perform a global analysis with SU(3) flavor symmetry. Assuming all form factors are real, we achieve a remarkable minimal…
▽ More
An increasing number of experimental measurements from the BESIII, Belle, and Belle-II collaborations encourage investigations into charmed baryon two-body decay processes. By including contributions from the penguin diagrams that are ignored in previous studies, we perform a global analysis with SU(3) flavor symmetry. Assuming all form factors are real, we achieve a remarkable minimal $χ^{2}/d.o.f = 0.788$ and find that the contribution of the amplitude proportional to $V_{cb}^*V_{ub}$ is of the order $\sim 0.01$, comparable with the contribution of the tree-level diagram. Additionally, by using the KPW theorem to reduce the number of amplitudes from 13 to 7 in the leading contribution, it becomes possible to consider the complex form factor case for the leading IRA amplitude in the global analysis. However, the analysis of complex form factors significantly conflicts with the experimental data $Br(Ξ_c^0\toΞ^-π^+)$, and by excluding this data, $χ^2/d.o.f$ is reduced from 5.95 to 1.19. Although the analysis of complex form factors shows a significant central value of the penguin diagram contribution, the large errors from the corresponding form factors make it a challenge to precisely determine its true contribution. Consequently, the direct CP violation in decay processes is predicted to be approximately zero. With more data in future experiments, the penguin diagram contribution with the amplitude proportional to $V_{cb}^*V_{ub}$ will be precisely determined, allowing for a more accurate prediction of CP violation. Our analysis necessitates further theoretical investigations and experimental measurements in the future.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Molecule Language Model with Augmented Pairs and Expertise Transfer
Authors:
Namkyeong Lee,
Siddhartha Laghuvarapu,
Chanyoung Park,
Jimeng Sun
Abstract:
Understanding the molecules and their textual descriptions via molecule language models (MoLM) recently got a surge of interest among researchers. However, unique challenges exist in the field of MoLM due to 1) a limited amount of molecule-text paired data and 2) missing expertise that occurred due to the specialized areas of focus among the experts. To this end, we propose AMOLE, which 1) augment…
▽ More
Understanding the molecules and their textual descriptions via molecule language models (MoLM) recently got a surge of interest among researchers. However, unique challenges exist in the field of MoLM due to 1) a limited amount of molecule-text paired data and 2) missing expertise that occurred due to the specialized areas of focus among the experts. To this end, we propose AMOLE, which 1) augments molecule-text pairs with structural similarity preserving loss, and 2) transfers the expertise between the molecules. Extensive experiments on various downstream tasks demonstrate the superiority of AMOLE in comprehending molecules and their descriptions, highlighting its potential for application in real-world drug discovery.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Centrality dependence of Lévy-stable two-pion Bose-Einstein correlations in $\sqrt{s_{_{NN}}}=200$ GeV Au$+$Au collisions
Authors:
PHENIX Collaboration,
N. J. Abdulameer,
U. Acharya,
A. Adare,
C. Aidala,
N. N. Ajitanand,
Y. Akiba,
R. Akimoto,
H. Al-Ta'ani,
J. Alexander,
A. Angerami,
K. Aoki,
N. Apadula,
Y. Aramaki,
H. Asano,
E. C. Aschenauer,
E. T. Atomssa,
T. C. Awes,
B. Azmoun,
V. Babintsev,
M. Bai,
B. Bannier,
K. N. Barish,
B. Bassalleck,
S. Bathe
, et al. (377 additional authors not shown)
Abstract:
The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability…
▽ More
The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability $α$, and the Lévy-scale parameter $R$ as a function of transverse mass $m_T$ and centrality. The $λ(m_T)$ parameter is constant at larger values of $m_T$, but decreases as $m_T$ decreases. The Lévy scale parameter $R(m_T)$ decreases with $m_T$ and exhibits proportionality to the length scale of the nuclear overlap region. The Lévy exponent $α(m_T)$ is independent of $m_T$ within uncertainties in each investigated centrality bin, but shows a clear centrality dependence. At all centralities, the Lévy exponent $α$ is significantly different from that of Gaussian ($α=2$) or Cauchy ($α=1$) source distributions. Comparisons to the predictions of Monte-Carlo simulations of resonance-decay chains show that in all but the most peripheral centrality class (50%-60%), the obtained results are inconsistent with the measurements, unless a significant reduction of the in-medium mass of the $η'$ meson is included. In each centrality class, the best value of the in-medium $η'$ mass is compared to the mass of the $η$ meson, as well as to several theoretical predictions that consider restoration of $U_A(1)$ symmetry in hot hadronic matter.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Dynamical Measure Transport and Neural PDE Solvers for Sampling
Authors:
Jingtong Sun,
Julius Berner,
Lorenz Richter,
Marius Zeinhofer,
Johannes Müller,
Kamyar Azizzadenesheli,
Anima Anandkumar
Abstract:
The task of sampling from a probability density can be approached as transporting a tractable density function to the target, known as dynamical measure transport. In this work, we tackle it through a principled unified framework using deterministic or stochastic evolutions described by partial differential equations (PDEs). This framework incorporates prior trajectory-based sampling methods, such…
▽ More
The task of sampling from a probability density can be approached as transporting a tractable density function to the target, known as dynamical measure transport. In this work, we tackle it through a principled unified framework using deterministic or stochastic evolutions described by partial differential equations (PDEs). This framework incorporates prior trajectory-based sampling methods, such as diffusion models or Schrödinger bridges, without relying on the concept of time-reversals. Moreover, it allows us to propose novel numerical methods for solving the transport task and thus sampling from complicated targets without the need for the normalization constant or data samples. We employ physics-informed neural networks (PINNs) to approximate the respective PDE solutions, implying both conceptional and computational advantages. In particular, PINNs allow for simulation- and discretization-free optimization and can be trained very efficiently, leading to significantly better mode coverage in the sampling task compared to alternative methods. Moreover, they can readily be fine-tuned with Gauss-Newton methods to achieve high accuracy in sampling.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (645 additional authors not shown)
Abstract:
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be…
▽ More
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Bergman Kernels of the Cheng-Yau metrics on quasi-projective manifolds
Authors:
Jingzhoun Sun
Abstract:
We show the asymptotics of the Bergman kernel function near the smooth divisor at infinity of the Cheng-Yau metric on quasi-projective manifolds. In particular, we show that there is a quantum phenomenon for the points very close to the divisor at infinity.
We show the asymptotics of the Bergman kernel function near the smooth divisor at infinity of the Cheng-Yau metric on quasi-projective manifolds. In particular, we show that there is a quantum phenomenon for the points very close to the divisor at infinity.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Flow to Rare Events: An Application of Normalizing Flow in Temporal Importance Sampling for Automated Vehicle Validation
Authors:
Yichun Ye,
He Zhang,
Ye Tian,
Jian Sun
Abstract:
Automated Vehicle (AV) validation based on simulated testing requires unbiased evaluation and high efficiency. One effective solution is to increase the exposure to risky rare events while reweighting the probability measure. However, characterizing the distribution of risky events is particularly challenging due to the paucity of samples and the temporality of continuous scenario variables. To so…
▽ More
Automated Vehicle (AV) validation based on simulated testing requires unbiased evaluation and high efficiency. One effective solution is to increase the exposure to risky rare events while reweighting the probability measure. However, characterizing the distribution of risky events is particularly challenging due to the paucity of samples and the temporality of continuous scenario variables. To solve it, we devise a method to represent, generate, and reweight the distribution of risky rare events. We decompose the temporal evolution of continuous variables into distribution components based on conditional probability. By introducing the Risk Indicator Function, the distribution of risky rare events is theoretically precipitated out of naturalistic driving distribution. This targeted distribution is practically generated via Normalizing Flow, which achieves exact and tractable probability evaluation of intricate distribution. The rare event distribution is then demonstrated as the advantageous Importance Sampling distribution. We also promote the technique of temporal Importance Sampling. The combined method, named as TrimFlow, is executed to estimate the collision rate of Car-following scenarios as a tentative practice. The results showed that sampling background vehicle maneuvers from rare event distribution could evolve testing scenarios to hazardous states. TrimFlow reduced 86.1% of tests compared to generating testing scenarios according to their exposure in the naturalistic driving environment. In addition, the TrimFlow method is not limited to one specific type of functional scenario.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Certified Continual Learning for Neural Network Regression
Authors:
Long H. Pham,
Jun Sun
Abstract:
On the one hand, there has been considerable progress on neural network verification in recent years, which makes certifying neural networks a possibility. On the other hand, neural networks in practice are often re-trained over time to cope with new data distribution or for solving different tasks (a.k.a. continual learning). Once re-trained, the verified correctness of the neural network is like…
▽ More
On the one hand, there has been considerable progress on neural network verification in recent years, which makes certifying neural networks a possibility. On the other hand, neural networks in practice are often re-trained over time to cope with new data distribution or for solving different tasks (a.k.a. continual learning). Once re-trained, the verified correctness of the neural network is likely broken, particularly in the presence of the phenomenon known as catastrophic forgetting. In this work, we propose an approach called certified continual learning which improves existing continual learning methods by preserving, as long as possible, the established correctness properties of a verified network. Our approach is evaluated with multiple neural networks and on two different continual learning methods. The results show that our approach is efficient and the trained models preserve their certified correctness and often maintain high utility.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Superconductivity up to 14.2 K in MnB$_4$ under pressure
Authors:
Zhe-Ning Xiang,
Ying-Jie Zhang,
Qing Lu,
Qing Li,
Yiwen Li,
Tianheng Huang,
Yijie Zhu,
Yongze Ye,
Jian Sun,
Hai-Hu Wen
Abstract:
The discovery of superconductivity in 3$d$-transition metal compounds with strong magnetism is interesting but rare. Especially for Mn-based compounds, there exist only very limited materials that show superconductivity. Here, we report the discovery of superconductivity up to 14.2 K in a Mn-based material MnB$_4$. By applying high pressures, we found the continuous suppression of a weak insulatin…
▽ More
The discovery of superconductivity in 3$d$-transition metal compounds with strong magnetism is interesting but rare. Especially for Mn-based compounds, there exist only very limited materials that show superconductivity. Here, we report the discovery of superconductivity up to 14.2 K in a Mn-based material MnB$_4$. By applying high pressures, we found the continuous suppression of a weak insulating behavior and the occurrence of superconductivity after about 30 GPa. With further increasing pressure, $T_\text{c}$ is gradually enhanced and reaches the maximum value of about 14.2 K at 150 GPa with a Fermi-Liquid behavior in the normal states. The synchrotron X-ray diffraction data reveal the unchanged monoclinic (S.G: $P2_1/c$) symmetry but an unusual crossover of the lattice parameters $b$ and $c$. Theoretical calculations based on the electron-phonon coupling picture reveal a very low $T_\text{c}$ (less than 1 K), manifesting an exotic pairing mechanism beyond the Bardeen-Cooper-Schrieffer (BCS) theory. Our findings show a promising way to explore high $T_\text{c}$ superconductivity by combining the 3d-transition metal magnetic elements and light elements.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Academic Article Recommendation Using Multiple Perspectives
Authors:
Kenneth Church,
Omar Alonso,
Peter Vickers,
Jiameng Sun,
Abteen Ebrahimi,
Raman Chandrasekar
Abstract:
We argue that Content-based filtering (CBF) and Graph-based methods (GB) complement one another in Academic Search recommendations. The scientific literature can be viewed as a conversation between authors and the audience. CBF uses abstracts to infer authors' positions, and GB uses citations to infer responses from the audience. In this paper, we describe nine differences between CBF and GB, as w…
▽ More
We argue that Content-based filtering (CBF) and Graph-based methods (GB) complement one another in Academic Search recommendations. The scientific literature can be viewed as a conversation between authors and the audience. CBF uses abstracts to infer authors' positions, and GB uses citations to infer responses from the audience. In this paper, we describe nine differences between CBF and GB, as well as synergistic opportunities for hybrid combinations. Two embeddings will be used to illustrate these opportunities: (1) Specter, a CBF method based on BERT-like deepnet encodings of abstracts, and (2) ProNE, a GB method based on spectral clustering of more than 200M papers and 2B citations from Semantic Scholar.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Bulk high-temperature superconductivity in the high-pressure tetragonal phase of bilayer La2PrNi2O7
Authors:
Ningning Wang,
Gang Wang,
Xiaoling Shen,
Jun Hou,
Jun Luo,
Xiaoping Ma,
Huaixin Yang,
Lifen Shi,
Jie Dou,
Jie Feng,
Jie Yang,
Yunqing Shi,
Zhian Ren,
Hanming Ma,
Pengtao Yang,
Ziyi Liu,
Yue Liu,
Hua Zhang,
Xiaoli Dong,
Yuxin Wang,
Kun Jiang,
Jiangping Hu,
Stuart Calder,
Jiaqiang Yan,
Jianping Sun
, et al. (4 additional authors not shown)
Abstract:
The Ruddlesden-Popper (R-P) bilayer nickelate, La3Ni2O7, was recently found to show signatures of high-temperature superconductivity (HTSC) at pressures above 14 GPa. Subsequent investigations achieved zero resistance in single- and poly-crystalline samples under hydrostatic pressure conditions. Yet, obvious diamagnetic signals, the other hallmark of superconductors, are still lacking owing to the…
▽ More
The Ruddlesden-Popper (R-P) bilayer nickelate, La3Ni2O7, was recently found to show signatures of high-temperature superconductivity (HTSC) at pressures above 14 GPa. Subsequent investigations achieved zero resistance in single- and poly-crystalline samples under hydrostatic pressure conditions. Yet, obvious diamagnetic signals, the other hallmark of superconductors, are still lacking owing to the filamentary nature with low superconducting volume fraction. The presence of a novel "1313" polymorph and competing R-P phases obscured proper identification of the phase for HTSC. Thus, achieving bulk HTSC and identifying the phase at play are the most prominent tasks at present. Here, we address these issues in the praseodymium (Pr)-doped La2PrNi2O7 polycrystalline samples. We find that the substitutions of Pr for La effectively inhibits the intergrowth of different R-P phases, resulting in nearly pure bilayer structure. For La2PrNi2O7, pressure-induced orthorhombic-to-tetragonal structural transition takes place at Pc ~ 11 GPa, above which HTSC emerges gradually upon further compression. The superconducting transition temperatures at 18-20 GPa reach Tconset = 82.5 K and Tczero = 60 K, which are the highest values among known nickelate superconductors. More importantly, bulk HTSC was testified by detecting clear diamagnetic signals below ~75 K corresponding to an estimated superconducting volume fraction ~ 57(5)% at 20 GPa. Our results not only resolve the existing controversies but also illuminate directions for exploring bulk HTSC in the bilayer nickelates.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
DMSD-CDFSAR: Distillation from Mixed-Source Domain for Cross-Domain Few-shot Action Recognition
Authors:
Fei Guo,
YiKang Wang,
Han Qi,
Li Zhu,
Jing Sun
Abstract:
Few-shot action recognition is an emerging field in computer vision, primarily focused on meta-learning within the same domain. However, challenges arise in real-world scenario deployment, as gathering extensive labeled data within a specific domain is laborious and time-intensive. Thus, attention shifts towards cross-domain few-shot action recognition, requiring the model to generalize across dom…
▽ More
Few-shot action recognition is an emerging field in computer vision, primarily focused on meta-learning within the same domain. However, challenges arise in real-world scenario deployment, as gathering extensive labeled data within a specific domain is laborious and time-intensive. Thus, attention shifts towards cross-domain few-shot action recognition, requiring the model to generalize across domains with significant deviations. Therefore, we propose a novel approach, ``Distillation from Mixed-Source Domain", tailored to address this conundrum. Our method strategically integrates insights from both labeled data of the source domain and unlabeled data of the target domain during the training. The ResNet18 is used as the backbone to extract spatial features from the source and target domains. We design two branches for meta-training: the original-source and the mixed-source branches. In the first branch, a Domain Temporal Encoder is employed to capture temporal features for both the source and target domains. Additionally, a Domain Temporal Decoder is employed to reconstruct all extracted features. In the other branch, a Domain Mixed Encoder is used to handle labeled source domain data and unlabeled target domain data, generating mixed-source domain features. We incorporate a pre-training stage before meta-training, featuring a network architecture similar to that of the first branch. Lastly, we introduce a dual distillation mechanism to refine the classification probabilities of source domain features, aligning them with those of mixed-source domain features. This iterative process enriches the insights of the original-source branch with knowledge from the mixed-source branch, thereby enhancing the model's generalization capabilities. Our code is available at URL: \url{https://xxxx/xxxx/xxxx.git}
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Volume effect on the extraction of sound velocity in high-energy nucleus-nucleus collisions
Authors:
Jing-An Sun,
Li Yan
Abstract:
The determination of the speed of sound in quark-gluon plasma is a crucial aspect of understanding the properties of strongly interacting matter created in relativistic heavy-ion collisions. In this study, we investigate the impact of initial-state fluctuations on the extraction of the speed of sound in a quark-gluon plasma in the ultra-central collisions. By employing the TRENTo model for simulat…
▽ More
The determination of the speed of sound in quark-gluon plasma is a crucial aspect of understanding the properties of strongly interacting matter created in relativistic heavy-ion collisions. In this study, we investigate the impact of initial-state fluctuations on the extraction of the speed of sound in a quark-gluon plasma in the ultra-central collisions. By employing the TRENTo model for simulating initial conditions, we demonstrate that these fluctuations lead to sizable volume effect, which in turn corrects the measured values of the sound velocity. With respect to realistic conditions of high-energy heavy-ion experiments, we provide quantitative estimate of these corrections.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Ferromagnetic inter-layer coupling in FeSe$_{1-x}$S$_{x}$ superconductors revealed by inelastic neutron scattering
Authors:
Mingwei Ma,
Philippe Bourges,
Yvan Sidis,
Jinzhao Sun,
Guoqing Wang,
Kazuki Iida,
Kazuya Kamazawa,
Jitae T. Park,
Frederic Bourdarot,
Zhian Ren,
Yuan Li
Abstract:
FeSe$_{1-x}$S$_{x}$ superconductors are commonly considered layered van der Waals materials with negligible inter-layer coupling. Here, using inelastic neutron scattering to study spin excitations in single-crystal samples, we reveal that the magnetic coupling between adjacent Fe layers is not only significant, as it affects excitations up to \textcolor{black}{15} meV, but also ferromagnetic in na…
▽ More
FeSe$_{1-x}$S$_{x}$ superconductors are commonly considered layered van der Waals materials with negligible inter-layer coupling. Here, using inelastic neutron scattering to study spin excitations in single-crystal samples, we reveal that the magnetic coupling between adjacent Fe layers is not only significant, as it affects excitations up to \textcolor{black}{15} meV, but also ferromagnetic in nature, making the system different from most unconventional superconductors including iron pnictides. Our observation provides a new standpoint to understand the absence of magnetic order in FeSe$_{1-x}$S$_{x}$. Since intercalating between the Fe layers is known to enhance superconductivity and suppress the inter-layer coupling, superconductivity appears to be a more robust phenomenon in the two-dimensional limit than antiferromagnetic order.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Multi-scale Conditional Generative Modeling for Microscopic Image Restoration
Authors:
Luzhe Huang,
Xiongye Xiao,
Shixuan Li,
Jiawen Sun,
Yi Huang,
Aydogan Ozcan,
Paul Bogdan
Abstract:
The advance of diffusion-based generative models in recent years has revolutionized state-of-the-art (SOTA) techniques in a wide variety of image analysis and synthesis tasks, whereas their adaptation on image restoration, particularly within computational microscopy remains theoretically and empirically underexplored. In this research, we introduce a multi-scale generative model that enhances con…
▽ More
The advance of diffusion-based generative models in recent years has revolutionized state-of-the-art (SOTA) techniques in a wide variety of image analysis and synthesis tasks, whereas their adaptation on image restoration, particularly within computational microscopy remains theoretically and empirically underexplored. In this research, we introduce a multi-scale generative model that enhances conditional image restoration through a novel exploitation of the Brownian Bridge process within wavelet domain. By initiating the Brownian Bridge diffusion process specifically at the lowest-frequency subband and applying generative adversarial networks at subsequent multi-scale high-frequency subbands in the wavelet domain, our method provides significant acceleration during training and sampling while sustaining a high image generation quality and diversity on par with SOTA diffusion models. Experimental results on various computational microscopy and imaging tasks confirm our method's robust performance and its considerable reduction in its sampling steps and time. This pioneering technique offers an efficient image restoration framework that harmonizes efficiency with quality, signifying a major stride in incorporating cutting-edge generative models into computational microscopy workflows.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
A Reference-free Metric for Language-Queried Audio Source Separation using Contrastive Language-Audio Pretraining
Authors:
Feiyang Xiao,
Jian Guan,
Qiaoxi Zhu,
Xubo Liu,
Wenbo Wang,
Shuhan Qi,
Kejia Zhang,
Jianyuan Sun,
Wenwu Wang
Abstract:
Language-queried audio source separation (LASS) aims to separate an audio source guided by a text query, with the signal-to-distortion ratio (SDR)-based metrics being commonly used to objectively measure the quality of the separated audio. However, the SDR-based metrics require a reference signal, which is often difficult to obtain in real-world scenarios. In addition, with the SDR-based metrics,…
▽ More
Language-queried audio source separation (LASS) aims to separate an audio source guided by a text query, with the signal-to-distortion ratio (SDR)-based metrics being commonly used to objectively measure the quality of the separated audio. However, the SDR-based metrics require a reference signal, which is often difficult to obtain in real-world scenarios. In addition, with the SDR-based metrics, the content information of the text query is not considered effectively in LASS. This paper introduces a reference-free evaluation metric using a contrastive language-audio pretraining (CLAP) module, termed CLAPScore, which measures the semantic similarity between the separated audio and the text query. Unlike SDR, the proposed CLAPScore metric evaluates the quality of the separated audio based on the content information of the text query, without needing a reference signal. Experimental results show that the CLAPScore metric provides an effective evaluation of the semantic relevance of the separated audio to the text query, as compared to the SDR metric, offering an alternative for the performance evaluation of LASS systems.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
TieBot: Learning to Knot a Tie from Visual Demonstration through a Real-to-Sim-to-Real Approach
Authors:
Weikun Peng,
Jun Lv,
Yuwei Zeng,
Haonan Chen,
Siheng Zhao,
Jichen Sun,
Cewu Lu,
Lin Shao
Abstract:
The tie-knotting task is highly challenging due to the tie's high deformation and long-horizon manipulation actions. This work presents TieBot, a Real-to-Sim-to-Real learning from visual demonstration system for the robots to learn to knot a tie. We introduce the Hierarchical Feature Matching approach to estimate a sequence of tie's meshes from the demonstration video. With these estimated meshes…
▽ More
The tie-knotting task is highly challenging due to the tie's high deformation and long-horizon manipulation actions. This work presents TieBot, a Real-to-Sim-to-Real learning from visual demonstration system for the robots to learn to knot a tie. We introduce the Hierarchical Feature Matching approach to estimate a sequence of tie's meshes from the demonstration video. With these estimated meshes used as subgoals, we first learn a teacher policy using privileged information. Then, we learn a student policy with point cloud observation by imitating teacher policy. Lastly, our pipeline learns a residual policy when the learned policy is applied to real-world execution, mitigating the Sim2Real gap. We demonstrate the effectiveness of TieBot in simulation and the real world. In the real-world experiment, a dual-arm robot successfully knots a tie, achieving 50% success rate among 10 trials. Videos can be found https://tiebots.github.io/.
△ Less
Submitted 3 July, 2024; v1 submitted 3 July, 2024;
originally announced July 2024.
-
Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be…
▽ More
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
PHANGS-MeerKAT and MHONGOOSE HI observations of nearby spiral galaxies: physical drivers of the molecular gas fraction, $R_{\mathrm{mol}}$
Authors:
Cosima Eibensteiner,
Jiayi Sun,
Frank Bigiel,
Adam K. Leroy,
Eva Schinnerer,
Erik Rosolowsky,
Sushma Kurapati,
D. J. Pisano,
W. J. G de Blok,
Ashley T. Barnes,
Mallory Thorp,
Dario Colombo,
Eric W. Koch,
I-Da Chiang,
Eve C. Ostriker,
Eric J. Murphy,
Nikki Zabel,
Sebstian Laudage,
Filippo M. Maccagni,
Julia Healy,
Srikrishna Sekhar,
Dyas Utomo,
Jakob den Brok,
Yixian Cao,
Mélanie Chevance
, et al. (14 additional authors not shown)
Abstract:
The molecular-to-atomic gas ratio is crucial to the evolution of the interstellar medium in galaxies. We investigate the balance between the atomic ($Σ_{\rm HI}$) and molecular gas ($Σ_{\rm H2}$) surface densities in eight nearby star-forming galaxies using new high-quality observations from MeerKAT and ALMA (for HI and CO, respectively). We define the molecular gas ratio as…
▽ More
The molecular-to-atomic gas ratio is crucial to the evolution of the interstellar medium in galaxies. We investigate the balance between the atomic ($Σ_{\rm HI}$) and molecular gas ($Σ_{\rm H2}$) surface densities in eight nearby star-forming galaxies using new high-quality observations from MeerKAT and ALMA (for HI and CO, respectively). We define the molecular gas ratio as $R_{\rm mol} = Σ_{\rm H2} / Σ_{\rm HI}$ and measure how it depends on local conditions in the galaxy disks using multi-wavelength observations. We find that, depending on the galaxy, HI is detected at $>3σ$ out to 20-120 kpc in galactocentric radius ($r_{\rm gal}$). The typical radius at which $Σ_{\rm HI}$ reaches 1~$\rm M_\odot~pc^{-2}$ is $r_{\rm HI}\approx22$~kpc, which corresponds to 1-3 times the optical radius ($r_{25}$). $R_{\rm mol}$ correlates best with the dynamical equilibrium pressure, P$_{\rm DE}$, among potential drivers studied, with a median correlation coefficient of $<ρ>=0.89$. Correlations between $R_{\rm mol}$ and star formation rate, total gas and stellar surface density, metallicity, and $Σ_{\rm SFR}$/P$_{\rm DE}$ are present but somewhat weaker. Our results also show a direct correlation between P$_{\rm DE}$ and $Σ_{\rm SFR}$, supporting self-regulation models. Quantitatively, we measure similar scalings as previous works and attribute the modest differences that we find to the effect of varying resolution and sensitivity. At $r_{\rm gal} {\gtrsim}0.4~r_{25}$, atomic gas dominates over molecular gas, and at the balance of these two gas phases, we find that the baryon mass is dominated by stars, with $Σ_{*} > 5~Σ_{\rm gas}$. Our study constitutes an important step in the statistical investigation of how local galaxy properties impact the conversion from atomic to molecular gas in nearby galaxies.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
PointViG: A Lightweight GNN-based Model for Efficient Point Cloud Analysis
Authors:
Qiang Zheng,
Yafei Qi,
Chen Wang,
Chao Zhang,
Jian Sun
Abstract:
In the domain of point cloud analysis, despite the significant capabilities of Graph Neural Networks (GNNs) in managing complex 3D datasets, existing approaches encounter challenges like high computational costs and scalability issues with extensive scenarios. These limitations restrict the practical deployment of GNNs, notably in resource-constrained environments. To address these issues, this st…
▽ More
In the domain of point cloud analysis, despite the significant capabilities of Graph Neural Networks (GNNs) in managing complex 3D datasets, existing approaches encounter challenges like high computational costs and scalability issues with extensive scenarios. These limitations restrict the practical deployment of GNNs, notably in resource-constrained environments. To address these issues, this study introduce <b>Point<\b> <b>Vi<\b>sion <b>G<\b>NN (PointViG), an efficient framework for point cloud analysis. PointViG incorporates a lightweight graph convolutional module to efficiently aggregate local features and mitigate over-smoothing. For large-scale point cloud scenes, we propose an adaptive dilated graph convolution technique that searches for sparse neighboring nodes within a dilated neighborhood based on semantic correlation, thereby expanding the receptive field and ensuring computational efficiency. Experiments demonstrate that PointViG achieves performance comparable to state-of-the-art models while balancing performance and complexity. On the ModelNet40 classification task, PointViG achieved 94.3% accuracy with 1.5M parameters. For the S3DIS segmentation task, it achieved an mIoU of 71.7% with 5.3M parameters. These results underscore the potential and efficiency of PointViG in point cloud analysis.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets
Authors:
Jintai Chen,
Yaojun Hu,
Yue Wang,
Yingzhou Lu,
Xu Cao,
Miao Lin,
Hongxia Xu,
Jian Wu,
Cao Xiao,
Jimeng Sun,
Lucas Glass,
Kexin Huang,
Marinka Zitnik,
Tianfan Fu
Abstract:
Clinical trials are pivotal for developing new medical treatments, yet they typically pose some risks such as patient mortality, adverse events, and enrollment failure that waste immense efforts spanning over a decade. Applying artificial intelligence (AI) to forecast or simulate key events in clinical trials holds great potential for providing insights to guide trial designs. However, complex dat…
▽ More
Clinical trials are pivotal for developing new medical treatments, yet they typically pose some risks such as patient mortality, adverse events, and enrollment failure that waste immense efforts spanning over a decade. Applying artificial intelligence (AI) to forecast or simulate key events in clinical trials holds great potential for providing insights to guide trial designs. However, complex data collection and question definition requiring medical expertise and a deep understanding of trial designs have hindered the involvement of AI thus far. This paper tackles these challenges by presenting a comprehensive suite of meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design, encompassing prediction of trial duration, patient dropout rate, serious adverse event, mortality rate, trial approval outcome, trial failure reason, drug dose finding, design of eligibility criteria. Furthermore, we provide basic validation methods for each task to ensure the datasets' usability and reliability. We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design, ultimately advancing clinical trial research and accelerating medical solution development. The curated dataset, metrics, and basic models are publicly available at https://github.com/ML2Health/ML2ClinicalTrials/tree/main/AI4Trial.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Observable CP-violation in charmed baryons decays with SU(3) symmetry analysis
Authors:
Jin Sun,
Ruilin Zhu,
Zhi-Peng Xing
Abstract:
CP violation in baryon decays has recently garnered significant attention. Traditionally, CP violation in charmed baryon decays is predicted to be at the order of $O(10^{-4})$. Utilizing recent measurements from BESIII, Belle and Belle~II experiments, we conduct a comprehensive global analysis to estimate the contribution of topological diagrams in these processes. Our study reveals that topologic…
▽ More
CP violation in baryon decays has recently garnered significant attention. Traditionally, CP violation in charmed baryon decays is predicted to be at the order of $O(10^{-4})$. Utilizing recent measurements from BESIII, Belle and Belle~II experiments, we conduct a comprehensive global analysis to estimate the contribution of topological diagrams in these processes. Our study reveals that topological diagrams alone cannot fully account for the experimental data. By incorporating new effects that are not included in traditional topological diagrams and exploring the possible source of the new effect, our analysis reveals the potential for significant CP violation in $Λ_c^+\to n π^+$ and $Ξ_c^+\to Ξ^0 K^+$. This suggests a promising opportunity to observe CP violation for the first time in charmed baryon decays.
△ Less
Submitted 11 July, 2024; v1 submitted 29 June, 2024;
originally announced July 2024.
-
Evaluating Human Alignment and Model Faithfulness of LLM Rationale
Authors:
Mohsen Fayyaz,
Fan Yin,
Jiao Sun,
Nanyun Peng
Abstract:
We study how well large language models (LLMs) explain their generations with rationales -- a set of tokens extracted from the input texts that reflect the decision process of LLMs. We examine LLM rationales extracted with two methods: 1) attribution-based methods that use attention or gradients to locate important tokens, and 2) prompting-based methods that guide LLMs to extract rationales using…
▽ More
We study how well large language models (LLMs) explain their generations with rationales -- a set of tokens extracted from the input texts that reflect the decision process of LLMs. We examine LLM rationales extracted with two methods: 1) attribution-based methods that use attention or gradients to locate important tokens, and 2) prompting-based methods that guide LLMs to extract rationales using prompts. Through extensive experiments, we show that prompting-based rationales align better with human-annotated rationales than attribution-based rationales, and demonstrate reasonable alignment with humans even when model performance is poor. We additionally find that the faithfulness limitations of prompting-based methods, which are identified in previous work, may be linked to their collapsed predictions. By fine-tuning these models on the corresponding datasets, both prompting and attribution methods demonstrate improved faithfulness. Our study sheds light on more rigorous and fair evaluations of LLM rationales, especially for prompting-based ones.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
S. Ahmed,
M. Albrecht,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
X. H. Bai,
Y. Bai,
O. Bakina,
R. Baldini Ferroli,
I. Balossino,
Y. Ban,
K. Begzsuren,
N. Berger,
M. Bertani,
D. Bettoni,
F. Bianchi,
J. Bloms,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (495 additional authors not shown)
Abstract:
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions…
▽ More
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components.
△ Less
Submitted 2 July, 2024; v1 submitted 28 June, 2024;
originally announced July 2024.
-
MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis
Authors:
Jun-Yan He,
Zhi-Qi Cheng,
Chenyang Li,
Jingdong Sun,
Qi He,
Wangmeng Xiang,
Hanyuan Chen,
Jin-Peng Lan,
Xianhui Lin,
Kang Zhu,
Bin Luo,
Yifeng Geng,
Xuansong Xie,
Alexander G. Hauptmann
Abstract:
MetaDesigner revolutionizes artistic typography synthesis by leveraging the strengths of Large Language Models (LLMs) to drive a design paradigm centered around user engagement. At the core of this framework lies a multi-agent system comprising the Pipeline, Glyph, and Texture agents, which collectively enable the creation of customized WordArt, ranging from semantic enhancements to the imposition…
▽ More
MetaDesigner revolutionizes artistic typography synthesis by leveraging the strengths of Large Language Models (LLMs) to drive a design paradigm centered around user engagement. At the core of this framework lies a multi-agent system comprising the Pipeline, Glyph, and Texture agents, which collectively enable the creation of customized WordArt, ranging from semantic enhancements to the imposition of complex textures. MetaDesigner incorporates a comprehensive feedback mechanism that harnesses insights from multimodal models and user evaluations to refine and enhance the design process iteratively. Through this feedback loop, the system adeptly tunes hyperparameters to align with user-defined stylistic and thematic preferences, generating WordArt that not only meets but exceeds user expectations of visual appeal and contextual relevance. Empirical validations highlight MetaDesigner's capability to effectively serve diverse WordArt applications, consistently producing aesthetically appealing and context-sensitive results.
△ Less
Submitted 4 July, 2024; v1 submitted 28 June, 2024;
originally announced June 2024.
-
MuGSI: Distilling GNNs with Multi-Granularity Structural Information for Graph Classification
Authors:
Tianjun Yao,
Jiaqi Sun,
Defu Cao,
Kun Zhang,
Guangyi Chen
Abstract:
Recent works have introduced GNN-to-MLP knowledge distillation (KD) frameworks to combine both GNN's superior performance and MLP's fast inference speed. However, existing KD frameworks are primarily designed for node classification within single graphs, leaving their applicability to graph classification largely unexplored. Two main challenges arise when extending KD for node classification to gr…
▽ More
Recent works have introduced GNN-to-MLP knowledge distillation (KD) frameworks to combine both GNN's superior performance and MLP's fast inference speed. However, existing KD frameworks are primarily designed for node classification within single graphs, leaving their applicability to graph classification largely unexplored. Two main challenges arise when extending KD for node classification to graph classification: (1) The inherent sparsity of learning signals due to soft labels being generated at the graph level; (2) The limited expressiveness of student MLPs, especially in datasets with limited input feature spaces. To overcome these challenges, we introduce MuGSI, a novel KD framework that employs Multi-granularity Structural Information for graph classification. Specifically, we propose multi-granularity distillation loss in MuGSI to tackle the first challenge. This loss function is composed of three distinct components: graph-level distillation, subgraph-level distillation, and node-level distillation. Each component targets a specific granularity of the graph structure, ensuring a comprehensive transfer of structural knowledge from the teacher model to the student model. To tackle the second challenge, MuGSI proposes to incorporate a node feature augmentation component, thereby enhancing the expressiveness of the student MLPs and making them more capable learners. We perform extensive experiments across a variety of datasets and different teacher/student model architectures. The experiment results demonstrate the effectiveness, efficiency, and robustness of MuGSI. Codes are publicly available at: \textbf{\url{https://github.com/tianyao-aka/MuGSI}.}
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Time-optimal Flight in Cluttered Environments via Safe Reinforcement Learning
Authors:
Wei Xiao,
Zhaohan Feng,
Ziyu Zhou,
Jian Sun,
Gang Wang,
Jie Chen
Abstract:
This paper addresses the problem of guiding a quadrotor through a predefined sequence of waypoints in cluttered environments, aiming to minimize the flight time while avoiding collisions. Previous approaches either suffer from prolonged computational time caused by solving complex non-convex optimization problems or are limited by the inherent smoothness of polynomial trajectory representations, t…
▽ More
This paper addresses the problem of guiding a quadrotor through a predefined sequence of waypoints in cluttered environments, aiming to minimize the flight time while avoiding collisions. Previous approaches either suffer from prolonged computational time caused by solving complex non-convex optimization problems or are limited by the inherent smoothness of polynomial trajectory representations, thereby restricting the flexibility of movement. In this work, we present a safe reinforcement learning approach for autonomous drone racing with time-optimal flight in cluttered environments. The reinforcement learning policy, trained using safety and terminal rewards specifically designed to enforce near time-optimal and collision-free flight, outperforms current state-of-the-art algorithms. Additionally, experimental results demonstrate the efficacy of the proposed approach in achieving both minimum flight time and obstacle avoidance objectives in complex environments, with a commendable $66.7\%$ success rate in unseen, challenging settings.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Tools Fail: Detecting Silent Errors in Faulty Tools
Authors:
Jimin Sun,
So Yeon Min,
Yingshan Chang,
Yonatan Bisk
Abstract:
Tools have become a mainstay of LLMs, allowing them to retrieve knowledge not in their weights, to perform tasks on the web, and even to control robots. However, most ontologies and surveys of tool-use have assumed the core challenge for LLMs is choosing the tool. Instead, we introduce a framework for tools more broadly which guides us to explore a model's ability to detect "silent" tool errors, a…
▽ More
Tools have become a mainstay of LLMs, allowing them to retrieve knowledge not in their weights, to perform tasks on the web, and even to control robots. However, most ontologies and surveys of tool-use have assumed the core challenge for LLMs is choosing the tool. Instead, we introduce a framework for tools more broadly which guides us to explore a model's ability to detect "silent" tool errors, and reflect on how to plan. This more directly aligns with the increasingly popular use of models as tools. We provide an initial approach to failure recovery with promising results both on a controlled calculator setting and embodied agent planning.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec…
▽ More
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Towards Large Language Model Aided Program Refinement
Authors:
Yufan Cai,
Zhe Hou,
Xiaokun Luan,
David Miguel Sanan Baena,
Yun Lin,
Jun Sun,
Jin Song Dong
Abstract:
Program refinement involves correctness-preserving transformations from formal high-level specification statements into executable programs. Traditional verification tool support for program refinement is highly interactive and lacks automation. On the other hand, the emergence of large language models (LLMs) enables automatic code generations from informal natural language specifications. However…
▽ More
Program refinement involves correctness-preserving transformations from formal high-level specification statements into executable programs. Traditional verification tool support for program refinement is highly interactive and lacks automation. On the other hand, the emergence of large language models (LLMs) enables automatic code generations from informal natural language specifications. However, code generated by LLMs is often unreliable. Moreover, the opaque procedure from specification to code provided by LLM is an uncontrolled black box. We propose LLM4PR, a tool that combines formal program refinement techniques with informal LLM-based methods to (1) transform the specification to preconditions and postconditions, (2) automatically build prompts based on refinement calculus, (3) interact with LLM to generate code, and finally, (4) verify that the generated code satisfies the conditions of refinement calculus, thus guaranteeing the correctness of the code. We have implemented our tool using GPT4, Coq, and Coqhammer, and evaluated it on the HumanEval and EvalPlus datasets.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of…
▽ More
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$, evidence for $ψ(4160) \to K^{-}\barΞ^{+}Λ$ is found for the first time with a significance of 4.4$σ$, including systematic uncertainties. No evidence for other possible resonances is found. In addition, the products of electronic partial width and branching fraction for all assumed resonances decaying into $K^{-}\barΞ^{+}Λ/Σ^{0}$ are determined.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an…
▽ More
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, and $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^0)=(2.02 \pm 0.13 \pm 0.05)\%$, where the first uncertainties are statistical and the second systematic. Combining with the known branching fractions of $Λ_c^+ \to pK_{S}^{0}$, $Λ_c^+ \to pK_{S}^{0}π^+π^-$, and $Λ_c^+ \to pK_{S}^{0}π^0$, we present the first measurements of the $K_{S}^{0}$-$K_{L}^{0}$ asymmetries $R(Λ_c^+, K_{S,L}^0X) = \frac{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) - \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) + \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}$ in charmed baryon decays: $R(Λ_c^+, pK_{S,L}^0) = -0.025 \pm 0.031$, $R(Λ_c^+, pK_{S,L}^0π^+π^-) = -0.027 \pm 0.048$, and $R(Λ_c^+, pK_{S,L}^0π^0) =-0.015 \pm 0.046$. No significant asymmetries within the uncertainties are observed.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
PharmaGPT: Domain-Specific Large Language Models for Bio-Pharmaceutical and Chemistry
Authors:
Linqing Chen,
Weilei Wang,
Zilong Bai,
Peng Xu,
Yan Fang,
Jie Fang,
Wentao Wu,
Lizhi Zhou,
Ruiji Zhang,
Yubin Xia,
Chaobo Xu,
Ran Hu,
Licong Xu,
Qijun Cai,
Haoran Hua,
Jing Sun,
Jin Liu,
Tian Qiu,
Haowen Liu,
Meng Hu,
Xiuwen Li,
Fei Gao,
Yufu Wang,
Lin Tie,
Chaochao Wang
, et al. (11 additional authors not shown)
Abstract:
Large language models (LLMs) have revolutionized Natural Language Processing (NLP) by minimizing the need for complex feature engineering. However, the application of LLMs in specialized domains like biopharmaceuticals and chemistry remains largely unexplored. These fields are characterized by intricate terminologies, specialized knowledge, and a high demand for precision areas where general purpo…
▽ More
Large language models (LLMs) have revolutionized Natural Language Processing (NLP) by minimizing the need for complex feature engineering. However, the application of LLMs in specialized domains like biopharmaceuticals and chemistry remains largely unexplored. These fields are characterized by intricate terminologies, specialized knowledge, and a high demand for precision areas where general purpose LLMs often fall short. In this study, we introduce PharmaGPT, a suite of domain specilized LLMs with 13 billion and 70 billion parameters, specifically trained on a comprehensive corpus tailored to the Bio-Pharmaceutical and Chemical domains. Our evaluation shows that PharmaGPT surpasses existing general models on specific-domain benchmarks such as NAPLEX, demonstrating its exceptional capability in domain-specific tasks. Remarkably, this performance is achieved with a model that has only a fraction, sometimes just one-tenth-of the parameters of general-purpose large models. This advancement establishes a new benchmark for LLMs in the bio-pharmaceutical and chemical fields, addressing the existing gap in specialized language modeling. It also suggests a promising path for enhanced research and development, paving the way for more precise and effective NLP applications in these areas.
△ Less
Submitted 9 July, 2024; v1 submitted 25 June, 2024;
originally announced June 2024.
-
Tele-Correlation: Calibrating Shear-Shear Correlation with Real Data
Authors:
Zhi Shen,
Jun Zhang,
Cong Liu,
Hekun Li,
Haoran Wang,
Zhenjie Liu,
Jiarui Sun
Abstract:
Tele-correlation refers to the correlation of galaxy shapes with large angular separations (e.g., $>100$ degrees). Since there are no astrophysical reasons causing such a correlation on cosmological scales, any detected tele-correlation could disclose systematic effects in shear-shear correlation measurement. If the shear estimators are measured on single exposures, we show that the field distorti…
▽ More
Tele-correlation refers to the correlation of galaxy shapes with large angular separations (e.g., $>100$ degrees). Since there are no astrophysical reasons causing such a correlation on cosmological scales, any detected tele-correlation could disclose systematic effects in shear-shear correlation measurement. If the shear estimators are measured on single exposures, we show that the field distortion (FD) signal associated with the galaxy position on the CCD can be retained and used in tele-correlation to help us directly calibrate the multiplicative and additive biases in shear-shear correlations. We use the DECaLS shear catalog produced by the Fourier\_Quad pipeline to demonstrate this idea. To our surprise, we find that significant multiplicative biases can arise (up to more than 10\%) due to redshift binning of the galaxies. Correction for this bias leads to about 1$σ$ increase of the best-fit value of $S_8$ from $0.760^{+0.015}_{-0.017}$ to $0.777^{+0.016}_{-0.019}$ in our tomography study.
△ Less
Submitted 27 June, 2024; v1 submitted 25 June, 2024;
originally announced June 2024.
-
ET tu, CLIP? Addressing Common Object Errors for Unseen Environments
Authors:
Ye Won Byun,
Cathy Jiao,
Shahriar Noroozizadeh,
Jimin Sun,
Rosa Vitiello
Abstract:
We introduce a simple method that employs pre-trained CLIP encoders to enhance model generalization in the ALFRED task. In contrast to previous literature where CLIP replaces the visual encoder, we suggest using CLIP as an additional module through an auxiliary object detection objective. We validate our method on the recently proposed Episodic Transformer architecture and demonstrate that incorpo…
▽ More
We introduce a simple method that employs pre-trained CLIP encoders to enhance model generalization in the ALFRED task. In contrast to previous literature where CLIP replaces the visual encoder, we suggest using CLIP as an additional module through an auxiliary object detection objective. We validate our method on the recently proposed Episodic Transformer architecture and demonstrate that incorporating CLIP improves task performance on the unseen validation set. Additionally, our analysis results support that CLIP especially helps with leveraging object descriptions, detecting small objects, and interpreting rare words.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Accelerating Clinical Evidence Synthesis with Large Language Models
Authors:
Zifeng Wang,
Lang Cao,
Benjamin Danek,
Yichi Zhang,
Qiao Jin,
Zhiyong Lu,
Jimeng Sun
Abstract:
Automatic medical discovery by AI is a dream of many. One step toward that goal is to create an AI model to understand clinical studies and synthesize clinical evidence from the literature. Clinical evidence synthesis currently relies on systematic reviews of clinical trials and retrospective analyses from medical literature. However, the rapid expansion of publications presents challenges in effi…
▽ More
Automatic medical discovery by AI is a dream of many. One step toward that goal is to create an AI model to understand clinical studies and synthesize clinical evidence from the literature. Clinical evidence synthesis currently relies on systematic reviews of clinical trials and retrospective analyses from medical literature. However, the rapid expansion of publications presents challenges in efficiently identifying, summarizing, and updating evidence. We introduce TrialMind, a generative AI-based pipeline for conducting medical systematic reviews, encompassing study search, screening, and data extraction phases. We utilize large language models (LLMs) to drive each pipeline component while incorporating human expert oversight to minimize errors. To facilitate evaluation, we also create a benchmark dataset TrialReviewBench, a custom dataset with 870 annotated clinical studies from 25 meta-analysis papers across various medical treatments. Our results demonstrate that TrialMind significantly improves the literature review process, achieving high recall rates (0.897-1.000) in study searching from over 20 million PubMed studies and outperforming traditional language model embeddings-based methods in screening (Recall@20 of 0.227-0.246 vs. 0.000-0.102). Furthermore, our approach surpasses direct GPT-4 performance in result extraction, with accuracy ranging from 0.65 to 0.84. We also support clinical evidence synthesis in forest plots, as validated by eight human annotators who preferred TrialMind over the GPT-4 baseline with a winning rate of 62.5%-100% across the involved reviews. Our findings suggest that an LLM-based clinical evidence synthesis approach, such as TrialMind, can enable reliable and high-quality clinical evidence synthesis to improve clinical research efficiency.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Study of the $f_{0}(980)$ through the decay $D_{s}^{+}\rightarrow π^{+}π^{+}π^{-}π^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (649 additional authors not shown)
Abstract:
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and…
▽ More
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and determine the branching fractions $\mathcal{B}(D_s^+\toπ^+π^+π^-π^0|_{{\rm non}-η})=(2.04\pm0.08_{\rm stat.}\pm0.05_{\rm syst.})\%$ and $\mathcal{B}(D_s^+\toηπ^+)=(1.56\pm0.09_{\rm stat.}\pm0.04_{\rm syst.})\%$. Moreover, we measure the relative branching fraction between $φ\toπ^+π^-π^0$ and $φ\to K^+K^-$ to be $\frac{\mathcal{B}(φ(1020) \to π^+π^-π^0)}{\mathcal{B}(φ(1020) \to K^+K^-)}=0.230 \pm 0.014_{\rm stat.} \pm 0.010_{\rm syst.}$, which deviates from the world average value by more than $4σ$.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Probing the nature of the $χ_{c1}(3872)$ state using radiative decays
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1094 additional authors not shown)
Abstract:
The radiative decays $χ_{c1}(3872)\rightarrowψ(2S)γ$ and $χ_{c1}(3872)\rightarrow J/ψγ$ are used to probe the~nature of the~$χ_{c1}(3872)$ state using proton-proton collision data collected with the LHCb detector, corresponding to an~integrated luminosity of~9fb$^{-1}$. Using the~$B^+\rightarrow χ_{c1}(3872)K^+$decay, the $χ_{c1}(3872)\rightarrow ψ(2S)γ$ process is observed for the first time and…
▽ More
The radiative decays $χ_{c1}(3872)\rightarrowψ(2S)γ$ and $χ_{c1}(3872)\rightarrow J/ψγ$ are used to probe the~nature of the~$χ_{c1}(3872)$ state using proton-proton collision data collected with the LHCb detector, corresponding to an~integrated luminosity of~9fb$^{-1}$. Using the~$B^+\rightarrow χ_{c1}(3872)K^+$decay, the $χ_{c1}(3872)\rightarrow ψ(2S)γ$ process is observed for the first time and the ratio of its partial width to that of the $χ_{c1}(3872)\rightarrow J/ψγ$ decay is measured to be $$ \frac{Γ_{χ_{c1}(3872)\rightarrow ψ(2S)γ}}
{Γ_{χ_{c1}(3872)\rightarrow J/ψγ}} = 1.67 \pm 0.21 \pm 0.12 \pm0.04 , $$ where the first uncertainty is statistical, the second systematic and the third is due to the uncertainties on the branching fractions of the $ψ(2S)$ and $J/ψ$ mesons. The measured ratio makes the interpretation of the $χ_{c1}(3872)$ state as a~pure $D^0\bar{D}^{*0}+\bar{D}^0D^{*0}$ molecule questionable and strongly indicates a sizeable compact charmonium or tetraquark component within the $χ_{c1}(3872)$ state.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Privacy Preserved Blood Glucose Level Cross-Prediction: An Asynchronous Decentralized Federated Learning Approach
Authors:
Chengzhe Piao,
Taiyu Zhu,
Yu Wang,
Stephanie E Baldeweg,
Paul Taylor,
Pantelis Georgiou,
Jiahao Sun,
Jun Wang,
Kezhi Li
Abstract:
Newly diagnosed Type 1 Diabetes (T1D) patients often struggle to obtain effective Blood Glucose (BG) prediction models due to the lack of sufficient BG data from Continuous Glucose Monitoring (CGM), presenting a significant "cold start" problem in patient care. Utilizing population models to address this challenge is a potential solution, but collecting patient data for training population models…
▽ More
Newly diagnosed Type 1 Diabetes (T1D) patients often struggle to obtain effective Blood Glucose (BG) prediction models due to the lack of sufficient BG data from Continuous Glucose Monitoring (CGM), presenting a significant "cold start" problem in patient care. Utilizing population models to address this challenge is a potential solution, but collecting patient data for training population models in a privacy-conscious manner is challenging, especially given that such data is often stored on personal devices. Considering the privacy protection and addressing the "cold start" problem in diabetes care, we propose "GluADFL", blood Glucose prediction by Asynchronous Decentralized Federated Learning. We compared GluADFL with eight baseline methods using four distinct T1D datasets, comprising 298 participants, which demonstrated its superior performance in accurately predicting BG levels for cross-patient analysis. Furthermore, patients' data might be stored and shared across various communication networks in GluADFL, ranging from highly interconnected (e.g., random, performs the best among others) to more structured topologies (e.g., cluster and ring), suitable for various social networks. The asynchronous training framework supports flexible participation. By adjusting the ratios of inactive participants, we found it remains stable if less than 70% are inactive. Our results confirm that GluADFL offers a practical, privacy-preserving solution for BG prediction in T1D, significantly enhancing the quality of diabetes management.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
A Unified Framework for Input Feature Attribution Analysis
Authors:
Jingyi Sun,
Pepa Atanasova,
Isabelle Augenstein
Abstract:
Explaining the decision-making process of machine learning models is crucial for ensuring their reliability and fairness. One popular explanation form highlights key input features, such as i) tokens (e.g., Shapley Values and Integrated Gradients), ii) interactions between tokens (e.g., Bivariate Shapley and Attention-based methods), or iii) interactions between spans of the input (e.g., Louvain S…
▽ More
Explaining the decision-making process of machine learning models is crucial for ensuring their reliability and fairness. One popular explanation form highlights key input features, such as i) tokens (e.g., Shapley Values and Integrated Gradients), ii) interactions between tokens (e.g., Bivariate Shapley and Attention-based methods), or iii) interactions between spans of the input (e.g., Louvain Span Interactions). However, these explanation types have only been studied in isolation, making it difficult to judge their respective applicability. To bridge this gap, we propose a unified framework that facilitates a direct comparison between highlight and interactive explanations comprised of four diagnostic properties. Through extensive analysis across these three types of input feature explanations--each utilizing three different explanation techniques--across two datasets and two models, we reveal that each explanation type excels in terms of different diagnostic properties. In our experiments, highlight explanations are the most faithful to a model's prediction, and interactive explanations provide better utility for learning to simulate a model's predictions. These insights further highlight the need for future research to develop combined methods that enhance all diagnostic properties.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction…
▽ More
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction $\mathcal{B}[χ_{c1}(3872)\toπ^+π^- J/ψ]$ at 4.914 and 4.946 GeV are set to be 0.85 and 0.96 pb, respectively. These measurements provide useful information for the production of the $χ_{c1}(3872)$ at $e^+e^-$ collider and deepen our understanding about the nature of this particle.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
A Decision-Making GPT Model Augmented with Entropy Regularization for Autonomous Vehicles
Authors:
Jiaqi Liu,
Shiyu Fang,
Xuekai Liu,
Lulu Guo,
Peng Hang,
Jian Sun
Abstract:
In the domain of autonomous vehicles (AVs), decision-making is a critical factor that significantly influences the efficacy of autonomous navigation. As the field progresses, the enhancement of decision-making capabilities in complex environments has become a central area of research within data-driven methodologies. Despite notable advances, existing learning-based decision-making strategies in a…
▽ More
In the domain of autonomous vehicles (AVs), decision-making is a critical factor that significantly influences the efficacy of autonomous navigation. As the field progresses, the enhancement of decision-making capabilities in complex environments has become a central area of research within data-driven methodologies. Despite notable advances, existing learning-based decision-making strategies in autonomous vehicles continue to reveal opportunities for further refinement, particularly in the articulation of policies and the assurance of safety. In this study, the decision-making challenges associated with autonomous vehicles are conceptualized through the framework of the Constrained Markov Decision Process (CMDP) and approached as a sequence modeling problem. Utilizing the Generative Pre-trained Transformer (GPT), we introduce a novel decision-making model tailored for AVs, which incorporates entropy regularization techniques to bolster exploration and enhance safety performance. Comprehensive experiments conducted across various scenarios affirm that our approach surpasses several established baseline methods, particularly in terms of safety and overall efficacy.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Authors:
Team GLM,
:,
Aohan Zeng,
Bin Xu,
Bowen Wang,
Chenhui Zhang,
Da Yin,
Diego Rojas,
Guanyu Feng,
Hanlin Zhao,
Hanyu Lai,
Hao Yu,
Hongning Wang,
Jiadai Sun,
Jiajie Zhang,
Jiale Cheng,
Jiayi Gui,
Jie Tang,
Jing Zhang,
Juanzi Li,
Lei Zhao,
Lindong Wu,
Lucen Zhong,
Mingdao Liu,
Minlie Huang
, et al. (32 additional authors not shown)
Abstract:
We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable models that are trained with all the insights and lessons gained from the preceding three generations of ChatGLM. To date, the GLM-4 models are pre-trained…
▽ More
We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable models that are trained with all the insights and lessons gained from the preceding three generations of ChatGLM. To date, the GLM-4 models are pre-trained on ten trillions of tokens mostly in Chinese and English, along with a small set of corpus from 24 languages, and aligned primarily for Chinese and English usage. The high-quality alignment is achieved via a multi-stage post-training process, which involves supervised fine-tuning and learning from human feedback. Evaluations show that GLM-4 1) closely rivals or outperforms GPT-4 in terms of general metrics such as MMLU, GSM8K, MATH, BBH, GPQA, and HumanEval, 2) gets close to GPT-4-Turbo in instruction following as measured by IFEval, 3) matches GPT-4 Turbo (128K) and Claude 3 for long context tasks, and 4) outperforms GPT-4 in Chinese alignments as measured by AlignBench. The GLM-4 All Tools model is further aligned to understand user intent and autonomously decide when and which tool(s) touse -- including web browser, Python interpreter, text-to-image model, and user-defined functions -- to effectively complete complex tasks. In practical applications, it matches and even surpasses GPT-4 All Tools in tasks like accessing online information via web browsing and solving math problems using Python interpreter. Over the course, we have open-sourced a series of models, including ChatGLM-6B (three generations), GLM-4-9B (128K, 1M), GLM-4V-9B, WebGLM, and CodeGeeX, attracting over 10 million downloads on Hugging face in the year 2023 alone. The open models can be accessed through https://github.com/THUDM and https://huggingface.co/THUDM.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Diverse Responses in Lattice Thermal Conductivity of $n$-type/$p$-type Semiconductors Driven by Asymmetric Electron-Phonon Interactions
Authors:
Jianshi Sun,
Shouhang Li,
Zhen Tong,
Cheng Shao,
Han Xie,
Meng An,
Chuang Zhang,
Xiongfei Zhu,
Chen Huang,
Yucheng Xiong,
Xiangjun Liu
Abstract:
Accurately assessing the impact of electron-phonon interaction (EPI) on the lattice thermal conductivity of semiconductors is crucial for the thermal management of electronic devices and a unified physical understanding of this issue is highly desired. In this work, we predict the lattice thermal conductivities of typical direct and indirect bandgap semiconductors accounting for EPI based on mode-…
▽ More
Accurately assessing the impact of electron-phonon interaction (EPI) on the lattice thermal conductivity of semiconductors is crucial for the thermal management of electronic devices and a unified physical understanding of this issue is highly desired. In this work, we predict the lattice thermal conductivities of typical direct and indirect bandgap semiconductors accounting for EPI based on mode-level first-principles calculations. It is found that EPI has a larger effect on the lattice thermal conductivity of $p$-type doping compared to $n$-type doping in the same semiconductor at high charge carrier concentrations. The stronger EPI in $p$-type doping is attributed to the relatively higher electron density of states caused by the relatively larger $p$-orbital component. Furthermore, EPI has a stronger influence on the lattice thermal conductivity of $n$-type indirect bandgap semiconductors than $n$-type direct bandgap semiconductors. This is attributed to the relatively lower electron density of states in direct bandgap semiconductors stemming from the $s$-orbital component. This work reveals that there exist diverse responses in lattice thermal conductivity of $n$-type/$p$-type semiconductors, which can be attributed to asymmetric EPIs.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Precision measurement of the $Ξ^-_b$ baryon lifetime
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1064 additional authors not shown)
Abstract:
A sample of $pp$ collision data, corresponding to an integrated luminosity of 5.5 fb$^{-1}$ and collected by the LHCb experiment during Run 2, is used to measure the ratio of the lifetime of the $Ξ^-_b$ baryon to that of the $Λ^0_b$ baryon, $r_τ\equivτ_{Ξ^-_b}/τ_{Λ^0_b}$. The value ${r_τ^{\rm Run\,2}=1.076\pm0.013\pm0.006}$ is obtained, where the first uncertainty is statistical and the second sys…
▽ More
A sample of $pp$ collision data, corresponding to an integrated luminosity of 5.5 fb$^{-1}$ and collected by the LHCb experiment during Run 2, is used to measure the ratio of the lifetime of the $Ξ^-_b$ baryon to that of the $Λ^0_b$ baryon, $r_τ\equivτ_{Ξ^-_b}/τ_{Λ^0_b}$. The value ${r_τ^{\rm Run\,2}=1.076\pm0.013\pm0.006}$ is obtained, where the first uncertainty is statistical and the second systematic. This value is averaged with the corresponding value from Run 1 to obtain ${r_τ^{\rm Run\,1,2} = 1.078\pm0.012\pm0.007}$. Multiplying by the world-average value of the $Λ^0_b$ lifetime yields $τ_{Ξ^-_b}^{\rm Run~1,2} = 1.578\pm0.018\pm0.010\pm0.011$ ps, where the uncertainties are statistical, systematic, and due to the limited knowledge of the $Λ^0_b$ lifetime. This measurement improves the precision of the current world average of the $Ξ^-_b$ lifetime by about a factor of two, and is in good agreement with the most recent theoretical predictions.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.