-
Dismantling Gender Blindness in Online Discussion of a Crime/Gender Dichotomy
Authors:
Yigang Qin,
Weilun Duan,
Qunfang Wu,
Zhicong Lu
Abstract:
Contemporary feminists utilize social media for activism, while backlashes come along. The gender-related discourses are often diminished when addressing public events regarding sexism and gender inequality on social media platforms. The dichotomized debate around the Tangshan beating incident in China epitomized how criminal interpretations of gender-related violence became a backlash against fem…
▽ More
Contemporary feminists utilize social media for activism, while backlashes come along. The gender-related discourses are often diminished when addressing public events regarding sexism and gender inequality on social media platforms. The dichotomized debate around the Tangshan beating incident in China epitomized how criminal interpretations of gender-related violence became a backlash against feminist expressions. By analyzing posts on Weibo using mixed methods, we describe the emerging discursive patterns around crime and gender, uncovering the inherent gender-blind sexism that refutes feminist discourses on the social platform. We also highlight the critical restrictions facing grassroots feminist activism in Chinese cyberspace and propose implications for the design and research related to digital feminist activism.
△ Less
Submitted 3 March, 2024;
originally announced March 2024.
-
Asymmetric Feature Fusion for Image Retrieval
Authors:
Hui Wu,
Min Wang,
Wengang Zhou,
Zhenbo Lu,
Houqiang Li
Abstract:
In asymmetric retrieval systems, models with different capacities are deployed on platforms with different computational and storage resources. Despite the great progress, existing approaches still suffer from a dilemma between retrieval efficiency and asymmetric accuracy due to the limited capacity of the lightweight query model. In this work, we propose an Asymmetric Feature Fusion (AFF) paradig…
▽ More
In asymmetric retrieval systems, models with different capacities are deployed on platforms with different computational and storage resources. Despite the great progress, existing approaches still suffer from a dilemma between retrieval efficiency and asymmetric accuracy due to the limited capacity of the lightweight query model. In this work, we propose an Asymmetric Feature Fusion (AFF) paradigm, which advances existing asymmetric retrieval systems by considering the complementarity among different features just at the gallery side. Specifically, it first embeds each gallery image into various features, e.g., local features and global features. Then, a dynamic mixer is introduced to aggregate these features into compact embedding for efficient search. On the query side, only a single lightweight model is deployed for feature extraction. The query model and dynamic mixer are jointly trained by sharing a momentum-updated classifier. Notably, the proposed paradigm boosts the accuracy of asymmetric retrieval without introducing any extra overhead to the query side. Exhaustive experiments on various landmark retrieval datasets demonstrate the superiority of our paradigm.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
Metamorpheus: Interactive, Affective, and Creative Dream Narration Through Metaphorical Visual Storytelling
Authors:
Qian Wan,
Xin Feng,
Yining Bei,
Zhiqi Gao,
Zhicong Lu
Abstract:
Human emotions are essentially molded by lived experiences, from which we construct personalised meaning. The engagement in such meaning-making process has been practiced as an intervention in various psychotherapies to promote wellness. Nevertheless, to support recollecting and recounting lived experiences in everyday life remains under explored in HCI. It also remains unknown how technologies su…
▽ More
Human emotions are essentially molded by lived experiences, from which we construct personalised meaning. The engagement in such meaning-making process has been practiced as an intervention in various psychotherapies to promote wellness. Nevertheless, to support recollecting and recounting lived experiences in everyday life remains under explored in HCI. It also remains unknown how technologies such as generative AI models can facilitate the meaning making process, and ultimately support affective mindfulness. In this paper we present Metamorpheus, an affective interface that engages users in a creative visual storytelling of emotional experiences during dreams. Metamorpheus arranges the storyline based on a dream's emotional arc, and provokes self-reflection through the creation of metaphorical images and text depictions. The system provides metaphor suggestions, and generates visual metaphors and text depictions using generative AI models, while users can apply generations to recolour and re-arrange the interface to be visually affective. Our experience-centred evaluation manifests that, by interacting with Metamorpheus, users can recall their dreams in vivid detail, through which they relive and reflect upon their experiences in a meaningful way.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
"There is a Job Prepared for Me Here": Understanding How Short Video and Live-streaming Platforms Empower Ageing Job Seekers in China
Authors:
PiaoHong Wang,
Siying Hu,
Bo Wen,
Zhicong Lu
Abstract:
In recent years, the global unemployment rate has remained persistently high. Compounding this issue, the ageing population in China often encounters additional challenges in finding employment due to prevalent age discrimination in daily life. However, with the advent of social media, there has been a rise in the popularity of short videos and live-streams for recruiting ageing workers. To better…
▽ More
In recent years, the global unemployment rate has remained persistently high. Compounding this issue, the ageing population in China often encounters additional challenges in finding employment due to prevalent age discrimination in daily life. However, with the advent of social media, there has been a rise in the popularity of short videos and live-streams for recruiting ageing workers. To better understand the motivations of ageing job seekers to engage with these video-based recruitment methods and to explore the extent to which such platforms can empower them, we conducted an interview-based study with ageing job seekers who have had exposure to these short recruitment videos and live-streaming channels. Our findings reveal that these platforms can provide a job-seeking choice that is particularly friendly to ageing job seekers, effectively improving their disadvantaged situation.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
Cooperatively Modulating Magnetic Anisotropy and Colossal Magnetoresistance via Atomic-Scale Buffer Layers in Highly Strained La0.7Sr0.3MnO3 Films
Authors:
Sheng Li,
Zengxing Lu,
Bin Lao,
Xuan Zheng,
Guoxin Chen,
Run-Wei Li,
Zhiming Wang
Abstract:
Simultaneous control of magnetic anisotropy and magnetoresistance, especially with atomic scale precision, remains a pivotal challenge for realizing advanced spintronic functionalities. Here we demonstrate cooperative continuous control over both magnetoresistance and magnetic anisotropy in highly strained La0.7Sr0.3MnO3 (LSMO) thin films. By inserting varying perovskite buffer layers, compressive…
▽ More
Simultaneous control of magnetic anisotropy and magnetoresistance, especially with atomic scale precision, remains a pivotal challenge for realizing advanced spintronic functionalities. Here we demonstrate cooperative continuous control over both magnetoresistance and magnetic anisotropy in highly strained La0.7Sr0.3MnO3 (LSMO) thin films. By inserting varying perovskite buffer layers, compressively strained LSMO films transition from a ferromagnetic insulator with out-of-plane magnetic anisotropy to a metallic state with in-plane anisotropy. Atomic-scale buffer layer insertion enables remarkably acute, precise control to sharply modulate this magnetic phase transformation. A gigantic 10,000% modulation of the colossal magnetoresistance (CMR) and an exceptionally sharp transition from out-of-plane to in-plane magnetic anisotropy are attained in just a few contiguous layers. These atomic-scale correlations among electronic, magnetic, and structural order parameters yield flexible multifunctional control promising for next-generation oxide spintronics.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
#PoetsOfInstagram: Navigating The Practices And Challenges Of Novice Poets On Instagram
Authors:
Ankolika De,
Zhicong Lu
Abstract:
Commencing as a photo-sharing platform, Instagram has since become multifaceted, accommodating diverse art forms, with poetry emerging as a prominent one. However, the academic understanding of Instagram's poetry community is limited, yet its significance emerges from its distinctive utilization of a primarily visual social media platform guided by recommendation algorithms for disseminating poetr…
▽ More
Commencing as a photo-sharing platform, Instagram has since become multifaceted, accommodating diverse art forms, with poetry emerging as a prominent one. However, the academic understanding of Instagram's poetry community is limited, yet its significance emerges from its distinctive utilization of a primarily visual social media platform guided by recommendation algorithms for disseminating poetry, further characterized by a predominantly novice creative population. We employ qualitative analysis to explore motivations, experiences, and algorithmic influence within Instagram's poetry community. We demonstrate that participants prioritize conforming to algorithmic constraints for visibility, yet maintain their community's values of integrity and originality, illustrating the tension between algorithmic growth and participant authenticity. We introduce the concept of Algorithmically Mediated Creative Labor, a phenomenon specific to non-monetizing creative users who are impacted by the prioritization of professional creators and continually adapt their creative endeavors to align with platform logic, thereby affecting their motivation and creative outputs.
△ Less
Submitted 29 February, 2024;
originally announced February 2024.
-
Seeking Soulmate via Voice: Understanding Promises and Challenges of Online Synchronized Voice-Based Mobile Dating
Authors:
Chenxinran Shen,
Yan Xu,
Ray LC,
Zhicong Lu
Abstract:
Online dating has become a popular way for individuals to connect with potential romantic partners. Many dating apps use personal profiles that include a headshot and self-description, allowing users to present themselves and search for compatible matches. However, this traditional model often has limitations. In this study, we explore a non-traditional voice-based dating app called "Soul". Unlike…
▽ More
Online dating has become a popular way for individuals to connect with potential romantic partners. Many dating apps use personal profiles that include a headshot and self-description, allowing users to present themselves and search for compatible matches. However, this traditional model often has limitations. In this study, we explore a non-traditional voice-based dating app called "Soul". Unlike traditional platforms that rely heavily on profile information, Soul facilitates user interactions through voice-based communication. We conducted semi-structured interviews with 18 dedicated Soul users to investigate how they engage with the platform and perceive themselves and others in this unique dating environment. Our findings indicate that the role of voice as a moderator influences impression management and shapes perceptions between the sender and the receiver of the voice. Additionally, the synchronous voice-based and community-based dating model offers benefits to users in the Chinese cultural context. Our study contributes to understanding the affordances introduced by voice-based interactions in online dating in China.
△ Less
Submitted 29 February, 2024;
originally announced February 2024.
-
RL-GPT: Integrating Reinforcement Learning and Code-as-policy
Authors:
Shaoteng Liu,
Haoqi Yuan,
Minda Hu,
Yanwei Li,
Yukang Chen,
Shu Liu,
Zongqing Lu,
Jiaya Jia
Abstract:
Large Language Models (LLMs) have demonstrated proficiency in utilizing various tools by coding, yet they face limitations in handling intricate logic and precise control. In embodied tasks, high-level planning is amenable to direct coding, while low-level actions often necessitate task-specific refinement, such as Reinforcement Learning (RL). To seamlessly integrate both modalities, we introduce…
▽ More
Large Language Models (LLMs) have demonstrated proficiency in utilizing various tools by coding, yet they face limitations in handling intricate logic and precise control. In embodied tasks, high-level planning is amenable to direct coding, while low-level actions often necessitate task-specific refinement, such as Reinforcement Learning (RL). To seamlessly integrate both modalities, we introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent. The slow agent analyzes actions suitable for coding, while the fast agent executes coding tasks. This decomposition effectively focuses each agent on specific tasks, proving highly efficient within our pipeline. Our approach outperforms traditional RL methods and existing GPT agents, demonstrating superior efficiency. In the Minecraft game, it rapidly obtains diamonds within a single day on an RTX3090. Additionally, it achieves SOTA performance across all designated MineDojo tasks.
△ Less
Submitted 29 February, 2024;
originally announced February 2024.
-
Miniaturized on-chip spectrometer enabled by electrochromic modulation
Authors:
Menghan Tian,
Baolei Liu,
Zelin Lu,
Yao Wang,
Ze Zheng,
Jiaqi Song,
Xiaolan Zhong,
Fan Wang
Abstract:
Miniaturized on-chip spectrometers with small footprints, lightweight, and low cost are in great demand for portable optical sensing, lab-on-chip systems, and so on. Such miniaturized spectrometers are usually based on engineered spectral response units and then reconstruct unknown spectra with algorithms. However, due to the limited footprints of computational on-chip spectrometers, the recovered…
▽ More
Miniaturized on-chip spectrometers with small footprints, lightweight, and low cost are in great demand for portable optical sensing, lab-on-chip systems, and so on. Such miniaturized spectrometers are usually based on engineered spectral response units and then reconstruct unknown spectra with algorithms. However, due to the limited footprints of computational on-chip spectrometers, the recovered spectral resolution is limited by the number of integrated spectral response units/filters. Thus, it is challenging to improve the spectral resolution without increasing the number of used filters. Here we present a computational on-chip spectrometer using electrochromic filters that can be electrochemically modulated to increase the efficient sampling number for higher spectral resolution. These filters are directly integrated on top of the photodetector pixels, and the spectral modulation of the filters results from redox reactions during the dual injection of ions and electrons into the electrochromic material. We experimentally demonstrate that the spectral resolution of the proposed spectrometer can be effectively improved as the number of applied voltages increases. The average difference of the peak wavelengths between the reconstructed and the reference spectra decreases from 14.48 nm to 2.57 nm. We also demonstrate the proposed spectrometer can be worked with only four or two filter units, assisted by electrochromic modulation. This strategy suggests a new way to enhance the performance of miniaturized spectrometers with tunable spectral filters for high resolution, low-cost, and portable spectral sensing, and would also inspire the exploration of other stimulus responses such as photochromic and force-chromic, etc, on computational spectrometers.
△ Less
Submitted 29 February, 2024;
originally announced February 2024.
-
Gluon GTMDs at nonzero skewness and impact parameter dependent parton distributions
Authors:
Chentao Tan,
Zhun Lu
Abstract:
We investigate the leading twist generalized transverse momentum dependent parton distributions (GTMDs) of the unpolarized and longitudinally polarized gluons in the nucleon. We adopt a light-front gluon-triquark model for the nucleon motivated by soft-wall AdS/QCD. The gluon GTMDs are defined through the off-forward gluon-gluon generalized correlator and are expressed as the overlap of light-cone…
▽ More
We investigate the leading twist generalized transverse momentum dependent parton distributions (GTMDs) of the unpolarized and longitudinally polarized gluons in the nucleon. We adopt a light-front gluon-triquark model for the nucleon motivated by soft-wall AdS/QCD. The gluon GTMDs are defined through the off-forward gluon-gluon generalized correlator and are expressed as the overlap of light-cone wave functions. The GTMDs can be employed to provide the generalized parton distributions (GPDs) by integrating out the transverse momentum. The Fourier transform of the GPDs encodes the parton distributions in the transverse position space, namely, the impact parameter dependent parton distributions (IPDs). We also calculate the three gluon IPDs corresponding to the GPDs $H^g$, $E^g$ and $\widetilde{H}^g$, and present their dependence on $x$ and $b_\perp$, respectively.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Spatial Distribution of Inertial Particles in Turbulent Taylor-Couette Flow
Authors:
Hao Jiang,
Zhi-ming Lu,
Bo-fu Wang,
Xiao-hui Meng,
Jie Shen,
Kai Leong Chong
Abstract:
This study investigates the spatial distribution of inertial particles in turbulent Taylor-Couette flow. Direct numerical simulations are performed using a one-way coupled Eulerian-Lagrangian approach, with a fixed inner wall Reynolds number of 2500 for the carrier flow, while the particle Stokes number varies from 0.034 to 1 for the dispersed phase. We first examine the issue of preferential conc…
▽ More
This study investigates the spatial distribution of inertial particles in turbulent Taylor-Couette flow. Direct numerical simulations are performed using a one-way coupled Eulerian-Lagrangian approach, with a fixed inner wall Reynolds number of 2500 for the carrier flow, while the particle Stokes number varies from 0.034 to 1 for the dispersed phase. We first examine the issue of preferential concentration of particles near the outer wall region. Employing two-dimensional (2D) Voronoi analysis, we observe a pronounced particle clustering with increasing $St$, particularly evident in regions of low fluid velocity. Additionally, we investigate the concentration balance equation, inspired by the work of johnson et al.(2020), to examine particle radial distribution. We discern the predominant sources of influence, namely biased sampling, turbophoresis, and centrifugal effects. Across all cases, centrifugal force emerges as the primary driver, causing particle migration towards the outer wall. Biased sampling predominantly affects smaller inertial particles, driving them towards the inner wall due to sampling within Taylor rolls with inward radial velocity. Conversely, turbophoresis primarily impacts larger inertial particles, inducing migration towards both walls where turbulent intensity is weaker compared to the bulk. With the revealed physics, our work provides a basis for predicting and controlling particle movement and distribution in industrial applications.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Direct Detection of Dark Photon Dark Matter with the James Webb Space Telescope
Authors:
Haipeng An,
Shuailiang Ge,
Jia Liu,
Zhiyao Lu
Abstract:
In this study, we propose an investigation into dark photon dark matter (DPDM) within the infrared frequency band, utilizing highly sensitive infrared light detectors commonly integrated into space telescopes, such as the James Webb Space Telescope (JWST). The presence of DPDM induces electron oscillations in the reflector of these detectors. Consequently, these oscillating electrons can emit mono…
▽ More
In this study, we propose an investigation into dark photon dark matter (DPDM) within the infrared frequency band, utilizing highly sensitive infrared light detectors commonly integrated into space telescopes, such as the James Webb Space Telescope (JWST). The presence of DPDM induces electron oscillations in the reflector of these detectors. Consequently, these oscillating electrons can emit monochromatic electromagnetic waves with a frequency almost equivalent to the mass of DPDM. By employing the stationary phase approximation, we can demonstrate that when the size of the reflector significantly exceeds the wavelength of the electromagnetic wave, the contribution to the electromagnetic wave field at a given position primarily stems from the surface unit perpendicular to the relative position vector. This simplification results in the reduction of electromagnetic wave calculations to ray optics. By applying this concept to JWST, our analysis of observational data demonstrates the potential to establish constraints on the kinetic mixing between the photon and dark photon within the range [10, 500] THz. Despite JWST not being optimized for DPDM searches, our findings reveal constraints comparable to those obtained from the XENON1T experiment in the laboratory, as well as astrophysical constraints from solar emission. Additionally, we explore strategies to optimize future experiments specifically designed for DPDM searches.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs
Authors:
Zimu Lu,
Aojun Zhou,
Houxing Ren,
Ke Wang,
Weikang Shi,
Junting Pan,
Mingjie Zhan,
Hongsheng Li
Abstract:
Large language models (LLMs) have exhibited great potential in mathematical reasoning. However, there remains a performance gap in this area between existing open-source models and closed-source models such as GPT-4. In this paper, we introduce MathGenie, a novel method for generating diverse and reliable math problems from a small-scale problem-solution dataset (denoted as seed data). We augment…
▽ More
Large language models (LLMs) have exhibited great potential in mathematical reasoning. However, there remains a performance gap in this area between existing open-source models and closed-source models such as GPT-4. In this paper, we introduce MathGenie, a novel method for generating diverse and reliable math problems from a small-scale problem-solution dataset (denoted as seed data). We augment the ground-truth solutions of our seed data and train a back-translation model to translate the augmented solutions back into new questions. Subsequently, we generate code-integrated solutions for the new questions. To ensure the correctness of the code-integrated solutions, we employ rationale-based strategy for solution verification. Various pretrained models, ranging from 7B to 70B, are trained on the newly curated data to test the effectiveness of the proposed augmentation technique, resulting in a family of models known as MathGenieLM. These models consistently outperform previous open-source models across five representative mathematical reasoning datasets, achieving state-of-the-art performance. In particular, MathGenieLM-InternLM2 achieves an accuracy of 87.7% on GSM8K and 55.7% on MATH, securing the best overall score among open-source language models.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Ultrafast and precise distance measurement via real-time chirped pulse interferometry
Authors:
Mingyang Xu,
Hanzhong Wu,
Jiawen Zhi,
Yang Liu,
Jie Zhang,
Zehuang Lu,
Chenggang Shao
Abstract:
Laser frequency combs, which are composed of a series of equally-spaced coherent frequency components, have triggered revolutionary progress for precision spectroscopy and optical metrology. Length/distance is of fundamental importance in both science and technology. In this work, we describe a ranging scheme based on chirped pulse interferometry. In contrast to the traditional spectral interferom…
▽ More
Laser frequency combs, which are composed of a series of equally-spaced coherent frequency components, have triggered revolutionary progress for precision spectroscopy and optical metrology. Length/distance is of fundamental importance in both science and technology. In this work, we describe a ranging scheme based on chirped pulse interferometry. In contrast to the traditional spectral interferometry, the local oscillator is strongly chirped which is able to meet the measurement pulses at arbitrary distances, and therefore the dead zones can be removed. The distances can be precisely determined via two measurement steps based on time-of-flight method and synthetic wavelength interferometry, respectively. To overcome the speed limitation of the optical spectrum analyzer, the spectrograms are stretched and detected by a fast photodetector and oscilloscope, and consequently mapped into the time domain in real time. The experimental results indicate that the measurement uncertainty can be well within 2 $\upmu$m, compared with the reference distance meter. The Allan deviation can reach 0.4 $\upmu$m at averaging time of 4 ns, 25 nm at 1 $\upmu$s, and can achieve 2 nm at 100 $\upmu$s averaging time. We also measure a spinning disk with grooves of different depths to verify the measurement speed, and the results show that the grooves with about 150 m/s line speed can be clearly captured. Our method provides a unique combination of non-dead zones, ultrafast measurement speed, high precision and accuracy, large ambiguity range, and with only one single comb source. This system could offer a powerful solution for the field measurements in practical applications in future.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
Detailed Report on the Measurement of the Positive Muon Anomalous Magnetic Moment to 0.20 ppm
Authors:
D. P. Aguillard,
T. Albahri,
D. Allspach,
A. Anisenkov,
K. Badgley,
S. Baeßler,
I. Bailey,
L. Bailey,
V. A. Baranov,
E. Barlas-Yucel,
T. Barrett,
E. Barzi,
F. Bedeschi,
M. Berz,
M. Bhattacharya,
H. P. Binney,
P. Bloom,
J. Bono,
E. Bottalico,
T. Bowcock,
S. Braun,
M. Bressler,
G. Cantatore,
R. M. Carey,
B. C. K. Casey
, et al. (168 additional authors not shown)
Abstract:
We present details on a new measurement of the muon magnetic anomaly, $a_μ= (g_μ-2)/2$. The result is based on positive muon data taken at Fermilab's Muon Campus during the 2019 and 2020 accelerator runs. The measurement uses $3.1$ GeV$/c$ polarized muons stored in a $7.1$-m-radius storage ring with a $1.45$ T uniform magnetic field. The value of $ a_μ$ is determined from the measured difference b…
▽ More
We present details on a new measurement of the muon magnetic anomaly, $a_μ= (g_μ-2)/2$. The result is based on positive muon data taken at Fermilab's Muon Campus during the 2019 and 2020 accelerator runs. The measurement uses $3.1$ GeV$/c$ polarized muons stored in a $7.1$-m-radius storage ring with a $1.45$ T uniform magnetic field. The value of $ a_μ$ is determined from the measured difference between the muon spin precession frequency and its cyclotron frequency. This difference is normalized to the strength of the magnetic field, measured using Nuclear Magnetic Resonance (NMR). The ratio is then corrected for small contributions from beam motion, beam dispersion, and transient magnetic fields. We measure $a_μ= 116 592 057 (25) \times 10^{-11}$ (0.21 ppm). This is the world's most precise measurement of this quantity and represents a factor of $2.2$ improvement over our previous result based on the 2018 dataset. In combination, the two datasets yield $a_μ(\text{FNAL}) = 116 592 055 (24) \times 10^{-11}$ (0.20 ppm). Combining this with the measurements from Brookhaven National Laboratory for both positive and negative muons, the new world average is $a_μ$(exp) $ = 116 592 059 (22) \times 10^{-11}$ (0.19 ppm).
△ Less
Submitted 22 May, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
Updated kinematics of the Radcliffe Wave: non-synchronous, dipole-like vertical oscillations
Authors:
Zhi-Kai Zhu,
Min Fang,
Zu-Jia Lu,
Junzhi Wang,
Guang-Xing Li,
Shiyu Zhang,
Veli-Matti Pelkonen,
Paolo Padoan,
En-Wei Liang
Abstract:
The kinematic structure of the Radcliffe Wave (RW) is crucial for understanding its origin and evolution. In this work, we present an accurate measurement of the vertical velocity $V_Z$ by where the radial velocity (RV) measures are taken into consideration. This is achieved in two ways. First, the velocities are measured towards Young Stellar Objects (YSOs), using their RV and proper motion measu…
▽ More
The kinematic structure of the Radcliffe Wave (RW) is crucial for understanding its origin and evolution. In this work, we present an accurate measurement of the vertical velocity $V_Z$ by where the radial velocity (RV) measures are taken into consideration. This is achieved in two ways. First, the velocities are measured towards Young Stellar Objects (YSOs), using their RV and proper motion measurements from APOGEE-2 and Gaia DR3. Second, we combine RV measurements toward clouds with proper motion measurements of associated YSOs to determine the vertical velocities of the clouds. The results reveal that the oscillations in $V_Z$ are not synchronous with the vertical coordinate. The difference is caused by a combination of the effect of the radial velocity which we include in this paper, and the difference in models. By supplementing our analysis with additional young star samples, we find a consistent dipole pattern in $V_Z$. The fact that no significant amplitude differences are found among the analyzed samples indicates that there is no apparent age gradient within the dipole. We propose that RW evolves at a relatively slow rate. The fact that it will take a much longer time for RW to complete a full period compared to the cloud lifetimes challenges its classification as a traditional "wave". This age discrepancy should explain the phase difference, and non-synchronous oscillation found in kinematic studies.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset
Authors:
Ke Wang,
Junting Pan,
Weikang Shi,
Zimu Lu,
Mingjie Zhan,
Hongsheng Li
Abstract:
Recent advancements in Large Multimodal Models (LMMs) have shown promising results in mathematical reasoning within visual contexts, with models approaching human-level performance on existing benchmarks such as MathVista. However, we observe significant limitations in the diversity of questions and breadth of subjects covered by these benchmarks. To address this issue, we present the MATH-Vision…
▽ More
Recent advancements in Large Multimodal Models (LMMs) have shown promising results in mathematical reasoning within visual contexts, with models approaching human-level performance on existing benchmarks such as MathVista. However, we observe significant limitations in the diversity of questions and breadth of subjects covered by these benchmarks. To address this issue, we present the MATH-Vision (MATH-V) dataset, a meticulously curated collection of 3,040 high-quality mathematical problems with visual contexts sourced from real math competitions. Spanning 16 distinct mathematical disciplines and graded across 5 levels of difficulty, our dataset provides a comprehensive and diverse set of challenges for evaluating the mathematical reasoning abilities of LMMs. Through extensive experimentation, we unveil a notable performance gap between current LMMs and human performance on MATH-V, underscoring the imperative for further advancements in LMMs. Moreover, our detailed categorization allows for a thorough error analysis of LMMs, offering valuable insights to guide future research and development. The project is available at https://mathvision-cuhk.github.io
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark
Authors:
Xiuying Chen,
Tairan Wang,
Qingqing Zhu,
Taicheng Guo,
Shen Gao,
Zhiyong Lu,
Xin Gao,
Xiangliang Zhang
Abstract:
The summarization capabilities of pretrained and large language models (LLMs) have been widely validated in general areas, but their use in scientific corpus, which involves complex sentences and specialized knowledge, has been less assessed. This paper presents conceptual and experimental analyses of scientific summarization, highlighting the inadequacies of traditional evaluation methods, such a…
▽ More
The summarization capabilities of pretrained and large language models (LLMs) have been widely validated in general areas, but their use in scientific corpus, which involves complex sentences and specialized knowledge, has been less assessed. This paper presents conceptual and experimental analyses of scientific summarization, highlighting the inadequacies of traditional evaluation methods, such as $n$-gram, embedding comparison, and QA, particularly in providing explanations, grasping scientific concepts, or identifying key content. Subsequently, we introduce the Facet-aware Metric (FM), employing LLMs for advanced semantic matching to evaluate summaries based on different aspects. This facet-aware approach offers a thorough evaluation of abstracts by decomposing the evaluation task into simpler subtasks.Recognizing the absence of an evaluation benchmark in this domain, we curate a Facet-based scientific summarization Dataset (FD) with facet-level annotations. Our findings confirm that FM offers a more logical approach to evaluating scientific summaries. In addition, fine-tuned smaller models can compete with LLMs in scientific contexts, while LLMs have limitations in learning from in-context information in scientific domains. This suggests an area for future enhancement of LLMs.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning
Authors:
Qiao Jin,
Zhizheng Wang,
Yifan Yang,
Qingqing Zhu,
Donald Wright,
Thomas Huang,
W John Wilbur,
Zhe He,
Andrew Taylor,
Qingyu Chen,
Zhiyong Lu
Abstract:
Clinical calculators play a vital role in healthcare by offering accurate evidence-based predictions for various purposes such as prognosis. Nevertheless, their widespread utilization is frequently hindered by usability challenges, poor dissemination, and restricted functionality. Augmenting large language models with extensive collections of clinical calculators presents an opportunity to overcom…
▽ More
Clinical calculators play a vital role in healthcare by offering accurate evidence-based predictions for various purposes such as prognosis. Nevertheless, their widespread utilization is frequently hindered by usability challenges, poor dissemination, and restricted functionality. Augmenting large language models with extensive collections of clinical calculators presents an opportunity to overcome these obstacles and improve workflow efficiency, but the scalability of the manual curation process poses a significant challenge. In response, we introduce AgentMD, a novel language agent capable of curating and applying clinical calculators across various clinical contexts. Using the published literature, AgentMD has automatically curated a collection of 2,164 diverse clinical calculators with executable functions and structured documentation, collectively named RiskCalcs. Manual evaluations show that RiskCalcs tools achieve an accuracy of over 80% on three quality metrics. At inference time, AgentMD can automatically select and apply the relevant RiskCalcs tools given any patient description. On the newly established RiskQA benchmark, AgentMD significantly outperforms chain-of-thought prompting with GPT-4 (87.7% vs. 40.9% in accuracy). Additionally, we also applied AgentMD to real-world clinical notes for analyzing both population-level and risk-level patient characteristics. In summary, our study illustrates the utility of language agents augmented with clinical calculators for healthcare analytics and patient care.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Benchmarking Retrieval-Augmented Generation for Medicine
Authors:
Guangzhi Xiong,
Qiao Jin,
Zhiyong Lu,
Aidong Zhang
Abstract:
While large language models (LLMs) have achieved state-of-the-art performance on a wide range of medical question answering (QA) tasks, they still face challenges with hallucinations and outdated knowledge. Retrieval-augmented generation (RAG) is a promising solution and has been widely adopted. However, a RAG system can involve multiple flexible components, and there is a lack of best practices r…
▽ More
While large language models (LLMs) have achieved state-of-the-art performance on a wide range of medical question answering (QA) tasks, they still face challenges with hallucinations and outdated knowledge. Retrieval-augmented generation (RAG) is a promising solution and has been widely adopted. However, a RAG system can involve multiple flexible components, and there is a lack of best practices regarding the optimal RAG setting for various medical purposes. To systematically evaluate such systems, we propose the Medical Information Retrieval-Augmented Generation Evaluation (MIRAGE), a first-of-its-kind benchmark including 7,663 questions from five medical QA datasets. Using MIRAGE, we conducted large-scale experiments with over 1.8 trillion prompt tokens on 41 combinations of different corpora, retrievers, and backbone LLMs through the MedRAG toolkit introduced in this work. Overall, MedRAG improves the accuracy of six different LLMs by up to 18% over chain-of-thought prompting, elevating the performance of GPT-3.5 and Mixtral to GPT-4-level. Our results show that the combination of various medical corpora and retrievers achieves the best performance. In addition, we discovered a log-linear scaling property and the "lost-in-the-middle" effects in medical RAG. We believe our comprehensive evaluations can serve as practical guidelines for implementing RAG systems for medicine.
△ Less
Submitted 23 February, 2024; v1 submitted 20 February, 2024;
originally announced February 2024.
-
FGAD: Self-boosted Knowledge Distillation for An Effective Federated Graph Anomaly Detection Framework
Authors:
Jinyu Cai,
Yunhe Zhang,
Zhoumin Lu,
Wenzhong Guo,
See-kiong Ng
Abstract:
Graph anomaly detection (GAD) aims to identify anomalous graphs that significantly deviate from other ones, which has raised growing attention due to the broad existence and complexity of graph-structured data in many real-world scenarios. However, existing GAD methods usually execute with centralized training, which may lead to privacy leakage risk in some sensitive cases, thereby impeding collab…
▽ More
Graph anomaly detection (GAD) aims to identify anomalous graphs that significantly deviate from other ones, which has raised growing attention due to the broad existence and complexity of graph-structured data in many real-world scenarios. However, existing GAD methods usually execute with centralized training, which may lead to privacy leakage risk in some sensitive cases, thereby impeding collaboration among organizations seeking to collectively develop robust GAD models. Although federated learning offers a promising solution, the prevalent non-IID problems and high communication costs present significant challenges, particularly pronounced in collaborations with graph data distributed among different participants. To tackle these challenges, we propose an effective federated graph anomaly detection framework (FGAD). We first introduce an anomaly generator to perturb the normal graphs to be anomalous, and train a powerful anomaly detector by distinguishing generated anomalous graphs from normal ones. Then, we leverage a student model to distill knowledge from the trained anomaly detector (teacher model), which aims to maintain the personality of local models and alleviate the adverse impact of non-IID problems. Moreover, we design an effective collaborative learning mechanism that facilitates the personalization preservation of local models and significantly reduces communication costs among clients. Empirical results of the GAD tasks on non-IID graphs compared with state-of-the-art baselines demonstrate the superiority and efficiency of the proposed FGAD method.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Locality-Sensitive Hashing-Based Efficient Point Transformer with Applications in High-Energy Physics
Authors:
Siqi Miao,
Zhiyuan Lu,
Mia Liu,
Javier Duarte,
Pan Li
Abstract:
This study introduces a novel transformer model optimized for large-scale point cloud processing in scientific domains such as high-energy physics (HEP) and astrophysics. Addressing the limitations of graph neural networks and standard transformers, our model integrates local inductive bias and achieves near-linear complexity with hardware-friendly regular operations. One contribution of this work…
▽ More
This study introduces a novel transformer model optimized for large-scale point cloud processing in scientific domains such as high-energy physics (HEP) and astrophysics. Addressing the limitations of graph neural networks and standard transformers, our model integrates local inductive bias and achieves near-linear complexity with hardware-friendly regular operations. One contribution of this work is the quantitative analysis of the error-complexity tradeoff of various sparsification techniques for building efficient transformers. Our findings highlight the superiority of using locality-sensitive hashing (LSH), especially OR & AND-construction LSH, in kernel approximation for large-scale point cloud data with local inductive bias. Based on this finding, we propose LSH-based Efficient Point Transformer (HEPT), which combines E$^2$LSH with OR & AND constructions and is built upon regular computations. HEPT demonstrates remarkable performance on two critical yet time-consuming HEP tasks, significantly outperforming existing GNNs and transformers in accuracy and computational speed, marking a significant advancement in geometric deep learning and large-scale scientific data processing. Our code is available at https://github.com/Graph-COM/HEPT.
△ Less
Submitted 5 June, 2024; v1 submitted 19 February, 2024;
originally announced February 2024.
-
FiT: Flexible Vision Transformer for Diffusion Model
Authors:
Zeyu Lu,
Zidong Wang,
Di Huang,
Chengyue Wu,
Xihui Liu,
Wanli Ouyang,
Lei Bai
Abstract:
Nature is infinitely resolution-free. In the context of this reality, existing diffusion models, such as Diffusion Transformers, often face challenges when processing image resolutions outside of their trained domain. To overcome this limitation, we present the Flexible Vision Transformer (FiT), a transformer architecture specifically designed for generating images with unrestricted resolutions an…
▽ More
Nature is infinitely resolution-free. In the context of this reality, existing diffusion models, such as Diffusion Transformers, often face challenges when processing image resolutions outside of their trained domain. To overcome this limitation, we present the Flexible Vision Transformer (FiT), a transformer architecture specifically designed for generating images with unrestricted resolutions and aspect ratios. Unlike traditional methods that perceive images as static-resolution grids, FiT conceptualizes images as sequences of dynamically-sized tokens. This perspective enables a flexible training strategy that effortlessly adapts to diverse aspect ratios during both training and inference phases, thus promoting resolution generalization and eliminating biases induced by image cropping. Enhanced by a meticulously adjusted network structure and the integration of training-free extrapolation techniques, FiT exhibits remarkable flexibility in resolution extrapolation generation. Comprehensive experiments demonstrate the exceptional performance of FiT across a broad range of resolutions, showcasing its effectiveness both within and beyond its training resolution distribution. Repository available at https://github.com/whlzy/FiT.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Thermal Stress Analysis of the LNG Corrugated Cryogenic Hose During Gas Pre-Cooling Process
Authors:
Miaoer Liu,
Fangqiu Li,
Hao Cheng,
Endao Li,
Jun Yan,
Hailong Lu,
Yufeng Bu,
Tingting Tang,
Zhaokuan Lu
Abstract:
In this study, thermal-fluid-solid coupled simulations on the gas-phase pre-cooling operation of the corrugated cryogenic hoses were performed. Attention was focused on the temporal evolution and spatial distribution of transient thermal stress in the hose structure caused by convective heat transfer of the cooling medium, Liquefied Natural Gas Boil-Off Gas (BOG). The effects of different corrugat…
▽ More
In this study, thermal-fluid-solid coupled simulations on the gas-phase pre-cooling operation of the corrugated cryogenic hoses were performed. Attention was focused on the temporal evolution and spatial distribution of transient thermal stress in the hose structure caused by convective heat transfer of the cooling medium, Liquefied Natural Gas Boil-Off Gas (BOG). The effects of different corrugated hose parameters, i.e., boundary conditions, hose lengths, BOG inlet flow rates, and corrugation shapes (C-type and U-type), on the transient thermal stress behavior were thoroughly assessed. The thermal stress developed at different locations of the corrugated hoses with these parameters is found to be governed by two major factors: the boundary constraint and local temperature gradient. The objective of this study is to offer practical insights for the structural strength design of corrugated cryogenic hoses and effective pre-cooling strategies, aiming to mitigate structural safety risks caused by excessive thermal stress.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation
Authors:
Aiwei Liu,
Haoping Bai,
Zhiyun Lu,
Xiang Kong,
Simon Wang,
Jiulong Shan,
Meng Cao,
Lijie Wen
Abstract:
Aligning large language models (LLMs) with human expectations without human-annotated preference data is an important problem. In this paper, we propose a method to evaluate the response preference by using the output probabilities of response pairs under contrastive prompt pairs, which could achieve better performance on LLaMA2-7B and LLaMA2-13B compared to RLAIF. Based on this, we propose an aut…
▽ More
Aligning large language models (LLMs) with human expectations without human-annotated preference data is an important problem. In this paper, we propose a method to evaluate the response preference by using the output probabilities of response pairs under contrastive prompt pairs, which could achieve better performance on LLaMA2-7B and LLaMA2-13B compared to RLAIF. Based on this, we propose an automatic alignment method, Direct Large Model Alignment (DLMA). First, we use contrastive prompt pairs to automatically generate preference data. Then, we continue to evaluate the generated preference data using contrastive prompt pairs and calculate a self-rewarding score. Finally, we use the DPO algorithm to effectively align LLMs by combining this self-rewarding score. In the experimental stage, our DLMA method could surpass the \texttt{RLHF} method without relying on human-annotated preference data.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Search for the production of deuterons and antideuterons in e^+e^- annihilation at center-of-mass energies between 4.13 and 4.70 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (593 additional authors not shown)
Abstract:
Using a data sample of $e^+e^-$ collision data corresponding to an integrated luminosity of 19 fb$^{-1}$ collected with the BESIII detector at the BEPCII collider, we search for the production of deuterons and antideuterons via $e^+e^-\to ppπ^-\bar{d}+c.c.$ for the first time at center-of-mass energies between 4.13 and 4.70 GeV. No significant signal is observed and the upper limit of the…
▽ More
Using a data sample of $e^+e^-$ collision data corresponding to an integrated luminosity of 19 fb$^{-1}$ collected with the BESIII detector at the BEPCII collider, we search for the production of deuterons and antideuterons via $e^+e^-\to ppπ^-\bar{d}+c.c.$ for the first time at center-of-mass energies between 4.13 and 4.70 GeV. No significant signal is observed and the upper limit of the $e^+e^-\to ppπ^-\bar{d}+c.c.$ cross section is determined to be from 9.0 to 145 fb depending on the center-of-mass energy at the $90\%$ confidence level.
△ Less
Submitted 17 February, 2024;
originally announced February 2024.
-
Enabling data-driven and bidirectional model development in Verilog-A for photonic devices
Authors:
Dias Azhigulov,
Zeqin Lu,
James Pond,
Lukas Chrostowski,
Sudip Shekhar
Abstract:
We present a method to model photonic components in Verilog-A by introducing bidirectional signaling through a single port. To achieve this, the concept of power waves and scattering parameters from electromagnetism are employed. As a consequence, one can simultaneously transmit forward and backward propagating waves on a single wire while also capturing realistic, measurement-backed response of p…
▽ More
We present a method to model photonic components in Verilog-A by introducing bidirectional signaling through a single port. To achieve this, the concept of power waves and scattering parameters from electromagnetism are employed. As a consequence, one can simultaneously transmit forward and backward propagating waves on a single wire while also capturing realistic, measurement-backed response of photonic components in Verilog-A. We demonstrate examples to show the efficacy of the proposed technique in accounting for critical effects in photonic integrated circuits such as Fabry-Perot cavity resonance, reflections to lasers, etc. Our solution makes electronic-photonic co-simulation more intuitive and accurate.
△ Less
Submitted 3 July, 2024; v1 submitted 15 February, 2024;
originally announced February 2024.
-
Roadmap on Data-Centric Materials Science
Authors:
Stefan Bauer,
Peter Benner,
Tristan Bereau,
Volker Blum,
Mario Boley,
Christian Carbogno,
C. Richard A. Catlow,
Gerhard Dehm,
Sebastian Eibl,
Ralph Ernstorfer,
Ádám Fekete,
Lucas Foppa,
Peter Fratzl,
Christoph Freysoldt,
Baptiste Gault,
Luca M. Ghiringhelli,
Sajal K. Giri,
Anton Gladyshev,
Pawan Goyal,
Jason Hattrick-Simpers,
Lara Kabalan,
Petr Karpov,
Mohammad S. Khorrami,
Christoph Koch,
Sebastian Kokott
, et al. (36 additional authors not shown)
Abstract:
Science is and always has been based on data, but the terms "data-centric" and the "4th paradigm of" materials research indicate a radical change in how information is retrieved, handled and research is performed. It signifies a transformative shift towards managing vast data collections, digital repositories, and innovative data analytics methods. The integration of Artificial Intelligence (AI) a…
▽ More
Science is and always has been based on data, but the terms "data-centric" and the "4th paradigm of" materials research indicate a radical change in how information is retrieved, handled and research is performed. It signifies a transformative shift towards managing vast data collections, digital repositories, and innovative data analytics methods. The integration of Artificial Intelligence (AI) and its subset Machine Learning (ML), has become pivotal in addressing all these challenges. This Roadmap on Data-Centric Materials Science explores fundamental concepts and methodologies, illustrating diverse applications in electronic-structure theory, soft matter theory, microstructure research, and experimental techniques like photoemission, atom probe tomography, and electron microscopy. While the roadmap delves into specific areas within the broad interdisciplinary field of materials science, the provided examples elucidate key concepts applicable to a wider range of topics. The discussed instances offer insights into addressing the multifaceted challenges encountered in contemporary materials research.
△ Less
Submitted 1 May, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
Design of 2D Skyrmionic Metamaterial Through Controlled Assembly
Authors:
Qichen Xu,
Zhuanglin Shen,
Alexander Edström,
I. P. Miranda,
Zhiwei Lu,
Anders Bergman,
Danny Thonig,
Wanjian Yin,
Olle Eriksson,
Anna Delin
Abstract:
Despite extensive research on magnetic skyrmions and antiskyrmions, a significant challenge remains in crafting nontrivial high-order skyrmionic textures with varying, or even tailor-made, topologies. We address this challenge, by focusing on a construction pathway of skyrmionics metamaterial within a monolayer thin film and suggest several promising lattice-like, flakes-like, and cell-like skyrmi…
▽ More
Despite extensive research on magnetic skyrmions and antiskyrmions, a significant challenge remains in crafting nontrivial high-order skyrmionic textures with varying, or even tailor-made, topologies. We address this challenge, by focusing on a construction pathway of skyrmionics metamaterial within a monolayer thin film and suggest several promising lattice-like, flakes-like, and cell-like skyrmionic metamaterials that are surprisingly stable. Central to our approach is the concept of 'simulated controlled assembly', in short, a protocol inspired by 'click chemistry' that allows for positioning topological magnetic structures where one likes, and then allowing for energy minimization to elucidate the stability. Utilizing high-throughput atomistic-spin-dynamic (ASD) simulations alongside state-of-the-art AI-driven tools, we have isolated skyrmions (topological charge Q=1), antiskyrmions (Q=-1), and skyrmionium (Q=0). These entities serve as foundational 'skyrmionic building blocks' to forming reported intricate textures. In this work, two key contributions are introduced to the field of skyrmionic systems. First, we present a novel method for integrating control assembly protocols for the stabilization and investigation of topological magnets, which marks a significant advancement in the ability to explore new skyrmionic textures. Second, we report on the discovery of skyrmionic metamaterials, which shows a plethora of complex topologies that are possible to investigate theoretically and experimentally.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Chiral Interaction Induced Near-Perfect Photon Blockade
Authors:
Zhi-Guang Lu,
Ying Wu,
Xin-You Lü
Abstract:
Based on the scattering matrix method, we theoretically demonstrate that the chiral interaction can induce the almost perfect photon blockade (PB) in the waveguide-cavity quantum electrodynamics (QED) system. The mechanism relies on the multi-photon-paths interference within the waveguide, which is clearly shown by the analytical parameter regime for $g^{(2)}(0)\approx0$. When $N$ cavities are int…
▽ More
Based on the scattering matrix method, we theoretically demonstrate that the chiral interaction can induce the almost perfect photon blockade (PB) in the waveguide-cavity quantum electrodynamics (QED) system. The mechanism relies on the multi-photon-paths interference within the waveguide, which is clearly shown by the analytical parameter regime for $g^{(2)}(0)\approx0$. When $N$ cavities are introduced into the system, there are $N$ optimal parameter points accordingly for the almost perfect PB, and the required chirality decreases exponentially with increasing $N$. Under the conditions of resonant driving and specific chirality, the output light only relies on the parity of $N$ ($N\ge2$), where the coherent state and single-photon state correspond to the case of system including the odd and even number of cavities, respectively. Our work offers an alternative route for achieving almost perfect PB effects by employing the chirality of system, with potential application in the on-chip single-photon source with integrability.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Detecting Adversarial Spectrum Attacks via Distance to Decision Boundary Statistics
Authors:
Wenwei Zhao,
Xiaowen Li,
Shangqing Zhao,
Jie Xu,
Yao Liu,
Zhuo Lu
Abstract:
Machine learning has been adopted for efficient cooperative spectrum sensing. However, it incurs an additional security risk due to attacks leveraging adversarial machine learning to create malicious spectrum sensing values to deceive the fusion center, called adversarial spectrum attacks. In this paper, we propose an efficient framework for detecting adversarial spectrum attacks. Our design lever…
▽ More
Machine learning has been adopted for efficient cooperative spectrum sensing. However, it incurs an additional security risk due to attacks leveraging adversarial machine learning to create malicious spectrum sensing values to deceive the fusion center, called adversarial spectrum attacks. In this paper, we propose an efficient framework for detecting adversarial spectrum attacks. Our design leverages the concept of the distance to the decision boundary (DDB) observed at the fusion center and compares the training and testing DDB distributions to identify adversarial spectrum attacks. We create a computationally efficient way to compute the DDB for machine learning based spectrum sensing systems. Experimental results based on realistic spectrum data show that our method, under typical settings, achieves a high detection rate of up to 99\% and maintains a low false alarm rate of less than 1\%. In addition, our method to compute the DDB based on spectrum data achieves 54\%--64\% improvements in computational efficiency over existing distance calculation methods. The proposed DDB-based detection framework offers a practical and efficient solution for identifying malicious sensing values created by adversarial spectrum attacks.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
A survey of recent methods for addressing AI fairness and bias in biomedicine
Authors:
Yifan Yang,
Mingquan Lin,
Han Zhao,
Yifan Peng,
Furong Huang,
Zhiyong Lu
Abstract:
Artificial intelligence (AI) systems have the potential to revolutionize clinical practices, including improving diagnostic accuracy and surgical decision-making, while also reducing costs and manpower. However, it is important to recognize that these systems may perpetuate social inequities or demonstrate biases, such as those based on race or gender. Such biases can occur before, during, or afte…
▽ More
Artificial intelligence (AI) systems have the potential to revolutionize clinical practices, including improving diagnostic accuracy and surgical decision-making, while also reducing costs and manpower. However, it is important to recognize that these systems may perpetuate social inequities or demonstrate biases, such as those based on race or gender. Such biases can occur before, during, or after the development of AI models, making it critical to understand and address potential biases to enable the accurate and reliable application of AI models in clinical settings. To mitigate bias concerns during model development, we surveyed recent publications on different debiasing methods in the fields of biomedical natural language processing (NLP) or computer vision (CV). Then we discussed the methods that have been applied in the biomedical domain to address bias. We performed our literature search on PubMed, ACM digital library, and IEEE Xplore of relevant articles published between January 2018 and December 2023 using multiple combinations of keywords. We then filtered the result of 10,041 articles automatically with loose constraints, and manually inspected the abstracts of the remaining 890 articles to identify the 55 articles included in this review. Additional articles in the references are also included in this review. We discuss each method and compare its strengths and weaknesses. Finally, we review other potential methods from the general domain that could be applied to biomedicine to address bias and improve fairness.The bias of AIs in biomedicine can originate from multiple sources. Existing debiasing methods that focus on algorithms can be categorized into distributional or algorithmic.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Diffusion Model-based Probabilistic Downscaling for 180-year East Asian Climate Reconstruction
Authors:
Fenghua Ling,
Zeyu Lu,
Jing-Jia Luo,
Lei Bai,
Swadhin K. Behera,
Dachao Jin,
Baoxiang Pan,
Huidong Jiang,
Toshio Yamagata
Abstract:
As our planet is entering into the "global boiling" era, understanding regional climate change becomes imperative. Effective downscaling methods that provide localized insights are crucial for this target. Traditional approaches, including computationally-demanding regional dynamical models or statistical downscaling frameworks, are often susceptible to the influence of downscaling uncertainty. He…
▽ More
As our planet is entering into the "global boiling" era, understanding regional climate change becomes imperative. Effective downscaling methods that provide localized insights are crucial for this target. Traditional approaches, including computationally-demanding regional dynamical models or statistical downscaling frameworks, are often susceptible to the influence of downscaling uncertainty. Here, we address these limitations by introducing a diffusion probabilistic downscaling model (DPDM) into the meteorological field. This model can efficiently transform data from 1° to 0.1° resolution. Compared with deterministic downscaling schemes, it not only has more accurate local details, but also can generate a large number of ensemble members based on probability distribution sampling to evaluate the uncertainty of downscaling. Additionally, we apply the model to generate a 180-year dataset of monthly surface variables in East Asia, offering a more detailed perspective for understanding local scale climate change over the past centuries.
△ Less
Submitted 5 April, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
QGFN: Controllable Greediness with Action Values
Authors:
Elaine Lau,
Stephen Zhewen Lu,
Ling Pan,
Doina Precup,
Emmanuel Bengio
Abstract:
Generative Flow Networks (GFlowNets; GFNs) are a family of reward/energy-based generative methods for combinatorial objects, capable of generating diverse and high-utility samples. However, biasing GFNs towards producing high-utility samples is non-trivial. In this work, we leverage connections between GFNs and reinforcement learning (RL) and propose to combine the GFN policy with an action-value…
▽ More
Generative Flow Networks (GFlowNets; GFNs) are a family of reward/energy-based generative methods for combinatorial objects, capable of generating diverse and high-utility samples. However, biasing GFNs towards producing high-utility samples is non-trivial. In this work, we leverage connections between GFNs and reinforcement learning (RL) and propose to combine the GFN policy with an action-value estimate, $Q$, to create greedier sampling policies which can be controlled by a mixing parameter. We show that several variants of the proposed method, QGFN, are able to improve on the number of high-reward samples generated in a variety of tasks without sacrificing diversity.
△ Less
Submitted 23 May, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
Research on Mobile Network High-precision Absolute Time Synchronization based on TAP
Authors:
Chenyu Zhang,
Xiangming Wen,
Wei Zheng,
Longdan Yu,
Zhaoming Lu,
Zhengying Wang
Abstract:
With the development of mobile communication and industrial internet technologies, the demand for robust absolute time synchronization based on network for diverse scenarios is significantly growing. TAP is a novel network timing method that aims to achieve sub-microsecond synchronization over air interface. This paper investigates the improvement and end-to-end realization of TAP. This paper firs…
▽ More
With the development of mobile communication and industrial internet technologies, the demand for robust absolute time synchronization based on network for diverse scenarios is significantly growing. TAP is a novel network timing method that aims to achieve sub-microsecond synchronization over air interface. This paper investigates the improvement and end-to-end realization of TAP. This paper first analyzes the effectiveness and deficiencies of TAP by establishing an equivalent clock model which evaluates TAP from timing error composition and allan variance. Second, this paper proposes a detailed base station and terminal design and the corresponding improvement of TAP. Both hardware compensation and protocol software design are taken into account so as to minimize timing error and system cost while maximizing compatibility with 3GPP. Finally, this paper presents a TAP end-to-end 5G prototype system developed based on software defined radio base station and COTS baseband module. The field test results show that the proposed scheme effectively solves the problems of TAP in application and robustly achieves 200ns level timing accuracy in various situations. The average accuracy with long observations can reach 1 nanosecond. It is 2$\sim$3 orders of magnitude better than common network timing methods, including NTP, PTP and the original TAP.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science
Authors:
Xiangru Tang,
Qiao Jin,
Kunlun Zhu,
Tongxin Yuan,
Yichi Zhang,
Wangchunshu Zhou,
Meng Qu,
Yilun Zhao,
Jian Tang,
Zhuosheng Zhang,
Arman Cohan,
Zhiyong Lu,
Mark Gerstein
Abstract:
Intelligent agents powered by large language models (LLMs) have demonstrated substantial promise in autonomously conducting experiments and facilitating scientific discoveries across various disciplines. While their capabilities are promising, these agents, called scientific LLM agents, also introduce novel vulnerabilities that demand careful consideration for safety. However, there exists a notab…
▽ More
Intelligent agents powered by large language models (LLMs) have demonstrated substantial promise in autonomously conducting experiments and facilitating scientific discoveries across various disciplines. While their capabilities are promising, these agents, called scientific LLM agents, also introduce novel vulnerabilities that demand careful consideration for safety. However, there exists a notable gap in the literature, as there has been no comprehensive exploration of these vulnerabilities. This perspective paper fills this gap by conducting a thorough examination of vulnerabilities in LLM-based agents within scientific domains, shedding light on potential risks associated with their misuse and emphasizing the need for safety measures. We begin by providing a comprehensive overview of the potential risks inherent to scientific LLM agents, taking into account user intent, the specific scientific domain, and their potential impact on the external environment. Then, we delve into the origins of these vulnerabilities and provide a scoping review of the limited existing works. Based on our analysis, we propose a triadic framework involving human regulation, agent alignment, and an understanding of environmental feedback (agent regulation) to mitigate these identified risks. Furthermore, we highlight the limitations and challenges associated with safeguarding scientific agents and advocate for the development of improved models, robust benchmarks, and comprehensive regulations to address these issues effectively.
△ Less
Submitted 5 June, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
High-order stochastic integration schemes for the Rosenbluth-Trubnikov collision operator in particle simulations
Authors:
Zhixin Lu,
Guo Meng,
Tomasz Tyranowski,
Alex Chankin
Abstract:
In this study, we consider a numerical implementation of the nonlinear Rosenbluth-Trubnikov collision operator for particle simulations in plasma physics in the framework of the finite element method (FEM). The relevant particle evolution equations are formulated as stochastic differential equations, both in the Stratonovich and Itô forms, and are then solved with advanced high-order stochastic nu…
▽ More
In this study, we consider a numerical implementation of the nonlinear Rosenbluth-Trubnikov collision operator for particle simulations in plasma physics in the framework of the finite element method (FEM). The relevant particle evolution equations are formulated as stochastic differential equations, both in the Stratonovich and Itô forms, and are then solved with advanced high-order stochastic numerical schemes. Due to its formulation as a stochastic differential equation, both the drift and diffusion components of the collision operator are treated on an equal footing. Our investigation focuses on assessing the accuracy of these schemes. Previous studies on this subject have used the Euler-Maruyama scheme, which, although popular, is of low order, and requires small time steps to achieve satisfactory accuracy. In this work, we compare the performance of the Euler-Maruyama method to other high-order stochastic methods known in the stochastic differential equations literature. Our study reveals advantageous features of these high-order schemes, such as better accuracy and improved conservation properties of the numerical solution. The main test case used in the numerical experiments is the thermalization of isotropic and anisotropic particle distributions.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Precise Measurement of Born Cross Sections for $e^+e^-\to D\bar{D}$ and Observation of One Structure between $\sqrt{s} = 3.80-4.95$ GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (604 additional authors not shown)
Abstract:
Using data samples collected with the BESIII detector at the BEPCII collider at center-of-mass energies ranging from 3.80 to 4.95 GeV, corresponding to an integrated luminosity of 20 fb$^{-1}$, a measurement of Born cross sections for the $e^+e^-\to D^{0}\bar{D}^{0}$ and $D^{+}D^{-}$ processes is presented with unprecedented precision. By performing a simultaneous fit to the dressed cross sections…
▽ More
Using data samples collected with the BESIII detector at the BEPCII collider at center-of-mass energies ranging from 3.80 to 4.95 GeV, corresponding to an integrated luminosity of 20 fb$^{-1}$, a measurement of Born cross sections for the $e^+e^-\to D^{0}\bar{D}^{0}$ and $D^{+}D^{-}$ processes is presented with unprecedented precision. By performing a simultaneous fit to the dressed cross sections for both processes, one possible new structure around 3.9 GeV/$c^2$ is observed for the first time, in addition to seven known resonances $ψ(3770)$, $ψ(4040)$, $ψ(4160)$, $Y(4230)$, $Y(4360)$, $ψ(4415)$, and $Y(4660)$. These results offer crucial experimental insights into the nature of hadron production in the open charm region.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
SEABO: A Simple Search-Based Method for Offline Imitation Learning
Authors:
Jiafei Lyu,
Xiaoteng Ma,
Le Wan,
Runze Liu,
Xiu Li,
Zongqing Lu
Abstract:
Offline reinforcement learning (RL) has attracted much attention due to its ability in learning from static offline datasets and eliminating the need of interacting with the environment. Nevertheless, the success of offline RL relies heavily on the offline transitions annotated with reward labels. In practice, we often need to hand-craft the reward function, which is sometimes difficult, labor-int…
▽ More
Offline reinforcement learning (RL) has attracted much attention due to its ability in learning from static offline datasets and eliminating the need of interacting with the environment. Nevertheless, the success of offline RL relies heavily on the offline transitions annotated with reward labels. In practice, we often need to hand-craft the reward function, which is sometimes difficult, labor-intensive, or inefficient. To tackle this challenge, we set our focus on the offline imitation learning (IL) setting, and aim at getting a reward function based on the expert data and unlabeled data. To that end, we propose a simple yet effective search-based offline IL method, tagged SEABO. SEABO allocates a larger reward to the transition that is close to its closest neighbor in the expert demonstration, and a smaller reward otherwise, all in an unsupervised learning manner. Experimental results on a variety of D4RL datasets indicate that SEABO can achieve competitive performance to offline RL algorithms with ground-truth rewards, given only a single expert trajectory, and can outperform prior reward learning and offline IL methods across many tasks. Moreover, we demonstrate that SEABO also works well if the expert demonstrations contain only observations. Our code is publicly available at https://github.com/dmksjfl/SEABO.
△ Less
Submitted 21 February, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Harnessing PubMed User Query Logs for Post Hoc Explanations of Recommended Similar Articles
Authors:
Ashley Shin,
Qiao Jin,
James Anibal,
Zhiyong Lu
Abstract:
Searching for a related article based on a reference article is an integral part of scientific research. PubMed, like many academic search engines, has a "similar articles" feature that recommends articles relevant to the current article viewed by a user. Explaining recommended items can be of great utility to users, particularly in the literature search process. With more than a million biomedica…
▽ More
Searching for a related article based on a reference article is an integral part of scientific research. PubMed, like many academic search engines, has a "similar articles" feature that recommends articles relevant to the current article viewed by a user. Explaining recommended items can be of great utility to users, particularly in the literature search process. With more than a million biomedical papers being published each year, explaining the recommended similar articles would facilitate researchers and clinicians in searching for related articles. Nonetheless, the majority of current literature recommendation systems lack explanations for their suggestions. We employ a post hoc approach to explaining recommendations by identifying relevant tokens in the titles of similar articles. Our major contribution is building PubCLogs by repurposing 5.6 million pairs of coclicked articles from PubMed's user query logs. Using our PubCLogs dataset, we train the Highlight Similar Article Title (HSAT), a transformer-based model designed to select the most relevant parts of the title of a similar article, based on the title and abstract of a seed article. HSAT demonstrates strong performance in our empirical evaluations, achieving an F1 score of 91.72 percent on the PubCLogs test set, considerably outperforming several baselines including BM25 (70.62), MPNet (67.11), MedCPT (62.22), GPT-3.5 (46.00), and GPT-4 (64.89). Additional evaluations on a separate, manually annotated test set further verifies HSAT's performance. Moreover, participants of our user study indicate a preference for HSAT, due to its superior balance between conciseness and comprehensiveness. Our study suggests that repurposing user query logs of academic search engines can be a promising way to train state-of-the-art models for explaining literature recommendation.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Understanding What Affects Generalization Gap in Visual Reinforcement Learning: Theory and Empirical Evidence
Authors:
Jiafei Lyu,
Le Wan,
Xiu Li,
Zongqing Lu
Abstract:
Recently, there are many efforts attempting to learn useful policies for continuous control in visual reinforcement learning (RL). In this scenario, it is important to learn a generalizable policy, as the testing environment may differ from the training environment, e.g., there exist distractors during deployment. Many practical algorithms are proposed to handle this problem. However, to the best…
▽ More
Recently, there are many efforts attempting to learn useful policies for continuous control in visual reinforcement learning (RL). In this scenario, it is important to learn a generalizable policy, as the testing environment may differ from the training environment, e.g., there exist distractors during deployment. Many practical algorithms are proposed to handle this problem. However, to the best of our knowledge, none of them provide a theoretical understanding of what affects the generalization gap and why their proposed methods work. In this paper, we bridge this issue by theoretically answering the key factors that contribute to the generalization gap when the testing environment has distractors. Our theories indicate that minimizing the representation distance between training and testing environments, which aligns with human intuition, is the most critical for the benefit of reducing the generalization gap. Our theoretical results are supported by the empirical evidence in the DMControl Generalization Benchmark (DMC-GB).
△ Less
Submitted 4 February, 2024;
originally announced February 2024.
-
Non-Fermi Liquid and Hund Correlation in La$_4$Ni$_3$O$_{10}$ under High Pressure
Authors:
Jing-Xuan Wang,
Zhenfeng Ouyang,
Rong-Qiang He,
Zhong-Yi Lu
Abstract:
High temperature superconductivity was recently found in the bilayer nickelate $\rm{La}_3 \rm{Ni}_2 \rm{O}_7$ (La327), followed by the discovery of superconductivity in the trilayer $\rm{La}_4 \rm{Ni}_3 \rm{O}_{10}$ (La4310), under high pressure. Through studying the electronic correlation of La4310 with DFT+DMFT, and further comparing it with that of La327, we find that the $e_g$ orbitals of the…
▽ More
High temperature superconductivity was recently found in the bilayer nickelate $\rm{La}_3 \rm{Ni}_2 \rm{O}_7$ (La327), followed by the discovery of superconductivity in the trilayer $\rm{La}_4 \rm{Ni}_3 \rm{O}_{10}$ (La4310), under high pressure. Through studying the electronic correlation of La4310 with DFT+DMFT, and further comparing it with that of La327, we find that the $e_g$ orbitals of the outer-layer Ni cations in La4310 have a similar (but slightly weaker) electronic correlation to those in La327, in which the electrons behave as a non-Fermi liquid with Hund correlation and linear-in-temperature scattering rate. Our results suggest that the experimentally observed ``strange metal'' behavior may be explained by the Hund spin correlation featuring high spin states and spin-orbital separation. In contrast, the electrons in the inner-layer Ni cations in La4310 behave as a Fermi liquid. The weaker electronic correlation in La4310 is attributed to more hole-doping, which may explain its lower superconducting transition temperature.
△ Less
Submitted 4 February, 2024;
originally announced February 2024.
-
Settling Decentralized Multi-Agent Coordinated Exploration by Novelty Sharing
Authors:
Haobin Jiang,
Ziluo Ding,
Zongqing Lu
Abstract:
Exploration in decentralized cooperative multi-agent reinforcement learning faces two challenges. One is that the novelty of global states is unavailable, while the novelty of local observations is biased. The other is how agents can explore in a coordinated way. To address these challenges, we propose MACE, a simple yet effective multi-agent coordinated exploration method. By communicating only l…
▽ More
Exploration in decentralized cooperative multi-agent reinforcement learning faces two challenges. One is that the novelty of global states is unavailable, while the novelty of local observations is biased. The other is how agents can explore in a coordinated way. To address these challenges, we propose MACE, a simple yet effective multi-agent coordinated exploration method. By communicating only local novelty, agents can take into account other agents' local novelty to approximate the global novelty. Further, we newly introduce weighted mutual information to measure the influence of one agent's action on other agents' accumulated novelty. We convert it as an intrinsic reward in hindsight to encourage agents to exert more influence on other agents' exploration and boost coordinated exploration. Empirically, we show that MACE achieves superior performance in three multi-agent environments with sparse rewards.
△ Less
Submitted 3 February, 2024;
originally announced February 2024.
-
Measurement of the Electromagnetic Transition Form-factors in the decays $η'\rightarrowπ^+π^-l^+l^-$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (618 additional authors not shown)
Abstract:
With a sample of $(10087\pm44)\times10^{6}$ $J/ψ$ events accumulated with the BESIII detector, we analyze the decays $η'\rightarrowπ^+π^-l^+l^-(l=e,$ $μ)$ via the process $J/ψ\rightarrowγη'$. The branching fractions are measured to be $\mathcal{B}(η'\rightarrowπ^+π^-e^+e^-)=(2.45\pm0.02(\rm{stat.})\pm0.08(\rm{syst.})) \times10^{-3}$ and…
▽ More
With a sample of $(10087\pm44)\times10^{6}$ $J/ψ$ events accumulated with the BESIII detector, we analyze the decays $η'\rightarrowπ^+π^-l^+l^-(l=e,$ $μ)$ via the process $J/ψ\rightarrowγη'$. The branching fractions are measured to be $\mathcal{B}(η'\rightarrowπ^+π^-e^+e^-)=(2.45\pm0.02(\rm{stat.})\pm0.08(\rm{syst.})) \times10^{-3}$ and $\mathcal{B}(η'\rightarrowπ^+π^-μ^+μ^-)=(2.16\pm0.12(\rm{stat.})\pm0.06(\rm{syst.}))\times10^{-5}$, and the ratio is $\frac{\mathcal{B}(η'\rightarrowπ^{+}π^{-}e^{+}e^{-})}{\mathcal{B}(η'\rightarrowπ^{+}π^{-}μ^{+}μ^{-})} = 113.4\pm0.9(\rm{stat.})\pm3.7(\rm{syst.})$. In addition, by combining the $η'\rightarrowπ^+π^-e^+e^-$ and $η'\rightarrowπ^+π^-μ^+μ^-$ decays, the slope parameter of the electromagnetic transition form factor is measured to be $b_{η'}=1.30\pm0.19\ (\mathrm{GeV}/c^{2})^{-2}$, which is consistent with previous measurements from BESIII and theoretical predictions from the VMD model. The asymmetry in the angle between the $π^+π^-$ and $l^+l^-$ decay planes, which has the potential to reveal the $CP$-violation originating from an unconventional electric dipole transition, is also investigated. The asymmetry parameters are determined to be $\mathcal{A}_{CP}(η'\rightarrowπ^+π^-e^+e^-)=(-0.21\pm0.73(\rm{stat.})\pm0.01(\rm{syst.}))\%$ and $\mathcal{A}_{CP}(η'\rightarrowπ^+π^-μ^+μ^-)=(0.62\pm4.71(\rm{stat.})\pm0.08(\rm{syst.}))\%$, implying that no evidence of $CP$-violation is observed at the present statistics. Finally, an axion-like particle is searched for via the decay $η'\rightarrowπ^+π^-a, a\rightarrow e^+e^-$, and upper limits of the branching fractions are presented for the mass assumptions of the axion-like particle in the range of $0-500\ \mathrm{MeV}/c^{2}$.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Quality of Answers of Generative Large Language Models vs Peer Patients for Interpreting Lab Test Results for Lay Patients: Evaluation Study
Authors:
Zhe He,
Balu Bhasuran,
Qiao Jin,
Shubo Tian,
Karim Hanna,
Cindy Shavor,
Lisbeth Garcia Arguello,
Patrick Murray,
Zhiyong Lu
Abstract:
Lab results are often confusing and hard to understand. Large language models (LLMs) such as ChatGPT have opened a promising avenue for patients to get their questions answered. We aim to assess the feasibility of using LLMs to generate relevant, accurate, helpful, and unharmful responses to lab test-related questions asked by patients and to identify potential issues that can be mitigated with au…
▽ More
Lab results are often confusing and hard to understand. Large language models (LLMs) such as ChatGPT have opened a promising avenue for patients to get their questions answered. We aim to assess the feasibility of using LLMs to generate relevant, accurate, helpful, and unharmful responses to lab test-related questions asked by patients and to identify potential issues that can be mitigated with augmentation approaches. We first collected lab test results related question and answer data from Yahoo! Answers and selected 53 QA pairs for this study. Using the LangChain framework and ChatGPT web portal, we generated responses to the 53 questions from four LLMs including GPT-4, Meta LLaMA 2, MedAlpaca, and ORCA_mini. We first assessed the similarity of their answers using standard QA similarity-based evaluation metrics including ROUGE, BLEU, METEOR, BERTScore. We also utilized an LLM-based evaluator to judge whether a target model has higher quality in terms of relevance, correctness, helpfulness, and safety than the baseline model. Finally, we performed a manual evaluation with medical experts for all the responses to seven selected questions on the same four aspects. The results of Win Rate and medical expert evaluation both showed that GPT-4's responses achieved better scores than all the other LLM responses and human responses on all four aspects (relevance, correctness, helpfulness, and safety). However, LLM responses occasionally also suffer from a lack of interpretation in one's medical context, incorrect statements, and lack of references. We find that compared to other three LLMs and human answer from the Q&A website, GPT-4's responses are more accurate, helpful, relevant, and safer. However, there are cases which GPT-4 responses are inaccurate and not individualized. We identified a number of ways to improve the quality of LLM responses.
△ Less
Submitted 23 January, 2024;
originally announced February 2024.
-
Measurements of Normalized Differential Cross Sections of Inclusive $η$ Production in $e^{+}e^{-}$ Annihilation at Energy from 2.0000 to 3.6710 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
D. Anderle,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko
, et al. (641 additional authors not shown)
Abstract:
Using data samples collected with the BESIII detector operating at the BEPCII storage ring, the cross section of the inclusive process $e^{+}e^{-} \to η+ X$, normalized by the total cross section of $e^{+}e^{-} \to \text{hadrons}$, is measured at eight center-of-mass energy points from 2.0000 GeV to 3.6710 GeV. These are the first measurements with momentum dependence in this energy region. Our me…
▽ More
Using data samples collected with the BESIII detector operating at the BEPCII storage ring, the cross section of the inclusive process $e^{+}e^{-} \to η+ X$, normalized by the total cross section of $e^{+}e^{-} \to \text{hadrons}$, is measured at eight center-of-mass energy points from 2.0000 GeV to 3.6710 GeV. These are the first measurements with momentum dependence in this energy region. Our measurement shows a significant discrepancy from calculations with the existing fragmentation functions. To address this discrepancy, a new QCD analysis is performed at the next-to-next-to-leading order with hadron mass corrections and higher twist effects, which can explain both the established high-energy data and our measurements reasonably well.
△ Less
Submitted 31 January, 2024;
originally announced January 2024.
-
Camouflage Adversarial Attacks on Multiple Agent Systems
Authors:
Ziqing Lu,
Guanlin Liu,
Lifeng Lai,
Weiyu Xu
Abstract:
The multi-agent reinforcement learning systems (MARL) based on the Markov decision process (MDP) have emerged in many critical applications. To improve the robustness/defense of MARL systems against adversarial attacks, the study of various adversarial attacks on reinforcement learning systems is very important. Previous works on adversarial attacks considered some possible features to attack in M…
▽ More
The multi-agent reinforcement learning systems (MARL) based on the Markov decision process (MDP) have emerged in many critical applications. To improve the robustness/defense of MARL systems against adversarial attacks, the study of various adversarial attacks on reinforcement learning systems is very important. Previous works on adversarial attacks considered some possible features to attack in MDP, such as the action poisoning attacks, the reward poisoning attacks, and the state perception attacks. In this paper, we propose a brand-new form of attack called the camouflage attack in the MARL systems. In the camouflage attack, the attackers change the appearances of some objects without changing the actual objects themselves; and the camouflaged appearances may look the same to all the targeted recipient (victim) agents. The camouflaged appearances can mislead the recipient agents to misguided actions. We design algorithms that give the optimal camouflage attacks minimizing the rewards of recipient agents. Our numerical and theoretical results show that camouflage attacks can rival the more conventional, but likely more difficult state perception attacks. We also investigate cost-constrained camouflage attacks and showed numerically how cost budgets affect the attack performance.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
Achieving More Human Brain-Like Vision via Human EEG Representational Alignment
Authors:
Zitong Lu,
Yile Wang,
Julie D. Golomb
Abstract:
Despite advancements in artificial intelligence, object recognition models still lag behind in emulating visual information processing in human brains. Recent studies have highlighted the potential of using neural data to mimic brain processing; however, these often rely on invasive neural recordings from non-human subjects, leaving a critical gap in understanding human visual perception. Addressi…
▽ More
Despite advancements in artificial intelligence, object recognition models still lag behind in emulating visual information processing in human brains. Recent studies have highlighted the potential of using neural data to mimic brain processing; however, these often rely on invasive neural recordings from non-human subjects, leaving a critical gap in understanding human visual perception. Addressing this gap, we present, for the first time, 'Re(presentational)Al(ignment)net', a vision model aligned with human brain activity based on non-invasive EEG, demonstrating a significantly higher similarity to human brain representations. Our innovative image-to-brain multi-layer encoding framework advances human neural alignment by optimizing multiple model layers and enabling the model to efficiently learn and mimic human brain's visual representational patterns across object categories and different modalities. Our findings suggest that ReAlnet represents a breakthrough in bridging the gap between artificial and human vision, and paving the way for more brain-like artificial intelligence systems.
△ Less
Submitted 24 April, 2024; v1 submitted 30 January, 2024;
originally announced January 2024.
-
Layered and Staged Monte Carlo Tree Search for SMT Strategy Synthesis
Authors:
Zhengyang Lu,
Stefan Siemer,
Piyush Jha,
Joel Day,
Florin Manea,
Vijay Ganesh
Abstract:
Modern SMT solvers, such as Z3, offer user-controllable strategies, enabling users to tailor solving strategies for their unique set of instances, thus dramatically enhancing solver performance for their use case. However, this approach of strategy customization presents a significant challenge: handcrafting an optimized strategy for a class of SMT instances remains a complex and demanding task fo…
▽ More
Modern SMT solvers, such as Z3, offer user-controllable strategies, enabling users to tailor solving strategies for their unique set of instances, thus dramatically enhancing solver performance for their use case. However, this approach of strategy customization presents a significant challenge: handcrafting an optimized strategy for a class of SMT instances remains a complex and demanding task for both solver developers and users alike.
In this paper, we address this problem of automatic SMT strategy synthesis via a novel Monte Carlo Tree Search (MCTS) based method. Our method treats strategy synthesis as a sequential decision-making process, whose search tree corresponds to the strategy space, and employs MCTS to navigate this vast search space. The key innovations that enable our method to identify effective strategies, while keeping costs low, are the ideas of layered and staged MCTS search. These novel heuristics allow for a deeper and more efficient exploration of the strategy space, enabling us to synthesize more effective strategies than the default ones in state-of-the-art (SOTA) SMT solvers. We implement our method, dubbed Z3alpha, as part of the Z3 SMT solver. Through extensive evaluations across six important SMT logics, Z3alpha demonstrates superior performance compared to the SOTA synthesis tool FastSMT, the default Z3 solver, and the CVC5 solver on most benchmarks. Remarkably, on a challenging QF_BV benchmark set, Z3alpha solves 42.7% more instances than the default strategy in the Z3 SMT solver.
△ Less
Submitted 30 April, 2024; v1 submitted 30 January, 2024;
originally announced January 2024.
-
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models
Authors:
Yuanjie Lyu,
Zhiyu Li,
Simin Niu,
Feiyu Xiong,
Bo Tang,
Wenjin Wang,
Hao Wu,
Huanyong Liu,
Tong Xu,
Enhong Chen,
Yi Luo,
Peng Cheng,
Haiying Deng,
Zhonghao Wang,
Zijia Lu
Abstract:
Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of large language models (LLMs) by incorporating external knowledge sources. This method addresses common LLM limitations, including outdated information and the tendency to produce inaccurate "hallucinated" content. However, the evaluation of RAG systems is challenging, as existing benchmarks are limited in scope a…
▽ More
Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of large language models (LLMs) by incorporating external knowledge sources. This method addresses common LLM limitations, including outdated information and the tendency to produce inaccurate "hallucinated" content. However, the evaluation of RAG systems is challenging, as existing benchmarks are limited in scope and diversity. Most of the current benchmarks predominantly assess question-answering applications, overlooking the broader spectrum of situations where RAG could prove advantageous. Moreover, they only evaluate the performance of the LLM component of the RAG pipeline in the experiments, and neglect the influence of the retrieval component and the external knowledge database. To address these issues, this paper constructs a large-scale and more comprehensive benchmark, and evaluates all the components of RAG systems in various RAG application scenarios. Specifically, we have categorized the range of RAG applications into four distinct types-Create, Read, Update, and Delete (CRUD), each representing a unique use case. "Create" refers to scenarios requiring the generation of original, varied content. "Read" involves responding to intricate questions in knowledge-intensive situations. "Update" focuses on revising and rectifying inaccuracies or inconsistencies in pre-existing texts. "Delete" pertains to the task of summarizing extensive texts into more concise forms. For each of these CRUD categories, we have developed comprehensive datasets to evaluate the performance of RAG systems. We also analyze the effects of various components of the RAG system, such as the retriever, the context length, the knowledge base construction, and the LLM. Finally, we provide useful insights for optimizing the RAG technology for different scenarios.
△ Less
Submitted 18 February, 2024; v1 submitted 30 January, 2024;
originally announced January 2024.