-
Measurement of the branching fraction of $D^+_s\to \ell^+ν_\ell$ via $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and…
▽ More
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and $\mathcal{B}(D_s^+\toτ^+ν_τ)=(\bftauv)\%$, respectively. The product of the decay constant and Cabibbo-Kobayashi-Maskawa matrix element $|V_{cs}|$ is determined to be $f_{D_s^+}|V_{cs}|=(\mufdsxvcsresult)_{μν}~\mathrm{MeV}$ and $f_{D_s^+}|V_{cs}|=(\taufdsxvcsresult))_{τν}~\mathrm{MeV}$, respectively. Taking the value of $|V_{cs}|$ from a global fit in the Standard Model, we obtain ${f_{D^+_s}}=(\mufdsresult)_{μν}$\,MeV and ${f_{D^+_s}}=(\taufdsresult)_{τν}$\,MeV, respectively. Conversely, taking the value for $f_{D_s^+}$ from the latest lattice quantum chromodynamics calculation, we obtain $|V_{cs}| =(\muvcsresult)_{μν}$ and $|V_{cs}| = (\tauvcsresult)_{τν}$, respectively.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
High-Resolution Spectroscopy of the Intermediate Impurity States near a Quantum Phase Transition
Authors:
Yao Zhang,
Tao Xie,
Zhen-Yu Liu,
Rui Wang,
Wenhao Zhang,
Chaofei Liu,
Ying-Shuang Fu
Abstract:
The intermediate behavior near a quantum phase transition is crucial for understanding the quantum criticality of various competing phases and their separate origins, yet remains unexplored for the multiple Yu-Shiba-Rusinov (YSR) states. Here, we investigated the detailed spectroscopic change of the exchange coupling-dependent YSR states near a quantum phase transition. The initially developed one…
▽ More
The intermediate behavior near a quantum phase transition is crucial for understanding the quantum criticality of various competing phases and their separate origins, yet remains unexplored for the multiple Yu-Shiba-Rusinov (YSR) states. Here, we investigated the detailed spectroscopic change of the exchange coupling-dependent YSR states near a quantum phase transition. The initially developed one pair of YSR states, induced by the Fe vacancy in monolayer Fe(Te,Se) superconductor, are clearly resolved with high resolution showing an evolution into two pairs of YSR peaks yet with dichotomy in their spectral features. Interestingly, while the lower-energy YSR branch enters the quantum phase transition region, the higher-lying one remains rigidly away from the lower-energy counterpart with a constant energy difference. Spectral weight analysis of the higher-energy branch yields an exponential dependence on the exchange coupling, which can be well rationalized by taking the two pairs of YSR states as a result of field splitting by the magnetic anisotropy. Our results unveil the intermediate region of a quantum phase transition with a magnetic anisotropy-induced splitting of the YSR resonance, and highlight a prospect for developing functional electronics based on the flexibly controllable multiple quantum states.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Generative AI Driven Task-Oriented Adaptive Semantic Communications
Authors:
Yuzhou Fu,
Wenchi Cheng,
Jingqing Wang,
Liuguo Yin,
Wei Zhang
Abstract:
Task-Oriented Semantic Communication (TOSC) has been regarded as a promising communication framework, serving for various Artificial Intelligence (AI) task driven applications. The existing TOSC frameworks focus on extracting the full semantic features of source data and learning low-dimensional channel inputs to transmit them within limited bandwidth resources. Although transmitting full semantic…
▽ More
Task-Oriented Semantic Communication (TOSC) has been regarded as a promising communication framework, serving for various Artificial Intelligence (AI) task driven applications. The existing TOSC frameworks focus on extracting the full semantic features of source data and learning low-dimensional channel inputs to transmit them within limited bandwidth resources. Although transmitting full semantic features can preserve the integrity of data meaning, this approach does not attain the performance threshold of the TOSC. In this paper, we propose a Task-oriented Adaptive Semantic Communication (TasCom) framework, which aims to effectively facilitate the execution of AI tasks by only sending task-related semantic features. In the TasCom framework, we first propose a Generative AI (GAI) architecture based Generative Joint Source-Channel Coding (G-JSCC) for efficient semantic transmission. Then, an Adaptive Coding Controller (ACC) is proposed to find the optimal coding scheme for the proposed G-JSCC, which allows the semantic features with significant contributions to the AI task to preferentially occupy limited bandwidth resources for wireless transmission. Furthermore, we propose a generative training algorithm to train the proposed TasCom for optimal performance. The simulation results show that the proposed TasCom outperforms the existing TOSC and traditional codec schemes on the object detection and instance segmentation tasks at all considered channel conditions.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Digital-Analog Transmission Framework for Task-Oriented Semantic Communications
Authors:
Yuzhou Fu,
Wenchi Cheng,
Wei Zhang,
Wei Zhang
Abstract:
Task-Oriented Semantic Communication (TOSC) has been considered as a new communication paradigm to serve various samrt devices that depend on Artificial Intelligence (AI) tasks in future wireless networks. The existing TOSC frameworks rely on the Neural Network (NN) model to extract the semantic feature from the source data. The semantic feature, constituted by analog vectors of a lower dimensiona…
▽ More
Task-Oriented Semantic Communication (TOSC) has been considered as a new communication paradigm to serve various samrt devices that depend on Artificial Intelligence (AI) tasks in future wireless networks. The existing TOSC frameworks rely on the Neural Network (NN) model to extract the semantic feature from the source data. The semantic feature, constituted by analog vectors of a lower dimensionality relative to the original source data, reserves the meaning of the source data. By conveying the semantic feature, TOSCs can significantly reduce the amount of data transmission while ensuring the correct execution of the AI-driven downstream task. However, standardized wireless networks depend on digital signal processing for data transmission, yet they necessitate the conveyance of semantic features that are inherently analog. Although existing TOSC frameworks developed the Deep Learning (DL) based \emph{analog approach} or conventional \emph{digital approach} to transmit the semantic feature, but there are still many challenging problems to urgently be solved in actual deployment. In this article, we first propose several challenging issues associated with the development of the TOSC framework in the standardized wireless network. Then, we develop a Digital-Analog transmission framework based TOSC (DA-TOSC) to resolve these challenging issues. Future research directions are discussed to further improve the DA-TOSC.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Scalable Extraction Based Semantic Communication for 6G Wireless Networks
Authors:
Yuzhou Fu,
Wenchi Cheng,
Wei Zhang,
Jingqing Wang
Abstract:
Due to the challenges of satisfying the demands for communication efficiency and intelligent connectivity, sixth-generation (6G) wireless network requires new communication frameworks to enable effective information exchange and the integrated Artificial Intelligence (AI) and communication. The Deep Learning (DL) based semantic communication, which can integrate application requirements and the da…
▽ More
Due to the challenges of satisfying the demands for communication efficiency and intelligent connectivity, sixth-generation (6G) wireless network requires new communication frameworks to enable effective information exchange and the integrated Artificial Intelligence (AI) and communication. The Deep Learning (DL) based semantic communication, which can integrate application requirements and the data meanings into data processing and transmission, is expected to become a new paradigm in 6G wireless networks. However, existing semantic communications frameworks rely on sending full semantic feature, which can maximize the semantic fidelity but fail to achieve the efficient semantic communications. In this article, we introduce a novel Scalable Extraction based Semantic Communication (SE-SC) model to support the potential applications in 6G wireless networks and then analyze its feasibility. Then, we propose a promising the SE-SC framework to highlight the potentials of SE-SC model in 6G wireless networks. Numerical results show that our proposed SE-SC scheme can offer an identical Quality of Service (QoS) for the downstream task with much fewer transmission symbols than the full semantic feature transmission and the traditional codec scheme. Finally, we discuss several challenges for further investigating the scalable extraction based semantic communications.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Authors:
Ruisheng Cao,
Fangyu Lei,
Haoyuan Wu,
Jixuan Chen,
Yeqiao Fu,
Hongcheng Gao,
Xinzhuang Xiong,
Hanchong Zhang,
Yuchen Mao,
Wenjing Hu,
Tianbao Xie,
Hongshen Xu,
Danyang Zhang,
Sida Wang,
Ruoxi Sun,
Pengcheng Yin,
Caiming Xiong,
Ansong Ni,
Qian Liu,
Victor Zhong,
Lu Chen,
Kai Yu,
Tao Yu
Abstract:
Data science and engineering workflows often span multiple stages, from warehousing to orchestration, using tools like BigQuery, dbt, and Airbyte. As vision language models (VLMs) advance in multimodal understanding and code generation, VLM-based agents could potentially automate these workflows by generating SQL queries, Python code, and GUI operations. This automation can improve the productivit…
▽ More
Data science and engineering workflows often span multiple stages, from warehousing to orchestration, using tools like BigQuery, dbt, and Airbyte. As vision language models (VLMs) advance in multimodal understanding and code generation, VLM-based agents could potentially automate these workflows by generating SQL queries, Python code, and GUI operations. This automation can improve the productivity of experts while democratizing access to large-scale data analysis. In this paper, we introduce Spider2-V, the first multimodal agent benchmark focusing on professional data science and engineering workflows, featuring 494 real-world tasks in authentic computer environments and incorporating 20 enterprise-level professional applications. These tasks, derived from real-world use cases, evaluate the ability of a multimodal agent to perform data-related tasks by writing code and managing the GUI in enterprise data software systems. To balance realistic simulation with evaluation simplicity, we devote significant effort to developing automatic configurations for task setup and carefully crafting evaluation metrics for each task. Furthermore, we supplement multimodal agents with comprehensive documents of these enterprise data software systems. Our empirical evaluation reveals that existing state-of-the-art LLM/VLM-based agents do not reliably automate full data workflows (14.0% success). Even with step-by-step guidance, these agents still underperform in tasks that require fine-grained, knowledge-intensive GUI actions (16.2%) and involve remote cloud-hosted workspaces (10.6%). We hope that Spider2-V paves the way for autonomous multimodal agents to transform the automation of data science and engineering workflow. Our code and data are available at https://spider2-v.github.io.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Hierarchical search method for gravitational waves from stellar-mass binary black holes in noisy space-based detector data
Authors:
Yao Fu,
Yan Wang,
Soumya D. Mohanty
Abstract:
Future space-based laser interferometric detectors, such as LISA, will be able to detect gravitational waves (GWs) generated during the inspiral phase of stellar-mass binary black holes (SmBBHs). The detection and characterization of GWs from SmBBHs poses a formidable data analysis challenge, arising from the large number of wave cycles that make the search extremely sensitive to mismatches in sig…
▽ More
Future space-based laser interferometric detectors, such as LISA, will be able to detect gravitational waves (GWs) generated during the inspiral phase of stellar-mass binary black holes (SmBBHs). The detection and characterization of GWs from SmBBHs poses a formidable data analysis challenge, arising from the large number of wave cycles that make the search extremely sensitive to mismatches in signal and template parameters in a likelihood-based approach. This makes the search for the maximum of the likelihood function over the signal parameter space an extremely difficult task. We present a data analysis method that addresses this problem using both algorithmic innovations and hardware acceleration driven by GPUs. The method follows a hierarchical approach in which a semi-coherent $\mathcal{F}$-statistic is computed with different numbers of frequency domain partitions at different stages, with multiple particle swarm optimization (PSO) runs used in each stage for global optimization. An important step in the method is the judicious partitioning of the parameter space at each stage to improve the convergence probability of PSO and avoid premature convergence to noise-induced secondary maxima. The hierarchy of stages confines the semi-coherent searches to progressively smaller parameter ranges, with the final stage performing a search for the global maximum of the fully-coherent $\mathcal{F}$-statistic. We test our method on 2.5 years of a single LISA TDI combination and find that for an injected SmBBH signal with a SNR between $\approx 11$ and $\approx 14$, the method can estimate (i) the chirp mass with a relative error of $\lesssim 0.01\%$, (ii) the time of coalescence within $\approx 100$ sec, (iii) the sky location within $\approx 0.2$ ${\rm deg}^2$, and (iv) orbital eccentricity at a fiducial signal frequency of 10 mHz with a relative error of $\lesssim 1\%$. (abr.)
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Sudden polarization angle jumps of the repeating fast radio burst FRB 20201124A
Authors:
J. R. Niu,
W. Y. Wang,
J. C. Jiang,
Y. Qu,
D. J. Zhou,
W. W. Zhu,
K. J. Lee,
J. L. Han,
B. Zhang,
D. Li,
S. Cao,
Z. Y. Fang,
Y. Feng,
Q. Y. Fu,
P. Jiang,
W. C. Jing,
J. Li,
Y. Li,
R. Luo,
L. Q. Meng,
C. C. Miao,
X. L. Miao,
C. H. Niu,
Y. C. Pan,
B. J. Wang
, et al. (19 additional authors not shown)
Abstract:
We report the first detection of polarization angle (PA) orthogonal jumps, a phenomenon previously only observed from radio pulsars, from a fast radio burst (FRB) source FRB 20201124A. We find three cases of orthogonal jumps in over two thousand bursts, all resembling those observed in pulsar single pulses. We propose that the jumps are due to the superposition of two orthogonal emission modes tha…
▽ More
We report the first detection of polarization angle (PA) orthogonal jumps, a phenomenon previously only observed from radio pulsars, from a fast radio burst (FRB) source FRB 20201124A. We find three cases of orthogonal jumps in over two thousand bursts, all resembling those observed in pulsar single pulses. We propose that the jumps are due to the superposition of two orthogonal emission modes that could only be produced in a highly magnetized plasma, and they are caused by the line of sight sweeping across a rotating magnetosphere. The shortest jump timescale is of the order of one-millisecond, which hints that the emission modes come from regions smaller than the light cylinder of most pulsars or magnetars. This discovery provides convincing evidence that FRB emission originates from the complex magnetosphere of a magnetar, suggesting an FRB emission mechanism that is analogous to radio pulsars despite a huge luminosity difference between two types of objects.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Accessing Vision Foundation Models at ImageNet-level Costs
Authors:
Yitian Zhang,
Xu Ma,
Yue Bai,
Huan Wang,
Yun Fu
Abstract:
Vision foundation models are renowned for their generalization ability due to massive training data. Nevertheless, they demand tremendous training resources, and the training data is often inaccessible, e.g., CLIP, DINOv2, posing great challenges to developing derivatives that could advance research in this field. In this work, we offer a very simple and general solution, named Proteus, to distill…
▽ More
Vision foundation models are renowned for their generalization ability due to massive training data. Nevertheless, they demand tremendous training resources, and the training data is often inaccessible, e.g., CLIP, DINOv2, posing great challenges to developing derivatives that could advance research in this field. In this work, we offer a very simple and general solution, named Proteus, to distill foundation models into smaller equivalents on ImageNet-1K without access to the original training data. Specifically, we remove the designs from conventional knowledge distillation settings that result in dataset bias and present three levels of training objectives, i.e., token, patch, and feature, to maximize the efficacy of knowledge transfer. In this manner, Proteus is trained at ImageNet-level costs with surprising ability, facilitating the accessibility of training foundation models for the broader research community. Leveraging DINOv2-g/14 as the teacher, Proteus-L/14 matches the performance of the Oracle method DINOv2-L/14 (142M training data) across 15 benchmarks and outperforms other vision foundation models including CLIP-L/14 (400M), OpenCLIP-L/14 (400M/2B) and SynCLR-L/14 (600M).
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
TemporalStory: Enhancing Consistency in Story Visualization using Spatial-Temporal Attention
Authors:
Sixiao Zheng,
Yanwei Fu
Abstract:
Story visualization presents a challenging task in text-to-image generation, requiring not only the rendering of visual details from text prompt but also ensuring consistency across images. Recently, most approaches address inconsistency problem using an auto-regressive manner conditioned on previous image-sentence pairs. However, they overlook the fact that story context is dispersed across all s…
▽ More
Story visualization presents a challenging task in text-to-image generation, requiring not only the rendering of visual details from text prompt but also ensuring consistency across images. Recently, most approaches address inconsistency problem using an auto-regressive manner conditioned on previous image-sentence pairs. However, they overlook the fact that story context is dispersed across all sentences. The auto-regressive approach fails to encode information from susequent image-sentence pairs, thus unable to capture the entirety of the story context. To address this, we introduce TemporalStory, leveraging Spatial-Temporal attention to model complex spatial and temporal dependencies in images, enabling the generation of coherent images based on a given storyline. In order to better understand the storyline context, we introduce a text adapter capable of integrating information from other sentences into the embedding of the current sentence. Additionally, to utilize scene changes between story images as guidance for the model, we propose the StoryFlow Adapter to measure the degree of change between images. Through extensive experiments on two popular benchmarks, PororoSV and FlintstonesSV, our TemporalStory outperforms the previous state-of-the-art in both story visualization and story continuation tasks.
△ Less
Submitted 13 July, 2024;
originally announced July 2024.
-
SoupLM: Model Integration in Large Language and Multi-Modal Models
Authors:
Yue Bai,
Zichen Zhang,
Jiasen Lu,
Yun Fu
Abstract:
Training large language models (LLMs) and multimodal LLMs necessitates significant computing resources, and existing publicly available LLMs are typically pre-trained on diverse, privately curated datasets spanning various tasks. For instance, LLaMA, Vicuna, and LLaVA are three LLM variants trained with LLaMA base models using very different training recipes, tasks, and data modalities. The traini…
▽ More
Training large language models (LLMs) and multimodal LLMs necessitates significant computing resources, and existing publicly available LLMs are typically pre-trained on diverse, privately curated datasets spanning various tasks. For instance, LLaMA, Vicuna, and LLaVA are three LLM variants trained with LLaMA base models using very different training recipes, tasks, and data modalities. The training cost and complexity for such LLM variants grow rapidly. In this study, we propose to use a soup strategy to assemble these LLM variants into a single well-generalized multimodal LLM (SoupLM) in a cost-efficient manner. Assembling these LLM variants efficiently brings knowledge and specialities trained from different domains and data modalities into an integrated one (e.g., chatbot speciality from user-shared conversations for Vicuna, and visual capacity from vision-language data for LLaVA), therefore, to avoid computing costs of repetitive training on several different domains. We propose series of soup strategies to systematically benchmark performance gains across various configurations, and probe the soup behavior across base models in the interpolation space.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Spectroastrometry and Reverberation Mapping (SARM) of Active Galactic Nuclei. I. The H$β$ Broad-line Region Structure and Black Hole Mass of Five Quasars
Authors:
Yan-Rong Li,
Chen Hu,
Zhu-Heng Yao,
Yong-Jie Chen,
Hua-Rui Bai,
Sen Yang,
Pu Du,
Feng-Na Fang,
Yi-Xin Fu,
Jun-Rong Liu,
Yue-Chang Peng,
Yu-Yang Songsheng,
Yi-Lin Wang,
Ming Xiao,
Shuo Zhai,
Hartmut Winkler,
Jin-Ming Bai,
Luis C. Ho,
Romain G. Petrov,
Jesus Aceituno,
Jian-Min Wang
Abstract:
We conduct a reverberation mapping (RM) campaign to spectroscopically monitor a sample of selected bright active galactic nuclei with large anticipated broad-line region (BLR) sizes adequate for spectroastrometric observations by the GRAVITY instrument on the Very Large Telescope Interferometer. We report the first results for five objects, IC 4329A, Mrk 335, Mrk 509, Mrk 1239, and PDS 456, among…
▽ More
We conduct a reverberation mapping (RM) campaign to spectroscopically monitor a sample of selected bright active galactic nuclei with large anticipated broad-line region (BLR) sizes adequate for spectroastrometric observations by the GRAVITY instrument on the Very Large Telescope Interferometer. We report the first results for five objects, IC 4329A, Mrk 335, Mrk 509, Mrk 1239, and PDS 456, among which Mrk 1239 and PDS 456 are for the first time spectroscopically monitored. We obtain multi-year monitoring data and perform multi-component spectral decomposition to extract the broad H$β$ profiles. We detect significant time lags between the H$β$ and continuum variations, generally obeying the previously established BLR size-luminosity relation. Velocity-resolved H$β$ time lags illustrate diverse, possibly evolving BLR kinematics. We further measure the H$β$ line widths from mean and rms spectra and the resulting virial products show good consistency among different seasons. Adopting a unity virial factor and the full width at half maximum of the broad H$β$ line from the mean spectrum as the measure of velocity, the obtained black hole mass averaged over seasons is $\log M_\bullet/M_\odot=8.02_{-0.14}^{+0.09}$, $6.92_{-0.12}^{+0.12}$, $8.01_{-0.25}^{+0.16}$, $7.44_{-0.14}^{+0.13}$, and $8.59_{-0.11}^{+0.07}$ for the five objects, respectively. The black hole mass estimations using other line width measures are also reported (up to the virial factors). For objects with previous RM campaigns, our mass estimates are in agreement with earlier results. In a companion paper, we will employ BLR dynamical modeling to directly infer the black hole mass and thereby determine the virial factors.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (645 additional authors not shown)
Abstract:
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be…
▽ More
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
PerlDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models
Authors:
Jinhua Zhang,
Hualian Sheng,
Sijia Cai,
Bing Deng,
Qiao Liang,
Wen Li,
Ying Fu,
Jieping Ye,
Shuhang Gu
Abstract:
Controllable generation is considered a potentially vital approach to address the challenge of annotating 3D data, and the precision of such controllable generation becomes particularly imperative in the context of data production for autonomous driving. Existing methods focus on the integration of diverse generative information into controlling inputs, utilizing frameworks such as GLIGEN or Contr…
▽ More
Controllable generation is considered a potentially vital approach to address the challenge of annotating 3D data, and the precision of such controllable generation becomes particularly imperative in the context of data production for autonomous driving. Existing methods focus on the integration of diverse generative information into controlling inputs, utilizing frameworks such as GLIGEN or ControlNet, to produce commendable outcomes in controllable generation. However, such approaches intrinsically restrict generation performance to the learning capacities of predefined network architectures. In this paper, we explore the integration of controlling information and introduce PerlDiff (Perspective-Layout Diffusion Models), a method for effective street view image generation that fully leverages perspective 3D geometric information. Our PerlDiff employs 3D geometric priors to guide the generation of street view images with precise object-level control within the network learning process, resulting in a more robust and controllable output. Moreover, it demonstrates superior controllability compared to alternative layout control methods. Empirical results justify that our PerlDiff markedly enhances the precision of generation on the NuScenes and KITTI datasets. Our codes and models are publicly available at https://github.com/LabShuHangGU/PerlDiff.
△ Less
Submitted 16 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
MVGT: A Multi-view Graph Transformer Based on Spatial Relations for EEG Emotion Recognition
Authors:
Yanjie Cui,
Xiaohong Liu,
Jing Liang,
Yamin Fu
Abstract:
Electroencephalography (EEG), a medical imaging technique that captures scalp electrical activity of brain structures via electrodes, has been widely used in affective computing. The spatial domain of EEG is rich in affective information. However, few of the existing studies have simultaneously analyzed EEG signals from multiple perspectives of geometric and anatomical structures in spatial domain…
▽ More
Electroencephalography (EEG), a medical imaging technique that captures scalp electrical activity of brain structures via electrodes, has been widely used in affective computing. The spatial domain of EEG is rich in affective information. However, few of the existing studies have simultaneously analyzed EEG signals from multiple perspectives of geometric and anatomical structures in spatial domain. In this paper, we propose a multi-view Graph Transformer (MVGT) based on spatial relations, which integrates information from the temporal, frequency and spatial domains, including geometric and anatomical structures, so as to enhance the expressive power of the model comprehensively. We incorporate the spatial information of EEG channels into the model as encoding, thereby improving its ability to perceive the spatial structure of the channels. Meanwhile, experimental results based on publicly available datasets demonstrate that our proposed model outperforms state-of-the-art methods in recent years. In addition, the results also show that the MVGT could extract information from multiple domains and capture inter-channel relationships in EEG emotion recognition tasks effectively.
△ Less
Submitted 8 July, 2024; v1 submitted 3 July, 2024;
originally announced July 2024.
-
Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be…
▽ More
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Inducing superconductivity in quantum anomalous Hall regime
Authors:
Yu Huang,
Yu Fu,
Peng Zhang,
Kang L. Wang,
Qing Lin He
Abstract:
Interfacing the quantum anomalous Hall insulator with a conventional superconductor is known to be a promising manner for realizing a topological superconductor, which has been continuously pursued for years. Such a proximity route depends to a great extent on the control of the delicate interfacial coupling of the two constituents. However, a recent experiment reported the failure to reproduce su…
▽ More
Interfacing the quantum anomalous Hall insulator with a conventional superconductor is known to be a promising manner for realizing a topological superconductor, which has been continuously pursued for years. Such a proximity route depends to a great extent on the control of the delicate interfacial coupling of the two constituents. However, a recent experiment reported the failure to reproduce such a topological superconductor, which is ascribed to the negligence of the electrical short by the superconductor in the theoretical proposal. Here, we reproduce this topological superconductor with attention to the interface control. The resulted conductance matrix under a wide magnetic field range agrees with the fingerprint of this topological superconductor. This allows us to develop a phase diagram that unveils three regions parameterized by various coupling limits, which not only supports the feasibility to fabricate the topological superconductor by proximity but also fully explains the origin of the previous debate. The present work provides a comprehensible guide on fabricating the topological superconductor.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
HC-GLAD: Dual Hyperbolic Contrastive Learning for Unsupervised Graph-Level Anomaly Detection
Authors:
Yali Fu,
Jindong Li,
Jiahong Liu,
Qianli Xing,
Qi Wang,
Irwin King
Abstract:
Unsupervised graph-level anomaly detection (UGAD) has garnered increasing attention in recent years due to its significance. However, most existing methods only rely on traditional graph neural networks to explore pairwise relationships but such kind of pairwise edges are not enough to describe multifaceted relationships involving anomaly. There is an emergency need to exploit node group informati…
▽ More
Unsupervised graph-level anomaly detection (UGAD) has garnered increasing attention in recent years due to its significance. However, most existing methods only rely on traditional graph neural networks to explore pairwise relationships but such kind of pairwise edges are not enough to describe multifaceted relationships involving anomaly. There is an emergency need to exploit node group information which plays a crucial role in UGAD. In addition, most previous works ignore the global underlying properties (e.g., hierarchy and power-law structure) which are common in real-world graph datasets and therefore are indispensable factors on UGAD task. In this paper, we propose a novel Dual Hyperbolic Contrastive Learning for Unsupervised Graph-Level Anomaly Detection (HC-GLAD in short). To exploit node group connections, we construct hypergraphs based on gold motifs and subsequently perform hypergraph convolution. Furthermore, to preserve the hierarchy of real-world graphs, we introduce hyperbolic geometry into this field and conduct both graph and hypergraph embedding learning in hyperbolic space with hyperboloid model. To the best of our knowledge, this is the first work to simultaneously apply hypergraph with node group connections and hyperbolic geometry into this field. Extensive experiments on several real world datasets of different fields demonstrate the superiority of HC-GLAD on UGAD task. The code is available at https://github.com/Yali-F/HC-GLAD.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Investigating the Effects of Large-Scale Pseudo-Stereo Data and Different Speech Foundation Model on Dialogue Generative Spoken Language Model
Authors:
Yu-Kuan Fu,
Cheng-Kuang Lee,
Hsiu-Hsuan Wang,
Hung-yi Lee
Abstract:
Recent efforts in Spoken Dialogue Modeling aim to synthesize spoken dialogue without the need for direct transcription, thereby preserving the wealth of non-textual information inherent in speech. However, this approach faces a challenge when speakers talk simultaneously, requiring stereo dialogue data with speakers recorded on separate channels, a notably scarce resource. To address this, we have…
▽ More
Recent efforts in Spoken Dialogue Modeling aim to synthesize spoken dialogue without the need for direct transcription, thereby preserving the wealth of non-textual information inherent in speech. However, this approach faces a challenge when speakers talk simultaneously, requiring stereo dialogue data with speakers recorded on separate channels, a notably scarce resource. To address this, we have developed an innovative pipeline capable of transforming single-channel dialogue data into pseudo-stereo data. This expanded our training dataset from a mere 2,000 to an impressive 17,600 hours, significantly enriching the diversity and quality of the training examples available. The inclusion of this pseudo-stereo data has proven to be effective in improving the performance of spoken dialogue language models. Additionally, we explored the use of discrete units of different speech foundation models for spoken dialogue generation.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
MG-Verilog: Multi-grained Dataset Towards Enhanced LLM-assisted Verilog Generation
Authors:
Yongan Zhang,
Zhongzhi Yu,
Yonggan Fu,
Cheng Wan,
Yingyan Celine Lin
Abstract:
Large Language Models (LLMs) have recently shown promise in streamlining hardware design processes by encapsulating vast amounts of domain-specific data. In addition, they allow users to interact with the design processes through natural language instructions, thus making hardware design more accessible to developers. However, effectively leveraging LLMs in hardware design necessitates providing d…
▽ More
Large Language Models (LLMs) have recently shown promise in streamlining hardware design processes by encapsulating vast amounts of domain-specific data. In addition, they allow users to interact with the design processes through natural language instructions, thus making hardware design more accessible to developers. However, effectively leveraging LLMs in hardware design necessitates providing domain-specific data during inference (e.g., through in-context learning), fine-tuning, or pre-training. Unfortunately, existing publicly available hardware datasets are often limited in size, complexity, or detail, which hinders the effectiveness of LLMs in hardware design tasks. To address this issue, we first propose a set of criteria for creating high-quality hardware datasets that can effectively enhance LLM-assisted hardware design. Based on these criteria, we propose a Multi-Grained-Verilog (MG-Verilog) dataset, which encompasses descriptions at various levels of detail and corresponding code samples. To benefit the broader hardware design community, we have developed an open-source infrastructure that facilitates easy access, integration, and extension of the dataset to meet specific project needs. Furthermore, to fully exploit the potential of the MG-Verilog dataset, which varies in complexity and detail, we introduce a balanced fine-tuning scheme. This scheme serves as a unique use case to leverage the diverse levels of detail provided by the dataset. Extensive experiments demonstrate that the proposed dataset and fine-tuning scheme consistently improve the performance of LLMs in hardware design tasks.
△ Less
Submitted 3 July, 2024; v1 submitted 1 July, 2024;
originally announced July 2024.
-
Data on the Move: Traffic-Oriented Data Trading Platform Powered by AI Agent with Common Sense
Authors:
Yi Yu,
Shengyue Yao,
Tianchen Zhou,
Yexuan Fu,
Jingru Yu,
Ding Wang,
Xuhong Wang,
Cen Chen,
Yilun Lin
Abstract:
In the digital era, data has become a pivotal asset, advancing technologies such as autonomous driving. Despite this, data trading faces challenges like the absence of robust pricing methods and the lack of trustworthy trading mechanisms. To address these challenges, we introduce a traffic-oriented data trading platform named Data on The Move (DTM), integrating traffic simulation, data trading, an…
▽ More
In the digital era, data has become a pivotal asset, advancing technologies such as autonomous driving. Despite this, data trading faces challenges like the absence of robust pricing methods and the lack of trustworthy trading mechanisms. To address these challenges, we introduce a traffic-oriented data trading platform named Data on The Move (DTM), integrating traffic simulation, data trading, and Artificial Intelligent (AI) agents. The DTM platform supports evident-based data value evaluation and AI-based trading mechanisms. Leveraging the common sense capabilities of Large Language Models (LLMs) to assess traffic state and data value, DTM can determine reasonable traffic data pricing through multi-round interaction and simulations. Moreover, DTM provides a pricing method validation by simulating traffic systems, multi-agent interactions, and the heterogeneity and irrational behaviors of individuals in the trading market. Within the DTM platform, entities such as connected vehicles and traffic light controllers could engage in information collecting, data pricing, trading, and decision-making. Simulation results demonstrate that our proposed AI agent-based pricing approach enhances data trading by offering rational prices, as evidenced by the observed improvement in traffic efficiency. This underscores the effectiveness and practical value of DTM, offering new perspectives for the evolution of data markets and smart cities. To the best of our knowledge, this is the first study employing LLMs in data pricing and a pioneering data trading practice in the field of intelligent vehicles and smart cities.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Multi-field quantum conferencing overcomes the network capacity limit
Authors:
Yuan-Mei Xie,
Yu-Shuo Lu,
Yao Fu,
Hua-Lei Yin,
Zeng-Bing Chen
Abstract:
Quantum conferencing enables multiple nodes within a quantum network to share a secure group key for private message broadcasting. The key rate, however, is limited by the repeaterless capacity to distribute multiparticle entangled states across the network. Currently, in the finite-size regime, no feasible schemes utilizing existing experimental techniques can overcome the fundamental rate-distan…
▽ More
Quantum conferencing enables multiple nodes within a quantum network to share a secure group key for private message broadcasting. The key rate, however, is limited by the repeaterless capacity to distribute multiparticle entangled states across the network. Currently, in the finite-size regime, no feasible schemes utilizing existing experimental techniques can overcome the fundamental rate-distance limit of quantum conferencing in quantum networks without repeaters. Here, we propose a practical, multi-field scheme that breaks this limit, involving virtually establishing Greenberger-Horne-Zeilinger states through post-measurement coincidence matching. This proposal features a measurement-device-independent characteristic and can directly scale to support any number of users. Simulations show that the fundamental limitation on the group key rate can be overcome in a reasonable running time of sending $10^{14}$ pulses. We predict that it offers an efficient design for long-distance broadcast communication in future quantum networks.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Frequency-resolved Raman Thermometry Analysis via a Multi-layer Heat Transfer Model for Bulk and Low-dimensional Materials
Authors:
Taocheng Yu,
Yilu Fu,
Chenguang Fu,
Tiejun Zhu,
Wee-Liat Ong
Abstract:
Raman thermometry is advantageous for measuring the thermal transport of low-dimensional materials due to its non-contact nature. Transient Raman methods have improved the accuracy of steady-state Raman thermometry by removing the need for accurate temperature calibration and laser absorption evaluation. However, current methods often resort to finite element analysis (FEA) to decipher the measure…
▽ More
Raman thermometry is advantageous for measuring the thermal transport of low-dimensional materials due to its non-contact nature. Transient Raman methods have improved the accuracy of steady-state Raman thermometry by removing the need for accurate temperature calibration and laser absorption evaluation. However, current methods often resort to finite element analysis (FEA) to decipher the measured signals. This step is time-consuming and impedes its ubiquitous adaptation. In this work, we replace the FEA by fitting the transient-state Raman signal to a three-dimensional (3D) analytical heat transfer model for measuring the thermal conductivity of two bulk layered materials [i.e., molybdenum disulfide (MoS2) and bismuth selenide (Bi2Se3) crystals] and the interfacial thermal conductance (h) of CVD-grown MoS2 and molybdenum di-selenide (MoSe2) on quartz (SiO2). Our measured results agree reasonably well with literature and theoretical calculations. We also performed a quantitative sensitivity analysis to give insights on how to improve the measurement sensitivity. Our work provides an efficient way to process the data of transient-based Raman thermometry for high throughput measurements.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Imaging of single barium atoms in a second matrix site in solid xenon for barium tagging in a $^{136}$Xe double beta decay experiment
Authors:
M. Yvaine,
D. Fairbank,
J. Soderstrom,
C. Taylor,
J. Stanley,
T. Walton,
C. Chambers,
A. Iverson,
W. Fairbank,
S. Al Kharusi,
A. Amy,
E. Angelico,
A. Anker,
I. J. Arnquist,
A. Atencio,
J. Bane,
V. Belov,
E. P. Bernard,
T. Bhatta,
A. Bolotnikov,
J. Breslin,
P. A. Breur,
J. P. Brodsky,
E. Brown,
T. Brunner
, et al. (112 additional authors not shown)
Abstract:
Neutrinoless double beta decay is one of the most sensitive probes for new physics beyond the Standard Model of particle physics. One of the isotopes under investigation is $^{136}$Xe, which would double beta decay into $^{136}$Ba. Detecting the single $^{136}$Ba daughter provides a sort of ultimate tool in the discrimination against backgrounds. Previous work demonstrated the ability to perform s…
▽ More
Neutrinoless double beta decay is one of the most sensitive probes for new physics beyond the Standard Model of particle physics. One of the isotopes under investigation is $^{136}$Xe, which would double beta decay into $^{136}$Ba. Detecting the single $^{136}$Ba daughter provides a sort of ultimate tool in the discrimination against backgrounds. Previous work demonstrated the ability to perform single atom imaging of Ba atoms in a single-vacancy site of a solid xenon matrix. In this paper, the effort to identify signal from individual barium atoms is extended to Ba atoms in a hexa-vacancy site in the matrix and is achieved despite increased photobleaching in this site. Abrupt fluorescence turn-off of a single Ba atom is also observed. Significant recovery of fluorescence signal lost through photobleaching is demonstrated upon annealing of Ba deposits in the Xe ice. Following annealing, it is observed that Ba atoms in the hexa-vacancy site exhibit antibleaching while Ba atoms in the tetra-vacancy site exhibit bleaching. This may be evidence for a matrix site transfer upon laser excitation. Our findings offer a path of continued research toward tagging of Ba daughters in all significant sites in solid xenon.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec…
▽ More
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
FFN: a Fine-grained Chinese-English Financial Domain Parallel Corpus
Authors:
Yuxin Fu,
Shijing Si,
Leyi Mai,
Xi-ang Li
Abstract:
Large Language Models (LLMs) have stunningly advanced the field of machine translation, though their effectiveness within the financial domain remains largely underexplored. To probe this issue, we constructed a fine-grained Chinese-English parallel corpus of financial news called FFN. We acquired financial news articles spanning between January 1st, 2014, to December 31, 2023, from mainstream med…
▽ More
Large Language Models (LLMs) have stunningly advanced the field of machine translation, though their effectiveness within the financial domain remains largely underexplored. To probe this issue, we constructed a fine-grained Chinese-English parallel corpus of financial news called FFN. We acquired financial news articles spanning between January 1st, 2014, to December 31, 2023, from mainstream media websites such as CNN, FOX, and China Daily. The dataset consists of 1,013 main text and 809 titles, all of which have been manually corrected. We measured the translation quality of two LLMs -- ChatGPT and ERNIE-bot, utilizing BLEU, TER and chrF scores as the evaluation metrics. For comparison, we also trained an OpenNMT model based on our dataset. We detail problems of LLMs and provide in-depth analysis, intending to stimulate further research and solutions in this largely uncharted territory. Our research underlines the need to optimize LLMs within the specific field of financial translation to ensure accuracy and quality.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Microscopic characteristics of SF6 partial discharge induced by a floating linear metal particle
Authors:
Zihao Feng,
Yuanyuan Jiang,
Liyang Zhang,
Zhigang Liu,
Kai Wang,
Xinxin Wang,
Xiaobing Zou,
Haiyun Luo,
Yangyang Fu
Abstract:
Direct current (DC) gas insulated transmission lines (GILs) have been widely used in power transmission, but might be threatened by partial discharge due to the presence of floating impurities (e.g., dust and metal particles) inside the sealed chamber. In this letter, by using a 2D fluid model we characterize the microscopic properties of the partial discharge induced by a floating linear metal pa…
▽ More
Direct current (DC) gas insulated transmission lines (GILs) have been widely used in power transmission, but might be threatened by partial discharge due to the presence of floating impurities (e.g., dust and metal particles) inside the sealed chamber. In this letter, by using a 2D fluid model we characterize the microscopic properties of the partial discharge induced by a floating linear metal particle in SF6 (both the discharge propagation and interaction between space charge and metal particle) under negative high voltage direct current (HVDC) conditions. Due to the strong electronegativity of SF6, the spatiotemporal distributions of the charged species (electrons, positive and negative ions), space charge, and reduced electric field are rather different from those in air. Notably, a negative ion region is observed around the top tip of the metal particle, and it plays an important role in the generation and propagation of primary and secondary streamers in SF6, which may lead to severe motion characteristics of the particle and aliasing of partial discharge signals. Additionally, we analyze the charging process and electric force reversal phenomenon, which may provide a more precise understanding of the underlying mechanisms of the firefly motion previously reported for DC GILs.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
EFCNet: Every Feature Counts for Small Medical Object Segmentation
Authors:
Lingjie Kong,
Qiaoling Wei,
Chengming Xu,
Han Chen,
Yanwei Fu
Abstract:
This paper explores the segmentation of very small medical objects with significant clinical value. While Convolutional Neural Networks (CNNs), particularly UNet-like models, and recent Transformers have shown substantial progress in image segmentation, our empirical findings reveal their poor performance in segmenting the small medical objects and lesions concerned in this paper. This limitation…
▽ More
This paper explores the segmentation of very small medical objects with significant clinical value. While Convolutional Neural Networks (CNNs), particularly UNet-like models, and recent Transformers have shown substantial progress in image segmentation, our empirical findings reveal their poor performance in segmenting the small medical objects and lesions concerned in this paper. This limitation may be attributed to information loss during their encoding and decoding process. In response to this challenge, we propose a novel model named EFCNet for small object segmentation in medical images. Our model incorporates two modules: the Cross-Stage Axial Attention Module (CSAA) and the Multi-Precision Supervision Module (MPS). These modules address information loss during encoding and decoding procedures, respectively. Specifically, CSAA integrates features from all stages of the encoder to adaptively learn suitable information needed in different decoding stages, thereby reducing information loss in the encoder. On the other hand, MPS introduces a novel multi-precision supervision mechanism to the decoder. This mechanism prioritizes attention to low-resolution features in the initial stages of the decoder, mitigating information loss caused by subsequent convolution and sampling processes and enhancing the model's global perception. We evaluate our model on two benchmark medical image datasets. The results demonstrate that EFCNet significantly outperforms previous segmentation methods designed for both medical and normal images.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of…
▽ More
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$, evidence for $ψ(4160) \to K^{-}\barΞ^{+}Λ$ is found for the first time with a significance of 4.4$σ$, including systematic uncertainties. No evidence for other possible resonances is found. In addition, the products of electronic partial width and branching fraction for all assumed resonances decaying into $K^{-}\barΞ^{+}Λ/Σ^{0}$ are determined.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an…
▽ More
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, and $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^0)=(2.02 \pm 0.13 \pm 0.05)\%$, where the first uncertainties are statistical and the second systematic. Combining with the known branching fractions of $Λ_c^+ \to pK_{S}^{0}$, $Λ_c^+ \to pK_{S}^{0}π^+π^-$, and $Λ_c^+ \to pK_{S}^{0}π^0$, we present the first measurements of the $K_{S}^{0}$-$K_{L}^{0}$ asymmetries $R(Λ_c^+, K_{S,L}^0X) = \frac{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) - \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) + \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}$ in charmed baryon decays: $R(Λ_c^+, pK_{S,L}^0) = -0.025 \pm 0.031$, $R(Λ_c^+, pK_{S,L}^0π^+π^-) = -0.027 \pm 0.048$, and $R(Λ_c^+, pK_{S,L}^0π^0) =-0.015 \pm 0.046$. No significant asymmetries within the uncertainties are observed.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Study of the $f_{0}(980)$ through the decay $D_{s}^{+}\rightarrow π^{+}π^{+}π^{-}π^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (649 additional authors not shown)
Abstract:
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and…
▽ More
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and determine the branching fractions $\mathcal{B}(D_s^+\toπ^+π^+π^-π^0|_{{\rm non}-η})=(2.04\pm0.08_{\rm stat.}\pm0.05_{\rm syst.})\%$ and $\mathcal{B}(D_s^+\toηπ^+)=(1.56\pm0.09_{\rm stat.}\pm0.04_{\rm syst.})\%$. Moreover, we measure the relative branching fraction between $φ\toπ^+π^-π^0$ and $φ\to K^+K^-$ to be $\frac{\mathcal{B}(φ(1020) \to π^+π^-π^0)}{\mathcal{B}(φ(1020) \to K^+K^-)}=0.230 \pm 0.014_{\rm stat.} \pm 0.010_{\rm syst.}$, which deviates from the world average value by more than $4σ$.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Efficient source-independent quantum conference key agreement
Authors:
Yu Bao,
Yi-Ran Xiao,
Yu-Chen Song,
Yao Fu,
Xiao-Yu Cao,
Hua-Lei Yin,
Zeng-Bing Chen
Abstract:
Quantum conference key agreement (QCKA) enables the unconditional secure distribution of conference keys among multiple participants. Due to challenges in high-fidelity preparation and long-distance distribution of multi-photon entanglement, entanglement-based QCKA is facing severe limitations in both key rate and scalability. Here, we propose a source-independent QCKA scheme utilizing the post-ma…
▽ More
Quantum conference key agreement (QCKA) enables the unconditional secure distribution of conference keys among multiple participants. Due to challenges in high-fidelity preparation and long-distance distribution of multi-photon entanglement, entanglement-based QCKA is facing severe limitations in both key rate and scalability. Here, we propose a source-independent QCKA scheme utilizing the post-matching method, feasible within the entangled photon pair distribution network. We introduce an equivalent distributing virtual multi-photon entanglement protocol for providing the unconditional security proof even in the case of coherent attacks. For the symmetry star-network, comparing with previous $n$-photon entanglement protocol, the conference key rate is improved from $O(η^{n})$ to $O(η^{2})$, where $η$ is the transmittance from the entanglement source to one participant. Simulation results show that the performance of our protocol has multiple orders of magnitude advantages in the intercity distance. We anticipate that our approach will demonstrate its potential in the implementation of quantum networks.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
TRAWL: Tensor Reduced and Approximated Weights for Large Language Models
Authors:
Yiran Luo,
Het Patel,
Yu Fu,
Dawon Ahn,
Jia Chen,
Yue Dong,
Evangelos E. Papalexakis
Abstract:
Large language models (LLMs) have fundamentally transformed artificial intelligence, catalyzing recent advancements while imposing substantial environmental and computational burdens. We introduce TRAWL (Tensor Reduced and Approximated Weights for Large Language Models), a novel methodology for optimizing LLMs through tensor decomposition. TRAWL leverages diverse strategies to exploit matrices wit…
▽ More
Large language models (LLMs) have fundamentally transformed artificial intelligence, catalyzing recent advancements while imposing substantial environmental and computational burdens. We introduce TRAWL (Tensor Reduced and Approximated Weights for Large Language Models), a novel methodology for optimizing LLMs through tensor decomposition. TRAWL leverages diverse strategies to exploit matrices within transformer-based architectures, realizing notable performance enhancements without necessitating retraining. The most significant improvements were observed through a layer-by-layer intervention strategy, particularly when applied to fully connected weights of the final layers, yielding up to 16% enhancement in accuracy without the need for additional data or fine-tuning. These results underscore the importance of targeted and adaptive techniques in increasing the efficiency and effectiveness of large language model optimization, thereby promoting the development of more sustainable and accessible AI systems.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Make Graph Neural Networks Great Again: A Generic Integration Paradigm of Topology-Free Patterns for Traffic Speed Prediction
Authors:
Yicheng Zhou,
Pengfei Wang,
Hao Dong,
Denghui Zhang,
Dingqi Yang,
Yanjie Fu,
Pengyang Wang
Abstract:
Urban traffic speed prediction aims to estimate the future traffic speed for improving urban transportation services. Enormous efforts have been made to exploit Graph Neural Networks (GNNs) for modeling spatial correlations and temporal dependencies of traffic speed evolving patterns, regularized by graph topology.While achieving promising results, current traffic speed prediction methods still su…
▽ More
Urban traffic speed prediction aims to estimate the future traffic speed for improving urban transportation services. Enormous efforts have been made to exploit Graph Neural Networks (GNNs) for modeling spatial correlations and temporal dependencies of traffic speed evolving patterns, regularized by graph topology.While achieving promising results, current traffic speed prediction methods still suffer from ignoring topology-free patterns, which cannot be captured by GNNs. To tackle this challenge, we propose a generic model for enabling the current GNN-based methods to preserve topology-free patterns. Specifically, we first develop a Dual Cross-Scale Transformer (DCST) architecture, including a Spatial Transformer and a Temporal Transformer, to preserve the cross-scale topology-free patterns and associated dynamics, respectively. Then, to further integrate both topology-regularized/-free patterns, we propose a distillation-style learning framework, in which the existing GNN-based methods are considered as the teacher model, and the proposed DCST architecture is considered as the student model. The teacher model would inject the learned topology-regularized patterns into the student model for integrating topology-free patterns. The extensive experimental results demonstrated the effectiveness of our methods.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Odd Dipole Screening in Radial Inflation
Authors:
Yang Fu,
H. George E. Hentschel,
Pawandeep Kaur,
Avanish Kumar,
Itamar Procaccia
Abstract:
The inflation of an inner radial (or spherical) cavity in an amorphous solids confined in a disk (or a sphere), served as a fruitful case model for studying the effects of plastic deformations on the mechanical response. It was shown that when the field associated with Eshelby quadrupolar charges is non-uniform, the displacement field is riddled with dipole charges that screen elasticity, reminisc…
▽ More
The inflation of an inner radial (or spherical) cavity in an amorphous solids confined in a disk (or a sphere), served as a fruitful case model for studying the effects of plastic deformations on the mechanical response. It was shown that when the field associated with Eshelby quadrupolar charges is non-uniform, the displacement field is riddled with dipole charges that screen elasticity, reminiscent of Debye monopoles screening in electrostatics. In this paper we look deeper into the screening phenomenon, taking into account the consequences of irreversibility that are associated with the breaking of Chiral symmetry. We consider the equations for the displacement field with the presence of "Odd Dipole Screening", solve them analytically and compare with numerical simulations. Suggestions how to test the theory in experiments are provided.
△ Less
Submitted 27 June, 2024; v1 submitted 23 June, 2024;
originally announced June 2024.
-
Repeater-Like Asynchronous Measurement-Device-Independent Quantum Conference Key Agreement
Authors:
Yu-Shuo Lu,
Yuan-Mei Xie,
Yao Fu,
Hua-Lei Yin,
Zeng-Bing Chen
Abstract:
Quantum conference key agreement facilitates secure communication among multiple parties through multipartite entanglement and is anticipated to be an important cryptographic primitive for future quantum networks. However, the experimental complexity and low efficiency associated with the synchronous detection of multipartite entangled states have significantly hindered their practical application…
▽ More
Quantum conference key agreement facilitates secure communication among multiple parties through multipartite entanglement and is anticipated to be an important cryptographic primitive for future quantum networks. However, the experimental complexity and low efficiency associated with the synchronous detection of multipartite entangled states have significantly hindered their practical application. In this work, we propose a measurement-device-independent conference key agreement protocol that utilizes asynchronous Greenberger-Horne-Zeilinger state measurement.This approach achieves a linear scaling of the conference key rate among multiple parties, exhibiting performance similar to that of the single-repeater scheme in quantum networks. The asynchronous measurement strategy bypasses the need for complex global phase locking technologies, concurrently extending the intercity transmission distance with composable security in the finite key regime. Additionally, our work also showcases the advantages of the asynchronous pairing concept in multiparty quantum entanglement.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
Authors:
Zhongzhi Yu,
Zheng Wang,
Yonggan Fu,
Huihong Shi,
Khalid Shaikh,
Yingyan Celine Lin
Abstract:
Attention is a fundamental component behind the remarkable achievements of large language models (LLMs). However, our current understanding of the attention mechanism, especially regarding how attention distributions are established, remains limited. Inspired by recent studies that explore the presence of attention sink in the initial token, which receives disproportionately large attention scores…
▽ More
Attention is a fundamental component behind the remarkable achievements of large language models (LLMs). However, our current understanding of the attention mechanism, especially regarding how attention distributions are established, remains limited. Inspired by recent studies that explore the presence of attention sink in the initial token, which receives disproportionately large attention scores despite their lack of semantic importance, this work delves deeper into this phenomenon. We aim to provide a more profound understanding of the existence of attention sinks within LLMs and to uncover ways to enhance the achievable accuracy of LLMs by directly optimizing the attention distributions, without the need for weight finetuning. Specifically, this work begins with comprehensive visualizations of the attention distributions in LLMs during inference across various inputs and tasks. Based on these visualizations, to the best of our knowledge, we are the first to discover that (1) attention sinks occur not only at the start of sequences but also within later tokens of the input, and (2) not all attention sinks have a positive impact on the achievable accuracy of LLMs. Building upon our findings, we propose a training-free Attention Calibration Technique (ACT) that automatically optimizes the attention distributions on the fly during inference in an input-adaptive manner. Extensive experiments validate that ACT consistently enhances the accuracy of various LLMs across different applications. Specifically, ACT achieves an average improvement of up to 7.30% in accuracy across different datasets when applied to Llama-30B. Our code is available at https://github.com/GATECH-EIC/ACT.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction…
▽ More
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction $\mathcal{B}[χ_{c1}(3872)\toπ^+π^- J/ψ]$ at 4.914 and 4.946 GeV are set to be 0.85 and 0.96 pb, respectively. These measurements provide useful information for the production of the $χ_{c1}(3872)$ at $e^+e^-$ collider and deepen our understanding about the nature of this particle.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models
Authors:
Zhawnen Chen,
Tianchun Wang,
Yizhou Wang,
Michal Kosinski,
Xiang Zhang,
Yun Fu,
Sheng Li
Abstract:
Can large multimodal models have a human-like ability for emotional and social reasoning, and if so, how does it work? Recent research has discovered emergent theory-of-mind (ToM) reasoning capabilities in large language models (LLMs). LLMs can reason about people's mental states by solving various text-based ToM tasks that ask questions about the actors' ToM (e.g., human belief, desire, intention…
▽ More
Can large multimodal models have a human-like ability for emotional and social reasoning, and if so, how does it work? Recent research has discovered emergent theory-of-mind (ToM) reasoning capabilities in large language models (LLMs). LLMs can reason about people's mental states by solving various text-based ToM tasks that ask questions about the actors' ToM (e.g., human belief, desire, intention). However, human reasoning in the wild is often grounded in dynamic scenes across time. Thus, we consider videos a new medium for examining spatio-temporal ToM reasoning ability. Specifically, we ask explicit probing questions about videos with abundant social and emotional reasoning content. We develop a pipeline for multimodal LLM for ToM reasoning using video and text. We also enable explicit ToM reasoning by retrieving key frames for answering a ToM question, which reveals how multimodal LLMs reason about ToM.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Jogging the Memory of Unlearned Model Through Targeted Relearning Attack
Authors:
Shengyuan Hu,
Yiwei Fu,
Zhiwei Steven Wu,
Virginia Smith
Abstract:
Machine unlearning is a promising approach to mitigate undesirable memorization of training data in ML models. However, in this work we show that existing approaches for unlearning in LLMs are surprisingly susceptible to a simple set of targeted relearning attacks. With access to only a small and potentially loosely related set of data, we find that we can 'jog' the memory of unlearned models to r…
▽ More
Machine unlearning is a promising approach to mitigate undesirable memorization of training data in ML models. However, in this work we show that existing approaches for unlearning in LLMs are surprisingly susceptible to a simple set of targeted relearning attacks. With access to only a small and potentially loosely related set of data, we find that we can 'jog' the memory of unlearned models to reverse the effects of unlearning. We formalize this unlearning-relearning pipeline, explore the attack across three popular unlearning benchmarks, and discuss future directions and guidelines that result from our study.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Towards Adaptive Neighborhood for Advancing Temporal Interaction Graph Modeling
Authors:
Siwei Zhang,
Xi Chen,
Yun Xiong,
Xixi Wu,
Yao Zhang,
Yongrui Fu,
Yinglong Zhao,
Jiawei Zhang
Abstract:
Temporal Graph Networks (TGNs) have demonstrated their remarkable performance in modeling temporal interaction graphs. These works can generate temporal node representations by encoding the surrounding neighborhoods for the target node. However, an inherent limitation of existing TGNs is their reliance on fixed, hand-crafted rules for neighborhood encoding, overlooking the necessity for an adaptiv…
▽ More
Temporal Graph Networks (TGNs) have demonstrated their remarkable performance in modeling temporal interaction graphs. These works can generate temporal node representations by encoding the surrounding neighborhoods for the target node. However, an inherent limitation of existing TGNs is their reliance on fixed, hand-crafted rules for neighborhood encoding, overlooking the necessity for an adaptive and learnable neighborhood that can accommodate both personalization and temporal evolution across different timestamps. In this paper, we aim to enhance existing TGNs by introducing an adaptive neighborhood encoding mechanism. We present SEAN, a flexible plug-and-play model that can be seamlessly integrated with existing TGNs, effectively boosting their performance. To achieve this, we decompose the adaptive neighborhood encoding process into two phases: (i) representative neighbor selection, and (ii) temporal-aware neighborhood information aggregation. Specifically, we propose the Representative Neighbor Selector component, which automatically pinpoints the most important neighbors for the target node. It offers a tailored understanding of each node's unique surrounding context, facilitating personalization. Subsequently, we propose a Temporal-aware Aggregator, which synthesizes neighborhood aggregation by selectively determining the utilization of aggregation routes and decaying the outdated information, allowing our model to adaptively leverage both the contextually significant and current information during aggregation. We conduct extensive experiments by integrating SEAN into three representative TGNs, evaluating their performance on four public datasets and one financial benchmark dataset introduced in this paper. The results demonstrate that SEAN consistently leads to performance improvements across all models, achieving SOTA performance and exceptional robustness.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection
Authors:
Lingjie Kong,
Kai Wu,
Xiaobin Hu,
Wenhui Han,
Jinlong Peng,
Chengming Xu,
Donghao Luo,
Jiangning Zhang,
Chengjie Wang,
Yanwei Fu
Abstract:
Text-to-image based object customization, aiming to generate images with the same identity (ID) as objects of interest in accordance with text prompts and reference images, has made significant progress. However, recent customizing research is dominated by specialized tasks, such as human customization or virtual try-on, leaving a gap in general object customization. To this end, we introduce AnyM…
▽ More
Text-to-image based object customization, aiming to generate images with the same identity (ID) as objects of interest in accordance with text prompts and reference images, has made significant progress. However, recent customizing research is dominated by specialized tasks, such as human customization or virtual try-on, leaving a gap in general object customization. To this end, we introduce AnyMaker, an innovative zero-shot object customization framework capable of generating general objects with high ID fidelity and flexible text editability. The efficacy of AnyMaker stems from its novel general ID extraction, dual-level ID injection, and ID-aware decoupling. Specifically, the general ID extraction module extracts sufficient ID information with an ensemble of self-supervised models to tackle the diverse customization tasks for general objects. Then, to provide the diffusion UNet with the extracted ID as much while not damaging the text editability in the generation process, we design a global-local dual-level ID injection module, in which the global-level semantic ID is injected into text descriptions while the local-level ID details are injected directly into the model through newly added cross-attention modules. In addition, we propose an ID-aware decoupling module to disentangle ID-related information from non-ID elements in the extracted representations for high-fidelity generation of both identity and text descriptions. To validate our approach and boost the research of general object customization, we create the first large-scale general ID dataset, Multi-Category ID-Consistent (MC-IDC) dataset, with 315k text-image samples and 10k categories. Experiments show that AnyMaker presents remarkable performance in general object customization and outperforms specialized methods in corresponding tasks. Code and dataset will be released soon.
△ Less
Submitted 5 July, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Technique Report of CVPR 2024 PBDL Challenges
Authors:
Ying Fu,
Yu Li,
Shaodi You,
Boxin Shi,
Linwei Chen,
Yunhao Zou,
Zichun Wang,
Yichen Li,
Yuze Han,
Yingkai Zhang,
Jianan Wang,
Qinglin Liu,
Wei Yu,
Xiaoqian Lv,
Jianing Li,
Shengping Zhang,
Xiangyang Ji,
Yuanpei Chen,
Yuhan Zhang,
Weihang Peng,
Liwen Zhang,
Zhe Xu,
Dingyong Gou,
Cong Li,
Senyan Xu
, et al. (75 additional authors not shown)
Abstract:
The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, a…
▽ More
The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, and medium properties from images. In recent years, deep learning has shown promising improvements for various vision tasks, and when combined with physics-based vision, these approaches can enhance the robustness and accuracy of vision systems. This technical report summarizes the outcomes of the Physics-Based Vision Meets Deep Learning (PBDL) 2024 challenge, held in CVPR 2024 workshop. The challenge consisted of eight tracks, focusing on Low-Light Enhancement and Detection as well as High Dynamic Range (HDR) Imaging. This report details the objectives, methodologies, and results of each track, highlighting the top-performing solutions and their innovative approaches.
△ Less
Submitted 12 July, 2024; v1 submitted 15 June, 2024;
originally announced June 2024.
-
Radial Projections in $\mathbb{R}^n$ Revisited
Authors:
Paige Bright,
Yuqiu Fu,
Kevin Ren
Abstract:
We generalize the recent results on radial projections by Orponen, Shmerkin, Wang using two different methods. In particular, we show that given $X,Y\subset \mathbb{R}^n$ Borel sets and $X\neq \emptyset$. If $\dim Y \in (k,k+1]$ for some $k\in \{1,\dots, n-1\}$, then \[ \sup_{x\in X} \dim π_x(Y\setminus \{x\}) \geq \min \{\dim X + \dim Y - k, k\}. \] Our results give a new approach to solving a co…
▽ More
We generalize the recent results on radial projections by Orponen, Shmerkin, Wang using two different methods. In particular, we show that given $X,Y\subset \mathbb{R}^n$ Borel sets and $X\neq \emptyset$. If $\dim Y \in (k,k+1]$ for some $k\in \{1,\dots, n-1\}$, then \[ \sup_{x\in X} \dim π_x(Y\setminus \{x\}) \geq \min \{\dim X + \dim Y - k, k\}. \] Our results give a new approach to solving a conjecture of Lund-Pham-Thu in all dimensions and for all ranges of $\dim Y$.
The first of our two methods for proving the above theorem is shorter, utilizing a result of the first author and Gan. Our second method, though longer, follows the original methodology of Orponen--Shmerkin--Wang, and requires a higher dimensional incidence estimate and a dual Furstenberg-set estimate for lines. These new estimates may be of independent interest.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Search for $X(1870)$ via the decay $J/ψ\to ωK^+ K^-η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the…
▽ More
Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the $90\%$ confidence level. In addition, the branching faction $B(J/ψ\toωK^+ K^- η)$ is measured to be $(3.33\pm0.02(\rm{stat.})\pm 0.12(\rm{syst.}))\times 10^{-4}$.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Adaptive Slot Attention: Object Discovery with Dynamic Slot Number
Authors:
Ke Fan,
Zechen Bai,
Tianjun Xiao,
Tong He,
Max Horn,
Yanwei Fu,
Francesco Locatello,
Zheng Zhang
Abstract:
Object-centric learning (OCL) extracts the representation of objects with slots, offering an exceptional blend of flexibility and interpretability for abstracting low-level perceptual features. A widely adopted method within OCL is slot attention, which utilizes attention mechanisms to iteratively refine slot representations. However, a major drawback of most object-centric models, including slot…
▽ More
Object-centric learning (OCL) extracts the representation of objects with slots, offering an exceptional blend of flexibility and interpretability for abstracting low-level perceptual features. A widely adopted method within OCL is slot attention, which utilizes attention mechanisms to iteratively refine slot representations. However, a major drawback of most object-centric models, including slot attention, is their reliance on predefining the number of slots. This not only necessitates prior knowledge of the dataset but also overlooks the inherent variability in the number of objects present in each instance. To overcome this fundamental limitation, we present a novel complexity-aware object auto-encoder framework. Within this framework, we introduce an adaptive slot attention (AdaSlot) mechanism that dynamically determines the optimal number of slots based on the content of the data. This is achieved by proposing a discrete slot sampling module that is responsible for selecting an appropriate number of slots from a candidate list. Furthermore, we introduce a masked slot decoder that suppresses unselected slots during the decoding process. Our framework, tested extensively on object discovery tasks with various datasets, shows performance matching or exceeding top fixed-slot models. Moreover, our analysis substantiates that our method exhibits the capability to dynamically adapt the slot number according to each instance's complexity, offering the potential for further exploration in slot attention research. Project will be available at https://kfan21.github.io/AdaSlot/
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Blind Super-Resolution via Meta-learning and Markov Chain Monte Carlo Simulation
Authors:
Jingyuan Xia,
Zhixiong Yang,
Shengxi Li,
Shuanghui Zhang,
Yaowen Fu,
Deniz Gündüz,
Xiang Li
Abstract:
Learning-based approaches have witnessed great successes in blind single image super-resolution (SISR) tasks, however, handcrafted kernel priors and learning based kernel priors are typically required. In this paper, we propose a Meta-learning and Markov Chain Monte Carlo (MCMC) based SISR approach to learn kernel priors from organized randomness. In concrete, a lightweight network is adopted as k…
▽ More
Learning-based approaches have witnessed great successes in blind single image super-resolution (SISR) tasks, however, handcrafted kernel priors and learning based kernel priors are typically required. In this paper, we propose a Meta-learning and Markov Chain Monte Carlo (MCMC) based SISR approach to learn kernel priors from organized randomness. In concrete, a lightweight network is adopted as kernel generator, and is optimized via learning from the MCMC simulation on random Gaussian distributions. This procedure provides an approximation for the rational blur kernel, and introduces a network-level Langevin dynamics into SISR optimization processes, which contributes to preventing bad local optimal solutions for kernel estimation. Meanwhile, a meta-learning-based alternating optimization procedure is proposed to optimize the kernel generator and image restorer, respectively. In contrast to the conventional alternating minimization strategy, a meta-learning-based framework is applied to learn an adaptive optimization strategy, which is less-greedy and results in better convergence performance. These two procedures are iteratively processed in a plug-and-play fashion, for the first time, realizing a learning-based but plug-and-play blind SISR solution in unsupervised inference. Extensive simulations demonstrate the superior performance and generalization ability of the proposed approach when comparing with state-of-the-arts on synthesis and real-world datasets. The code is available at https://github.com/XYLGroup/MLMC.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
Authors:
Xiaoshuai Song,
Muxi Diao,
Guanting Dong,
Zhengyang Wang,
Yujia Fu,
Runqi Qiao,
Zhexu Wang,
Dayuan Fu,
Huangxuan Wu,
Bin Liang,
Weihao Zeng,
Yejie Wang,
Zhuoma GongQue,
Jianing Yu,
Qiuna Tan,
Weiran Xu
Abstract:
Computer Science (CS) stands as a testament to the intricacies of human intelligence, profoundly advancing the development of artificial intelligence and modern society. However, the current community of large language models (LLMs) overly focuses on benchmarks for analyzing specific foundational skills (e.g. mathematics and code generation), neglecting an all-round evaluation of the computer scie…
▽ More
Computer Science (CS) stands as a testament to the intricacies of human intelligence, profoundly advancing the development of artificial intelligence and modern society. However, the current community of large language models (LLMs) overly focuses on benchmarks for analyzing specific foundational skills (e.g. mathematics and code generation), neglecting an all-round evaluation of the computer science field. To bridge this gap, we introduce CS-Bench, the first bilingual (Chinese-English) benchmark dedicated to evaluating the performance of LLMs in computer science. CS-Bench comprises approximately 5K meticulously curated test samples, covering 26 subfields across 4 key areas of computer science, encompassing various task forms and divisions of knowledge and reasoning. Utilizing CS-Bench, we conduct a comprehensive evaluation of over 30 mainstream LLMs, revealing the relationship between CS performance and model scales. We also quantitatively analyze the reasons for failures in existing LLMs and highlight directions for improvements, including knowledge supplementation and CS-specific reasoning. Further cross-capability experiments show a high correlation between LLMs' capabilities in computer science and their abilities in mathematics and coding. Moreover, expert LLMs specialized in mathematics and coding also demonstrate strong performances in several CS subfields. Looking ahead, we envision CS-Bench serving as a cornerstone for LLM applications in the CS field and paving new avenues in assessing LLMs' diverse reasoning capabilities. The CS-Bench data and evaluation code are available at https://github.com/csbench/csbench.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Macroscopic Tunneling Probe of Moiré Spin Textures in Twisted CrI$_3$
Authors:
Bowen Yang,
Tarun Patel,
Meixin Cheng,
Kostyantyn Pichugin,
Lin Tian,
Nachiket Sherlekar,
Shaohua Yan,
Yang Fu,
Shangjie Tian,
Hechang Lei,
Michael E. Reimer,
Junichi Okamoto,
Adam W. Tsen
Abstract:
Various noncollinear spin textures and magnetic phases have been predicted in twisted two-dimensional CrI$_3$ due to competing ferromagnetic (FM) and antiferromagnetic (AFM) interlayer exchange from moiré stacking - with potential spintronic applications even when the underlying material possesses a negligible Dzyaloshinskii-Moriya or dipole-dipole interaction. Recent measurements have shown evide…
▽ More
Various noncollinear spin textures and magnetic phases have been predicted in twisted two-dimensional CrI$_3$ due to competing ferromagnetic (FM) and antiferromagnetic (AFM) interlayer exchange from moiré stacking - with potential spintronic applications even when the underlying material possesses a negligible Dzyaloshinskii-Moriya or dipole-dipole interaction. Recent measurements have shown evidence of coexisting FM and AFM layer order in small-twist-angle CrI$_3$ bilayers and double bilayers. Yet, the nature of the magnetic textures remains unresolved and possibilities for their manipulation and electrical readout are unexplored. Here, we use tunneling magnetoresistance to investigate the collective spin states of twisted double-bilayer CrI$_3$ under both out-of-plane and in-plane magnetic fields together with detailed micromagnetic simulations of domain dynamics based on magnetic circular dichroism. Our results capture hysteretic and anisotropic field evolutions of the magnetic states and we further uncover two distinct non-volatile spin textures (out-of-plane and in-plane domains) at $\approx$ 1° twist angle, with a different global tunneling resistance that can be switched by magnetic field.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
ProTrain: Efficient LLM Training via Memory-Aware Techniques
Authors:
Hanmei Yang,
Jin Zhou,
Yao Fu,
Xiaoqun Wang,
Ramine Roane,
Hui Guan,
Tongping Liu
Abstract:
It is extremely memory-hungry to train Large Language Models (LLM). To solve this problem, existing work exploits the combination of CPU and GPU for the training process, such as ZeRO-Offload. Such a technique largely democratizes billion-scale model training, making it possible to train with few consumer graphics cards. However, based on our observation, existing frameworks often provide coarse-g…
▽ More
It is extremely memory-hungry to train Large Language Models (LLM). To solve this problem, existing work exploits the combination of CPU and GPU for the training process, such as ZeRO-Offload. Such a technique largely democratizes billion-scale model training, making it possible to train with few consumer graphics cards. However, based on our observation, existing frameworks often provide coarse-grained memory management and require experienced experts in configuration tuning, leading to suboptimal hardware utilization and performance. This paper proposes ProTrain, a novel training system that intelligently balances memory usage and performance by coordinating memory, computation, and IO. ProTrain achieves adaptive memory management through Chunk-Based Model State Management and Block-Wise Activation Management, guided by a Memory-Aware Runtime Profiler without user intervention. ProTrain does not change the training algorithm and thus does not compromise accuracy. Experiments show that ProTrain improves training throughput by 1.43$\times$ to 2.71$\times$ compared to the SOTA training systems.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.