-
Measurement of the branching fraction of $D^+_s\to \ell^+ν_\ell$ via $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and…
▽ More
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and $\mathcal{B}(D_s^+\toτ^+ν_τ)=(\bftauv)\%$, respectively. The product of the decay constant and Cabibbo-Kobayashi-Maskawa matrix element $|V_{cs}|$ is determined to be $f_{D_s^+}|V_{cs}|=(\mufdsxvcsresult)_{μν}~\mathrm{MeV}$ and $f_{D_s^+}|V_{cs}|=(\taufdsxvcsresult))_{τν}~\mathrm{MeV}$, respectively. Taking the value of $|V_{cs}|$ from a global fit in the Standard Model, we obtain ${f_{D^+_s}}=(\mufdsresult)_{μν}$\,MeV and ${f_{D^+_s}}=(\taufdsresult)_{τν}$\,MeV, respectively. Conversely, taking the value for $f_{D_s^+}$ from the latest lattice quantum chromodynamics calculation, we obtain $|V_{cs}| =(\muvcsresult)_{μν}$ and $|V_{cs}| = (\tauvcsresult)_{τν}$, respectively.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
ST-RetNet: A Long-term Spatial-Temporal Traffic Flow Prediction Method
Authors:
Baichao Long,
Wang Zhu,
Jianli Xiao
Abstract:
Traffic flow forecasting is considered a critical task in the field of intelligent transportation systems. In this paper, to address the issue of low accuracy in long-term forecasting of spatial-temporal big data on traffic flow, we propose an innovative model called Spatial-Temporal Retentive Network (ST-RetNet). We extend the Retentive Network to address the task of traffic flow forecasting. At…
▽ More
Traffic flow forecasting is considered a critical task in the field of intelligent transportation systems. In this paper, to address the issue of low accuracy in long-term forecasting of spatial-temporal big data on traffic flow, we propose an innovative model called Spatial-Temporal Retentive Network (ST-RetNet). We extend the Retentive Network to address the task of traffic flow forecasting. At the spatial scale, we integrate a topological graph structure into Spatial Retentive Network(S-RetNet), utilizing an adaptive adjacency matrix to extract dynamic spatial features of the road network. We also employ Graph Convolutional Networks to extract static spatial features of the road network. These two components are then fused to capture dynamic and static spatial correlations. At the temporal scale, we propose the Temporal Retentive Network(T-RetNet), which has been demonstrated to excel in capturing long-term dependencies in traffic flow patterns compared to other time series models, including Recurrent Neural Networks based and transformer models. We achieve the spatial-temporal traffic flow forecasting task by integrating S-RetNet and T-RetNet to form ST-RetNet. Through experimental comparisons conducted on four real-world datasets, we demonstrate that ST-RetNet outperforms the state-of-the-art approaches in traffic flow forecasting.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
APC: Adaptive Patch Contrast for Weakly Supervised Semantic Segmentation
Authors:
Wangyu Wu,
Tianhong Dai,
Zhenhong Chen,
Xiaowei Huang,
Fei Ma,
Jimin Xiao
Abstract:
Weakly Supervised Semantic Segmentation (WSSS) using only image-level labels has gained significant attention due to its cost-effectiveness. The typical framework involves using image-level labels as training data to generate pixel-level pseudo-labels with refinements. Recently, methods based on Vision Transformers (ViT) have demonstrated superior capabilities in generating reliable pseudo-labels,…
▽ More
Weakly Supervised Semantic Segmentation (WSSS) using only image-level labels has gained significant attention due to its cost-effectiveness. The typical framework involves using image-level labels as training data to generate pixel-level pseudo-labels with refinements. Recently, methods based on Vision Transformers (ViT) have demonstrated superior capabilities in generating reliable pseudo-labels, particularly in recognizing complete object regions, compared to CNN methods. However, current ViT-based approaches have some limitations in the use of patch embeddings, being prone to being dominated by certain abnormal patches, as well as many multi-stage methods being time-consuming and lengthy in training, thus lacking efficiency. Therefore, in this paper, we introduce a novel ViT-based WSSS method named \textit{Adaptive Patch Contrast} (APC) that significantly enhances patch embedding learning for improved segmentation effectiveness. APC utilizes an Adaptive-K Pooling (AKP) layer to address the limitations of previous max pooling selection methods. Additionally, we propose a Patch Contrastive Learning (PCL) to enhance patch embeddings, thereby further improving the final results. Furthermore, we improve upon the existing multi-stage training framework without CAM by transforming it into an end-to-end single-stage training approach, thereby enhancing training efficiency. The experimental results show that our approach is effective and efficient, outperforming other state-of-the-art WSSS methods on the PASCAL VOC 2012 and MS COCO 2014 dataset within a shorter training duration.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
From Easy to Hard: Learning Curricular Shape-aware Features for Robust Panoptic Scene Graph Generation
Authors:
Hanrong Shi,
Lin Li,
Jun Xiao,
Yueting Zhuang,
Long Chen
Abstract:
Panoptic Scene Graph Generation (PSG) aims to generate a comprehensive graph-structure representation based on panoptic segmentation masks. Despite remarkable progress in PSG, almost all existing methods neglect the importance of shape-aware features, which inherently focus on the contours and boundaries of objects. To bridge this gap, we propose a model-agnostic Curricular shApe-aware FEature (CA…
▽ More
Panoptic Scene Graph Generation (PSG) aims to generate a comprehensive graph-structure representation based on panoptic segmentation masks. Despite remarkable progress in PSG, almost all existing methods neglect the importance of shape-aware features, which inherently focus on the contours and boundaries of objects. To bridge this gap, we propose a model-agnostic Curricular shApe-aware FEature (CAFE) learning strategy for PSG. Specifically, we incorporate shape-aware features (i.e., mask features and boundary features) into PSG, moving beyond reliance solely on bbox features. Furthermore, drawing inspiration from human cognition, we propose to integrate shape-aware features in an easy-to-hard manner. To achieve this, we categorize the predicates into three groups based on cognition learning difficulty and correspondingly divide the training process into three stages. Each stage utilizes a specialized relation classifier to distinguish specific groups of predicates. As the learning difficulty of predicates increases, these classifiers are equipped with features of ascending complexity. We also incorporate knowledge distillation to retain knowledge acquired in earlier stages. Due to its model-agnostic nature, CAFE can be seamlessly incorporated into any PSG model. Extensive experiments and ablations on two PSG tasks under both robust and zero-shot PSG have attested to the superiority and robustness of our proposed CAFE, which outperforms existing state-of-the-art methods by a large margin.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (645 additional authors not shown)
Abstract:
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be…
▽ More
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Higher-order Fuzzy Membership in Motif Modularity Optimization
Authors:
Jing Xiao,
Ya-Wei Wei,
Xiao-Ke Xu
Abstract:
Higher-order community detection (HCD) reveals both mesoscale structures and functional characteristics of real-life networks. Although many methods have been developed from diverse perspectives, to our knowledge, none can provide fine-grained higher-order fuzzy community information. This study presents a novel concept of higher-order fuzzy memberships that quantify the membership grades of motif…
▽ More
Higher-order community detection (HCD) reveals both mesoscale structures and functional characteristics of real-life networks. Although many methods have been developed from diverse perspectives, to our knowledge, none can provide fine-grained higher-order fuzzy community information. This study presents a novel concept of higher-order fuzzy memberships that quantify the membership grades of motifs to crisp higher-order communities, thereby revealing the partial community affiliations. Furthermore, we employ higher-order fuzzy memberships to enhance HCD via a general framework called fuzzy memberships assisted motif-based evolutionary modularity (FMMEM). In FFMEM, on the one hand, a fuzzy membership-based neighbor community modification (FM-NCM) strategy is designed to correct misassigned bridge nodes, thereby improving partition quality. On the other hand, a fuzzy membership-based local community merging (FM-LCM) strategy is also proposed to combine excessively fragmented communities for enhancing local search ability. Experimental results indicate that the FMMEM framework outperforms state-of-the-art methods in both synthetic and real-world datasets, particularly in the networks with ambiguous and complex structures.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Transformation of a cellular skyrmion to polyomino-like structures
Authors:
Jing Xia,
Xichao Zhang,
Yan Zhou,
Xiaoxi Liu,
Guoping Zhao,
Masahito Mochizuki
Abstract:
Topological spin structures with transformable shapes may have potential implications on data storage and computation. Here, we demonstrate that a square cellular skyrmion on an artificial grid pinning pattern can be manipulated by programmed current pulses. We find that parallel short pulses could result in the elongation of the skyrmion mainly in the current direction, while parallel long pulses…
▽ More
Topological spin structures with transformable shapes may have potential implications on data storage and computation. Here, we demonstrate that a square cellular skyrmion on an artificial grid pinning pattern can be manipulated by programmed current pulses. We find that parallel short pulses could result in the elongation of the skyrmion mainly in the current direction, while parallel long pulses are able to induce the elongation in the direction perpendicular to the current due to the intrinsic skyrmion Hall effect. Consequently, a programmed sequence of parallel pulses could lead to the transformation of the skyrmion to I-, L-, and Z-shaped polyomino-like structures without affecting the topological charge. In addition, we find that orthogonal pulses could lead to the transformation to more complex polyomino-like structures, including the T-shaped and irregular ones. Particularly, when a small T-shaped structure is formed, the topological charge of the system is found to be non-integer due to incomplete compensation of local topological charge densities; however, the T-shaped structure is stable on the attractive pinning pattern. Our results offer an effective way to create polyomino-like spin structures toward functional applications.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Fine-Tuning Linear Layers Only Is a Simple yet Effective Way for Task Arithmetic
Authors:
Ruochen Jin,
Bojian Hou,
Jiancong Xiao,
Weijie Su,
Li Shen
Abstract:
Task arithmetic has recently emerged as a cost-effective and scalable approach to edit pre-trained models directly in weight space, by adding the fine-tuned weights of different tasks. The performance has been further improved by a linear property which is illustrated by weight disentanglement. Yet, conventional linearization methods (e.g., NTK linearization) not only double the time and training…
▽ More
Task arithmetic has recently emerged as a cost-effective and scalable approach to edit pre-trained models directly in weight space, by adding the fine-tuned weights of different tasks. The performance has been further improved by a linear property which is illustrated by weight disentanglement. Yet, conventional linearization methods (e.g., NTK linearization) not only double the time and training cost but also have a disadvantage on single-task performance. We propose a simple yet effective and efficient method that only fine-tunes linear layers, which improves weight disentanglement and efficiency simultaneously. Specifically, our study reveals that only fine-tuning the linear layers in the attention modules makes the whole model occur in a linear regime, significantly improving weight disentanglement. To further understand how our method improves the disentanglement of task arithmetic, we present a comprehensive study of task arithmetic by differentiating the role of representation model and task-specific model. In particular, we find that the representation model plays an important role in improving weight disentanglement whereas the task-specific models such as the classification heads can degenerate the weight disentanglement performance. Overall, our work uncovers novel insights into the fundamental mechanisms of task arithmetic and offers a more reliable and effective approach to editing pre-trained models.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions
Authors:
Yanxu Zhu,
Jinlin Xiao,
Yuhang Wang,
Jitao Sang
Abstract:
Recent studies have demonstrated that large language models (LLMs) are susceptible to being misled by false premise questions (FPQs), leading to errors in factual knowledge, know as factuality hallucination. Existing benchmarks that assess this vulnerability primarily rely on manual construction, resulting in limited scale and lack of scalability. In this work, we introduce an automated, scalable…
▽ More
Recent studies have demonstrated that large language models (LLMs) are susceptible to being misled by false premise questions (FPQs), leading to errors in factual knowledge, know as factuality hallucination. Existing benchmarks that assess this vulnerability primarily rely on manual construction, resulting in limited scale and lack of scalability. In this work, we introduce an automated, scalable pipeline to create FPQs based on knowledge graphs (KGs). The first step is modifying true triplets extracted from KGs to create false premises. Subsequently, utilizing the state-of-the-art capabilities of GPTs, we generate semantically rich FPQs. Based on the proposed method, we present a comprehensive benchmark, the Knowledge Graph-based False Premise Questions (KG-FPQ), which contains approximately 178k FPQs across three knowledge domains, at six levels of confusability, and in two task formats. Using KG-FPQ, we conduct extensive evaluations on several representative LLMs and provide valuable insights. The KG-FPQ dataset and code are available at~https://github.com/yanxuzhu/KG-FPQ.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
AIRA: A Low-cost IR-based Approach Towards Autonomous Precision Drone Landing and NLOS Indoor Navigation
Authors:
Yanchen Liu,
Minghui Zhao,
Kaiyuan Hou,
Junxi Xia,
Charlie Carver,
Stephen Xia,
Xia Zhou,
Xiaofan Jiang
Abstract:
Automatic drone landing is an important step for achieving fully autonomous drones. Although there are many works that leverage GPS, video, wireless signals, and active acoustic sensing to perform precise landing, autonomous drone landing remains an unsolved challenge for palm-sized microdrones that may not be able to support the high computational requirements of vision, wireless, or active audio…
▽ More
Automatic drone landing is an important step for achieving fully autonomous drones. Although there are many works that leverage GPS, video, wireless signals, and active acoustic sensing to perform precise landing, autonomous drone landing remains an unsolved challenge for palm-sized microdrones that may not be able to support the high computational requirements of vision, wireless, or active audio sensing. We propose AIRA, a low-cost infrared light-based platform that targets precise and efficient landing of low-resource microdrones. AIRA consists of an infrared light bulb at the landing station along with an energy efficient hardware photodiode (PD) sensing platform at the bottom of the drone. AIRA costs under 83 USD, while achieving comparable performance to existing vision-based methods at a fraction of the energy cost. AIRA requires only three PDs without any complex pattern recognition models to accurately land the drone, under $10$cm of error, from up to $11.1$ meters away, compared to camera-based methods that require recognizing complex markers using high resolution images with a range of only up to $1.2$ meters from the same height. Moreover, we demonstrate that AIRA can accurately guide drones in low light and partial non line of sight scenarios, which are difficult for traditional vision-based approaches.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
A Broadband Algorithm for Adiabatic Mode Evolution and An Application on Polarization Splitter-Rotator on LNOI Platform
Authors:
Geng Chen,
Chijun Li,
Xuanhao Wang,
Yuankang Huang,
Siyu Lu,
Yiqi Dai,
Xiangyu Meng,
Cheng Zeng,
Jinsong Xia
Abstract:
Adiabatic mode evolution waveguides (AMEWs) are widely utilized in integrated photonics, including tapered waveguides, edge couplers, mode converters, splitters, etc. An analytical theory and a novel AMEW design algorithm are developed to create shortcuts to adiabaticity (STA). With the new algorithm, we demonstrate a broadband and highly efficient polarization splitter-rotator (PSR) on a lithium-…
▽ More
Adiabatic mode evolution waveguides (AMEWs) are widely utilized in integrated photonics, including tapered waveguides, edge couplers, mode converters, splitters, etc. An analytical theory and a novel AMEW design algorithm are developed to create shortcuts to adiabaticity (STA). With the new algorithm, we demonstrate a broadband and highly efficient polarization splitter-rotator (PSR) on a lithium-niobate-on-insulator (LNOI) platform with an LN thickness of 500 nm. The fabricated PSR, with a total length of 2 mm, exhibits an insertion loss (IL) of 0.8 dB and a polarization extinction ratio (ER) of 12 dB over a wavelength range exceeding 76 nm.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Planning with Large Language Models for Conversational Agents
Authors:
Zhigen Li,
Jianxiang Peng,
Yanmeng Wang,
Tianhao Shen,
Minghui Zhang,
Linxi Su,
Shang Wu,
Yihang Wu,
Yuqian Wang,
Ye Wang,
Wei Hu,
Jianfeng Li,
Shaojun Wang,
Jing Xiao,
Deyi Xiong
Abstract:
Controllability and proactivity are crucial properties of autonomous conversational agents (CAs). Controllability requires the CAs to follow the standard operating procedures (SOPs), such as verifying identity before activating credit cards. Proactivity requires the CAs to guide the conversation towards the goal during user uncooperation, such as persuasive dialogue. Existing research cannot be un…
▽ More
Controllability and proactivity are crucial properties of autonomous conversational agents (CAs). Controllability requires the CAs to follow the standard operating procedures (SOPs), such as verifying identity before activating credit cards. Proactivity requires the CAs to guide the conversation towards the goal during user uncooperation, such as persuasive dialogue. Existing research cannot be unified with controllability, proactivity, and low manual annotation. To bridge this gap, we propose a new framework for planning-based conversational agents (PCA) powered by large language models (LLMs), which only requires humans to define tasks and goals for the LLMs. Before conversation, LLM plans the core and necessary SOP for dialogue offline. During the conversation, LLM plans the best action path online referring to the SOP, and generates responses to achieve process controllability. Subsequently, we propose a semi-automatic dialogue data creation framework and curate a high-quality dialogue dataset (PCA-D). Meanwhile, we develop multiple variants and evaluation metrics for PCA, e.g., planning with Monte Carlo Tree Search (PCA-M), which searches for the optimal dialogue action while satisfying SOP constraints and achieving the proactive of the dialogue. Experiment results show that LLMs finetuned on PCA-D can significantly improve the performance and generalize to unseen domains. PCA-M outperforms other CoT and ToT baselines in terms of conversation controllability, proactivity, task success rate, and overall logical coherence, and is applicable in industry dialogue scenarios. The dataset and codes are available at XXXX.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Compact ultra-broadband light coupling on chip via nonadiabatic pumping
Authors:
Weiwei Liu,
Chijun Li,
Bing Wang,
Tianyan Chai,
Lingzhi Zheng,
Zhuoxiong Liu,
Haoru Zhang,
Shuaifei Ren,
Xiaohong Li,
Cheng Zeng,
Jinsong Xia,
Peixiang Lu
Abstract:
Enlarging bandwidth capacity of the integrated photonic systems demands efficient and broadband light coupling among optical elements, which has been a vital issue in integrated photonics. Here, we have developed a compact ultra-broadband light coupling strategy based on nonadiabatic pumping in coupled optical waveguides, and experimentally demonstrated the designs in thin-film lithium niobate on…
▽ More
Enlarging bandwidth capacity of the integrated photonic systems demands efficient and broadband light coupling among optical elements, which has been a vital issue in integrated photonics. Here, we have developed a compact ultra-broadband light coupling strategy based on nonadiabatic pumping in coupled optical waveguides, and experimentally demonstrated the designs in thin-film lithium niobate on insulator (LNOI) platform. We found that nonadiabatic transition would produce a decreased dispersion of the phases related to eigenstates in the waveguides. As a consequence, we realized high-efficiency directional transfer between edgestates for various wavelengths covering a 1-dB bandwidth of ~320 nm in experiment (>400 nm in simulation), with a coupling length (~50 μm) approximately 1/10 of that required in the adiabatic regime. Furthermore, we have constructed complex functional devices including beamsplitter and multiple-level cascaded networks for broadband light routing and splitting. Our work preserves significant advantages simultaneously in extending the operation bandwidth and minimizing the footprint, which demonstrates great potential for large-scale and compact photonic integration on chip.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
BACON: Supercharge Your VLM with Bag-of-Concept Graph to Mitigate Hallucinations
Authors:
Zhantao Yang,
Ruili Feng,
Keyu Yan,
Huangji Wang,
Zhicai Wang,
Shangwen Zhu,
Han Zhang,
Jie Xiao,
Pingyu Wu,
Kai Zhu,
Jixuan Chen,
Chen-Wei Xie,
Chaojie Mao,
Yue Yang,
Hongyang Zhang,
Yu Liu,
Fan Cheng
Abstract:
This paper presents Bag-of-Concept Graph (BACON) to gift models with limited linguistic abilities to taste the privilege of Vision Language Models (VLMs) and boost downstream tasks such as detection, visual question answering (VQA), and image generation. Since the visual scenes in physical worlds are structured with complex relations between objects, BACON breaks down annotations into basic minimu…
▽ More
This paper presents Bag-of-Concept Graph (BACON) to gift models with limited linguistic abilities to taste the privilege of Vision Language Models (VLMs) and boost downstream tasks such as detection, visual question answering (VQA), and image generation. Since the visual scenes in physical worlds are structured with complex relations between objects, BACON breaks down annotations into basic minimum elements and presents them in a graph structure. Element-wise style enables easy understanding, and structural composition liberates difficult locating. Careful prompt design births the BACON captions with the help of public-available VLMs and segmentation methods. In this way, we gather a dataset with 100K annotated images, which endow VLMs with remarkable capabilities, such as accurately generating BACON, transforming prompts into BACON format, envisioning scenarios in the style of BACONr, and dynamically modifying elements within BACON through interactive dialogue and more. Wide representative experiments, including detection, VQA, and image generation tasks, tell BACON as a lifeline to achieve previous out-of-reach tasks or excel in their current cutting-edge solutions.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be…
▽ More
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Differential Encoding for Improved Representation Learning over Graphs
Authors:
Haimin Zhang,
Jiahao Xia,
Min Xu
Abstract:
Combining the message-passing paradigm with the global attention mechanism has emerged as an effective framework for learning over graphs. The message-passing paradigm and the global attention mechanism fundamentally generate node embeddings based on information aggregated from a node's local neighborhood or from the whole graph. The most basic and commonly used aggregation approach is to take the…
▽ More
Combining the message-passing paradigm with the global attention mechanism has emerged as an effective framework for learning over graphs. The message-passing paradigm and the global attention mechanism fundamentally generate node embeddings based on information aggregated from a node's local neighborhood or from the whole graph. The most basic and commonly used aggregation approach is to take the sum of information from a node's local neighbourhood or from the whole graph. However, it is unknown if the dominant information is from a node itself or from the node's neighbours (or the rest of the graph nodes). Therefore, there exists information lost at each layer of embedding generation, and this information lost could be accumulated and become more serious when more layers are used in the model. In this paper, we present a differential encoding method to address the issue of information lost. The idea of our method is to encode the differential representation between the information from a node's neighbours (or the rest of the graph nodes) and that from the node itself. The obtained differential encoding is then combined with the original aggregated local or global representation to generate the updated node embedding. By integrating differential encodings, the representational ability of generated node embeddings is improved. The differential encoding method is empirically evaluated on different graph tasks on seven benchmark datasets. The results show that it is a general method that improves the message-passing update and the global attention update, advancing the state-of-the-art performance for graph representation learning on these datasets.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Hypermultiplexed off-chip hologram by on-chip integrated metasurface
Authors:
Xianjin Liu,
Zhanying Ma,
Dasen Zhang,
Qiwen Bao,
Zhenzhen Liu,
Jun-Jun Xiao
Abstract:
The waveguide-integrated metasurface introduces a novel photonic chip capable of converting guided modes into free-space light. This enables functions such as off-chip beam focusing, steering, and imaging. The challenge lies in achieving hypermultiplexing across diverse parameters, including guided-wave mode type, direction, polarization, and notably, multiple wavelengths. Here, we introduce a com…
▽ More
The waveguide-integrated metasurface introduces a novel photonic chip capable of converting guided modes into free-space light. This enables functions such as off-chip beam focusing, steering, and imaging. The challenge lies in achieving hypermultiplexing across diverse parameters, including guided-wave mode type, direction, polarization, and notably, multiple wavelengths. Here, we introduce a comprehensive end-to-end inverse design framework, rooted in a physical model, for the multifunctional design of on-chip metasurfaces. This framework allows for metasurface optimization through a target-field-driven iteration process. We demonstrate a hypermultiplexed on-chip metasurface capable of generating red-green-blue holograms at multiple target planes, with both independent and cooperative control over guided-wave direction. Significantly, the proposed method streamlines the design process utilizing only the positions of meta-atoms as the design variable. We demonstrate 9 independent holographic channels through a combination of wavelength and distance multiplexing. Moreover, by incorporating the excitation direction into the design, the metasurface produces a total of 36 distinct holograms. The robustness of these results against fabrication discrepancies is validated through 3D full-wave electromagnetic simulations, aligning well with advanced manufacturing techniques. Our research presents a universal design framework for the development of multifunctional on-chip metasurfaces, opening up new avenues for a wide range of applications.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
S. Ahmed,
M. Albrecht,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
X. H. Bai,
Y. Bai,
O. Bakina,
R. Baldini Ferroli,
I. Balossino,
Y. Ban,
K. Begzsuren,
N. Berger,
M. Bertani,
D. Bettoni,
F. Bianchi,
J. Bloms,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (495 additional authors not shown)
Abstract:
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions…
▽ More
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components.
△ Less
Submitted 2 July, 2024; v1 submitted 28 June, 2024;
originally announced July 2024.
-
Uncovering cognitive taskonomy through transfer learning in masked autoencoder-based fMRI reconstruction
Authors:
Youzhi Qu,
Junfeng Xia,
Xinyao Jian,
Wendu Li,
Kaining Peng,
Zhichao Liang,
Haiyan Wu,
Quanying Liu
Abstract:
Data reconstruction is a widely used pre-training task to learn the generalized features for many downstream tasks. Although reconstruction tasks have been applied to neural signal completion and denoising, neural signal reconstruction is less studied. Here, we employ the masked autoencoder (MAE) model to reconstruct functional magnetic resonance imaging (fMRI) data, and utilize a transfer learnin…
▽ More
Data reconstruction is a widely used pre-training task to learn the generalized features for many downstream tasks. Although reconstruction tasks have been applied to neural signal completion and denoising, neural signal reconstruction is less studied. Here, we employ the masked autoencoder (MAE) model to reconstruct functional magnetic resonance imaging (fMRI) data, and utilize a transfer learning framework to obtain the cognitive taskonomy, a matrix to quantify the similarity between cognitive tasks. Our experimental results demonstrate that the MAE model effectively captures the temporal dynamics patterns and interactions within the brain regions, enabling robust cross-subject fMRI signal reconstruction. The cognitive taskonomy derived from the transfer learning framework reveals the relationships among cognitive tasks, highlighting subtask correlations within motor tasks and similarities between emotion, social, and gambling tasks. Our study suggests that the fMRI reconstruction with MAE model can uncover the latent representation and the obtained taskonomy offers guidance for selecting source tasks in neural decoding tasks for improving the decoding performance on target tasks.
△ Less
Submitted 24 May, 2024;
originally announced July 2024.
-
Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec…
▽ More
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Quantum annealing-based structural optimization with a multiplicative design update
Authors:
Naruethep Sukulthanasorn,
Junsen Xiao,
Koya Wagatsuma,
Shuji Moriguchi,
Kenjiro Terada
Abstract:
This paper presents a new structural design framework, developed based on iterative optimization via quantum annealing (QA). The novelty lies in its successful design update using an unknown design multiplier obtained by iteratively solving the optimization problems with QA. In addition, to align with density-based approaches in structural optimization, multipliers are multiplicative to represent…
▽ More
This paper presents a new structural design framework, developed based on iterative optimization via quantum annealing (QA). The novelty lies in its successful design update using an unknown design multiplier obtained by iteratively solving the optimization problems with QA. In addition, to align with density-based approaches in structural optimization, multipliers are multiplicative to represent design material and serve as design variables. In particular, structural analysis is performed on a classical computer using the finite element method, and QA is utilized for topology updating. The primary objective of the framework is to minimize compliance under an inequality volume constraint, while an encoding process for the design variable is adopted, enabling smooth iterative updates to the optimized design. The proposed framework incorporates both penalty methods and slack variables to transform the inequality constraint into an equality constraint and is implemented in a quadratic unconstrained binary optimization (QUBO) model through QA. To demonstrate its performance, design optimization is performed for both truss and continuum structures. Promising results from these applications indicate that the proposed framework is capable of creating an optimal shape and topology similar to those benchmarked by the optimality criteria (OC) method on a classical computer.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Fully Exploiting Every Real Sample: SuperPixel Sample Gradient Model Stealing
Authors:
Yunlong Zhao,
Xiaoheng Deng,
Yijing Liu,
Xinjun Pei,
Jiazhi Xia,
Wei Chen
Abstract:
Model stealing (MS) involves querying and observing the output of a machine learning model to steal its capabilities. The quality of queried data is crucial, yet obtaining a large amount of real data for MS is often challenging. Recent works have reduced reliance on real data by using generative models. However, when high-dimensional query data is required, these methods are impractical due to the…
▽ More
Model stealing (MS) involves querying and observing the output of a machine learning model to steal its capabilities. The quality of queried data is crucial, yet obtaining a large amount of real data for MS is often challenging. Recent works have reduced reliance on real data by using generative models. However, when high-dimensional query data is required, these methods are impractical due to the high costs of querying and the risk of model collapse. In this work, we propose using sample gradients (SG) to enhance the utility of each real sample, as SG provides crucial guidance on the decision boundaries of the victim model. However, utilizing SG in the model stealing scenario faces two challenges: 1. Pixel-level gradient estimation requires extensive query volume and is susceptible to defenses. 2. The estimation of sample gradients has a significant variance. This paper proposes Superpixel Sample Gradient stealing (SPSG) for model stealing under the constraint of limited real samples. With the basic idea of imitating the victim model's low-variance patch-level gradients instead of pixel-level gradients, SPSG achieves efficient sample gradient estimation through two steps. First, we perform patch-wise perturbations on query images to estimate the average gradient in different regions of the image. Then, we filter the gradients through a threshold strategy to reduce variance. Exhaustive experiments demonstrate that, with the same number of real samples, SPSG achieves accuracy, agreements, and adversarial success rate significantly surpassing the current state-of-the-art MS methods. Codes are available at https://github.com/zyl123456aB/SPSG_attack.
△ Less
Submitted 18 May, 2024;
originally announced June 2024.
-
Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of…
▽ More
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$, evidence for $ψ(4160) \to K^{-}\barΞ^{+}Λ$ is found for the first time with a significance of 4.4$σ$, including systematic uncertainties. No evidence for other possible resonances is found. In addition, the products of electronic partial width and branching fraction for all assumed resonances decaying into $K^{-}\barΞ^{+}Λ/Σ^{0}$ are determined.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery
Authors:
Jian Song,
Hongruixuan Chen,
Weihao Xuan,
Junshi Xia,
Naoto Yokoya
Abstract:
Global semantic 3D understanding from single-view high-resolution remote sensing (RS) imagery is crucial for Earth Observation (EO). However, this task faces significant challenges due to the high costs of annotations and data collection, as well as geographically restricted data availability. To address these challenges, synthetic data offer a promising solution by being easily accessible and thu…
▽ More
Global semantic 3D understanding from single-view high-resolution remote sensing (RS) imagery is crucial for Earth Observation (EO). However, this task faces significant challenges due to the high costs of annotations and data collection, as well as geographically restricted data availability. To address these challenges, synthetic data offer a promising solution by being easily accessible and thus enabling the provision of large and diverse datasets. We develop a specialized synthetic data generation pipeline for EO and introduce SynRS3D, the largest synthetic RS 3D dataset. SynRS3D comprises 69,667 high-resolution optical images that cover six different city styles worldwide and feature eight land cover types, precise height information, and building change masks. To further enhance its utility, we develop a novel multi-task unsupervised domain adaptation (UDA) method, RS3DAda, coupled with our synthetic dataset, which facilitates the RS-specific transition from synthetic to real scenarios for land cover mapping and height estimation tasks, ultimately enabling global monocular 3D semantic understanding based on synthetic data. Extensive experiments on various real-world datasets demonstrate the adaptability and effectiveness of our synthetic dataset and proposed RS3DAda method. SynRS3D and related codes will be available.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an…
▽ More
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, and $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^0)=(2.02 \pm 0.13 \pm 0.05)\%$, where the first uncertainties are statistical and the second systematic. Combining with the known branching fractions of $Λ_c^+ \to pK_{S}^{0}$, $Λ_c^+ \to pK_{S}^{0}π^+π^-$, and $Λ_c^+ \to pK_{S}^{0}π^0$, we present the first measurements of the $K_{S}^{0}$-$K_{L}^{0}$ asymmetries $R(Λ_c^+, K_{S,L}^0X) = \frac{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) - \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) + \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}$ in charmed baryon decays: $R(Λ_c^+, pK_{S,L}^0) = -0.025 \pm 0.031$, $R(Λ_c^+, pK_{S,L}^0π^+π^-) = -0.027 \pm 0.048$, and $R(Λ_c^+, pK_{S,L}^0π^0) =-0.015 \pm 0.046$. No significant asymmetries within the uncertainties are observed.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
CMBFSCNN: Cosmic Microwave Background Polarization Foreground Subtraction with Convolutional Neural Network
Authors:
Ye-Peng Yan,
Si-Yu Li,
Guo-Jian Wang,
Zirui Zhang,
Jun-Qing Xia
Abstract:
In our previous study, we introduced a machine-learning technique, namely CMBFSCNN, for the removal of foreground contamination in cosmic microwave background (CMB) polarization data. This method was successfully employed on actual observational data from the Planck mission. In this study, we extend our investigation by considering the CMB lensing effect in simulated data and utilizing the CMBFSCN…
▽ More
In our previous study, we introduced a machine-learning technique, namely CMBFSCNN, for the removal of foreground contamination in cosmic microwave background (CMB) polarization data. This method was successfully employed on actual observational data from the Planck mission. In this study, we extend our investigation by considering the CMB lensing effect in simulated data and utilizing the CMBFSCNN approach to recover the CMB lensing B-mode power spectrum from multi-frequency observational maps. Our method is first applied to simulated data with the performance of CMB-S4 experiment. We achieve reliable recovery of the noisy CMB Q (or U) maps with a mean absolute difference of $0.016\pm0.008\ μ$K (or $0.021\pm0.002\ μ$K) for the CMB-S4 experiment. To address the residual instrumental noise in the foreground-cleaned map, we employ a "half-split maps" approach, where the entire dataset is divided into two segments sharing the same sky signal but having uncorrelated noise. Using cross-correlation techniques between two recovered half-split maps, we effectively reduce instrumental noise effects at the power spectrum level. As a result, we achieve precise recovery of the CMB EE and lensing B-mode power spectra. Furthermore, we also extend our pipeline to full-sky simulated data with the performance of LiteBIRD experiment. As expected, various foregrounds are cleanly removed from the foreground contamination observational maps, and recovered EE and lensing B-mode power spectra exhibit excellent agreement with the true results. Finally, we discuss the dependency of our method on the foreground models.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Examples of non-scattering inhomogeneities
Authors:
Lucas Chesnel,
Houssem Haddar,
Hongjie Li,
Jingni Xiao
Abstract:
We consider the scattering of waves by a penetrable inclusion embedded in some reference medium. We exhibit examples of materials and geometries for which non-scattering frequencies exist, i.e., for which at some frequencies there are incident fields which produce null scattered fields outside of the inhomogeneity. We show in particular that certain domains with corners or even cusps can support n…
▽ More
We consider the scattering of waves by a penetrable inclusion embedded in some reference medium. We exhibit examples of materials and geometries for which non-scattering frequencies exist, i.e., for which at some frequencies there are incident fields which produce null scattered fields outside of the inhomogeneity. We show in particular that certain domains with corners or even cusps can support non-scattering frequencies. We relate the latter, for some inclusions, to resonance frequencies for Dirichlet or Neumann cavities. We also find situations where incident non-scattering fields solve the Helmholtz equation in a neighborhood of the inhomogeneity and not in the whole space. Finally, in relation with invisibility, we give examples of inclusions of anisotropic materials which are non-scattering for all real frequencies. We prove that corresponding material indices must have a special structure on the boundary.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Indications of superconductivities in blend of variant apatite and covellite
Authors:
Hongyang Wang,
Yijing Zhao,
Hao Wu,
Ling Wang,
Zhixing Wu,
Zhihui Geng,
Jiewen Xiao,
Weiwei Xue,
Shufeng Ye,
Ning Chen,
Xianfeng Qiao,
Yao Yao
Abstract:
Through heavily doping sulfur into an apatite framework, we synthesize a new blend mainly comprising variant apatite and covellite (copper sulfide). Magnetic measurement exhibits that significant diamagnetism appears at around 260 K and drops dramatically below 30 K implying coexistence of two superconducting phases. The upper critical magnetic field is larger than 1000 Oe at 250 K. Electric measu…
▽ More
Through heavily doping sulfur into an apatite framework, we synthesize a new blend mainly comprising variant apatite and covellite (copper sulfide). Magnetic measurement exhibits that significant diamagnetism appears at around 260 K and drops dramatically below 30 K implying coexistence of two superconducting phases. The upper critical magnetic field is larger than 1000 Oe at 250 K. Electric measurement manifests that the current-voltage curves deviate from the normal linear lineshape suggesting the presence of zero-resistance effect, and the critical current is around 50 $μ$A at 140 K. These exotic magnetic and electric features strongly indicate these two components, variant apatite and covellite, individually trigger two superconducting phases at near-room and low temperatures.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Study of the $f_{0}(980)$ through the decay $D_{s}^{+}\rightarrow π^{+}π^{+}π^{-}π^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (649 additional authors not shown)
Abstract:
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and…
▽ More
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and determine the branching fractions $\mathcal{B}(D_s^+\toπ^+π^+π^-π^0|_{{\rm non}-η})=(2.04\pm0.08_{\rm stat.}\pm0.05_{\rm syst.})\%$ and $\mathcal{B}(D_s^+\toηπ^+)=(1.56\pm0.09_{\rm stat.}\pm0.04_{\rm syst.})\%$. Moreover, we measure the relative branching fraction between $φ\toπ^+π^-π^0$ and $φ\to K^+K^-$ to be $\frac{\mathcal{B}(φ(1020) \to π^+π^-π^0)}{\mathcal{B}(φ(1020) \to K^+K^-)}=0.230 \pm 0.014_{\rm stat.} \pm 0.010_{\rm syst.}$, which deviates from the world average value by more than $4σ$.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Deep Learning Segmentation of Ascites on Abdominal CT Scans for Automatic Volume Quantification
Authors:
Benjamin Hou,
Sung-Won Lee,
Jung-Min Lee,
Christopher Koh,
Jing Xiao,
Perry J. Pickhardt,
Ronald M. Summers
Abstract:
Purpose: To evaluate the performance of an automated deep learning method in detecting ascites and subsequently quantifying its volume in patients with liver cirrhosis and ovarian cancer.
Materials and Methods: This retrospective study included contrast-enhanced and non-contrast abdominal-pelvic CT scans of patients with cirrhotic ascites and patients with ovarian cancer from two institutions, N…
▽ More
Purpose: To evaluate the performance of an automated deep learning method in detecting ascites and subsequently quantifying its volume in patients with liver cirrhosis and ovarian cancer.
Materials and Methods: This retrospective study included contrast-enhanced and non-contrast abdominal-pelvic CT scans of patients with cirrhotic ascites and patients with ovarian cancer from two institutions, National Institutes of Health (NIH) and University of Wisconsin (UofW). The model, trained on The Cancer Genome Atlas Ovarian Cancer dataset (mean age, 60 years +/- 11 [s.d.]; 143 female), was tested on two internal (NIH-LC and NIH-OV) and one external dataset (UofW-LC). Its performance was measured by the Dice coefficient, standard deviations, and 95% confidence intervals, focusing on ascites volume in the peritoneal cavity.
Results: On NIH-LC (25 patients; mean age, 59 years +/- 14 [s.d.]; 14 male) and NIH-OV (166 patients; mean age, 65 years +/- 9 [s.d.]; all female), the model achieved Dice scores of 0.855 +/- 0.061 (CI: 0.831-0.878) and 0.826 +/- 0.153 (CI: 0.764-0.887), with median volume estimation errors of 19.6% (IQR: 13.2-29.0) and 5.3% (IQR: 2.4-9.7) respectively. On UofW-LC (124 patients; mean age, 46 years +/- 12 [s.d.]; 73 female), the model had a Dice score of 0.830 +/- 0.107 (CI: 0.798-0.863) and median volume estimation error of 9.7% (IQR: 4.5-15.1). The model showed strong agreement with expert assessments, with r^2 values of 0.79, 0.98, and 0.97 across the test sets.
Conclusion: The proposed deep learning method performed well in segmenting and quantifying the volume of ascites in concordance with expert radiologist assessments.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Privacy Preserving Machine Learning for Electronic Health Records using Federated Learning and Differential Privacy
Authors:
Naif A. Ganadily,
Han J. Xia
Abstract:
An Electronic Health Record (EHR) is an electronic database used by healthcare providers to store patients' medical records which may include diagnoses, treatments, costs, and other personal information. Machine learning (ML) algorithms can be used to extract and analyze patient data to improve patient care. Patient records contain highly sensitive information, such as social security numbers (SSN…
▽ More
An Electronic Health Record (EHR) is an electronic database used by healthcare providers to store patients' medical records which may include diagnoses, treatments, costs, and other personal information. Machine learning (ML) algorithms can be used to extract and analyze patient data to improve patient care. Patient records contain highly sensitive information, such as social security numbers (SSNs) and residential addresses, which introduces a need to apply privacy-preserving techniques for these ML models using federated learning and differential privacy.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction…
▽ More
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction $\mathcal{B}[χ_{c1}(3872)\toπ^+π^- J/ψ]$ at 4.914 and 4.946 GeV are set to be 0.85 and 0.96 pb, respectively. These measurements provide useful information for the production of the $χ_{c1}(3872)$ at $e^+e^-$ collider and deepen our understanding about the nature of this particle.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
A Pure Transformer Pretraining Framework on Text-attributed Graphs
Authors:
Yu Song,
Haitao Mao,
Jiachen Xiao,
Jingzhe Liu,
Zhikai Chen,
Wei Jin,
Carl Yang,
Jiliang Tang,
Hui Liu
Abstract:
Pretraining plays a pivotal role in acquiring generalized knowledge from large-scale data, achieving remarkable successes as evidenced by large models in CV and NLP. However, progress in the graph domain remains limited due to fundamental challenges such as feature heterogeneity and structural heterogeneity. Recently, increasing efforts have been made to enhance node feature quality with Large Lan…
▽ More
Pretraining plays a pivotal role in acquiring generalized knowledge from large-scale data, achieving remarkable successes as evidenced by large models in CV and NLP. However, progress in the graph domain remains limited due to fundamental challenges such as feature heterogeneity and structural heterogeneity. Recently, increasing efforts have been made to enhance node feature quality with Large Language Models (LLMs) on text-attributed graphs (TAGs), demonstrating superiority to traditional bag-of-words or word2vec techniques. These high-quality node features reduce the previously critical role of graph structure, resulting in a modest performance gap between Graph Neural Networks (GNNs) and structure-agnostic Multi-Layer Perceptrons (MLPs). Motivated by this, we introduce a feature-centric pretraining perspective by treating graph structure as a prior and leveraging the rich, unified feature space to learn refined interaction patterns that generalizes across graphs. Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walks and employs masked feature reconstruction to capture pairwise proximity in the LLM-unified feature space using a standard Transformer. By utilizing unified text representations rather than varying structures, our framework achieves significantly better transferability among graphs within the same domain. GSPT can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
NTIRE 2024 Challenge on Night Photography Rendering
Authors:
Egor Ershov,
Artyom Panshin,
Oleg Karasev,
Sergey Korchagin,
Shepelev Lev,
Alexandr Startsev,
Daniil Vladimirov,
Ekaterina Zaychenkova,
Nikola Banić,
Dmitrii Iarchuk,
Maria Efimova,
Radu Timofte,
Arseniy Terekhin,
Shuwei Yue,
Yuyang Liu,
Minchen Wei,
Lu Xu,
Chao Zhang,
Yasi Wang,
Furkan Kınlı,
Doğa Yılmaz,
Barış Özcan,
Furkan Kıraç,
Shuai Liu,
Jingyuan Xiao
, et al. (25 additional authors not shown)
Abstract:
This paper presents a review of the NTIRE 2024 challenge on night photography rendering. The goal of the challenge was to find solutions that process raw camera images taken in nighttime conditions, and thereby produce a photo-quality output images in the standard RGB (sRGB) space. Unlike the previous year's competition, the challenge images were collected with a mobile phone and the speed of algo…
▽ More
This paper presents a review of the NTIRE 2024 challenge on night photography rendering. The goal of the challenge was to find solutions that process raw camera images taken in nighttime conditions, and thereby produce a photo-quality output images in the standard RGB (sRGB) space. Unlike the previous year's competition, the challenge images were collected with a mobile phone and the speed of algorithms was also measured alongside the quality of their output. To evaluate the results, a sufficient number of viewers were asked to assess the visual quality of the proposed solutions, considering the subjective nature of the task. There were 2 nominations: quality and efficiency. Top 5 solutions in terms of output quality were sorted by evaluation time (see Fig. 1). The top ranking participants' solutions effectively represent the state-of-the-art in nighttime photography rendering. More results can be found at https://nightimaging.org.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
PFID: Privacy First Inference Delegation Framework for LLMs
Authors:
Haoyan Yang,
Zhitao Li,
Yong Zhang,
Jianzong Wang,
Ning Cheng,
Ming Li,
Jing Xiao
Abstract:
This paper introduces a novel privacy-preservation framework named PFID for LLMs that addresses critical privacy concerns by localizing user data through model sharding and singular value decomposition. When users are interacting with LLM systems, their prompts could be subject to being exposed to eavesdroppers within or outside LLM system providers who are interested in collecting users' input. I…
▽ More
This paper introduces a novel privacy-preservation framework named PFID for LLMs that addresses critical privacy concerns by localizing user data through model sharding and singular value decomposition. When users are interacting with LLM systems, their prompts could be subject to being exposed to eavesdroppers within or outside LLM system providers who are interested in collecting users' input. In this work, we proposed a framework to camouflage user input, so as to alleviate privacy issues. Our framework proposes to place model shards on the client and the public server, we sent compressed hidden states instead of prompts to and from servers. Clients have held back information that can re-privatized the hidden states so that overall system performance is comparable to traditional LLMs services. Our framework was designed to be communication efficient, computation can be delegated to the local client so that the server's computation burden can be lightened. We conduct extensive experiments on machine translation tasks to verify our framework's performance.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
NovoBench: Benchmarking Deep Learning-based De Novo Peptide Sequencing Methods in Proteomics
Authors:
Jingbo Zhou,
Shaorong Chen,
Jun Xia,
Sizhe Liu,
Tianze Ling,
Wenjie Du,
Yue Liu,
Jianwei Yin,
Stan Z. Li
Abstract:
Tandem mass spectrometry has played a pivotal role in advancing proteomics, enabling the high-throughput analysis of protein composition in biological tissues. Many deep learning methods have been developed for \emph{de novo} peptide sequencing task, i.e., predicting the peptide sequence for the observed mass spectrum. However, two key challenges seriously hinder the further advancement of this im…
▽ More
Tandem mass spectrometry has played a pivotal role in advancing proteomics, enabling the high-throughput analysis of protein composition in biological tissues. Many deep learning methods have been developed for \emph{de novo} peptide sequencing task, i.e., predicting the peptide sequence for the observed mass spectrum. However, two key challenges seriously hinder the further advancement of this important task. Firstly, since there is no consensus for the evaluation datasets, the empirical results in different research papers are often not comparable, leading to unfair comparison. Secondly, the current methods are usually limited to amino acid-level or peptide-level precision and recall metrics. In this work, we present the first unified benchmark NovoBench for \emph{de novo} peptide sequencing, which comprises diverse mass spectrum data, integrated models, and comprehensive evaluation metrics. Recent impressive methods, including DeepNovo, PointNovo, Casanovo, InstaNovo, AdaNovo and $π$-HelixNovo are integrated into our framework. In addition to amino acid-level and peptide-level precision and recall, we evaluate the models' performance in terms of identifying post-tranlational modifications (PTMs), efficiency and robustness to peptide length, noise peaks and missing fragment ratio, which are important influencing factors while seldom be considered. Leveraging this benchmark, we conduct a large-scale study of current methods, report many insightful findings that open up new possibilities for future development. The benchmark will be open-sourced to facilitate future research and application.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Frozen CLIP: A Strong Backbone for Weakly Supervised Semantic Segmentation
Authors:
Bingfeng Zhang,
Siyue Yu,
Yunchao Wei,
Yao Zhao,
Jimin Xiao
Abstract:
Weakly supervised semantic segmentation has witnessed great achievements with image-level labels. Several recent approaches use the CLIP model to generate pseudo labels for training an individual segmentation model, while there is no attempt to apply the CLIP model as the backbone to directly segment objects with image-level labels. In this paper, we propose WeCLIP, a CLIP-based single-stage pipel…
▽ More
Weakly supervised semantic segmentation has witnessed great achievements with image-level labels. Several recent approaches use the CLIP model to generate pseudo labels for training an individual segmentation model, while there is no attempt to apply the CLIP model as the backbone to directly segment objects with image-level labels. In this paper, we propose WeCLIP, a CLIP-based single-stage pipeline, for weakly supervised semantic segmentation. Specifically, the frozen CLIP model is applied as the backbone for semantic feature extraction, and a new decoder is designed to interpret extracted semantic features for final prediction. Meanwhile, we utilize the above frozen backbone to generate pseudo labels for training the decoder. Such labels cannot be optimized during training. We then propose a refinement module (RFM) to rectify them dynamically. Our architecture enforces the proposed decoder and RFM to benefit from each other to boost the final performance. Extensive experiments show that our approach significantly outperforms other approaches with less training cost. Additionally, our WeCLIP also obtains promising results for fully supervised settings. The code is available at https://github.com/zbf1991/WeCLIP.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models
Authors:
Kaifeng Gao,
Jiaxin Shi,
Hanwang Zhang,
Chunping Wang,
Jun Xiao
Abstract:
With the advance of diffusion models, today's video generation has achieved impressive quality. But generating temporal consistent long videos is still challenging. A majority of video diffusion models (VDMs) generate long videos in an autoregressive manner, i.e., generating subsequent clips conditioned on last frames of previous clip. However, existing approaches all involve bidirectional computa…
▽ More
With the advance of diffusion models, today's video generation has achieved impressive quality. But generating temporal consistent long videos is still challenging. A majority of video diffusion models (VDMs) generate long videos in an autoregressive manner, i.e., generating subsequent clips conditioned on last frames of previous clip. However, existing approaches all involve bidirectional computations, which restricts the receptive context of each autoregression step, and results in the model lacking long-term dependencies. Inspired from the huge success of large language models (LLMs) and following GPT (generative pre-trained transformer), we bring causal (i.e., unidirectional) generation into VDMs, and use past frames as prompt to generate future frames. For Causal Generation, we introduce causal temporal attention into VDM, which forces each generated frame to depend on its previous frames. For Frame as Prompt, we inject the conditional frames by concatenating them with noisy frames (frames to be generated) along the temporal axis. Consequently, we present Video Diffusion GPT (ViD-GPT). Based on the two key designs, in each autoregression step, it is able to acquire long-term context from prompting frames concatenated by all previously generated frames. Additionally, we bring the kv-cache mechanism to VDMs, which eliminates the redundant computation from overlapped frames, significantly boosting the inference speed. Extensive experiments demonstrate that our ViD-GPT achieves state-of-the-art performance both quantitatively and qualitatively on long video generation. Code will be available at https://github.com/Dawn-LX/Causal-VideoGen.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Make Your Home Safe: Time-aware Unsupervised User Behavior Anomaly Detection in Smart Homes via Loss-guided Mask
Authors:
Jingyu Xiao,
Zhiyao Xu,
Qingsong Zou,
Qing Li,
Dan Zhao,
Dong Fang,
Ruoyu Li,
Wenxin Tang,
Kang Li,
Xudong Zuo,
Penghui Hu,
Yong Jiang,
Zixuan Weng,
Michael R. Lyv
Abstract:
Smart homes, powered by the Internet of Things, offer great convenience but also pose security concerns due to abnormal behaviors, such as improper operations of users and potential attacks from malicious attackers. Several behavior modeling methods have been proposed to identify abnormal behaviors and mitigate potential risks. However, their performance often falls short because they do not effec…
▽ More
Smart homes, powered by the Internet of Things, offer great convenience but also pose security concerns due to abnormal behaviors, such as improper operations of users and potential attacks from malicious attackers. Several behavior modeling methods have been proposed to identify abnormal behaviors and mitigate potential risks. However, their performance often falls short because they do not effectively learn less frequent behaviors, consider temporal context, or account for the impact of noise in human behaviors. In this paper, we propose SmartGuard, an autoencoder-based unsupervised user behavior anomaly detection framework. First, we design a Loss-guided Dynamic Mask Strategy (LDMS) to encourage the model to learn less frequent behaviors, which are often overlooked during learning. Second, we propose a Three-level Time-aware Position Embedding (TTPE) to incorporate temporal information into positional embedding to detect temporal context anomaly. Third, we propose a Noise-aware Weighted Reconstruction Loss (NWRL) that assigns different weights for routine behaviors and noise behaviors to mitigate the interference of noise behaviors during inference. Comprehensive experiments on three datasets with ten types of anomaly behaviors demonstrates that SmartGuard consistently outperforms state-of-the-art baselines and also offers highly interpretable results.
△ Less
Submitted 18 June, 2024; v1 submitted 16 June, 2024;
originally announced June 2024.
-
Geometric Distortion Guided Transformer for Omnidirectional Image Super-Resolution
Authors:
Cuixin Yang,
Rongkang Dong,
Jun Xiao,
Cong Zhang,
Kin-Man Lam,
Fei Zhou,
Guoping Qiu
Abstract:
As virtual and augmented reality applications gain popularity, omnidirectional image (ODI) super-resolution has become increasingly important. Unlike 2D plain images that are formed on a plane, ODIs are projected onto spherical surfaces. Applying established image super-resolution methods to ODIs, therefore, requires performing equirectangular projection (ERP) to map the ODIs onto a plane. ODI sup…
▽ More
As virtual and augmented reality applications gain popularity, omnidirectional image (ODI) super-resolution has become increasingly important. Unlike 2D plain images that are formed on a plane, ODIs are projected onto spherical surfaces. Applying established image super-resolution methods to ODIs, therefore, requires performing equirectangular projection (ERP) to map the ODIs onto a plane. ODI super-resolution needs to take into account geometric distortion resulting from ERP. However, without considering such geometric distortion of ERP images, previous deep-learning-based methods only utilize a limited range of pixels and may easily miss self-similar textures for reconstruction. In this paper, we introduce a novel Geometric Distortion Guided Transformer for Omnidirectional image Super-Resolution (GDGT-OSR). Specifically, a distortion modulated rectangle-window self-attention mechanism, integrated with deformable self-attention, is proposed to better perceive the distortion and thus involve more self-similar textures. Distortion modulation is achieved through a newly devised distortion guidance generator that produces guidance by exploiting the variability of distortion across latitudes. Furthermore, we propose a dynamic feature aggregation scheme to adaptively fuse the features from different self-attention modules. We present extensive experimental results on public datasets and show that the new GDGT-OSR outperforms methods in existing literature.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Interplay between topology and correlations in the second moiré band of twisted bilayer MoTe2
Authors:
Fan Xu,
Xumin Chang,
Jiayong Xiao,
Yixin Zhang,
Feng Liu,
Zheng Sun,
Ning Mao,
Nikolai Peshcherenko,
Jiayi Li,
Kenji Watanabe,
Takashi Taniguchi,
Bingbing Tong,
Li Lu,
Jinfeng Jia,
Dong Qian,
Zhiwen Shi,
Yang Zhang,
Xiaoxue Liu,
Shengwei Jiang,
Tingxin Li
Abstract:
Topological flat bands formed in two-dimensional lattice systems offer unique opportunity to study the fractional phases of matter in the absence of an external magnetic field. Celebrated examples include fractional quantum anomalous Hall (FQAH) effects and fractional topological insulators. Recently, FQAH effects have been experimentally realized in both the twisted bilayer MoTe2 (tMoTe2) system…
▽ More
Topological flat bands formed in two-dimensional lattice systems offer unique opportunity to study the fractional phases of matter in the absence of an external magnetic field. Celebrated examples include fractional quantum anomalous Hall (FQAH) effects and fractional topological insulators. Recently, FQAH effects have been experimentally realized in both the twisted bilayer MoTe2 (tMoTe2) system and the rhombohedral stacked multilayer graphene/hBN moiré systems. To date, experimental studies mainly focus on the first moiré flat band, except a very recent work that reported the evidence of the integer and fractional quantum spin Hall effects in higher moiré bands of a 2.1° tMoTe2 device. Here, we present the systematical transport study of approximately 3° tMoTe2 devices, especially for the second moiré band. At ν = -2 and -4, time-reversal-symmetric single and double quantum spin Hall states formed, consistent with the previous observation in 2.1° tMoTe2 device. On the other hand, we observed ferromagnetism in the second moiré band, and a Chern insulator state driven by out-of-plane magnetic fields at ν = -3. At ν = -2.5, nonmonotonic temperature dependence of resistivity and large out-of-plane negative magnetoresistance have been observed, which likely arises from the frustrated liquid-like ground state and weak effective Ruderman-Kittel-Kasuya-Yosida interactions. Applying out-of-plane electric field can induce quantum phase transitions at both integer and fractional filling factors. Our studies pave the way for realizing tunable topological states and other unexpected magnetic phases beyond the first moiré flat band based on twisted MoTe2 platform.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Search for $X(1870)$ via the decay $J/ψ\to ωK^+ K^-η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the…
▽ More
Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the $90\%$ confidence level. In addition, the branching faction $B(J/ψ\toωK^+ K^- η)$ is measured to be $(3.33\pm0.02(\rm{stat.})\pm 0.12(\rm{syst.}))\times 10^{-4}$.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
MGRQ: Post-Training Quantization For Vision Transformer With Mixed Granularity Reconstruction
Authors:
Lianwei Yang,
Zhikai Li,
Junrui Xiao,
Haisong Gong,
Qingyi Gu
Abstract:
Post-training quantization (PTQ) efficiently compresses vision models, but unfortunately, it accompanies a certain degree of accuracy degradation. Reconstruction methods aim to enhance model performance by narrowing the gap between the quantized model and the full-precision model, often yielding promising results. However, efforts to significantly improve the performance of PTQ through reconstruct…
▽ More
Post-training quantization (PTQ) efficiently compresses vision models, but unfortunately, it accompanies a certain degree of accuracy degradation. Reconstruction methods aim to enhance model performance by narrowing the gap between the quantized model and the full-precision model, often yielding promising results. However, efforts to significantly improve the performance of PTQ through reconstruction in the Vision Transformer (ViT) have shown limited efficacy. In this paper, we conduct a thorough analysis of the reasons for this limited effectiveness and propose MGRQ (Mixed Granularity Reconstruction Quantization) as a solution to address this issue. Unlike previous reconstruction schemes, MGRQ introduces a mixed granularity reconstruction approach. Specifically, MGRQ enhances the performance of PTQ by introducing Extra-Block Global Supervision and Intra-Block Local Supervision, building upon Optimized Block-wise Reconstruction. Extra-Block Global Supervision considers the relationship between block outputs and the model's output, aiding block-wise reconstruction through global supervision. Meanwhile, Intra-Block Local Supervision reduces generalization errors by aligning the distribution of outputs at each layer within a block. Subsequently, MGRQ is further optimized for reconstruction through Mixed Granularity Loss Fusion. Extensive experiments conducted on various ViT models illustrate the effectiveness of MGRQ. Notably, MGRQ demonstrates robust performance in low-bit quantization, thereby enhancing the practicality of the quantized model.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Blind Super-Resolution via Meta-learning and Markov Chain Monte Carlo Simulation
Authors:
Jingyuan Xia,
Zhixiong Yang,
Shengxi Li,
Shuanghui Zhang,
Yaowen Fu,
Deniz Gündüz,
Xiang Li
Abstract:
Learning-based approaches have witnessed great successes in blind single image super-resolution (SISR) tasks, however, handcrafted kernel priors and learning based kernel priors are typically required. In this paper, we propose a Meta-learning and Markov Chain Monte Carlo (MCMC) based SISR approach to learn kernel priors from organized randomness. In concrete, a lightweight network is adopted as k…
▽ More
Learning-based approaches have witnessed great successes in blind single image super-resolution (SISR) tasks, however, handcrafted kernel priors and learning based kernel priors are typically required. In this paper, we propose a Meta-learning and Markov Chain Monte Carlo (MCMC) based SISR approach to learn kernel priors from organized randomness. In concrete, a lightweight network is adopted as kernel generator, and is optimized via learning from the MCMC simulation on random Gaussian distributions. This procedure provides an approximation for the rational blur kernel, and introduces a network-level Langevin dynamics into SISR optimization processes, which contributes to preventing bad local optimal solutions for kernel estimation. Meanwhile, a meta-learning-based alternating optimization procedure is proposed to optimize the kernel generator and image restorer, respectively. In contrast to the conventional alternating minimization strategy, a meta-learning-based framework is applied to learn an adaptive optimization strategy, which is less-greedy and results in better convergence performance. These two procedures are iteratively processed in a plug-and-play fashion, for the first time, realizing a learning-based but plug-and-play blind SISR solution in unsupervised inference. Extensive simulations demonstrate the superior performance and generalization ability of the proposed approach when comparing with state-of-the-arts on synthesis and real-world datasets. The code is available at https://github.com/XYLGroup/MLMC.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
A Single-Step Non-Autoregressive Automatic Speech Recognition Architecture with High Accuracy and Inference Speed
Authors:
Ziyang Zhuang,
Chenfeng Miao,
Kun Zou,
Shuai Gong,
Ming Fang,
Tao Wei,
Zijian Li,
Wei Hu,
Shaojun Wang,
Jing Xiao
Abstract:
Non-autoregressive (NAR) automatic speech recognition (ASR) models predict tokens independently and simultaneously, bringing high inference speed. However, there is still a gap in the accuracy of the NAR models compared to the autoregressive (AR) models. To further narrow the gap between the NAR and AR models, we propose a single-step NAR ASR architecture with high accuracy and inference speed, ca…
▽ More
Non-autoregressive (NAR) automatic speech recognition (ASR) models predict tokens independently and simultaneously, bringing high inference speed. However, there is still a gap in the accuracy of the NAR models compared to the autoregressive (AR) models. To further narrow the gap between the NAR and AR models, we propose a single-step NAR ASR architecture with high accuracy and inference speed, called EfficientASR. It uses an Index Mapping Vector (IMV) based alignment generator to generate alignments during training, and an alignment predictor to learn the alignments for inference. It can be trained end-to-end (E2E) with cross-entropy loss combined with alignment loss. The proposed EfficientASR achieves competitive results on the AISHELL-1 and AISHELL-2 benchmarks compared to the state-of-the-art (SOTA) models. Specifically, it achieves character error rates (CER) of 4.26%/4.62% on the AISHELL-1 dev/test dataset, which outperforms the SOTA AR Conformer with about 30x inference speedup.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes…
▽ More
In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes of astrophysical $γ$-ray background while large amount of dark matter. By analyzing more than 700 days observational data at LHAASO, no significant dark matter signal from 1 TeV to 1 EeV is detected. Accordingly we derive the most stringent constraints on the ultra-heavy dark matter annihilation cross-section up to EeV. The constraints on the lifetime of dark matter in decay mode are also derived.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
What If We Recaption Billions of Web Images with LLaMA-3?
Authors:
Xianhang Li,
Haoqin Tu,
Mude Hui,
Zeyu Wang,
Bingchen Zhao,
Junfei Xiao,
Sucheng Ren,
Jieru Mei,
Qing Liu,
Huangjie Zheng,
Yuyin Zhou,
Cihang Xie
Abstract:
Web-crawled image-text pairs are inherently noisy. Prior studies demonstrate that semantically aligning and enriching textual descriptions of these pairs can significantly enhance model training across various vision-language tasks, particularly text-to-image generation. However, large-scale investigations in this area remain predominantly closed-source. Our paper aims to bridge this community eff…
▽ More
Web-crawled image-text pairs are inherently noisy. Prior studies demonstrate that semantically aligning and enriching textual descriptions of these pairs can significantly enhance model training across various vision-language tasks, particularly text-to-image generation. However, large-scale investigations in this area remain predominantly closed-source. Our paper aims to bridge this community effort, leveraging the powerful and \textit{open-sourced} LLaMA-3, a GPT-4 level LLM. Our recaptioning pipeline is simple: first, we fine-tune a LLaMA-3-8B powered LLaVA-1.5 and then employ it to recaption 1.3 billion images from the DataComp-1B dataset. Our empirical results confirm that this enhanced dataset, Recap-DataComp-1B, offers substantial benefits in training advanced vision-language models. For discriminative models like CLIP, we observe enhanced zero-shot performance in cross-modal retrieval tasks. For generative models like text-to-image Diffusion Transformers, the generated images exhibit a significant improvement in alignment with users' text instructions, especially in following complex queries. Our project page is https://www.haqtu.me/Recap-Datacomp-1B/
△ Less
Submitted 18 June, 2024; v1 submitted 12 June, 2024;
originally announced June 2024.
-
Observation of $η_{c}$(1S, 2S) and $χ_{cJ}$ decays to 2$(π^{+}π^{-})η$ via $ψ$(3686) radiative transitions
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (636 additional authors not shown)
Abstract:
Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measur…
▽ More
Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measured in both destructive and constructive interference scenarios for the first time. The mass and width of the $η_{c}(1S)$ are measured to be $M=(2984.14 \pm 0.13 \pm 0.38)$ MeV/$c^{2}$ and $Γ=(28.82 \pm 0.11 \pm 0.82)$ MeV, respectively. Clear signals for the decays of the $χ_{cJ}(J=0,1,2)$ and the $η_{c}(2S)$ to $2(π^{+}π^{-})η$ are also observed for the first time, and the corresponding branching fractions are measured. The ratio of the branching fractions between the $η_{c}(2S)$ and $η_{c}(1S)$ decays is significantly lower than the theoretical prediction, which might suggest different dynamics in their decays.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Holistic Memory Diversification for Incremental Learning in Growing Graphs
Authors:
Ziyue Qiao,
Junren Xiao,
Qingqiang Sun,
Meng Xiao,
Hui Xiong
Abstract:
This paper addresses the challenge of incremental learning in growing graphs with increasingly complex tasks. The goal is to continually train a graph model to handle new tasks while retaining its inference ability on previous tasks. Existing methods usually neglect the importance of memory diversity, limiting in effectively selecting high-quality memory from previous tasks and remembering broad p…
▽ More
This paper addresses the challenge of incremental learning in growing graphs with increasingly complex tasks. The goal is to continually train a graph model to handle new tasks while retaining its inference ability on previous tasks. Existing methods usually neglect the importance of memory diversity, limiting in effectively selecting high-quality memory from previous tasks and remembering broad previous knowledge within the scarce memory on graphs. To address that, we introduce a novel holistic Diversified Memory Selection and Generation (DMSG) framework for incremental learning in graphs, which first introduces a buffer selection strategy that considers both intra-class and inter-class diversities, employing an efficient greedy algorithm for sampling representative training nodes from graphs into memory buffers after learning each new task. Then, to adequately rememorize the knowledge preserved in the memory buffer when learning new tasks, we propose a diversified memory generation replay method. This method first utilizes a variational layer to generate the distribution of buffer node embeddings and sample synthesized ones for replaying. Furthermore, an adversarial variational embedding learning method and a reconstruction-based decoder are proposed to maintain the integrity and consolidate the generalization of the synthesized node embeddings, respectively. Finally, we evaluate our model on node classification tasks involving increasing class numbers. Extensive experimental results on publicly accessible datasets demonstrate the superiority of DMSG over state-of-the-art methods.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Strong and weak $CP$ tests in sequential decays of polarized $Σ^0$ hyperons
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The wea…
▽ More
The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The weak-$CP$ test is performed in the subsequent decays of their daughter particles $Λ$ and $\barΛ$. Also for the first time, the transverse polarizations of the $Σ^0$ hyperons in $J/ψ$ and $ψ(3686)$ decays are observed with opposite directions, and the ratios between the S-wave and D-wave contributions of the $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ decays are obtained. These results are crucial to understand the decay dynamics of the charmonium states and the production mechanism of the $Σ^0-\barΣ^0$ pairs.
△ Less
Submitted 16 July, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.