-
Measurement of the branching fraction of $D^+_s\to \ell^+ν_\ell$ via $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and…
▽ More
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and $\mathcal{B}(D_s^+\toτ^+ν_τ)=(\bftauv)\%$, respectively. The product of the decay constant and Cabibbo-Kobayashi-Maskawa matrix element $|V_{cs}|$ is determined to be $f_{D_s^+}|V_{cs}|=(\mufdsxvcsresult)_{μν}~\mathrm{MeV}$ and $f_{D_s^+}|V_{cs}|=(\taufdsxvcsresult))_{τν}~\mathrm{MeV}$, respectively. Taking the value of $|V_{cs}|$ from a global fit in the Standard Model, we obtain ${f_{D^+_s}}=(\mufdsresult)_{μν}$\,MeV and ${f_{D^+_s}}=(\taufdsresult)_{τν}$\,MeV, respectively. Conversely, taking the value for $f_{D_s^+}$ from the latest lattice quantum chromodynamics calculation, we obtain $|V_{cs}| =(\muvcsresult)_{μν}$ and $|V_{cs}| = (\tauvcsresult)_{τν}$, respectively.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Search for the rare $Λ_c^+ \to p μ^+ μ^-$ decay
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1062 additional authors not shown)
Abstract:
A search for the nonresonant $Λ_c^+ \to p μ^+ μ^-$ decay is performed using proton-proton collision data recorded at a centre-of-mass energy of 13 TeV by the LHCb experiment, corresponding to an integrated luminosity of 5.4 fb$^{-1}$. No evidence for the decay is found in the dimuon invariant-mass regions where the expected contributions of resonances is subdominant. The upper limit on the branchi…
▽ More
A search for the nonresonant $Λ_c^+ \to p μ^+ μ^-$ decay is performed using proton-proton collision data recorded at a centre-of-mass energy of 13 TeV by the LHCb experiment, corresponding to an integrated luminosity of 5.4 fb$^{-1}$. No evidence for the decay is found in the dimuon invariant-mass regions where the expected contributions of resonances is subdominant. The upper limit on the branching fraction of the $Λ_c^+ \to p μ^+ μ^-$ decay is determined to be $2.9~(3.2) \times 10^{-8}$ at 90% (95%) confidence level. The branching fractions in the dimuon invariant-mass regions dominated by the $η$, $ρ$ and $ω$ resonances are also determined.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
cDP-MIL: Robust Multiple Instance Learning via Cascaded Dirichlet Process
Authors:
Yihang Chen,
Tsai Hor Chan,
Guosheng Yin,
Yuming Jiang,
Lequan Yu
Abstract:
Multiple instance learning (MIL) has been extensively applied to whole slide histopathology image (WSI) analysis. The existing aggregation strategy in MIL, which primarily relies on the first-order distance (e.g., mean difference) between instances, fails to accurately approximate the true feature distribution of each instance, leading to biased slide-level representations. Moreover, the scarcity…
▽ More
Multiple instance learning (MIL) has been extensively applied to whole slide histopathology image (WSI) analysis. The existing aggregation strategy in MIL, which primarily relies on the first-order distance (e.g., mean difference) between instances, fails to accurately approximate the true feature distribution of each instance, leading to biased slide-level representations. Moreover, the scarcity of WSI observations easily leads to model overfitting, resulting in unstable testing performance and limited generalizability. To tackle these challenges, we propose a new Bayesian nonparametric framework for multiple instance learning, which adopts a cascade of Dirichlet processes (cDP) to incorporate the instance-to-bag characteristic of the WSIs. We perform feature aggregation based on the latent clusters formed by the Dirichlet process, which incorporates the covariances of the patch features and forms more representative clusters. We then perform bag-level prediction with another Dirichlet process model on the bags, which imposes a natural regularization on learning to prevent overfitting and enhance generalizability. Moreover, as a Bayesian nonparametric method, the cDP model can accurately generate posterior uncertainty, which allows for the detection of outlier samples and tumor localization. Extensive experiments on five WSI benchmarks validate the superior performance of our method, as well as its generalizability and ability to estimate uncertainties. Codes are available at https://github.com/HKU-MedAI/cDPMIL.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
Authors:
Yanqin Jiang,
Chaohui Yu,
Chenjie Cao,
Fan Wang,
Weiming Hu,
Jin Gao
Abstract:
Recent advances in 4D generation mainly focus on generating 4D content by distilling pre-trained text or single-view image-conditioned models. It is inconvenient for them to take advantage of various off-the-shelf 3D assets with multi-view attributes, and their results suffer from spatiotemporal inconsistency owing to the inherent ambiguity in the supervision signals. In this work, we present Anim…
▽ More
Recent advances in 4D generation mainly focus on generating 4D content by distilling pre-trained text or single-view image-conditioned models. It is inconvenient for them to take advantage of various off-the-shelf 3D assets with multi-view attributes, and their results suffer from spatiotemporal inconsistency owing to the inherent ambiguity in the supervision signals. In this work, we present Animate3D, a novel framework for animating any static 3D model. The core idea is two-fold: 1) We propose a novel multi-view video diffusion model (MV-VDM) conditioned on multi-view renderings of the static 3D object, which is trained on our presented large-scale multi-view video dataset (MV-Video). 2) Based on MV-VDM, we introduce a framework combining reconstruction and 4D Score Distillation Sampling (4D-SDS) to leverage the multi-view video diffusion priors for animating 3D objects. Specifically, for MV-VDM, we design a new spatiotemporal attention module to enhance spatial and temporal consistency by integrating 3D and video diffusion models. Additionally, we leverage the static 3D model's multi-view renderings as conditions to preserve its identity. For animating 3D models, an effective two-stage pipeline is proposed: we first reconstruct motions directly from generated multi-view videos, followed by the introduced 4D-SDS to refine both appearance and motion. Qualitative and quantitative experiments demonstrate that Animate3D significantly outperforms previous approaches. Data, code, and models will be open-released.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Feature Inference Attack on Shapley Values
Authors:
Xinjian Luo,
Yangfan Jiang,
Xiaokui Xiao
Abstract:
As a solution concept in cooperative game theory, Shapley value is highly recognized in model interpretability studies and widely adopted by the leading Machine Learning as a Service (MLaaS) providers, such as Google, Microsoft, and IBM. However, as the Shapley value-based model interpretability methods have been thoroughly studied, few researchers consider the privacy risks incurred by Shapley va…
▽ More
As a solution concept in cooperative game theory, Shapley value is highly recognized in model interpretability studies and widely adopted by the leading Machine Learning as a Service (MLaaS) providers, such as Google, Microsoft, and IBM. However, as the Shapley value-based model interpretability methods have been thoroughly studied, few researchers consider the privacy risks incurred by Shapley values, despite that interpretability and privacy are two foundations of machine learning (ML) models.
In this paper, we investigate the privacy risks of Shapley value-based model interpretability methods using feature inference attacks: reconstructing the private model inputs based on their Shapley value explanations. Specifically, we present two adversaries. The first adversary can reconstruct the private inputs by training an attack model based on an auxiliary dataset and black-box access to the model interpretability services. The second adversary, even without any background knowledge, can successfully reconstruct most of the private features by exploiting the local linear correlations between the model inputs and outputs. We perform the proposed attacks on the leading MLaaS platforms, i.e., Google Cloud, Microsoft Azure, and IBM aix360. The experimental results demonstrate the vulnerability of the state-of-the-art Shapley value-based model interpretability methods used in the leading MLaaS platforms and highlight the significance and necessity of designing privacy-preserving model interpretability methods in future studies. To our best knowledge, this is also the first work that investigates the privacy risks of Shapley values.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Phases Calibration of RIS Using Backpropagation Algorithm
Authors:
Wei Zhang,
Bin Zhou,
Tianyi Zhang,
Yi Jiang,
Zhiyong Bu
Abstract:
Reconfigurable intelligent surface (RIS) technology has emerged in recent years as a promising solution to the ever-increasing demand for wireless communication capacity. In practice, however, elements of RIS may suffer from phase deviations, which need to be properly estimated and calibrated. This paper models the problem of over-the-air (OTA) estimation of the RIS elements as a quasi-neural netw…
▽ More
Reconfigurable intelligent surface (RIS) technology has emerged in recent years as a promising solution to the ever-increasing demand for wireless communication capacity. In practice, however, elements of RIS may suffer from phase deviations, which need to be properly estimated and calibrated. This paper models the problem of over-the-air (OTA) estimation of the RIS elements as a quasi-neural network (QNN) so that the phase estimates can be obtained using the classic backpropagation (BP) algorithm. We also derive the Cramér Rao Bounds (CRBs) for the phases of the RIS elements as a benchmark of the proposed approach. The simulation results verify the effectiveness of the proposed algorithm by showing that the root mean square errors (RMSEs) of the phase estimates are close to the CRBs.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
FabGPT: An Efficient Large Multimodal Model for Complex Wafer Defect Knowledge Queries
Authors:
Yuqi Jiang,
Xudong Lu,
Qian Jin,
Qi Sun,
Hanming Wu,
Cheng Zhuo
Abstract:
Intelligence is key to advancing integrated circuit (IC) fabrication. Recent breakthroughs in Large Multimodal Models (LMMs) have unlocked unparalleled abilities in understanding images and text, fostering intelligent fabrication. Leveraging the power of LMMs, we introduce FabGPT, a customized IC fabrication large multimodal model for wafer defect knowledge query. FabGPT manifests expertise in con…
▽ More
Intelligence is key to advancing integrated circuit (IC) fabrication. Recent breakthroughs in Large Multimodal Models (LMMs) have unlocked unparalleled abilities in understanding images and text, fostering intelligent fabrication. Leveraging the power of LMMs, we introduce FabGPT, a customized IC fabrication large multimodal model for wafer defect knowledge query. FabGPT manifests expertise in conducting defect detection in Scanning Electron Microscope (SEM) images, performing root cause analysis, and providing expert question-answering (Q&A) on fabrication processes. FabGPT matches enhanced multimodal features to automatically detect minute defects under complex wafer backgrounds and reduce the subjectivity of manual threshold settings. Besides, the proposed modulation module and interactive corpus training strategy embed wafer defect knowledge into the pre-trained model, effectively balancing Q&A queries related to defect knowledge and original knowledge and mitigating the modality bias issues. Experiments on in-house fab data (SEM-WaD) show that our FabGPT achieves significant performance improvement in wafer defect detection and knowledge querying.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Graphusion: Leveraging Large Language Models for Scientific Knowledge Graph Fusion and Construction in NLP Education
Authors:
Rui Yang,
Boming Yang,
Sixun Ouyang,
Tianwei She,
Aosong Feng,
Yuang Jiang,
Freddy Lecue,
Jinghui Lu,
Irene Li
Abstract:
Knowledge graphs (KGs) are crucial in the field of artificial intelligence and are widely applied in downstream tasks, such as enhancing Question Answering (QA) systems. The construction of KGs typically requires significant effort from domain experts. Recently, Large Language Models (LLMs) have been used for knowledge graph construction (KGC), however, most existing approaches focus on a local pe…
▽ More
Knowledge graphs (KGs) are crucial in the field of artificial intelligence and are widely applied in downstream tasks, such as enhancing Question Answering (QA) systems. The construction of KGs typically requires significant effort from domain experts. Recently, Large Language Models (LLMs) have been used for knowledge graph construction (KGC), however, most existing approaches focus on a local perspective, extracting knowledge triplets from individual sentences or documents. In this work, we introduce Graphusion, a zero-shot KGC framework from free text. The core fusion module provides a global view of triplets, incorporating entity merging, conflict resolution, and novel triplet discovery. We showcase how Graphusion could be applied to the natural language processing (NLP) domain and validate it in the educational scenario. Specifically, we introduce TutorQA, a new expert-verified benchmark for graph reasoning and QA, comprising six tasks and a total of 1,200 QA pairs. Our evaluation demonstrates that Graphusion surpasses supervised baselines by up to 10% in accuracy on link prediction. Additionally, it achieves average scores of 2.92 and 2.37 out of 3 in human evaluations for concept entity extraction and relation recognition, respectively.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Automated high-resolution backscattered-electron imaging at macroscopic scale
Authors:
Zhiyuan Lang,
Zunshuai Zhang,
Lei Wang,
Yuhan Liu,
Weixiong Qian,
Shenghua Zhou,
Ying Jiang,
Tongyi Zhang,
Jiong Yang
Abstract:
Scanning electron microscopy (SEM) has been widely utilized in the field of materials science due to its significant advantages, such as large depth of field, wide field of view, and excellent stereoscopic imaging. However, at high magnification, the limited imaging range in SEM cannot cover all the possible inhomogeneous microstructures. In this research, we propose a novel approach for generatin…
▽ More
Scanning electron microscopy (SEM) has been widely utilized in the field of materials science due to its significant advantages, such as large depth of field, wide field of view, and excellent stereoscopic imaging. However, at high magnification, the limited imaging range in SEM cannot cover all the possible inhomogeneous microstructures. In this research, we propose a novel approach for generating high-resolution SEM images across multiple scales, enabling a single image to capture physical dimensions at the centimeter level while preserving submicron-level details. We adopted the SEM imaging on the AlCoCrFeNi2.1 eutectic high entropy alloy (EHEA) as an example. SEM videos and image stitching are combined to fulfill this goal, and the video-extracted low-definition (LD) images are clarified by a well-trained denoising model. Furthermore, we segment the macroscopic image of the EHEA, and area of various microstructures are distinguished. Combining the segmentation results and hardness experiments, we found that the hardness is positively correlated with the content of body-centered cubic (BCC) phase, negatively correlated with the lamella width, and the relationship with the proportion of lamellar structures was not significant. Our work provides a feasible solution to generate macroscopic images based on SEMs for further analysis of the correlations between the microstructures and spatial distribution, and can be widely applied to other types of microscope.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Exploring incentive strategies and predicting development trends for new energy vehicles
Authors:
Tao Jin,
Yulian Jiang,
Xingwen Liu
Abstract:
To facilitate new energy vehicles (NEVs), we construct a game model between vehicle manufacturers and consumers to explore their interactions. In the model, we propose the Expectation Supply-Demand Game (ESDG), construct the consumer purchasing decision-making process with feedback and analyse the stability of the system under different feedback factors. We processes the data of the model in numer…
▽ More
To facilitate new energy vehicles (NEVs), we construct a game model between vehicle manufacturers and consumers to explore their interactions. In the model, we propose the Expectation Supply-Demand Game (ESDG), construct the consumer purchasing decision-making process with feedback and analyse the stability of the system under different feedback factors. We processes the data of the model in numerical simulation through Min-Max normalisation and predicts the development of NEVs. The results show that: (1) An evolutionary stabilisation strategy (ESS) emerges in the evolutionary game model with the introduction of feedback. (2) The Min-Max normalisation method is conducive to the accuracy of the model. (3) Excessive advertising and marketing may cause consumer boredom. (4) The establishment of an appropriate battery compensation and replacement insurance is conducive to the development of NEVs. (5) The production and sales ratio of China's NEVs is predicted to reach 37.2\% and 36.9\% respectively in 2024.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Triggering the Untriggered: The First Einstein Probe-Detected Gamma-Ray Burst 240219A and Its Implications
Authors:
Yi-Han Iris Yin,
Bin-Bin Zhang,
Jun Yang,
Hui Sun,
Chen Zhang,
Yi-Xuan Shao,
You-Dong Hu,
Zi-Pei Zhu,
Dong Xu,
Li An,
He Gao,
Xue-Feng Wu,
Bing Zhang,
Alberto Javier Castro-Tirado,
Shashi B. Pandey,
Arne Rau,
Weihua Lei,
Wei Xie,
Giancarlo Ghirlanda,
Luigi Piro,
Paul O'Brien,
Eleonora Troja,
Peter Jonker,
Yun-Wei Yu,
Jie An
, et al. (26 additional authors not shown)
Abstract:
The Einstein Probe (EP) achieved its first detection and localization of a bright X-ray flare, EP240219a, on February 19, 2024, during its commissioning phase. Subsequent targeted searches triggered by the EP240219a alert identified a faint, untriggered gamma-ray burst (GRB) in the archived data of Fermi/GBM, Swift/BAT, Insight-HXMT/HE and INTEGRAL/SPI-ACS. The EP/WXT light curve reveals a long du…
▽ More
The Einstein Probe (EP) achieved its first detection and localization of a bright X-ray flare, EP240219a, on February 19, 2024, during its commissioning phase. Subsequent targeted searches triggered by the EP240219a alert identified a faint, untriggered gamma-ray burst (GRB) in the archived data of Fermi/GBM, Swift/BAT, Insight-HXMT/HE and INTEGRAL/SPI-ACS. The EP/WXT light curve reveals a long duration of approximately 160 seconds with a slow decay, whereas the Fermi/GBM light curve shows a total duration of approximately 70 seconds. The peak in the Fermi/GBM light curve occurs slightly later with respect to the peak seen in the EP/WXT light curve. Our spectral analysis shows that a single cutoff power-law model effectively describes the joint EP/WXT-Fermi/GBM spectra in general, indicating coherent broad emission typical of GRBs. The model yielded a photon index of $\sim -1.70 \pm 0.05$ and a peak energy of $\sim 257 \pm 134$ keV. After detection of GRB 240219A, long-term observations identified several candidates in optical and radio wavelengths, none of which was confirmed as the afterglow counterpart during subsequent optical and near-infrared follow-ups. The analysis of GRB 240219A classifies it as an X-ray rich GRB with a high peak energy, presenting both challenges and opportunities for studying the physical origins of X-ray flashes (XRFs), X-ray rich GRBs (XRRs), and classical GRBs (C-GRBs). Furthermore, linking the cutoff power-law component to non-thermal synchrotron radiation suggests that the burst is driven by a Poynting flux-dominated outflow.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
FedVAE: Trajectory privacy preserving based on Federated Variational AutoEncoder
Authors:
Yuchen Jiang,
Ying Wu,
Shiyao Zhang,
James J. Q. Yu
Abstract:
The use of trajectory data with abundant spatial-temporal information is pivotal in Intelligent Transport Systems (ITS) and various traffic system tasks. Location-Based Services (LBS) capitalize on this trajectory data to offer users personalized services tailored to their location information. However, this trajectory data contains sensitive information about users' movement patterns and habits,…
▽ More
The use of trajectory data with abundant spatial-temporal information is pivotal in Intelligent Transport Systems (ITS) and various traffic system tasks. Location-Based Services (LBS) capitalize on this trajectory data to offer users personalized services tailored to their location information. However, this trajectory data contains sensitive information about users' movement patterns and habits, necessitating confidentiality and protection from unknown collectors. To address this challenge, privacy-preserving methods like K-anonymity and Differential Privacy have been proposed to safeguard private information in the dataset. Despite their effectiveness, these methods can impact the original features by introducing perturbations or generating unrealistic trajectory data, leading to suboptimal performance in downstream tasks. To overcome these limitations, we propose a Federated Variational AutoEncoder (FedVAE) approach, which effectively generates a new trajectory dataset while preserving the confidentiality of private information and retaining the structure of the original features. In addition, FedVAE leverages Variational AutoEncoder (VAE) to maintain the original feature space and generate new trajectory data, and incorporates Federated Learning (FL) during the training stage, ensuring that users' data remains locally stored to protect their personal information. The results demonstrate its superior performance compared to other existing methods, affirming FedVAE as a promising solution for enhancing data privacy and utility in location-based applications.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Numerical Analysis on the Spatiotemporal Characteristics of the Portevin-Le Chatelier Effect in Ti-12Mo Alloy
Authors:
Shiyuan Luo,
Yongxin Jiang,
Sandrine Thuillier,
Philippe Castany,
Liangcai Zeng
Abstract:
A simplified 3D FE model based on McCormick's model is developed to numerically predict the spatiotemporal behaviors of the PLC effect in Ti-12Mo alloy tensile tests at 350 degrees C with strain rates from the order of $10^{-4}$ s$^{-1}$ to $10^{-2}$ s$^{-1}$. The material parameter identification procedure is firstly presented in details, and the simulated results are highly consistent with exper…
▽ More
A simplified 3D FE model based on McCormick's model is developed to numerically predict the spatiotemporal behaviors of the PLC effect in Ti-12Mo alloy tensile tests at 350 degrees C with strain rates from the order of $10^{-4}$ s$^{-1}$ to $10^{-2}$ s$^{-1}$. The material parameter identification procedure is firstly presented in details, and the simulated results are highly consistent with experimental ones, especially in terms of stress drop magnitudes and PLC band widths. The distribution of simulated stress drop magnitudes at a constant tensile velocity (0.01 mm/s) follows a normal distribution and its peak value is in the range of 26-28 MPa. Furthermore, the simulated band width slightly fluctuates with the increase of true strain and its average value is about 1.5 mm. Besides, the staircase behavior of strain-time curves and the hopping propagation of the PLC band are observed in Ti-12Mo alloy tensile process, which are related to the strain localization and stress drop magnitudes.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Optimization of Long-Haul C+L+S Systems by means of a Closed Form EGN Model
Authors:
Y. Jiang,
J. Sarkis,
A. Nespola,
F. Forghieri,
S. Piciaccia,
A. Tanzi,
M. Ranjbar Zefreh,
P. Poggiolini
Abstract:
We investigate C+L+S long-haul systems using a closed-form GN/EGN non-linearity model. We perform accurate launch power and Raman pump optimization. We show a potential 4x throughput increase over legacy C-band systems in 1000 km links, using moderate S-only Raman amplification. We simultaneously achieve extra-flat GSNR, within +/-0.5 dB across the whole C+L+S spectrum.
We investigate C+L+S long-haul systems using a closed-form GN/EGN non-linearity model. We perform accurate launch power and Raman pump optimization. We show a potential 4x throughput increase over legacy C-band systems in 1000 km links, using moderate S-only Raman amplification. We simultaneously achieve extra-flat GSNR, within +/-0.5 dB across the whole C+L+S spectrum.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Privacy-Preserving Collaborative Genomic Research: A Real-Life Deployment and Vision
Authors:
Zahra Rahmani,
Nahal Shahini,
Nadav Gat,
Zebin Yun,
Yuzhou Jiang,
Ofir Farchy,
Yaniv Harel,
Vipin Chaudhary,
Mahmood Sharif,
Erman Ayday
Abstract:
The data revolution holds significant promise for the health sector. Vast amounts of data collected from individuals will be transformed into knowledge, AI models, predictive systems, and best practices. One area of health that stands to benefit greatly is the genomic domain. Progress in AI, machine learning, and data science has opened new opportunities for genomic research, promising breakthroug…
▽ More
The data revolution holds significant promise for the health sector. Vast amounts of data collected from individuals will be transformed into knowledge, AI models, predictive systems, and best practices. One area of health that stands to benefit greatly is the genomic domain. Progress in AI, machine learning, and data science has opened new opportunities for genomic research, promising breakthroughs in personalized medicine. However, increasing awareness of privacy and cybersecurity necessitates robust solutions to protect sensitive data in collaborative research. This paper presents a practical deployment of a privacy-preserving framework for genomic research, developed in collaboration with Lynx$.$MD, a platform for secure health data collaboration. The framework addresses critical cybersecurity and privacy challenges, enabling the privacy-preserving sharing and analysis of genomic data while mitigating risks associated with data breaches. By integrating advanced privacy-preserving algorithms, the solution ensures the protection of individual privacy without compromising data utility. A unique feature of the system is its ability to balance trade-offs between data sharing and privacy, providing stakeholders tools to quantify privacy risks and make informed decisions. Implementing the framework within Lynx$.$MD involves encoding genomic data into binary formats and applying noise through controlled perturbation techniques. This approach preserves essential statistical properties of the data, facilitating effective research and analysis. Moreover, the system incorporates real-time data monitoring and advanced visualization tools, enhancing user experience and decision-making. The paper highlights the need for tailored privacy attacks and defenses specific to genomic data. Addressing these challenges fosters collaboration in genomic research, advancing personalized medicine and public health.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Infinite Motion: Extended Motion Generation via Long Text Instructions
Authors:
Mengtian Li,
Chengshuo Zhai,
Shengxiang Yao,
Zhifeng Xie,
Keyu Chen,
Yu-Gang Jiang
Abstract:
In the realm of motion generation, the creation of long-duration, high-quality motion sequences remains a significant challenge. This paper presents our groundbreaking work on "Infinite Motion", a novel approach that leverages long text to extended motion generation, effectively bridging the gap between short and long-duration motion synthesis. Our core insight is the strategic extension and reass…
▽ More
In the realm of motion generation, the creation of long-duration, high-quality motion sequences remains a significant challenge. This paper presents our groundbreaking work on "Infinite Motion", a novel approach that leverages long text to extended motion generation, effectively bridging the gap between short and long-duration motion synthesis. Our core insight is the strategic extension and reassembly of existing high-quality text-motion datasets, which has led to the creation of a novel benchmark dataset to facilitate the training of models for extended motion sequences. A key innovation of our model is its ability to accept arbitrary lengths of text as input, enabling the generation of motion sequences tailored to specific narratives or scenarios. Furthermore, we incorporate the timestamp design for text which allows precise editing of local segments within the generated sequences, offering unparalleled control and flexibility in motion synthesis. We further demonstrate the versatility and practical utility of "Infinite Motion" through three specific applications: natural language interactive editing, motion sequence editing within long sequences and splicing of independent motion sequences. Each application highlights the adaptability of our approach and broadens the spectrum of possibilities for research and development in motion generation. Through extensive experiments, we demonstrate the superior performance of our model in generating long sequence motions compared to existing methods.Project page: https://shuochengzhai.github.io/Infinite-motion.github.io/
△ Less
Submitted 12 July, 2024; v1 submitted 11 July, 2024;
originally announced July 2024.
-
Accurate Cooperative Localization Utilizing LiDAR-equipped Roadside Infrastructure for Autonomous Driving
Authors:
Yuze Jiang,
Ehsan Javanmardi,
Manabu Tsukada,
Hiroshi Esaki
Abstract:
Recent advancements in LiDAR technology have significantly lowered costs and improved both its precision and resolution, thereby solidifying its role as a critical component in autonomous vehicle localization. Using sophisticated 3D registration algorithms, LiDAR now facilitates vehicle localization with centimeter-level accuracy. However, these high-precision techniques often face reliability cha…
▽ More
Recent advancements in LiDAR technology have significantly lowered costs and improved both its precision and resolution, thereby solidifying its role as a critical component in autonomous vehicle localization. Using sophisticated 3D registration algorithms, LiDAR now facilitates vehicle localization with centimeter-level accuracy. However, these high-precision techniques often face reliability challenges in environments devoid of identifiable map features. To address this limitation, we propose a novel approach that utilizes road side units (RSU) with vehicle-to-infrastructure (V2I) communications to assist vehicle self-localization. By using RSUs as stationary reference points and processing real-time LiDAR data, our method enhances localization accuracy through a cooperative localization framework. By placing RSUs in critical areas, our proposed method can improve the reliability and precision of vehicle localization when the traditional vehicle self-localization technique falls short. Evaluation results in an end-to-end autonomous driving simulator AWSIM show that the proposed method can improve localization accuracy by up to 80% under vulnerable environments compared to traditional localization methods. Additionally, our method also demonstrates robust resistance to network delays and packet loss in heterogeneous network environments.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Optimum Launch Power in Multiband Systems
Authors:
Yanchao Jiang,
Fabrizio Forghieri,
Stefano Piciaccia,
Gabriella Bosco,
Pierluigi Poggiolini
Abstract:
We investigate the residual throughput penalty due to ISRS, after power-optimization, in multiband systems. We show it to be mild. We also revisit the launch power optimization 3-dB rule. We find that using it is possible but not advisable due to increased GSNR non-uniformity.
We investigate the residual throughput penalty due to ISRS, after power-optimization, in multiband systems. We show it to be mild. We also revisit the launch power optimization 3-dB rule. We find that using it is possible but not advisable due to increased GSNR non-uniformity.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Improving the Automated Coronal Jet Identification with U-NET
Authors:
Jiajia Liu,
Chunyu Ji,
Yimin Wang,
Szabolcs Soós,
Ye Jiang,
Robertus Erdélyi,
M. B. Korsós,
Yuming Wang
Abstract:
Coronal jets are one of the most common eruptive activities in the solar atmosphere. They are related to rich physics processes, including but not limited to magnetic reconnection, flaring, instabilities, and plasma heating. Automated identification of off-limb coronal jets has been difficult due to their abundant nature, complex appearance, and relatively small size compared to other features in…
▽ More
Coronal jets are one of the most common eruptive activities in the solar atmosphere. They are related to rich physics processes, including but not limited to magnetic reconnection, flaring, instabilities, and plasma heating. Automated identification of off-limb coronal jets has been difficult due to their abundant nature, complex appearance, and relatively small size compared to other features in the corona. In this paper, we present an automated coronal jet identification algorithm (AJIA) that utilizes true and fake jets previously detected by a laborious semi-automated jet detection algorithm (SAJIA, Liu et al. 2023) as the input of an image segmentation neural network U-NET. It is found that AJIA could achieve a much higher (0.81) detecting precision than SAJIA (0.34), meanwhile giving the possibility of whether each pixel in an input image belongs to a jet. We demonstrate that with the aid of artificial neural networks, AJIA could enable fast, accurate, and real-time coronal jet identification from SDO/AIA 304 Åobservations, which are essential in studying the collective and long-term behavior of coronal jets and their relation with the solar activity cycles.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (645 additional authors not shown)
Abstract:
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be…
▽ More
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Closed-Form EGN Model with Comprehensive Raman Support
Authors:
Yanchao Jiang,
Antonino Nespola,
Stefano Straullu,
Alberto Tanzi,
Stefano Piciaccia,
Fabrizio Forghieri,
Dario Pilori,
Pierluigi Poggiolini
Abstract:
We present a series of experiments testing the accuracy of a new closed-form multiband EGN model, carried out over a full-Raman 9-span C+L link. Transmission regimes ranged from linear to strongly non-linear with large ISRS. We found good correspondence between predicted and measured performance.
We present a series of experiments testing the accuracy of a new closed-form multiband EGN model, carried out over a full-Raman 9-span C+L link. Transmission regimes ranged from linear to strongly non-linear with large ISRS. We found good correspondence between predicted and measured performance.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Testing the cosmic distance duality relation using strong gravitational lensing time delays and Type Ia supernovae
Authors:
Jing-Zhao Qi,
Yi-Fan Jiang,
Wan-Ting Hou,
Xin Zhang
Abstract:
We present a comprehensive test of the cosmic distance duality relation (DDR) using a combination of strong gravitational lensing (SGL) time delay measurements and Type Ia supernovae (SNe Ia) data. We investigate three different parameterizations of potential DDR violations. To bridge the gap between SGL and SNe Ia datasets, we implement an artificial neural network (ANN) approach to reconstruct t…
▽ More
We present a comprehensive test of the cosmic distance duality relation (DDR) using a combination of strong gravitational lensing (SGL) time delay measurements and Type Ia supernovae (SNe Ia) data. We investigate three different parameterizations of potential DDR violations. To bridge the gap between SGL and SNe Ia datasets, we implement an artificial neural network (ANN) approach to reconstruct the distance modulus of SNe Ia. Our analysis uniquely considers both scenarios where the absolute magnitude of SNe Ia ($M_B$) is treated as a free parameter and where it is fixed to a Cepheid-calibrated value. Using a sample of six SGL systems and the Pantheon+ SNe Ia dataset, we find no statistically significant evidence for DDR violations across all parameterizations. The consistency of our findings across different parameterizations not only reinforces confidence in the standard DDR but also demonstrates the robustness of our analytical approach.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Limiting Over-Smoothing and Over-Squashing of Graph Message Passing by Deep Scattering Transforms
Authors:
Yuanhong Jiang,
Dongmian Zou,
Xiaoqun Zhang,
Yu Guang Wang
Abstract:
Graph neural networks (GNNs) have become pivotal tools for processing graph-structured data, leveraging the message passing scheme as their core mechanism. However, traditional GNNs often grapple with issues such as instability, over-smoothing, and over-squashing, which can degrade performance and create a trade-off dilemma. In this paper, we introduce a discriminatively trained, multi-layer Deep…
▽ More
Graph neural networks (GNNs) have become pivotal tools for processing graph-structured data, leveraging the message passing scheme as their core mechanism. However, traditional GNNs often grapple with issues such as instability, over-smoothing, and over-squashing, which can degrade performance and create a trade-off dilemma. In this paper, we introduce a discriminatively trained, multi-layer Deep Scattering Message Passing (DSMP) neural network designed to overcome these challenges. By harnessing spectral transformation, the DSMP model aggregates neighboring nodes with global information, thereby enhancing the precision and accuracy of graph signal processing. We provide theoretical proofs demonstrating the DSMP's effectiveness in mitigating these issues under specific conditions. Additionally, we support our claims with empirical evidence and thorough frequency analysis, showcasing the DSMP's superior ability to address instability, over-smoothing, and over-squashing.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
HyCIR: Boosting Zero-Shot Composed Image Retrieval with Synthetic Labels
Authors:
Yingying Jiang,
Hanchao Jia,
Xiaobing Wang,
Peng Hao
Abstract:
Composed Image Retrieval (CIR) aims to retrieve images based on a query image with text. Current Zero-Shot CIR (ZS-CIR) methods try to solve CIR tasks without using expensive triplet-labeled training datasets. However, the gap between ZS-CIR and triplet-supervised CIR is still large. In this work, we propose Hybrid CIR (HyCIR), which uses synthetic labels to boost the performance of ZS-CIR. A new…
▽ More
Composed Image Retrieval (CIR) aims to retrieve images based on a query image with text. Current Zero-Shot CIR (ZS-CIR) methods try to solve CIR tasks without using expensive triplet-labeled training datasets. However, the gap between ZS-CIR and triplet-supervised CIR is still large. In this work, we propose Hybrid CIR (HyCIR), which uses synthetic labels to boost the performance of ZS-CIR. A new label Synthesis pipeline for CIR (SynCir) is proposed, in which only unlabeled images are required. First, image pairs are extracted based on visual similarity. Second, query text is generated for each image pair based on vision-language model and LLM. Third, the data is further filtered in language space based on semantic similarity. To improve ZS-CIR performance, we propose a hybrid training strategy to work with both ZS-CIR supervision and synthetic CIR triplets. Two kinds of contrastive learning are adopted. One is to use large-scale unlabeled image dataset to learn an image-to-text mapping with good generalization. The other is to use synthetic CIR triplets to learn a better mapping for CIR tasks. Our approach achieves SOTA zero-shot performance on the common CIR benchmarks: CIRR and CIRCO.
△ Less
Submitted 8 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
Retrieved In-Context Principles from Previous Mistakes
Authors:
Hao Sun,
Yong Jiang,
Bo Wang,
Yingyan Hou,
Yan Zhang,
Pengjun Xie,
Fei Huang
Abstract:
In-context learning (ICL) has been instrumental in adapting Large Language Models (LLMs) to downstream tasks using correct input-output examples. Recent advances have attempted to improve model performance through principles derived from mistakes, yet these approaches suffer from lack of customization and inadequate error coverage. To address these limitations, we propose Retrieved In-Context Prin…
▽ More
In-context learning (ICL) has been instrumental in adapting Large Language Models (LLMs) to downstream tasks using correct input-output examples. Recent advances have attempted to improve model performance through principles derived from mistakes, yet these approaches suffer from lack of customization and inadequate error coverage. To address these limitations, we propose Retrieved In-Context Principles (RICP), a novel teacher-student framework. In RICP, the teacher model analyzes mistakes from the student model to generate reasons and insights for preventing similar mistakes. These mistakes are clustered based on their underlying reasons for developing task-level principles, enhancing the error coverage of principles. During inference, the most relevant mistakes for each question are retrieved to create question-level principles, improving the customization of the provided guidance. RICP is orthogonal to existing prompting methods and does not require intervention from the teacher model during inference. Experimental results across seven reasoning benchmarks reveal that RICP effectively enhances performance when applied to various prompting strategies.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection Enhanced by Comprehensive Guidance from Text and Image
Authors:
Pengkun Jiao,
Na Zhao,
Jingjing Chen,
Yu-Gang Jiang
Abstract:
Open-vocabulary 3D object detection (OV-3DDet) aims to localize and recognize both seen and previously unseen object categories within any new 3D scene. While language and vision foundation models have achieved success in handling various open-vocabulary tasks with abundant training data, OV-3DDet faces a significant challenge due to the limited availability of training data. Although some pioneer…
▽ More
Open-vocabulary 3D object detection (OV-3DDet) aims to localize and recognize both seen and previously unseen object categories within any new 3D scene. While language and vision foundation models have achieved success in handling various open-vocabulary tasks with abundant training data, OV-3DDet faces a significant challenge due to the limited availability of training data. Although some pioneering efforts have integrated vision-language models (VLM) knowledge into OV-3DDet learning, the full potential of these foundational models has yet to be fully exploited. In this paper, we unlock the textual and visual wisdom to tackle the open-vocabulary 3D detection task by leveraging the language and vision foundation models. We leverage a vision foundation model to provide image-wise guidance for discovering novel classes in 3D scenes. Specifically, we utilize a object detection vision foundation model to enable the zero-shot discovery of objects in images, which serves as the initial seeds and filtering guidance to identify novel 3D objects. Additionally, to align the 3D space with the powerful vision-language space, we introduce a hierarchical alignment approach, where the 3D feature space is aligned with the vision-language feature space using a pre-trained VLM at the instance, category, and scene levels. Through extensive experimentation, we demonstrate significant improvements in accuracy and generalization, highlighting the potential of foundation models in advancing open-vocabulary 3D object detection in real-world scenarios.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Electrical magnetochiral anisotropy and quantum metric in chiral conductors
Authors:
Yiyang Jiang,
Qinyan Yi,
Binghai Yan
Abstract:
Electrical magnetochiral anisotropy (EMCA) refers to the chirality- and current-dependent nonlinear magnetoresistance in chiral conductors and is commonly interpreted in a semimclassical picture. In this work, we reveal a quantum geometry origin of EMCA by a chiral rectangular lattice model that resembles a chiral organic conductor (DM-EDT-TTF)${}_2$ClO${}_4$ studied for EMCA recently and exhibits…
▽ More
Electrical magnetochiral anisotropy (EMCA) refers to the chirality- and current-dependent nonlinear magnetoresistance in chiral conductors and is commonly interpreted in a semimclassical picture. In this work, we reveal a quantum geometry origin of EMCA by a chiral rectangular lattice model that resembles a chiral organic conductor (DM-EDT-TTF)${}_2$ClO${}_4$ studied for EMCA recently and exhibits symmetry-protected Dirac bands similar to those of graphene. Compared to the semiclassical term, we find that Dirac states contribute significantly to EMCA by the quantum metric when Fermi energy is close to the Dirac point. Besides, we discovered topological insulator state can emerge once SOC is added to our chiral model lattice. Our work paves a path to understand quantum geometry in the magneto-transport of chiral materials.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Robust Skin Color Driven Privacy Preserving Face Recognition via Function Secret Sharing
Authors:
Dong Han,
Yufan Jiang,
Yong Li,
Ricardo Mendes,
Joachim Denzler
Abstract:
In this work, we leverage the pure skin color patch from the face image as the additional information to train an auxiliary skin color feature extractor and face recognition model in parallel to improve performance of state-of-the-art (SOTA) privacy-preserving face recognition (PPFR) systems. Our solution is robust against black-box attacking and well-established generative adversarial network (GA…
▽ More
In this work, we leverage the pure skin color patch from the face image as the additional information to train an auxiliary skin color feature extractor and face recognition model in parallel to improve performance of state-of-the-art (SOTA) privacy-preserving face recognition (PPFR) systems. Our solution is robust against black-box attacking and well-established generative adversarial network (GAN) based image restoration. We analyze the potential risk in previous work, where the proposed cosine similarity computation might directly leak the protected precomputed embedding stored on the server side. We propose a Function Secret Sharing (FSS) based face embedding comparison protocol without any intermediate result leakage. In addition, we show in experiments that the proposed protocol is more efficient compared to the Secret Sharing (SS) based protocol.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Toward Precise Robotic Weed Flaming Using a Mobile Manipulator with a Flamethrower
Authors:
Di Wang,
Chengsong Hu,
Shuangyu Xie,
Joe Johnson,
Hojun Ji,
Yingtao Jiang,
Muthukumar Bagavathiannan,
Dezhen Song
Abstract:
Robotic weed flaming is a new and environmentally friendly approach to weed removal in the agricultural field. Using a mobile manipulator equipped with a flamethrower, we design a new system and algorithm to enable effective weed flaming, which requires robotic manipulation with a soft and deformable end effector, as the thermal coverage of the flame is affected by dynamic or unknown environmental…
▽ More
Robotic weed flaming is a new and environmentally friendly approach to weed removal in the agricultural field. Using a mobile manipulator equipped with a flamethrower, we design a new system and algorithm to enable effective weed flaming, which requires robotic manipulation with a soft and deformable end effector, as the thermal coverage of the flame is affected by dynamic or unknown environmental factors such as gravity, wind, atmospheric pressure, fuel tank pressure, and pose of the nozzle. System development includes overall design, hardware integration, and software pipeline. To enable precise weed removal, the greatest challenge is to detect and predict dynamic flame coverage in real time before motion planning, which is quite different from a conventional rigid gripper in grasping or a spray gun in painting. Based on the images from two onboard infrared cameras and the pose information of the flamethrower nozzle on a mobile manipulator, we propose a new dynamic flame coverage model. The flame model uses a center-arc curve with a Gaussian cross-section model to describe the flame coverage in real time. The experiments have demonstrated the working system and shown that our model and algorithm can achieve a mean average precision (mAP) of more than 76\% in the reprojected images during online prediction.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
PECTP: Parameter-Efficient Cross-Task Prompts for Incremental Vision Transformer
Authors:
Qian Feng,
Hanbin Zhao,
Chao Zhang,
Jiahua Dong,
Henghui Ding,
Yu-Gang Jiang,
Hui Qian
Abstract:
Incremental Learning (IL) aims to learn deep models on sequential tasks continually, where each new task includes a batch of new classes and deep models have no access to task-ID information at the inference time. Recent vast pre-trained models (PTMs) have achieved outstanding performance by prompt technique in practical IL without the old samples (rehearsal-free) and with a memory constraint (mem…
▽ More
Incremental Learning (IL) aims to learn deep models on sequential tasks continually, where each new task includes a batch of new classes and deep models have no access to task-ID information at the inference time. Recent vast pre-trained models (PTMs) have achieved outstanding performance by prompt technique in practical IL without the old samples (rehearsal-free) and with a memory constraint (memory-constrained): Prompt-extending and Prompt-fixed methods. However, prompt-extending methods need a large memory buffer to maintain an ever-expanding prompt pool and meet an extra challenging prompt selection problem. Prompt-fixed methods only learn a single set of prompts on one of the incremental tasks and can not handle all the incremental tasks effectively. To achieve a good balance between the memory cost and the performance on all the tasks, we propose a Parameter-Efficient Cross-Task Prompt (PECTP) framework with Prompt Retention Module (PRM) and classifier Head Retention Module (HRM). To make the final learned prompts effective on all incremental tasks, PRM constrains the evolution of cross-task prompts' parameters from Outer Prompt Granularity and Inner Prompt Granularity. Besides, we employ HRM to inherit old knowledge in the previously learned classifier heads to facilitate the cross-task prompts' generalization ability. Extensive experiments show the effectiveness of our method. The source codes will be available at \url{https://github.com/RAIAN08/PECTP}.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Enhancing Class Fairness in Classification with A Two-Player Game Approach
Authors:
Yunpeng Jiang,
Paul Weng,
Yutong Ban
Abstract:
Data augmentation is widely applied and has shown its benefits in different machine learning tasks. However, as recently observed in some downstream tasks, data augmentation may introduce an unfair impact on classifications. While it can improve the performance of some classes, it can actually be detrimental for other classes, which can be problematic in some application domains. In this paper, to…
▽ More
Data augmentation is widely applied and has shown its benefits in different machine learning tasks. However, as recently observed in some downstream tasks, data augmentation may introduce an unfair impact on classifications. While it can improve the performance of some classes, it can actually be detrimental for other classes, which can be problematic in some application domains. In this paper, to counteract this phenomenon, we propose a FAir Classification approach with a Two-player game (FACT). We first formulate the training of a classifier with data augmentation as a fair optimization problem, which can be further written as an adversarial two-player game. Following this formulation, we propose a novel multiplicative weight optimization algorithm, for which we theoretically prove that it can converge to a solution that is fair over classes. Interestingly, our formulation also reveals that this fairness issue over classes is not due to data augmentation only, but is in fact a general phenomenon. Our empirical experiments demonstrate that the performance of our learned classifiers is indeed more fairly distributed over classes in five datasets, with only limited impact on the average accuracy.
△ Less
Submitted 8 July, 2024; v1 submitted 30 May, 2024;
originally announced July 2024.
-
Electromagnetic Property Sensing Based on Diffusion Model in ISAC System
Authors:
Yuhua Jiang,
Feifei Gao,
Shi Jin,
Tie Jun Cui
Abstract:
Integrated sensing and communications (ISAC) has opened up numerous game-changing opportunities for future wireless systems. In this paper, we develop a novel ISAC scheme that utilizes the diffusion model to sense the electromagnetic (EM) property of the target in a predetermined sensing area. Specifically, we first estimate the sensing channel by using both the communications and the sensing sign…
▽ More
Integrated sensing and communications (ISAC) has opened up numerous game-changing opportunities for future wireless systems. In this paper, we develop a novel ISAC scheme that utilizes the diffusion model to sense the electromagnetic (EM) property of the target in a predetermined sensing area. Specifically, we first estimate the sensing channel by using both the communications and the sensing signals echoed back from the target. Then we employ the diffusion model to generate the point cloud that represents the target and thus enables 3D visualization of the target's EM property distribution. In order to minimize the mean Chamfer distance (MCD) between the ground truth and the estimated point clouds, we further design the communications and sensing beamforming matrices under the constraint of a maximum transmit power and a minimum communications achievable rate for each user equipment (UE). Simulation results demonstrate the efficacy of the proposed method in achieving high-quality reconstruction of the target's shape, relative permittivity, and conductivity. Besides, the proposed method can sense the EM property of the target effectively in any position of the sensing area.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be…
▽ More
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers
Authors:
Yanfeng Jiang,
Ning Sun,
Xueshuo Xie,
Fei Yang,
Tao Li
Abstract:
Vision Transformers (ViTs) have exhibited exceptional performance across diverse computer vision tasks, while their substantial parameter size incurs significantly increased memory and computational demands, impeding effective inference on resource-constrained devices. Quantization has emerged as a promising solution to mitigate these challenges, yet existing methods still suffer from significant…
▽ More
Vision Transformers (ViTs) have exhibited exceptional performance across diverse computer vision tasks, while their substantial parameter size incurs significantly increased memory and computational demands, impeding effective inference on resource-constrained devices. Quantization has emerged as a promising solution to mitigate these challenges, yet existing methods still suffer from significant accuracy loss at low-bit. We attribute this issue to the distinctive distributions of post-LayerNorm and post-GELU activations within ViTs, rendering conventional hardware-friendly quantizers ineffective, particularly in low-bit scenarios. To address this issue, we propose a novel framework called Activation-Distribution-Friendly post-training Quantization for Vision Transformers, ADFQ-ViT. Concretely, we introduce the Per-Patch Outlier-aware Quantizer to tackle irregular outliers in post-LayerNorm activations. This quantizer refines the granularity of the uniform quantizer to a per-patch level while retaining a minimal subset of values exceeding a threshold at full-precision. To handle the non-uniform distributions of post-GELU activations between positive and negative regions, we design the Shift-Log2 Quantizer, which shifts all elements to the positive region and then applies log2 quantization. Moreover, we present the Attention-score enhanced Module-wise Optimization which adjusts the parameters of each quantizer by reconstructing errors to further mitigate quantization error. Extensive experiments demonstrate ADFQ-ViT provides significant improvements over various baselines in image classification, object detection, and instance segmentation tasks at 4-bit. Specifically, when quantizing the ViT-B model to 4-bit, we achieve a 10.23% improvement in Top-1 accuracy on the ImageNet dataset.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
When Could Abelian Fractional Topological Insulators Exist in Twisted MoTe$_2$ (and Other Systems)
Authors:
Yves H. Kwan,
Glenn Wagner,
Jiabin Yu,
Andrea Kouta Dagnino,
Yi Jiang,
Xiaodong Xu,
B. Andrei Bernevig,
Titus Neupert,
Nicolas Regnault
Abstract:
Using comprehensive exact diagonalization calculations on $θ\approx 3.7 ^{\circ}$ twisted bilayer MoTe$_2$ ($t$MoTe$_2$), as well as idealized Landau level models also relevant for lower $θ$, we extract general principles for engineering fractional topological insulators (FTIs) in realistic situations. First, in a Landau level setup at $ν=1/3+1/3$, we investigate what features of the interaction d…
▽ More
Using comprehensive exact diagonalization calculations on $θ\approx 3.7 ^{\circ}$ twisted bilayer MoTe$_2$ ($t$MoTe$_2$), as well as idealized Landau level models also relevant for lower $θ$, we extract general principles for engineering fractional topological insulators (FTIs) in realistic situations. First, in a Landau level setup at $ν=1/3+1/3$, we investigate what features of the interaction destroy an FTI. For both pseudopotential interactions and realistic screened Coulomb interactions, we find that sufficient suppression of the short-range repulsion is needed for stabilizing an FTI. We then study $θ\approx 3.7 ^{\circ}$ $t$MoTe$_2$ with realistic band-mixing and anisotropic non-local dielectric screening. Our finite-size calculations only find an FTI phase at $ν=-4/3$ in the presence of a significant additional short-range attraction $g$ that acts to counter the Coulomb repulsion at short distances. We discuss how further finite-size drifts, dielectric engineering, Landau level character, and band-mixing effects may reduce the required value of $g$ closer towards the experimentally relevant conditions of $t$MoTe$_2$. Projective calculations into the $n=1$ Landau level, which resembles the second valence band of $θ\simeq 2.1^\circ$ $t$MoTe$_2$, do not yield FTIs for any $g$, suggesting that FTIs at low-angle $t$MoTe$_2$ for $ν=-8/3$ and $-10/3$ may be unlikely. While our study highlights the challenges, at least for the fillings considered, to obtaining an FTI with transport plateaus, even in large-angle $t$MoTe$_2$ where fractional Chern insulators are experimentally established, we also provide potential sample-engineering routes to improve the stability of FTI phases.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Unifying quantum spatial search, state transfer and uniform sampling on graphs: simple and exact
Authors:
Qingwen Wang,
Ying Jiang,
Lvzhou Li
Abstract:
This article presents a novel and succinct algorithmic framework via alternating quantum walks, unifying quantum spatial search, state transfer and uniform sampling on a large class of graphs. Using the framework, we can achieve exact uniform sampling over all vertices and perfect state transfer between any two vertices, provided that eigenvalues of Laplacian matrix of the graph are all integers.…
▽ More
This article presents a novel and succinct algorithmic framework via alternating quantum walks, unifying quantum spatial search, state transfer and uniform sampling on a large class of graphs. Using the framework, we can achieve exact uniform sampling over all vertices and perfect state transfer between any two vertices, provided that eigenvalues of Laplacian matrix of the graph are all integers. Furthermore, if the graph is vertex-transitive as well, then we can achieve deterministic quantum spatial search that finds a marked vertex with certainty. In contrast, existing quantum search algorithms generally has a certain probability of failure. Even if the graph is not vertex-transitive, such as the complete bipartite graph, we can still adjust the algorithmic framework to obtain deterministic spatial search, which thus shows the flexibility of it. Besides unifying and improving plenty of previous results, our work provides new results on more graphs. The approach is easy to use since it has a succinct formalism that depends only on the depth of the Laplacian eigenvalue set of the graph, and may shed light on the solution of more problems related to graphs.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations
Authors:
Yubo Ma,
Yuhang Zang,
Liangyu Chen,
Meiqi Chen,
Yizhu Jiao,
Xinze Li,
Xinyuan Lu,
Ziyu Liu,
Yan Ma,
Xiaoyi Dong,
Pan Zhang,
Liangming Pan,
Yu-Gang Jiang,
Jiaqi Wang,
Yixin Cao,
Aixin Sun
Abstract:
Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document understanding (DU). However, their abilities on long-context DU remain an open problem. This work presents MMLongBench-Doc, a long-context, multi-modal benchmark co…
▽ More
Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document understanding (DU). However, their abilities on long-context DU remain an open problem. This work presents MMLongBench-Doc, a long-context, multi-modal benchmark comprising 1,062 expert-annotated questions. Distinct from previous datasets, it is constructed upon 130 lengthy PDF-formatted documents with an average of 49.4 pages and 20,971 textual tokens. Towards comprehensive evaluation, answers to these questions rely on pieces of evidence from (1) different sources (text, image, chart, table, and layout structure) and (2) various locations (i.e. page number). Moreover, 33.2% of the questions are cross-page questions requiring evidence across multiple pages. 22.8% of the questions are designed to be unanswerable for detecting potential hallucinations. Experiments on 14 LVLMs demonstrate that long-context DU greatly challenges current models. Notably, the best-performing model, GPT-4o, achieves an F1 score of only 42.7%, while the second-best, GPT-4V, scores 31.4%. Furthermore, 12 LVLMs (all except GPT-4o and GPT-4V) even present worse performance than their LLM counterparts which are fed with lossy-parsed OCR documents. These results validate the necessity of future research toward more capable long-context LVLMs. Project Page: https://mayubo2333.github.io/MMLongBench-Doc
△ Less
Submitted 10 July, 2024; v1 submitted 1 July, 2024;
originally announced July 2024.
-
ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions
Authors:
Jingheng Ye,
Yong Jiang,
Xiaobin Wang,
Yinghui Li,
Yangning Li,
Hai-Tao Zheng,
Pengjun Xie,
Fei Huang
Abstract:
This paper introduces the task of product demand clarification within an e-commercial scenario, where the user commences the conversation with ambiguous queries and the task-oriented agent is designed to achieve more accurate and tailored product searching by asking clarification questions. To address this task, we propose ProductAgent, a conversational information seeking agent equipped with abil…
▽ More
This paper introduces the task of product demand clarification within an e-commercial scenario, where the user commences the conversation with ambiguous queries and the task-oriented agent is designed to achieve more accurate and tailored product searching by asking clarification questions. To address this task, we propose ProductAgent, a conversational information seeking agent equipped with abilities of strategic clarification question generation and dynamic product retrieval. Specifically, we develop the agent with strategies for product feature summarization, query generation, and product retrieval. Furthermore, we propose the benchmark called PROCLARE to evaluate the agent's performance both automatically and qualitatively with the aid of a LLM-driven user simulator. Experiments show that ProductAgent interacts positively with the user and enhances retrieval performance with increasing dialogue turns, where user demands become gradually more explicit and detailed. All the source codes will be released after the review anonymity period.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Symplectic double groupoids and the generalized Kähler potential
Authors:
Daniel Álvarez,
Marco Gualtieri,
Yucong Jiang
Abstract:
A description of the fundamental degrees of freedom underlying a generalized Kähler manifold, which separates its holomorphic moduli from the space of compatible metrics in a similar way to the Kähler case, has been sought since its discovery in 1984. In this paper, we describe a full solution to this problem for arbitrary generalized Kähler manifolds, which involves the new concept of a holomorph…
▽ More
A description of the fundamental degrees of freedom underlying a generalized Kähler manifold, which separates its holomorphic moduli from the space of compatible metrics in a similar way to the Kähler case, has been sought since its discovery in 1984. In this paper, we describe a full solution to this problem for arbitrary generalized Kähler manifolds, which involves the new concept of a holomorphic symplectic Morita 2-equivalence between double symplectic groupoids, equipped with a Lagrangian bisection of its real symplectic core. Essentially, any generalized Kähler manifold has an associated holomorphic symplectic manifold of quadruple dimension and equipped with an anti-holomorphic involution; the metric is determined by a Lagrangian submanifold of its fixed point locus. This finally resolves affirmatively a long-standing conjecture by physicists concerning the existence of a generalized Kähler potential.
We demonstrate the theory by constructing explicitly the above Morita 2-equivalence and Lagrangian bisection for the well-known generalized Kähler structures on compact even-dimensional semisimple Lie groups, which have until now escaped such analysis. We construct the required holomorphic symplectic manifolds by expressing them as moduli spaces of flat connections on surfaces with decorated boundary, through a quasi-Hamiltonian reduction.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
A Whole-Process Certifiably Robust Aggregation Method Against Backdoor Attacks in Federated Learning
Authors:
Anqi Zhou,
Yezheng Liu,
Yidong Chai,
Hongyi Zhu,
Xinyue Ge,
Yuanchun Jiang,
Meng Wang
Abstract:
Federated Learning (FL) has garnered widespread adoption across various domains such as finance, healthcare, and cybersecurity. Nonetheless, FL remains under significant threat from backdoor attacks, wherein malicious actors insert triggers into trained models, enabling them to perform certain tasks while still meeting FL's primary objectives. In response, robust aggregation methods have been prop…
▽ More
Federated Learning (FL) has garnered widespread adoption across various domains such as finance, healthcare, and cybersecurity. Nonetheless, FL remains under significant threat from backdoor attacks, wherein malicious actors insert triggers into trained models, enabling them to perform certain tasks while still meeting FL's primary objectives. In response, robust aggregation methods have been proposed, which can be divided into three types: ex-ante, ex-durante, and ex-post methods. Given the complementary nature of these methods, combining all three types is promising yet unexplored. Such a combination is non-trivial because it requires leveraging their advantages while overcoming their disadvantages. Our study proposes a novel whole-process certifiably robust aggregation (WPCRA) method for FL, which enhances robustness against backdoor attacks across three phases: ex-ante, ex-durante, and ex-post. Moreover, since the current geometric median estimation method fails to consider differences among clients, we propose a novel weighted geometric median estimation algorithm (WGME). This algorithm estimates the geometric median of model updates from clients based on each client's weight, further improving the robustness of WPCRA against backdoor attacks. We also theoretically prove that WPCRA offers improved certified robustness guarantees with a larger certified radius. We evaluate the advantages of our methods based on the task of loan status prediction. Comparison with baselines shows that our methods significantly improve FL's robustness against backdoor attacks. This study contributes to the literature with a novel WPCRA method and a novel WGME algorithm. Our code is available at https://github.com/brick-brick/WPCRAM.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Learning System Dynamics without Forgetting
Authors:
Xikun Zhang,
Dongjin Song,
Yushan Jiang,
Yixin Chen,
Dacheng Tao
Abstract:
Predicting the trajectories of systems with unknown dynamics (\textit{i.e.} the governing rules) is crucial in various research fields, including physics and biology. This challenge has gathered significant attention from diverse communities. Most existing works focus on learning fixed system dynamics within one single system. However, real-world applications often involve multiple systems with di…
▽ More
Predicting the trajectories of systems with unknown dynamics (\textit{i.e.} the governing rules) is crucial in various research fields, including physics and biology. This challenge has gathered significant attention from diverse communities. Most existing works focus on learning fixed system dynamics within one single system. However, real-world applications often involve multiple systems with different types of dynamics or evolving systems with non-stationary dynamics (dynamics shifts). When data from those systems are continuously collected and sequentially fed to machine learning models for training, these models tend to be biased toward the most recently learned dynamics, leading to catastrophic forgetting of previously observed/learned system dynamics. To this end, we aim to learn system dynamics via continual learning. Specifically, we present a novel framework of Mode-switching Graph ODE (MS-GODE), which can continually learn varying dynamics and encode the system-specific dynamics into binary masks over the model parameters. During the inference stage, the model can select the most confident mask based on the observational data to identify the system and predict future trajectories accordingly. Empirically, we systematically investigate the task configurations and compare the proposed MS-GODE with state-of-the-art techniques. More importantly, we construct a novel benchmark of biological dynamic systems, featuring diverse systems with disparate dynamics and significantly enriching the research field of machine learning for dynamic systems.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
UWBAD: Towards Effective and Imperceptible Jamming Attacks Against UWB Ranging Systems with COTS Chips
Authors:
Yuqiao Yang,
Zhongjie Wu,
Yongzhao Zhang,
Ting Chen,
Jun Li,
Jie Yang,
Wenhao Liu,
Xiaosong Zhang,
Ruicong Shi,
Jingwei Li,
Yu Jiang,
Zhuo Su
Abstract:
UWB ranging systems have been adopted in many critical and security sensitive applications due to its precise positioning and secure ranging capabilities. We present a practical jamming attack, namely UWBAD, against commercial UWB ranging systems, which exploits the vulnerability of the adoption of the normalized cross-correlation process in UWB ranging and can selectively and quickly block rangin…
▽ More
UWB ranging systems have been adopted in many critical and security sensitive applications due to its precise positioning and secure ranging capabilities. We present a practical jamming attack, namely UWBAD, against commercial UWB ranging systems, which exploits the vulnerability of the adoption of the normalized cross-correlation process in UWB ranging and can selectively and quickly block ranging sessions without prior knowledge of the configurations of the victim devices, potentially leading to severe consequences such as property loss, unauthorized access, or vehicle theft. UWBAD achieves more effective and less imperceptible jamming due to: (i) it efficiently blocks every ranging session by leveraging the field-level jamming, thereby exerting a tangible impact on commercial UWB ranging systems, and (ii) the compact, reactive, and selective system design based on COTS UWB chips, making it affordable and less imperceptible. We successfully conducted real attacks against commercial UWB ranging systems from the three largest UWB chip vendors on the market, e.g., Apple, NXP, and Qorvo. We reported our findings to Apple, related Original Equipment Manufacturers (OEM), and the Automotive Security Research Group, triggering internal security incident response procedures at Volkswagen, Audi, Bosch, and NXP. As of the writing of this paper, the related OEM has acknowledged this vulnerability in their automotive systems and has offered a $5,000 reward as a bounty.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Neural Network-Assisted End-to-End Design for Dispersive Full-Parameter Control of Meta-Optics
Authors:
Hanbin Chi,
Yueqiang Hu,
Xiangnian Ou,
Yuting Jiang,
Dian Yu,
Shaozhen Lou,
Quan Wang,
Qiong Xie,
Cheng-Wei Qiu,
Huigao Duan
Abstract:
Flexible control light field across multiple parameters is the cornerstone of versatile and miniaturized optical devices. Metasurfaces, comprising subwavelength scatterers, offer a potent platform for executing such precise manipulations. However, the inherent mutual constraints between parameters of metasurfaces make it challenging for traditional approaches to achieve full-parameter control acro…
▽ More
Flexible control light field across multiple parameters is the cornerstone of versatile and miniaturized optical devices. Metasurfaces, comprising subwavelength scatterers, offer a potent platform for executing such precise manipulations. However, the inherent mutual constraints between parameters of metasurfaces make it challenging for traditional approaches to achieve full-parameter control across multiple wavelengths. Here, we propose a universal end-to-end inverse design framework to directly optimize the geometric parameter layout of meta-optics based on the target functionality of full-parameter control across multiple wavelengths. This framework employs a differentiable forward simulator integrating a neural network-based dispersive full-parameter Jones matrix and Fourier propagation to facilitate gradient-based optimization. Its superiority over sequential forward designs in dual-polarization channel color holography with higher quality and tri-polarization three-dimensional color holography with higher multiplexed capacity is showcased. To highlight the universality, we further present polarized spectral multi-information processing with six arbitrary polarizations and three wavelengths. This versatile, differentiable, system-level design framework is poised to expedite the advancement of meta-optics in integrated multi-information display, imaging, and communication, extending to multi-modal sensing applications.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Teola: Towards End-to-End Optimization of LLM-based Applications
Authors:
Xin Tan,
Yimin Jiang,
Yitao Yang,
Hong Xu
Abstract:
Large language model (LLM)-based applications consist of both LLM and non-LLM components, each contributing to the end-to-end latency. Despite great efforts to optimize LLM inference, end-to-end workflow optimization has been overlooked. Existing frameworks employ coarse-grained orchestration with task modules, which confines optimizations to within each module and yields suboptimal scheduling dec…
▽ More
Large language model (LLM)-based applications consist of both LLM and non-LLM components, each contributing to the end-to-end latency. Despite great efforts to optimize LLM inference, end-to-end workflow optimization has been overlooked. Existing frameworks employ coarse-grained orchestration with task modules, which confines optimizations to within each module and yields suboptimal scheduling decisions. We propose fine-grained end-to-end orchestration, which utilizes task primitives as the basic units and represents each query's workflow as a primitive-level dataflow graph. This explicitly exposes a much larger design space, enables optimizations in parallelization and pipelining across primitives of different modules, and enhances scheduling to improve application-level performance. We build Teola, a novel orchestration framework for LLM-based applications that implements this scheme. Comprehensive experiments show that Teola can achieve up to 2.09x speedup over existing systems across various popular LLM applications.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Learning Unsupervised Gaze Representation via Eye Mask Driven Information Bottleneck
Authors:
Yangzhou Jiang,
Yinxin Lin,
Yaoming Wang,
Teng Li,
Bilian Ke,
Bingbing Ni
Abstract:
Appearance-based supervised methods with full-face image input have made tremendous advances in recent gaze estimation tasks. However, intensive human annotation requirement inhibits current methods from achieving industrial level accuracy and robustness. Although current unsupervised pre-training frameworks have achieved success in many image recognition tasks, due to the deep coupling between fa…
▽ More
Appearance-based supervised methods with full-face image input have made tremendous advances in recent gaze estimation tasks. However, intensive human annotation requirement inhibits current methods from achieving industrial level accuracy and robustness. Although current unsupervised pre-training frameworks have achieved success in many image recognition tasks, due to the deep coupling between facial and eye features, such frameworks are still deficient in extracting useful gaze features from full-face. To alleviate above limitations, this work proposes a novel unsupervised/self-supervised gaze pre-training framework, which forces the full-face branch to learn a low dimensional gaze embedding without gaze annotations, through collaborative feature contrast and squeeze modules. In the heart of this framework is an alternating eye-attended/unattended masking training scheme, which squeezes gaze-related information from full-face branch into an eye-masked auto-encoder through an injection bottleneck design that successfully encourages the model to pays more attention to gaze direction rather than facial textures only, while still adopting the eye self-reconstruction objective. In the same time, a novel eye/gaze-related information contrastive loss has been designed to further boost the learned representation by forcing the model to focus on eye-centered regions. Extensive experimental results on several gaze benchmarks demonstrate that the proposed scheme achieves superior performances over unsupervised state-of-the-art.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Secure Outsourced Decryption for FHE-based Privacy-preserving Cloud Computing
Authors:
Xirong Ma,
Chuan Li,
Yuchang Hu,
Yunting Tao,
Yali Jiang,
Yanbin Li,
Fanyu Kong,
Chunpeng Ge
Abstract:
The demand for processing vast volumes of data has surged dramatically due to the advancement of machine learning technology. Large-scale data processing necessitates substantial computational resources, prompting individuals and enterprises to turn to cloud services. Accompanying this trend is a growing concern regarding data leakage and misuse. Homomorphic encryption (HE) is one solution for saf…
▽ More
The demand for processing vast volumes of data has surged dramatically due to the advancement of machine learning technology. Large-scale data processing necessitates substantial computational resources, prompting individuals and enterprises to turn to cloud services. Accompanying this trend is a growing concern regarding data leakage and misuse. Homomorphic encryption (HE) is one solution for safeguarding data privacy, enabling encrypted data to be processed securely in the cloud. However, the encryption and decryption routines of some HE schemes require considerable computational resources, presenting non-trivial work for clients. In this paper, we propose an outsourced decryption protocol for the prevailing RLWE-based fully homomorphic encryption schemes. The protocol splits the original decryption into two routines, with the computationally intensive part executed remotely by the cloud. Its security relies on an invariant of the NTRU-search problem with a newly designed blinding key distribution. Cryptographic analyses are conducted to configure protocol parameters across varying security levels. Our experiments demonstrate that the proposed protocol achieves up to a $67\%$ acceleration in the client's local decryption, accompanied by a $50\%$ reduction in space usage.
△ Less
Submitted 9 July, 2024; v1 submitted 28 June, 2024;
originally announced June 2024.
-
Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec…
▽ More
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Symbolic Learning Enables Self-Evolving Agents
Authors:
Wangchunshu Zhou,
Yixin Ou,
Shengwei Ding,
Long Li,
Jialong Wu,
Tiannan Wang,
Jiamin Chen,
Shuai Wang,
Xiaohua Xu,
Ningyu Zhang,
Huajun Chen,
Yuchen Eleanor Jiang
Abstract:
The AI community has been exploring a pathway to artificial general intelligence (AGI) by developing "language agents", which are complex large language models (LLMs) pipelines involving both prompting techniques and tool usage methods. While language agents have demonstrated impressive capabilities for many real-world tasks, a fundamental limitation of current language agents research is that the…
▽ More
The AI community has been exploring a pathway to artificial general intelligence (AGI) by developing "language agents", which are complex large language models (LLMs) pipelines involving both prompting techniques and tool usage methods. While language agents have demonstrated impressive capabilities for many real-world tasks, a fundamental limitation of current language agents research is that they are model-centric, or engineering-centric. That's to say, the progress on prompts, tools, and pipelines of language agents requires substantial manual engineering efforts from human experts rather than automatically learning from data. We believe the transition from model-centric, or engineering-centric, to data-centric, i.e., the ability of language agents to autonomously learn and evolve in environments, is the key for them to possibly achieve AGI.
In this work, we introduce agent symbolic learning, a systematic framework that enables language agents to optimize themselves on their own in a data-centric way using symbolic optimizers. Specifically, we consider agents as symbolic networks where learnable weights are defined by prompts, tools, and the way they are stacked together. Agent symbolic learning is designed to optimize the symbolic network within language agents by mimicking two fundamental algorithms in connectionist learning: back-propagation and gradient descent. Instead of dealing with numeric weights, agent symbolic learning works with natural language simulacrums of weights, loss, and gradients. We conduct proof-of-concept experiments on both standard benchmarks and complex real-world tasks and show that agent symbolic learning enables language agents to update themselves after being created and deployed in the wild, resulting in "self-evolving agents".
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Fast 3D 31P B1+ mapping with a weighted stack of spiral trajectory at 7 Tesla
Authors:
Mark Widmaier,
Antonia Kaiser,
Salome Baup,
Daniel Wenz,
Katarzyna Pierzchala,
Ying Xiao,
Zhiwei Huang,
Yun Jiang,
Lijing Xin
Abstract:
Purpose: Phosphorus Magnetic Resonance Spectroscopy (31P MRS) enables non-invasive assessment of energy metabolism, yet its application is hindered by sensitivity limitations. To overcome this, often high magnetic fields are used, leading to challenges such as spatial B_1^+ inhomogeneity and therefore the need for accurate flip angle determination in accelerated acquisitions with short repetition…
▽ More
Purpose: Phosphorus Magnetic Resonance Spectroscopy (31P MRS) enables non-invasive assessment of energy metabolism, yet its application is hindered by sensitivity limitations. To overcome this, often high magnetic fields are used, leading to challenges such as spatial B_1^+ inhomogeneity and therefore the need for accurate flip angle determination in accelerated acquisitions with short repetition times (T_R). In response to these challenges, we propose a novel short T_R and look-up table-based Double-Angle Method for fast 3D 31P B_1^+ mapping (fDAM). Methods: Our method incorporates 3D weighted stack of spiral gradient echo acquisitions and a frequency-selective pulse to enable efficient B_1^+ mapping based on the phosphocreatine signal at 7T. Protocols were optimised using simulations and validated through phantom experiments. The method was validated in phantom experiments and skeletal muscle applications using a birdcage 1H/31P volume coil. Results: The results of fDAM were compared to the classical DAM (cDAM). A good correlation (r=0.94) was obtained between the two B_1^+ maps. A 3D 31P B_1^+ mapping in the human calf muscle was achieved in about 10 min using a birdcage volume coil, with a 20% extended coverage relative to that of the cDAM (24 min). fDAM also enabled the first full brain coverage 31P 3D B_1^+ mapping in approx. 10 min using a 1 Tx/ 32 Rx coil. Conclusion: fDAM is an efficient method for 31P 3D B_1^+ mapping, showing promise for future applications in rapid 31P MRSI.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers
Authors:
Yibo Jiang,
Goutham Rajendran,
Pradeep Ravikumar,
Bryon Aragam
Abstract:
Large Language Models (LLMs) have the capacity to store and recall facts. Through experimentation with open-source models, we observe that this ability to retrieve facts can be easily manipulated by changing contexts, even without altering their factual meanings. These findings highlight that LLMs might behave like an associative memory model where certain tokens in the contexts serve as clues to…
▽ More
Large Language Models (LLMs) have the capacity to store and recall facts. Through experimentation with open-source models, we observe that this ability to retrieve facts can be easily manipulated by changing contexts, even without altering their factual meanings. These findings highlight that LLMs might behave like an associative memory model where certain tokens in the contexts serve as clues to retrieving facts. We mathematically explore this property by studying how transformers, the building blocks of LLMs, can complete such memory tasks. We study a simple latent concept association problem with a one-layer transformer and we show theoretically and empirically that the transformer gathers information using self-attention and uses the value matrix for associative memory.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.