-
Deform3DGS: Flexible Deformation for Fast Surgical Scene Reconstruction with Gaussian Splatting
Authors:
Shuojue Yang,
Qian Li,
Daiyun Shen,
Bingchen Gong,
Qi Dou,
Yueming Jin
Abstract:
Tissue deformation poses a key challenge for accurate surgical scene reconstruction. Despite yielding high reconstruction quality, existing methods suffer from slow rendering speeds and long training times, limiting their intraoperative applicability. Motivated by recent progress in 3D Gaussian Splatting, an emerging technology in real-time 3D rendering, this work presents a novel fast reconstruct…
▽ More
Tissue deformation poses a key challenge for accurate surgical scene reconstruction. Despite yielding high reconstruction quality, existing methods suffer from slow rendering speeds and long training times, limiting their intraoperative applicability. Motivated by recent progress in 3D Gaussian Splatting, an emerging technology in real-time 3D rendering, this work presents a novel fast reconstruction framework, termed Deform3DGS, for deformable tissues during endoscopic surgery. Specifically, we introduce 3D GS into surgical scenes by integrating a point cloud initialization to improve reconstruction. Furthermore, we propose a novel flexible deformation modeling scheme (FDM) to learn tissue deformation dynamics at the level of individual Gaussians. Our FDM can model the surface deformation with efficient representations, allowing for real-time rendering performance. More importantly, FDM significantly accelerates surgical scene reconstruction, demonstrating considerable clinical values, particularly in intraoperative settings where time efficiency is crucial. Experiments on DaVinci robotic surgery videos indicate the efficacy of our approach, showcasing superior reconstruction fidelity PSNR: (37.90) and rendering speed (338.8 FPS) while substantially reducing training time to only 1 minute/scene. Our code is available at https://github.com/jinlab-imvr/Deform3DGS.
△ Less
Submitted 30 May, 2024; v1 submitted 28 May, 2024;
originally announced May 2024.
-
Comprehensive analysis of local and nonlocal amplitudes in the $B^0\rightarrow K^{*0}μ^+μ^-$ decay
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1070 additional authors not shown)
Abstract:
A comprehensive study of the local and nonlocal amplitudes contributing to the decay $B^0\rightarrow K^{*0}(\to K^+π^-) μ^+μ^-$ is performed by analysing the phase-space distribution of the decay products. The analysis is based on \proton\proton collision data corresponding to an integrated luminosity of 8.4fb$^{-1}$ collected by the LHCb experiment. This measurement employs for the first time a m…
▽ More
A comprehensive study of the local and nonlocal amplitudes contributing to the decay $B^0\rightarrow K^{*0}(\to K^+π^-) μ^+μ^-$ is performed by analysing the phase-space distribution of the decay products. The analysis is based on \proton\proton collision data corresponding to an integrated luminosity of 8.4fb$^{-1}$ collected by the LHCb experiment. This measurement employs for the first time a model of both one-particle and two-particle nonlocal amplitudes, and utilises the complete dimuon mass spectrum without any veto regions around the narrow charmonium resonances. In this way it is possible to explicitly isolate the local and nonlocal contributions and capture the interference between them. The results show that interference with nonlocal contributions, although larger than predicted, only has a minor impact on the Wilson Coefficients determined from the fit to the data. For the local contributions, the Wilson Coefficient $C_9$, responsible for vector dimuon currents, exhibits a $2.1σ$ deviation from the Standard Model expectation. The Wilson Coefficients $C_{10}$, $C_{9}'$ and $C_{10}'$ are all in better agreement than $C_{9}$ with the Standard Model and the global significance is at the level of $1.5σ$. The model used also accounts for nonlocal contributions from $B^{0}\to K^{*0}\left[τ^+τ^-\to μ^+μ^-\right]$ rescattering, resulting in the first direct measurement of the $b sττ$ vector effective-coupling $C_{9τ}$.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Convex Relaxation for Solving Large-Margin Classifiers in Hyperbolic Space
Authors:
Sheng Yang,
Peihan Liu,
Cengiz Pehlevan
Abstract:
Hyperbolic spaces have increasingly been recognized for their outstanding performance in handling data with inherent hierarchical structures compared to their Euclidean counterparts. However, learning in hyperbolic spaces poses significant challenges. In particular, extending support vector machines to hyperbolic spaces is in general a constrained non-convex optimization problem. Previous and popu…
▽ More
Hyperbolic spaces have increasingly been recognized for their outstanding performance in handling data with inherent hierarchical structures compared to their Euclidean counterparts. However, learning in hyperbolic spaces poses significant challenges. In particular, extending support vector machines to hyperbolic spaces is in general a constrained non-convex optimization problem. Previous and popular attempts to solve hyperbolic SVMs, primarily using projected gradient descent, are generally sensitive to hyperparameters and initializations, often leading to suboptimal solutions. In this work, by first rewriting the problem into a polynomial optimization, we apply semidefinite relaxation and sparse moment-sum-of-squares relaxation to effectively approximate the optima. From extensive empirical experiments, these methods are shown to perform better than the projected gradient descent approach.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Spectral regularization for adversarially-robust representation learning
Authors:
Sheng Yang,
Jacob A. Zavatone-Veth,
Cengiz Pehlevan
Abstract:
The vulnerability of neural network classifiers to adversarial attacks is a major obstacle to their deployment in safety-critical applications. Regularization of network parameters during training can be used to improve adversarial robustness and generalization performance. Usually, the network is regularized end-to-end, with parameters at all layers affected by regularization. However, in setting…
▽ More
The vulnerability of neural network classifiers to adversarial attacks is a major obstacle to their deployment in safety-critical applications. Regularization of network parameters during training can be used to improve adversarial robustness and generalization performance. Usually, the network is regularized end-to-end, with parameters at all layers affected by regularization. However, in settings where learning representations is key, such as self-supervised learning (SSL), layers after the feature representation will be discarded when performing inference. For these models, regularizing up to the feature space is more suitable. To this end, we propose a new spectral regularizer for representation learning that encourages black-box adversarial robustness in downstream classification tasks. In supervised classification settings, we show empirically that this method is more effective in boosting test accuracy and robustness than previously-proposed methods that regularize all layers of the network. We then show that this method improves the adversarial robustness of classifiers using representations learned with self-supervised training or transferred from another classification task. In all, our work begins to unveil how representational structure affects adversarial robustness.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Superpixelwise Low-rank Approximation based Partial Label Learning for Hyperspectral Image Classification
Authors:
Shujun Yang,
Yu Zhang,
Yao Ding,
Danfeng Hong
Abstract:
Insufficient prior knowledge of a captured hyperspectral image (HSI) scene may lead the experts or the automatic labeling systems to offer incorrect labels or ambiguous labels (i.e., assigning each training sample to a group of candidate labels, among which only one of them is valid; this is also known as partial label learning) during the labeling process. Accordingly, how to learn from such data…
▽ More
Insufficient prior knowledge of a captured hyperspectral image (HSI) scene may lead the experts or the automatic labeling systems to offer incorrect labels or ambiguous labels (i.e., assigning each training sample to a group of candidate labels, among which only one of them is valid; this is also known as partial label learning) during the labeling process. Accordingly, how to learn from such data with ambiguous labels is a problem of great practical importance. In this paper, we propose a novel superpixelwise low-rank approximation (LRA)-based partial label learning method, namely SLAP, which is the first to take into account partial label learning in HSI classification. SLAP is mainly composed of two phases: disambiguating the training labels and acquiring the predictive model. Specifically, in the first phase, we propose a superpixelwise LRA-based model, preparing the affinity graph for the subsequent label propagation process while extracting the discriminative representation to enhance the following classification task of the second phase. Then to disambiguate the training labels, label propagation propagates the labeling information via the affinity graph of training pixels. In the second phase, we take advantage of the resulting disambiguated training labels and the discriminative representations to enhance the classification performance. The extensive experiments validate the advantage of the proposed SLAP method over state-of-the-art methods.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Inference for Optimal Linear Treatment Regimes in Personalized Decision-making
Authors:
Yuwen Cheng,
Shu Yang
Abstract:
Personalized decision-making, tailored to individual characteristics, is gaining significant attention. The optimal treatment regime aims to provide the best-expected outcome in the entire population, known as the value function. One approach to determine this optimal regime is by maximizing the Augmented Inverse Probability Weighting (AIPW) estimator of the value function. However, the derived tr…
▽ More
Personalized decision-making, tailored to individual characteristics, is gaining significant attention. The optimal treatment regime aims to provide the best-expected outcome in the entire population, known as the value function. One approach to determine this optimal regime is by maximizing the Augmented Inverse Probability Weighting (AIPW) estimator of the value function. However, the derived treatment regime can be intricate and nonlinear, limiting their use. For clarity and interoperability, we emphasize linear regimes and determine the optimal linear regime by optimizing the AIPW estimator within set constraints.
While the AIPW estimator offers a viable path to estimating the optimal regime, current methodologies predominantly focus on its asymptotic distribution, leaving a gap in studying the linear regime itself. However, there are many benefits to understanding the regime, as pinpointing significant covariates can enhance treatment effects and provide future clinical guidance. In this paper, we explore the asymptotic distribution of the estimated linear regime. Our results show that the parameter associated with the linear regime follows a cube-root convergence to a non-normal limiting distribution characterized by the maximizer of a centered Gaussian process with a quadratic drift. When making inferences for the estimated linear regimes with cube-root convergence in practical scenarios, the standard nonparametric bootstrap is invalid. As a solution, we facilitate the Cattaneo et al. (2020) bootstrap technique to provide a consistent distributional approximation for the estimated linear regimes, validated further through simulations and real-world data applications from the eICU Collaborative Research Database.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
Intrinsic localized excitons in MoSe$_2$/CrSBr heterostructures
Authors:
Xinyue Huang,
Zhigang Song,
Yuchen Gao,
Pingfan Gu,
Kenji Watanabe,
Takashi Taniguchi,
Shiqi Yang,
Zuxin Chen,
Yu Ye
Abstract:
We present a comprehensive investigation of optical properties in MoSe$_2$/CrSBr heterostructures, unveiling the presence of localized excitons represented by a new emission feature, X$^*$. We demonstrate through temperature- and power-dependent photoluminescence spectroscopy that X$^*$ originates from excitons confined by intrinsic defects within the CrSBr layer. The valley polarization of X$^*$…
▽ More
We present a comprehensive investigation of optical properties in MoSe$_2$/CrSBr heterostructures, unveiling the presence of localized excitons represented by a new emission feature, X$^*$. We demonstrate through temperature- and power-dependent photoluminescence spectroscopy that X$^*$ originates from excitons confined by intrinsic defects within the CrSBr layer. The valley polarization of X$^*$ and trion peaks displays opposite polarity under a magnetic field, which closely correlates with the magnetic order of CrSBr. This is attributed to spin-dependent charge transfer mechanisms across the heterointerface, supported by density functional theory calculations revealing a type-II band alignment and spin-polarized band structures. Furthermore, the strong in-plane anisotropy of CrSBr induces unique polarization-dependent responses in MoSe$_2$ emissions. Our study highlights the crucial role of defects in shaping excitonic properties. It offers valuable insights into spectral-resolved proximity effects in van der Waals heterostructures between semiconductor and magnet, contributing to advancing spintronic and valleytronic devices.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
Leveraging Logical Rules in Knowledge Editing: A Cherry on the Top
Authors:
Keyuan Cheng,
Muhammad Asif Ali,
Shu Yang,
Gang Lin,
Yuxuan Zhai,
Haoyang Fei,
Ke Xu,
Lu Yu,
Lijie Hu,
Di Wang
Abstract:
Multi-hop Question Answering (MQA) under knowledge editing (KE) is a key challenge in Large Language Models (LLMs). While best-performing solutions in this domain use a plan and solve paradigm to split a question into sub-questions followed by response generation, we claim that this approach is sub-optimal as it fails for hard to decompose questions, and it does not explicitly cater to correlated…
▽ More
Multi-hop Question Answering (MQA) under knowledge editing (KE) is a key challenge in Large Language Models (LLMs). While best-performing solutions in this domain use a plan and solve paradigm to split a question into sub-questions followed by response generation, we claim that this approach is sub-optimal as it fails for hard to decompose questions, and it does not explicitly cater to correlated knowledge updates resulting as a consequence of knowledge edits. This has a detrimental impact on the overall consistency of the updated knowledge. To address these issues, in this paper, we propose a novel framework named RULE-KE, i.e., RULE based Knowledge Editing, which is a cherry on the top for augmenting the performance of all existing MQA methods under KE. Specifically, RULE-KE leverages rule discovery to discover a set of logical rules. Then, it uses these discovered rules to update knowledge about facts highly correlated with the edit. Experimental evaluation using existing and newly curated datasets (i.e., RKE-EVAL) shows that RULE-KE helps augment both performances of parameter-based and memory-based solutions up to 92% and 112.9%, respectively.
△ Less
Submitted 27 May, 2024; v1 submitted 24 May, 2024;
originally announced May 2024.
-
Ultra-sensitive solid-state organic molecular microwave quantum receiver
Authors:
Bo Zhang,
Yuchen Han,
Hong-Liang Wu,
Hao Wu,
Shuo Yang,
Mark Oxborrow,
Qing Zhao,
Yue Fu,
Weibin Li,
Yeliang Wang,
Dezhi Zheng,
Jun Zhang
Abstract:
High-accuracy microwave sensing is widely demanded in various fields, ranging from cosmology to microwave quantum technology. Quantum receivers based on inorganic solid-state spin systems are promising candidates for such purpose because of the stability and compatibility, but their best sensitivity is currently limited to a few pT/$\sqrt{\rm{Hz}}$. Here, by utilising an enhanced readout scheme wi…
▽ More
High-accuracy microwave sensing is widely demanded in various fields, ranging from cosmology to microwave quantum technology. Quantum receivers based on inorganic solid-state spin systems are promising candidates for such purpose because of the stability and compatibility, but their best sensitivity is currently limited to a few pT/$\sqrt{\rm{Hz}}$. Here, by utilising an enhanced readout scheme with the state-of-the-art solid-state maser technology, we develop a robust microwave quantum receiver functioned by organic molecular spins at ambient conditions. Owing to the maser amplification, the sensitivity of the receiver achieves 6.14 $\pm$ 0.17 fT/$\sqrt{\rm{Hz}}$ which exceeds three orders of magnitude than that of the inorganic solid-state quantum receivers. The heterodyne detection without additional local oscillators improves bandwidth of the receiver and allows frequency detection. The scheme can be extended to other solid-state spin systems without complicated control pulses and thus enables practical applications such as electron spin resonance spectroscopy, dark matter searches, and astronomical observations.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
In-context Time Series Predictor
Authors:
Jiecheng Lu,
Yan Sun,
Shihao Yang
Abstract:
Recent Transformer-based large language models (LLMs) demonstrate in-context learning ability to perform various functions based solely on the provided context, without updating model parameters. To fully utilize the in-context capabilities in time series forecasting (TSF) problems, unlike previous Transformer-based or LLM-based time series forecasting methods, we reformulate "time series forecast…
▽ More
Recent Transformer-based large language models (LLMs) demonstrate in-context learning ability to perform various functions based solely on the provided context, without updating model parameters. To fully utilize the in-context capabilities in time series forecasting (TSF) problems, unlike previous Transformer-based or LLM-based time series forecasting methods, we reformulate "time series forecasting tasks" as input tokens by constructing a series of (lookback, future) pairs within the tokens. This method aligns more closely with the inherent in-context mechanisms, and is more parameter-efficient without the need of using pre-trained LLM parameters. Furthermore, it addresses issues such as overfitting in existing Transformer-based TSF models, consistently achieving better performance across full-data, few-shot, and zero-shot settings compared to previous architectures.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Video Diffusion Models are Training-free Motion Interpreter and Controller
Authors:
Zeqi Xiao,
Yifan Zhou,
Shuai Yang,
Xingang Pan
Abstract:
Video generation primarily aims to model authentic and customized motion across frames, making understanding and controlling the motion a crucial topic. Most diffusion-based studies on video motion focus on motion customization with training-based paradigms, which, however, demands substantial training resources and necessitates retraining for diverse models. Crucially, these approaches do not exp…
▽ More
Video generation primarily aims to model authentic and customized motion across frames, making understanding and controlling the motion a crucial topic. Most diffusion-based studies on video motion focus on motion customization with training-based paradigms, which, however, demands substantial training resources and necessitates retraining for diverse models. Crucially, these approaches do not explore how video diffusion models encode cross-frame motion information in their features, lacking interpretability and transparency in their effectiveness. To answer this question, this paper introduces a novel perspective to understand, localize, and manipulate motion-aware features in video diffusion models. Through analysis using Principal Component Analysis (PCA), our work discloses that robust motion-aware feature already exists in video diffusion models. We present a new MOtion FeaTure (MOFT) by eliminating content correlation information and filtering motion channels. MOFT provides a distinct set of benefits, including the ability to encode comprehensive motion information with clear interpretability, extraction without the need for training, and generalizability across diverse architectures. Leveraging MOFT, we propose a novel training-free video motion control framework. Our method demonstrates competitive performance in generating natural and faithful motion, providing architecture-agnostic insights and applicability in a variety of downstream tasks.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Low-Energy Line Codes for On-Chip Networks
Authors:
Beyza Dabak,
Major Glenn,
Jingyang Liu,
Alexander Buck,
Siyi Yang,
Robert Calderbank,
Natalie Enright Jerger,
Daniel J. Sorin
Abstract:
Energy is a primary constraint in processor design, and much of that energy is consumed in on-chip communication. Communication can be intra-core (e.g., from a register file to an ALU) or inter-core (e.g., over the on-chip network). In this paper, we use the on-chip network (OCN) as a case study for saving on-chip communication energy. We have identified a new way to reduce the OCN's link energy c…
▽ More
Energy is a primary constraint in processor design, and much of that energy is consumed in on-chip communication. Communication can be intra-core (e.g., from a register file to an ALU) or inter-core (e.g., over the on-chip network). In this paper, we use the on-chip network (OCN) as a case study for saving on-chip communication energy. We have identified a new way to reduce the OCN's link energy consumption by using line coding, a longstanding technique in information theory. Our line codes, called Low-Energy Line Codes (LELCs), reduce energy by reducing the frequency of voltage transitions of the links, and they achieve a range of energy/performance trade-offs.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Test of light-lepton universality in $τ$ decays with the Belle II experiment
Authors:
Belle II Collaboration,
I. Adachi,
K. Adamczyk,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer,
J. Becker
, et al. (406 additional authors not shown)
Abstract:
We present a measurement of the ratio $R_μ= \mathcal{B}(τ^-\to μ^-\barν_μν_τ) / \mathcal{B}(τ^-\to e^-\barν_eν_τ)$ of branching fractions $\mathcal{B}$ of the $τ$ lepton decaying to muons or electrons using data collected with the Belle II detector at the SuperKEKB $e^+e^-$ collider. The sample has an integrated luminosity of 362 fb$^{-1}$ at a centre-of-mass energy of 10.58 GeV. Using an optimise…
▽ More
We present a measurement of the ratio $R_μ= \mathcal{B}(τ^-\to μ^-\barν_μν_τ) / \mathcal{B}(τ^-\to e^-\barν_eν_τ)$ of branching fractions $\mathcal{B}$ of the $τ$ lepton decaying to muons or electrons using data collected with the Belle II detector at the SuperKEKB $e^+e^-$ collider. The sample has an integrated luminosity of 362 fb$^{-1}$ at a centre-of-mass energy of 10.58 GeV. Using an optimised event selection, a binned maximum likelihood fit is performed using the momentum spectra of the electron and muon candidates. The result, $R_μ= 0.9675 \pm 0.0007 \pm 0.0036$, where the first uncertainty is statistical and the second is systematic, is the most precise to date. It provides a stringent test of the light-lepton universality, translating to a ratio of the couplings of the muon and electron to the $W$ boson in $τ$ decays of $0.9974 \pm 0.0019$, in agreement with the standard model expectation of unity.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Calibrated Self-Rewarding Vision Language Models
Authors:
Yiyang Zhou,
Zhiyuan Fan,
Dongjie Cheng,
Sihan Yang,
Zhaorun Chen,
Chenhang Cui,
Xiyao Wang,
Yun Li,
Linjun Zhang,
Huaxiu Yao
Abstract:
Large Vision-Language Models (LVLMs) have made substantial progress by integrating pre-trained large language models (LLMs) and vision models through instruction tuning. Despite these advancements, LVLMs often exhibit the hallucination phenomenon, where generated text responses appear linguistically plausible but contradict the input image, indicating a misalignment between image and text pairs. T…
▽ More
Large Vision-Language Models (LVLMs) have made substantial progress by integrating pre-trained large language models (LLMs) and vision models through instruction tuning. Despite these advancements, LVLMs often exhibit the hallucination phenomenon, where generated text responses appear linguistically plausible but contradict the input image, indicating a misalignment between image and text pairs. This misalignment arises because the model tends to prioritize textual information over visual input, even when both the language model and visual representations are of high quality. Existing methods leverage additional models or human annotations to curate preference data and enhance modality alignment through preference optimization. These approaches may not effectively reflect the target LVLM's preferences, making the curated preferences easily distinguishable. Our work addresses these challenges by proposing the Calibrated Self-Rewarding (CSR) approach, which enables the model to self-improve by iteratively generating candidate responses, evaluating the reward for each response, and curating preference data for fine-tuning. In the reward modeling, we employ a step-wise strategy and incorporate visual constraints into the self-rewarding process to place greater emphasis on visual input. Empirical results demonstrate that CSR enhances performance and reduces hallucinations across ten benchmarks and tasks, achieving substantial improvements over existing methods by 7.62%. Our empirical results are further supported by rigorous theoretical analysis, under mild assumptions, verifying the effectiveness of introducing visual constraints into the self-rewarding paradigm. Additionally, CSR shows compatibility with different vision-language models and the ability to incrementally improve performance through iterative fine-tuning. Our data and code are available at https://github.com/YiyangZhou/CSR.
△ Less
Submitted 31 May, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
Authors:
Shiqi Yang,
Zhi Zhong,
Mengjie Zhao,
Shusuke Takahashi,
Masato Ishii,
Takashi Shibuya,
Yuki Mitsufuji
Abstract:
In recent years, with the realistic generation results and a wide range of personalized applications, diffusion-based generative models gain huge attention in both visual and audio generation areas. Compared to the considerable advancements of text2image or text2audio generation, research in audio2visual or visual2audio generation has been relatively slow. The recent audio-visual generation method…
▽ More
In recent years, with the realistic generation results and a wide range of personalized applications, diffusion-based generative models gain huge attention in both visual and audio generation areas. Compared to the considerable advancements of text2image or text2audio generation, research in audio2visual or visual2audio generation has been relatively slow. The recent audio-visual generation methods usually resort to huge large language model or composable diffusion models. Instead of designing another giant model for audio-visual generation, in this paper we take a step back showing a simple and lightweight generative transformer, which is not fully investigated in multi-modal generation, can achieve excellent results on image2audio generation. The transformer operates in the discrete audio and visual Vector-Quantized GAN space, and is trained in the mask denoising manner. After training, the classifier-free guidance could be deployed off-the-shelf achieving better performance, without any extra training or modification. Since the transformer model is modality symmetrical, it could also be directly deployed for audio2image generation and co-generation. In the experiments, we show that our simple method surpasses recent image2audio generation methods. Generated audio samples can be found at https://docs.google.com/presentation/d/1ZtC0SeblKkut4XJcRaDsSTuCRIXB3ypxmSi7HTY3IyQ/
△ Less
Submitted 24 May, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Endowing Interpretability for Neural Cognitive Diagnosis by Efficient Kolmogorov-Arnold Networks
Authors:
Shangshang Yang,
Linrui Qin,
Xiaoshan Yu
Abstract:
In the realm of intelligent education, cognitive diagnosis plays a crucial role in subsequent recommendation tasks attributed to the revealed students' proficiency in knowledge concepts. Although neural network-based neural cognitive diagnosis models (CDMs) have exhibited significantly better performance than traditional models, neural cognitive diagnosis is criticized for the poor model interpret…
▽ More
In the realm of intelligent education, cognitive diagnosis plays a crucial role in subsequent recommendation tasks attributed to the revealed students' proficiency in knowledge concepts. Although neural network-based neural cognitive diagnosis models (CDMs) have exhibited significantly better performance than traditional models, neural cognitive diagnosis is criticized for the poor model interpretability due to the multi-layer perception (MLP) employed, even with the monotonicity assumption. Therefore, this paper proposes to empower the interpretability of neural cognitive diagnosis models through efficient kolmogorov-arnold networks (KANs), named KAN2CD, where KANs are designed to enhance interpretability in two manners. Specifically, in the first manner, KANs are directly used to replace the used MLPs in existing neural CDMs; while in the second manner, the student embedding, exercise embedding, and concept embedding are directly processed by several KANs, and then their outputs are further combined and learned in a unified KAN to get final predictions. To overcome the problem of training KANs slowly, we modify the implementation of original KANs to accelerate the training. Experiments on four real-world datasets show that the proposed KA2NCD exhibits better performance than traditional CDMs, and the proposed KA2NCD still has a bit of performance leading even over the existing neural CDMs. More importantly, the learned structures of KANs enable the proposed KA2NCD to hold as good interpretability as traditional CDMs, which is superior to existing neural CDMs. Besides, the training cost of the proposed KA2NCD is competitive to existing models.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration
Authors:
Yang Zhang,
Shixin Yang,
Chenjia Bai,
Fei Wu,
Xiu Li,
Zhen Wang,
Xuelong Li
Abstract:
Grounding the reasoning ability of large language models (LLMs) for embodied tasks is challenging due to the complexity of the physical world. Especially, LLM planning for multi-agent collaboration requires communication of agents or credit assignment as the feedback to re-adjust the proposed plans and achieve effective coordination. However, existing methods that overly rely on physical verificat…
▽ More
Grounding the reasoning ability of large language models (LLMs) for embodied tasks is challenging due to the complexity of the physical world. Especially, LLM planning for multi-agent collaboration requires communication of agents or credit assignment as the feedback to re-adjust the proposed plans and achieve effective coordination. However, existing methods that overly rely on physical verification or self-reflection suffer from excessive and inefficient querying of LLMs. In this paper, we propose a novel framework for multi-agent collaboration that introduces Reinforced Advantage feedback (ReAd) for efficient self-refinement of plans. Specifically, we perform critic regression to learn a sequential advantage function from LLM-planned data, and then treat the LLM planner as an optimizer to generate actions that maximize the advantage function. It endows the LLM with the foresight to discern whether the action contributes to accomplishing the final task. We provide theoretical analysis by extending advantage-weighted regression in reinforcement learning to multi-agent systems. Experiments on Overcooked-AI and a difficult variant of RoCoBench show that ReAd surpasses baselines in success rate, and also significantly decreases the interaction steps of agents and query rounds of LLMs, demonstrating its high efficiency for grounding LLMs. More results are given at https://read-llm.github.io/.
△ Less
Submitted 25 May, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Exploring and Evaluating Real-world CXL: Use Cases and System Adoption
Authors:
Jie Liu,
Xi Wang,
Jianbo Wu,
Shuangyan Yang,
Jie Ren,
Bhanu Shankar,
Dong Li
Abstract:
Compute eXpress Link (CXL) is emerging as a promising memory interface technology. Because of the common unavailiability of CXL devices, the performance of the CXL memory is largely unknown. What are the use cases for the CXL memory? What are the impacts of the CXL memory on application performance? How to use the CXL memory in combination with existing memory components? In this work, we study th…
▽ More
Compute eXpress Link (CXL) is emerging as a promising memory interface technology. Because of the common unavailiability of CXL devices, the performance of the CXL memory is largely unknown. What are the use cases for the CXL memory? What are the impacts of the CXL memory on application performance? How to use the CXL memory in combination with existing memory components? In this work, we study the performance of three genuine CXL memory-expansion cards from different vendors. We characterize the basic performance of the CXL memory, study how HPC applications and large language models can benefit from the CXL memory, and study the interplay between memory tiering and page interleaving. We also propose a novel data object-level interleaving policy to match the interleaving policy with memory access patterns. We reveal the challenges and opportunities of using the CXL memory.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
RET-CLIP: A Retinal Image Foundation Model Pre-trained with Clinical Diagnostic Reports
Authors:
Jiawei Du,
Jia Guo,
Weihang Zhang,
Shengzhu Yang,
Hanruo Liu,
Huiqi Li,
Ningli Wang
Abstract:
The Vision-Language Foundation model is increasingly investigated in the fields of computer vision and natural language processing, yet its exploration in ophthalmology and broader medical applications remains limited. The challenge is the lack of labeled data for the training of foundation model. To handle this issue, a CLIP-style retinal image foundation model is developed in this paper. Our fou…
▽ More
The Vision-Language Foundation model is increasingly investigated in the fields of computer vision and natural language processing, yet its exploration in ophthalmology and broader medical applications remains limited. The challenge is the lack of labeled data for the training of foundation model. To handle this issue, a CLIP-style retinal image foundation model is developed in this paper. Our foundation model, RET-CLIP, is specifically trained on a dataset of 193,865 patients to extract general features of color fundus photographs (CFPs), employing a tripartite optimization strategy to focus on left eye, right eye, and patient level to reflect real-world clinical scenarios. Extensive experiments demonstrate that RET-CLIP outperforms existing benchmarks across eight diverse datasets spanning four critical diagnostic categories: diabetic retinopathy, glaucoma, multiple disease diagnosis, and multi-label classification of multiple diseases, which demonstrate the performance and generality of our foundation model. The sourse code and pre-trained model are available at https://github.com/sStonemason/RET-CLIP.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
SN 2023zaw: the low-energy explosion of an ultra-stripped star, with non-radioactive heating
Authors:
Thomas Moore,
James Gillanders,
Matt Nicholl,
Mark Huber,
Stephen Smartt,
Shubham Srivastav,
Heloise Stevance,
Ting-Wan Chen,
Kenneth Chambers,
Joseph Anderson,
Michael Fulton,
Samantha Oates,
Charlotte Angus,
Giuliano Pignata,
Nicolas Erasmus,
Hua Gao,
Joanna Bulger,
Chien-Cheng Lin,
Thomas Lowe,
Eugene Magnier,
Paloma Minguez,
Chow-Choong Ngeow,
Xinyue Sheng,
Stuart A. Sim,
Ken Smith
, et al. (4 additional authors not shown)
Abstract:
Most stripped envelope supernova progenitors are formed through binary interaction, losing hydrogen and/or helium from their outer layers. An emerging class of supernovae with the highest degree of envelope-stripping are thought to be the product of stripping by a NS companion. However, relatively few examples are known and the outcomes of such systems can be diverse and are poorly understood at p…
▽ More
Most stripped envelope supernova progenitors are formed through binary interaction, losing hydrogen and/or helium from their outer layers. An emerging class of supernovae with the highest degree of envelope-stripping are thought to be the product of stripping by a NS companion. However, relatively few examples are known and the outcomes of such systems can be diverse and are poorly understood at present. Here, we present spectroscopic observations and high cadence multi-band photometry of SN 2023zaw, a low ejecta mass and rapidly evolving supernova. SN 2023zaw was discovered in a nearby spiral galaxy at D = 39.7 Mpc, with significant Milky Way extinction, $E(B-V) = 0.21$, and significant (but uncertain) host extinction. Bayesian evidence comparison reveals that nickel is not the only power source and an additional energy source is required to explain our observations. Our models suggest an ejecta mass of $M_{\rm ej} \sim 0.07\,\rm M_\odot$ and a synthesised nickel mass of $M_{\rm ej} \sim 0.007\,\rm M_\odot$ is required to explain the explosion. However an additional heating from a magnetar or interaction with circumstellar material is required to power the early light curve.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Preservation of Topological Surface States in Millimeter-Scale Transferred Membranes
Authors:
Chi Ian Jess Ip,
Qiang Gao,
Khanhy Du Nguyen,
Chenhui Yan,
Gangbin Yan,
Eli Hoenig,
Thomas S. Marchese,
Minghao Zhang,
Woojoo Lee,
Hossein Rokni,
Ying Shirley Meng,
Chong Liu,
Shuolong Yang
Abstract:
Ultrathin topological insulator membranes are building blocks of exotic quantum matter. However, traditional epitaxy of these materials does not facilitate stacking in arbitrary orders, while mechanical exfoliation from bulk crystals is also challenging due to the non-negligible interlayer coupling therein. Here we liberate millimeter-scale films of topological insulator Bi$_2$Se$_3$, grown by mol…
▽ More
Ultrathin topological insulator membranes are building blocks of exotic quantum matter. However, traditional epitaxy of these materials does not facilitate stacking in arbitrary orders, while mechanical exfoliation from bulk crystals is also challenging due to the non-negligible interlayer coupling therein. Here we liberate millimeter-scale films of topological insulator Bi$_2$Se$_3$, grown by molecular beam epitaxy, down to 3 quintuple layers. We characterize the preservation of the topological surface states and quantum well states in transferred Bi$_{2}$Se$_{3}$ films using angle-resolved photoemission spectroscopy. Leveraging the photon-energy-dependent surface sensitivity, the photoemission spectra taken with $6$ eV and $21.2$ eV photons reveal a transfer-induced migration of the topological surface states from the top to the inner layers. By establishing clear electronic structures of the transferred films and unveiling the wavefunction relocation of the topological surface states, our work paves the physics foundation crucial for the future fabrication of artificially stacked topological materials with single-layer precision.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Search for the lepton-flavor violating decay $B^0_s\toφμ^\pmτ^\mp$
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1062 additional authors not shown)
Abstract:
A search for the lepton-flavor violating decays $B^0_s\toφμ^\pmτ^\mp$ is presented, using a sample of proton-proton collisions at center-of-mass energies of 7, 8, and 13 TeV, collected with the LHCb detector and corresponding to a total integrated luminosity of $9\,\text{fb}^{-1}$. The $τ$ leptons are selected using decays with three charged pions. No significant excess is observed, and an upper l…
▽ More
A search for the lepton-flavor violating decays $B^0_s\toφμ^\pmτ^\mp$ is presented, using a sample of proton-proton collisions at center-of-mass energies of 7, 8, and 13 TeV, collected with the LHCb detector and corresponding to a total integrated luminosity of $9\,\text{fb}^{-1}$. The $τ$ leptons are selected using decays with three charged pions. No significant excess is observed, and an upper limit on the branching fraction is determined to be ${\cal B}( B^0_s\toφμ^\pmτ^\mp) < 1.0\times 10^{-5}$ at 90% confidence level.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Study of $b$-hadron decays to $Λ_c^+ h^- h^{\prime -}$ final states
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1072 additional authors not shown)
Abstract:
Decays of $Ξ_b^-$ and $Ω_b^-$ baryons to $Λ_c^+ h^- h^{\prime -}$ final states, with $h^- h^{\prime -}$ being $π^-π^-$, $K^-π^-$ and $K^-K^-$ meson pairs, are searched for using data collected with the LHCb detector. The data sample studied corresponds to an integrated luminosity of $8.7\,\mathrm{fb}^{-1}$ of $pp$ collisions collected at centre-of-mass energies $\sqrt{s} = 7$, $8$ and…
▽ More
Decays of $Ξ_b^-$ and $Ω_b^-$ baryons to $Λ_c^+ h^- h^{\prime -}$ final states, with $h^- h^{\prime -}$ being $π^-π^-$, $K^-π^-$ and $K^-K^-$ meson pairs, are searched for using data collected with the LHCb detector. The data sample studied corresponds to an integrated luminosity of $8.7\,\mathrm{fb}^{-1}$ of $pp$ collisions collected at centre-of-mass energies $\sqrt{s} = 7$, $8$ and $13\,\mathrm{Te\kern -0.1em V}$. The products of the relative branching fractions and fragmentation fractions for each signal mode, relative to the $B^- \to Λ_c^+ \overline{p} π^-$ mode, are measured, with $Ξ_{b}^- \toΛ_{c}^+ K^- π^-$, $Ξ_{b}^- \toΛ_{c}^+ K^- K^-$ and $Ω_{b}^- \toΛ_{c}^+ K^- K^-$ decays being observed at over $5\,σ$ significance. The $Ξ_{b}^- \toΛ_{c}^+ K^- π^-$ mode is also used to measure the $Ξ_{b}^-$ production asymmetry, which is found to be consistent with zero. In addition, the $B^- \to Λ_{c}^+ \overline{p} K^-$ decay is observed for the first time, and its branching fraction is measured relative to that of the $B^- \to Λ_{c}^+ \overline{p} π^-$ mode.
△ Less
Submitted 22 May, 2024; v1 submitted 21 May, 2024;
originally announced May 2024.
-
Three-dimensional mapping and electronic origin of large altermagnetic splitting near Fermi level in CrSb
Authors:
Guowei Yang,
Zhanghuan Li,
Sai Yang,
Jiyuan Li,
Hao Zheng,
Weifan Zhu,
Saizheng Cao,
Wenxuan Zhao,
Jiawen Zhang,
Mao Ye,
Yu Song,
Lun-Hui Hu,
Lexian Yang,
Ming Shi,
Huiqiu Yuan,
Yongjun Zhang,
Yuanfeng Xu,
Yang Liu
Abstract:
Recently, a new kind of collinear magnetism, dubbed altermagnetism, has attracted considerable interests. A key characteristic of altermagnet is the momentum-dependent band and spin splitting without net magnetization. However, finding altermagnetic materials with large splitting near the Fermi level, which necessarily requires three-dimensional k-space mapping and is crucial for spintronic applic…
▽ More
Recently, a new kind of collinear magnetism, dubbed altermagnetism, has attracted considerable interests. A key characteristic of altermagnet is the momentum-dependent band and spin splitting without net magnetization. However, finding altermagnetic materials with large splitting near the Fermi level, which necessarily requires three-dimensional k-space mapping and is crucial for spintronic applications and emergent phenomena, remains challenging. Here by employing synchrotron-based angle-resolved photoemission spectroscopy (ARPES) and model calculations, we uncover a large altermagnetic splitting, up to ~1.0 eV, near the Fermi level in CrSb. We verify its bulk-type g-wave altermagnetism through systematic three-dimensional kspace mapping, which unambiguously reveals the altermagnetic symmetry and associated nodal planes. The ARPES results are well captured by density functional theory calculations. In addition, tight-binding model analysis indicate that the large altermagnetic splitting arises from strong third-nearest-neighbor hopping mediated by Sb ions, which breaks both the space-time reversal symmetry and the translational spin-rotation symmetry. The large band/spin splitting near Fermi level in metallic CrSb, together with its high TN (up to 705 K) and simple spin configuration, paves the way for exploring emergent phenomena and spintronic applications based on altermagnets.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
First joint oscillation analysis of Super-Kamiokande atmospheric and T2K accelerator neutrino data
Authors:
Super-Kamiokande,
T2K collaborations,
:,
S. Abe,
K. Abe,
N. Akhlaq,
R. Akutsu,
H. Alarakia-Charles,
A. Ali,
Y. I. Alj Hakim,
S. Alonso Monsalve,
S. Amanai,
C. Andreopoulos,
L. H. V. Anthony,
M. Antonova,
S. Aoki,
K. A. Apte,
T. Arai,
T. Arihara,
S. Arimoto,
Y. Asada,
R. Asaka,
Y. Ashida,
E. T. Atkin,
N. Babu
, et al. (524 additional authors not shown)
Abstract:
The Super-Kamiokande and T2K collaborations present a joint measurement of neutrino oscillation parameters from their atmospheric and beam neutrino data. It uses a common interaction model for events overlapping in neutrino energy and correlated detector systematic uncertainties between the two datasets, which are found to be compatible. Using 3244.4 days of atmospheric data and a beam exposure of…
▽ More
The Super-Kamiokande and T2K collaborations present a joint measurement of neutrino oscillation parameters from their atmospheric and beam neutrino data. It uses a common interaction model for events overlapping in neutrino energy and correlated detector systematic uncertainties between the two datasets, which are found to be compatible. Using 3244.4 days of atmospheric data and a beam exposure of $19.7(16.3) \times 10^{20}$ protons on target in (anti)neutrino mode, the analysis finds a 1.9$σ$ exclusion of CP-conservation (defined as $J_{CP}=0$) and a preference for the normal mass ordering.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
AdaAugment: A Tuning-Free and Adaptive Approach to Enhance Data Augmentation
Authors:
Suorong Yang,
Peijia Li,
Xin Xiong,
Furao Shen,
Jian Zhao
Abstract:
Data augmentation (DA) is widely employed to improve the generalization performance of deep models. However, most existing DA methods use augmentation operations with random magnitudes throughout training. While this fosters diversity, it can also inevitably introduce uncontrolled variability in augmented data, which may cause misalignment with the evolving training status of the target models. Bo…
▽ More
Data augmentation (DA) is widely employed to improve the generalization performance of deep models. However, most existing DA methods use augmentation operations with random magnitudes throughout training. While this fosters diversity, it can also inevitably introduce uncontrolled variability in augmented data, which may cause misalignment with the evolving training status of the target models. Both theoretical and empirical findings suggest that this misalignment increases the risks of underfitting and overfitting. To address these limitations, we propose AdaAugment, an innovative and tuning-free Adaptive Augmentation method that utilizes reinforcement learning to dynamically adjust augmentation magnitudes for individual training samples based on real-time feedback from the target network. Specifically, AdaAugment features a dual-model architecture consisting of a policy network and a target network, which are jointly optimized to effectively adapt augmentation magnitudes. The policy network optimizes the variability within the augmented data, while the target network utilizes the adaptively augmented samples for training. Extensive experiments across benchmark datasets and deep architectures demonstrate that AdaAugment consistently outperforms other state-of-the-art DA methods in effectiveness while maintaining remarkable efficiency.
△ Less
Submitted 23 May, 2024; v1 submitted 19 May, 2024;
originally announced May 2024.
-
Causal Customer Churn Analysis with Low-rank Tensor Block Hazard Model
Authors:
Chenyin Gao,
Zhiming Zhang,
Shu Yang
Abstract:
This study introduces an innovative method for analyzing the impact of various interventions on customer churn, using the potential outcomes framework. We present a new causal model, the tensorized latent factor block hazard model, which incorporates tensor completion methods for a principled causal analysis of customer churn. A crucial element of our approach is the formulation of a 1-bit tensor…
▽ More
This study introduces an innovative method for analyzing the impact of various interventions on customer churn, using the potential outcomes framework. We present a new causal model, the tensorized latent factor block hazard model, which incorporates tensor completion methods for a principled causal analysis of customer churn. A crucial element of our approach is the formulation of a 1-bit tensor completion for the parameter tensor. This captures hidden customer characteristics and temporal elements from churn records, effectively addressing the binary nature of churn data and its time-monotonic trends. Our model also uniquely categorizes interventions by their similar impacts, enhancing the precision and practicality of implementing customer retention strategies. For computational efficiency, we apply a projected gradient descent algorithm combined with spectral clustering. We lay down the theoretical groundwork for our model, including its non-asymptotic properties. The efficacy and superiority of our model are further validated through comprehensive experiments on both simulated and real-world applications.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
Transverse polarization measurement of $Λ$ hyperons in $p$Ne collisions at $\sqrt{s_{NN}}$ = 68.4 GeV with the $\mbox{LHCb}$ detector
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1065 additional authors not shown)
Abstract:
A measurement of the transverse polarization of the $Λ$ and $\barΛ$ hyperons in $p$Ne fixed-target collisions at $\sqrt{s_{NN}}$ = 68.4 GeV is presented using data collected by the LHCb detector. The polarization is studied using the decay $Λ\rightarrow p π^-$ together with its charge conjugated process, the integrated values measured are…
▽ More
A measurement of the transverse polarization of the $Λ$ and $\barΛ$ hyperons in $p$Ne fixed-target collisions at $\sqrt{s_{NN}}$ = 68.4 GeV is presented using data collected by the LHCb detector. The polarization is studied using the decay $Λ\rightarrow p π^-$ together with its charge conjugated process, the integrated values measured are
$$ P_Λ = 0.029 \pm 0.019 \, (\rm{stat}) \pm 0.012 \, (\rm{syst}) \, , $$ $$ P_{\barΛ} = 0.003 \pm 0.023 \, (\rm{stat}) \pm 0.014 \,(\rm{syst}) \,. $$
Furthermore, the results are shown as a function of the Feynman~$x$~variable, transverse momentum, pseudorapidity and rapidity of the hyperons, and are compared with previous measurements.
△ Less
Submitted 24 May, 2024; v1 submitted 18 May, 2024;
originally announced May 2024.
-
The Effect of Higher Harmonics On Gravitational Wave Dark Sirens
Authors:
Jian-Dong Liu,
Wen-Biao Han,
Qianyun Yun,
Shu-Cheng Yang
Abstract:
The gravitational wave (GW) signal from the merger of two black holes can serve as a standard sirens for cosmological inference. However, a degeneracy exists between the luminosity distance and the inclination angle between the binary system's orbital angular momentum and the observer's line of sight, limiting the precise measurement of the luminosity distance. In this study, we investigate how hi…
▽ More
The gravitational wave (GW) signal from the merger of two black holes can serve as a standard sirens for cosmological inference. However, a degeneracy exists between the luminosity distance and the inclination angle between the binary system's orbital angular momentum and the observer's line of sight, limiting the precise measurement of the luminosity distance. In this study, we investigate how higher harmonics affect luminosity distance estimation for third-generation (3G) GW detectors in binary black hole mergers. Our findings demonstrate that considering higher harmonics significantly enhances distance inference results compared with using only the (2,2) mode. This improved accuracy in distance estimates also strengthens constraints on host galaxies, enabling more precise measurements of the Hubble constant. These results highlight the significant influence of higher harmonics on the range estimation accuracy of 3G ground-based GW detectors.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
A Functional Model Method for Nonconvex Nonsmooth Conditional Stochastic Optimization
Authors:
Andrzej Ruszczyński,
Shangzhe Yang
Abstract:
We consider stochastic optimization problems involving an expected value of a nonlinear function of a base random vector and a conditional expectation of another function depending on the base random vector, a dependent random vector, and the decision variables. We call such problems conditional stochastic optimization problems. They arise in many applications, such as uplift modeling, reinforceme…
▽ More
We consider stochastic optimization problems involving an expected value of a nonlinear function of a base random vector and a conditional expectation of another function depending on the base random vector, a dependent random vector, and the decision variables. We call such problems conditional stochastic optimization problems. They arise in many applications, such as uplift modeling, reinforcement learning, and contextual optimization. We propose a specialized single time-scale stochastic method for nonconvex constrained conditional stochastic optimization problems with a Lipschitz smooth outer function and a generalized differentiable inner function. In the method, we approximate the inner conditional expectation with a rich parametric model whose mean squared error satisfies a stochastic version of a Łojasiewicz condition. The model is used by an inner learning algorithm. The main feature of our approach is that unbiased stochastic estimates of the directions used by the method can be generated with one observation from the joint distribution per iteration, which makes it applicable to real-time learning. The directions, however, are not gradients or subgradients of any overall objective function. We prove the convergence of the method with probability one, using the method of differential inclusions and a specially designed Lyapunov function, involving a stochastic generalization of the Bregman distance. Finally, a numerical illustration demonstrates the viability of our approach.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transformers
Authors:
Sheng Yang,
Jiawang Bai,
Kuofeng Gao,
Yong Yang,
Yiming Li,
Shu-tao Xia
Abstract:
Given the power of vision transformers, a new learning paradigm, pre-training and then prompting, makes it more efficient and effective to address downstream visual recognition tasks. In this paper, we identify a novel security threat towards such a paradigm from the perspective of backdoor attacks. Specifically, an extra prompt token, called the switch token in this work, can turn the backdoor mo…
▽ More
Given the power of vision transformers, a new learning paradigm, pre-training and then prompting, makes it more efficient and effective to address downstream visual recognition tasks. In this paper, we identify a novel security threat towards such a paradigm from the perspective of backdoor attacks. Specifically, an extra prompt token, called the switch token in this work, can turn the backdoor mode on, i.e., converting a benign model into a backdoored one. Once under the backdoor mode, a specific trigger can force the model to predict a target class. It poses a severe risk to the users of cloud API, since the malicious behavior can not be activated and detected under the benign mode, thus making the attack very stealthy. To attack a pre-trained model, our proposed attack, named SWARM, learns a trigger and prompt tokens including a switch token. They are optimized with the clean loss which encourages the model always behaves normally even the trigger presents, and the backdoor loss that ensures the backdoor can be activated by the trigger when the switch is on. Besides, we utilize the cross-mode feature distillation to reduce the effect of the switch token on clean samples. The experiments on diverse visual recognition tasks confirm the success of our switchable backdoor attack, i.e., achieving 95%+ attack success rate, and also being hard to be detected and removed. Our code is available at https://github.com/20000yshust/SWARM.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Dual-Robust Integrated Sensing and Communication: Beamforming under CSI Imperfection and Location Uncertainty
Authors:
Wanting Lyu,
Songjie Yang,
Yue Xiu,
Xinyi Chen,
Zhongpei Zhang,
Chadi Assi,
Chau Yuan
Abstract:
A dual-robust design of beamforming is investigated in an integrated sensing and communication (ISAC) system.Existing research on robust ISAC waveform design, while proposing solutions to imperfect channel state information (CSI), generally depends on prior knowledge of the target's approximate location to design waveforms. This approach, however, limits the precision in sensing the target's exact…
▽ More
A dual-robust design of beamforming is investigated in an integrated sensing and communication (ISAC) system.Existing research on robust ISAC waveform design, while proposing solutions to imperfect channel state information (CSI), generally depends on prior knowledge of the target's approximate location to design waveforms. This approach, however, limits the precision in sensing the target's exact location. In this paper, considering both CSI imperfection and target location uncertainty, a novel framework of joint robust optimization is proposed by maximizing the weighted sum of worst-case data rate and beampattern gain. To address this challenging problem, we propose an efficient two-layer iteration algorithm based on S-Procedure and convex hull. Finally, numerical results verify the effectiveness and performance improvement of our dual-robust algorithm, as well as the trade-off between communication and sensing performance.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Flexible Beamforming for Movable Antenna-Enabled Integrated Sensing and Communication
Authors:
Wanting Lyu,
Songjie Yang,
Yue Xiu,
Zhongpei Zhang,
Chadi Assi,
Chau Yuen
Abstract:
This paper investigates flexible beamforming design in an integrated sensing and communication (ISAC) network with movable antennas (MAs). A bistatic radar system is integrated into a multi-user multiple-input-single-output (MU-MISO) system, with the base station (BS) equipped with MAs. This enables array response reconfiguration by adjusting the positions of antennas. Thus, a joint beamforming an…
▽ More
This paper investigates flexible beamforming design in an integrated sensing and communication (ISAC) network with movable antennas (MAs). A bistatic radar system is integrated into a multi-user multiple-input-single-output (MU-MISO) system, with the base station (BS) equipped with MAs. This enables array response reconfiguration by adjusting the positions of antennas. Thus, a joint beamforming and antenna position optimization problem, namely flexible beamforming, is proposed to maximize communication rate and sensing mutual information (MI). The fractional programming (FP) method is adopted to transform the non-convex objective function, and we alternatively update the beamforming matrix and antenna positions. Karush-Kuhn-Tucker (KKT) conditions are employed to derive the close-form solution of the beamforming matrix, while we propose an efficient search-based projected gradient ascent (SPGA) method to update the antenna positions. Simulation results demonstrate that MAs significantly enhance the ISAC performance when employing our proposed algorithm, achieving a 59.8% performance gain compared to fixed uniform arrays.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Grounded 3D-LLM with Referent Tokens
Authors:
Yilun Chen,
Shuai Yang,
Haifeng Huang,
Tai Wang,
Ruiyuan Lyu,
Runsen Xu,
Dahua Lin,
Jiangmiao Pang
Abstract:
Prior studies on 3D scene understanding have primarily developed specialized models for specific tasks or required task-specific fine-tuning. In this study, we propose Grounded 3D-LLM, which explores the potential of 3D large multi-modal models (3D LMMs) to consolidate various 3D vision tasks within a unified generative framework. The model uses scene referent tokens as special noun phrases to ref…
▽ More
Prior studies on 3D scene understanding have primarily developed specialized models for specific tasks or required task-specific fine-tuning. In this study, we propose Grounded 3D-LLM, which explores the potential of 3D large multi-modal models (3D LMMs) to consolidate various 3D vision tasks within a unified generative framework. The model uses scene referent tokens as special noun phrases to reference 3D scenes, enabling the handling of sequences that interleave 3D and textual data. It offers a natural approach for translating 3D vision tasks into language formats using task-specific instruction templates. To facilitate the use of referent tokens in subsequent language modeling, we have curated large-scale grounded language datasets that offer finer scene-text correspondence at the phrase level by bootstrapping existing object labels. Subsequently, we introduced Contrastive LAnguage-Scene Pre-training (CLASP) to effectively leverage this data, thereby integrating 3D vision with language models. Our comprehensive evaluation covers open-ended tasks like dense captioning and 3D QA, alongside close-ended tasks such as object detection and language grounding. Experiments across multiple 3D benchmarks reveal the leading performance and the broad applicability of Grounded 3D-LLM. Code and datasets will be released on the project page: https://groundedscenellm.github.io/grounded_3d-llm.github.io.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model
Authors:
Zheng Gu,
Shiyuan Yang,
Jing Liao,
Jing Huo,
Yang Gao
Abstract:
Visual In-Context Learning (ICL) has emerged as a promising research area due to its capability to accomplish various tasks with limited example pairs through analogical reasoning. However, training-based visual ICL has limitations in its ability to generalize to unseen tasks and requires the collection of a diverse task dataset. On the other hand, existing methods in the inference-based visual IC…
▽ More
Visual In-Context Learning (ICL) has emerged as a promising research area due to its capability to accomplish various tasks with limited example pairs through analogical reasoning. However, training-based visual ICL has limitations in its ability to generalize to unseen tasks and requires the collection of a diverse task dataset. On the other hand, existing methods in the inference-based visual ICL category solely rely on textual prompts, which fail to capture fine-grained contextual information from given examples and can be time-consuming when converting from images to text prompts. To address these challenges, we propose Analogist, a novel inference-based visual ICL approach that exploits both visual and textual prompting techniques using a text-to-image diffusion model pretrained for image inpainting. For visual prompting, we propose a self-attention cloning (SAC) method to guide the fine-grained structural-level analogy between image examples. For textual prompting, we leverage GPT-4V's visual reasoning capability to efficiently generate text prompts and introduce a cross-attention masking (CAM) operation to enhance the accuracy of semantic-level analogy guided by text prompts. Our method is out-of-the-box and does not require fine-tuning or optimization. It is also generic and flexible, enabling a wide range of visual tasks to be performed in an in-context manner. Extensive experiments demonstrate the superiority of our method over existing approaches, both qualitatively and quantitatively.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception
Authors:
Xiaosu Zhu,
Hualian Sheng,
Sijia Cai,
Bing Deng,
Shaopeng Yang,
Qiao Liang,
Ken Chen,
Lianli Gao,
Jingkuan Song,
Jieping Ye
Abstract:
We introduce RoScenes, the largest multi-view roadside perception dataset, which aims to shed light on the development of vision-centric Bird's Eye View (BEV) approaches for more challenging traffic scenes. The highlights of RoScenes include significantly large perception area, full scene coverage and crowded traffic. More specifically, our dataset achieves surprising 21.13M 3D annotations within…
▽ More
We introduce RoScenes, the largest multi-view roadside perception dataset, which aims to shed light on the development of vision-centric Bird's Eye View (BEV) approaches for more challenging traffic scenes. The highlights of RoScenes include significantly large perception area, full scene coverage and crowded traffic. More specifically, our dataset achieves surprising 21.13M 3D annotations within 64,000 $m^2$. To relieve the expensive costs of roadside 3D labeling, we present a novel BEV-to-3D joint annotation pipeline to efficiently collect such a large volume of data. After that, we organize a comprehensive study for current BEV methods on RoScenes in terms of effectiveness and efficiency. Tested methods suffer from the vast perception area and variation of sensor layout across scenes, resulting in performance levels falling below expectations. To this end, we propose RoBEV that incorporates feature-guided position embedding for effective 2D-3D feature assignment. With its help, our method outperforms state-of-the-art by a large margin without extra computational overhead on validation set. Our dataset and devkit will be made available at https://github.com/xiaosu-zhu/RoScenes.
△ Less
Submitted 4 July, 2024; v1 submitted 16 May, 2024;
originally announced May 2024.
-
SQL-to-Schema Enhances Schema Linking in Text-to-SQL
Authors:
Sun Yang,
Qiong Su,
Zhishuai Li,
Ziyue Li,
Hangyu Mao,
Chenxi Liu,
Rui Zhao
Abstract:
In sophisticated existing Text-to-SQL methods exhibit errors in various proportions, including schema-linking errors (incorrect columns, tables, or extra columns), join errors, nested errors, and group-by errors. Consequently, there is a critical need to filter out unnecessary tables and columns, directing the language models attention to relevant tables and columns with schema-linking, to reduce…
▽ More
In sophisticated existing Text-to-SQL methods exhibit errors in various proportions, including schema-linking errors (incorrect columns, tables, or extra columns), join errors, nested errors, and group-by errors. Consequently, there is a critical need to filter out unnecessary tables and columns, directing the language models attention to relevant tables and columns with schema-linking, to reduce errors during SQL generation. Previous approaches have involved sorting tables and columns based on their relevance to the question, selecting the top-ranked ones for sorting, or directly identifying the necessary tables and columns for SQL generation. However, these methods face challenges such as lengthy model training times, high consumption of expensive GPT-4 tokens in few-shot prompts, or suboptimal performance in schema linking. Therefore, we propose an inventive schema linking method in two steps: Firstly, generate an initial SQL query by utilizing the complete database schema. Subsequently, extract tables and columns from the initial SQL query to create a concise schema. Using CodeLlama-34B, when comparing the schemas obtained by mainstream methods with ours for SQL generation, our schema performs optimally. Leveraging GPT4, our SQL generation method achieved results that are comparable to mainstream Text-to-SQL methods on the Spider dataset.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
On automorphism groups of smooth hypersurfaces
Authors:
Song Yang,
Xun Yu,
Zigang Zhu
Abstract:
We show that smooth hypersurfaces in complex projective spaces with automorphism groups of maximum size are isomorphic to Fermat hypersurfaces, with a few exceptions. For the exceptions, we give explicitly the defining equations and automorphism groups.
We show that smooth hypersurfaces in complex projective spaces with automorphism groups of maximum size are isomorphic to Fermat hypersurfaces, with a few exceptions. For the exceptions, we give explicitly the defining equations and automorphism groups.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Enhancing Function Name Prediction using Votes-Based Name Tokenization and Multi-Task Learning
Authors:
Xiaoling Zhang,
Zhengzi Xu,
Shouguo Yang,
Zhi Li,
Zhiqiang Shi,
Limin Sun
Abstract:
Reverse engineers would acquire valuable insights from descriptive function names, which are absent in publicly released binaries. Recent advances in binary function name prediction using data-driven machine learning show promise. However, existing approaches encounter difficulties in capturing function semantics in diverse optimized binaries and fail to reserve the meaning of labels in function n…
▽ More
Reverse engineers would acquire valuable insights from descriptive function names, which are absent in publicly released binaries. Recent advances in binary function name prediction using data-driven machine learning show promise. However, existing approaches encounter difficulties in capturing function semantics in diverse optimized binaries and fail to reserve the meaning of labels in function names. We propose Epitome, a framework that enhances function name prediction using votes-based name tokenization and multi-task learning, specifically tailored for different compilation optimization binaries. Epitome learns comprehensive function semantics by pre-trained assembly language model and graph neural network, incorporating function semantics similarity prediction task, to maximize the similarity of function semantics in the context of different compilation optimization levels. In addition, we present two data preprocessing methods to improve the comprehensibility of function names. We evaluate the performance of Epitome using 2,597,346 functions extracted from binaries compiled with 5 optimizations (O0-Os) for 4 architectures (x64, x86, ARM, and MIPS). Epitome outperforms the state-of-the-art function name prediction tool by up to 44.34%, 64.16%, and 54.44% in precision, recall, and F1 score, while also exhibiting superior generalizability.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition
Authors:
Lingdong Kong,
Shaoyuan Xie,
Hanjiang Hu,
Yaru Niu,
Wei Tsang Ooi,
Benoit R. Cottereau,
Lai Xing Ng,
Yuexin Ma,
Wenwei Zhang,
Liang Pan,
Kai Chen,
Ziwei Liu,
Weichao Qiu,
Wei Zhang,
Xu Cao,
Hao Lu,
Ying-Cong Chen,
Caixin Kang,
Xinning Zhou,
Chengyang Ying,
Wentao Shang,
Xingxing Wei,
Yinpeng Dong,
Bo Yang,
Shengyin Jiang
, et al. (66 additional authors not shown)
Abstract:
In the realm of autonomous driving, robust perception under out-of-distribution conditions is paramount for the safe deployment of vehicles. Challenges such as adverse weather, sensor malfunctions, and environmental unpredictability can severely impact the performance of autonomous systems. The 2024 RoboDrive Challenge was crafted to propel the development of driving perception technologies that c…
▽ More
In the realm of autonomous driving, robust perception under out-of-distribution conditions is paramount for the safe deployment of vehicles. Challenges such as adverse weather, sensor malfunctions, and environmental unpredictability can severely impact the performance of autonomous systems. The 2024 RoboDrive Challenge was crafted to propel the development of driving perception technologies that can withstand and adapt to these real-world variabilities. Focusing on four pivotal tasks -- BEV detection, map segmentation, semantic occupancy prediction, and multi-view depth estimation -- the competition laid down a gauntlet to innovate and enhance system resilience against typical and atypical disturbances. This year's challenge consisted of five distinct tracks and attracted 140 registered teams from 93 institutes across 11 countries, resulting in nearly one thousand submissions evaluated through our servers. The competition culminated in 15 top-performing solutions, which introduced a range of innovative approaches including advanced data augmentation, multi-sensor fusion, self-supervised learning for error correction, and new algorithmic strategies to enhance sensor robustness. These contributions significantly advanced the state of the art, particularly in handling sensor inconsistencies and environmental variability. Participants, through collaborative efforts, pushed the boundaries of current technologies, showcasing their potential in real-world scenarios. Extensive evaluations and analyses provided insights into the effectiveness of these solutions, highlighting key trends and successful strategies for improving the resilience of driving perception systems. This challenge has set a new benchmark in the field, providing a rich repository of techniques expected to guide future research in this field.
△ Less
Submitted 29 May, 2024; v1 submitted 14 May, 2024;
originally announced May 2024.
-
Antiferromagnetic Quantum Anomalous Hall Effect Modulated by Spin Flips and Flops
Authors:
Zichen Lian,
Yongchao Wang,
Yongqian Wang,
Yang Feng,
Zehao Dong,
Shuai Yang,
Liangcai Xu,
Yaoxin Li,
Bohan Fu,
Yuetan Li,
Wanjun Jiang,
Chang Liu,
Jinsong Zhang,
Yayu Wang
Abstract:
The interplay between nontrivial band topology and layered antiferromagnetism in MnBi2Te4 has opened up a new avenue for exploring topological phases of matter. Representative examples include the quantum anomalous Hall effect and axion insulator state observed in odd and even number layers of MnBi2Te4, when the top and bottom surfaces have parallel and antiparallel spin alignments respectively. T…
▽ More
The interplay between nontrivial band topology and layered antiferromagnetism in MnBi2Te4 has opened up a new avenue for exploring topological phases of matter. Representative examples include the quantum anomalous Hall effect and axion insulator state observed in odd and even number layers of MnBi2Te4, when the top and bottom surfaces have parallel and antiparallel spin alignments respectively. The rich and complex spin dynamics associated with the van der Waals antiferromagnetic order is expected to generate novel topological phases and phase transitions that are unique to MnBi2Te4. Here we fabricate a device of 7-septuple-layer MnBi2Te4 covered with AlOx capping layer, which enables the investigation of antiferromagnetic quantum anomalous Hall effect over wide parameter spaces. By tuning the gate voltage and perpendicular magnetic field, we uncover a cascade of quantum phase transitions that can be attributed to the influence of spin configurations on charge transport. Furthermore, we find that an in-plane magnetic field enhances both the coercive field and exchange gap of the surface state, in sharp contrast to that in ferromagnetic quantum anomalous Hall state. We propose that these peculiar features arise from the spin flip and flop transitions inherent to van der Waals antiferromagnet. The versatile tunability of the quantum anomalous Hall effect in MnBi2Te4 paves the way for potential applications in topological antiferromagnetic spintronics.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Towards the Quantized Anomalous Hall effect in AlO$_x$-capped MnBi$_2$Te$_4$
Authors:
Yongqian Wang,
Bohan Fu,
Yongchao Wang,
Zicheng Lian,
Shuai Yang,
Yaoxin Li,
Liangcai Xu,
Zhiting Gao,
Wanjun Jiang,
Jinsong Zhang,
Yayu Wang,
Chang Liu
Abstract:
The quantum anomalous Hall effect in layered antiferromagnet MnBi$_2$Te$_4$ harbors a rich interplay between magnetism and topology, holding a significant promise for low-power electronic devices and topological antiferromagnetic spintronics. In recent years, MnBi$_2$Te$_4$ has garnered considerable attention as the only known material to exhibit the antiferromagnetic quantum anomalous Hall effect…
▽ More
The quantum anomalous Hall effect in layered antiferromagnet MnBi$_2$Te$_4$ harbors a rich interplay between magnetism and topology, holding a significant promise for low-power electronic devices and topological antiferromagnetic spintronics. In recent years, MnBi$_2$Te$_4$ has garnered considerable attention as the only known material to exhibit the antiferromagnetic quantum anomalous Hall effect. However, this field faces significant challenges as realizing quantized transport at zero magnetic fields depends critically on fabricating high-quality device. In this article, we address the detrimental influences of fabrication on MnBi$_2$Te$_4$ by simply depositing an AlO$_x$ thin layer on the surface prior to fabrications. Optical contrast and magnetotransport measurements on over 50 samples demonstrate that AlO$_x$ can effectively preserve the pristine state of the samples and significantly enhance the anomalous Hall effect towards quantization. Scaling analysis reveals the Berry curvature dominated mechanism of the anomalous Hall effect at various magnetic configurations. By adjusting the gate voltage, we uncover a gate independent antiferromagnetism in MnBi$_2$Te$_4$. Our experiment not only pave the way for fabricating high-quality transport devices but also advance the exploration of exotic quantum physics in 2D materials.
△ Less
Submitted 19 May, 2024; v1 submitted 14 May, 2024;
originally announced May 2024.
-
Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i…
▽ More
The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) is compatible with NGC 4278 within $\sim0.03$ degree. Variation analysis shows an indication of the variability at a few months level in the TeV band, which is consistent with low frequency observations. Based on these observations, we report the detection of TeV $γ$-ray emissions from this low-luminosity AGN NGC 4278. The observations by LHAASO-WCDA during active period has a significance level of 8.8\,$σ$ with best-fit photon spectral index $\varGamma=2.56\pm0.14$ and a flux $f_{1-10\,\rm{TeV}}=(7.0\pm1.1_{\rm{sta}}\pm0.35_{\rm{syst}})\times10^{-13}\,\rm{photons\,cm^{-2}\,s^{-1}}$, or approximately $5\%$ of the Crab Nebula. The discovery of VHE from NGC 4278 indicates that the compact, weak radio jet can efficiently accelerate particles and emit TeV photons.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Search for lepton-flavor-violating $τ^- \to μ^-μ^+μ^-$ decays at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Althubiti,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer,
J. Becker
, et al. (407 additional authors not shown)
Abstract:
We present the result of a search for the charged-lepton-flavor violating decay $τ^- \to μ^-μ^+μ^-$ using a $424fb^{-1}$ sample of data recorded by the Belle II experiment at the SuperKEKB $e^{-}e^{+}$ collider. The selection of $e^{-}e^{+}\toτ^+τ^-$ events is based on an inclusive reconstruction of the non-signal tau decay, and on a boosted decision tree to suppress background. We observe one sig…
▽ More
We present the result of a search for the charged-lepton-flavor violating decay $τ^- \to μ^-μ^+μ^-$ using a $424fb^{-1}$ sample of data recorded by the Belle II experiment at the SuperKEKB $e^{-}e^{+}$ collider. The selection of $e^{-}e^{+}\toτ^+τ^-$ events is based on an inclusive reconstruction of the non-signal tau decay, and on a boosted decision tree to suppress background. We observe one signal candidate, which is compatible with the expectation from background processes. We set a $90\%$ confidence level upper limit of $1.9 \times 10^{-8}$ on the branching fraction of the \taumu decay, which is the most stringent bound to date.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
Dark Matter Physics in General NMSSM
Authors:
Lei Meng,
Junjie Cao,
Shenshen Yang
Abstract:
In the General Next-to-Minimal Supersymmetric Standard Model (GNMSSM), singlet particles may form a secluded sector of dark matter (DM), in which Singlino-like DM could achieve the observed relic abundance through various channels such as $\tildeχ_1^0 \tildeχ_1^0 \to h_s h_s, A_s A_s, h_s A_s$, where $h_s$ and $A_s$ represent singlet-dominated CP-even and CP-odd Higgs bosons. We provide analytical…
▽ More
In the General Next-to-Minimal Supersymmetric Standard Model (GNMSSM), singlet particles may form a secluded sector of dark matter (DM), in which Singlino-like DM could achieve the observed relic abundance through various channels such as $\tildeχ_1^0 \tildeχ_1^0 \to h_s h_s, A_s A_s, h_s A_s$, where $h_s$ and $A_s$ represent singlet-dominated CP-even and CP-odd Higgs bosons. We provide analytical formulas for both the spin-independent and spin-dependent cross sections of Singlino DM scattering with nucleons, illustrating their dependence on the model's parameters in a clear manner. We also present analytic expressions for the annihilation cross sections of these three important channels. Based on these preparations, we conducted Bayesian analyses of the GNMSSM and concluded that the theory significantly favored Singlino-dominated DM over Bino-like DM across a much broader range of parameters. The combined results from our numerical analyses and the formulas distinctly highlight crucial aspects of DM physics within the GNMSSM.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
AHPPEBot: Autonomous Robot for Tomato Harvesting based on Phenotyping and Pose Estimation
Authors:
Xingxu Li,
Nan Ma,
Yiheng Han,
Shun Yang,
Siyi Zheng
Abstract:
To address the limitations inherent to conventional automated harvesting robots specifically their suboptimal success rates and risk of crop damage, we design a novel bot named AHPPEBot which is capable of autonomous harvesting based on crop phenotyping and pose estimation. Specifically, In phenotyping, the detection, association, and maturity estimation of tomato trusses and individual fruits are…
▽ More
To address the limitations inherent to conventional automated harvesting robots specifically their suboptimal success rates and risk of crop damage, we design a novel bot named AHPPEBot which is capable of autonomous harvesting based on crop phenotyping and pose estimation. Specifically, In phenotyping, the detection, association, and maturity estimation of tomato trusses and individual fruits are accomplished through a multi-task YOLOv5 model coupled with a detection-based adaptive DBScan clustering algorithm. In pose estimation, we employ a deep learning model to predict seven semantic keypoints on the pedicel. These keypoints assist in the robot's path planning, minimize target contact, and facilitate the use of our specialized end effector for harvesting. In autonomous tomato harvesting experiments conducted in commercial greenhouses, our proposed robot achieved a harvesting success rate of 86.67%, with an average successful harvest time of 32.46 s, showcasing its continuous and robust harvesting capabilities. The result underscores the potential of harvesting robots to bridge the labor gap in agriculture.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
Search for time-dependent $CP$ violation in $D^0 \rightarrow π^+ π^- π^0$ decays
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1062 additional authors not shown)
Abstract:
A measurement of time-dependent $CP$ violation in $D^0 \rightarrow π^+ π^- π^0$ decays using a $pp$ collision data sample collected by the LHCb experiment in 2012 and from 2015 to 2018, corresponding to an integrated luminosity of 7.7$\,\mathrm{fb}^{-1}$, is presented. The initial flavour of each $D^0$ candidate is determined from the charge of the pion produced in the…
▽ More
A measurement of time-dependent $CP$ violation in $D^0 \rightarrow π^+ π^- π^0$ decays using a $pp$ collision data sample collected by the LHCb experiment in 2012 and from 2015 to 2018, corresponding to an integrated luminosity of 7.7$\,\mathrm{fb}^{-1}$, is presented. The initial flavour of each $D^0$ candidate is determined from the charge of the pion produced in the $D^*(2010)^+ \rightarrow D^0 π^+$ decay. The decay $D^0 \rightarrow K^- π^+ π^0$ is used as a control channel to validate the measurement procedure. The gradient of the time-dependent $CP$ asymmetry, $ΔY$, in $D^0 \rightarrow π^+ π^- π^0$ decays is measured to be \begin{equation*}
ΔY = (-1.3 \pm 6.3 \pm 2.4) \times 10^{-4}, \end{equation*} where the first uncertainty is statistical and the second is systematic, which is compatible with $CP$ conservation.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
Hybrid thin-film lithium niobate micro-ring acousto-optic modulator for microwave-to-optical conversion
Authors:
Lei Wan,
Jiying Huang,
Meixun Wen,
Huan Li,
Wenfeng Zhou,
Zhiqiang Yang,
Yuping Chen,
Huilong Liu,
Siqing Zeng,
Dong Liu,
Shuixian Yang,
Daoxin Dai,
Zhaohui Li
Abstract:
Highly efficient acousto-optic modulation plays a vital role in the microwave-to-optical conversion. Herein, we demonstrate a hybrid thin-film lithium niobate (TFLN) racetrack micro-ring acousto-optic modulator (AOM) implemented with low-loss chalcogenide (ChG) waveguide. By engineering the electrode configuration of the interdigital transducer, the double-arm micro-ring acousto-optic modulation i…
▽ More
Highly efficient acousto-optic modulation plays a vital role in the microwave-to-optical conversion. Herein, we demonstrate a hybrid thin-film lithium niobate (TFLN) racetrack micro-ring acousto-optic modulator (AOM) implemented with low-loss chalcogenide (ChG) waveguide. By engineering the electrode configuration of the interdigital transducer, the double-arm micro-ring acousto-optic modulation is experimentally confirmed in nonsuspended ChG loaded TFLN waveguide platform. Varying the position of blue-detuned bias point, the half-wave-voltage-length product VpaiL of the hybrid TFLN micro-ring AOM is as small as 9 mVcm. Accordingly, the acousto-optic coupling strength is estimated to be 0.48 Hz s1/2 at acoustic frequency of 0.84 GHz. By analyzing the generation of phonon number from the piezoelectric transducer, the microwave-to-optical conversion efficiency is calculated to be 0.05%, approximately one order of magnitude larger than that of the state-of-the-art suspended counterpart. Efficient microwave-to-optical conversion thus provides new opportunities for low-power-consumption quantum information transduction using the TFLN-ChG hybrid piezo-optomechanical devices.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
MaskMatch: Boosting Semi-Supervised Learning Through Mask Autoencoder-Driven Feature Learning
Authors:
Wenjin Zhang,
Keyi Li,
Sen Yang,
Chenyang Gao,
Wanzhao Yang,
Sifan Yuan,
Ivan Marsic
Abstract:
Conventional methods in semi-supervised learning (SSL) often face challenges related to limited data utilization, mainly due to their reliance on threshold-based techniques for selecting high-confidence unlabeled data during training. Various efforts (e.g., FreeMatch) have been made to enhance data utilization by tweaking the thresholds, yet none have managed to use 100% of the available data. To…
▽ More
Conventional methods in semi-supervised learning (SSL) often face challenges related to limited data utilization, mainly due to their reliance on threshold-based techniques for selecting high-confidence unlabeled data during training. Various efforts (e.g., FreeMatch) have been made to enhance data utilization by tweaking the thresholds, yet none have managed to use 100% of the available data. To overcome this limitation and improve SSL performance, we introduce \algo, a novel algorithm that fully utilizes unlabeled data to boost semi-supervised learning. \algo integrates a self-supervised learning strategy, i.e., Masked Autoencoder (MAE), that uses all available data to enforce the visual representation learning. This enables the SSL algorithm to leverage all available data, including samples typically filtered out by traditional methods. In addition, we propose a synthetic data training approach to further increase data utilization and improve generalization. These innovations lead \algo to achieve state-of-the-art results on challenging datasets. For instance, on CIFAR-100 with 2 labels per class, STL-10 with 4 labels per class, and Euro-SAT with 2 labels per class, \algo achieves low error rates of 18.71%, 9.47%, and 3.07%, respectively. The code will be made publicly available.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Discovering the Mass-Scaled Damping Timescale from Microquasars to Blazars
Authors:
Haoyang Zhang,
Shenbang Yang,
Benzhong Dai
Abstract:
Studying the variability of the accretion disks of black holes and jets is important to identify their internal physical processes. In this letter, we obtain the characteristic damping timescale of 34 blazars and seven microquasars from the Fermi-Large Area Telescope and the XMM-Newton X-ray telescope, respectively. We found that the mass-scaled characteristic timescales, ranging from the microqua…
▽ More
Studying the variability of the accretion disks of black holes and jets is important to identify their internal physical processes. In this letter, we obtain the characteristic damping timescale of 34 blazars and seven microquasars from the Fermi-Large Area Telescope and the XMM-Newton X-ray telescope, respectively. We found that the mass-scaled characteristic timescales, ranging from the microquasars of stellar-mass black holes to the blazars of supermassive black holes, exhibited a linear relationship with a slope of $\sim$0.57. Given the fact the damping timescales of the $γ$-ray in the blazars are associated with the jet, we propose that the timescales of the X-ray in these microquasars are also related with the jet. The mass-scaled damping timescale that we found was consistent with the radiation of the optical accretion disk. This can be attributed to the viscous timescale at the ultraviolet-emitting radii of the disk, which can affect the jet. Our study provides a new perspective on the origin of the region of radiation and the possible disk--jet connection based on time-domain analysis.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.