-
Dynamic protected states in the non-Hermitian system
Authors:
Lei Chen,
Zhen-Xia Niu,
Xingran Xu
Abstract:
The non-Hermitian skin effect and nonreciprocal behavior are sensitive to the boundary conditions, which are unique features of non-Hermitian systems. The eigenenergies will become complex and all eigenstates are localized at the boundary, which is distinguished from the Hermitian topologies. In this work, we theoretically study the dynamic behavior of the propagation of Gaussian wavepackets insid…
▽ More
The non-Hermitian skin effect and nonreciprocal behavior are sensitive to the boundary conditions, which are unique features of non-Hermitian systems. The eigenenergies will become complex and all eigenstates are localized at the boundary, which is distinguished from the Hermitian topologies. In this work, we theoretically study the dynamic behavior of the propagation of Gaussian wavepackets inside a non-Hermitian lattice and analyze the self-acceleration process of bulk state or Gaussian wavepackets toward the system's boundary. The initial wavepackets will not only propagate toward the side where the eigenstates are localized, but also their momentum will approach to a specific value where the imaginary parts of energy dispersion are the maximum. In addition, if the wavepackets cover this specific momentum, they will eventually exhibit exponentially increasing amplitudes with time evolution, maintaining the dynamic protected condition for an extended period of time until they approach the boundary. We also take two widely used toy models as examples in one and two dimensions to verify the correspondence of the non-Hermitian skin effect and the dynamic protected state.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
The saddlepoint approximation for averages of conditionally independent random variables
Authors:
Ziang Niu,
Jyotishka Ray Choudhury,
Eugene Katsevich
Abstract:
Motivated by the application of saddlepoint approximations to resampling-based statistical tests, we prove that a Lugananni-Rice style approximation for conditional tail probabilities of averages of conditionally independent random variables has vanishing relative error. We also provide a general condition on the existence and uniqueness of the solution to the corresponding saddlepoint equation. T…
▽ More
Motivated by the application of saddlepoint approximations to resampling-based statistical tests, we prove that a Lugananni-Rice style approximation for conditional tail probabilities of averages of conditionally independent random variables has vanishing relative error. We also provide a general condition on the existence and uniqueness of the solution to the corresponding saddlepoint equation. The results are valid under a broad class of distributions involving no restrictions on the smoothness of the distribution function. The derived saddlepoint approximation formula can be directly applied to resampling-based hypothesis tests, including bootstrap, sign-flipping and conditional randomization tests. Our results extend and connect several classical saddlepoint approximation results. On the way to proving our main results, we prove a new conditional Berry-Esseen inequality for the sum of conditionally independent random variables, which may be of independent interest.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Computationally efficient and statistically accurate conditional independence testing with spaCRT
Authors:
Ziang Niu,
Jyotishka Ray Choudhury,
Eugene Katsevich
Abstract:
We introduce the saddlepoint approximation-based conditional randomization test (spaCRT), a novel conditional independence test that effectively balances statistical accuracy and computational efficiency, inspired by applications to single-cell CRISPR screens. Resampling-based methods like the distilled conditional randomization test (dCRT) offer statistical precision but at a high computational c…
▽ More
We introduce the saddlepoint approximation-based conditional randomization test (spaCRT), a novel conditional independence test that effectively balances statistical accuracy and computational efficiency, inspired by applications to single-cell CRISPR screens. Resampling-based methods like the distilled conditional randomization test (dCRT) offer statistical precision but at a high computational cost. The spaCRT leverages a saddlepoint approximation to the resampling distribution of the dCRT test statistic, achieving very similar finite-sample statistical performance with significantly reduced computational demands. We prove that the spaCRT p-value approximates the dCRT p-value with vanishing relative error, and that these two tests are asymptotically equivalent. Through extensive simulations and real data analysis, we demonstrate that the spaCRT controls Type-I error and maintains high power, outperforming other asymptotic and resampling-based tests. Our method is particularly well-suited for large-scale single-cell CRISPR screen analyses, facilitating the efficient and accurate assessment of perturbation-gene associations.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
High power GaSb-based distributed feedback laser with laterally coupled dielectric gratings at 1.95μm
Authors:
Zhengqing Ding,
Juntian Cao,
Kun Zhan,
Yihang Chen,
Lidan Zhou,
Hao Tan,
Chenao Yang,
Ying Yu,
Zhichuan Niu,
Siyuan Yu
Abstract:
Traditional Distributed Feedback (DFB) or Distributed Bragg Reflector (DBR) lasers typically utilize buried gratings as frequency-selective optical feedback mechanisms. However, the fabrication of such gratings often necessitates regrowth processes, which can pose technical challenges for materials platforms such as GaAs and GaSb. Metal gratings were also used for GaSb lasers but they introduce ad…
▽ More
Traditional Distributed Feedback (DFB) or Distributed Bragg Reflector (DBR) lasers typically utilize buried gratings as frequency-selective optical feedback mechanisms. However, the fabrication of such gratings often necessitates regrowth processes, which can pose technical challenges for materials platforms such as GaAs and GaSb. Metal gratings were also used for GaSb lasers but they introduce additional absorption loss that limits device efficiency and output power. In this paper, we introduce a novel laterally coupled dielectric Bragg grating structure, which enables highly controllable, deterministic, and stable coupling between the grating and the optical mode. Our device demonstrates a continuous-wave output power of 47.02 mW at room temperature, exhibiting stable single-mode operation from 300-1000 mA and achieving a maximum side mode suppression ratio of 46.7 dB. These results underscore the innovative lateral coupled dielectric grating as a feasible and technologically superior approach for fabricating DFB and DBR lasers, which hold universal applicability across different material platforms and wavelength bands.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Discovery of a dusty yellow supergiant progenitor for the Type IIb SN 2017gkk
Authors:
Zexi Niu,
Ning-Chen Sun,
Jifeng Liu
Abstract:
Type IIb supernovae are important subclass of stripped-envelope supernovae (SNe), which show H lines only at early times. Their progenitors are believed to contain a low-mass H envelope before explosion. This work reports the discovery of a progenitor candidate in pre-explosion Hubble Space Telescope images for the Type IIb SN~2017gkk. With detailed analysis of its spectral energy distribution and…
▽ More
Type IIb supernovae are important subclass of stripped-envelope supernovae (SNe), which show H lines only at early times. Their progenitors are believed to contain a low-mass H envelope before explosion. This work reports the discovery of a progenitor candidate in pre-explosion Hubble Space Telescope images for the Type IIb SN~2017gkk. With detailed analysis of its spectral energy distribution and local environment, we suggest that the progenitor is most likely a yellow supergiant with significant circumstellar extinction and has an initial mass of about 16 $M_\odot$, effective temperature log($T_{\rm eff}/K)=3.72\pm0.08$ and luminosity log($L/L_{\odot})=5.17\pm0.04$. This progenitor is not massive enough to strip envelope through stellar wind, and it supports an interacting binary progenitor channel and adds to the growing list of direct progenitor detections for Type~IIb SNe. Future late-time observations will confirm whether this progenitor candidate has disappeared and reveal the putative binary companion that has survived the explosion.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
C-MASS: Combinatorial Mobility-Aware Sensor Scheduling for Collaborative Perception with Second-Order Topology Approximation
Authors:
Yukuan Jia,
Yuxuan Sun,
Ruiqing Mao,
Zhaojun Nan,
Sheng Zhou,
Zhisheng Niu
Abstract:
Collaborative Perception (CP) has been a promising solution to address occlusions in the traffic environment by sharing sensor data among collaborative vehicles (CoV) via vehicle-to-everything (V2X) network. With limited wireless bandwidth, CP necessitates task-oriented and receiver-aware sensor scheduling to prioritize important and complementary sensor data. However, due to vehicular mobility, i…
▽ More
Collaborative Perception (CP) has been a promising solution to address occlusions in the traffic environment by sharing sensor data among collaborative vehicles (CoV) via vehicle-to-everything (V2X) network. With limited wireless bandwidth, CP necessitates task-oriented and receiver-aware sensor scheduling to prioritize important and complementary sensor data. However, due to vehicular mobility, it is challenging and costly to obtain the up-to-date perception topology, i.e., whether a combination of CoVs can jointly detect an object. In this paper, we propose a combinatorial mobility-aware sensor scheduling (C-MASS) framework for CP with minimal communication overhead. Specifically, detections are replayed with sensor data from individual CoVs and pairs of CoVs to maintain an empirical perception topology up to the second order, which approximately represents the complete perception topology. A hybrid greedy algorithm is then proposed to solve a variant of the budgeted maximum coverage problem with a worst-case performance guarantee. The C-MASS scheduling algorithm adapts the greedy algorithm by incorporating the topological uncertainty and the unexplored time of CoVs to balance exploration and exploitation, addressing the mobility challenge. Extensive numerical experiments demonstrate the near-optimality of the proposed C-MASS framework in both edge-assisted and distributed CP configurations. The weighted recall improvements over object-level CP are 5.8% and 4.2%, respectively. Compared to distance-based and area-based greedy heuristics, the gaps to the offline optimal solutions are reduced by up to 75% and 71%, respectively.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Dynamic Scheduling for Vehicle-to-Vehicle Communications Enhanced Federated Learning
Authors:
Jintao Yan,
Tan Chen,
Yuxuan Sun,
Zhaojun Nan,
Sheng Zhou,
Zhisheng Niu
Abstract:
Leveraging the computing and sensing capabilities of vehicles, vehicular federated learning (VFL) has been applied to edge training for connected vehicles. The dynamic and interconnected nature of vehicular networks presents unique opportunities to harness direct vehicle-to-vehicle (V2V) communications, enhancing VFL training efficiency. In this paper, we formulate a stochastic optimization proble…
▽ More
Leveraging the computing and sensing capabilities of vehicles, vehicular federated learning (VFL) has been applied to edge training for connected vehicles. The dynamic and interconnected nature of vehicular networks presents unique opportunities to harness direct vehicle-to-vehicle (V2V) communications, enhancing VFL training efficiency. In this paper, we formulate a stochastic optimization problem to optimize the VFL training performance, considering the energy constraints and mobility of vehicles, and propose a V2V-enhanced dynamic scheduling (VEDS) algorithm to solve it. The model aggregation requirements of VFL and the limited transmission time due to mobility result in a stepwise objective function, which presents challenges in solving the problem. We thus propose a derivative-based drift-plus-penalty method to convert the long-term stochastic optimization problem to an online mixed integer nonlinear programming (MINLP) problem, and provide a theoretical analysis to bound the performance gap between the online solution and the offline optimal solution. Further analysis of the scheduling priority reduces the original problem into a set of convex optimization problems, which are efficiently solved using the interior-point method. Experimental results demonstrate that compared with the state-of-the-art benchmarks, the proposed algorithm enhances the image classification accuracy on the CIFAR-10 dataset by 3.18% and reduces the average displacement errors on the Argoverse trajectory prediction dataset by 10.21%.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Retrieve-Plan-Generation: An Iterative Planning and Answering Framework for Knowledge-Intensive LLM Generation
Authors:
Yuanjie Lyu,
Zihan Niu,
Zheyong Xie,
Chao Zhang,
Tong Xu,
Yang Wang,
Enhong Chen
Abstract:
Despite the significant progress of large language models (LLMs) in various tasks, they often produce factual errors due to their limited internal knowledge. Retrieval-Augmented Generation (RAG), which enhances LLMs with external knowledge sources, offers a promising solution. However, these methods can be misled by irrelevant paragraphs in retrieved documents. Due to the inherent uncertainty in L…
▽ More
Despite the significant progress of large language models (LLMs) in various tasks, they often produce factual errors due to their limited internal knowledge. Retrieval-Augmented Generation (RAG), which enhances LLMs with external knowledge sources, offers a promising solution. However, these methods can be misled by irrelevant paragraphs in retrieved documents. Due to the inherent uncertainty in LLM generation, inputting the entire document may introduce off-topic information, causing the model to deviate from the central topic and affecting the relevance of the generated content. To address these issues, we propose the Retrieve-Plan-Generation (RPG) framework. RPG generates plan tokens to guide subsequent generation in the plan stage. In the answer stage, the model selects relevant fine-grained paragraphs based on the plan and uses them for further answer generation. This plan-answer process is repeated iteratively until completion, enhancing generation relevance by focusing on specific topics. To implement this framework efficiently, we utilize a simple but effective multi-task prompt-tuning method, enabling the existing LLMs to handle both planning and answering. We comprehensively compare RPG with baselines across 5 knowledge-intensive generation tasks, demonstrating the effectiveness of our approach.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
On the small boundary property, $\mathcal Z$-stability, and Bauer simplexes
Authors:
George A. Elliott,
Zhuang Niu
Abstract:
Let $X$ be a compact metrizable space, and let $Δ$ be a closed set of Borel probability measures on $X$. We study the small boundary property of the pair $(X, Δ)$. In particular, it is shown that $(X, Δ)$ has the small boundary property if it has a restricted version of property Gamma.
As an application, it is shown that, if $A$ is the crossed product C*-algebra…
▽ More
Let $X$ be a compact metrizable space, and let $Δ$ be a closed set of Borel probability measures on $X$. We study the small boundary property of the pair $(X, Δ)$. In particular, it is shown that $(X, Δ)$ has the small boundary property if it has a restricted version of property Gamma.
As an application, it is shown that, if $A$ is the crossed product C*-algebra $\mathrm{C}(X)\rtimes\mathbb Z^d$, where $(X, \mathbb Z^d)$ is a free minimal topological dynamical system, or if $A$ is an AH algebra with diagonal maps, then, $A$ is $\mathcal Z$-stable if the set of extreme tracial states is compact, regardless of its dimension.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Task-Oriented Wireless Communications for Collaborative Perception in Intelligent Unmanned Systems
Authors:
Sheng Zhou,
Yukuan Jia,
Ruiqing Mao,
Zhaojun Nan,
Yuxuan Sun,
Zhisheng Niu
Abstract:
Collaborative Perception (CP) has shown great potential to achieve more holistic and reliable environmental perception in intelligent unmanned systems (IUSs). However, implementing CP still faces key challenges due to the characteristics of the CP task and the dynamics of wireless channels. In this article, a task-oriented wireless communication framework is proposed to jointly optimize the commun…
▽ More
Collaborative Perception (CP) has shown great potential to achieve more holistic and reliable environmental perception in intelligent unmanned systems (IUSs). However, implementing CP still faces key challenges due to the characteristics of the CP task and the dynamics of wireless channels. In this article, a task-oriented wireless communication framework is proposed to jointly optimize the communication scheme and the CP procedure. We first propose channel-adaptive compression and robust fusion approaches to extract and exploit the most valuable semantic information under wireless communication constraints. We then propose a task-oriented distributed scheduling algorithm to identify the best collaborators for CP under dynamic environments. The main idea is learning while scheduling, where the collaboration utility is effectively learned with low computation and communication overhead. Case studies are carried out in connected autonomous driving scenarios to verify the proposed framework. Finally, we identify several future research directions.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Continuous momentum state lasing and cavity frequency-pinning with laser-cooled strontium atoms
Authors:
V. M. Schäfer,
Z. Niu,
J. R. K. Cline,
D. J. Young,
E. Y. Song,
H. Ritsch,
J. K. Thompson
Abstract:
Laser-cooled gases of atoms interacting with the field of an optical cavity are a powerful tool for quantum sensing and the simulation of open and closed quantum systems. They can display spontaneous self-organisation phase transitions, time crystals, new lasing mechanisms, squeezed states for quantum sensing, protection of quantum coherence, and dynamical phase transitions. However, all of these…
▽ More
Laser-cooled gases of atoms interacting with the field of an optical cavity are a powerful tool for quantum sensing and the simulation of open and closed quantum systems. They can display spontaneous self-organisation phase transitions, time crystals, new lasing mechanisms, squeezed states for quantum sensing, protection of quantum coherence, and dynamical phase transitions. However, all of these phenomena are explored in a discontinuous manner due to the need to stop and reload a new ensemble of atoms. Here we report the observation of hours-long continuous lasing from laser-cooled $^{88}$Sr atoms continuously loaded into a ring cavity. The required inversion to produce lasing arises from inversion in the atomic momentum degree of freedom, a mechanism related directly to self-organization phase transitions and collective atomic recoil lasing, both of which were previously only observed in a cyclic fashion compared to the truly continuous behavior here. Further, the sensitivity of the lasing frequency to cavity frequency changes is 120 fold suppressed due to an atomic loss mechanism, opening an interesting new path to compensate cavity frequency noise for realizing narrow frequency references. This work opens the way for continuous cavity QED quantum simulation experiments as well as continuous superradiant lasers.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Efficient LLM-Jailbreaking by Introducing Visual Modality
Authors:
Zhenxing Niu,
Yuyao Sun,
Haodong Ren,
Haoxuan Ji,
Quan Wang,
Xiaoke Ma,
Gang Hua,
Rong Jin
Abstract:
This paper focuses on jailbreaking attacks against large language models (LLMs), eliciting them to generate objectionable content in response to harmful user queries. Unlike previous LLM-jailbreaks that directly orient to LLMs, our approach begins by constructing a multimodal large language model (MLLM) through the incorporation of a visual module into the target LLM. Subsequently, we conduct an e…
▽ More
This paper focuses on jailbreaking attacks against large language models (LLMs), eliciting them to generate objectionable content in response to harmful user queries. Unlike previous LLM-jailbreaks that directly orient to LLMs, our approach begins by constructing a multimodal large language model (MLLM) through the incorporation of a visual module into the target LLM. Subsequently, we conduct an efficient MLLM-jailbreak to generate jailbreaking embeddings embJS. Finally, we convert the embJS into text space to facilitate the jailbreaking of the target LLM. Compared to direct LLM-jailbreaking, our approach is more efficient, as MLLMs are more vulnerable to jailbreaking than pure LLM. Additionally, to improve the attack success rate (ASR) of jailbreaking, we propose an image-text semantic matching scheme to identify a suitable initial input. Extensive experiments demonstrate that our approach surpasses current state-of-the-art methods in terms of both efficiency and effectiveness. Moreover, our approach exhibits superior cross-class jailbreaking capabilities.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Towards Unified Robustness Against Both Backdoor and Adversarial Attacks
Authors:
Zhenxing Niu,
Yuyao Sun,
Qiguang Miao,
Rong Jin,
Gang Hua
Abstract:
Deep Neural Networks (DNNs) are known to be vulnerable to both backdoor and adversarial attacks. In the literature, these two types of attacks are commonly treated as distinct robustness problems and solved separately, since they belong to training-time and inference-time attacks respectively. However, this paper revealed that there is an intriguing connection between them: (1) planting a backdoor…
▽ More
Deep Neural Networks (DNNs) are known to be vulnerable to both backdoor and adversarial attacks. In the literature, these two types of attacks are commonly treated as distinct robustness problems and solved separately, since they belong to training-time and inference-time attacks respectively. However, this paper revealed that there is an intriguing connection between them: (1) planting a backdoor into a model will significantly affect the model's adversarial examples; (2) for an infected model, its adversarial examples have similar features as the triggered images. Based on these observations, a novel Progressive Unified Defense (PUD) algorithm is proposed to defend against backdoor and adversarial attacks simultaneously. Specifically, our PUD has a progressive model purification scheme to jointly erase backdoors and enhance the model's adversarial robustness. At the early stage, the adversarial examples of infected models are utilized to erase backdoors. With the backdoor gradually erased, our model purification can naturally turn into a stage to boost the model's robustness against adversarial attacks. Besides, our PUD algorithm can effectively identify poisoned images, which allows the initial extra dataset not to be completely clean. Extensive experimental results show that, our discovered connection between backdoor and adversarial attacks is ubiquitous, no matter what type of backdoor attack. The proposed PUD outperforms the state-of-the-art backdoor defense, including the model repairing-based and data filtering-based methods. Besides, it also has the ability to compete with the most advanced adversarial defense methods.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Interference between distinguishable photons
Authors:
Manman Wang,
Yanfeng Li,
Hanqing Liu,
Haiqiao Ni,
Zhichuan Niu,
Chengyong Hu
Abstract:
Two-photon interference (TPI) lies at the heart of photonic quantum technologies. TPI is generally regarded as quantum interference stemming from the indistinguishability of identical photons, hence a common intuition prevails that TPI would disappear if photons are distinguishable. Here we disprove this perspective and uncover the essence of TPI. We report the first demonstration of TPI between d…
▽ More
Two-photon interference (TPI) lies at the heart of photonic quantum technologies. TPI is generally regarded as quantum interference stemming from the indistinguishability of identical photons, hence a common intuition prevails that TPI would disappear if photons are distinguishable. Here we disprove this perspective and uncover the essence of TPI. We report the first demonstration of TPI between distinguishable photons with their frequency separation up to $10^4$ times larger than their linewidths. We perform time-resolved TPI between an independent laser and single photons with ultralong coherence time ($>10\ μ$s). We observe a maximum TPI visibility of $72\%\pm 2\%$ well above the $50\%$ classical limit indicating the quantum feature, and simultaneously a broad visibility background and a classical beat visibility of less than $50\%$ reflecting the classical feature. These visibilities are independent of the photon frequency separation and show no difference between distinguishable and indistinguishable photons. Based on a general wave superposition model, we derive the cross-correlation functions which fully reproduce and explain the experiments. Our results reveal that TPI as the fourth-order interference arises from the second-order interference of two photons within the mutual coherence time and TPI is not linked to the photon indistinguishability. This work provides new insights into the nature of TPI with great implications in both quantum optics and photonic quantum technologies.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Tomur: Traffic-Aware Performance Prediction of On-NIC Network Functions with Multi-Resource Contention
Authors:
Shaofeng Wu,
Qiang Su,
Zhixiong Niu,
Hong Xu
Abstract:
Network function (NF) offloading on SmartNICs has been widely used in modern data centers, offering benefits in host resource saving and programmability. Co-running NFs on the same SmartNICs can cause performance interference due to onboard resource contention. Therefore, to meet performance SLAs while ensuring efficient resource management, operators need mechanisms to predict NF performance unde…
▽ More
Network function (NF) offloading on SmartNICs has been widely used in modern data centers, offering benefits in host resource saving and programmability. Co-running NFs on the same SmartNICs can cause performance interference due to onboard resource contention. Therefore, to meet performance SLAs while ensuring efficient resource management, operators need mechanisms to predict NF performance under such contention. However, existing solutions lack SmartNIC-specific knowledge and exhibit limited traffic awareness, leading to poor accuracy for on-NIC NFs. This paper proposes Tomur, a novel performance predictive system for on-NIC NFs. Tomur builds upon the key observation that co-located NFs contend for multiple resources, including onboard accelerators and the memory subsystem. It also facilitates traffic awareness according to the behaviors of individual resources to maintain accuracy as the external traffic attributes vary. Evaluation using BlueField-2 SmartNIC shows that Tomur improves the prediction accuracy by 78.8% and reduces SLA violations by 92.2% compared to state-of-the-art approaches, and enables new practical usecases.
△ Less
Submitted 31 May, 2024; v1 submitted 8 May, 2024;
originally announced May 2024.
-
A Hong Kong Sign Language Corpus Collected from Sign-interpreted TV News
Authors:
Zhe Niu,
Ronglai Zuo,
Brian Mak,
Fangyun Wei
Abstract:
This paper introduces TVB-HKSL-News, a new Hong Kong sign language (HKSL) dataset collected from a TV news program over a period of 7 months. The dataset is collected to enrich resources for HKSL and support research in large-vocabulary continuous sign language recognition (SLR) and translation (SLT). It consists of 16.07 hours of sign videos of two signers with a vocabulary of 6,515 glosses (for…
▽ More
This paper introduces TVB-HKSL-News, a new Hong Kong sign language (HKSL) dataset collected from a TV news program over a period of 7 months. The dataset is collected to enrich resources for HKSL and support research in large-vocabulary continuous sign language recognition (SLR) and translation (SLT). It consists of 16.07 hours of sign videos of two signers with a vocabulary of 6,515 glosses (for SLR) and 2,850 Chinese characters or 18K Chinese words (for SLT). One signer has 11.66 hours of sign videos and the other has 4.41 hours. One objective in building the dataset is to support the investigation of how well large-vocabulary continuous sign language recognition/translation can be done for a single signer given a (relatively) large amount of his/her training data, which could potentially lead to the development of new modeling methods. Besides, most parts of the data collection pipeline are automated with little human intervention; we believe that our collection method can be scaled up to collect more sign language data easily for SLT in the future for any sign languages if such sign-interpreted videos are available. We also run a SOTA SLR/SLT model on the dataset and get a baseline SLR word error rate of 34.08% and a baseline SLT BLEU-4 score of 23.58 for benchmarking future research on the dataset.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Nuclear mass predictions based on convolutional neural network
Authors:
Yanhua Lu,
Tianshuai Shang,
Pengxiang Du,
Jian Li,
Haozhao Liang,
Zhongming Niu
Abstract:
A convolutional neural network (CNN) is employed to investigate nuclear mass. By introducing the masses of neighboring nuclei and the paring effects at the input layer of the network, local features of the target nucleus are extracted to predict its mass. Then, through learning the differences between the experimental nuclear masses and the predicted nuclear masses by the WS4 model, a new global-l…
▽ More
A convolutional neural network (CNN) is employed to investigate nuclear mass. By introducing the masses of neighboring nuclei and the paring effects at the input layer of the network, local features of the target nucleus are extracted to predict its mass. Then, through learning the differences between the experimental nuclear masses and the predicted nuclear masses by the WS4 model, a new global-local model (CNN-WS4) is developed, which incorporates both the global nuclear mass model and local features. Due to the incorporation of local features, the CNN-WS4 model achieves high accuracy on the training set. When extrapolating for newly emerged nuclei, the CNN-WS4 also exhibits appreciable stability, thereby demonstrating its robustness.
△ Less
Submitted 28 June, 2024; v1 submitted 23 April, 2024;
originally announced April 2024.
-
Prediction of Nuclear Charge Density Distribution with Feedback Neural Network
Authors:
Tian-Shuai Shang,
Jian Li,
Zhong-Ming Niu
Abstract:
The nuclear charge density distribution plays an important role in nuclear physics and atomic physics. As one of the most frequently used models to obtain charge density distribution, the two-parameter fermi (2pF) model has been widely applied in both nuclear physics and atomic physics. Currently, the feedforward neural network has been employed to study the available 2pF model parameters for 86 n…
▽ More
The nuclear charge density distribution plays an important role in nuclear physics and atomic physics. As one of the most frequently used models to obtain charge density distribution, the two-parameter fermi (2pF) model has been widely applied in both nuclear physics and atomic physics. Currently, the feedforward neural network has been employed to study the available 2pF model parameters for 86 nuclei, and it is found that by introducing A^{1/3} into the input parameter of the neural network, the accuracy and precision of the parameter learning effect are improved. Furthermore, the average result of multiple predictions is more reliable than the best result of a single prediction, and there is no significant difference between the average result of density value and of parameter value for the average charge density distribution. In addition, 2pF parameters of 284 (near) stable nuclei are also predicted in this work, which provides a reference for the experiment.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Searching for Hyper-compact star clusters in the Milky Way using LAMOST and Gaia
Authors:
Hao Wu,
Haibo Yuan,
Yilun Wang,
Zexi Niu,
Huawei Zhang
Abstract:
During the early merger of the Milky Way, intermediate-mass black holes in merged dwarf galaxies may have been ejected from the center of their host galaxies due to gravitational waves, carrying some central stars along. This process can lead to the formation of hyper-compact star clusters, potentially hosting black holes in the mass range of $10^4$ to $10^5$ solar masses. These clusters are cruci…
▽ More
During the early merger of the Milky Way, intermediate-mass black holes in merged dwarf galaxies may have been ejected from the center of their host galaxies due to gravitational waves, carrying some central stars along. This process can lead to the formation of hyper-compact star clusters, potentially hosting black holes in the mass range of $10^4$ to $10^5$ solar masses. These clusters are crucial targets for identifying and investigating intermediate-mass black holes. However, no hyper-compact star clusters in the Milky Way have been identified so far. In this paper, taking advantage of the high spatial resolution power of Gaia, we used data from Gaia EDR3 and LAMOST DR7, along with additional data from Pan-STARRS and SDSS, to conduct an initial screening of 6,138,049 sources using various parameters of Gaia EDR3. A total of 4,786 sources were selected for in-depth analysis. Each of these sources was meticulously scrutinized by examining their images, spectra, and nearby celestial objects to exclude various false positives, such as contaminations, galaxies, wide binaries, or wrong matches. We finally identified one likely hyper-compact star cluster candidate in the Milky Way, laying the foundation for further high-resolution imaging and spectral verification.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Quantum and Classical Two-photon Interference of Single Photons with Ultralong Coherence Time
Authors:
Manman Wang,
Yanfeng Li,
Hanqing Liu,
Haiqiao Ni,
Zhichuan Niu,
Xiaogang Wei,
Renfu Yang,
Chengyong Hu
Abstract:
Two-photon interference (TPI) is a fundamental phenomenon in quantum optics and plays a crucial role in quantum information science and technology. TPI is commonly considered as quantum interference with an upper bound of $100\%$ for both the TPI visibility and the beat visibility in contrast to its classical counterpart with a maximum visibility of $50\%$. However, this is not always the case. He…
▽ More
Two-photon interference (TPI) is a fundamental phenomenon in quantum optics and plays a crucial role in quantum information science and technology. TPI is commonly considered as quantum interference with an upper bound of $100\%$ for both the TPI visibility and the beat visibility in contrast to its classical counterpart with a maximum visibility of $50\%$. However, this is not always the case. Here we report a simultaneous observation of quantum and classical TPI of single photons with ultralong coherence time which is longer than the photon correlation time by five orders of magnitude. We observe a TPI visibility of $94.3\%\pm 0.2\%$ but a beat visibility of $50\%$. Besides an anti-bunching central dip due to single-photon statistics, we observe two bunching side peaks in cross-correlation curves for indistinguishable photons. Using either classical wave superposition theory or quantum field approach, we derive the same expressions for the cross-correlation functions which reproduce and explain the experiments well. We conclude that quantum TPI with a stream of single photons is equivalent to classical TPI, both of which are the fourth-order interference arising from the second-order interference occurring on the time scale of photon coherence time.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study
Authors:
Zechun Niu,
Jiaxin Mao,
Qingyao Ai,
Ji-Rong Wen
Abstract:
Counterfactual learning to rank (CLTR) has attracted extensive attention in the IR community for its ability to leverage massive logged user interaction data to train ranking models. While the CLTR models can be theoretically unbiased when the user behavior assumption is correct and the propensity estimation is accurate, their effectiveness is usually empirically evaluated via simulation-based exp…
▽ More
Counterfactual learning to rank (CLTR) has attracted extensive attention in the IR community for its ability to leverage massive logged user interaction data to train ranking models. While the CLTR models can be theoretically unbiased when the user behavior assumption is correct and the propensity estimation is accurate, their effectiveness is usually empirically evaluated via simulation-based experiments due to a lack of widely-available, large-scale, real click logs. However, the mainstream simulation-based experiments are somewhat limited as they often feature a single, deterministic production ranker and simplified user simulation models to generate the synthetic click logs. As a result, the robustness of CLTR models in complex and diverse situations is largely unknown and needs further investigation.
To address this problem, in this paper, we aim to investigate the robustness of existing CLTR models in a reproducibility study with extensive simulation-based experiments that (1) use both deterministic and stochastic production rankers, each with different ranking performance, and (2) leverage multiple user simulation models with different user behavior assumptions. We find that the DLA models and IPS-DCM show better robustness under various simulation settings than IPS-PBM and PRS with offline propensity estimation. Besides, the existing CLTR models often fail to outperform the naive click baselines when the production ranker has relatively high ranking performance or certain randomness, which suggests an urgent need for developing new CLTR algorithms that work for these settings.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Convert laser light into single photons via interference
Authors:
Yanfeng Li,
Manman Wang,
Guoqi Huang,
Li Liu,
Wenyan Wang,
Weijie Ji,
Hanqing Liu,
Xiangbin Su,
Shulun Li,
Deyan Dai,
Xiangjun Shang,
Haiqiao Ni,
Zhichuan Niu,
Chengyong Hu
Abstract:
Laser light possesses perfect coherence, but cannot be attenuated to single photons via linear optics. An elegant route to convert laser light into single photons is based on photon blockade in a cavity with a single atom in the strong coupling regime. However, the single-photon purity achieved by this method remains relatively low. Here we propose an interference-based approach where laser light…
▽ More
Laser light possesses perfect coherence, but cannot be attenuated to single photons via linear optics. An elegant route to convert laser light into single photons is based on photon blockade in a cavity with a single atom in the strong coupling regime. However, the single-photon purity achieved by this method remains relatively low. Here we propose an interference-based approach where laser light can be transformed into single photons by destructively interfering with a weak but super-bunched incoherent field emitted from a cavity coupling to a single quantum emitter. We demonstrate this idea by measuring the reflected light of a laser field which drives a double-sided optical microcavity containing a single artificial atom-quantum dot (QD) in the Purcell regime. The reflected light consists of a superposition of the driving field with the cavity output field. We achieve the second-order autocorrelation g2(0)=0.030+-0.002 and the two-photon interference visibility 94.3%+-0.2. By separating the coherent and incoherent fields in the reflected light, we observe that the incoherent field from the cavity exhibits super-bunching with g2(0)=41+-2 while the coherent field remains Poissonian statistics. By controlling the relative amplitude of coherent and incoherent fields, we verify that photon statistics of reflected light is tuneable from perfect anti-bunching to super-bunching in agreement with our predictions. Our results demonstrate photon statistics of light as a quantum interference phenomenon that a single QD can scatter two photons simultaneously at low driving fields in contrast to the common picture that a single two-level quantum emitter can only scatter (or absorb and emit) single photons. This work opens the door to tailoring photon statistics of laser light via cavity or waveguide quantum electrodynamics and interference.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Infrastructure-Assisted Collaborative Perception in Automated Valet Parking: A Safety Perspective
Authors:
Yukuan Jia,
Jiawen Zhang,
Shimeng Lu,
Baokang Fan,
Ruiqing Mao,
Sheng Zhou,
Zhisheng Niu
Abstract:
Environmental perception in Automated Valet Parking (AVP) has been a challenging task due to severe occlusions in parking garages. Although Collaborative Perception (CP) can be applied to broaden the field of view of connected vehicles, the limited bandwidth of vehicular communications restricts its application. In this work, we propose a BEV feature-based CP network architecture for infrastructur…
▽ More
Environmental perception in Automated Valet Parking (AVP) has been a challenging task due to severe occlusions in parking garages. Although Collaborative Perception (CP) can be applied to broaden the field of view of connected vehicles, the limited bandwidth of vehicular communications restricts its application. In this work, we propose a BEV feature-based CP network architecture for infrastructure-assisted AVP systems. The model takes the roadside camera and LiDAR as optional inputs and adaptively fuses them with onboard sensors in a unified BEV representation. Autoencoder and downsampling are applied for channel-wise and spatial-wise dimension reduction, while sparsification and quantization further compress the feature map with little loss in data precision. Combining these techniques, the size of a BEV feature map is effectively compressed to fit in the feasible data rate of the NR-V2X network. With the synthetic AVP dataset, we observe that CP can effectively increase perception performance, especially for pedestrians. Moreover, the advantage of infrastructure-assisted CP is demonstrated in two typical safety-critical scenarios in the AVP setting, increasing the maximum safe cruising speed by up to 3m/s in both scenarios.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
The disappearance of the blue and luminous progenitor of the Type IIn SN 2010jl
Authors:
Zexi Niu,
Ning-Chen Sun,
Jifeng Liu
Abstract:
Type IIn supernovae (SNe) exhibit narrow hydrogen lines that arise from the strong interaction between ejecta and circumstellar material. It remains poorly understood, however, what progenitor stars give rise to these explosions. In this work, we perform a detailed analysis of the progenitor and environment of the nearby Type IIn SN 2010jl. With newer images taken by the Hubble Space Telescope, we…
▽ More
Type IIn supernovae (SNe) exhibit narrow hydrogen lines that arise from the strong interaction between ejecta and circumstellar material. It remains poorly understood, however, what progenitor stars give rise to these explosions. In this work, we perform a detailed analysis of the progenitor and environment of the nearby Type IIn SN 2010jl. With newer images taken by the Hubble Space Telescope, we confirm that the previously reported progenitor candidate is a blend of the progenitor itself and a field star cluster in its close vicinity. SN 2010jl has now become much fainter than the progenitor. The progenitor is very blue and luminous with an effective temperature of log $T_{\rm eff}/{\rm K}$=4.26$^{+0.11}_{-0.09}$ and a luminosity of log $L/L_{\odot}$ =6.52$^{+0.20}_{-0.16}$. It is located in a very young star-forming region, but its luminosity is much higher than that expected from the environmental stellar populations. We suggest that the progenitor was in outburst when observed. Its nature and evolutionary history remain to be investigated.
△ Less
Submitted 18 April, 2024; v1 submitted 20 March, 2024;
originally announced March 2024.
-
Spatio-Temporal Fluid Dynamics Modeling via Physical-Awareness and Parameter Diffusion Guidance
Authors:
Hao Wu,
Fan Xu,
Yifan Duan,
Ziwei Niu,
Weiyan Wang,
Gaofeng Lu,
Kun Wang,
Yuxuan Liang,
Yang Wang
Abstract:
This paper proposes a two-stage framework named ST-PAD for spatio-temporal fluid dynamics modeling in the field of earth sciences, aiming to achieve high-precision simulation and prediction of fluid dynamics through spatio-temporal physics awareness and parameter diffusion guidance. In the upstream stage, we design a vector quantization reconstruction module with temporal evolution characteristics…
▽ More
This paper proposes a two-stage framework named ST-PAD for spatio-temporal fluid dynamics modeling in the field of earth sciences, aiming to achieve high-precision simulation and prediction of fluid dynamics through spatio-temporal physics awareness and parameter diffusion guidance. In the upstream stage, we design a vector quantization reconstruction module with temporal evolution characteristics, ensuring balanced and resilient parameter distribution by introducing general physical constraints. In the downstream stage, a diffusion probability network involving parameters is utilized to generate high-quality future states of fluids, while enhancing the model's generalization ability by perceiving parameters in various physical setups. Extensive experiments on multiple benchmark datasets have verified the effectiveness and robustness of the ST-PAD framework, which showcase that ST-PAD outperforms current mainstream models in fluid dynamics modeling and prediction, especially in effectively capturing local representations and maintaining significant advantages in OOD generations.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
FedSPU: Personalized Federated Learning for Resource-constrained Devices with Stochastic Parameter Update
Authors:
Ziru Niu,
Hai Dong,
A. K. Qin
Abstract:
Personalized Federated Learning (PFL) is widely employed in IoT applications to handle high-volume, non-iid client data while ensuring data privacy. However, heterogeneous edge devices owned by clients may impose varying degrees of resource constraints, causing computation and communication bottlenecks for PFL. Federated Dropout has emerged as a popular strategy to address this challenge, wherein…
▽ More
Personalized Federated Learning (PFL) is widely employed in IoT applications to handle high-volume, non-iid client data while ensuring data privacy. However, heterogeneous edge devices owned by clients may impose varying degrees of resource constraints, causing computation and communication bottlenecks for PFL. Federated Dropout has emerged as a popular strategy to address this challenge, wherein only a subset of the global model, i.e. a \textit{sub-model}, is trained on a client's device, thereby reducing computation and communication overheads. Nevertheless, the dropout-based model-pruning strategy may introduce bias, particularly towards non-iid local data. When biased sub-models absorb highly divergent parameters from other clients, performance degradation becomes inevitable. In response, we propose federated learning with stochastic parameter update (FedSPU). Unlike dropout that tailors the global model to small-size local sub-models, FedSPU maintains the full model architecture on each device but randomly freezes a certain percentage of neurons in the local model during training while updating the remaining neurons. This approach ensures that a portion of the local model remains personalized, thereby enhancing the model's robustness against biased parameters from other clients. Experimental results demonstrate that FedSPU outperforms federated dropout by 7.57\% on average in terms of accuracy. Furthermore, an introduced early stopping scheme leads to a significant reduction of the training time by \(24.8\%\sim70.4\%\) while maintaining high accuracy.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Test for high-dimensional linear hypothesis of mean vectors via random integration
Authors:
Jianghao Li,
Shizhe Hong,
Zhenzhen Niu,
Zhidong Bai
Abstract:
In this paper, we investigate hypothesis testing for the linear combination of mean vectors across multiple populations through the method of random integration. We have established the asymptotic distributions of the test statistics under both null and alternative hypotheses. Additionally, we provide a theoretical explanation for the special use of our test statistics in situations when the nonze…
▽ More
In this paper, we investigate hypothesis testing for the linear combination of mean vectors across multiple populations through the method of random integration. We have established the asymptotic distributions of the test statistics under both null and alternative hypotheses. Additionally, we provide a theoretical explanation for the special use of our test statistics in situations when the nonzero signal in the linear combination of the true mean vectors is weakly dense. Moreover, Monte-Carlo simulations are presented to evaluate the suggested test against existing high-dimensional tests. The findings from these simulations reveal that our test not only aligns with the performance of other tests in terms of size but also exhibits superior power.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
ACM MMSys 2024 Bandwidth Estimation in Real Time Communications Challenge
Authors:
Sami Khairy,
Gabriel Mittag,
Vishak Gopal,
Francis Y. Yan,
Zhixiong Niu,
Ezra Ameri,
Scott Inglis,
Mehrsa Golestaneh,
Ross Cutler
Abstract:
The quality of experience (QoE) delivered by video conferencing systems to end users depends in part on correctly estimating the capacity of the bottleneck link between the sender and the receiver over time. Bandwidth estimation for real-time communications (RTC) remains a significant challenge, primarily due to the continuously evolving heterogeneous network architectures and technologies. From t…
▽ More
The quality of experience (QoE) delivered by video conferencing systems to end users depends in part on correctly estimating the capacity of the bottleneck link between the sender and the receiver over time. Bandwidth estimation for real-time communications (RTC) remains a significant challenge, primarily due to the continuously evolving heterogeneous network architectures and technologies. From the first bandwidth estimation challenge which was hosted at ACM MMSys 2021, we learned that bandwidth estimation models trained with reinforcement learning (RL) in simulations to maximize network-based reward functions may not be optimal in reality due to the sim-to-real gap and the difficulty of aligning network-based rewards with user-perceived QoE. This grand challenge aims to advance bandwidth estimation model design by aligning reward maximization with user-perceived QoE optimization using offline RL and a real-world dataset with objective rewards which have high correlations with subjective audio/video quality in Microsoft Teams. All models submitted to the grand challenge underwent initial evaluation on our emulation platform. For a comprehensive evaluation under diverse network conditions with temporal fluctuations, top models were further evaluated on our geographically distributed testbed by using each model to conduct 600 calls within a 12-day period. The winning model is shown to deliver comparable performance to the top behavior policy in the released dataset. By leveraging real-world data and integrating objective audio/video quality scores as rewards, offline RL can therefore facilitate the development of competitive bandwidth estimators for RTC.
△ Less
Submitted 15 March, 2024; v1 submitted 10 March, 2024;
originally announced March 2024.
-
Simultaneous test of the mean vectors and covariance matrices for high-dimensional data using RMT
Authors:
Zhenzhen Niu,
Jianghao Li,
Wenya Luo,
Zhidong Bai
Abstract:
In this paper, we propose a new modified likelihood ratio test (LRT) for simultaneously testing mean vectors and covariance matrices of two-sample populations in high-dimensional settings. By employing tools from Random Matrix Theory (RMT), we derive the limiting null distribution of the modified LRT for generally distributed populations. Furthermore, we compare the proposed test with existing tes…
▽ More
In this paper, we propose a new modified likelihood ratio test (LRT) for simultaneously testing mean vectors and covariance matrices of two-sample populations in high-dimensional settings. By employing tools from Random Matrix Theory (RMT), we derive the limiting null distribution of the modified LRT for generally distributed populations. Furthermore, we compare the proposed test with existing tests using simulation results, demonstrating that the modified LRT exhibits favorable properties in terms of both size and power.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
Deep Coupling Network For Multivariate Time Series Forecasting
Authors:
Kun Yi,
Qi Zhang,
Hui He,
Kaize Shi,
Liang Hu,
Ning An,
Zhendong Niu
Abstract:
Multivariate time series (MTS) forecasting is crucial in many real-world applications. To achieve accurate MTS forecasting, it is essential to simultaneously consider both intra- and inter-series relationships among time series data. However, previous work has typically modeled intra- and inter-series relationships separately and has disregarded multi-order interactions present within and between…
▽ More
Multivariate time series (MTS) forecasting is crucial in many real-world applications. To achieve accurate MTS forecasting, it is essential to simultaneously consider both intra- and inter-series relationships among time series data. However, previous work has typically modeled intra- and inter-series relationships separately and has disregarded multi-order interactions present within and between time series data, which can seriously degrade forecasting accuracy. In this paper, we reexamine intra- and inter-series relationships from the perspective of mutual information and accordingly construct a comprehensive relationship learning mechanism tailored to simultaneously capture the intricate multi-order intra- and inter-series couplings. Based on the mechanism, we propose a novel deep coupling network for MTS forecasting, named DeepCN, which consists of a coupling mechanism dedicated to explicitly exploring the multi-order intra- and inter-series relationships among time series data concurrently, a coupled variable representation module aimed at encoding diverse variable patterns, and an inference module facilitating predictions through one forward step. Extensive experiments conducted on seven real-world datasets demonstrate that our proposed DeepCN achieves superior performance compared with the state-of-the-art baselines.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
Interlayer ferroelectric polarization modulated anomalous Hall effects in four-layer MnBi2Te4 antiferromagnets
Authors:
Ziyu Niu,
Xiang-Long Yu,
Dingfu Shao,
Xixiang Jing,
Defeng Hou,
Xuhong Li,
Jing Sun,
Junqin Shi,
Xiaoli Fan,
Tengfei Cao
Abstract:
Van der Waals (vdW) assembly could efficiently modulate the symmetry of two-dimensional (2D) materials that ultimately governs their physical properties. Of particular interest is the ferroelectric polarization being introduced by proper vdW assembly that enables the realization of novel electronic, magnetic and transport properties of 2D materials. Four-layer antiferromagnetic MnBi2Te4 (F-MBT) of…
▽ More
Van der Waals (vdW) assembly could efficiently modulate the symmetry of two-dimensional (2D) materials that ultimately governs their physical properties. Of particular interest is the ferroelectric polarization being introduced by proper vdW assembly that enables the realization of novel electronic, magnetic and transport properties of 2D materials. Four-layer antiferromagnetic MnBi2Te4 (F-MBT) offers an excellent platform to explore ferroelectric polarization effects on magnetic order and topological transport properties of nanomaterials. Here, by applying symmetry analyses and density-functional-theory calculations, the ferroelectric interface effects on magnetic order, anomalous Hall effect (AHE) or even quantum AHE (QAHE) on the F-MBT are analyzed. Interlayer ferroelectric polarization in F-MBT efficiently violates the PT symmetry (the combination symmetry of central inversion (P) and time reverse (T) of the F-MBT by conferring magnetoelectric couplings, and stabilizes a specific antiferromagnetic order encompassing a ferromagnetic interface in the F-MBT. We predict that engineering an interlayer polarization in the top or bottom interface of F-MBT allows converting F-MBT from a trivial insulator to a Chern insulator. The switching of ferroelectric polarization at the middle interfaces results in a direction reversal of the quantum anomalous Hall current. Additionally, the interlayer polarization of the top and bottom interfaces can be aligned in the same direction, and the switching of polarization direction also reverses the direction of anomalous Hall currents. Overall, our work highlights the occurrence of quantum-transport phenomena in 2D vdW four-layer antiferromagnets through vdW assembly. These phenomena are absent in the bulk or thin-film in bulk-like stacking forms of MnBi2Te4.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
A Survey on Domain Generalization for Medical Image Analysis
Authors:
Ziwei Niu,
Shuyi Ouyang,
Shiao Xie,
Yen-wei Chen,
Lanfen Lin
Abstract:
Medical Image Analysis (MedIA) has emerged as a crucial tool in computer-aided diagnosis systems, particularly with the advancement of deep learning (DL) in recent years. However, well-trained deep models often experience significant performance degradation when deployed in different medical sites, modalities, and sequences, known as a domain shift issue. In light of this, Domain Generalization (D…
▽ More
Medical Image Analysis (MedIA) has emerged as a crucial tool in computer-aided diagnosis systems, particularly with the advancement of deep learning (DL) in recent years. However, well-trained deep models often experience significant performance degradation when deployed in different medical sites, modalities, and sequences, known as a domain shift issue. In light of this, Domain Generalization (DG) for MedIA aims to address the domain shift challenge by generalizing effectively and performing robustly across unknown data distributions. This paper presents the a comprehensive review of substantial developments in this area. First, we provide a formal definition of domain shift and domain generalization in medical field, and discuss several related settings. Subsequently, we summarize the recent methods from three viewpoints: data manipulation level, feature representation level, and model training level, and present some algorithms in detail for each viewpoints. Furthermore, we introduce the commonly used datasets. Finally, we summarize existing literature and present some potential research topics for the future. For this survey, we also created a GitHub project by collecting the supporting resources, at the link: https://github.com/Ziwei-Niu/DG_for_MedIA
△ Less
Submitted 13 February, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
Boosting Adversarial Transferability across Model Genus by Deformation-Constrained Warping
Authors:
Qinliang Lin,
Cheng Luo,
Zenghao Niu,
Xilin He,
Weicheng Xie,
Yuanbo Hou,
Linlin Shen,
Siyang Song
Abstract:
Adversarial examples generated by a surrogate model typically exhibit limited transferability to unknown target systems. To address this problem, many transferability enhancement approaches (e.g., input transformation and model augmentation) have been proposed. However, they show poor performances in attacking systems having different model genera from the surrogate model. In this paper, we propos…
▽ More
Adversarial examples generated by a surrogate model typically exhibit limited transferability to unknown target systems. To address this problem, many transferability enhancement approaches (e.g., input transformation and model augmentation) have been proposed. However, they show poor performances in attacking systems having different model genera from the surrogate model. In this paper, we propose a novel and generic attacking strategy, called Deformation-Constrained Warping Attack (DeCoWA), that can be effectively applied to cross model genus attack. Specifically, DeCoWA firstly augments input examples via an elastic deformation, namely Deformation-Constrained Warping (DeCoW), to obtain rich local details of the augmented input. To avoid severe distortion of global semantics led by random deformation, DeCoW further constrains the strength and direction of the warping transformation by a novel adaptive control strategy. Extensive experiments demonstrate that the transferable examples crafted by our DeCoWA on CNN surrogates can significantly hinder the performance of Transformers (and vice versa) on various tasks, including image classification, video action recognition, and audio recognition. Code is made available at https://github.com/LinQinLiang/DeCoWA.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Nuclear mass table in deformed relativistic Hartree-Bogoliubov theory in continuum, II: Even-$Z$ nuclei
Authors:
DRHBc Mass Table Collaboration,
Peng Guo,
Xiaojie Cao,
Kangmin Chen,
Zhihui Chen,
Myung-Ki Cheoun,
Yong-Beom Choi,
Pak Chung Lam,
Wenmin Deng,
Jianmin Dong,
Pengxiang Du,
Xiaokai Du,
Kangda Duan,
Xiaohua Fan,
Wei Gao,
Lisheng Geng,
Eunja Ha,
Xiao-Tao He,
Jinniu Hu,
Jingke Huang,
Kun Huang,
Yanan Huang,
Zidan Huang,
Kim Da Hyung,
Hoi Yat Chan
, et al. (58 additional authors not shown)
Abstract:
The mass table in the deformed relativistic Hartree-Bogoliubov theory in continuum (DRHBc) with the PC-PK1 density functional has been established for even-$Z$ nuclei with $8\le Z\le120$, extended from the previous work for even-even nuclei [Zhang $\it{et.~al.}$ (DRHBc Mass Table Collaboration), At. Data Nucl. Data Tables 144, 101488 (2022)]. The calculated binding energies, two-nucleon and one-ne…
▽ More
The mass table in the deformed relativistic Hartree-Bogoliubov theory in continuum (DRHBc) with the PC-PK1 density functional has been established for even-$Z$ nuclei with $8\le Z\le120$, extended from the previous work for even-even nuclei [Zhang $\it{et.~al.}$ (DRHBc Mass Table Collaboration), At. Data Nucl. Data Tables 144, 101488 (2022)]. The calculated binding energies, two-nucleon and one-neutron separation energies, root-mean-square (rms) radii of neutron, proton, matter, and charge distributions, quadrupole deformations, and neutron and proton Fermi surfaces are tabulated and compared with available experimental data. A total of 4829 even-$Z$ nuclei are predicted to be bound, with an rms deviation of 1.477 MeV from the 1244 mass data. Good agreement with the available experimental odd-even mass differences, $α$ decay energies, and charge radii is also achieved. The description accuracy for nuclear masses and nucleon separation energies as well as the prediction for drip lines is compared with the results obtained from other relativistic and nonrelativistic density functional. The comparison shows that the DRHBc theory with PC-PK1 provides an excellent microscopic description for the masses of even-$Z$ nuclei. The systematics of the nucleon separation energies, odd-even mass differences, pairing energies, two-nucleon gaps, $α$ decay energies, rms radii, quadrupole deformations, potential energy curves, neutron density distributions, and neutron mean-field potentials are discussed.
△ Less
Submitted 10 June, 2024; v1 submitted 5 February, 2024;
originally announced February 2024.
-
Jailbreaking Attack against Multimodal Large Language Model
Authors:
Zhenxing Niu,
Haodong Ren,
Xinbo Gao,
Gang Hua,
Rong Jin
Abstract:
This paper focuses on jailbreaking attacks against multi-modal large language models (MLLMs), seeking to elicit MLLMs to generate objectionable responses to harmful user queries. A maximum likelihood-based algorithm is proposed to find an \emph{image Jailbreaking Prompt} (imgJP), enabling jailbreaks against MLLMs across multiple unseen prompts and images (i.e., data-universal property). Our approa…
▽ More
This paper focuses on jailbreaking attacks against multi-modal large language models (MLLMs), seeking to elicit MLLMs to generate objectionable responses to harmful user queries. A maximum likelihood-based algorithm is proposed to find an \emph{image Jailbreaking Prompt} (imgJP), enabling jailbreaks against MLLMs across multiple unseen prompts and images (i.e., data-universal property). Our approach exhibits strong model-transferability, as the generated imgJP can be transferred to jailbreak various models, including MiniGPT-v2, LLaVA, InstructBLIP, and mPLUG-Owl2, in a black-box manner. Moreover, we reveal a connection between MLLM-jailbreaks and LLM-jailbreaks. As a result, we introduce a construction-based method to harness our approach for LLM-jailbreaks, demonstrating greater efficiency than current state-of-the-art methods. The code is available here. \textbf{Warning: some content generated by language models may be offensive to some readers.}
△ Less
Submitted 3 February, 2024;
originally announced February 2024.
-
Stability of vortices in exciton-polariton condensates with spin-orbital-angular-momentum coupling
Authors:
Xin-Xin Yang,
Wei Zhang,
Zhen-Xia Niu
Abstract:
The existence and dynamics of stable quantized vortices is an important subject of quantum many-body physics. Spin-orbital-angular-momentum coupling (SOAMC), a special type of spin-orbit coupling, has been experimentally achieved to create vortices in atomic Bose-Einstein condensates (BEC). Here, we generalize the concept of SOAMC to a two-component polariton BEC and analyze the emergence and conf…
▽ More
The existence and dynamics of stable quantized vortices is an important subject of quantum many-body physics. Spin-orbital-angular-momentum coupling (SOAMC), a special type of spin-orbit coupling, has been experimentally achieved to create vortices in atomic Bose-Einstein condensates (BEC). Here, we generalize the concept of SOAMC to a two-component polariton BEC and analyze the emergence and configuration of vortices under a finite-size circular pumping beam. We find that the regular configuration of vortex lattices induced by a finite-size circular pump is significantly distorted by the spatially dependent Raman coupling of SOAMC, even in the presence of a repulsive polariton interaction which can assist the forming of stable vortex configuration. Meanwhile, a pair of vortices induced by SOAMC located at the center of polariton cloud remains stable. When the Raman coupling is sufficiently strong and interaction is weak, the vortices spiraling in from the edge of polariton cloud will disrupt the polariton BEC.
△ Less
Submitted 3 May, 2024; v1 submitted 31 January, 2024;
originally announced January 2024.
-
Experimental test of the Crooks fluctuation theorem in a single nuclear spin
Authors:
Wei Cheng,
Wenquan Liu,
Zhibo Niu,
Chang-Kui Duan,
Xing Rong,
Jiangfeng Du
Abstract:
We experimentally test the Crooks fluctuation theorem in a quantum spin system. Our results show that the Crooks fluctuation theorem is valid for different speeds of the nonequilibrium processes and under various effective temperatures. Work is not an observable in quantum systems, which makes tests of quantum thermodynamic theorems challenging. In this work, we developed high-fidelity single-shot…
▽ More
We experimentally test the Crooks fluctuation theorem in a quantum spin system. Our results show that the Crooks fluctuation theorem is valid for different speeds of the nonequilibrium processes and under various effective temperatures. Work is not an observable in quantum systems, which makes tests of quantum thermodynamic theorems challenging. In this work, we developed high-fidelity single-shot readouts of a single nuclear spin in diamond and implemented the two-point work measurement protocol, enabling a direct experimental test of the Crooks fluctuation theorem. Our results provide a quantum insight into fluctuations and the methods we developed can be utilized to study other quantum thermodynamic theorems.
△ Less
Submitted 31 January, 2024;
originally announced January 2024.
-
Test for high-dimensional mean vectors via the weighted $L_2$-norm
Authors:
Jianghao Li,
Zhenzhen Niu,
Shizhe Hong,
Zhidong Bai
Abstract:
In this paper, we propose a novel approach to test the equality of high-dimensional mean vectors of several populations via the weighted $L_2$-norm. We establish the asymptotic normality of the test statistics under the null hypothesis. We also explain theoretically why our test statistics can be highly useful in weakly dense cases when the nonzero signal in mean vectors is present. Furthermore, w…
▽ More
In this paper, we propose a novel approach to test the equality of high-dimensional mean vectors of several populations via the weighted $L_2$-norm. We establish the asymptotic normality of the test statistics under the null hypothesis. We also explain theoretically why our test statistics can be highly useful in weakly dense cases when the nonzero signal in mean vectors is present. Furthermore, we compare the proposed test with existing tests using simulation results, demonstrating that the weighted $L_2$-norm-based test statistic exhibits favorable properties in terms of both size and power.
△ Less
Submitted 31 January, 2024; v1 submitted 30 January, 2024;
originally announced January 2024.
-
Memory-Inspired Temporal Prompt Interaction for Text-Image Classification
Authors:
Xinyao Yu,
Hao Sun,
Ziwei Niu,
Rui Qin,
Zhenjia Bai,
Yen-Wei Chen,
Lanfen Lin
Abstract:
In recent years, large-scale pre-trained multimodal models (LMM) generally emerge to integrate the vision and language modalities, achieving considerable success in various natural language processing and computer vision tasks. The growing size of LMMs, however, results in a significant computational cost for fine-tuning these models for downstream tasks. Hence, prompt-based interaction strategy i…
▽ More
In recent years, large-scale pre-trained multimodal models (LMM) generally emerge to integrate the vision and language modalities, achieving considerable success in various natural language processing and computer vision tasks. The growing size of LMMs, however, results in a significant computational cost for fine-tuning these models for downstream tasks. Hence, prompt-based interaction strategy is studied to align modalities more efficiently. In this contex, we propose a novel prompt-based multimodal interaction strategy inspired by human memory strategy, namely Memory-Inspired Temporal Prompt Interaction (MITP). Our proposed method involves in two stages as in human memory strategy: the acquiring stage, and the consolidation and activation stage. We utilize temporal prompts on intermediate layers to imitate the acquiring stage, leverage similarity-based prompt interaction to imitate memory consolidation, and employ prompt generation strategy to imitate memory activation. The main strength of our paper is that we interact the prompt vectors on intermediate layers to leverage sufficient information exchange between modalities, with compressed trainable parameters and memory usage. We achieve competitive results on several datasets with relatively small memory usage and 2.0M of trainable parameters (about 1% of the pre-trained foundation model).
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech
Authors:
Chenpeng Du,
Yiwei Guo,
Hankun Wang,
Yifan Yang,
Zhikang Niu,
Shuai Wang,
Hui Zhang,
Xie Chen,
Kai Yu
Abstract:
Recent TTS models with decoder-only Transformer architecture, such as SPEAR-TTS and VALL-E, achieve impressive naturalness and demonstrate the ability for zero-shot adaptation given a speech prompt. However, such decoder-only TTS models lack monotonic alignment constraints, sometimes leading to hallucination issues such as mispronunciation, word skipping and repeating. To address this limitation,…
▽ More
Recent TTS models with decoder-only Transformer architecture, such as SPEAR-TTS and VALL-E, achieve impressive naturalness and demonstrate the ability for zero-shot adaptation given a speech prompt. However, such decoder-only TTS models lack monotonic alignment constraints, sometimes leading to hallucination issues such as mispronunciation, word skipping and repeating. To address this limitation, we propose VALL-T, a generative Transducer model that introduces shifting relative position embeddings for input phoneme sequence, explicitly indicating the monotonic generation process while maintaining the architecture of decoder-only Transformer. Consequently, VALL-T retains the capability of prompt-based zero-shot adaptation and demonstrates better robustness against hallucinations with a relative reduction of 28.3% in the word error rate. Furthermore, the controllability of alignment in VALL-T during decoding facilitates the use of untranscribed speech prompts, even in unknown languages. It also enables the synthesis of lengthy speech by utilizing an aligned context window.
△ Less
Submitted 29 January, 2024; v1 submitted 25 January, 2024;
originally announced January 2024.
-
A comprehensive correction of the Gaia DR3 XP spectra
Authors:
Bowen Huang,
Haibo Yuan,
Maosheng Xiang,
Yang Huang,
Kai Xiao,
Shuai Xu,
Ruoyi Zhang,
Lin Yang,
Zexi Niu,
Hongrui Gu
Abstract:
By combining spectra from the CALSPEC and NGSL, as well as spectroscopic data from the LAMOST Data Release 7 (DR7), we have analyzed and corrected the systematic errors of the Gaia DR3 BP/RP (XP) spectra. The errors depend on the normalized spectral energy distribution (simplified by two independent ``colors'') and $G$ magnitude. Our corrections are applicable in the range of approximately…
▽ More
By combining spectra from the CALSPEC and NGSL, as well as spectroscopic data from the LAMOST Data Release 7 (DR7), we have analyzed and corrected the systematic errors of the Gaia DR3 BP/RP (XP) spectra. The errors depend on the normalized spectral energy distribution (simplified by two independent ``colors'') and $G$ magnitude. Our corrections are applicable in the range of approximately $-0.5<BP-RP<2$, $3<G<17.5$ and $E(B-V)<0.8$. To validate our correction, we conduct independent tests by comparing with the MILES and LEMONY spectra. The results demonstrate that the systematic errors of $BP-RP$ and $G$ have been effectively corrected, especially in the near ultraviolet. The consistency between the corrected Gaia XP spectra and the MILES and LEMONY is better than 2 per cent in the wavelength range of $336-400$\,nm and 1 per cent in redder wavelengths. A global absolute calibration is also carried out by comparing the synthetic Gaia photometry from the corrected XP spectra with the corrected Gaia DR3 photometry. Our study opens up new possibilities for using XP spectra in many fields. A Python package is publicly available to do the corrections (https://doi.org/10.12149/101375 or https://github.com/HiromonGON/GaiaXPcorrection).
△ Less
Submitted 22 January, 2024; v1 submitted 22 January, 2024;
originally announced January 2024.
-
Third-order exceptional line in a nitrogen-vacancy spin system
Authors:
Yang Wu,
Yunhan Wang,
Xiangyu Ye,
Wenquan Liu,
Zhibo Niu,
Chang-Kui Duan,
Ya Wang,
Xing Rong,
Jiangfeng Du
Abstract:
The exceptional points (EPs) aroused from the non-Hermiticity bring rich phenomena, such as exceptional nodal topologies, unidirectional invisibility, single-mode lasing, sensitivity enhancement and energy harvesting. Isolated high-order EPs have been observed to exhibit richer topological characteristics and better performance in sensing over 2nd-order EPs. Recently, high-order EP geometries, suc…
▽ More
The exceptional points (EPs) aroused from the non-Hermiticity bring rich phenomena, such as exceptional nodal topologies, unidirectional invisibility, single-mode lasing, sensitivity enhancement and energy harvesting. Isolated high-order EPs have been observed to exhibit richer topological characteristics and better performance in sensing over 2nd-order EPs. Recently, high-order EP geometries, such as lines or rings formed entirely by high order EPs, are predicted to provide richer phenomena and advantages over stand-alone high-order EPs. However, experimental exploration of high-order EP geometries is hitherto beyond reach due to the demand of more degrees of freedom in the Hamiltonian's parameter space or a higher level of symmetries. Here we report the observation of the third-order exceptional line (EL) at the atomic scale. By introducing multiple symmetries, the emergence of the third-order EL has been successfully realized with a single electron spin of nitrogen-vacancy center in diamond. Furthermore, the behaviors of the EP structure under different symmetries are systematically investigated. The symmetries are shown to play essential roles in the occurrence of high-order EPs and the related EP geometries. Our work opens a new avenue to explore high-order EP-related topological physics at the atomic scale and to the potential applications of high-order EPs in quantum technologies.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
Mobility Accelerates Learning: Convergence Analysis on Hierarchical Federated Learning in Vehicular Networks
Authors:
Tan Chen,
Jintao Yan,
Yuxuan Sun,
Sheng Zhou,
Deniz Gündüz,
Zhisheng Niu
Abstract:
Hierarchical federated learning (HFL) enables distributed training of models across multiple devices with the help of several edge servers and a cloud edge server in a privacy-preserving manner. In this paper, we consider HFL with highly mobile devices, mainly targeting at vehicular networks. Through convergence analysis, we show that mobility influences the convergence speed by both fusing the ed…
▽ More
Hierarchical federated learning (HFL) enables distributed training of models across multiple devices with the help of several edge servers and a cloud edge server in a privacy-preserving manner. In this paper, we consider HFL with highly mobile devices, mainly targeting at vehicular networks. Through convergence analysis, we show that mobility influences the convergence speed by both fusing the edge data and shuffling the edge models. While mobility is usually considered as a challenge from the perspective of communication, we prove that it increases the convergence speed of HFL with edge-level heterogeneous data, since more diverse data can be incorporated. Furthermore, we demonstrate that a higher speed leads to faster convergence, since it accelerates the fusion of data. Simulation results show that mobility increases the model accuracy of HFL by up to 15.1% when training a convolutional neural network on the CIFAR-10 dataset.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
Universal dynamics of the entropy of work distribution in spinor Bose-Einstein condensates
Authors:
Zhen-Xia Niu
Abstract:
Driving a quantum many-body system across the quantum phase transition (QPT) in the finite time has been concerned in different branches of physics to explore various fundamental questions. Here, we analyze how the underlying QPT affects the work distribution $P(W)$, when the control parameter of a ferromagnetic spinor Bose-Einstein condensates is tuned through the critical point in the finite tim…
▽ More
Driving a quantum many-body system across the quantum phase transition (QPT) in the finite time has been concerned in different branches of physics to explore various fundamental questions. Here, we analyze how the underlying QPT affects the work distribution $P(W)$, when the control parameter of a ferromagnetic spinor Bose-Einstein condensates is tuned through the critical point in the finite time. We show that the work distribution undergoes a dramatic change with increasing the driving time $τ$. To capture the characteristics of the work distribution, we analyze the entropy of $P(W)$ and find three different regions in the evolution of entropy as a function of $τ$. Specifically, the entropy is insensitive to the driving time in the region of very short $τ$, while it exhibits a universal power-law decay in the region with intermediate value of $τ$. In particular, the power-law scaling of the entropy is according with the well-known Kibble-Zurek mechanism. For the region with large $τ$, the validity of the adiabatic perturbation theory leads to the entropy decay as $τ^{-2}\lnτ$. Our results verify the usefulness of the entropy of the work distribution for understanding the critical dynamics and provide an alternative way to experimentally study nonequilibrium properties in quantum many-body systems.
△ Less
Submitted 9 July, 2024; v1 submitted 11 January, 2024;
originally announced January 2024.
-
Masked AutoEncoder for Graph Clustering without Pre-defined Cluster Number k
Authors:
Yuanchi Ma,
Hui He,
Zhongxiang Lei,
Zhendong Niu
Abstract:
Graph clustering algorithms with autoencoder structures have recently gained popularity due to their efficient performance and low training cost. However, for existing graph autoencoder clustering algorithms based on GCN or GAT, not only do they lack good generalization ability, but also the number of clusters clustered by such autoencoder models is difficult to determine automatically. To solve t…
▽ More
Graph clustering algorithms with autoencoder structures have recently gained popularity due to their efficient performance and low training cost. However, for existing graph autoencoder clustering algorithms based on GCN or GAT, not only do they lack good generalization ability, but also the number of clusters clustered by such autoencoder models is difficult to determine automatically. To solve this problem, we propose a new framework called Graph Clustering with Masked Autoencoders (GCMA). It employs our designed fusion autoencoder based on the graph masking method for the fusion coding of graph. It introduces our improved density-based clustering algorithm as a second decoder while decoding with multi-target reconstruction. By decoding the mask embedding, our model can capture more generalized and comprehensive knowledge. The number of clusters and clustering results can be output end-to-end while improving the generalization ability. As a nonparametric class method, extensive experiments demonstrate the superiority of \textit{GCMA} over state-of-the-art baselines.
△ Less
Submitted 9 January, 2024;
originally announced January 2024.
-
Coherence in resonance fluorescence
Authors:
Xu-Jie Wang,
Guoqi Huang,
Ming-Yang Li,
Yuan-Zhuo Wang,
Li Liu,
Bang Wu,
Hanqing Liu,
Haiqiao Ni,
Zhichuan Niu,
Weijie Ji,
Rongzhen Jiao,
Hua-Lei Yin,
Zhiliang Yuan
Abstract:
Resonance fluorescence (RF) of a two-level emitter displays persistently anti-bunching irrespective of the excitation intensity, but inherits the driving laser's linewidth under weak excitation. These properties are commonly explained disjoinedly as the emitter's single photon saturation or passively scattering light, until a recent theory attributes anti-bunching to the laser-like spectrum's inte…
▽ More
Resonance fluorescence (RF) of a two-level emitter displays persistently anti-bunching irrespective of the excitation intensity, but inherits the driving laser's linewidth under weak excitation. These properties are commonly explained disjoinedly as the emitter's single photon saturation or passively scattering light, until a recent theory attributes anti-bunching to the laser-like spectrum's interference with the incoherently scattered light. However, the theory implies higher-order scattering processes, and led to an experiment purporting to validate an atom's simultaneous scattering of two photons. If true, it could complicate RF's prospects in quantum information applications. Here, we propose a unified model that treats all RF photons as spontaneous emission, one at a time, and can explain simultaneously both the RF's spectral and correlation properties. We theoretically derive the excitation power dependencies, with the strongest effects measurable at the single-photon incidence level, of the first-order coherence of the whole RF and super-bunching of the spectrally filtered, followed by experimental confirmation on a semiconductor quantum dot micro-pillar device. Furthermore, our model explains peculiar coincidence bunching observed in phase-dependent two-photon interference experiments. Our work provides novel understandings of coherent light-matter interaction and may stimulate new applications.
△ Less
Submitted 30 May, 2024; v1 submitted 21 December, 2023;
originally announced December 2023.
-
Calibration of metallicity of LAMOST M dwarf stars Using FGK+M wide binaries
Authors:
Dan Qiu,
Jiadong Li,
Bo Zhang,
Chao Liu,
Haijun Tian,
Zexi Niu
Abstract:
Estimating precise metallicity of M dwarfs is a well-known difficult problem due to their complex spectra. In this work, we empirically calibrate the metallicity using wide binaries with a F, G, or K dwarf and a M dwarf companion. With 1308 FGK+M wide binaries well observed by LAMOST, we calibrated M dwarf's [Fe/H] by using the Stellar LAbel Machine (SLAM) model, a data-driven method based on supp…
▽ More
Estimating precise metallicity of M dwarfs is a well-known difficult problem due to their complex spectra. In this work, we empirically calibrate the metallicity using wide binaries with a F, G, or K dwarf and a M dwarf companion. With 1308 FGK+M wide binaries well observed by LAMOST, we calibrated M dwarf's [Fe/H] by using the Stellar LAbel Machine (SLAM) model, a data-driven method based on support vector regression (SVR). The [Fe/H] labels of the training data are from FGK companions in range of [-1,0.5] dex. The Teffs are selected from Li et al. (2021), spanning [3100,4400] K. The uncertainties in SLAM estimates of [Fe/H] and Teff are ~0.15 dex and ~40 K, respectively, at snri > 100, where snri is the signal-to-noise ratio (SNR) at i-band of M dwarf spectra. We applied the trained SLAM model to determine the [Fe/H] and Teff for ~630,000 M dwarfs with low-resolution spectra in LAMOST DR9. Compared to other literature also using FGK+M wide binaries for calibration, our [Fe/H] estimates show no bias but a scatter of ~ 0.14-0.18 dex. However, the [Fe/H] compared to APOGEE shows a systematic difference of ~ 0.10-0.15 dex with a scatter of ~ 0.15-0.20 dex. While the Teff compared to APOGEE has a bias of 3 K with a scatter of 62 K, it is systematically higher by 180 K compared to other calibrations based on the bolometric temperature. Finally, we calculated the zeta index for 1308 M dwarf secondaries and presents a moderate correlation between zeta and [Fe/H].
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Meili: Enabling SmartNIC as a Service in the Cloud
Authors:
Qiang Su,
Shaofeng Wu,
Zhixiong Niu,
Ran Shu,
Peng Cheng,
Yongqiang Xiong,
Chun Jason Xue,
Zaoxing Liu,
Hong Xu
Abstract:
SmartNICs are touted as an attractive substrate for network application offloading, offering benefits in programmability, host resource saving, and energy efficiency. The current usage restricts offloading to local hosts and confines SmartNIC ownership to individual application teams, resulting in poor resource efficiency and scalability. This paper presents Meili, a novel system that realizes Sma…
▽ More
SmartNICs are touted as an attractive substrate for network application offloading, offering benefits in programmability, host resource saving, and energy efficiency. The current usage restricts offloading to local hosts and confines SmartNIC ownership to individual application teams, resulting in poor resource efficiency and scalability. This paper presents Meili, a novel system that realizes SmartNIC as a service to address these issues. Meili organizes heterogeneous SmartNIC resources as a pool and offers a unified one-NIC abstraction to application developers. This allows developers to focus solely on the application logic while dynamically optimizing their performance needs. Our evaluation on NVIDIA BlueField series and AMD Pensando SmartNICs demonstrates that Meili achieves scalable single-flow throughput with a maximum 8 μs latency overhead and enhances resource efficiency by 3.07$\times$ compared to standalone deployments and 1.44$\times$ compared to state-of-the-art microservice deployments.
△ Less
Submitted 24 February, 2024; v1 submitted 19 December, 2023;
originally announced December 2023.
-
Robust Few-Shot Named Entity Recognition with Boundary Discrimination and Correlation Purification
Authors:
Xiaojun Xue,
Chunxia Zhang,
Tianxiang Xu,
Zhendong Niu
Abstract:
Few-shot named entity recognition (NER) aims to recognize novel named entities in low-resource domains utilizing existing knowledge. However, the present few-shot NER models assume that the labeled data are all clean without noise or outliers, and there are few works focusing on the robustness of the cross-domain transfer learning ability to textual adversarial attacks in Few-shot NER. In this wor…
▽ More
Few-shot named entity recognition (NER) aims to recognize novel named entities in low-resource domains utilizing existing knowledge. However, the present few-shot NER models assume that the labeled data are all clean without noise or outliers, and there are few works focusing on the robustness of the cross-domain transfer learning ability to textual adversarial attacks in Few-shot NER. In this work, we comprehensively explore and assess the robustness of few-shot NER models under textual adversarial attack scenario, and found the vulnerability of existing few-shot NER models. Furthermore, we propose a robust two-stage few-shot NER method with Boundary Discrimination and Correlation Purification (BDCP). Specifically, in the span detection stage, the entity boundary discriminative module is introduced to provide a highly distinguishing boundary representation space to detect entity spans. In the entity typing stage, the correlations between entities and contexts are purified by minimizing the interference information and facilitating correlation generalization to alleviate the perturbations caused by textual adversarial attacks. In addition, we construct adversarial examples for few-shot NER based on public datasets Few-NERD and Cross-Dataset. Comprehensive evaluations on those two groups of few-shot NER datasets containing adversarial examples demonstrate the robustness and superiority of the proposed method.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
ADOD: Adaptive Domain-Aware Object Detection with Residual Attention for Underwater Environments
Authors:
Lyes Saad Saoud,
Zhenwei Niu,
Atif Sultan,
Lakmal Seneviratne,
Irfan Hussain
Abstract:
This research presents ADOD, a novel approach to address domain generalization in underwater object detection. Our method enhances the model's ability to generalize across diverse and unseen domains, ensuring robustness in various underwater environments. The first key contribution is Residual Attention YOLOv3, a novel variant of the YOLOv3 framework empowered by residual attention modules. These…
▽ More
This research presents ADOD, a novel approach to address domain generalization in underwater object detection. Our method enhances the model's ability to generalize across diverse and unseen domains, ensuring robustness in various underwater environments. The first key contribution is Residual Attention YOLOv3, a novel variant of the YOLOv3 framework empowered by residual attention modules. These modules enable the model to focus on informative features while suppressing background noise, leading to improved detection accuracy and adaptability to different domains. The second contribution is the attention-based domain classification module, vital during training. This module helps the model identify domain-specific information, facilitating the learning of domain-invariant features. Consequently, ADOD can generalize effectively to underwater environments with distinct visual characteristics. Extensive experiments on diverse underwater datasets demonstrate ADOD's superior performance compared to state-of-the-art domain generalization methods, particularly in challenging scenarios. The proposed model achieves exceptional detection performance in both seen and unseen domains, showcasing its effectiveness in handling domain shifts in underwater object detection tasks. ADOD represents a significant advancement in adaptive object detection, providing a promising solution for real-world applications in underwater environments. With the prevalence of domain shifts in such settings, the model's strong generalization ability becomes a valuable asset for practical underwater surveillance and marine research endeavors.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.