Skip to main content

Showing 1–50 of 1,354 results for author: Tang, H

  1. arXiv:2407.10331  [pdf, other

    cs.RO cs.LG eess.SY

    3D Foundation Models Enable Simultaneous Geometry and Pose Estimation of Grasped Objects

    Authors: Weiming Zhi, Haozhan Tang, Tianyi Zhang, Matthew Johnson-Roberson

    Abstract: Humans have the remarkable ability to use held objects as tools to interact with their environment. For this to occur, humans internally estimate how hand movements affect the object's movement. We wish to endow robots with this capability. We contribute methodology to jointly estimate the geometry and pose of objects grasped by a robot, from RGB images captured by an external camera. Notably, our… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

  2. arXiv:2407.10061  [pdf, other

    cs.CV

    InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation

    Authors: Zeyu Zhang, Akide Liu, Qi Chen, Feng Chen, Ian Reid, Richard Hartley, Bohan Zhuang, Hao Tang

    Abstract: Text-to-motion generation holds potential for film, gaming, and robotics, yet current methods often prioritize short motion generation, making it challenging to produce long motion sequences effectively: (1) Current methods struggle to handle long motion sequences as a single input due to prohibitively high computational cost; (2) Breaking down the generation of long motion sequences into shorter… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

  3. arXiv:2407.09826  [pdf, other

    cs.CV

    3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance

    Authors: Xiaoxu Xu, Yitian Yuan, Jinlong Li, Qiudan Zhang, Zequn Jie, Lin Ma, Hao Tang, Nicu Sebe, Xu Wang

    Abstract: In this paper, we propose 3DSS-VLG, a weakly supervised approach for 3D Semantic Segmentation with 2D Vision-Language Guidance, an alternative approach that a 3D model predicts dense-embedding for each point which is co-embedded with both the aligned image and text spaces from the 2D vision-language model. Specifically, our method exploits the superior generalization ability of the 2D vision-langu… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

  4. arXiv:2407.09648  [pdf, other

    cs.CV

    3x2: 3D Object Part Segmentation by 2D Semantic Correspondences

    Authors: Anh Thai, Weiyao Wang, Hao Tang, Stefan Stojanov, Matt Feiszli, James M. Rehg

    Abstract: 3D object part segmentation is essential in computer vision applications. While substantial progress has been made in 2D object part segmentation, the 3D counterpart has received less attention, in part due to the scarcity of annotated 3D datasets, which are expensive to collect. In this work, we propose to leverage a few annotated 3D shapes or richly annotated 2D datasets to perform 3D object par… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  5. arXiv:2407.09464  [pdf, other

    physics.optics physics.app-ph

    Symmetric Second-Harmonic Generation in Sub-wavelength Periodically Poled Thin Film Lithium Niobate

    Authors: Fengyan Yang, Juanjuan Lu, Mohan Shen, Guangcanlan Yang, Hong X. Tang

    Abstract: Second harmonic generation (SHG) extensively employs periodically poled nonlinear crystals through forward quasi-phase-matching to achieve efficient frequency conversion. As poling periods approach sub-micrometers, backward quasi-phase-matching has also been demonstrated, albeit by utilizing pulsed laser drives. The realization of symmetric second harmonic generation, characterized by counterpropa… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  6. arXiv:2407.08952  [pdf, other

    cs.CL cs.AI

    Detect, Investigate, Judge and Determine: A Novel LLM-based Framework for Few-shot Fake News Detection

    Authors: Ye Liu, Jiajun Zhu, Kai Zhang, Haoyu Tang, Yanghai Zhang, Xukai Liu, Qi Liu, Enhong Chen

    Abstract: Few-Shot Fake News Detection (FS-FND) aims to distinguish inaccurate news from real ones in extremely low-resource scenarios. This task has garnered increased attention due to the widespread dissemination and harmful impact of fake news on social media. Large Language Models (LLMs) have demonstrated competitive performance with the help of their rich prior knowledge and excellent in-context learni… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  7. arXiv:2407.08155  [pdf, other

    hep-th

    Giant graviton expansion from eigenvalue instantons

    Authors: Yiming Chen, Raghu Mahajan, Haifeng Tang

    Abstract: Recently, S. Murthy has proposed a convergent expansion of free partition functions and superconformal indices of finite-$N$ purely adjoint gauge theories based on a Fredholm determinant expansion. This expansion has been dubbed the giant graviton expansion and takes the form of an infinite series of corrections to the $N=\infty$ result, with the $m^\text{th}$ correction being of order $e^{-mN}$.… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 12 pages, 1 figure

  8. arXiv:2407.07589  [pdf

    cs.RO

    MSC-LIO: An MSCKF-Based LiDAR-Inertial Odometry with Same-Plane-Point Tracking

    Authors: Tisheng Zhang, Man Yuan, Linfu Wei, Hailiang Tang, Xiaoji Niu

    Abstract: The multi-state constraint Kalman filter (MSCKF) has been proven to be more efficient than graph optimization for visual-based odometry while with similar accuracy. However, it has not yet been properly considered and studied for LiDAR-based odometry. In this paper, we propose a novel tightly coupled LiDAR-inertial odometry based on the MSCKF framework, named MSC-LIO. An efficient LiDAR same-plane… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 9 pages

  9. arXiv:2407.06780  [pdf, other

    cs.CV

    CoLA: Conditional Dropout and Language-driven Robust Dual-modal Salient Object Detection

    Authors: Shuang Hao, Chunlin Zhong, He Tang

    Abstract: The depth/thermal information is beneficial for detecting salient object with conventional RGB images. However, in dual-modal salient object detection (SOD) model, the robustness against noisy inputs and modality missing is crucial but rarely studied. To tackle this problem, we introduce \textbf{Co}nditional Dropout and \textbf{LA}nguage-driven(\textbf{CoLA}) framework comprising two core componen… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  10. arXiv:2407.05568  [pdf, other

    math.NA

    High-order accurate entropy stable schemes for compressible Euler equations with van der Waals equation of state on adaptive moving meshes

    Authors: Shangting Li, Huazhong Tang

    Abstract: This paper develops the high-order entropy stable (ES) finite difference schemes for multi-dimensional compressible Euler equations with the van der Waals equation of state (EOS) on adaptive moving meshes. Semi-discrete schemes are first nontrivially constructed built on the newly derived high-order entropy conservative (EC) fluxes in curvilinear coordinates and scaled eigenvector matrices as well… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: 3p pages, 16 figures, 5 tables

  11. GMC: A General Framework of Multi-stage Context Learning and Utilization for Visual Detection Tasks

    Authors: Xuan Wang, Hao Tang, Zhigang Zhu

    Abstract: Various contextual information has been employed by many approaches for visual detection tasks. However, most of the existing approaches only focus on specific context for specific tasks. In this paper, GMC, a general framework is proposed for multistage context learning and utilization, with various deep network architectures for various visual detection tasks. The GMC framework encompasses three… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  12. arXiv:2407.05468  [pdf, other

    physics.app-ph

    Non-contact excitation of multi-GHz lithium niobate electromechanical resonators

    Authors: Danqing Wang, Jiacheng Xie, Yu Guo, Mohan Shen, Hong X. Tang

    Abstract: The demand for high-performance electromechanical resonators is ever-growing across diverse applications, ranging from sensing and time-keeping to advanced communication devices. Among the electromechanical materials being explored, thin-film lithium niobate stands out for its strong piezoelectric properties and low acoustic loss. However, in nearly all existing lithium niobate electromechanical d… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: 6 pages, 4 figures

  13. arXiv:2407.05047  [pdf, other

    cs.AI

    MFE-ETP: A Comprehensive Evaluation Benchmark for Multi-modal Foundation Models on Embodied Task Planning

    Authors: Min Zhang, Jianye Hao, Xian Fu, Peilong Han, Hao Zhang, Lei Shi, Hongyao Tang, Yan Zheng

    Abstract: In recent years, Multi-modal Foundation Models (MFMs) and Embodied Artificial Intelligence (EAI) have been advancing side by side at an unprecedented pace. The integration of the two has garnered significant attention from the AI research community. In this work, we attempt to provide an in-depth and comprehensive evaluation of the performance of MFM s on embodied task planning, aiming to shed lig… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  14. arXiv:2407.04033  [pdf

    physics.comp-ph

    Exploring control of the emergent exciton insulator state in 1T-TiSe$_2$ monolayer by state-of-the-art theory models

    Authors: Hong Tang, Li Yin, Gábor I. Csonka, Adrienn Ruzsinszky

    Abstract: The layered transition metal dichalcogenide 1T-TiSe$_2$ is of great research interest, having intriguing properties of charge density waves (CDW) and superconductivity under doping or pressurizing. The monolayer form of 1T-TiSe$_2$ also shows a CDW with a higher transition temperature T_c than the bulk, indicating a stronger CDW interaction. By using the meta-generalized gradient approximation (me… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  15. arXiv:2407.03712  [pdf, other

    math.NA

    A second-order direct Eulerian GRP scheme for ten-moment Gaussian closure equations with source terms

    Authors: Jiangfu Wang, Huazhong Tang

    Abstract: This paper proposes a second-order accurate direct Eulerian generalized Riemann problem (GRP) scheme for the ten-moment Gaussian closure equations with source terms. The generalized Riemann invariants associated with the rarefaction waves, the contact discontinuity and the shear waves are given, and the 1D exact Riemann solver is obtained. After that, the generalized Riemann invariants and the Ran… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 54 pages, 20 figures, 2tables

  16. arXiv:2407.03110  [pdf, other

    cs.SD cs.AI eess.AS

    A Toolchain for Comprehensive Audio/Video Analysis Using Deep Learning Based Multimodal Approach (A use case of riot or violent context detection)

    Authors: Lam Pham, Phat Lam, Tin Nguyen, Hieu Tang, Alexander Schindler

    Abstract: In this paper, we present a toolchain for a comprehensive audio/video analysis by leveraging deep learning based multimodal approach. To this end, different specific tasks of Speech to Text (S2T), Acoustic Scene Classification (ASC), Acoustic Event Detection (AED), Visual Object Detection (VOD), Image Captioning (IC), and Video Captioning (VC) are conducted and integrated into the toolchain. By co… ▽ More

    Submitted 2 May, 2024; originally announced July 2024.

  17. arXiv:2407.02052  [pdf, other

    eess.AS cs.SD

    The USTC-NERCSLIP Systems for The ICMC-ASR Challenge

    Authors: Minghui Wu, Luzhen Xu, Jie Zhang, Haitao Tang, Yanyan Yue, Ruizhi Liao, Jintao Zhao, Zhengzhe Zhang, Yichi Wang, Haoyin Yan, Hongliang Yu, Tongle Ma, Jiachen Liu, Chongliang Wu, Yongchao Li, Yanyong Zhang, Xin Fang, Yue Zhang

    Abstract: This report describes the submitted system to the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) challenge, which considers the ASR task with multi-speaker overlapping and Mandarin accent dynamics in the ICMC case. We implement the front-end speaker diarization using the self-supervised learning representation based multi-speaker embedding and beamforming using the speaker position,… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted at ICASSP 2024

  18. arXiv:2407.02028  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Why does in-context learning fail sometimes? Evaluating in-context learning on open and closed questions

    Authors: Xiang Li, Haoran Tang, Siyu Chen, Ziwei Wang, Ryan Chen, Marcin Abram

    Abstract: We measure the performance of in-context learning as a function of task novelty and difficulty for open and closed questions. For that purpose, we created a novel benchmark consisting of hard scientific questions, each paired with a context of various relevancy. We show that counter-intuitively, a context that is more aligned with the topic does not always help more than a less relevant context. T… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 8 pages plus references, 4 main figures, 6 pages of supplementary material

  19. arXiv:2407.01963  [pdf, other

    eess.AS

    Towards Unsupervised Speaker Diarization System for Multilingual Telephone Calls Using Pre-trained Whisper Model and Mixture of Sparse Autoencoders

    Authors: Phat Lam, Lam Pham, Tin Nguyen, Hieu Tang, Thinh Pham, Loi Khanh Nguyen, Alexander Schindler

    Abstract: Existing speaker diarization systems heavily rely on large amounts of manually annotated data, which is labor-intensive and challenging to collect in real-world scenarios. Additionally, the language-specific constraint in speaker diarization systems significantly hinders their applicability and scalability in multilingual settings. In this paper, we therefore propose a cluster-based speaker diariz… ▽ More

    Submitted 7 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: 8 pages, 7 figures

  20. arXiv:2406.19827  [pdf, other

    cs.LG

    Towards Stable and Storage-efficient Dataset Distillation: Matching Convexified Trajectory

    Authors: Wenliang Zhong, Haoyu Tang, Qinghai Zheng, Mingzhu Xu, Yupeng Hu, Liqiang Nie

    Abstract: The rapid evolution of deep learning and large language models has led to an exponential growth in the demand for training data, prompting the development of Dataset Distillation methods to address the challenges of managing large datasets. Among these, Matching Training Trajectories (MTT) has been a prominent approach, which replicates the training trajectory of an expert network on real data wit… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 11 pages

  21. arXiv:2406.17632  [pdf, other

    nucl-th hep-ph

    Transient spin modes from relaxational axial kinetic theory

    Authors: Shu Lin, Haiqin Tang

    Abstract: We study the dynamics of spin mode by solving the axial kinetic equations under the relaxation time approximation in the presence of dissipative sources. We find transient spin modes in response to electric field with spacetime inhomogeneity, fluid acceleration and shear. To the lowest order in spatial gradient $k$, we find the responses to electric field and acceleration can be interpreted as ret… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 20 pages, 1 figure

  22. arXiv:2406.16855  [pdf, other

    cs.CV

    DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation

    Authors: Yuang Peng, Yuxin Cui, Haomiao Tang, Zekun Qi, Runpei Dong, Jing Bai, Chunrui Han, Zheng Ge, Xiangyu Zhang, Shu-Tao Xia

    Abstract: Personalized image generation holds great promise in assisting humans in everyday work and life due to its impressive function in creatively generating personalized content. However, current evaluations either are automated but misalign with humans or require human evaluations that are time-consuming and expensive. In this work, we present DreamBench++, a human-aligned benchmark automated by advan… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Project page: https://dreambenchplus.github.io/

  23. arXiv:2406.14535  [pdf, other

    stat.ME math.ST

    On estimation and order selection for multivariate extremes via clustering

    Authors: Shiyuan Deng, He Tang, Shuyang Bai

    Abstract: We investigate the estimation of multivariate extreme models with a discrete spectral measure using spherical clustering techniques. The primary contribution involves devising a method for selecting the order, that is, the number of clusters. The method consistently identifies the true order, i.e., the number of spectral atoms, and enjoys intuitive implementation in practice. Specifically, we intr… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 31 pages, 12 figures

    MSC Class: 62G32 (Primary); 60G70 (Secondary)

  24. arXiv:2406.13201  [pdf, other

    cs.LG cs.SI

    Toward Structure Fairness in Dynamic Graph Embedding: A Trend-aware Dual Debiasing Approach

    Authors: Yicong Li, Yu Yang, Jiannong Cao, Shuaiqi Liu, Haoran Tang, Guandong Xu

    Abstract: Recent studies successfully learned static graph embeddings that are structurally fair by preventing the effectiveness disparity of high- and low-degree vertex groups in downstream graph mining tasks. However, achieving structure fairness in dynamic graph embedding remains an open problem. Neglecting degree changes in dynamic graphs will significantly impair embedding effectiveness without notably… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  25. arXiv:2406.12208  [pdf, other

    cs.CL cs.AI cs.CV cs.NE

    Knowledge Fusion By Evolving Weights of Language Models

    Authors: Guodong Du, Jing Li, Hanting Liu, Runhua Jiang, Shuyang Yu, Yifei Guo, Sim Kuan Goh, Ho-Kin Tang

    Abstract: Fine-tuning pre-trained language models, particularly large language models, demands extensive computing resources and can result in varying performance outcomes across different domains and datasets. This paper examines the approach of integrating multiple models from diverse training scenarios into a unified model. This unified model excels across various data domains and exhibits the ability to… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL2024 Findings

  26. arXiv:2406.12200  [pdf, other

    cs.LG cs.DC cs.ET cs.MM cs.NE

    SFedCA: Credit Assignment-Based Active Client Selection Strategy for Spiking Federated Learning

    Authors: Qiugang Zhan, Jinbo Cao, Xiurui Xie, Malu Zhang, Huajin Tang, Guisong Liu

    Abstract: Spiking federated learning is an emerging distributed learning paradigm that allows resource-constrained devices to train collaboratively at low power consumption without exchanging local data. It takes advantage of both the privacy computation property in federated learning (FL) and the energy efficiency in spiking neural networks (SNN). Thus, it is highly promising to revolutionize the efficient… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 9 pages

  27. arXiv:2406.12124  [pdf

    cond-mat.mtrl-sci cond-mat.other

    Inside the Working Mechanism of Meta-generalized Gradient Density Functional Approximations: The Example of Quantum Spin-Hall Insulator 1T`-WTe2

    Authors: Li Yin, Hong Tang, Adrienn Ruzsinszky

    Abstract: Quantum spin Hall (QSH) insulators have attracted intensive experimental and theoretical studies due to their beneficial applications in spintronic devices. Density functional theory (DFT) meets challenges when describing the electronic structure of QSH materials. Only the Heyd-Scuseria-Ernzerhof (HSE06) with spin-orbit coupling (SOC) is effective in revealing the band opening in the typical QSH 1… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 15 pages, 5 figures, and Supplementary Materials

  28. arXiv:2406.11320  [pdf, other

    hep-th cond-mat.dis-nn quant-ph

    Brownian Gaussian Unitary Ensemble: non-equilibrium dynamics, efficient $k$-design and application in classical shadow tomography

    Authors: Haifeng Tang

    Abstract: We construct and extensively study a Brownian generalization of the Gaussian Unitary Ensemble (BGUE). Our analysis begins with the non-equilibrium dynamics of BGUE, where we derive explicit analytical expressions for various one-replica and two-replica variables at finite $N$ and $t$. These variables include the spectral form factor and its fluctuation, the two-point function and its fluctuation,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 27+6 pages, many figures

  29. arXiv:2406.11291  [pdf, ps, other

    quant-ph

    Simulation of chiral motion of excitation within the ground-state manifolds of neutral atoms

    Authors: Hao-Yuan Tang, Xiao-Xuan Li, Jia-Bin You, Xiao-Qiang Shao

    Abstract: Laser-induced gauge fields in neutral atoms serve as a means of mimicking the effects of a magnetic field, providing researchers with a platform to explore behaviors analogous to those observed in condensed matter systems under real magnetic fields. Here, we propose a method to generate chiral motion in atomic excitations within the neutral atomic ground-state manifolds. This is achieved through t… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 9 pages, 7 figures, revised manuscript submitted to APL Quantum

  30. arXiv:2406.09200  [pdf, other

    cs.CL

    Orthogonality and isotropy of speaker and phonetic information in self-supervised speech representations

    Authors: Mukhtar Mohamed, Oli Danyi Liu, Hao Tang, Sharon Goldwater

    Abstract: Self-supervised speech representations can hugely benefit downstream speech technologies, yet the properties that make them useful are still poorly understood. Two candidate properties related to the geometry of the representation space have been hypothesized to correlate well with downstream tasks: (1) the degree of orthogonality between the subspaces spanned by the speaker centroids and phone ce… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech

  31. arXiv:2406.08131  [pdf, other

    cond-mat.str-el cond-mat.quant-gas cond-mat.supr-con

    dx2-y2-wave Bose Metal induced by the next-nearest-neighbor hopping t'

    Authors: Zhangkai Cao, Jianyu Li, Jiahao Su, Tao Ying, Ho-Kin Tang

    Abstract: Superconductivity arises when electrons form Cooper pairs with phase coherence. In contrast, a lack of phase coherence in Cooper pairs can lead to an uncondensed metallic ground state known as the Bose metal state. In this study, we investigate an attractively interacting fermionic system with nearest-neighbor (NN) hopping (t) and next-nearest-neighbor (NNN) hopping (t') anisotropy between two spe… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 13 pages, 11 figures. arXiv admin note: substantial text overlap with arXiv:2405.13405. arXiv admin note: substantial text overlap with arXiv:2405.13405

  32. arXiv:2406.06579  [pdf, other

    cs.CL cs.AI cs.CV

    From Redundancy to Relevance: Enhancing Explainability in Multimodal Large Language Models

    Authors: Xiaofeng Zhang, Chen Shen, Xiaosong Yuan, Shaotian Yan, Liang Xie, Wenxiao Wang, Chaochen Gu, Hao Tang, Jieping Ye

    Abstract: Recently, multimodal large language models have exploded with an endless variety, most of the popular Large Vision Language Models (LVLMs) depend on sequential visual representation, where images are converted into hundreds or thousands of tokens before being input into the Large Language Model (LLM) along with language prompts. The black-box design hinders the interpretability of visual-language… ▽ More

    Submitted 13 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  33. arXiv:2406.06020  [pdf, other

    cond-mat.mtrl-sci

    Effect of Strain on the Band Gap of Monolayer MoS$_2$

    Authors: Raj K. Sah, Hong Tang, Chandra Shahi, Adrienn Ruzsinszky, John P. Perdew

    Abstract: Monolayer $\mathrm{MoS_2}$ under strain has many interesting properties and possible applications in technology. A recent experimental study examined the effect of strain on the bandgap of monolayer $\mathrm{MoS_2}$ on a mildly curved graphite surface, reporting that under biaxial strain with a Poisson's ratio of 0.44, the bandgap decreases at a rate of 400 meV/% strain. In this work, we performed… ▽ More

    Submitted 11 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  34. arXiv:2406.05464  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models

    Authors: Tzu-Quan Lin, Hung-yi Lee, Hao Tang

    Abstract: Self-supervised speech models have shown to be useful for various tasks, but their large size limits the use in devices with low computing power and memory. In this work, we explore early exit, an approach for reducing latency by exiting the forward process of a network early. Most approaches of early exit need a separate early exit model for each task, with some even requiring fine-tuning of the… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  35. arXiv:2406.04489  [pdf, other

    physics.plasm-ph physics.acc-ph physics.app-ph physics.optics

    Efficient backward x-ray emission in a finite-length plasma irradiated by a laser pulse of ps duration

    Authors: I-Lin Yeh, Kavin Tangtartharakul, Hongmei Tang, Louise Willingale, Alexey Arefiev

    Abstract: Motivated by experiments employing ps-long, kilojoule laser pulses, we examined x-ray emission in a finite-length underdense plasma irradiated by such a pulse using two dimensional particle-in-cell simulations. We found that, in addition to the expected forward emission, the plasma also efficiently emits in the backward direction. Our simulations reveal that the backward emission occurs when the l… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  36. arXiv:2406.02349  [pdf, other

    cs.NE cs.AI cs.CV

    CADE: Cosine Annealing Differential Evolution for Spiking Neural Network

    Authors: Runhua Jiang, Guodong Du, Shuyang Yu, Yifei Guo, Sim Kuan Goh, Ho-Kin Tang

    Abstract: Spiking neural networks (SNNs) have gained prominence for their potential in neuromorphic computing and energy-efficient artificial intelligence, yet optimizing them remains a formidable challenge for gradient-based methods due to their discrete, spike-based computation. This paper attempts to tackle the challenges by introducing Cosine Annealing Differential Evolution (CADE), designed to modulate… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  37. arXiv:2406.02236  [pdf, other

    quant-ph physics.optics

    Demonstration of superior communication through thermodynamically free channels in an optical quantum switch

    Authors: Hao Tang, Yu Guo, Xiao-Min Hu, Yun-Feng Huang, Bi-Heng Liu, Chuan-Feng Li, Guang-Can Guo

    Abstract: The release of causal structure of physical events from a well-defined order to an indefinite one stimulates remarkable enhancements in various quantum information tasks. Some of these advantages, however, are questioned for the ambiguous role of the control system in the quantum switch that is an experimentally realized process with indefinite causal structure. In communications, for example, not… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  38. arXiv:2406.01883  [pdf, other

    cs.NE cs.HC

    Context Gating in Spiking Neural Networks: Achieving Lifelong Learning through Integration of Local and Global Plasticity

    Authors: Jiangrong Shen, Wenyao Ni, Qi Xu, Gang Pan, Huajin Tang

    Abstract: Humans learn multiple tasks in succession with minimal mutual interference, through the context gating mechanism in the prefrontal cortex (PFC). The brain-inspired models of spiking neural networks (SNN) have drawn massive attention for their energy efficiency and biological plausibility. To overcome catastrophic forgetting when learning multiple tasks in sequence, current SNN models for lifelong… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  39. arXiv:2405.20092  [pdf, other

    cs.CL cs.SE

    Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation

    Authors: Jingchang Chen, Hongxuan Tang, Zheng Chu, Qianglong Chen, Zekun Wang, Ming Liu, Bing Qin

    Abstract: Despite recent progress made by large language models in code generation, they still struggle with programs that meet complex requirements. Recent work utilizes plan-and-solve decomposition to decrease the complexity and leverage self-tests to refine the generated program. Yet, planning deep-inside requirements in advance can be challenging, and the tests need to be accurate to accomplish self-imp… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  40. arXiv:2405.19465  [pdf, other

    cs.CV

    RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter

    Authors: Meng Cao, Haoran Tang, Jinfa Huang, Peng Jin, Can Zhang, Ruyang Liu, Long Chen, Xiaodan Liang, Li Yuan, Ge Li

    Abstract: Text-Video Retrieval (TVR) aims to align relevant video content with natural language queries. To date, most state-of-the-art TVR methods learn image-to-video transfer learning based on large-scale pre-trained visionlanguage models (e.g., CLIP). However, fully fine-tuning these pre-trained models for TVR incurs prohibitively expensive computation costs. To this end, we propose to conduct efficient… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted by ACL 2024 Findings

  41. arXiv:2405.19299  [pdf, other

    cs.CL

    Expert-Guided Extinction of Toxic Tokens for Debiased Generation

    Authors: Xueyao Sun, Kaize Shi, Haoran Tang, Guandong Xu, Qing Li

    Abstract: Large language models (LLMs) can elicit social bias during generations, especially when inference with toxic prompts. Controlling the sensitive attributes in generation encounters challenges in data distribution, generalizability, and efficiency. Specifically, fine-tuning and retrieval demand extensive unbiased corpus, while direct prompting requires meticulously curated instructions for correctin… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  42. arXiv:2405.18347  [pdf, other

    cs.LG

    Dataset Growth

    Authors: Ziheng Qin, Zhaopan Xu, Yukun Zhou, Zangwei Zheng, Zebang Cheng, Hao Tang, Lei Shang, Baigui Sun, Xiaojiang Peng, Radu Timofte, Hongxun Yao, Kai Wang, Yang You

    Abstract: Deep learning benefits from the growing abundance of available data. Meanwhile, efficiently dealing with the growing data scale has become a challenge. Data publicly available are from different sources with various qualities, and it is impractical to do manual cleaning against noise and redundancy given today's data scale. There are existing techniques for cleaning/selecting the collected data. H… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  43. Reliable Object Tracking by Multimodal Hybrid Feature Extraction and Transformer-Based Fusion

    Authors: Hongze Sun, Rui Liu, Wuque Cai, Jun Wang, Yue Wang, Huajin Tang, Yan Cui, Dezhong Yao, Daqing Guo

    Abstract: Visual object tracking, which is primarily based on visible light image sequences, encounters numerous challenges in complicated scenarios, such as low light conditions, high dynamic ranges, and background clutter. To address these challenges, incorporating the advantages of multiple visual modalities is a promising solution for achieving reliable object tracking. However, the existing approaches… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 16 pages, 7 figures, 9 tabes; This work has been submitted for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  44. arXiv:2405.17895  [pdf, ps, other

    math.AP

    Enhanced dissipation and temporal decay in the Euler-Poisson-Navier-Stokes equations

    Authors: Young-Pil Choi, Houzhi Tang, Weiyuan Zou

    Abstract: This paper investigates the global well-posedness and large-time behavior of solutions for a coupled fluid model in $\mathbb{R}^3$ consisting of the isothermal compressible Euler-Poisson system and incompressible Navier-Stokes equations coupled through the drag force. Notably, we exploit the dissipation effects inherent in the Poisson equation to achieve a faster decay of fluid density compared to… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    MSC Class: 35B40; 35B65; 76N10

  45. arXiv:2405.17727  [pdf, ps, other

    math.AP

    Optimal stability of Hardy-Littlewood-Sobolev and Sobolev inequalities of arbitrary orders with dimension-dependent constants

    Authors: Lu Chen, Guozhen Lu, Hanli Tang

    Abstract: Dolbeault-Esteban-Figalli-Frank-Loss [19] and Chen-Lu-Tang [17] established the optimal asymptotic lower bound for stability of the first-order Sobolev inequality and fractional Sobolev inequality of order $s$ for $0<s<1$ respectively. However, it left the problem of the optimal lower bound for stability of high-order Sobolev inequality and high-order fractional Sobolev inequality unsolved. The pu… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 38 pages

  46. arXiv:2405.17503  [pdf, other

    cs.SE cs.AI cs.CL cs.PL

    Code Repair with LLMs gives an Exploration-Exploitation Tradeoff

    Authors: Hao Tang, Keya Hu, Jin Peng Zhou, Sicheng Zhong, Wei-Long Zheng, Xujie Si, Kevin Ellis

    Abstract: Iteratively improving and repairing source code with large language models (LLMs), known as refinement, has emerged as a popular way of generating programs that would be too complex to construct in one shot. Given a bank of test cases, together with a candidate program, an LLM can improve that program by being prompted with failed test cases. But it remains an open question how to best iteratively… ▽ More

    Submitted 30 May, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

  47. arXiv:2405.15927  [pdf

    eess.SP cs.NE eess.SY

    Application based Evaluation of an Efficient Spike-Encoder, "Spiketrum"

    Authors: MHD Anas Alsakkal, Runze Wang, Jayawan Wijekoon, Huajin Tang

    Abstract: Spike-based encoders represent information as sequences of spikes or pulses, which are transmitted between neurons. A prevailing consensus suggests that spike-based approaches demonstrate exceptional capabilities in capturing the temporal dynamics of neural activity and have the potential to provide energy-efficient solutions for low-power applications. The Spiketrum encoder efficiently compresses… ▽ More

    Submitted 31 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: To be published at "IEEE/ACM Transactions on Audio, Speech, and Language Processing"

  48. arXiv:2405.13405  [pdf, other

    cond-mat.supr-con cond-mat.quant-gas cond-mat.str-el

    Exotic d-wave Bose Metal in two dimensions

    Authors: Zhangkai Cao, Jiahao Su, Jianyu Li, Tao Ying, WanSheng Wang, Jin-Hua Sun, Ho-Kin Tang, Haiqing Lin

    Abstract: The Landau Fermi liquid theory, a cornerstone in condensed matter physics, encounters limitations in explaining certain phenomena, like the peculiar behavior of strange metals in high-temperature superconductors. Non-Fermi liquids, like Bose metals with uncondensed bosonic ground state, offer potential explanations, yet constructing an elusive Bose metal phase in two dimensions (2D) remains a form… ▽ More

    Submitted 24 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: 15 pages, 13 figures

  49. arXiv:2405.13374  [pdf, other

    cs.CV cs.AI

    Collaboration of Teachers for Semi-supervised Object Detection

    Authors: Liyu Chen, Huaao Tang, Yi Wen, Hanting Chen, Wei Li, Junchao Liu, Jie Hu

    Abstract: Recent semi-supervised object detection (SSOD) has achieved remarkable progress by leveraging unlabeled data for training. Mainstream SSOD methods rely on Consistency Regularization methods and Exponential Moving Average (EMA), which form a cyclic data flow. However, the EMA updating training approach leads to weight coupling between the teacher and student models. This coupling in a cyclic data f… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  50. arXiv:2405.13190  [pdf, other

    cs.LG cs.AI

    Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation

    Authors: Haoteng Tang, Guodong Liu, Siyuan Dai, Kai Ye, Kun Zhao, Wenlu Wang, Carl Yang, Lifang He, Alex Leow, Paul Thompson, Heng Huang, Liang Zhan

    Abstract: The MRI-derived brain network serves as a pivotal instrument in elucidating both the structural and functional aspects of the brain, encompassing the ramifications of diseases and developmental processes. However, prevailing methodologies, often focusing on synchronous BOLD signals from functional MRI (fMRI), may not capture directional influences among brain regions and rarely tackle temporal fun… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.