Skip to main content

Showing 1–50 of 301 results for author: Gao, K

  1. arXiv:2407.11080  [pdf, other

    eess.SP

    Performance analysis for a rotary compressor at high speed: experimental study and mathematical modeling

    Authors: Chuntai Zheng, Wei Zhao, Benshuai Lyu, Keke Gao, Hongjun Cao, Lei Zhong, Yi Gao, Ren Liao

    Abstract: This paper conducted a comprehensive study on the performance of a rotary compressor over a rotational speed range of 80Hz to 200Hz through experimental tests and mathematical modeling. A compressor performance test rig was designed to conduct the performance tests, with fast-response pressure sensors and displacement sensors capturing the P-V diagram and dynamic motion of the moving components. R… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

  2. arXiv:2407.04675  [pdf, other

    eess.AS cs.SD

    Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

    Authors: Ye Bai, Jingping Chen, Jitong Chen, Wei Chen, Zhuo Chen, Chuang Ding, Linhao Dong, Qianqian Dong, Yujiao Du, Kepan Gao, Lu Gao, Yi Guo, Minglun Han, Ting Han, Wenchao Hu, Xinying Hu, Yuxiang Hu, Deyu Hua, Lu Huang, Mingkun Huang, Youjia Huang, Jishuo Jin, Fanliu Kong, Zongwei Lan, Tianyu Li , et al. (30 additional authors not shown)

    Abstract: Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this wor… ▽ More

    Submitted 10 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

  3. arXiv:2407.02773  [pdf, other

    cs.MM

    OpenVNA: A Framework for Analyzing the Behavior of Multimodal Language Understanding System under Noisy Scenarios

    Authors: Ziqi Yuan, Baozheng Zhang, Hua Xu, Zhiyun Liang, Kai Gao

    Abstract: We present OpenVNA, an open-source framework designed for analyzing the behavior of multimodal language understanding systems under noisy conditions. OpenVNA serves as an intuitive toolkit tailored for researchers, facilitating convenience batch-level robustness evaluation and on-the-fly instance-level demonstration. It primarily features a benchmark Python library for assessing global model robus… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 10 pages, 4 figures, to be published in ACL 2024 System Demonstration Track

  4. arXiv:2407.02411  [pdf, other

    cs.CV cs.CR cs.MM

    Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs

    Authors: Jinmin Li, Kuofeng Gao, Yang Bai, Jingyun Zhang, Shu-Tao Xia

    Abstract: The advent of video-based Large Language Models (LLMs) has significantly enhanced video understanding. However, it has also raised some safety concerns regarding data protection, as videos can be more easily annotated, even without authorization. This paper introduces Video Watermarking, a novel technique to protect videos from unauthorized annotations by such video-based LLMs, especially concerni… ▽ More

    Submitted 2 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2403.13507

  5. arXiv:2406.12556  [pdf, other

    cs.NI

    Towards Deep Application-Network Integration: Architectures, Progress and Opportunities

    Authors: Berta Serracanta, Kai Gao, Jordi Ros-Giralt, Alberto Rodriguez-Natal, Luis M. Contreras, Richard Yang, Albert Cabellos

    Abstract: With the rise of a new generation of applications (e.g., virtual and augmented reality, artificial intelligence, etc) demanding stringent performance requirements, the need for networking solutions and architectures that can enable a higher Quality of Experience (QoE) is becoming increasingly important. While jointly optimizing application and network may increase the applications' QoE and simul… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  6. arXiv:2406.10981  [pdf, other

    cs.CV

    ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models

    Authors: Kaifeng Gao, Jiaxin Shi, Hanwang Zhang, Chunping Wang, Jun Xiao

    Abstract: With the advance of diffusion models, today's video generation has achieved impressive quality. But generating temporal consistent long videos is still challenging. A majority of video diffusion models (VDMs) generate long videos in an autoregressive manner, i.e., generating subsequent clips conditioned on last frames of previous clip. However, existing approaches all involve bidirectional computa… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: Code will be available at https://github.com/Dawn-LX/Causal-VideoGen

  7. arXiv:2406.08698  [pdf, other

    astro-ph.HE hep-ph

    Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 17 pages, 12 figures, accepted by PRL

  8. arXiv:2406.05797  [pdf, other

    q-bio.BM cs.AI cs.CE cs.CL cs.LG

    3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular Tokenization

    Authors: Qizhi Pei, Lijun Wu, Kaiyuan Gao, Jinhua Zhu, Rui Yan

    Abstract: The integration of molecule and language has garnered increasing attention in molecular science. Recent advancements in Language Models (LMs) have demonstrated potential for the comprehensive modeling of molecule and language. However, existing works exhibit notable limitations. Most existing works overlook the modeling of 3D information, which is crucial for understanding molecular structures and… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: 18 pages

  9. arXiv:2406.05392  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas

    Authors: Chengyuan Deng, Yiqun Duan, Xin Jin, Heng Chang, Yijun Tian, Han Liu, Henry Peng Zou, Yiqiao Jin, Yijia Xiao, Yichen Wang, Shenghao Wu, Zongxing Xie, Kuofeng Gao, Sihong He, Jun Zhuang, Lu Cheng, Haohan Wang

    Abstract: Large Language Models (LLMs) have achieved unparalleled success across diverse language modeling tasks in recent years. However, this progress has also intensified ethical concerns, impacting the deployment of LLMs in everyday contexts. This paper provides a comprehensive survey of ethical challenges associated with LLMs, from longstanding issues such as copyright infringement, systematic bias, an… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  10. arXiv:2405.17213  [pdf

    physics.ao-ph

    Highly inhomogeneous interactions between background climate and urban warming across typical local climate zones in heatwave and non-heatwave days

    Authors: Jing Kong, Yongling Zhao, Kai Gao, Dominik Strebel, Jan Carmeliet, Chengwang Lei

    Abstract: Urban heat island (UHI) in conjunction with heatwave (HW) leads to exacerbation of thermal stress in urban areas. Prior research on UHI and HW has predominantly concentrated on examining the thermal conditions at the surface and near-surface, with few investigations extending to the radiative and dynamical interactions of UHI and HW, particularly with a focus on the inhomogeneities across local cl… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  11. arXiv:2405.15826  [pdf, other

    cs.CV

    3D Learnable Supertoken Transformer for LiDAR Point Cloud Scene Segmentation

    Authors: Dening Lu, Jun Zhou, Kyle Gao, Linlin Xu, Jonathan Li

    Abstract: 3D Transformers have achieved great success in point cloud understanding and representation. However, there is still considerable scope for further development in effective and efficient Transformers for large-scale LiDAR point cloud scene segmentation. This paper proposes a novel 3D Transformer framework, named 3D Learnable Supertoken Transformer (3DLST). The key contributions are summarized as f… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 13 pages, 10 figures, 7 tables

  12. arXiv:2405.12775  [pdf, other

    cs.MM cs.AI cs.CL

    Unsupervised Multimodal Clustering for Semantics Discovery in Multimodal Utterances

    Authors: Hanlei Zhang, Hua Xu, Fei Long, Xin Wang, Kai Gao

    Abstract: Discovering the semantics of multimodal utterances is essential for understanding human language and enhancing human-machine interactions. Existing methods manifest limitations in leveraging nonverbal information for discerning complex semantics in unsupervised scenarios. This paper introduces a novel unsupervised multimodal clustering method (UMC), making a pioneering contribution to this field.… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Accepted by ACL 2024, Main Conference, Long Paper

  13. arXiv:2405.11826  [pdf, other

    astro-ph.IM hep-ex physics.ins-det

    Data quality control system and long-term performance monitor of the LHAASO-KM2A

    Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

    Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More

    Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: 15 pages, 9 figures

  14. arXiv:2405.11021  [pdf, other

    cs.CV

    Enhanced 3D Urban Scene Reconstruction and Point Cloud Densification using Gaussian Splatting and Google Earth Imagery

    Authors: Kyle Gao, Dening Lu, Hongjie He, Linlin Xu, Jonathan Li

    Abstract: 3D urban scene reconstruction and modelling is a crucial research area in remote sensing with numerous applications in academia, commerce, industry, and administration. Recent advancements in view synthesis models have facilitated photorealistic 3D reconstruction solely from 2D images. Leveraging Google Earth imagery, we construct a 3D Gaussian Splatting model of the Waterloo region centered on th… ▽ More

    Submitted 1 June, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    ACM Class: I.4; I.3

  15. arXiv:2405.10612  [pdf, other

    cs.CV cs.CR cs.LG

    Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transformers

    Authors: Sheng Yang, Jiawang Bai, Kuofeng Gao, Yong Yang, Yiming Li, Shu-tao Xia

    Abstract: Given the power of vision transformers, a new learning paradigm, pre-training and then prompting, makes it more efficient and effective to address downstream visual recognition tasks. In this paper, we identify a novel security threat towards such a paradigm from the perspective of backdoor attacks. Specifically, an extra prompt token, called the switch token in this work, can turn the backdoor mo… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  16. arXiv:2405.09981  [pdf, other

    cs.CV

    Adversarial Robustness for Visual Grounding of Multimodal Large Language Models

    Authors: Kuofeng Gao, Yang Bai, Jiawang Bai, Yong Yang, Shu-Tao Xia

    Abstract: Multi-modal Large Language Models (MLLMs) have recently achieved enhanced performance across various vision-language tasks including visual grounding capabilities. However, the adversarial robustness of visual grounding remains unexplored in MLLMs. To fill this gap, we use referring expression comprehension (REC) as an example task in visual grounding and propose three adversarial attack paradigms… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: ICLR 2024 Workshop on Reliable and Responsible Foundation Models

  17. arXiv:2405.07691  [pdf, other

    astro-ph.HE

    Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 11 pages, 5 figures

  18. arXiv:2405.07136  [pdf

    physics.optics cond-mat.mtrl-sci

    Extremely long transverse optical needle focus for reflective metalens enabled by monolayer MoS$_2$

    Authors: Zhonglin Li, Kangyu Gao, Yingying Wang, Ruitong Bie, Dongliang Yang, Tianze Yu, Renxi Gao, Wenjun Liu, Bo Zhong, Linfeng Sun

    Abstract: Line-scan mode facilitates fast-speed and high-throughput imaging with developing a suitable optical transverse needle focus. Metasurface with periodic structures such as diffractive rings, ellipses, and gratings could enable discrete focus evolving into line focus under momentum conservation, but still face the challenge of extremely low light power utilization brought by inevitably multiple high… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: 22 pages, 5 figures

  19. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  20. arXiv:2404.19387  [pdf, other

    eess.SY

    Online Electricity Purchase for Data Center with Dynamic Virtual Battery from Flexibility Aggregation

    Authors: Kekun Gao, Yuejun Yan, Yixuan Liu, Endong Liu, Pengcheng You

    Abstract: As a critical component of modern infrastructure, data centers account for a huge amount of power consumption and greenhouse gas emission. This paper studies the electricity purchase strategy for a data center to lower its energy cost while integrating local renewable generation under uncertainty. To facilitate efficient and scalable decision-making, we propose a two-layer hierarchy where the lowe… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  21. arXiv:2404.16565  [pdf, other

    cs.SE

    PyRadar: Towards Automatically Retrieving and Validating Source Code Repository Information for PyPI Packages

    Authors: Kai Gao, Weiwei Xu, Wenhao Yang, Minghui Zhou

    Abstract: A package's source code repository records the development history of the package, providing indispensable information for the use and risk monitoring of the package. However, a package release often misses its source code repository due to the separation of the package's development platform from its distribution platform. Existing tools retrieve the release's repository information from its meta… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: This paper has been accepted at FSE 2024

  22. arXiv:2404.16557  [pdf, other

    cs.CV cs.AI

    Energy-Latency Manipulation of Multi-modal Large Language Models via Verbose Samples

    Authors: Kuofeng Gao, Jindong Gu, Yang Bai, Shu-Tao Xia, Philip Torr, Wei Liu, Zhifeng Li

    Abstract: Despite the exceptional performance of multi-modal large language models (MLLMs), their deployment requires substantial computational resources. Once malicious users induce high energy consumption and latency time (energy-latency cost), it will exhaust computational resources and harm availability of service. In this paper, we investigate this vulnerability for MLLMs, particularly image-based and… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2401.11170

  23. arXiv:2404.16525  [pdf, other

    cond-mat.quant-gas

    An efficient method to generate near-ideal hollow beams of different shapes for box potential of quantum gases

    Authors: Tongtong Ren, Yirong Wang, Xiaoyu Dai, Xiaoxu Gao, Guangren Sun, Xue Zhao, Kuiyi Gao, Zhiyue Zheng, Wei Zhang

    Abstract: Ultracold quantum gases are usually prepared in conservative traps for quantum simulation experiments. The atomic density inhomogeneity, together with the consequent position-dependent energy and time scales of cold atoms in traditional harmonic traps, makes it difficult to manipulate and detect the sample at a better level. These problems are partially solved by optical box traps of blue-detuned… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  24. arXiv:2404.14372  [pdf, other

    cs.CL cs.AI

    Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph

    Authors: Xiaochen Kev Gao, Feng Yao, Kewen Zhao, Beilei He, Animesh Kumar, Vish Krishnan, Jingbo Shang

    Abstract: Model scaling is becoming the default choice for many language tasks due to the success of large language models (LLMs). However, it can fall short in specific scenarios where simple customized methods excel. In this paper, we delve into the patent approval pre-diction task and unveil that simple domain-specific graph methods outperform enlarging the model, using the intrinsic dependencies within… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 17 Pages, Under Review

  25. arXiv:2404.11070  [pdf

    cs.CV eess.SP

    Sky-GVIO: an enhanced GNSS/INS/Vision navigation with FCN-based sky-segmentation in urban canyon

    Authors: Jingrong Wang, Bo Xu, Ronghe Jin, Shoujian Zhang, Kefu Gao, Jingnan Liu

    Abstract: Accurate, continuous, and reliable positioning is a critical component of achieving autonomous driving. However, in complex urban canyon environments, the vulnerability of a stand-alone sensor and non-line-of-sight (NLOS) caused by high buildings, trees, and elevated structures seriously affect positioning results. To address these challenges, a sky-view images segmentation algorithm based on Full… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  26. arXiv:2404.06758  [pdf, other

    cs.RO

    Toward Holistic Planning and Control Optimization for Dual-Arm Rearrangement

    Authors: Kai Gao, Zihe Ye, Duo Zhang, Baichuan Huang, Jingjin Yu

    Abstract: Long-horizon task and motion planning (TAMP) is notoriously difficult to solve, let alone optimally, due to the tight coupling between the interleaved (discrete) task and (continuous) motion planning phases, where each phase on its own is frequently an NP-hard or even PSPACE-hard computational challenge. In this study, we tackle the even more challenging goal of jointly optimizing task and motion… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: First three authors made equal contributions to this study

  27. arXiv:2404.05211  [pdf, other

    cs.CV

    Multi-level Graph Subspace Contrastive Learning for Hyperspectral Image Clustering

    Authors: Jingxin Wang, Renxiang Guan, Kainan Gao, Zihao Li, Hao Li, Xianju Li, Chang Tang

    Abstract: Hyperspectral image (HSI) clustering is a challenging task due to its high complexity. Despite subspace clustering shows impressive performance for HSI, traditional methods tend to ignore the global-local interaction in HSI data. In this study, we proposed a multi-level graph subspace contrastive learning (MLGSC) for HSI clustering. The model is divided into the following main parts. Graph convolu… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: IJCNN 2024

  28. arXiv:2404.04801  [pdf, ps, other

    astro-ph.IM astro-ph.HE

    LHAASO-KM2A detector simulation using Geant4

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (254 additional authors not shown)

    Abstract: KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  29. arXiv:2403.20261  [pdf, other

    q-bio.BM cs.AI cs.LG

    FABind+: Enhancing Molecular Docking through Improved Pocket Prediction and Pose Generation

    Authors: Kaiyuan Gao, Qizhi Pei, Jinhua Zhu, Kun He, Lijun Wu

    Abstract: Molecular docking is a pivotal process in drug discovery. While traditional techniques rely on extensive sampling and simulation governed by physical principles, these methods are often slow and costly. The advent of deep learning-based approaches has shown significant promise, offering increases in both accuracy and efficiency. Building upon the foundational work of FABind, a model designed with… ▽ More

    Submitted 7 April, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

    Comments: 17 pages, 14 figures, 5 tables

  30. AI Ethics: A Bibliometric Analysis, Critical Issues, and Key Gaps

    Authors: Di Kevin Gao, Andrew Haverly, Sudip Mittal, Jiming Wu, Jingdao Chen

    Abstract: Artificial intelligence (AI) ethics has emerged as a burgeoning yet pivotal area of scholarly research. This study conducts a comprehensive bibliometric analysis of the AI ethics literature over the past two decades. The analysis reveals a discernible tripartite progression, characterized by an incubation phase, followed by a subsequent phase focused on imbuing AI with human-like attributes, culmi… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Journal ref: International Journal of Business Analytics (IJBAN), 2024, 11(1), 1-19

  31. arXiv:2403.13507  [pdf, other

    cs.CV

    FMM-Attack: A Flow-based Multi-modal Adversarial Attack on Video-based LLMs

    Authors: Jinmin Li, Kuofeng Gao, Yang Bai, Jingyun Zhang, Shu-tao Xia, Yisen Wang

    Abstract: Despite the remarkable performance of video-based large language models (LLMs), their adversarial threat remains unexplored. To fill this gap, we propose the first adversarial attack tailored for video-based LLMs by crafting flow-based multi-modal adversarial perturbations on a small fraction of frames within a video, dubbed FMM-Attack. Extensive experiments show that our attack can effectively in… ▽ More

    Submitted 21 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  32. arXiv:2403.10943  [pdf, other

    cs.MM cs.CL

    MIntRec2.0: A Large-scale Benchmark Dataset for Multimodal Intent Recognition and Out-of-scope Detection in Conversations

    Authors: Hanlei Zhang, Xin Wang, Hua Xu, Qianrui Zhou, Kai Gao, Jianhua Su, jinyue Zhao, Wenrui Li, Yanting Chen

    Abstract: Multimodal intent recognition poses significant challenges, requiring the incorporation of non-verbal modalities from real-world contexts to enhance the comprehension of human intentions. Existing benchmark datasets are limited in scale and suffer from difficulties in handling out-of-scope samples that arise in multi-turn conversational interactions. We introduce MIntRec2.0, a large-scale benchmar… ▽ More

    Submitted 27 June, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

    Comments: Accepted by ICLR 2024, Long Paper; The abstract is slightly modified due to the length limitation

  33. Measurements of All-Particle Energy Spectrum and Mean Logarithmic Mass of Cosmic Rays from 0.3 to 30 PeV with LHAASO-KM2A

    Authors: The LHAASO Collaboration, Zhen Cao, F. Aharonian, Q. An, A. Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen , et al. (256 additional authors not shown)

    Abstract: We present the measurements of all-particle energy spectrum and mean logarithmic mass of cosmic rays in the energy range of 0.3-30 PeV using data collected from LHAASO-KM2A between September 2021 and December 2022, which is based on a nearly composition-independent energy reconstruction method, achieving unprecedented accuracy. Our analysis reveals the position of the knee at… ▽ More

    Submitted 26 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: 8 pages, 3 figures

    Journal ref: Physical Review Letters 132, 131002 (2024)

  34. arXiv:2403.06384  [pdf, other

    physics.atom-ph

    Precision Spectroscopy and Nuclear Structure Parameters in 7Li+ ion

    Authors: Hua Guan, Xiao-Qiu Qi, Peng-Peng Zhou, Wei Sun, Shao-Long Chen, Xu-Rui Chang, Yao Huang, Pei-Pei Zhang, Zong-Chao Yan, G. W. F. Drake, Ai-Xi Chen, Zhen-Xiang Zhong, Ting-Yun Shi, Ke-Lin Gao

    Abstract: The optical Ramsey technique is used to obtain precise measurements of the hyperfine splittings in the $2\,^3\!S_1$ and $2\,^3\!P_J$ states of $^7$Li$^+$. Together with bound-state quantum electrodynamic theory, the Zemach radius and quadrupole moment of the $^7$Li nucleus are determined to be $3.35(1)$~fm and $-3.86(5)$~fm$^2$ respectively, with the quadrupole moment deviating from the recommende… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  35. arXiv:2403.05551  [pdf

    cs.CY

    A Bibliometric View of AI Ethics Development

    Authors: Di Kevin Gao, Andrew Haverly, Sudip Mittal, Jingdao Chen

    Abstract: Artificial Intelligence (AI) Ethics is a nascent yet critical research field. Recent developments in generative AI and foundational models necessitate a renewed look at the problem of AI Ethics. In this study, we perform a bibliometric analysis of AI Ethics literature for the last 20 years based on keyword search. Our study reveals a three-phase development in AI Ethics, namely an incubation phase… ▽ More

    Submitted 8 February, 2024; originally announced March 2024.

  36. arXiv:2403.01528  [pdf, other

    cs.CL cs.AI q-bio.BM

    Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey

    Authors: Qizhi Pei, Lijun Wu, Kaiyuan Gao, Jinhua Zhu, Yue Wang, Zun Wang, Tao Qin, Rui Yan

    Abstract: The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology. This approach leverages the rich, multifaceted descriptions of biomolecules contained within textual data sources to enhance our fundamental understanding and enable downstream computational tasks such as biomol… ▽ More

    Submitted 5 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: Survey Paper. 25 pages, 9 figures, and 3 tables

  37. arXiv:2402.17810  [pdf, other

    q-bio.QM cs.AI cs.CE cs.LG q-bio.BM

    BioT5+: Towards Generalized Biological Understanding with IUPAC Integration and Multi-task Tuning

    Authors: Qizhi Pei, Lijun Wu, Kaiyuan Gao, Xiaozhuan Liang, Yin Fang, Jinhua Zhu, Shufang Xie, Tao Qin, Rui Yan

    Abstract: Recent research trends in computational biology have increasingly focused on integrating text and bio-entity modeling, especially in the context of molecules and proteins. However, previous efforts like BioT5 faced challenges in generalizing across diverse tasks and lacked a nuanced understanding of molecular structures, particularly in their textual representations (e.g., IUPAC). This paper intro… ▽ More

    Submitted 31 May, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL 2024 (Findings)

  38. arXiv:2402.16088  [pdf

    cond-mat.mes-hall

    Origin of giant magnetoresistance in layered nodal-line semimetal TaNiTe5 nanoflakes

    Authors: Ding-Bang Zhou, Kuang-Hong Gao, Meng-Fan Zhao, Zhi-Yan Jia, Xiao-Xia Hu, Qian-Jin Guo, Hai-Yan Du, Xiao-Ping Chen, Zhi-Qing Li

    Abstract: Layered transition metal chalcogenides have stimulated a wide research interest due to their many exotic physical properties. In this paper, we studied the magnetotransport properties of the exfoliated TaNiTe5, a recently discovered Dirac nodal-line semimetal. A giant positive magnetoresistance (MR) is observed when the current is parallel to the crystallographic c axis, while it is strongly dimin… ▽ More

    Submitted 18 May, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

    Comments: 21 pages, 7 figures, 1 table

  39. arXiv:2402.01347  [pdf, ps, other

    cond-mat.supr-con cond-mat.dis-nn

    Quantum Griffiths singularity in three-dimensional MoTiN superconducting films

    Authors: Zi-Xiao Wang, Tian-Yu Jing, Zi-Yan Han, Kuang-Hong Gao, Song-Ci Li, Zhi-Qing Li

    Abstract: Quantum Griffiths singularity (QGS) has been experimentally observed in a range of two-dimensional (2D) superconducting systems. Although it is theoretically suggested that the QGS also exists in three-dimensional (3D) superconductors, there is almost no experimental support to the theoretical prediction. In the present paper, we observe the occurrence of QGS in a series of $\sim$80-nm-thick Mo… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: 11 pages and 9 Figures

    Journal ref: Physical Review B 109, 224508 (2024)

  40. arXiv:2401.13488  [pdf, other

    cs.NI

    Fast Inverse Model Transformation: Algebraic Framework for Fast Data Plane Verification

    Authors: Shenshen Chen, Jian Luo, Dong Guo, Kai Gao, Yang Richard Yang

    Abstract: Data plane verification (DPV) analyzes routing tables and detects routing abnormalities and policy violations during network operation and planning. Thus, it has become an important tool to harden the networking infrastructure and the computing systems building on top. Substantial advancements have been made in the last decade and state-of-the-art DPV systems can achieve sub-us verification for an… ▽ More

    Submitted 26 February, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: 12 pages pre-reference

  41. arXiv:2401.11170  [pdf, other

    cs.CV cs.CR

    Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images

    Authors: Kuofeng Gao, Yang Bai, Jindong Gu, Shu-Tao Xia, Philip Torr, Zhifeng Li, Wei Liu

    Abstract: Large vision-language models (VLMs) such as GPT-4 have achieved exceptional performance across various multi-modal tasks. However, the deployment of VLMs necessitates substantial energy consumption and computational resources. Once attackers maliciously induce high energy consumption and latency time (energy-latency cost) during inference of VLMs, it will exhaust computational resources. In this p… ▽ More

    Submitted 22 March, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

    Comments: Accepted by ICLR 2024

  42. arXiv:2401.02954  [pdf, other

    cs.CL cs.AI cs.LG

    DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

    Authors: DeepSeek-AI, :, Xiao Bi, Deli Chen, Guanting Chen, Shanhuang Chen, Damai Dai, Chengqi Deng, Honghui Ding, Kai Dong, Qiushi Du, Zhe Fu, Huazuo Gao, Kaige Gao, Wenjun Gao, Ruiqi Ge, Kang Guan, Daya Guo, Jianzhong Guo, Guangbo Hao, Zhewen Hao, Ying He, Wenjie Hu, Panpan Huang, Erhang Li , et al. (63 additional authors not shown)

    Abstract: The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over scaling LLMs. We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  43. arXiv:2312.14667  [pdf, other

    cs.MM cs.LG

    Token-Level Contrastive Learning with Modality-Aware Prompting for Multimodal Intent Recognition

    Authors: Qianrui Zhou, Hua Xu, Hao Li, Hanlei Zhang, Xiaohan Zhang, Yifan Wang, Kai Gao

    Abstract: Multimodal intent recognition aims to leverage diverse modalities such as expressions, body movements and tone of speech to comprehend user's intent, constituting a critical task for understanding human language and behavior in real-world multimodal scenarios. Nevertheless, the majority of existing methods ignore potential correlations among different modalities and own limitations in effectively… ▽ More

    Submitted 5 June, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024 (Main Track, Long Paper)

  44. arXiv:2312.12123  [pdf, other

    cs.LG

    Probabilistic Prediction of Longitudinal Trajectory Considering Driving Heterogeneity with Interpretability

    Authors: Shuli Wang, Kun Gao, Lanfang Zhang, Yang Liu, Lei Chen

    Abstract: Automated vehicles are envisioned to navigate safely in complex mixed-traffic scenarios alongside human-driven vehicles. To promise a high degree of safety, accurately predicting the maneuvers of surrounding vehicles and their future positions is a critical task and attracts much attention. However, most existing studies focused on reasoning about positional information based on objective historic… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: 14 pages, 8 figures

  45. arXiv:2312.05104   

    cs.RO

    An Autonomous Driving Model Integrated with BEV-V2X Perception, Fusion Prediction of Motion and Occupancy, and Driving Planning, in Complex Traffic Intersections

    Authors: Fukang Li, Wenlin Ou, Kunpeng Gao, Yuwen Pang, Yifei Li, Henry Fan

    Abstract: The comprehensiveness of vehicle-to-everything (V2X) recognition enriches and holistically shapes the global Birds-Eye-View (BEV) perception, incorporating rich semantics and integrating driving scene information, thereby serving features of vehicle state prediction, decision-making and driving planning. Utilizing V2X message sets to form BEV map proves to be an effective perception method for con… ▽ More

    Submitted 22 April, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

    Comments: The content of the paper has not received unanimous consent from all the members and requires further evaluation prior to submission

  46. arXiv:2311.16194  [pdf, other

    cs.CV

    BadCLIP: Trigger-Aware Prompt Learning for Backdoor Attacks on CLIP

    Authors: Jiawang Bai, Kuofeng Gao, Shaobo Min, Shu-Tao Xia, Zhifeng Li, Wei Liu

    Abstract: Contrastive Vision-Language Pre-training, known as CLIP, has shown promising effectiveness in addressing downstream image recognition tasks. However, recent works revealed that the CLIP model can be implanted with a downstream-oriented backdoor. On downstream tasks, one victim model performs well on clean samples but predicts a specific target class whenever a specific trigger is present. For inje… ▽ More

    Submitted 21 March, 2024; v1 submitted 26 November, 2023; originally announced November 2023.

    Comments: 14 pages, 6 figures

  47. arXiv:2311.12644  [pdf, other

    cs.LG

    Careful Selection and Thoughtful Discarding: Graph Explicit Pooling Utilizing Discarded Nodes

    Authors: Chuang Liu, Wenhang Yu, Kuang Gao, Xueqi Ma, Yibing Zhan, Jia Wu, Bo Du, Wenbin Hu

    Abstract: Graph pooling has been increasingly recognized as crucial for Graph Neural Networks (GNNs) to facilitate hierarchical graph representation learning. Existing graph pooling methods commonly consist of two stages: selecting top-ranked nodes and discarding the remaining to construct coarsened graph representations. However, this paper highlights two key issues with these methods: 1) The process of se… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

    Comments: 14 pages, 7 figures, 4 tables. Submitting to Science China Information Sciences

  48. Deeper Hedging: A New Agent-based Model for Effective Deep Hedging

    Authors: Kang Gao, Stephen Weston, Perukrishnen Vytelingum, Namid R. Stillman, Wayne Luk, Ce Guo

    Abstract: We propose the Chiarella-Heston model, a new agent-based model for improving the effectiveness of deep hedging strategies. This model includes momentum traders, fundamental traders, and volatility traders. The volatility traders participate in the market by innovatively following a Heston-style volatility signal. The proposed model generalises both the extended Chiarella model and the Heston stoch… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: Accepted in the 4th ACM International Conference on AI in Finance (ICAIF'23)

  49. arXiv:2310.17082  [pdf, ps, other

    astro-ph.HE

    Does or did the supernova remnant Cassiopeia A operate as a PeVatron?

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE;… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: 11 pages, 3 figures, Accepted by the APJL

  50. arXiv:2310.16006  [pdf, other

    cond-mat.quant-gas

    Machine-learning the phase diagram of a strongly-interacting Fermi gas

    Authors: M. Link, K. Gao, A. Kell, M. Breyer, D. Eberz, B. Rauf, M. Köhl

    Abstract: We determine the phase diagram of strongly correlated fermions in the crossover from Bose-Einstein condensates of molecules (BEC) to Cooper pairs of fermions (BCS) utilizing an artificial neural network. By applying advanced image recognition techniques to the momentum distribution of the fermions, a quantity which has been widely considered as featureless for providing information about the conde… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Journal ref: Phys. Rev. Lett. 130, 203401 (2023)