Skip to main content

Showing 1–50 of 406 results for author: He, P

  1. arXiv:2407.09121  [pdf, other

    cs.CL cs.AI

    Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training

    Authors: Youliang Yuan, Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Jiahao Xu, Tian Liang, Pinjia He, Zhaopeng Tu

    Abstract: This study addresses a critical gap in safety tuning practices for Large Language Models (LLMs) by identifying and tackling a refusal position bias within safety tuning data, which compromises the models' ability to appropriately refuse generating unsafe content. We introduce a novel approach, Decoupled Refusal Training (DeRTa), designed to empower LLMs to refuse compliance to harmful prompts at a… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2407.07304  [pdf, other

    cs.AI

    Inference Performance Optimization for Large Language Models on CPUs

    Authors: Pujiang He, Shan Zhou, Wenhuan Huang, Changqing Li, Duyi Wang, Bin Guo, Chen Meng, Sheng Gui, Weifei Yu, Yi Xie

    Abstract: Large language models (LLMs) have shown exceptional performance and vast potential across diverse tasks. However, the deployment of LLMs with high performance in low-resource environments has garnered significant attention in the industry. When GPU hardware resources are limited, we can explore alternative options on CPUs. To mitigate the financial burden and alleviate constraints imposed by hardw… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 5 pages, 6 figure, ICML 2024 on Foundation Models in the Wild

  3. arXiv:2407.05739  [pdf, other

    cs.NE cs.AI

    Multi-Bit Mechanism: A Novel Information Transmission Paradigm for Spiking Neural Networks

    Authors: Yongjun Xiao, Xianlong Tian, Yongqi Ding, Pei He, Mengmeng Jing, Lin Zuo

    Abstract: Since proposed, spiking neural networks (SNNs) gain recognition for their high performance, low power consumption and enhanced biological interpretability. However, while bringing these advantages, the binary nature of spikes also leads to considerable information loss in SNNs, ultimately causing performance degradation. We claim that the limited expressiveness of current binary spikes, resulting… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Under review

  4. arXiv:2407.02756  [pdf, other

    hep-th cond-mat.stat-mech

    Probing Krylov Complexity in Scalar Field Theory with General Temperatures

    Authors: Peng-Zhang He, Hai-Qing Zhang

    Abstract: Krylov complexity characterizes the operator growth in quantum many-body systems or in quantum field theories. The existing literatures have studied the Krylov complexity in the low temperature limit in quantum field theory. In this paper, we extend and systematically study the Krylov complexity and Krylov entropy in a scalar field theory with general temperatures. To this end, we propose a new me… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 26 pages, 7 figures

  5. arXiv:2407.00029  [pdf, other

    cs.DC

    Distributed Inference Performance Optimization for LLMs on CPUs

    Authors: Pujiang He, Shan Zhou, Changqing Li, Wenhuan Huang, Weifei Yu, Duyi Wang, Chen Meng, Sheng Gui

    Abstract: Large language models (LLMs) hold tremendous potential for addressing numerous real-world challenges, yet they typically demand significant computational resources and memory. Deploying LLMs onto a resource-limited hardware device with restricted memory capacity presents considerable challenges. Distributed computing emerges as a prevalent strategy to mitigate single-node memory constraints and ex… ▽ More

    Submitted 16 May, 2024; originally announced July 2024.

    Comments: 4 pages, 3 figures, Practical ML for Low Resource Settings Workshop @ ICLR 2024

  6. arXiv:2406.14773  [pdf, other

    cs.CR

    Mitigating the Privacy Issues in Retrieval-Augmented Generation (RAG) via Pure Synthetic Data

    Authors: Shenglai Zeng, Jiankun Zhang, Pengfei He, Jie Ren, Tianqi Zheng, Hanqing Lu, Han Xu, Hui Liu, Yue Xing, Jiliang Tang

    Abstract: Retrieval-augmented generation (RAG) enhances the outputs of language models by integrating relevant information retrieved from external knowledge sources. However, when the retrieval process involves private data, RAG systems may face severe privacy risks, potentially leading to the leakage of sensitive information. To address this issue, we propose using synthetic data as a privacy-preserving al… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  7. arXiv:2406.11645  [pdf, other

    cs.HC cs.CV

    SeamPose: Repurposing Seams as Capacitive Sensors in a Shirt for Upper-Body Pose Tracking

    Authors: Tianhong Catherine Yu, Manru, Zhang, Peter He, Chi-Jung Lee, Cassidy Cheesman, Saif Mahmud, Ruidong Zhang, François Guimbretière, Cheng Zhang

    Abstract: Seams are areas of overlapping fabric formed by stitching two or more pieces of fabric together in the cut-and-sew apparel manufacturing process. In SeamPose, we repurposed seams as capacitive sensors in a shirt for continuous upper-body pose estimation. Compared to previous all-textile motion-capturing garments that place the electrodes on the surface of clothing, our solution leverages existing… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  8. arXiv:2406.10794  [pdf, other

    cs.CL

    Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis

    Authors: Yuping Lin, Pengfei He, Han Xu, Yue Xing, Makoto Yamada, Hui Liu, Jiliang Tang

    Abstract: Large language models (LLMs) are susceptible to a type of attack known as jailbreaking, which misleads LLMs to output harmful contents. Although there are diverse jailbreak attack strategies, there is no unified understanding on why some methods succeed and others fail. This paper explores the behavior of harmful and harmless prompts in the LLM's representation space to investigate the intrinsic p… ▽ More

    Submitted 26 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

  9. arXiv:2405.17229  [pdf, other

    cs.HC

    InsigHTable: Insight-driven Hierarchical Table Visualization with Reinforcement Learning

    Authors: Guozheng Li, Peng He, Xinyu Wang, Runfei Li, Chi Harold Liu, Chuangxin Ou, Dong He, Guoren Wang

    Abstract: Embedding visual representations within original hierarchical tables can mitigate additional cognitive load stemming from the division of users' attention. The created hierarchical table visualizations can help users understand and explore complex data with multi-level attributes. However, because of many options available for transforming hierarchical tables and selecting subsets for embedding, t… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  10. arXiv:2405.04513  [pdf, other

    cs.CL cs.AI cs.LG

    Switchable Decision: Dynamic Neural Generation Networks

    Authors: Shujian Zhang, Korawat Tanwisuth, Chengyue Gong, Pengcheng He, Mingyuan Zhou

    Abstract: Auto-regressive generation models achieve competitive performance across many different NLP tasks such as summarization, question answering, and classifications. However, they are also known for being slow in inference, which makes them challenging to deploy in real-time applications. We propose a switchable decision to accelerate inference by dynamically assigning computation resources for each d… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted to ICML 2024

  11. arXiv:2405.04133  [pdf, other

    cs.CV

    Exposing AI-generated Videos: A Benchmark Dataset and a Local-and-Global Temporal Defect Based Detection Method

    Authors: Peisong He, Leyao Zhu, Jiaxing Li, Shiqi Wang, Haoliang Li

    Abstract: The generative model has made significant advancements in the creation of realistic videos, which causes security issues. However, this emerging risk has not been adequately addressed due to the absence of a benchmark dataset for AI-generated videos. In this paper, we first construct a video dataset using advanced diffusion-based video generation algorithms with various semantic contents. Besides,… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  12. arXiv:2405.03884  [pdf, other

    cs.CV

    BadFusion: 2D-Oriented Backdoor Attacks against 3D Object Detection

    Authors: Saket S. Chaturvedi, Lan Zhang, Wenbin Zhang, Pan He, Xiaoyong Yuan

    Abstract: 3D object detection plays an important role in autonomous driving; however, its vulnerability to backdoor attacks has become evident. By injecting ''triggers'' to poison the training dataset, backdoor attacks manipulate the detector's prediction for inputs containing these triggers. Existing backdoor attacks against 3D object detection primarily poison 3D LiDAR signals, where large-sized 3D trigge… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: Accepted at IJCAI 2024 Conference

  13. arXiv:2405.03489  [pdf, other

    cs.SE

    On the Influence of Data Resampling for Deep Learning-Based Log Anomaly Detection: Insights and Recommendations

    Authors: Xiaoxue Ma, Huiqi Zou, Jacky Keung, Pinjia He, Yishu Li, Xiao Yu, Federica Sarro

    Abstract: Numerous DL-based approaches have garnered considerable attention in the field of software Log Anomaly Detection. However, a practical challenge persists: the class imbalance in the public data commonly used to train the DL models. This imbalance is characterized by a substantial disparity in the number of abnormal log sequences compared to normal ones, for example, anomalies represent less than 1… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 15 pages, 2 figures

  14. arXiv:2405.00920  [pdf, other

    astro-ph.IM astro-ph.CO astro-ph.GA

    Identifying Halos in Cosmological Simulations with Continuous Wavelet Analysis: The 2D Case

    Authors: Minxing Li, Yun Wang, Ping He

    Abstract: Continuous wavelet analysis is gaining popularity in science and engineering for its ability to analyze data across spatial and scale domain simultaneously. In this study, we introduce a wavelet-based method to identify halos and assess its feasibility in two-dimensional (2D) scenarios. We begin with the generation of four pseudo-2D datasets from the SIMBA dark matter simulation by compressing thi… ▽ More

    Submitted 3 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

    Comments: 18 pages, 13 figures, 1 table, comments welcome

  15. arXiv:2404.18512  [pdf, other

    cond-mat.mes-hall cond-mat.dis-nn cond-mat.quant-gas

    Floquet Amorphous Topological Orders in a Rydberg Glass

    Authors: Peng He, Jing-Xin Liu, Hong Wu, Z. D. Wang

    Abstract: We study the Floquet amorphous topological orders in experimentally accessible one-dimensional array of randomly pointed Rydberg atoms with periodic driving. The filling factor in the chain is tunable by applying a microwave field. We give a complete characterization of the topological properties from both the single-particle and many-body aspect. The periodic driving results in richer topological… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 7 pages, 4 figures

  16. arXiv:2404.15838  [pdf, other

    hep-ph astro-ph.HE gr-qc

    Abnormal threshold behaviors of photo-pion production off the proton in the GZK region

    Authors: Ping He, Bo-Qiang Ma

    Abstract: The confirmation of the existence of GZK cut-off was tortuous, leading to activities to explore new physics, such as the cosmic-ray new components, unidentified cosmic-ray origins, unknown propagation mechanism, and the modification of fundamental physics concepts like the tiny Lorentz invariance violation (LV). The confirmation of the GZK cut-off provides an opportunity to constrain the LV effect… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 9 latex pages, 3 figures, final version

    Journal ref: Euro.Phys.J. C 84 (2024) 401

  17. arXiv:2404.15819  [pdf, other

    cs.AR

    APACHE: A Processing-Near-Memory Architecture for Multi-Scheme Fully Homomorphic Encryption

    Authors: Lin Ding, Song Bian, Penggao He, Yan Xu, Gang Qu, Jiliang Zhang

    Abstract: Fully Homomorphic Encryption (FHE) allows one to outsource computation over encrypted data to untrusted servers without worrying about data breaching. Since FHE is known to be extremely computationally-intensive, application-specific accelerators emerged as a powerful solution to narrow the performance gap. Nonetheless, due to the increasing complexities in FHE schemes per se and multi-scheme FHE… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  18. arXiv:2404.13177  [pdf, other

    stat.ME stat.AP

    A Bayesian Hybrid Design with Borrowing from Historical Study

    Authors: Zhaohua Lu, John Toso, Girma Ayele, Philip He

    Abstract: In early phase drug development of combination therapy, the primary objective is to preliminarily assess whether there is additive activity when a novel agent combined with an established monotherapy. Due to potential feasibility issues with a large randomized study, uncontrolled single-arm trials have been the mainstream approach in cancer clinical trials. However, such trials often present signi… ▽ More

    Submitted 29 April, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

  19. arXiv:2404.11255  [pdf, other

    astro-ph.CO astro-ph.GA astro-ph.IM physics.flu-dyn

    Turbulence revealed by wavelet transform: power spectrum and intermittency for the velocity field of the cosmic baryonic fluid

    Authors: Yun Wang, Ping He

    Abstract: We use continuous wavelet transform techniques to construct the global and environment-dependent wavelet statistics, such as energy spectrum and kurtosis, to study the fluctuation and intermittency of the turbulent motion in the cosmic fluid velocity field with the IllustrisTNG simulation data. We find that the peak scales of the energy spectrum and the spectral ratio define two characteristic sca… ▽ More

    Submitted 10 July, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: 19 pages, 11 figures, 2 tables, submitted to the ApJ

  20. arXiv:2404.08877  [pdf, other

    cs.SE cs.CL cs.LG

    Aligning LLMs for FL-free Program Repair

    Authors: Junjielong Xu, Ying Fu, Shin Hwei Tan, Pinjia He

    Abstract: Large language models (LLMs) have achieved decent results on automated program repair (APR). However, the next token prediction training objective of decoder-only LLMs (e.g., GPT-4) is misaligned with the masked span prediction objective of current infilling-style methods, which impedes LLMs from fully leveraging pre-trained knowledge for program repair. In addition, while some LLMs are capable of… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  21. arXiv:2403.17574  [pdf, other

    cs.SE cs.DC

    SPES: Towards Optimizing Performance-Resource Trade-Off for Serverless Functions

    Authors: Cheryl Lee, Zhouruixin Zhu, Tianyi Yang, Yintong Huo, Yuxin Su, Pinjia He, Michael R. Lyu

    Abstract: As an emerging cloud computing deployment paradigm, serverless computing is gaining traction due to its efficiency and ability to harness on-demand cloud resources. However, a significant hurdle remains in the form of the cold start problem, causing latency when launching new function instances from scratch. Existing solutions tend to use over-simplistic strategies for function pre-loading/unloadi… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: 12 pages, accepted by ICDE 2024 (40th IEEE International Conference on Data Engineering)

  22. arXiv:2403.12389  [pdf, other

    cs.NE

    Learning-guided iterated local search for the minmax multiple traveling salesman problem

    Authors: Pengfei He, Jin-Kao Hao, Jinhui Xia

    Abstract: The minmax multiple traveling salesman problem involves minimizing the longest tour among a set of tours. The problem is of great practical interest because it can be used to formulate several real-life applications. To solve this computationally challenging problem, we propose a leaning-driven iterated local search approach that combines an aggressive local search procedure with a probabilistic a… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  23. arXiv:2403.11773  [pdf, other

    math.PR

    Scaling limit of heavy tailed nearly unstable cumulative INAR($\infty$) processes and rough fractional diffusions

    Authors: Yingli Wang, Chunhao Cai, Ping He

    Abstract: In this paper, we investigated the scaling limit of heavy-tailed unstable cumulative INAR($\infty$) processes. These processes exhibit a power law tail of the form $n^{-(1+α)}$, with $α\in (\frac{1}{2}, 1)$, where the $\ell^1$ norm of the kernel vector is close to $1$. The result is in contrast to scaling limit of the continuous-time heavy tailed unstable Hawkes processes and the one of INAR($p$)… ▽ More

    Submitted 16 April, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: text overlap with arXiv:1504.03100 by other authors

    MSC Class: 60G22; 60F05

  24. arXiv:2403.09361  [pdf, other

    cs.AI

    A Multi-population Integrated Approach for Capacitated Location Routing

    Authors: Pengfei He, Jin-Kao Hao, Qinghua Wu

    Abstract: The capacitated location-routing problem involves determining the depots from a set of candidate capacitated depot locations and finding the required routes from the selected depots to serve a set of customers whereas minimizing a cost function that includes the cost of opening the chosen depots, the fixed utilization cost per vehicle used, and the total cost (distance) of the routes. This paper p… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  25. arXiv:2403.07252  [pdf, ps, other

    math.RT

    Serre functors and complete torsion pairs

    Authors: Zhe Han, Ping He

    Abstract: Given a torsion pair $(\mathcal{T},\mathcal{F})$ in an abelian category $\mathcal{A}$, there is a t-structure $(\mathcal{U}_\mathcal{T},\mathcal{V}_\mathcal{T})$ determined by $\mathcal{T}$ on the derived category $D^b(\mathcal{A})$. The existence of derived equivalence between heart $\mathcal{B}$ of the t-structure and $\mathcal{A}$ which naturally extends the embedding… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 18pages

  26. arXiv:2403.06884  [pdf, other

    cs.CV

    A Holistic Framework Towards Vision-based Traffic Signal Control with Microscopic Simulation

    Authors: Pan He, Quanyi Li, Xiaoyong Yuan, Bolei Zhou

    Abstract: Traffic signal control (TSC) is crucial for reducing traffic congestion that leads to smoother traffic flow, reduced idling time, and mitigated CO2 emissions. In this study, we explore the computer vision approach for TSC that modulates on-road traffic flows through visual observation. Unlike traditional feature-based approaches, vision-based methods depend much less on heuristics and predefined f… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: Under review for IEEE publications

  27. arXiv:2403.04861  [pdf, other

    cs.LG cs.NE

    A Survey of Lottery Ticket Hypothesis

    Authors: Bohan Liu, Zijie Zhang, Peixiong He, Zhensen Wang, Yang Xiao, Ruimeng Ye, Yang Zhou, Wei-Shinn Ku, Bo Hui

    Abstract: The Lottery Ticket Hypothesis (LTH) states that a dense neural network model contains a highly sparse subnetwork (i.e., winning tickets) that can achieve even better performance than the original model when trained in isolation. While LTH has been proved both empirically and theoretically in many works, there still are some open issues, such as efficiency and scalability, to be addressed. Also, th… ▽ More

    Submitted 12 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

  28. arXiv:2402.16893  [pdf, other

    cs.CR cs.AI cs.CL

    The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)

    Authors: Shenglai Zeng, Jiankun Zhang, Pengfei He, Yue Xing, Yiding Liu, Han Xu, Jie Ren, Shuaiqiang Wang, Dawei Yin, Yi Chang, Jiliang Tang

    Abstract: Retrieval-augmented generation (RAG) is a powerful technique to facilitate language model with proprietary and private data, where data privacy is a pivotal concern. Whereas extensive research has demonstrated the privacy risks of large language models (LLMs), the RAG technique could potentially reshape the inherent behaviors of LLM generation, posing new privacy issues that are currently under-ex… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  29. arXiv:2402.12958  [pdf, other

    cs.SE

    Go Static: Contextualized Logging Statement Generation

    Authors: Yichen Li, Yintong Huo, Renyi Zhong, Zhihan Jiang, Jinyang Liu, Junjie Huang, Jiazhen Gu, Pinjia He, Michael R. Lyu

    Abstract: Logging practices have been extensively investigated to assist developers in writing appropriate logging statements for documenting software behaviors. Although numerous automatic logging approaches have been proposed, their performance remains unsatisfactory due to the constraint of the single-method input, without informative programming context outside the method. Specifically, we identify thre… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: This paper was accepted by The ACM International Conference on the Foundations of Software Engineering (FSE 2024)

  30. Principle of multi-critical-points in the ALP-Higgs model and the corresponding phase transition

    Authors: Jiyuan Ke, Minxing Li, Ping He

    Abstract: The principle of multi-critical-points (PMCP) may be a convincing approach to determine the emerging parameter values in different kinds of beyond-standard-model (BSM) models. This could certainly be applied to solve the problem of undetermined new parameters in the ALP-Higgs interaction models. In this paper, we apply this principle to such model and investigate whether there are suitable solutio… ▽ More

    Submitted 20 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: 6 pages, 3 figures

  31. arXiv:2402.11825  [pdf, other

    physics.atom-ph quant-ph

    Photoelectron Polarization Vortexes in Strong-Field Ionization

    Authors: Pei-Lun He, Zhao-Han Zhang, Karen Z. Hatsagortsyan, Christoph H. Keitel

    Abstract: The spin polarization of photoelectrons induced by an intense linearly polarized laser field is investigated using numerical solutions of the time-dependent Schrödinger equation in companion with our analytic treatment via the spin-resolved strong-field approximation and classical trajectory Monte Carlo simulations. We demonstrate that, even though the total polarization vanishes upon averaging ov… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: 6 pages, 4 figures

  32. arXiv:2402.10907  [pdf

    physics.app-ph physics.ins-det physics.optics

    Optically Levitated Nanoparticles as Receiving Antennas for Low Frequency Wireless Communication

    Authors: Zhenhai Fu, Jinsheng Xu, Shaochong Zhu, Chaoxiong He, Xunming Zhu, Xiaowen Gao, Han Cai, Peitong He, Zhiming Chen, Yizhou Zhang, Nan Li, Xingfan Chen, Ying Dong, Shiyao Zhu, Cheng Liu, Huizhu Hu

    Abstract: Low-frequency (LF) wireless communications play a crucial role in ensuring anti-interference, long-range, and efficient communication across various environments. However, in conventional LF communication systems, their antenna size is required to be inversely proportional to the wavelength, so that their mobility and flexibility are greatly limited. Here we introduce a novel prototype of LF recei… ▽ More

    Submitted 10 January, 2024; originally announced February 2024.

  33. arXiv:2402.02777  [pdf, other

    cond-mat.quant-gas cond-mat.mes-hall

    Realizing and detecting Stiefel-Whitney insulators in an optical Raman lattice

    Authors: Jian-Te Wang, Jing-Xin Liu, Hai-Tao Ding, Peng He

    Abstract: We propose a feasible scheme to realize a four-band Stiefel-Whitney insultor (SWI) with spin-orbit coupled ultracold atoms in an optical Raman lattice. Four selected spin states are coupled by carefully designed Raman lasers, to generate the desired spin-orbit interactions with spacetime inversion symmetry. We map out a phase diagram with respect to the experimental parameters, where a large topol… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Journal ref: Phys. Rev. A 109, 053314(2024)

  34. arXiv:2402.02333  [pdf, other

    cs.CR cs.CV cs.LG

    Copyright Protection in Generative AI: A Technical Perspective

    Authors: Jie Ren, Han Xu, Pengfei He, Yingqian Cui, Shenglai Zeng, Jiankun Zhang, Hongzhi Wen, Jiayuan Ding, Hui Liu, Yi Chang, Jiliang Tang

    Abstract: Generative AI has witnessed rapid advancement in recent years, expanding their capabilities to create synthesized content such as text, images, audio, and code. The high fidelity and authenticity of contents generated by these Deep Generative Models (DGMs) have sparked significant copyright concerns. There have been various legal debates on how to effectively safeguard copyrights in DGMs. This wor… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: 26 pages

  35. arXiv:2402.02160  [pdf, other

    cs.CR

    Data Poisoning for In-context Learning

    Authors: Pengfei He, Han Xu, Yue Xing, Hui Liu, Makoto Yamada, Jiliang Tang

    Abstract: In the domain of large language models (LLMs), in-context learning (ICL) has been recognized for its innovative ability to adapt to new tasks, relying on examples rather than retraining or fine-tuning. This paper delves into the critical issue of ICL's susceptibility to data poisoning attacks, an area not yet fully explored. We wonder whether ICL is vulnerable, with adversaries capable of manipula… ▽ More

    Submitted 27 March, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

  36. arXiv:2401.17426  [pdf, other

    cs.LG cs.AI stat.ML

    Superiority of Multi-Head Attention in In-Context Linear Regression

    Authors: Yingqian Cui, Jie Ren, Pengfei He, Jiliang Tang, Yue Xing

    Abstract: We present a theoretical analysis of the performance of transformer with softmax attention in in-context learning with linear regression tasks. While the existing literature predominantly focuses on the convergence of transformers with single-/multi-head attention, our research centers on comparing their performance. We conduct an exact theoretical analysis to demonstrate that multi-head attention… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  37. arXiv:2401.05986  [pdf, other

    cs.SE

    LogPTR: Variable-Aware Log Parsing with Pointer Network

    Authors: Yifan Wu, Bingxu Chai, Siyu Yu, Ying Li, Pinjia He, Wei Jiang, Jianguo Li

    Abstract: Due to the sheer size of software logs, developers rely on automated log analysis. Log parsing, which parses semi-structured logs into a structured format, is a prerequisite of automated log analysis. However, existing log parsers are unsatisfactory when applied in practice because: 1) they ignore categories of variables, and 2) have poor generalization ability. To address the limitations of exist… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  38. arXiv:2401.01912  [pdf, other

    cs.CV cs.LG eess.IV

    Shrinking Your TimeStep: Towards Low-Latency Neuromorphic Object Recognition with Spiking Neural Network

    Authors: Yongqi Ding, Lin Zuo, Mengmeng Jing, Pei He, Yongjun Xiao

    Abstract: Neuromorphic object recognition with spiking neural networks (SNNs) is the cornerstone of low-power neuromorphic computing. However, existing SNNs suffer from significant latency, utilizing 10 to 40 timesteps or more, to recognize neuromorphic objects. At low latencies, the performance of existing SNNs is drastically degraded. In this work, we propose the Shrinking SNN (SSNN) to achieve low-latenc… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI 2024

  39. arXiv:2401.00757  [pdf, other

    cs.SE cs.AI cs.CL cs.LO

    A & B == B & A: Triggering Logical Reasoning Failures in Large Language Models

    Authors: Yuxuan Wan, Wenxuan Wang, Yiliu Yang, Youliang Yuan, Jen-tse Huang, Pinjia He, Wenxiang Jiao, Michael R. Lyu

    Abstract: Recent advancements in large language models (LLMs) have propelled Artificial Intelligence (AI) to new heights, enabling breakthroughs in various tasks such as writing assistance, code generation, and machine translation. A significant distinction of advanced LLMs, such as ChatGPT, is their demonstrated ability to "reason." However, evaluating the reasoning ability of LLMs remains a challenge as m… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

  40. arXiv:2312.15352  [pdf

    stat.AP

    A Bayesian Basket Trial Design Using Local Power Prior

    Authors: Haiming Zhou, Rex Shen, Sutan Wu, Philip He

    Abstract: In recent years, basket trials, which enable the evaluation of an experimental therapy across multiple tumor types within a single protocol, have gained prominence in early-phase oncology development. Unlike traditional trials, where each tumor type is evaluated separately with limited sample size, basket trials offer the advantage of borrowing information across various tumor types. However, a ke… ▽ More

    Submitted 19 April, 2024; v1 submitted 23 December, 2023; originally announced December 2023.

  41. Large nonlinear Hall effect and Berry curvature in KTaO3 based two-dimensional electron gas

    Authors: Jinfeng Zhai, Mattia Trama, Hao Liu, Zhifei Zhu, Yinyan Zhu, Carmine Antonio Perroni, Roberta Citro, Pan He, Jian Shen

    Abstract: The two-dimensional electron gas (2DEG) at oxide interfaces exhibits various exotic properties stemming from interfacial inversion symmetry breaking. In this work, we report the emergence of large nonlinear Hall effects (NHE) in the LaAlO3/KTaO3(111) interface 2DEG under zero magnetic field. Skew scattering was identified as the dominant origin based on the cubic scaling of nonlinear Hall conducti… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Journal ref: Nano Letters 2023

  42. Merging history of massive galaxies at 3<z<6

    Authors: Kemeng Li, Zhen Jiang, Ping He, Qi Guo, Jie Wang

    Abstract: The observational data of high redshift galaxies become increasingly abundant, especially since the operation of the James Webb Space Telescope (JWST), which allows us to verify and optimize the galaxy formation model at high redshifts. In this work, we investigate the merging history of massive galaxies at $3 < z < 6$ using a well-developed semi-analytic galaxy formation catalogue. We find that t… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: 9 pages,8figures

    Journal ref: 2023,Research in Astronomy and Astrophysics, 23(1), 015010

  43. arXiv:2312.00409  [pdf, ps, other

    gr-qc astro-ph.HE hep-th

    White Paper and Roadmap for Quantum Gravity Phenomenology in the Multi-Messenger Era

    Authors: R. Alves Batista, G. Amelino-Camelia, D. Boncioli, J. M. Carmona, A. di Matteo, G. Gubitosi, I. Lobo, N. E. Mavromatos, C. Pfeifer, D. Rubiera-Garcia, E. N. Saridakis, T. Terzić, E. C. Vagenas, P. Vargas Moniz, H. Abdalla, M. Adamo, A. Addazi, F. K. Anagnostopoulos, V. Antonelli, M. Asorey, A. Ballesteros, S. Basilakos, D. Benisty, M. Boettcher, J. Bolmont , et al. (80 additional authors not shown)

    Abstract: The unification of quantum mechanics and general relativity has long been elusive. Only recently have empirical predictions of various possible theories of quantum gravity been put to test. The dawn of multi-messenger high-energy astrophysics has been tremendously beneficial, as it allows us to study particles with much higher energies and travelling much longer distances than possible in terrestr… ▽ More

    Submitted 12 December, 2023; v1 submitted 1 December, 2023; originally announced December 2023.

    Comments: Submitted to CQG for the Focus Issue on "Quantum Gravity Phenomenology in the Multi-Messenger Era: Challenges and Perspectives". Please contact us to express interesst of endorsement of this white paper

  44. How do baryonic effects on the cosmic matter distribution vary with scale and local density environment?

    Authors: Yun Wang, Ping He

    Abstract: In this study, we investigate how the baryonic effects vary with scale and local density environment mainly by utilizing a novel statistic, the environment-dependent wavelet power spectrum (env-WPS). With four state-of-the-art cosmological simulation suites, EAGLE, SIMBA, Illustris, and IllustrisTNG, we compare the env-WPS of the total matter density field between the hydrodynamic and dark matter-… ▽ More

    Submitted 21 January, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

    Comments: 12 pages, 12 figures, and 3 tables; accepted by MNRAS

    Journal ref: 2024, MNRAS, Volume 528, Issue 2

  45. arXiv:2310.17304  [pdf, other

    cs.CR cs.SE

    Static Semantics Reconstruction for Enhancing JavaScript-WebAssembly Multilingual Malware Detection

    Authors: Yifan Xia, Ping He, Xuhong Zhang, Peiyu Liu, Shouling Ji, Wenhai Wang

    Abstract: The emergence of WebAssembly allows attackers to hide the malicious functionalities of JavaScript malware in cross-language interoperations, termed JavaScript-WebAssembly multilingual malware (JWMM). However, existing anti-virus solutions based on static program analysis are still limited to monolingual code. As a result, their detection effectiveness decreases significantly against JWMM. The dete… ▽ More

    Submitted 19 April, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

    Comments: Accepted to ESORICS 2023

  46. arXiv:2310.11451  [pdf, other

    cs.CL cs.AI cs.LG

    Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective

    Authors: Ming Zhong, Chenxin An, Weizhu Chen, Jiawei Han, Pengcheng He

    Abstract: Large Language Models (LLMs) inherently encode a wealth of knowledge within their parameters through pre-training on extensive corpora. While prior research has delved into operations on these parameters to manipulate the underlying implicit knowledge (encompassing detection, editing, and merging), there remains an ambiguous understanding regarding their transferability across models with varying… ▽ More

    Submitted 8 May, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: ICLR 2024

  47. arXiv:2310.08659  [pdf, other

    cs.CL cs.AI cs.LG

    LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

    Authors: Yixiao Li, Yifan Yu, Chen Liang, Pengcheng He, Nikos Karampatziakis, Weizhu Chen, Tuo Zhao

    Abstract: Quantization is an indispensable technique for serving Large Language Models (LLMs) and has recently found its way into LoRA fine-tuning. In this work we focus on the scenario where quantization and LoRA fine-tuning are applied together on a pre-trained model. In such cases it is common to observe a consistent gap in the performance on downstream tasks between full fine-tuning and quantization plu… ▽ More

    Submitted 28 November, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

  48. arXiv:2310.06714  [pdf, other

    cs.AI cs.CL cs.LG

    Exploring Memorization in Fine-tuned Language Models

    Authors: Shenglai Zeng, Yaxin Li, Jie Ren, Yiding Liu, Han Xu, Pengfei He, Yue Xing, Shuaiqiang Wang, Jiliang Tang, Dawei Yin

    Abstract: Large language models (LLMs) have shown great capabilities in various tasks but also exhibited memorization of training data, raising tremendous privacy and copyright concerns. While prior works have studied memorization during pre-training, the exploration of memorization during fine-tuning is rather limited. Compared to pre-training, fine-tuning typically involves more sensitive data and diverse… ▽ More

    Submitted 22 February, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

  49. arXiv:2310.06433  [pdf, other

    cs.SE cs.AI cs.CL cs.CV

    Retromorphic Testing: A New Approach to the Test Oracle Problem

    Authors: Boxi Yu, Qiuyang Mang, Qingshuo Guo, Pinjia He

    Abstract: A test oracle serves as a criterion or mechanism to assess the correspondence between software output and the anticipated behavior for a given input set. In automated testing, black-box techniques, known for their non-intrusive nature in test oracle construction, are widely used, including notable methodologies like differential testing and metamorphic testing. Inspired by the mathematical concept… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    ACM Class: D.3.0; I.2.7; I.4.0

  50. arXiv:2310.06389  [pdf, other

    cs.CV stat.ML

    Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling

    Authors: Huangjie Zheng, Zhendong Wang, Jianbo Yuan, Guanghan Ning, Pengcheng He, Quanzeng You, Hongxia Yang, Mingyuan Zhou

    Abstract: Diffusion models excel at generating photo-realistic images but come with significant computational costs in both training and sampling. While various techniques address these computational challenges, a less-explored issue is designing an efficient and adaptable network backbone for iterative refinement. Current options like U-Net and Vision Transformer often rely on resource-intensive deep netwo… ▽ More

    Submitted 27 June, 2024; v1 submitted 10 October, 2023; originally announced October 2023.