Skip to main content

Showing 1–50 of 392 results for author: Jin, H

  1. arXiv:2407.01599  [pdf, other

    cs.CL cs.CR cs.CV cs.LG

    JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models

    Authors: Haibo Jin, Leyang Hu, Xinuo Li, Peiyan Zhang, Chonghan Chen, Jun Zhuang, Haohan Wang

    Abstract: The rapid evolution of artificial intelligence (AI) through developments in Large Language Models (LLMs) and Vision-Language Models (VLMs) has brought significant advancements across various technological domains. While these models enhance capabilities in natural language processing and visual interactive tasks, their growing adoption raises critical concerns regarding security and ethical alignm… ▽ More

    Submitted 25 June, 2024; originally announced July 2024.

    Comments: 44 pages

  2. arXiv:2407.01527  [pdf, other

    cs.CL

    KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches

    Authors: Jiayi Yuan, Hongyi Liu, Shaochen, Zhong, Yu-Neng Chuang, Songchen Li, Guanchu Wang, Duy Le, Hongye Jin, Vipin Chaudhary, Zhaozhuo Xu, Zirui Liu, Xia Hu

    Abstract: Long context capability is a crucial competency for large language models (LLMs) as it mitigates the human struggle to digest long-form texts. This capability enables complex task-solving scenarios such as book summarization, code assistance, and many more tasks that are traditionally manpower-intensive. However, transformer-based LLMs face significant challenges with long context input due to the… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  3. arXiv:2406.17990  [pdf, other

    cs.CL cs.AI cs.LG

    Explicit Diversity Conditions for Effective Question Answer Generation with Large Language Models

    Authors: Vikas Yadav, Hyuk Joon Kwon, Vijay Srinivasan, Hongxia Jin

    Abstract: Question Answer Generation (QAG) is an effective data augmentation technique to improve the accuracy of question answering systems, especially in low-resource domains. While recent pretrained and large language model-based QAG methods have made substantial progress, they face the critical issue of redundant QA pair generation, affecting downstream QA systems. Implicit diversity techniques such as… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Published at COLING 2024

  4. arXiv:2406.17880  [pdf, other

    cs.CV

    MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment Retrieval

    Authors: Weitong Cai, Jiabo Huang, Shaogang Gong, Hailin Jin, Yang Liu

    Abstract: Video Moment Retrieval (VMR) aims to localize a specific temporal segment within an untrimmed long video given a natural language query. Existing methods often suffer from inadequate training annotations, i.e., the sentence typically matches with a fraction of the prominent video content in the foreground with limited wording diversity. This intrinsic modality imbalance leaves a considerable porti… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Under review

  5. arXiv:2406.15093  [pdf, other

    cs.CR cs.CV eess.IV

    ECLIPSE: Expunging Clean-label Indiscriminate Poisons via Sparse Diffusion Purification

    Authors: Xianlong Wang, Shengshan Hu, Yechao Zhang, Ziqi Zhou, Leo Yu Zhang, Peng Xu, Wei Wan, Hai Jin

    Abstract: Clean-label indiscriminate poisoning attacks add invisible perturbations to correctly labeled training images, thus dramatically reducing the generalization capability of the victim models. Recently, some defense mechanisms have been proposed such as adversarial training, image transformation techniques, and image purification. However, these schemes are either susceptible to adaptive attacks, bui… ▽ More

    Submitted 24 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted by ESORICS 2024

  6. arXiv:2406.07520  [pdf, other

    cs.CV cs.AI cs.GR

    Neural Gaffer: Relighting Any Object via Diffusion

    Authors: Haian Jin, Yuan Li, Fujun Luan, Yuanbo Xiangli, Sai Bi, Kai Zhang, Zexiang Xu, Jin Sun, Noah Snavely

    Abstract: Single-image relighting is a challenging task that involves reasoning about the complex interplay between geometry, materials, and lighting. Many prior methods either support only specific categories of images, such as portraits, or require special capture conditions, like using a flashlight. Alternatively, some methods explicitly decompose a scene into intrinsic components, such as normals and BR… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Project Website: https://neural-gaffer.github.io

  7. arXiv:2406.05667  [pdf

    eess.SP cs.IT

    Achieving High Capacity Transmission With N-Dimensional Quasi-Fractal UCA

    Authors: Hongyun Jin, Wenchi Cheng, Haiyue Jing, Jingqing Wang, Wei Zhang

    Abstract: The vortex electromagnetic wave carried by multiple orthogonal orbital angular momentum (OAM) modes in the same frequency band can be applied to the field of wireless communications, which greatly increases the spectrum efficiency. The uniform circular array (UCA) is widely used to generate and receive vortex electromagnetic waves with multiple OAM-modes. However, the maximum number of orthogonal… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  8. arXiv:2406.05663  [pdf

    eess.SP cs.IT

    Movable Antenna Assisted OAM Wireless Communications With Misaligned Transceiver

    Authors: Hongyun Jin, Wenchi Cheng, Haiyue Jing, Jingqing Wang

    Abstract: The vortex electromagnetic wave carried by multiple orthogonal orbital angular momentum (OAM) modes in the same frequency band can be applied to the field of wireless communications, which greatly increases the spectrum efficiency. The uniform circular array (UCA) is the classical structure to generate and receive vortex electromagnetic waves with multiple OAM-modes. However, when the transmit and… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  9. arXiv:2406.03718  [pdf, other

    cs.CR cs.AI cs.CL

    Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning

    Authors: Xiaohu Du, Ming Wen, Jiahao Zhu, Zifan Xie, Bin Ji, Huijun Liu, Xuanhua Shi, Hai Jin

    Abstract: Code Pre-trained Models (CodePTMs) based vulnerability detection have achieved promising results over recent years. However, these models struggle to generalize as they typically learn superficial mapping from source code to labels instead of understanding the root causes of code vulnerabilities, resulting in poor performance in real-world scenarios beyond the training instances. To tackle this ch… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 Findings

  10. arXiv:2406.02847  [pdf, other

    cs.LG stat.ML

    Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers

    Authors: Brian K Chen, Tianyang Hu, Hui Jin, Hwee Kuan Lee, Kenji Kawaguchi

    Abstract: In-Context Learning (ICL) has been a powerful emergent property of large language models that has attracted increasing attention in recent years. In contrast to regular gradient-based learning, ICL is highly interpretable and does not require parameter updates. In this paper, we show that, for linearized transformer networks, ICL can be made explicit and permanent through the inclusion of bias ter… ▽ More

    Submitted 6 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML 2024

  11. arXiv:2406.02131  [pdf, other

    cs.LG cs.AI

    CondTSF: One-line Plugin of Dataset Condensation for Time Series Forecasting

    Authors: Jianrong Ding, Zhanyu Liu, Guanjie Zheng, Haiming Jin, Linghe Kong

    Abstract: Dataset condensation is a newborn technique that generates a small dataset that can be used in training deep neural networks to lower training costs. The objective of dataset condensation is to ensure that the model trained with the synthetic dataset can perform comparably to the model trained with full datasets. However, existing methods predominantly concentrate on classification tasks, posing c… ▽ More

    Submitted 11 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: 23 pages, 13 figures

  12. arXiv:2405.20681  [pdf, other

    cs.CR cs.AI

    No Free Lunch Theorem for Privacy-Preserving LLM Inference

    Authors: Xiaojin Zhang, Yulin Fei, Yan Kang, Wei Chen, Lixin Fan, Hai Jin, Qiang Yang

    Abstract: Individuals and businesses have been significantly benefited by Large Language Models (LLMs) including PaLM, Gemini and ChatGPT in various ways. For example, LLMs enhance productivity, reduce costs, and enable us to focus on more valuable tasks. Furthermore, LLMs possess the capacity to sift through extensive datasets, uncover underlying patterns, and furnish critical insights that propel the fron… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  13. arXiv:2405.20413  [pdf, other

    cs.CR cs.CL cs.CV cs.LG

    Jailbreaking Large Language Models Against Moderation Guardrails via Cipher Characters

    Authors: Haibo Jin, Andy Zhou, Joe D. Menke, Haohan Wang

    Abstract: Large Language Models (LLMs) are typically harmless but remain vulnerable to carefully crafted prompts known as ``jailbreaks'', which can bypass protective measures and induce harmful behavior. Recent advancements in LLMs have incorporated moderation guardrails that can filter outputs, which trigger processing errors for certain malicious questions. Existing red-teaming benchmarks often neglect to… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 20 pages

  14. arXiv:2405.20247  [pdf, other

    cs.AI cs.CV cs.LG cs.SE

    KerasCV and KerasNLP: Vision and Language Power-Ups

    Authors: Matthew Watson, Divyashree Shivakumar Sreepathihalli, Francois Chollet, Martin Gorner, Kiranbir Sodhia, Ramesh Sampath, Tirth Patel, Haifeng Jin, Neel Kovelamudi, Gabriel Rasskin, Samaneh Saadat, Luke Wood, Chen Qian, Jonathan Bischof, Ian Stenbit, Abheesht Sharma, Anshuman Mishra

    Abstract: We present the Keras domain packages KerasCV and KerasNLP, extensions of the Keras API for Computer Vision and Natural Language Processing workflows, capable of running on either JAX, TensorFlow, or PyTorch. These domain packages are designed to enable fast experimentation, with a focus on ease-of-use and performance. We adopt a modular, layered design: at the library's lowest level of abstraction… ▽ More

    Submitted 5 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: Submitted to Journal of Machine Learning Open Source Software

    ACM Class: I.2.5; I.2.7; I.2.10

  15. arXiv:2405.15747  [pdf, other

    cs.NI cs.CR

    Over-the-Air Runtime Wi-Fi MAC Address Re-randomization

    Authors: Hongyu Jin, Panos Papadimitratos

    Abstract: Medium Access Control (MAC) address randomization is a key component for privacy protection in Wi-Fi networks. Current proposals periodically change the mobile device MAC addresses when it disconnects from the Access Point (AP). This way frames cannot be linked across changes, but the mobile device presence is exposed as long as it remains connected: all its communication is trivially linkable by… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  16. arXiv:2405.15302  [pdf, other

    cs.AI cs.CL cs.LG

    Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation

    Authors: Zhiwei Wang, Yunji Wang, Zhongwang Zhang, Zhangchen Zhou, Hui Jin, Tianyang Hu, Jiacheng Sun, Zhenguo Li, Yaoyu Zhang, Zhi-Qin John Xu

    Abstract: Large language models have consistently struggled with complex reasoning tasks, such as mathematical problem-solving. Investigating the internal reasoning mechanisms of these models can help us design better model architectures and training strategies, ultimately enhancing their reasoning capabilities. In this study, we examine the matching mechanism employed by Transformer for multi-step reasonin… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  17. arXiv:2405.10570  [pdf

    eess.IV cs.AI

    Simultaneous Deep Learning of Myocardium Segmentation and T2 Quantification for Acute Myocardial Infarction MRI

    Authors: Yirong Zhou, Chengyan Wang, Mengtian Lu, Kunyuan Guo, Zi Wang, Dan Ruan, Rui Guo, Peijun Zhao, Jianhua Wang, Naiming Wu, Jianzhong Lin, Yinyin Chen, Hang Jin, Lianxin Xie, Lilan Wu, Liuhong Zhu, Jianjun Zhou, Congbo Cai, He Wang, Xiaobo Qu

    Abstract: In cardiac Magnetic Resonance Imaging (MRI) analysis, simultaneous myocardial segmentation and T2 quantification are crucial for assessing myocardial pathologies. Existing methods often address these tasks separately, limiting their synergistic potential. To address this, we propose SQNet, a dual-task network integrating Transformer and Convolutional Neural Network (CNN) components. SQNet features… ▽ More

    Submitted 29 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: 10 pages, 8 figures, 6 tables

  18. arXiv:2405.10389  [pdf, other

    eess.SY cs.LG

    Physics-Informed Heterogeneous Graph Neural Networks for DC Blocker Placement

    Authors: Hongwei Jin, Prasanna Balaprakash, Allen Zou, Pieter Ghysels, Aditi S. Krishnapriyan, Adam Mate, Arthur Barnes, Russell Bent

    Abstract: The threat of geomagnetic disturbances (GMDs) to the reliable operation of the bulk energy system has spurred the development of effective strategies for mitigating their impacts. One such approach involves placing transformer neutral blocking devices, which interrupt the path of geomagnetically induced currents (GICs) to limit their impact. The high cost of these devices and the sparsity of trans… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Paper is accepted by PSCC 2024

  19. arXiv:2405.06133  [pdf, other

    cs.DC

    Advancing Anomaly Detection in Computational Workflows with Active Learning

    Authors: Krishnan Raghavan, George Papadimitriou, Hongwei Jin, Anirban Mandal, Mariam Kiran, Prasanna Balaprakash, Ewa Deelman

    Abstract: A computational workflow, also known as workflow, consists of tasks that are executed in a certain order to attain a specific computational campaign. Computational workflows are commonly employed in science domains, such as physics, chemistry, genomics, to complete large-scale experiments in distributed and heterogeneous computing environments. However, running computations at such a large scale m… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  20. arXiv:2405.05523  [pdf, other

    cs.CV cs.AI

    Prompt When the Animal is: Temporal Animal Behavior Grounding with Positional Recovery Training

    Authors: Sheng Yan, Xin Du, Zongying Li, Yi Wang, Hongcang Jin, Mengyuan Liu

    Abstract: Temporal grounding is crucial in multimodal learning, but it poses challenges when applied to animal behavior data due to the sparsity and uniform distribution of moments. To address these challenges, we propose a novel Positional Recovery Training framework (Port), which prompts the model with the start and end times of specific animal behaviors during training. Specifically, Port enhances the ba… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted by ICMEW 2024. arXiv admin note: text overlap with arXiv:2404.13657

  21. arXiv:2405.04026  [pdf, other

    stat.ML cs.LG

    Federated Control in Markov Decision Processes

    Authors: Hao Jin, Yang Peng, Liangyu Zhang, Zhihua Zhang

    Abstract: We study problems of federated control in Markov Decision Processes. To solve an MDP with large state space, multiple learning agents are introduced to collaboratively learn its optimal policy without communication of locally collected experience. In our settings, these agents have limited capabilities, which means they are restricted within different regions of the overall state space during the… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  22. arXiv:2405.03236  [pdf, other

    cs.LG stat.ML

    Federated Reinforcement Learning with Constraint Heterogeneity

    Authors: Hao Jin, Liangyu Zhang, Zhihua Zhang

    Abstract: We study a Federated Reinforcement Learning (FedRL) problem with constraint heterogeneity. In our setting, we aim to solve a reinforcement learning problem with multiple constraints while $N$ training agents are located in $N$ different environments with limited access to the constraint signals and they are expected to collaboratively learn a policy satisfying all constraint signals. Such learning… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  23. arXiv:2405.02466  [pdf, other

    cs.CR cs.LG

    ProFLingo: A Fingerprinting-based Intellectual Property Protection Scheme for Large Language Models

    Authors: Heng Jin, Chaoyu Zhang, Shanghao Shi, Wenjing Lou, Y. Thomas Hou

    Abstract: Large language models (LLMs) have attracted significant attention in recent years. Due to their "Large" nature, training LLMs from scratch consumes immense computational resources. Since several major players in the artificial intelligence (AI) field have open-sourced their original LLMs, an increasing number of individual researchers and smaller companies are able to build derivative LLMs based o… ▽ More

    Submitted 26 June, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: This is the author's pre-print version of the work. It is posted here for your personal use. Not for redistribution

  24. arXiv:2405.00888  [pdf, other

    cs.CL

    DynaMo: Accelerating Language Model Inference with Dynamic Multi-Token Sampling

    Authors: Shikhar Tuli, Chi-Heng Lin, Yen-Chang Hsu, Niraj K. Jha, Yilin Shen, Hongxia Jin

    Abstract: Traditional language models operate autoregressively, i.e., they predict one token at a time. Rapid explosion in model sizes has resulted in high inference times. In this work, we propose DynaMo, a suite of multi-token prediction language models that reduce net inference times. Our models $\textit{dynamically}$ predict multiple tokens based on their confidence in the predicted joint probability di… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: Accepted at NAACL 2024

  25. arXiv:2404.18533  [pdf, other

    cs.AI cs.HC

    Evaluating Concept-based Explanations of Language Models: A Study on Faithfulness and Readability

    Authors: Meng Li, Haoran Jin, Ruixuan Huang, Zhihao Xu, Defu Lian, Zijia Lin, Di Zhang, Xiting Wang

    Abstract: Despite the surprisingly high intelligence exhibited by Large Language Models (LLMs), we are somehow intimidated to fully deploy them into real-life applications considering their black-box nature. Concept-based explanations arise as a promising avenue for explaining what the LLMs have learned, making them more transparent to humans. However, current evaluations for concepts tend to be heuristic a… ▽ More

    Submitted 29 April, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  26. arXiv:2404.17136  [pdf, other

    cs.DB cs.AI cs.CL

    Automated Data Visualization from Natural Language via Large Language Models: An Exploratory Study

    Authors: Yang Wu, Yao Wan, Hongyu Zhang, Yulei Sui, Wucai Wei, Wei Zhao, Guandong Xu, Hai Jin

    Abstract: The Natural Language to Visualization (NL2Vis) task aims to transform natural-language descriptions into visual representations for a grounded table, enabling users to gain insights from vast amounts of data. Recently, many deep learning-based approaches have been developed for NL2Vis. Despite the considerable efforts made by these approaches, challenges persist in visualizing data sourced from un… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  27. arXiv:2404.15687  [pdf, other

    cs.SE cs.AI cs.CR

    Graph Neural Networks for Vulnerability Detection: A Counterfactual Explanation

    Authors: Zhaoyang Chu, Yao Wan, Qian Li, Yang Wu, Hongyu Zhang, Yulei Sui, Guandong Xu, Hai Jin

    Abstract: Vulnerability detection is crucial for ensuring the security and reliability of software systems. Recently, Graph Neural Networks (GNNs) have emerged as a prominent code embedding approach for vulnerability detection, owing to their ability to capture the underlying semantic structure of source code. However, GNNs face significant challenges in explainability due to their inherently black-box natu… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: This paper was accepted in the proceedings of the 33nd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2024)

  28. arXiv:2404.14296  [pdf, other

    cs.SE cs.AI

    Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach

    Authors: Yao Wan, Guanghua Wan, Shijie Zhang, Hongyu Zhang, Yulei Sui, Pan Zhou, Hai Jin, Lichao Sun

    Abstract: Recent years have witnessed significant progress in developing deep learning-based models for automated code completion. Although using source code in GitHub has been a common practice for training deep-learning-based models for code completion, it may induce some legal and ethical issues such as copyright infringement. In this paper, we investigate the legal and ethical issues of current neural c… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  29. arXiv:2404.06691  [pdf

    q-bio.BM cs.LG cs.NE

    Latent Chemical Space Searching for Plug-in Multi-objective Molecule Generation

    Authors: Ningfeng Liu, Jie Yu, Siyu Xiu, Xinfang Zhao, Siyu Lin, Bo Qiang, Ruqiu Zheng, Hongwei Jin, Liangren Zhang, Zhenming Liu

    Abstract: Molecular generation, an essential method for identifying new drug structures, has been supported by advancements in machine learning and computational technology. However, challenges remain in multi-objective generation, model adaptability, and practical application in drug discovery. In this study, we developed a versatile 'plug-in' molecular generation model that incorporates multiple objective… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  30. arXiv:2404.05558  [pdf, other

    eess.IV cs.CV

    JDEC: JPEG Decoding via Enhanced Continuous Cosine Coefficients

    Authors: Woo Kyoung Han, Sunghoon Im, Jaedeok Kim, Kyong Hwan Jin

    Abstract: We propose a practical approach to JPEG image decoding, utilizing a local implicit neural representation with continuous cosine formulation. The JPEG algorithm significantly quantizes discrete cosine transform (DCT) spectra to achieve a high compression rate, inevitably resulting in quality degradation while encoding an image. We have designed a continuous cosine spectrum estimator to address the… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  31. arXiv:2404.04556  [pdf, other

    cs.CV

    Rethinking Self-training for Semi-supervised Landmark Detection: A Selection-free Approach

    Authors: Haibo Jin, Haoxuan Che, Hao Chen

    Abstract: Self-training is a simple yet effective method for semi-supervised learning, during which pseudo-label selection plays an important role for handling confirmation bias. Despite its popularity, applying self-training to landmark detection faces three problems: 1) The selected confident pseudo-labels often contain data bias, which may hurt model performance; 2) It is not easy to decide a proper thre… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: Under review

  32. arXiv:2404.02678  [pdf, other

    cs.CV

    Independently Keypoint Learning for Small Object Semantic Correspondence

    Authors: Hailong Jin, Huiying Li

    Abstract: Semantic correspondence remains a challenging task for establishing correspondences between a pair of images with the same category or similar scenes due to the large intra-class appearance. In this paper, we introduce a novel problem called 'Small Object Semantic Correspondence (SOSC).' This problem is challenging due to the close proximity of keypoints associated with small objects, which result… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  33. arXiv:2404.00589   

    cs.LG cs.CL

    Harnessing the Power of Large Language Model for Uncertainty Aware Graph Processing

    Authors: Zhenyu Qian, Yiming Qian, Yuting Song, Fei Gao, Hai Jin, Chen Yu, Xia Xie

    Abstract: Handling graph data is one of the most difficult tasks. Traditional techniques, such as those based on geometry and matrix factorization, rely on assumptions about the data relations that become inadequate when handling large and complex graph data. On the other hand, deep learning approaches demonstrate promising results in handling large graph data, but they often fall short of providing interpr… ▽ More

    Submitted 12 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

    Comments: Because my organization does not allow members to privately upload papers to arXiv, I am requesting a withdrawal of my submission

  34. arXiv:2404.00226  [pdf, other

    cs.CV cs.CL

    Design as Desired: Utilizing Visual Question Answering for Multimodal Pre-training

    Authors: Tongkun Su, Jun Li, Xi Zhang, Haibo Jin, Hao Chen, Qiong Wang, Faqin Lv, Baoliang Zhao, Yin Hu

    Abstract: Multimodal pre-training demonstrates its potential in the medical domain, which learns medical visual representations from paired medical reports. However, many pre-training tasks require extra annotations from clinicians, and most of them fail to explicitly guide the model to learn the desired features of different pathologies. To the best of our knowledge, we are the first to utilize Visual Ques… ▽ More

    Submitted 8 April, 2024; v1 submitted 29 March, 2024; originally announced April 2024.

  35. arXiv:2403.17377  [pdf, other

    cs.CV cs.AI cs.LG

    Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance

    Authors: Donghoon Ahn, Hyoungwon Cho, Jaewon Min, Wooseok Jang, Jungwoo Kim, SeonHwa Kim, Hyun Hee Park, Kyong Hwan Jin, Seungryong Kim

    Abstract: Recent studies have demonstrated that diffusion models are capable of generating high-quality samples, but their quality heavily depends on sampling guidance techniques, such as classifier guidance (CG) and classifier-free guidance (CFG). These techniques are often not applicable in unconditional generation or in various downstream tasks such as image restoration. In this paper, we propose a novel… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Project page is available at https://ku-cvlab.github.io/Perturbed-Attention-Guidance

  36. arXiv:2403.16792  [pdf, other

    cs.CL cs.SE

    Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback

    Authors: Zhangqian Bi, Yao Wan, Zheng Wang, Hongyu Zhang, Batu Guan, Fangxin Lu, Zili Zhang, Yulei Sui, Hai Jin, Xuanhua Shi

    Abstract: Large Language Models (LLMs) have shown remarkable progress in automated code generation. Yet, LLM-generated code may contain errors in API usage, class, data structure, or missing project-specific information. As much of this project-specific context cannot fit into the prompts of LLMs, we must find ways to allow the model to explore the project-level code context. We present CoCoGen, a new code… ▽ More

    Submitted 10 June, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

  37. The Power of Bamboo: On the Post-Compromise Security for Searchable Symmetric Encryption

    Authors: Tianyang Chen, Peng Xu, Stjepan Picek, Bo Luo, Willy Susilo, Hai Jin, Kaitai Liang

    Abstract: Dynamic searchable symmetric encryption (DSSE) enables users to delegate the keyword search over dynamically updated encrypted databases to an honest-but-curious server without losing keyword privacy. This paper studies a new and practical security risk to DSSE, namely, secret key compromise (e.g., a user's secret key is leaked or stolen), which threatens all the security guarantees offered by exi… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: This is a full version paper that includes the security proof. The paper with the same name has been published by NDSS 2023

    Journal ref: NDSS 2023

  38. arXiv:2403.10801  [pdf, other

    cs.CV

    Securely Fine-tuning Pre-trained Encoders Against Adversarial Examples

    Authors: Ziqi Zhou, Minghui Li, Wei Liu, Shengshan Hu, Yechao Zhang, Wei Wan, Lulu Xue, Leo Yu Zhang, Dezhong Yao, Hai Jin

    Abstract: With the evolution of self-supervised learning, the pre-training paradigm has emerged as a predominant solution within the deep learning landscape. Model providers furnish pre-trained encoders designed to function as versatile feature extractors, enabling downstream users to harness the benefits of expansive models with minimal effort through fine-tuning. Nevertheless, recent works have exposed a… ▽ More

    Submitted 18 March, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

  39. arXiv:2403.03414  [pdf, other

    cs.LG q-bio.NC

    Leveraging The Finite States of Emotion Processing to Study Late-Life Mental Health

    Authors: Yuanzhe Huang, Saurab Faruque, Minjie Wu, Akiko Mizuno, Eduardo Diniz, Shaolin Yang, George Dewitt Stetten, Noah Schweitzer, Hecheng Jin, Linghai Wang, Howard J. Aizenstein

    Abstract: Traditional approaches in mental health research apply General Linear Models (GLM) to describe the longitudinal dynamics of observed psycho-behavioral measurements (questionnaire summary scores). Similarly, GLMs are also applied to characterize relationships between neurobiological measurements (regional fMRI signals) and perceptual stimuli or other regional signals. While these methods are useful… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  40. arXiv:2403.02901  [pdf, other

    cs.AI

    A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods

    Authors: Hanlei Jin, Yang Zhang, Dan Meng, Jun Wang, Jinghua Tan

    Abstract: Automatic Text Summarization (ATS), utilizing Natural Language Processing (NLP) algorithms, aims to create concise and accurate summaries, thereby significantly reducing the human effort required in processing large volumes of text. ATS has drawn considerable interest in both academic and industrial circles. Many studies have been conducted in the past to survey ATS methods; however, they generall… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  41. arXiv:2403.02742  [pdf, other

    cs.CL

    Towards Training A Chinese Large Language Model for Anesthesiology

    Authors: Zhonghai Wang, Jie Jiang, Yibing Zhan, Bohao Zhou, Yanhong Li, Chong Zhang, Liang Ding, Hua Jin, Jun Peng, Xu Lin, Weifeng Liu

    Abstract: Medical large language models (LLMs) have gained popularity recently due to their significant practical utility. However, most existing research focuses on general medicine, and there is a need for in-depth study of LLMs in specific fields like anesthesiology. To fill the gap, we introduce Hypnos, a Chinese Anesthesia model built upon existing LLMs, e.g., Llama. Hypnos' contributions have three as… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  42. arXiv:2403.01479  [pdf, other

    cs.CL cs.AI

    Align-to-Distill: Trainable Attention Alignment for Knowledge Distillation in Neural Machine Translation

    Authors: Heegon Jin, Seonil Son, Jemin Park, Youngseok Kim, Hyungjong Noh, Yeonsoo Lee

    Abstract: The advent of scalable deep models and large datasets has improved the performance of Neural Machine Translation. Knowledge Distillation (KD) enhances efficiency by transferring knowledge from a teacher model to a more compact student model. However, KD approaches to Transformer architecture often rely on heuristics, particularly when deciding which teacher layers to distill from. In this paper, w… ▽ More

    Submitted 25 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: Accepted to LREC-COLING 2024

    MSC Class: 68T50 ACM Class: I.2.7

  43. arXiv:2402.12721  [pdf, other

    cs.CV cs.AI

    PAC-FNO: Parallel-Structured All-Component Fourier Neural Operators for Recognizing Low-Quality Images

    Authors: Jinsung Jeon, Hyundong Jin, Jonghyun Choi, Sanghyun Hong, Dongeun Lee, Kookjin Lee, Noseong Park

    Abstract: A standard practice in developing image recognition models is to train a model on a specific image resolution and then deploy it. However, in real-world inference, models often encounter images different from the training sets in resolution and/or subject to natural variations such as weather changes, noise types and compression artifacts. While traditional solutions involve training multiple mode… ▽ More

    Submitted 14 March, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: Accepted at ICLR 2024

  44. arXiv:2402.05350  [pdf, other

    cs.CV eess.IV

    Descanning: From Scanned to the Original Images with a Color Correction Diffusion Model

    Authors: Junghun Cha, Ali Haider, Seoyun Yang, Hoeyeong Jin, Subin Yang, A. F. M. Shahab Uddin, Jaehyoung Kim, Soo Ye Kim, Sung-Ho Bae

    Abstract: A significant volume of analog information, i.e., documents and images, have been digitized in the form of scanned copies for storing, sharing, and/or analyzing in the digital world. However, the quality of such contents is severely degraded by various distortions caused by printing, storing, and scanning processes in the physical world. Although restoring high-quality content from scanned copies… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: Accepted to AAAI 2024

  45. arXiv:2402.03299  [pdf, other

    cs.LG cs.CL cs.CV

    GUARD: Role-playing to Generate Natural-language Jailbreakings to Test Guideline Adherence of Large Language Models

    Authors: Haibo Jin, Ruoxi Chen, Andy Zhou, Yang Zhang, Haohan Wang

    Abstract: The discovery of "jailbreaks" to bypass safety filters of Large Language Models (LLMs) and harmful responses have encouraged the community to implement safety measures. One major safety measure is to proactively test the LLMs with jailbreaks prior to the release. Therefore, such testing will require a method that can generate jailbreaks massively and efficiently. In this paper, we follow a novel y… ▽ More

    Submitted 30 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: 28 papges

  46. KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

    Authors: Zirui Liu, Jiayi Yuan, Hongye Jin, Shaochen Zhong, Zhaozhuo Xu, Vladimir Braverman, Beidi Chen, Xia Hu

    Abstract: Efficiently serving large language models (LLMs) requires batching many requests together to reduce the cost per request. Yet, the key-value (KV) cache, which stores attention keys and values to avoid re-computations, significantly increases memory demands and becomes the new bottleneck in speed and memory usage. This memory demand increases with larger batch sizes and longer context lengths. Addi… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  47. arXiv:2401.17855  [pdf, other

    stat.AP cs.HC cs.IR

    Network-based Topic Structure Visualization

    Authors: Yeseul Jeon, Jina Park, Ick Hoon Jin, Dongjun Chungc

    Abstract: In the real world, many topics are inter-correlated, making it challenging to investigate their structure and relationships. Understanding the interplay between topics and their relevance can provide valuable insights for researchers, guiding their studies and informing the direction of research. In this paper, we utilize the topic-words distribution, obtained from topic models, as item-response d… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  48. arXiv:2401.13329  [pdf, other

    cs.CV

    Generative Video Diffusion for Unseen Cross-Domain Video Moment Retrieval

    Authors: Dezhao Luo, Shaogang Gong, Jiabo Huang, Hailin Jin, Yang Liu

    Abstract: Video Moment Retrieval (VMR) requires precise modelling of fine-grained moment-text associations to capture intricate visual-language relationships. Due to the lack of a diverse and generalisable VMR dataset to facilitate learning scalable moment-text associations, existing methods resort to joint training on both source and target domain videos for cross-domain applications. Meanwhile, recent dev… ▽ More

    Submitted 29 January, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

  49. arXiv:2401.11089  [pdf, other

    cs.CR cs.AI cs.DC cs.IR

    FedRKG: A Privacy-preserving Federated Recommendation Framework via Knowledge Graph Enhancement

    Authors: Dezhong Yao, Tongtong Liu, Qi Cao, Hai Jin

    Abstract: Federated Learning (FL) has emerged as a promising approach for preserving data privacy in recommendation systems by training models locally. Recently, Graph Neural Networks (GNN) have gained popularity in recommendation tasks due to their ability to capture high-order interactions between users and items. However, privacy concerns prevent the global sharing of the entire user-item graph. To addre… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  50. arXiv:2401.09767  [pdf, other

    cs.CR cs.SE

    On the Effectiveness of Function-Level Vulnerability Detectors for Inter-Procedural Vulnerabilities

    Authors: Zhen Li, Ning Wang, Deqing Zou, Yating Li, Ruqian Zhang, Shouhuai Xu, Chao Zhang, Hai Jin

    Abstract: Software vulnerabilities are a major cyber threat and it is important to detect them. One important approach to detecting vulnerabilities is to use deep learning while treating a program function as a whole, known as function-level vulnerability detectors. However, the limitation of this approach is not understood. In this paper, we investigate its limitation in detecting one class of vulnerabilit… ▽ More

    Submitted 20 January, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: 12 pages, 7 figures. To appear in the Proceedings of the 46th International Conference on Software Engineering (ICSE'24)