Skip to main content

Showing 1–50 of 364 results for author: Ding, J

  1. arXiv:2407.02818  [pdf, other

    cs.SE cs.ET cs.PL

    WizardMerge -- Save Us From Merging Without Any Clues

    Authors: Qingyu Zhang, Junzhe Li, Jiayi Lin, Jie Ding, Lanteng Lin, Chenxiong Qian

    Abstract: Modern software development necessitates efficient version-oriented collaboration among developers. While Git is the most popular version control system, it generates unsatisfactory version merging results due to textual-based workflow, leading to potentially unexpected results in the merged version of the project. Although numerous merging tools have been proposed for improving merge results, dev… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 22 pages

    ACM Class: D.2; D.3

  2. arXiv:2407.00747  [pdf, other

    cs.CL cs.AI

    A Comparative Study of Quality Evaluation Methods for Text Summarization

    Authors: Huyen Nguyen, Haihua Chen, Lavanya Pobbathi, Junhua Ding

    Abstract: Evaluating text summarization has been a challenging task in natural language processing (NLP). Automatic metrics which heavily rely on reference summaries are not suitable in many situations, while human evaluation is time-consuming and labor-intensive. To bridge this gap, this paper proposes a novel method based on large language models (LLMs) for evaluating text summarization. We also conducts… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: The paper is under review at Empirical Methods in Natural Language Processing (EMNLP) 2024. It has 15 pages and 4 figures

  3. arXiv:2406.19875  [pdf, other

    cs.CV

    InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding

    Authors: Kirolos Ataallah, Chenhui Gou, Eslam Abdelrahman, Khushbu Pahwa, Jian Ding, Mohamed Elhoseiny

    Abstract: Understanding long videos, ranging from tens of minutes to several hours, presents unique challenges in video comprehension. Despite the increasing importance of long-form video content, existing benchmarks primarily focus on shorter clips. To address this gap, we introduce InfiniBench a comprehensive benchmark for very long video understanding which presents 1)The longest video duration, averagin… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 16 page ,17 figures

  4. arXiv:2406.19614  [pdf, other

    cs.LG cs.AI

    A Survey on Data Quality Dimensions and Tools for Machine Learning

    Authors: Yuhan Zhou, Fengjiao Tu, Kewei Sha, Junhua Ding, Haihua Chen

    Abstract: Machine learning (ML) technologies have become substantial in practically all aspects of our society, and data quality (DQ) is critical for the performance, fairness, robustness, safety, and scalability of ML models. With the large and complex data in data-centric AI, traditional methods like exploratory data analysis (EDA) and cross-validation (CV) face challenges, highlighting the importance of… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted by The 6th IEEE International Conference on Artificial Intelligence Testing (IEEE AITest 2024) as an invited paper

  5. arXiv:2406.18087  [pdf, other

    cs.SE cs.AI cs.CL

    EHR-Based Mobile and Web Platform for Chronic Disease Risk Prediction Using Large Language Multimodal Models

    Authors: Chun-Chieh Liao, Wei-Ting Kuo, I-Hsuan Hu, Yen-Chen Shih, Jun-En Ding, Feng Liu, Fang-Ming Hung

    Abstract: Traditional diagnosis of chronic diseases involves in-person consultations with physicians to identify the disease. However, there is a lack of research focused on predicting and developing application systems using clinical notes and blood test values. We collected five years of Electronic Health Records (EHRs) from Taiwan's hospital database between 2017 and 2021 as an AI database. Furthermore,… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  6. arXiv:2406.12384  [pdf, other

    cs.CV

    VRSBench: A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding

    Authors: Xiang Li, Jian Ding, Mohamed Elhoseiny

    Abstract: We introduce a new benchmark designed to advance the development of general-purpose, large-scale vision-language models for remote sensing images. Although several vision-language datasets in remote sensing have been proposed to pursue this goal, existing datasets are typically tailored to single tasks, lack detailed object information, or suffer from inadequate quality control. Exploring these im… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Submitted for consideration at a conference

  7. arXiv:2406.07732  [pdf, other

    quant-ph cs.CR

    Experimenting with D-Wave Quantum Annealers on Prime Factorization problems

    Authors: Jingwen Ding, Giuseppe Spallitta, Roberto Sebastiani

    Abstract: This paper builds on top of a paper we have published very recently, in which we have proposed a novel approach to prime factorization (PF) by quantum annealing, where 8,219,999=32,749x251 was the highest prime product we were able to factorize -- which, to the best of our knowledge is the largest number which was ever factorized by means of a quantum device. The series of annealing experiments wh… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Published on Frontiers

  8. arXiv:2406.06211  [pdf, other

    cs.CV

    iMotion-LLM: Motion Prediction Instruction Tuning

    Authors: Abdulwahab Felemban, Eslam Mohamed Bakr, Xiaoqian Shen, Jian Ding, Abduallah Mohamed, Mohamed Elhoseiny

    Abstract: We introduce iMotion-LLM: a Multimodal Large Language Models (LLMs) with trajectory prediction, tailored to guide interactive multi-agent scenarios. Different from conventional motion prediction approaches, iMotion-LLM capitalizes on textual instructions as key inputs for generating contextually relevant trajectories. By enriching the real-world driving scenarios in the Waymo Open Dataset with tex… ▽ More

    Submitted 11 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  9. arXiv:2406.02614  [pdf, other

    cs.LG cs.AI

    Frequency Enhanced Pre-training for Cross-city Few-shot Traffic Forecasting

    Authors: Zhanyu Liu, Jianrong Ding, Guanjie Zheng

    Abstract: The field of Intelligent Transportation Systems (ITS) relies on accurate traffic forecasting to enable various downstream applications. However, developing cities often face challenges in collecting sufficient training traffic data due to limited resources and outdated infrastructure. Recognizing this obstacle, the concept of cross-city few-shot forecasting has emerged as a viable approach. While… ▽ More

    Submitted 5 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted by ECMLPKDD 2024 (Research Track)

  10. arXiv:2406.02131  [pdf, other

    cs.LG cs.AI

    CondTSF: One-line Plugin of Dataset Condensation for Time Series Forecasting

    Authors: Jianrong Ding, Zhanyu Liu, Guanjie Zheng, Haiming Jin, Linghe Kong

    Abstract: Dataset condensation is a newborn technique that generates a small dataset that can be used in training deep neural networks to lower training costs. The objective of dataset condensation is to ensure that the model trained with the synthetic dataset can perform comparably to the model trained with full datasets. However, existing methods predominantly concentrate on classification tasks, posing c… ▽ More

    Submitted 11 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: 23 pages, 13 figures

  11. arXiv:2405.20694  [pdf, other

    cs.NE

    Robust Stable Spiking Neural Networks

    Authors: Jianhao Ding, Zhiyu Pan, Yujia Liu, Zhaofei Yu, Tiejun Huang

    Abstract: Spiking neural networks (SNNs) are gaining popularity in deep learning due to their low energy budget on neuromorphic hardware. However, they still face challenges in lacking sufficient robustness to guard safety-critical applications such as autonomous driving. Many studies have been conducted to defend SNNs from the threat of adversarial attacks. This paper aims to uncover the robustness of SNN… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML2024

  12. arXiv:2405.20355  [pdf, other

    cs.NE cs.CR cs.CV cs.LG

    Enhancing Adversarial Robustness in SNNs with Sparse Gradients

    Authors: Yujia Liu, Tong Bu, Jianhao Ding, Zecheng Hao, Tiejun Huang, Zhaofei Yu

    Abstract: Spiking Neural Networks (SNNs) have attracted great attention for their energy-efficient operations and biologically inspired structures, offering potential advantages over Artificial Neural Networks (ANNs) in terms of energy efficiency and interpretability. Nonetheless, similar to ANNs, the robustness of SNNs remains a challenge, especially when facing adversarial attacks. Existing techniques, wh… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: accepted by ICML 2024

  13. arXiv:2405.19856  [pdf, other

    cs.CL cs.SE

    DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories

    Authors: Jia Li, Ge Li, Yunfei Zhao, Yongmin Li, Huanyu Liu, Hao Zhu, Lecheng Wang, Kaibo Liu, Zheng Fang, Lanshen Wang, Jiazheng Ding, Xuanming Zhang, Yuqi Zhu, Yihong Dong, Zhi Jin, Binhua Li, Fei Huang, Yongbin Li

    Abstract: How to evaluate the coding abilities of Large Language Models (LLMs) remains an open question. We find that existing benchmarks are poorly aligned with real-world code repositories and are insufficient to evaluate the coding abilities of LLMs. To address the knowledge gap, we propose a new benchmark named DevEval, which has three advances. (1) DevEval aligns with real-world repositories in multi… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted by the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024). arXiv admin note: substantial text overlap with arXiv:2404.00599, arXiv:2401.06401

  14. arXiv:2405.19524  [pdf, other

    cs.CR cs.AI

    AI Risk Management Should Incorporate Both Safety and Security

    Authors: Xiangyu Qi, Yangsibo Huang, Yi Zeng, Edoardo Debenedetti, Jonas Geiping, Luxi He, Kaixuan Huang, Udari Madhushani, Vikash Sehwag, Weijia Shi, Boyi Wei, Tinghao Xie, Danqi Chen, Pin-Yu Chen, Jeffrey Ding, Ruoxi Jia, Jiaqi Ma, Arvind Narayanan, Weijie J Su, Mengdi Wang, Chaowei Xiao, Bo Li, Dawn Song, Peter Henderson, Prateek Mittal

    Abstract: The exposure of security vulnerabilities in safety-aligned language models, e.g., susceptibility to adversarial attacks, has shed light on the intricate interplay between AI safety and AI security. Although the two disciplines now come together under the overarching goal of AI risk management, they have historically evolved separately, giving rise to differing perspectives. Therefore, in this pape… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  15. arXiv:2405.18937  [pdf, other

    cs.CV cs.CL

    Kestrel: Point Grounding Multimodal LLM for Part-Aware 3D Vision-Language Understanding

    Authors: Junjie Fei, Mahmoud Ahmed, Jian Ding, Eslam Mohamed Bakr, Mohamed Elhoseiny

    Abstract: While 3D MLLMs have achieved significant progress, they are restricted to object and scene understanding and struggle to understand 3D spatial structures at the part level. In this paper, we introduce Kestrel, representing a novel approach that empowers 3D MLLMs with part-aware understanding, enabling better interpretation and segmentation grounding of 3D objects at the part level. Despite its sig… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  16. arXiv:2405.18146  [pdf, other

    cs.IR cs.LG

    Unified Low-rank Compression Framework for Click-through Rate Prediction

    Authors: Hao Yu, Minghao Fu, Jiandong Ding, Yusheng Zhou, Jianxin Wu

    Abstract: Deep Click-Through Rate (CTR) prediction models play an important role in modern industrial recommendation scenarios. However, high memory overhead and computational costs limit their deployment in resource-constrained environments. Low-rank approximation is an effective method for computer vision and natural language processing models, but its application in compressing CTR prediction models has… ▽ More

    Submitted 11 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD2024 Applied Data Science (ADS) Track

  17. arXiv:2405.16663  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Private Edge Density Estimation for Random Graphs: Optimal, Efficient and Robust

    Authors: Hongjie Chen, Jingqiu Ding, Yiding Hua, David Steurer

    Abstract: We give the first polynomial-time, differentially node-private, and robust algorithm for estimating the edge density of Erdős-Rényi random graphs and their generalization, inhomogeneous random graphs. We further prove information-theoretical lower bounds, showing that the error rate of our algorithm is optimal up to logarithmic factors. Previous algorithms incur either exponential running time or… ▽ More

    Submitted 3 June, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

    Comments: fix minor typos; add missing references

  18. arXiv:2405.13403  [pdf, other

    eess.IV cs.MM

    Adaptive Wireless Image Semantic Transmission and Over-The-Air Testing

    Authors: Jiarun Ding, Peiwen Jiang, Chao-Kai Wen, Shi Jin

    Abstract: Semantic communication has undergone considerable evolution due to the recent rapid development of artificial intelligence (AI), significantly enhancing both communication robustness and efficiency. Despite these advancements, most current semantic communication methods for image transmission pay little attention to the differing importance of objects and backgrounds in images. To address this iss… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  19. arXiv:2405.13058  [pdf, other

    cs.SE cs.AI cs.CY cs.LG

    The AI Community Building the Future? A Quantitative Analysis of Development Activity on Hugging Face Hub

    Authors: Cailean Osborne, Jennifer Ding, Hannah Rose Kirk

    Abstract: Open model developers have emerged as key actors in the political economy of artificial intelligence (AI), but we still have a limited understanding of collaborative practices in the open AI ecosystem. This paper responds to this gap with a three-part quantitative analysis of development activity on the Hugging Face (HF) Hub, a popular platform for building, sharing, and demonstrating models. Firs… ▽ More

    Submitted 5 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: 27 pages, 5 figures, 9 tables

    ACM Class: K.4.1

  20. arXiv:2405.12107  [pdf, other

    cs.CV cs.CL

    Imp: Highly Capable Large Multimodal Models for Mobile Devices

    Authors: Zhenwei Shao, Zhou Yu, Jun Yu, Xuecheng Ouyang, Lihao Zheng, Zhenbiao Gai, Mingyang Wang, Jiajun Ding

    Abstract: By harnessing the capabilities of large language models (LLMs), recent large multimodal models (LMMs) have shown remarkable versatility in open-world multimodal understanding. Nevertheless, they are usually parameter-heavy and computation-intensive, thus hindering their applicability in resource-constrained scenarios. To this end, several lightweight LMMs have been proposed successively to maximiz… ▽ More

    Submitted 29 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: fix some typos and correct a few number in the tables

  21. arXiv:2405.11353  [pdf, other

    cs.CR cs.AR

    NTTSuite: Number Theoretic Transform Benchmarks for Accelerating Encrypted Computation

    Authors: Juran Ding, Yuanzhe Liu, Lingbin Sun, Brandon Reagen

    Abstract: Privacy concerns have thrust privacy-preserving computation into the spotlight. Homomorphic encryption (HE) is a cryptographic system that enables computation to occur directly on encrypted data, providing users with strong privacy (and security) guarantees while using the same services they enjoy today unprotected. While promising, HE has seen little adoption due to extremely high computational o… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

    Comments: 8 pages, 5 figures, and two tables. To download the source code, see https://github.com/Dragon201701/NTTSuite

  22. arXiv:2405.11155  [pdf, other

    eess.SY cs.CC

    Inner-approximate Reachability Computation via Zonotopic Boundary Analysis

    Authors: Dejin Ren, Zhen Liang, Chenyu Wu, Jianqiang Ding, Taoran Wu, Bai Xue

    Abstract: Inner-approximate reachability analysis involves calculating subsets of reachable sets, known as inner-approximations. This analysis is crucial in the fields of dynamic systems analysis and control theory as it provides a reliable estimation of the set of states that a system can reach from given initial states at a specific time instant. In this paper, we study the inner-approximate reachability… ▽ More

    Submitted 21 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: the extended version of the paper accepted by CAV 2024

  23. arXiv:2405.10255  [pdf, other

    cs.CV cs.RO

    When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models

    Authors: Xianzheng Ma, Yash Bhalgat, Brandon Smart, Shuai Chen, Xinghui Li, Jian Ding, Jindong Gu, Dave Zhenyu Chen, Songyou Peng, Jia-Wang Bian, Philip H Torr, Marc Pollefeys, Matthias Nießner, Ian D Reid, Angel X. Chang, Iro Laina, Victor Adrian Prisacariu

    Abstract: As large language models (LLMs) evolve, their integration with 3D spatial data (3D-LLMs) has seen rapid progress, offering unprecedented capabilities for understanding and interacting with physical spaces. This survey provides a comprehensive overview of the methodologies enabling LLMs to process, understand, and generate 3D data. Highlighting the unique advantages of LLMs, such as in-context lear… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  24. arXiv:2405.10121  [pdf, other

    cs.CL cs.MM

    Distilling Implicit Multimodal Knowledge into LLMs for Zero-Resource Dialogue Generation

    Authors: Bo Zhang, Hui Ma, Jian Ding, Jian Wang, Bo Xu, Hongfei Lin

    Abstract: Integrating multimodal knowledge into large language models (LLMs) represents a significant advancement in dialogue generation capabilities. However, the effective incorporation of such knowledge in zero-resource scenarios remains a substantial challenge due to the scarcity of diverse, high-quality dialogue datasets. To address this, we propose the Visual Implicit Knowledge Distillation Framework… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Under Review

  25. arXiv:2405.08235  [pdf, other

    stat.ML cs.LG

    Additive-Effect Assisted Learning

    Authors: Jiawei Zhang, Yuhong Yang, Jie Ding

    Abstract: It is quite popular nowadays for researchers and data analysts holding different datasets to seek assistance from each other to enhance their modeling performance. We consider a scenario where different learners hold datasets with potentially distinct variables, and their observations can be aligned by a nonprivate identifier. Their collaboration faces the following difficulties: First, learners m… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  26. arXiv:2405.07488  [pdf, other

    cs.LG cs.RO cs.SC

    Predictive Modeling of Flexible EHD Pumps using Kolmogorov-Arnold Networks

    Authors: Yanhong Peng, Miao He, Fangchao Hu, Zebing Mao, Xia Huang, Jun Ding

    Abstract: We present a novel approach to predicting the pressure and flow rate of flexible electrohydrodynamic pumps using the Kolmogorov-Arnold Network. Inspired by the Kolmogorov-Arnold representation theorem, KAN replaces fixed activation functions with learnable spline-based activation functions, enabling it to approximate complex nonlinear functions more effectively than traditional models like Multi-L… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  27. arXiv:2404.19563  [pdf, other

    cs.CL

    RepEval: Effective Text Evaluation with LLM Representation

    Authors: Shuqian Sheng, Yi Xu, Tianhang Zhang, Zanwei Shen, Luoyi Fu, Jiaxin Ding, Lei Zhou, Xinbing Wang, Chenghu Zhou

    Abstract: Automatic evaluation metrics for generated texts play an important role in the NLG field, especially with the rapid growth of LLMs. However, existing metrics are often limited to specific scenarios, making it challenging to meet the evaluation requirements of expanding LLM applications. Therefore, there is a demand for new, flexible, and effective metrics. In this study, we introduce RepEval, the… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  28. arXiv:2404.17456  [pdf, other

    cs.NE

    Converting High-Performance and Low-Latency SNNs through Explicit Modelling of Residual Error in ANNs

    Authors: Zhipeng Huang, Jianhao Ding, Zhiyu Pan, Haoran Li, Ying Fang, Zhaofei Yu, Jian K. Liu

    Abstract: Spiking neural networks (SNNs) have garnered interest due to their energy efficiency and superior effectiveness on neuromorphic chips compared with traditional artificial neural networks (ANNs). One of the mainstream approaches to implementing deep SNNs is the ANN-SNN conversion, which integrates the efficient training strategy of ANNs with the energy-saving potential and fast inference capability… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  29. arXiv:2404.13844  [pdf, other

    cs.LG cs.AI

    ColA: Collaborative Adaptation with Gradient Learning

    Authors: Enmao Diao, Qi Le, Suya Wu, Xinran Wang, Ali Anwar, Jie Ding, Vahid Tarokh

    Abstract: A primary function of back-propagation is to compute both the gradient of hidden representations and parameters for optimization with gradient descent. Training large models requires high computational costs due to their vast parameter sizes. While Parameter-Efficient Fine-Tuning (PEFT) methods aim to train smaller auxiliary models to save computational space, they still present computational over… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  30. arXiv:2404.10528  [pdf, other

    cs.MM

    AllTheDocks road safety dataset: A cyclist's perspective and experience

    Authors: Chia-Yen Chiang, Ruikang Zhong, Jennifer Ding, Joseph Wood, Stephen Bee, Mona Jaber

    Abstract: Active travel is an essential component in intelligent transportation systems. Cycling, as a form of active travel, shares the road space with motorised traffic which often affects the cyclists' safety and comfort and therefore peoples' propensity to uptake cycling instead of driving. This paper presents a unique dataset, collected by cyclists across London, that includes video footage, accelerome… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  31. arXiv:2404.10318  [pdf, other

    cs.CV

    SRGS: Super-Resolution 3D Gaussian Splatting

    Authors: Xiang Feng, Yongbo He, Yubo Wang, Yan Yang, Wen Li, Yifei Chen, Zhenzhong Kuang, Jiajun ding, Jianping Fan, Yu Jun

    Abstract: Recently, 3D Gaussian Splatting (3DGS) has gained popularity as a novel explicit 3D representation. This approach relies on the representation power of Gaussian primitives to provide a high-quality rendering. However, primitives optimized at low resolution inevitably exhibit sparsity and texture deficiency, posing a challenge for achieving high-resolution novel view synthesis (HRNVS). To address t… ▽ More

    Submitted 18 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: The first to focus on the HRNVS of 3DGS

  32. arXiv:2404.07493  [pdf, other

    cs.LG cs.AI

    Characterizing the Influence of Topology on Graph Learning Tasks

    Authors: Kailong Wu, Yule Xie, Jiaxin Ding, Yuxiang Ren, Luoyi Fu, Xinbing Wang, Chenghu Zhou

    Abstract: Graph neural networks (GNN) have achieved remarkable success in a wide range of tasks by encoding features combined with topology to create effective representations. However, the fundamental problem of understanding and analyzing how graph topology influences the performance of learning models on downstream tasks has not yet been well understood. In this paper, we propose a metric, TopoInf, which… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  33. Deep Reinforcement Learning Based Toolpath Generation for Thermal Uniformity in Laser Powder Bed Fusion Process

    Authors: Mian Qin, Junhao Ding, Shuo Qu, Xu Song, Charlie C. L. Wang, Wei-Hsin Liao

    Abstract: Laser powder bed fusion (LPBF) is a widely used metal additive manufacturing technology. However, the accumulation of internal residual stress during printing can cause significant distortion and potential failure. Although various scan patterns have been studied to reduce possible accumulated stress, such as zigzag scanning vectors with changing directions or a chessboard-based scan pattern with… ▽ More

    Submitted 16 February, 2024; originally announced April 2024.

    Journal ref: Additive Manufacturing, vol.79, 103937 (12 pages), January 2024

  34. arXiv:2404.03413  [pdf, other

    cs.CV

    MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens

    Authors: Kirolos Ataallah, Xiaoqian Shen, Eslam Abdelrahman, Essam Sleiman, Deyao Zhu, Jian Ding, Mohamed Elhoseiny

    Abstract: This paper introduces MiniGPT4-Video, a multimodal Large Language Model (LLM) designed specifically for video understanding. The model is capable of processing both temporal visual and textual data, making it adept at understanding the complexities of videos. Building upon the success of MiniGPT-v2, which excelled in translating visual features into the LLM space for single images and achieved imp… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: 6 pages,8 figures

  35. arXiv:2403.20025  [pdf, ps, other

    cs.IT eess.SP

    Secure Full-Duplex Communication via Movable Antennas

    Authors: Jingze Ding, Zijian Zhou, Chenbo Wang, Wenyao Li, Lifeng Lin, Bingli Jiao

    Abstract: This paper investigates physical layer security (PLS) for a movable antenna (MA)-assisted full-duplex (FD) system. In this system, an FD base station (BS) with multiple MAs for transmission and reception provides services for an uplink (UL) user and a downlink (DL) user. Each user operates in half-duplex (HD) mode and is equipped with a single fixed-position antenna (FPA), in the presence of a sin… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: This paper has been submitted for possible publication

  36. arXiv:2403.18774  [pdf, other

    cs.CV cs.CR cs.LG

    RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees

    Authors: Xun Xian, Ganghua Wang, Xuan Bi, Jayanth Srinivasa, Ashish Kundu, Mingyi Hong, Jie Ding

    Abstract: Safeguarding intellectual property and preventing potential misuse of AI-generated images are of paramount importance. This paper introduces a robust and agile plug-and-play watermark detection framework, dubbed as RAW. As a departure from traditional encoder-decoder methods, which incorporate fixed binary codes as watermarks within latent representations, our approach introduces learnable waterma… ▽ More

    Submitted 23 January, 2024; originally announced March 2024.

  37. arXiv:2403.14275  [pdf, other

    cs.CL

    Is Reference Necessary in the Evaluation of NLG Systems? When and Where?

    Authors: Shuqian Sheng, Yi Xu, Luoyi Fu, Jiaxin Ding, Lei Zhou, Xinbing Wang, Chenghu Zhou

    Abstract: The majority of automatic metrics for evaluating NLG systems are reference-based. However, the challenge of collecting human annotation results in a lack of reliable references in numerous application scenarios. Despite recent advancements in reference-free metrics, it has not been well understood when and where they can be used as an alternative to reference-based metrics. In this study, by emplo… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  38. arXiv:2403.13313  [pdf, other

    cs.AI cs.CL

    Polaris: A Safety-focused LLM Constellation Architecture for Healthcare

    Authors: Subhabrata Mukherjee, Paul Gamble, Markel Sanz Ausin, Neel Kant, Kriti Aggarwal, Neha Manjunath, Debajyoti Datta, Zhengliang Liu, Jiayuan Ding, Sophia Busacca, Cezanne Bianco, Swapnil Sharma, Rae Lasko, Michelle Voisard, Sanchay Harneja, Darya Filippova, Gerry Meixiong, Kevin Cha, Amir Youssefi, Meyhaa Buvanesh, Howard Weingram, Sebastian Bierman-Lytle, Harpreet Singh Mangat, Kim Parikh, Saad Godil , et al. (1 additional authors not shown)

    Abstract: We develop Polaris, the first safety-focused LLM constellation for real-time patient-AI healthcare conversations. Unlike prior LLM works in healthcare focusing on tasks like question answering, our work specifically focuses on long multi-turn voice conversations. Our one-trillion parameter constellation system is composed of several multibillion parameter LLMs as co-operative agents: a stateful pr… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  39. arXiv:2403.12213  [pdf, ps, other

    cs.DS cs.CC cs.LG stat.ML

    Private graphon estimation via sum-of-squares

    Authors: Hongjie Chen, Jingqiu Ding, Tommaso d'Orsi, Yiding Hua, Chih-Hung Liu, David Steurer

    Abstract: We develop the first pure node-differentially-private algorithms for learning stochastic block models and for graphon estimation with polynomial running time for any constant number of blocks. The statistical utility guarantees match those of the previous best information-theoretic (exponential-time) node-private mechanisms for these problems. The algorithm is based on an exponential mechanism for… ▽ More

    Submitted 18 April, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: 71 pages, accepted to STOC 2024

  40. arXiv:2403.11397  [pdf, other

    cs.CV eess.IV

    Defense Against Adversarial Attacks on No-Reference Image Quality Models with Gradient Norm Regularization

    Authors: Yujia Liu, Chenxi Yang, Dingquan Li, Jianhao Ding, Tingting Jiang

    Abstract: The task of No-Reference Image Quality Assessment (NR-IQA) is to estimate the quality score of an input image without additional information. NR-IQA models play a crucial role in the media industry, aiding in performance evaluation and optimization guidance. However, these models are found to be vulnerable to adversarial attacks, which introduce imperceptible perturbations to input images, resulti… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: accepted by CVPR 2024

  41. Rumor Mitigation in Social Media Platforms with Deep Reinforcement Learning

    Authors: Hongyuan Su, Yu Zheng, Jingtao Ding, Depeng Jin, Yong Li

    Abstract: Social media platforms have become one of the main channels where people disseminate and acquire information, of which the reliability is severely threatened by rumors widespread in the network. Existing approaches such as suspending users or broadcasting real information to combat rumors are either with high cost or disturbing users. In this paper, we introduce a novel rumor mitigation paradigm,… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: WWW24 short

    MSC Class: 68T09

  42. MetroGNN: Metro Network Expansion with Reinforcement Learning

    Authors: Hongyuan Su, Yu Zheng, Jingtao Ding, Depeng Jin, Yong Li

    Abstract: Selecting urban regions for metro network expansion to meet maximal transportation demands is crucial for urban development, while computationally challenging to solve. The expansion process relies not only on complicated features like urban demographics and origin-destination (OD) flow but is also constrained by the existing metro network and urban geography. In this paper, we introduce a reinfor… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: WWW24 short

    MSC Class: 68T09

  43. arXiv:2403.04785  [pdf, other

    cs.CL cs.AI

    Large Language Multimodal Models for 5-Year Chronic Disease Cohort Prediction Using EHR Data

    Authors: Jun-En Ding, Phan Nguyen Minh Thao, Wen-Chih Peng, Jian-Zhe Wang, Chun-Cheng Chug, Min-Chen Hsieh, Yun-Chien Tseng, Ling Chen, Dongsheng Luo, Chi-Te Wang, Pei-fu Chen, Feng Liu, Fang-Ming Hung

    Abstract: Chronic diseases such as diabetes are the leading causes of morbidity and mortality worldwide. Numerous research studies have been attempted with various deep learning models in diagnosis. However, most previous studies had certain limitations, including using publicly available datasets (e.g. MIMIC), and imbalanced data. In this study, we collected five-year electronic health records (EHRs) from… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  44. arXiv:2403.02576  [pdf, other

    cs.DL cs.LG cs.SI

    AceMap: Knowledge Discovery through Academic Graph

    Authors: Xinbing Wang, Luoyi Fu, Xiaoying Gan, Ying Wen, Guanjie Zheng, Jiaxin Ding, Liyao Xiang, Nanyang Ye, Meng Jin, Shiyu Liang, Bin Lu, Haiwen Wang, Yi Xu, Cheng Deng, Shao Zhang, Huquan Kang, Xingli Wang, Qi Li, Zhixin Guo, Jiexing Qi, Pan Liu, Yuyang Ren, Lyuwen Wu, Jungang Yang, Jianping Zhou , et al. (1 additional authors not shown)

    Abstract: The exponential growth of scientific literature requires effective management and extraction of valuable insights. While existing scientific search engines excel at delivering search results based on relational databases, they often neglect the analysis of collaborations between scientific entities and the evolution of ideas, as well as the in-depth analysis of content within scientific publicatio… ▽ More

    Submitted 14 April, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: Technical Report for AceMap (https://www.acemap.info)

  45. arXiv:2403.01513  [pdf

    eess.IV cs.CV

    CDSE-UNet: Enhancing COVID-19 CT Image Segmentation with Canny Edge Detection and Dual-Path SENet Feature Fusion

    Authors: Jiao Ding, Jie Chang, Renrui Han, Li Yang

    Abstract: Accurate segmentation of COVID-19 CT images is crucial for reducing the severity and mortality rates associated with COVID-19 infections. In response to blurred boundaries and high variability characteristic of lesion areas in COVID-19 CT images, we introduce CDSE-UNet: a novel UNet-based segmentation model that integrates Canny operator edge detection and a dual-path SENet feature fusion mechanis… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

  46. arXiv:2403.00033  [pdf, other

    q-bio.NC cs.LG eess.SP

    Identification of Craving Maps among Marijuana Users via the Analysis of Functional Brain Networks with High-Order Attention Graph Neural Networks

    Authors: Jun-En Ding, Shihao Yang, Anna Zilverstand, Feng Liu

    Abstract: The excessive consumption of marijuana can induce substantial psychological and social consequences. In this investigation, we propose an elucidative framework termed high-order graph attention neural networks (HOGANN) for the classification of Marijuana addiction, coupled with an analysis of localized brain network communities exhibiting abnormal activities among chronic marijuana users. HOGANN i… ▽ More

    Submitted 26 March, 2024; v1 submitted 28 February, 2024; originally announced March 2024.

  47. arXiv:2402.16887  [pdf, other

    cs.SI cs.AI cs.LG physics.soc-ph

    Artificial Intelligence for Complex Network: Potential, Methodology and Application

    Authors: Jingtao Ding, Chang Liu, Yu Zheng, Yunke Zhang, Zihan Yu, Ruikun Li, Hongyi Chen, Jinghua Piao, Huandong Wang, Jiazhen Liu, Yong Li

    Abstract: Complex networks pervade various real-world systems, from the natural environment to human societies. The essence of these networks is in their ability to transition and evolve from microscopic disorder-where network topology and node dynamics intertwine-to a macroscopic order characterized by certain collective behaviors. Over the past two decades, complex network science has significantly enhanc… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: 51 pages, 4 figures, 10 tables

  48. arXiv:2402.16479  [pdf, other

    cs.CV

    Edge Detectors Can Make Deep Convolutional Neural Networks More Robust

    Authors: Jin Ding, Jie-Chao Zhao, Yong-Zhi Sun, Ping Tan, Jia-Wei Wang, Ji-En Ma, You-Tong Fang

    Abstract: Deep convolutional neural networks (DCNN for short) are vulnerable to examples with small perturbations. Improving DCNN's robustness is of great significance to the safety-critical applications, such as autonomous driving and industry automation. Inspired by the principal way that human eyes recognize objects, i.e., largely relying on the shape features, this paper first employs the edge detectors… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: 26 pages, 18 figures, 7 tables. submitted to Neural Networks, under review

  49. arXiv:2402.14103  [pdf, ps, other

    cs.LG cs.CC math.ST stat.ML

    Computational-Statistical Gaps for Improper Learning in Sparse Linear Regression

    Authors: Rares-Darius Buhai, Jingqiu Ding, Stefan Tiegel

    Abstract: We study computational-statistical gaps for improper learning in sparse linear regression. More specifically, given $n$ samples from a $k$-sparse linear model in dimension $d$, we ask what is the minimum sample complexity to efficiently (in time polynomial in $d$, $k$, and $n$) find a potentially dense estimate for the regression vector that achieves non-trivial prediction error on the $n$ samples… ▽ More

    Submitted 25 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: 24 pages; updated typos, some explanations, and references

  50. arXiv:2402.11922  [pdf, other

    cs.LG

    Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation

    Authors: Yuan Yuan, Chenyang Shao, Jingtao Ding, Depeng Jin, Yong Li

    Abstract: Spatio-temporal modeling is foundational for smart city applications, yet it is often hindered by data scarcity in many cities and regions. To bridge this gap, we propose a novel generative pre-training framework, GPD, for spatio-temporal few-shot learning with urban knowledge transfer. Unlike conventional approaches that heavily rely on common feature extraction or intricate few-shot learning des… ▽ More

    Submitted 25 March, 2024; v1 submitted 19 February, 2024; originally announced February 2024.