Skip to main content

Showing 1–50 of 945 results for author: Du, J

  1. arXiv:2407.11380  [pdf, other

    cs.CV cs.LG

    NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition

    Authors: Chenyu Liu, Jia Pan, Jinshui Hu, Baocai Yin, Bing Yin, Mingjun Chen, Cong Liu, Jun Du, Qingfeng Liu

    Abstract: Recently, Handwritten Mathematical Expression Recognition (HMER) has gained considerable attention in pattern recognition for its diverse applications in document understanding. Current methods typically approach HMER as an image-to-sequence generation task within an autoregressive (AR) encoder-decoder framework. However, these approaches suffer from several drawbacks: 1) a lack of overall languag… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  2. arXiv:2407.08154  [pdf, other

    cs.CE

    Bayesian uncertainty analysis for underwater 3D reconstruction with neural radiance fields

    Authors: Haojie Lian, Xinhao Li, Yilin Qu, Jing Du, Zhuxuan Meng, Jie Liu, Leilei Chen

    Abstract: Neural radiance fields (NeRFs) are a deep learning technique that can generate novel views of 3D scenes using sparse 2D images from different viewing directions and camera poses. As an extension of conventional NeRFs in underwater environment, where light can get absorbed and scattered by water, SeaThru-NeRF was proposed to separate the clean appearance and geometric structure of underwater scene… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  3. arXiv:2407.07453  [pdf, other

    physics.optics eess.SP

    Waveguide Superlattices with Artificial Gauge Field Towards Colorless and Crosstalkless Ultrahigh-Density Photonic Integration

    Authors: Xuelin Zhang, Jiangbing Du, Ke Xu, Zuyuan He

    Abstract: Dense waveguide arrays with low crosstalk and ultra-broadband remain a vital issue for chip-scale integrated photonics. However, the sub-wavelength regime of such devices has not been adequately explored in practice. Herein, we propose the advanced waveguide superlattices leveraging the artificial gauge field mechanism. This approach achieves remarkable -24 dB crosstalk suppression with an ultra-b… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  4. arXiv:2407.06519  [pdf, other

    eess.IV

    F2PAD: A General Optimization Framework for Feature-Level to Pixel-Level Anomaly Detection

    Authors: Chengyu Tao, Hao Xu, Juan Du

    Abstract: Image-based inspection systems have been widely deployed in manufacturing production lines. Due to the scarcity of defective samples, unsupervised anomaly detection that only leverages normal samples during training to detect various defects is popular. Existing feature-based methods, utilizing deep features from pretrained neural networks, show their impressive performance in anomaly localization… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  5. arXiv:2407.05487  [pdf, other

    cs.IT cs.LG eess.IV eess.SP

    Multi-level Reliability Interface for Semantic Communications over Wireless Networks

    Authors: Tze-Yang Tung, Homa Esfahanizadeh, Jinfeng Du, Harish Viswanathan

    Abstract: Semantic communication, when examined through the lens of joint source-channel coding (JSCC), maps source messages directly into channel input symbols, where the measure of success is defined by end-to-end distortion rather than traditional metrics such as block error rate. Previous studies have shown significant improvements achieved through deep learning (DL)-driven JSCC compared to traditional… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  6. arXiv:2407.04104  [pdf, other

    stat.ME cs.LG stat.ML

    Network-based Neighborhood regression

    Authors: Yaoming Zhen, Jin-Hong Du

    Abstract: Given the ubiquity of modularity in biological systems, module-level regulation analysis is vital for understanding biological systems across various levels and their dynamics. Current statistical analysis on biological modules predominantly focuses on either detecting the functional modules in biological networks or sub-group regression on the biological features without using the network data. T… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  7. arXiv:2407.04053  [pdf, other

    cs.DC

    Edge AI: A Taxonomy, Systematic Review and Future Directions

    Authors: Sukhpal Singh Gill, Muhammed Golec, Jianmin Hu, Minxian Xu, Junhui Du, Huaming Wu, Guneet Kaur Walia, Subramaniam Subramanian Murugesan, Babar Ali, Mohit Kumar, Kejiang Ye, Prabal Verma, Surendra Kumar, Felix Cuadrado, Steve Uhlig

    Abstract: Edge Artificial Intelligence (AI) incorporates a network of interconnected systems and devices that receive, cache, process, and analyse data in close communication with the location where the data is captured with AI technology. Recent advancements in AI efficiency, the widespread use of Internet of Things (IoT) devices, and the emergence of edge computing have unlocked the enormous scope of Edge… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Preprint Version, 18 Figures

  8. arXiv:2407.03676  [pdf

    physics.app-ph

    Out-of-Plane Polarization from Spin Reflection Induces Field-Free Spin-Orbit Torque Switching in Structures with Canted NiO Interfacial Moments

    Authors: Zhe Zhang, Zhuoyi Li, Yuzhe Chen, Fangyuan Zhu, Yu Yan, Yao Li, Liang He, Jun Du, Rong Zhang, Jing Wu, Xianyang Lu, Yongbing Xu

    Abstract: Realizing deterministic current-induced spin-orbit torque (SOT) magnetization switching, especially in systems exhibiting perpendicular magnetic anisotropy (PMA), typically requires the application of a collinear in-plane field, posing a challenging problem. In this study, we successfully achieve field-free SOT switching in the CoFeB/MgO system. In a Ta/CoFeB/MgO/NiO/Ta structure, spin reflection… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  9. arXiv:2407.01544  [pdf, other

    cs.NI cs.AI

    Decentralized Multi-Party Multi-Network AI for Global Deployment of 6G Wireless Systems

    Authors: Merim Dzaferagic, Marco Ruffini, Nina Slamnik-Krijestorac, Joao F. Santos, Johann Marquez-Barja, Christos Tranoris, Spyros Denazis, Thomas Kyriakakis, Panagiotis Karafotis, Luiz DaSilva, Shashi Raj Pandey, Junya Shiraishi, Petar Popovski, Soren Kejser Jensen, Christian Thomsen, Torben Bach Pedersen, Holger Claussen, Jinfeng Du, Gil Zussman, Tingjun Chen, Yiran Chen, Seshu Tirupathi, Ivan Seskar, Daniel Kilper

    Abstract: Multiple visions of 6G networks elicit Artificial Intelligence (AI) as a central, native element. When 6G systems are deployed at a large scale, end-to-end AI-based solutions will necessarily have to encompass both the radio and the fiber-optical domain. This paper introduces the Decentralized Multi-Party, Multi-Network AI (DMMAI) framework for integrating AI into 6G networks deployed at scale. DM… ▽ More

    Submitted 15 April, 2024; originally announced July 2024.

  10. arXiv:2407.00647  [pdf, other

    cond-mat.mes-hall quant-ph

    Critical fluctuation and noise spectra in two-dimensional Fe$_{3}$GeTe$_{2}$ magnets

    Authors: Yuxin Li, Zhe Ding, Chen Wang, Haoyu Sun, Zhousheng Chen, Pengfei Wang, Ya Wang, Ming Gong, Hualing Zeng, Fazhan Shi, Jiangfeng Du

    Abstract: Critical fluctuations play a fundamental role in determining the spin orders for low-dimensional quantum materials, especially for recently discovered two-dimensional (2D) magnets. Here we employ the quantum decoherence imaging technique utilizing nitrogen-vacancy centers in diamond to explore the critical magnetic fluctuations and the associated temporal spin noise in van der Waals magnet… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  11. arXiv:2407.00119  [pdf, other

    cs.LG cs.AI cs.CL

    Efficient Long-distance Latent Relation-aware Graph Neural Network for Multi-modal Emotion Recognition in Conversations

    Authors: Yuntao Shou, Wei Ai, Jiayi Du, Tao Meng, Haiyan Liu

    Abstract: The task of multi-modal emotion recognition in conversation (MERC) aims to analyze the genuine emotional state of each utterance based on the multi-modal information in the conversation, which is crucial for conversation understanding. Existing methods focus on using graph neural networks (GNN) to model conversational relationships and capture contextual latent semantic relationships. However, due… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: 11 pages, 3 tables

  12. arXiv:2406.18984  [pdf, other

    cs.IR

    Amplify Graph Learning for Recommendation via Sparsity Completion

    Authors: Peng Yuan, Haojie Li, Minying Fang, Xu Yu, Yongjing Hao, Junwei Du

    Abstract: Graph learning models have been widely deployed in collaborative filtering (CF) based recommendation systems. Due to the issue of data sparsity, the graph structure of the original input lacks potential positive preference edges, which significantly reduces the performance of recommendations. In this paper, we study how to enhance the graph structure for CF more effectively, thereby optimizing the… ▽ More

    Submitted 1 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

  13. arXiv:2406.16253  [pdf, other

    cs.CL

    LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

    Authors: Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath, Haoran Ranran Zhang, Vipul Gupta, Yinghui Li, Tao Li, Fei Wang, Qin Liu, Tianlin Liu, Pengzhi Gao, Congying Xia, Chen Xing, Jiayang Cheng, Zhaowei Wang, Ying Su, Raj Sanjay Shah, Ruohao Guo , et al. (15 additional authors not shown)

    Abstract: This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as th… ▽ More

    Submitted 25 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  14. arXiv:2406.16203  [pdf, other

    cs.CL

    LLMs' Classification Performance is Overclaimed

    Authors: Hanzi Xu, Renze Lou, Jiangshu Du, Vahid Mahzoon, Elmira Talebianaraki, Zhuoan Zhou, Elizabeth Garrison, Slobodan Vucetic, Wenpeng Yin

    Abstract: In many classification tasks designed for AI or human to solve, gold labels are typically included within the label space by default, often posed as "which of the following is correct?" This standard setup has traditionally highlighted the strong performance of advanced AI, particularly top-performing Large Language Models (LLMs), in routine classification tasks. However, when the gold label is in… ▽ More

    Submitted 3 July, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  15. arXiv:2406.15731  [pdf, other

    cs.CR cs.AI

    Breaking Secure Aggregation: Label Leakage from Aggregated Gradients in Federated Learning

    Authors: Zhibo Wang, Zhiwei Chang, Jiahui Hu, Xiaoyi Pang, Jiacheng Du, Yongle Chen, Kui Ren

    Abstract: Federated Learning (FL) exhibits privacy vulnerabilities under gradient inversion attacks (GIAs), which can extract private information from individual gradients. To enhance privacy, FL incorporates Secure Aggregation (SA) to prevent the server from obtaining individual gradients, thus effectively resisting GIAs. In this paper, we propose a stealthy label inference attack to bypass SA and recover… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 10 pages, conference to IEEE INFOCOM 2024

  16. arXiv:2406.15160  [pdf, other

    eess.AS eess.SP

    Exploring Audio-Visual Information Fusion for Sound Event Localization and Detection In Low-Resource Realistic Scenarios

    Authors: Ya Jiang, Qing Wang, Jun Du, Maocheng Hu, Pengfei Hu, Zeyan Liu, Shi Cheng, Zhaoxu Nian, Yuxuan Dong, Mingqi Cai, Xin Fang, Chin-Hui Lee

    Abstract: This study presents an audio-visual information fusion approach to sound event localization and detection (SELD) in low-resource scenarios. We aim at utilizing audio and video modality information through cross-modal learning and multi-modal fusion. First, we propose a cross-modal teacher-student learning (TSL) framework to transfer information from an audio-only teacher model, trained on a rich c… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: accepted by icme2024

  17. arXiv:2406.13348  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    Textual Unlearning Gives a False Sense of Unlearning

    Authors: Jiacheng Du, Zhibo Wang, Kui Ren

    Abstract: Language models (LMs) are susceptible to "memorizing" training data, including a large amount of private or copyright-protected content. To safeguard the right to be forgotten (RTBF), machine unlearning has emerged as a promising method for LMs to efficiently "forget" sensitive training content and mitigate knowledge leakage risks. However, despite its good intentions, could the unlearning mechani… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  18. arXiv:2406.12195  [pdf, other

    quant-ph cs.LG

    Quantum Compiling with Reinforcement Learning on a Superconducting Processor

    Authors: Z. T. Wang, Qiuhao Chen, Yuxuan Du, Z. H. Yang, Xiaoxia Cai, Kaixuan Huang, Jingning Zhang, Kai Xu, Jun Du, Yinan Li, Yuling Jiao, Xingyao Wu, Wu Liu, Xiliang Lu, Huikai Xu, Yirong Jin, Ruixia Wang, Haifeng Yu, S. P. Zhao

    Abstract: To effectively implement quantum algorithms on noisy intermediate-scale quantum (NISQ) processors is a central task in modern quantum technology. NISQ processors feature tens to a few hundreds of noisy qubits with limited coherence times and gate operations with errors, so NISQ algorithms naturally require employing circuits of short lengths via quantum compilation. Here, we develop a reinforcemen… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  19. arXiv:2406.10304  [pdf, other

    cs.CL

    Enhancing Voice Wake-Up for Dysarthria: Mandarin Dysarthria Speech Corpus Release and Customized System Design

    Authors: Ming Gao, Hang Chen, Jun Du, Xin Xu, Hongxiao Guo, Hui Bu, Jianxing Yang, Ming Li, Chin-Hui Lee

    Abstract: Smart home technology has gained widespread adoption, facilitating effortless control of devices through voice commands. However, individuals with dysarthria, a motor speech disorder, face challenges due to the variability of their speech. This paper addresses the wake-up word spotting (WWS) task for dysarthric individuals, aiming to integrate them into real-world applications. To support this, we… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: to be published in Interspeech 2024

  20. arXiv:2406.09057  [pdf, ps, other

    math.QA math.RT

    The $q$-Schur algebras in type $D$, I, fundamental multiplication formulas

    Authors: Jie Du, Yiqiang Li, Zhaozhao Zhao

    Abstract: By embedding the Hecke algebra $\check H_q$ of type $D$ into the Hecke algebra $H_{q,1}$ of type $B$ with unequal parameters $(q,1)$, the $q$-Schur algebras $S^κ_q(n,r)$ of type $D$ is naturally defined as the endomorphism algebra of the tensor space with the $\check H_q$-action restricted from the $H_{q,1}$-action that defines the $(q,1)$-Schur algebra $S^\jmath_{q,1}(n,r)$ of type $B$. We invest… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 33 pages. Comments welcome

    MSC Class: 16T20; 17B37; 20C08; 20C33; 20G43

  21. arXiv:2406.08757  [pdf, other

    cs.CL cs.AI

    SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding

    Authors: Jiefeng Ma, Yan Wang, Chenyu Liu, Jun Du, Yu Hu, Zhenrong Zhang, Pengfei Hu, Qing Wang, Jianshu Zhang

    Abstract: Accurately identifying and organizing textual content is crucial for the automation of document processing in the field of form understanding. Existing datasets, such as FUNSD and XFUND, support entity classification and relationship prediction tasks but are typically limited to local and entity-level annotations. This limitation overlooks the hierarchically structured representation of documents,… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: NeurIPS 2024 Track on Datasets and Benchmarks under review

  22. arXiv:2406.07625  [pdf, other

    cond-mat.str-el cond-mat.quant-gas quant-ph

    Emergent Universal Quench Dynamics in Randomly Interacting Spin Models

    Authors: Yuchen Li, Tian-Gang Zhou, Ze Wu, Pai Peng, Shengyu Zhang, Riqiang Fu, Ren Zhang, Wei Zheng, Pengfei Zhang, Hui Zhai, Xinhua Peng, Jiangfeng Du

    Abstract: Universality often emerges in low-energy equilibrium physics of quantum many-body systems, despite their microscopic complexity and variety. Recently, there has been a growing interest in studying far-from-equilibrium dynamics of quantum many-body systems. Such dynamics usually involves highly excited states beyond the traditional low-energy theory description. Whether universal behaviors can also… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 10 pages, 4 figures; Supplementary Information 26 pages, 11 figures, 2 tables

  23. arXiv:2406.07256  [pdf, ps, other

    cs.SD cs.AI eess.AS

    AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection

    Authors: Rong Gong, Hongfei Xue, Lezhi Wang, Xin Xu, Qisheng Li, Lei Xie, Hui Bu, Shaomei Wu, Jiaming Zhou, Yong Qin, Binbin Zhang, Jun Du, Jia Bin, Ming Li

    Abstract: The rapid advancements in speech technologies over the past two decades have led to human-level performance in tasks like automatic speech recognition (ASR) for fluent speech. However, the efficacy of these models diminishes when applied to atypical speech, such as stuttering. This paper introduces AS-70, the first publicly available Mandarin stuttered speech dataset, which stands out as the large… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  24. arXiv:2406.07081  [pdf, other

    cs.CL

    Efficiently Exploring Large Language Models for Document-Level Machine Translation with In-context Learning

    Authors: Menglong Cui, Jiangcun Du, Shaolin Zhu, Deyi Xiong

    Abstract: Large language models (LLMs) exhibit outstanding performance in machine translation via in-context learning. In contrast to sentence-level translation, document-level translation (DOCMT) by LLMs based on in-context learning faces two major challenges: firstly, document translations generated by LLMs are often incoherent; secondly, the length of demonstration for in-context learning is usually limi… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL2024 long paper (Findings)

  25. arXiv:2406.04582  [pdf, other

    eess.AS cs.SD

    Neural Codec-based Adversarial Sample Detection for Speaker Verification

    Authors: Xuanjun Chen, Jiawei Du, Haibin Wu, Jyh-Shing Roger Jang, Hung-yi Lee

    Abstract: Automatic Speaker Verification (ASV), increasingly used in security-critical applications, faces vulnerabilities from rising adversarial attacks, with few effective defenses available. In this paper, we propose a neural codec-based adversarial sample detection method for ASV. The approach leverages the codec's ability to discard redundant perturbations and retain essential information. Specificall… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  26. arXiv:2406.02463  [pdf, other

    cs.CR

    Click Without Compromise: Online Advertising Measurement via Per User Differential Privacy

    Authors: Yingtai Xiao, Jian Du, Shikun Zhang, Qiang Yan, Danfeng Zhang, Daniel Kifer

    Abstract: Online advertising is a cornerstone of the Internet ecosystem, with advertising measurement playing a crucial role in optimizing efficiency. Ad measurement entails attributing desired behaviors, such as purchases, to ad exposures across various platforms, necessitating the collection of user activities across these platforms. As this practice faces increasing restrictions due to rising privacy con… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  27. arXiv:2406.02262  [pdf, other

    eess.SP

    A DAFT Based Unified Waveform Design Framework for High-Mobility Communications

    Authors: Xingyao Zhang, Haoran Yin, Yanqun Tang, Yu Zhou, Yuqing Liu, Jinming Du, Yipeng Ding

    Abstract: With the increasing demand for multi-carrier communication in high-mobility scenarios, it is urgent to design new multi-carrier communication waveforms that can resist large delay-Doppler spreads. Various multi-carrier waveforms in the transform domain were proposed for the fast time-varying channels, including orthogonal time frequency space (OTFS), orthogonal chirp division multiplexing (OCDM),… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  28. arXiv:2406.02110  [pdf, other

    cs.CL cs.AI

    UniOQA: A Unified Framework for Knowledge Graph Question Answering with Large Language Models

    Authors: Zhuoyang Li, Liran Deng, Hui Liu, Qiaoqiao Liu, Junzhao Du

    Abstract: OwnThink stands as the most extensive Chinese open-domain knowledge graph introduced in recent times. Despite prior attempts in question answering over OwnThink (OQA), existing studies have faced limitations in model representation capabilities, posing challenges in further enhancing overall accuracy in question answering. In this paper, we introduce UniOQA, a unified framework that integrates two… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 10 pages, 5 figures

  29. arXiv:2406.00915  [pdf

    cond-mat.mtrl-sci physics.chem-ph

    Molecular-Resolution Imaging of Ice Crystallized from Liquid Water

    Authors: Jingshan S. Du, Suvo Banik, Henry Chan, Birk Fritsch, Ying Xia, Andreas Hutzler, Subramanian K. R. S. Sankaranarayanan, James J. De Yoreo

    Abstract: Despite the ubiquity of ice, a molecular-resolution image of ice crystallized from liquid water or the resulting defect structure has never been obtained. Here, we report the stabilization and angstrom-resolution electron imaging of ice Ih crystallized from liquid water. We combine lattice mapping with molecular dynamics simulations to reveal that ice formation is highly tolerant to nanoscale defe… ▽ More

    Submitted 4 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

    Comments: 15 pages, 4 figures, and 1 table. Supplementary Materials: 35 pages, 25 figures, and 9 tables. Updated to resolve the equation typesetting issue

  30. arXiv:2406.00625  [pdf, other

    cs.CV

    SAM-LAD: Segment Anything Model Meets Zero-Shot Logic Anomaly Detection

    Authors: Yun Peng, Xiao Lin, Nachuan Ma, Jiayuan Du, Chuangwei Liu, Chengju Liu, Qijun Chen

    Abstract: Visual anomaly detection is vital in real-world applications, such as industrial defect detection and medical diagnosis. However, most existing methods focus on local structural anomalies and fail to detect higher-level functional anomalies under logical conditions. Although recent studies have explored logical anomaly detection, they can only address simple anomalies like missing or addition and… ▽ More

    Submitted 5 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

  31. arXiv:2405.17245  [pdf, other

    cs.DC cs.AI cs.LG cs.NI

    Galaxy: A Resource-Efficient Collaborative Edge AI System for In-situ Transformer Inference

    Authors: Shengyuan Ye, Jiangsu Du, Liekang Zeng, Wenzhong Ou, Xiaowen Chu, Yutong Lu, Xu Chen

    Abstract: Transformer-based models have unlocked a plethora of powerful intelligent applications at the edge, such as voice assistant in smart home. Traditional deployment approaches offload the inference workloads to the remote cloud server, which would induce substantial pressure on the backbone network as well as raise users' privacy concerns. To address that, in-situ inference has been recently recogniz… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE International Conference on Computer Communications 2024

  32. arXiv:2405.16952  [pdf, other

    eess.AS

    A Variance-Preserving Interpolation Approach for Diffusion Models with Applications to Single Channel Speech Enhancement and Recognition

    Authors: Zilu Guo, Qing Wang, Jun Du, Jia Pan, Qing-Feng Liu, Chin-Hui

    Abstract: In this paper, we propose a variance-preserving interpolation framework to improve diffusion models for single-channel speech enhancement (SE) and automatic speech recognition (ASR). This new variance-preserving interpolation diffusion model (VPIDM) approach requires only 25 iterative steps and obviates the need for a corrector, an essential element in the existing variance-exploding interpolation… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  33. arXiv:2405.16863  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    All-voltage control of Giant Magnetoresistance

    Authors: Lujun Wei, Yiyang Zhang, Fei Huang, Jiajv Yang, Jincheng Peng, Yanghui Li, Yu Lu, Jiarui Chen, Tianyu Liu, Yong Pu, Jun Du

    Abstract: The aim of voltage control of magnetism is to reduce the power consumption of spintronic devices. For a spin valve, the magnetization directions of two ferromagnetic layers determine the giant magnetoresistance magnitude. However, achieving all-voltage manipulation of the magnetization directions between parallel and antiparallel states is a significant challenge. Here, we demonstrate that by util… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  34. arXiv:2405.15863  [pdf, other

    cs.SD cs.AI eess.AS

    Quality-aware Masked Diffusion Transformer for Enhanced Music Generation

    Authors: Chang Li, Ruoyu Wang, Lijuan Liu, Jun Du, Yixuan Sun, Zilu Guo, Zhenrong Zhang, Yuan Jiang

    Abstract: In recent years, diffusion-based text-to-music (TTM) generation has gained prominence, offering a novel approach to synthesizing musical content from textual descriptions. Achieving high accuracy and diversity in this generation process requires extensive, high-quality data, which often constitutes only a fraction of available datasets. Within open-source datasets, the prevalence of issues like mi… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  35. arXiv:2405.14137  [pdf, other

    cs.CV

    RET-CLIP: A Retinal Image Foundation Model Pre-trained with Clinical Diagnostic Reports

    Authors: Jiawei Du, Jia Guo, Weihang Zhang, Shengzhu Yang, Hanruo Liu, Huiqi Li, Ningli Wang

    Abstract: The Vision-Language Foundation model is increasingly investigated in the fields of computer vision and natural language processing, yet its exploration in ophthalmology and broader medical applications remains limited. The challenge is the lack of labeled data for the training of foundation model. To handle this issue, a CLIP-style retinal image foundation model is developed in this paper. Our fou… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  36. arXiv:2405.11862  [pdf, other

    cs.CV

    SEMv3: A Fast and Robust Approach to Table Separation Line Detection

    Authors: Chunxia Qin, Zhenrong Zhang, Pengfei Hu, Chenyu Liu, Jiefeng Ma, Jun Du

    Abstract: Table structure recognition (TSR) aims to parse the inherent structure of a table from its input image. The `"split-and-merge" paradigm is a pivotal approach to parse table structure, where the table separation line detection is crucial. However, challenges such as wireless and deformed tables make it demanding. In this paper, we adhere to the "split-and-merge" paradigm and propose SEMv3 (SEM: Spl… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 9 pages, 6 figures, 5 tables. Accepted by IJCAI2024 main track

  37. arXiv:2405.10436  [pdf, other

    cs.IR cs.AI

    Positional encoding is not the same as context: A study on positional encoding for Sequential recommendation

    Authors: Alejo Lopez-Avila, Jinhua Du, Abbas Shimary, Ze Li

    Abstract: The expansion of streaming media and e-commerce has led to a boom in recommendation systems, including Sequential recommendation systems, which consider the user's previous interactions with items. In recent years, research has focused on architectural improvements such as transformer blocks and feature extraction that can augment model information. Among these features are context and attributes.… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: 19 pages, 3 figures, 12 tables

    MSC Class: I.2.m

  38. arXiv:2405.09897  [pdf, other

    cond-mat.mtrl-sci

    Towards Informatics-Driven Design of Nuclear Waste Forms

    Authors: Vinay I. Hegde, Miroslava Peterson, Sarah I. Allec, Xiaonan Lu, Thiruvillamalai Mahadevan, Thanh Nguyen, Jayani Kalahe, Jared Oshiro, Robert J. Seffens, Ethan K. Nickerson, Jincheng Du, Brian J. Riley, John D. Vienna, James E. Saal

    Abstract: Informatics-driven approaches, such as machine learning and sequential experimental design, have shown the potential to drastically impact next-generation materials discovery and design. In this perspective, we present a few guiding principles for applying informatics-based methods towards the design of novel nuclear waste forms. We advocate for adopting a system design approach, and describe the… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: 35 pages, 9 figures, 2 tables

  39. arXiv:2405.09791  [pdf

    gr-qc astro-ph.IM physics.ins-det

    Challenging theories of dark energy with levitated force sensor

    Authors: Peiran Yin, Rui Li, Chengjiang Yin, Xiangyu Xu, Xiang Bian, Han Xie, Chang-Kui Duan, Pu Huang, Jian-hua He, Jiangfeng Du

    Abstract: The nature of dark energy is one of the most outstanding problems in physical science, and various theories have been proposed. It is therefore essential to directly verify or rule out these theories experimentally. However, despite substantial efforts in astrophysical observations and laboratory experiments, previous tests have not yet acquired enough accuracy to provide decisive conclusions as t… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Journal ref: Nature Physics 18, 1181-1185 (2022)

  40. arXiv:2405.09353  [pdf, other

    eess.IV cs.CV

    Large coordinate kernel attention network for lightweight image super-resolution

    Authors: Fangwei Hao, Jiesheng Wu, Haotian Lu, Ji Du, Jing Xu

    Abstract: The multi-scale receptive field and large kernel attention (LKA) module have been shown to significantly improve performance in the lightweight image super-resolution task. However, existing lightweight super-resolution (SR) methods seldom pay attention to designing efficient building block with multi-scale receptive field for local modeling, and their LKA modules face a quadratic increase in comp… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  41. arXiv:2405.05176  [pdf, other

    cs.CL

    Encoder-Decoder Framework for Interactive Free Verses with Generation with Controllable High-Quality Rhyming

    Authors: Tommaso Pasini, Alejo López-Ávila, Husam Quteineh, Gerasimos Lampouras, Jinhua Du, Yubing Wang, Ze Li, Yusen Sun

    Abstract: Composing poetry or lyrics involves several creative factors, but a challenging aspect of generation is the adherence to a more or less strict metric and rhyming pattern. To address this challenge specifically, previous work on the task has mainly focused on reverse language modeling, which brings the critical selection of each rhyming word to the forefront of each verse. On the other hand, revers… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 18 pages, 1 figure

    MSC Class: I.2.7

  42. arXiv:2404.17662  [pdf, other

    cs.CL

    PLAYER*: Enhancing LLM-based Multi-Agent Communication and Interaction in Murder Mystery Games

    Authors: Qinglin Zhu, Runcong Zhao, Jinhua Du, Lin Gui, Yulan He

    Abstract: We propose PLAYER*, a novel framework that addresses the limitations of existing agent-based approaches built on Large Language Models (LLMs) in handling complex questions and understanding interpersonal relationships in dynamic environments. PLAYER* enhances path planning in Murder Mystery Games (MMGs) using an anytime sampling-based planner and a questioning-driven search framework. By equipping… ▽ More

    Submitted 17 June, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

  43. arXiv:2404.15075  [pdf, other

    quant-ph

    An energy efficient quantum-enhanced machine

    Authors: Waner Hou, Xingyu Zhao, Kamran Rehan, Yi Li, Yue Li, Eric Lutz, Yiheng Lin, Jiangfeng Du

    Abstract: Quantum friction, a quantum analog of classical friction, reduces the performance of quantum machines, such as heat engines, and makes them less energy efficient. We here report the experimental realization of an energy efficient quantum engine coupled to a quantum battery that stores the produced work, using a single ion in a linear Paul trap. We first establish the quantum nature of the device b… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  44. arXiv:2404.12731  [pdf, other

    hep-ex physics.ins-det

    Near-Quantum-limited Haloscope Detection of Dark Photon Dark Matter Enhanced by a High-Q Superconducting Cavit

    Authors: Runqi Kang, Man Jiao, Yu Tong, Yang Liu, Youpeng Zhong, Yi-Fu Cai, Jingwei Zhou, Xing Rong, Jiangfeng Du

    Abstract: We report new experimental results on the search for dark photons based on a near-quantum-limited haloscope equipped with a superconducting cavity. The loaded quality factor of the superconducting cavity is $6\times10^{5}$, so that the expected signal from dark photon dark matter can be enhanced by more than one order compared to a copper cavity. A Josephson parametric amplifier with a near-quantu… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  45. arXiv:2404.11313  [pdf, other

    eess.IV cs.AI

    NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results

    Authors: Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei Li, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo , et al. (43 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i.e., Kuaishou/Kwai Platform. The KVQ database is divided into three parts, including 2926 videos for training, 420 videos for validation, and 854 videos for testing. The… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR2024 Workshop. The challenge report for CVPR NTIRE2024 Short-form UGC Video Quality Assessment Challenge

  46. arXiv:2404.09119  [pdf, other

    stat.ME stat.AP stat.ML

    Causal Inference for Genomic Data with Multiple Heterogeneous Outcomes

    Authors: Jin-Hong Du, Zhenghao Zeng, Edward H. Kennedy, Larry Wasserman, Kathryn Roeder

    Abstract: With the evolution of single-cell RNA sequencing techniques into a standard approach in genomics, it has become possible to conduct cohort-level causal inferences based on single-cell-level measurements. However, the individual gene expression levels of interest are not directly observable; instead, only repeated proxy measurements from each individual's cells are available, providing a derived ou… ▽ More

    Submitted 16 April, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

    Comments: 26 pages and 6 figures for the main text, 30 pages and 3 figures for the supplement

  47. arXiv:2404.07748  [pdf, other

    cs.CV cs.LG

    3D-CSAD: Untrained 3D Anomaly Detection for Complex Manufacturing Surfaces

    Authors: Xuanming Cao, Chengyu Tao, Juan Du

    Abstract: The surface quality inspection of manufacturing parts based on 3D point cloud data has attracted increasing attention in recent years. The reason is that the 3D point cloud can capture the entire surface of manufacturing parts, unlike the previous practices that focus on some key product characteristics. However, achieving accurate 3D anomaly detection is challenging, due to the complex surfaces o… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  48. arXiv:2404.05403  [pdf, other

    cs.CR cs.AI

    SoK: Gradient Leakage in Federated Learning

    Authors: Jiacheng Du, Jiahui Hu, Zhibo Wang, Peng Sun, Neil Zhenqiang Gong, Kui Ren

    Abstract: Federated learning (FL) enables collaborative model training among multiple clients without raw data exposure. However, recent studies have shown that clients' private training data can be reconstructed from the gradients they share in FL, known as gradient inversion attacks (GIAs). While GIAs have demonstrated effectiveness under \emph{ideal settings and auxiliary assumptions}, their actual effic… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  49. arXiv:2404.04937  [pdf, other

    cs.CR cs.GT

    Optimizing Information Propagation for Blockchain-empowered Mobile AIGC: A Graph Attention Network Approach

    Authors: Jiana Liao, Jinbo Wen, Jiawen Kang, Yang Zhang, Jianbo Du, Qihao Li, Weiting Zhang, Dong Yang

    Abstract: Artificial Intelligence-Generated Content (AIGC) is a rapidly evolving field that utilizes advanced AI algorithms to generate content. Through integration with mobile edge networks, mobile AIGC networks have gained significant attention, which can provide real-time customized and personalized AIGC services and products. Since blockchains can facilitate decentralized and transparent data management… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2403.13237

  50. arXiv:2404.04481  [pdf, other

    cs.IR cs.AI cs.LG

    Joint Identifiability of Cross-Domain Recommendation via Hierarchical Subspace Disentanglement

    Authors: Jing Du, Zesheng Ye, Bin Guo, Zhiwen Yu, Lina Yao

    Abstract: Cross-Domain Recommendation (CDR) seeks to enable effective knowledge transfer across domains. Existing works rely on either representation alignment or transformation bridges, but they struggle on identifying domain-shared from domain-specific latent factors. Specifically, while CDR describes user representations as a joint distribution over two domains, these methods fail to account for its join… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: accepted to SIGIR 2024 as a Full Research Paper