Skip to main content

Showing 1–50 of 102 results for author: Luo, Q

  1. arXiv:2407.11018  [pdf, other

    cs.NI eess.SP

    Online Multi-Task Offloading for Semantic-Aware Edge Computing Systems

    Authors: Xuyang Chen, Qu Luo, Gaojie Chen, Daquan Feng, Yao Sun

    Abstract: Mobile edge computing (MEC) provides low-latency offloading solutions for computationally intensive tasks, effectively improving the computing efficiency and battery life of mobile devices. However, for data-intensive tasks or scenarios with limited uplink bandwidth, network congestion might occur due to massive simultaneous offloading nodes, increasing transmission latency and affecting task perf… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  2. arXiv:2407.09756  [pdf, other

    cs.CL

    LLM-Collaboration on Automatic Science Journalism for the General Audience

    Authors: Gongyao Jiang, Xinran Shi, Qiong Luo

    Abstract: Science journalism reports current scientific discoveries to non-specialists, aiming to enable public comprehension of the state of the art. However, this task can be challenging as the audience often lacks specific knowledge about the presented research. To address this challenge, we propose a framework that integrates three LLMs mimicking the real-world writing-reading-feedback-revision workflow… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Under review

  3. arXiv:2406.10744  [pdf, other

    cs.CV

    Technique Report of CVPR 2024 PBDL Challenges

    Authors: Ying Fu, Yu Li, Shaodi You, Boxin Shi, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Shengping Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou, Cong Li, Senyan Xu , et al. (75 additional authors not shown)

    Abstract: The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, a… ▽ More

    Submitted 12 July, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 PBDL Challenges: https://pbdl-ws.github.io/pbdl2024/challenge/index.html

  4. arXiv:2405.12533  [pdf

    cs.CV

    Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering

    Authors: Hiba Maryam, Ling Fu, Jiajun Song, Tajrian ABM Shafayet, Qidi Luo, Xiang Bai, Yuliang Liu

    Abstract: The development of Urdu scene text detection, recognition, and Visual Question Answering (VQA) technologies is crucial for advancing accessibility, information retrieval, and linguistic diversity in digital content, facilitating better understanding and interaction with Urdu-language visual data. This initiative seeks to bridge the gap between textual and visual comprehension. We propose a new mul… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Accepted by the International Conference on Document Analysis and Recognition (ICDAR) 2024

  5. arXiv:2405.07530  [pdf, other

    cs.SE

    Prompt-based Code Completion via Multi-Retrieval Augmented Generation

    Authors: Hanzhuo Tan, Qi Luo, Ling Jiang, Zizheng Zhan, Jing Li, Haotian Zhang, Yuqun Zhang

    Abstract: Automated code completion, aiming at generating subsequent tokens from unfinished code, has been significantly benefited from recent progress in pre-trained Large Language Models (LLMs). However, these models often suffer from coherence issues and hallucinations when dealing with complex code logic or extrapolating beyond their training data. Existing Retrieval Augmented Generation (RAG) technique… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  6. arXiv:2405.06246  [pdf

    cs.CV

    Comparative Analysis of Advanced Feature Matching Algorithms in Challenging High Spatial Resolution Optical Satellite Stereo Scenarios

    Authors: Qiyan Luo, Jidan Zhang, Yuzhen Xie, Xu Huang, Ting Han

    Abstract: Feature matching determines the orientation accuracy for the High Spatial Resolution (HSR) optical satellite stereos, subsequently impacting several significant applications such as 3D reconstruction and change detection. However, the matching of off-track HSR optical satellite stereos often encounters challenging conditions including wide-baseline observation, significant radiometric differences,… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: The manuscript is accepted as Oral Presentation in IEEE International Geoscience and Remote Sensing Symposium(IGARSS 2024)

  7. arXiv:2405.05708  [pdf

    cs.IT

    Characteristic-Mode Based Conformal Design of Ultra-Wideband Antenna Array

    Authors: Zhan Chen, Wei Hu, Yuchen Gao, Qi Luo, Xiangbo Wang, Steven Gao

    Abstract: An innovative design method of conformal array antennas is presented by utilizing characteristic mode analysis (CMA) in this work. A single-layer continuous perfect electric conductor under bending conditions is conducted by CMA to evaluate the variations in operating performance. By using this method, the design process of a conformal array is simplified. The results indicate that the operating p… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  8. arXiv:2404.17128  [pdf, other

    q-bio.NC cs.SI

    Network Structure Trumps Neuron Dynamics: Insights from Drosophila Connectome Simulations

    Authors: Xiaoyu Zhang, Pengcheng Yang, Jiawei Feng, Qiang Luo, Wei Lin, Xin Lu

    Abstract: Despite the success of artificial neural networks, the necessity of real network structures in simulating intelligence remains unclear. Utilizing the largest adult Drosophila connectome data set, we constructed a large-scale network communication model framework based on simple neuronal activation mechanisms to simulate the activation behavior observed in the connectome. The results demonstrate th… ▽ More

    Submitted 30 June, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  9. arXiv:2404.10606  [pdf, other

    cs.RO cs.AI cs.LG

    InfoCon: Concept Discovery with Generative and Discriminative Informativeness

    Authors: Ruizhe Liu, Qian Luo, Yanchao Yang

    Abstract: We focus on the self-supervised discovery of manipulation concepts that can be adapted and reassembled to address various robotic tasks. We propose that the decision to conceptualize a physical procedure should not depend on how we name it (semantics) but rather on the significance of the informativeness in its representation regarding the low-level physical state and state changes. We model manip… ▽ More

    Submitted 14 March, 2024; originally announced April 2024.

    Comments: 27 pages, 15 figures. Published as a conference paper at ICLR 2024

  10. arXiv:2404.04542  [pdf, other

    cs.IT cs.ET

    Adaptive Polynomial Chaos Expansion for Uncertainty Quantification and Optimization of Horn Antennas at SubTHz Frequencies

    Authors: Aristeides D. Papadopoulos, Yihan Ma, Qi Luo, George C. Alexandropoulos

    Abstract: Sub-terahertz (subTHz) antennas will play an important role in the next generations of wireless communication systems. However, when comes to the subTHz frequency spectrum, the antenna fabrication tolerance needs to be accurately considered during the design stage. The classic approach to studying the average performance of an antenna design considering fabrication tolerances is through the use of… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: 10 pages, 12 figures, submitted to an IEEE Transactions Journal

  11. arXiv:2404.02827  [pdf, other

    cs.LG

    BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models

    Authors: Qijun Luo, Hengxu Yu, Xiao Li

    Abstract: This work presents BAdam, an optimization method that leverages the block coordinate descent framework with Adam as the inner solver. BAdam offers a memory efficient approach to the full parameter finetuning of large language models. We conduct theoretical convergence analysis for BAdam in the deterministic case. Experimentally, we apply BAdam to instruction-tune the Llama 2-7B and Llama 3-8B mode… ▽ More

    Submitted 22 May, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: 21 pages

  12. arXiv:2404.02638  [pdf, other

    cs.CV

    SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation

    Authors: Junyan Ye, Qiyan Luo, Jinhua Yu, Huaping Zhong, Zhimeng Zheng, Conghui He, Weijia Li

    Abstract: This paper aims at achieving fine-grained building attribute segmentation in a cross-view scenario, i.e., using satellite and street-view image pairs. The main challenge lies in overcoming the significant perspective differences between street views and satellite views. In this work, we introduce SG-BEV, a novel approach for satellite-guided BEV fusion for cross-view semantic segmentation. To over… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: accepted by CVPR 2024

  13. arXiv:2404.00726  [pdf, other

    eess.IV cs.CV cs.LG

    MugenNet: A Novel Combined Convolution Neural Network and Transformer Network with its Application for Colonic Polyp Image Segmentation

    Authors: Chen Peng, Zhiqin Qian, Kunyu Wang, Qi Luo, Zhuming Bi, Wenjun Zhang

    Abstract: Biomedical image segmentation is a very important part in disease diagnosis. The term "colonic polyps" refers to polypoid lesions that occur on the surface of the colonic mucosa within the intestinal lumen. In clinical practice, early detection of polyps is conducted through colonoscopy examinations and biomedical image processing. Therefore, the accurate polyp image segmentation is of great signi… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  14. arXiv:2403.16826  [pdf, ps, other

    cs.IT

    A Progressive Codebook Optimization Scheme for Sparse Code Multiple Access in Downlink Channels

    Authors: Tuofeng Lei, Qu Luo, Shuyan Ni, Shimiao Chen, Xin Song, Pei Xiao

    Abstract: Sparse code multiple access (SCMA) is a promising technique for enabling massive connectivity and high spectrum efficiency in future machine-type communication networks. However, its performance crucially depends on well-designed multi-dimensional codebooks. In this paper, we propose a novel progressive codebook optimization scheme that can achieve near-optimal performance over downlink fading cha… ▽ More

    Submitted 4 April, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

  15. LSTTN: A Long-Short Term Transformer-based Spatio-temporal Neural Network for Traffic Flow Forecasting

    Authors: Qinyao Luo, Silu He, Xing Han, Yuhan Wang, Haifeng Li

    Abstract: Accurate traffic forecasting is a fundamental problem in intelligent transportation systems and learning long-range traffic representations with key information through spatiotemporal graph neural networks (STGNNs) is a basic assumption of current traffic flow prediction models. However, due to structural limitations, existing STGNNs can only utilize short-range traffic flow data; therefore, the m… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 15 pages, 10 figures, 6 tables

    Journal ref: Knowledge-Based Systems 2024

  16. arXiv:2403.13492  [pdf, ps, other

    cs.CR cs.DB

    Secure Query Processing with Linear Complexity

    Authors: Qiyao Luo, Yilei Wang, Wei Dong, Ke Yi

    Abstract: We present LINQ, the first join protocol with linear complexity (in both running time and communication) under the secure multi-party computation model (MPC). It can also be extended to support all free-connex queries, a large class of select-join-aggregate queries, still with linear complexity. This matches the plaintext result for the query processing problem, as free-connex queries are the larg… ▽ More

    Submitted 23 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  17. arXiv:2403.11095  [pdf, other

    cs.RO eess.SY

    PyroTrack: Belief-Based Deep Reinforcement Learning Path Planning for Aerial Wildfire Monitoring in Partially Observable Environments

    Authors: Sahand Khoshdel, Qi Luo, Fatemeh Afghah

    Abstract: Motivated by agility, 3D mobility, and low-risk operation compared to human-operated management systems of autonomous unmanned aerial vehicles (UAVs), this work studies UAV-based active wildfire monitoring where a UAV detects fire incidents in remote areas and tracks the fire frontline. A UAV path planning solution is proposed considering realistic wildfire management missions, where a single low-… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: 7 pages, Accepted in American Control Conference (ACC) 2024, July 10-12th, Toronto, ON, Canada

    MSC Class: 68T40

  18. arXiv:2403.10116  [pdf, other

    cs.CR cs.DS

    Instance-optimal Clipping for Summation Problems in the Shuffle Model of Differential Privacy

    Authors: Wei Dong, Qiyao Luo, Giulia Fanti, Elaine Shi, Ke Yi

    Abstract: Differentially private mechanisms achieving worst-case optimal error bounds (e.g., the classical Laplace mechanism) are well-studied in the literature. However, when typical data are far from the worst case, \emph{instance-specific} error bounds -- which depend on the largest value in the dataset -- are more meaningful. For example, consider the sum estimation problem, where each user has an integ… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  19. arXiv:2403.07889  [pdf, other

    cs.NI cs.AR cs.ET cs.IT

    Reconfigurable Intelligent Surfaces for THz: Hardware Design and Signal Processing Challenges

    Authors: George C. Alexandropoulos, Antonio Clemente, Sérgio Matos, Ryan Husbands, Sean Ahearne, Qi Luo, Verónica Lain-Rubio, Thomas Kürner, Luís M. Pessoa

    Abstract: Wireless communications in the THz frequency band is an envisioned revolutionary technology for sixth Generation (6G) networks. However, such frequencies impose certain coverage and device design challenges that need to be efficiently overcome. To this end, the development of cost- and energy-efficient approaches for scaling these networks to realistic scenarios constitute a necessity. Among the r… ▽ More

    Submitted 2 February, 2024; originally announced March 2024.

    Comments: 5 pages, 7 figures, EuCAP 2024

  20. arXiv:2403.05286  [pdf, other

    cs.PL cs.CL

    LLM4Decompile: Decompiling Binary Code with Large Language Models

    Authors: Hanzhuo Tan, Qi Luo, Jing Li, Yuqun Zhang

    Abstract: Decompilation aims to convert binary code to high-level source code, but traditional tools like Ghidra often produce results that are difficult to read and execute. Motivated by the advancements in Large Language Models (LLMs), we propose LLM4Decompile, the first and largest open-source LLM series (1.3B to 33B) trained to decompile binary code. We optimize the LLM training process and introduce th… ▽ More

    Submitted 18 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  21. arXiv:2403.00897  [pdf, other

    eess.IV astro-ph.GA cs.AI cs.CV cs.LG

    VisRec: A Semi-Supervised Approach to Radio Interferometric Data Reconstruction

    Authors: Ruoqi Wang, Haitao Wang, Qiong Luo, Feng Wang, Hejun Wu

    Abstract: Radio telescopes produce visibility data about celestial objects, but these data are sparse and noisy. As a result, images created on raw visibility data are of low quality. Recent studies have used deep learning models to reconstruct visibility data to get cleaner images. However, these methods rely on a substantial amount of labeled training data, which requires significant labeling effort from… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  22. arXiv:2402.16667  [pdf, other

    cs.CL cs.AI

    RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation

    Authors: Qinyu Luo, Yining Ye, Shihao Liang, Zhong Zhang, Yujia Qin, Yaxi Lu, Yesai Wu, Xin Cong, Yankai Lin, Yingli Zhang, Xiaoyin Che, Zhiyuan Liu, Maosong Sun

    Abstract: Generative models have demonstrated considerable potential in software engineering, particularly in tasks such as code generation and debugging. However, their utilization in the domain of code documentation generation remains underexplored. To this end, we introduce RepoAgent, a large language model powered open-source framework aimed at proactively generating, maintaining, and updating code docu… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    ACM Class: I.2.7; F.2.2

  23. arXiv:2402.13840  [pdf, other

    cs.IR cs.AI

    LLM4SBR: A Lightweight and Effective Framework for Integrating Large Language Models in Session-based Recommendation

    Authors: Shutong Qiao, Chen Gao, Junhao Wen, Wei Zhou, Qun Luo, Peixuan Chen, Yong Li

    Abstract: Traditional session-based recommendation (SBR) utilizes session behavior sequences from anonymous users for recommendation. Although this strategy is highly efficient, it sacrifices the inherent semantic information of the items, making it difficult for the model to understand the true intent of the session and resulting in a lack of interpretability in the recommended results. Recently, large lan… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  24. arXiv:2402.13804  [pdf, other

    cs.IT cs.ET

    Reconfigurable Intelligent Surfaces for THz: Hardware Impairments and Switching Technologies

    Authors: Sérgio Matos, Yihan Ma, Qi Luo, Jonas Deuermeier, Luca Lucci, Panagiotis Gavriilidis, Asal Kiazadeh, Verónica Lain-Rubio, Tung D. Phan, Ping Jack Soh, Antonio Clemente, Luís M. Pessoa, George C. Alexandropoulos

    Abstract: The demand for unprecedented performance in the upcoming 6G wireless networks is fomenting the research on THz communications empowered by Reconfigurable Inteligent Surfaces (RISs). A wide range of use cases have been proposed, most of them, assuming high-level RIS models that overlook some of the hardware impairments that this technology faces. The expectation is that the emergent reconfigurable… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 6 pages, 6 figures, submitted for a conference presentation

  25. arXiv:2402.13741  [pdf, other

    cs.CL cs.AI

    Unlocking Instructive In-Context Learning with Tabular Prompting for Relational Triple Extraction

    Authors: Guozheng Li, Wenjun Ke, Peng Wang, Zijie Xu, Ke Ji, Jiajun Liu, Ziyu Shang, Qiqing Luo

    Abstract: The in-context learning (ICL) for relational triple extraction (RTE) has achieved promising performance, but still encounters two key challenges: (1) how to design effective prompts and (2) how to select proper demonstrations. Existing methods, however, fail to address these challenges appropriately. On the one hand, they usually recast RTE task to text-to-text prompting formats, which is unnatura… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: LREC-COLING 2024

  26. arXiv:2402.13496  [pdf, other

    cs.LG cs.SI

    HetTree: Heterogeneous Tree Graph Neural Network

    Authors: Mingyu Guan, Jack W. Stokes, Qinlong Luo, Fuchen Liu, Purvanshi Mehta, Elnaz Nouri, Taesoo Kim

    Abstract: The recent past has seen an increasing interest in Heterogeneous Graph Neural Networks (HGNNs) since many real-world graphs are heterogeneous in nature, from citation graphs to email graphs. However, existing methods ignore a tree hierarchy among metapaths, which is naturally constituted by different node types and relation types. In this paper, we present HetTree, a novel heterogeneous tree graph… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  27. arXiv:2402.08212  [pdf, other

    cs.RO

    BBSEA: An Exploration of Brain-Body Synchronization for Embodied Agents

    Authors: Sizhe Yang, Qian Luo, Anumpam Pani, Yanchao Yang

    Abstract: Embodied agents capable of complex physical skills can improve productivity, elevate life quality, and reshape human-machine collaboration. We aim at autonomous training of embodied agents for various tasks involving mainly large foundation models. It is believed that these models could act as a brain for embodied agents; however, existing methods heavily rely on humans for task proposal and scene… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  28. arXiv:2401.04408  [pdf, other

    cs.IR cs.LG

    Fine-Grained Embedding Dimension Optimization During Training for Recommender Systems

    Authors: Qinyi Luo, Penghan Wang, Wei Zhang, Fan Lai, Jiachen Mao, Xiaohan Wei, Jun Song, Wei-Yu Tsai, Shuai Yang, Yuxi Hu, Xuehai Qian

    Abstract: Huge embedding tables in modern Deep Learning Recommender Models (DLRM) require prohibitively large memory during training and inference. Aiming to reduce the memory footprint of training, this paper proposes FIne-grained In-Training Embedding Dimension optimization (FIITED). Given the observation that embedding vectors are not equally important, FIITED adjusts the dimension of each individual emb… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: 16 pages, 9 figures

    ACM Class: I.2.6; H.3.3

  29. arXiv:2312.11302  [pdf, other

    cs.IT eess.SP

    AFDM-SCMA: A Promising Waveform for Massive Connectivity over High Mobility Channels

    Authors: Qu Luo, Pei Xiao, Zilong Liu, Ziwei Wan, Thomos Nikolaos, Zhen Gao, Ziming He

    Abstract: This paper studies the affine frequency division multiplexing (AFDM)-empowered sparse code multiple access (SCMA) system, referred to as AFDM-SCMA, for supporting massive connectivity in high-mobility environments. First, by placing the sparse codewords on the AFDM chirp subcarriers, the input-output (I/O) relation of AFDM-SCMA systems is presented. Next, we delve into the generalized receiver des… ▽ More

    Submitted 11 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  30. CAT: A Causally Graph Attention Network for Trimming Heterophilic Graph

    Authors: Silu He, Qinyao Luo, Xinsha Fu, Ling Zhao, Ronghua Du, Haifeng Li

    Abstract: Local Attention-guided Message Passing Mechanism (LAMP) adopted in Graph Attention Networks (GATs) is designed to adaptively learn the importance of neighboring nodes for better local aggregation on the graph, which can bring the representations of similar neighbors closer effectively, thus showing stronger discrimination ability. However, existing GATs suffer from a significant discrimination abi… ▽ More

    Submitted 17 June, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: 25 pages, 18 figures, 5 tables

    Journal ref: Information Science 2024

  31. arXiv:2312.01126  [pdf, other

    cs.IT eess.SP

    BER Analysis of SCMA-OFDM Systems in the Presence of Carrier Frequency Offset

    Authors: Haibo Liu, Qu Luo, Zilong Liu, Shan Luo, Pei Xiao, Rongping Lin

    Abstract: Sparse code multiple access (SCMA) building upon orthogonal frequency division multiplexing (OFDM) is a promising wireless technology for supporting massive connectivity in future machine-type communication networks. However, the sensitivity of OFDM to carrier frequency offset (CFO) poses a major challenge because it leads to orthogonality loss and incurs intercarrier interference (ICI). In this p… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  32. arXiv:2312.01125  [pdf, other

    cs.IT eess.SP

    Design and Performance Analysis of Index Modulation Empowered AFDM System

    Authors: Jing Zhu, Qu Luo, Gaojie Chen, Pei Xiao, Lixia Xiao

    Abstract: In this letter, we incorporate index modulation (IM) into affine frequency division multiplexing (AFDM), called AFDM-IM, to enhance the bit error rate (BER) and energy efficiency (EE) performance. In this scheme, the information bits are conveyed not only by $M$-ary constellation symbols, but also by the activation of the chirp subcarriers (SCs) indices, which are determined based on the incoming… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  33. arXiv:2311.03687  [pdf, other

    cs.PF cs.CL cs.LG

    Dissecting the Runtime Performance of the Training, Fine-tuning, and Inference of Large Language Models

    Authors: Longteng Zhang, Xiang Liu, Zeyu Li, Xinglin Pan, Peijie Dong, Ruibo Fan, Rui Guo, Xin Wang, Qiong Luo, Shaohuai Shi, Xiaowen Chu

    Abstract: Large Language Models (LLMs) have seen great advance in both academia and industry, and their popularity results in numerous open-source frameworks and techniques in accelerating LLM pre-training, fine-tuning, and inference. Training and deploying LLMs are expensive as it requires considerable computing resources and memory, hence many efficient approaches have been developed for improving system… ▽ More

    Submitted 1 December, 2023; v1 submitted 6 November, 2023; originally announced November 2023.

  34. arXiv:2310.06182  [pdf, other

    cs.LG

    PAC-Bayesian Spectrally-Normalized Bounds for Adversarially Robust Generalization

    Authors: Jiancong Xiao, Ruoyu Sun, Zhi- Quan Luo

    Abstract: Deep neural networks (DNNs) are vulnerable to adversarial attacks. It is found empirically that adversarially robust generalization is crucial in establishing defense algorithms against adversarial attacks. Therefore, it is interesting to study the theoretical guarantee of robust generalization. This paper focuses on norm-based complexity, based on a PAC-Bayes approach (Neyshabur et al., 2017). Th… ▽ More

    Submitted 28 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  35. arXiv:2309.15223  [pdf, other

    cs.CL cs.AI cs.LG cs.NE cs.SD eess.AS

    Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition

    Authors: Yu Yu, Chao-Han Huck Yang, Jari Kolehmainen, Prashanth G. Shivakumar, Yile Gu, Sungho Ryu, Roger Ren, Qi Luo, Aditya Gourav, I-Fan Chen, Yi-Chieh Liu, Tuan Dinh, Ankur Gandhe, Denis Filimonov, Shalini Ghosh, Andreas Stolcke, Ariya Rastow, Ivan Bulyko

    Abstract: We propose a neural language modeling system based on low-rank adaptation (LoRA) for speech recognition output rescoring. Although pretrained language models (LMs) like BERT have shown superior performance in second-pass rescoring, the high computational cost of scaling up the pretraining stage and adapting the pretrained models to specific domains limit their practical use in rescoring. Here we p… ▽ More

    Submitted 10 October, 2023; v1 submitted 26 September, 2023; originally announced September 2023.

    Comments: Accepted to IEEE ASRU 2023. Internal Review Approved. Revised 2nd version with Andreas and Huck. The first version is in Sep 29th. 8 pages

    Journal ref: Proc. IEEE ASRU Workshop, Dec. 2023

  36. arXiv:2309.11489  [pdf, other

    cs.LG cs.AI cs.CL cs.RO

    Text2Reward: Reward Shaping with Language Models for Reinforcement Learning

    Authors: Tianbao Xie, Siheng Zhao, Chen Henry Wu, Yitao Liu, Qian Luo, Victor Zhong, Yanchao Yang, Tao Yu

    Abstract: Designing reward functions is a longstanding challenge in reinforcement learning (RL); it requires specialized knowledge or domain data, leading to high costs for development. To address this, we introduce Text2Reward, a data-free framework that automates the generation and shaping of dense reward functions based on large language models (LLMs). Given a goal described in natural language, Text2Rew… ▽ More

    Submitted 25 May, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: ICLR 2024 camera ready, 37 pages, 12 figures

  37. arXiv:2308.14610  [pdf, other

    astro-ph.IM cs.AI cs.CV

    PolarRec: Radio Interferometric Data Reconstruction with Polar Coordinate Representation

    Authors: Ruoqi Wang, Zhuoyang Chen, Jiayi Zhu, Qiong Luo, Feng Wang

    Abstract: In radio astronomy, visibility data, which are measurements of wave signals from radio telescopes, are transformed into images for observation of distant celestial objects. However, these resultant images usually contain both real sources and artifacts, due to signal sparsity and other factors. One way to obtain cleaner images is to reconstruct samples into dense forms before imaging. Unfortunatel… ▽ More

    Submitted 27 November, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

  38. arXiv:2308.13330  [pdf, other

    cs.IT eess.SP

    Enhancing Signal Space Diversity for SCMA Over Rayleigh Fading Channels

    Authors: Qu Luo, Zilong Liu, Gaojie Chen, Pei Xiao

    Abstract: Sparse code multiple access (SCMA) is a promising technique for the enabling of massive connectivity in future machine-type communication networks, but it suffers from a limited diversity order which is a bottleneck for significant improvement of error performance. This paper aims for enhancing the signal space diversity of sparse code multiple access (SCMA) by introducing quadrature component del… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  39. arXiv:2307.15973  [pdf, other

    cs.IR

    Debiased Pairwise Learning from Positive-Unlabeled Implicit Feedback

    Authors: Bin Liu, Qin Luo, Bang Wang

    Abstract: Learning contrastive representations from pairwise comparisons has achieved remarkable success in various fields, such as natural language processing, computer vision, and information retrieval. Collaborative filtering algorithms based on pairwise learning also rooted in this paradigm. A significant concern is the absence of labels for negative instances in implicit feedback data, which often resu… ▽ More

    Submitted 29 July, 2023; originally announced July 2023.

    Comments: 13 pages

  40. Rhythm-controllable Attention with High Robustness for Long Sentence Speech Synthesis

    Authors: Dengfeng Ke, Yayue Deng, Yukang Jia, Jinlong Xue, Qi Luo, Ya Li, Jianqing Sun, Jiaen Liang, Binghuai Lin

    Abstract: Regressive Text-to-Speech (TTS) system utilizes attention mechanism to generate alignment between text and acoustic feature sequence. Alignment determines synthesis robustness (e.g, the occurence of skipping, repeating, and collapse) and rhythm via duration control. However, current attention algorithms used in speech synthesis cannot control rhythm using external duration information to generate… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: 5 pages, 3 figures, Published in: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP)

  41. arXiv:2305.09121  [pdf, other

    astro-ph.IM astro-ph.GA cs.CV cs.LG eess.IV

    A Conditional Denoising Diffusion Probabilistic Model for Radio Interferometric Image Reconstruction

    Authors: Ruoqi Wang, Zhuoyang Chen, Qiong Luo, Feng Wang

    Abstract: In radio astronomy, signals from radio telescopes are transformed into images of observed celestial objects, or sources. However, these images, called dirty images, contain real sources as well as artifacts due to signal sparsity and other factors. Therefore, radio interferometric image reconstruction is performed on dirty images, aiming to produce clean images in which artifacts are reduced and r… ▽ More

    Submitted 29 August, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: Accepted by ECAI 2023

  42. arXiv:2304.12866  [pdf

    cs.NE cs.LG eess.SP physics.data-an

    Binary stochasticity enabled highly efficient neuromorphic deep learning achieves better-than-software accuracy

    Authors: Yang Li, Wei Wang, Ming Wang, Chunmeng Dou, Zhengyu Ma, Huihui Zhou, Peng Zhang, Nicola Lepri, Xumeng Zhang, Qing Luo, Xiaoxin Xu, Guanhua Yang, Feng Zhang, Ling Li, Daniele Ielmini, Ming Liu

    Abstract: Deep learning needs high-precision handling of forwarding signals, backpropagating errors, and updating weights. This is inherently required by the learning algorithm since the gradient descent learning rule relies on the chain product of partial derivatives. However, it is challenging to implement deep learning in hardware systems that use noisy analog memristors as artificial synapses, as well a… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

  43. arXiv:2304.03427  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Cleansing Jewel: A Neural Spelling Correction Model Built On Google OCR-ed Tibetan Manuscripts

    Authors: Queenie Luo, Yung-Sung Chuang

    Abstract: Scholars in the humanities rely heavily on ancient manuscripts to study history, religion, and socio-political structures in the past. Many efforts have been devoted to digitizing these precious manuscripts using OCR technology, but most manuscripts were blemished over the centuries so that an Optical Character Recognition (OCR) program cannot be expected to capture faded graphs and stains on page… ▽ More

    Submitted 14 May, 2024; v1 submitted 6 April, 2023; originally announced April 2023.

  44. arXiv:2303.17919  [pdf, other

    cs.RO

    Grounding Object Relations in Language-Conditioned Robotic Manipulation with Semantic-Spatial Reasoning

    Authors: Qian Luo, Yunfei Li, Yi Wu

    Abstract: Grounded understanding of natural language in physical scenes can greatly benefit robots that follow human instructions. In object manipulation scenarios, existing end-to-end models are proficient at understanding semantic concepts, but typically cannot handle complex instructions involving spatial relations among multiple objects. which require both reasoning object-level spatial relations and le… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

    Comments: AAAI 2023 RL Ready for Production Workshop

  45. PROCTER: PROnunciation-aware ConTextual adaptER for personalized speech recognition in neural transducers

    Authors: Rahul Pandey, Roger Ren, Qi Luo, Jing Liu, Ariya Rastrow, Ankur Gandhe, Denis Filimonov, Grant Strimel, Andreas Stolcke, Ivan Bulyko

    Abstract: End-to-End (E2E) automatic speech recognition (ASR) systems used in voice assistants often have difficulties recognizing infrequent words personalized to the user, such as names and places. Rare words often have non-trivial pronunciations, and in such cases, human knowledge in the form of a pronunciation lexicon can be useful. We propose a PROnunCiation-aware conTextual adaptER (PROCTER) that dyna… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: To appear in Proc. IEEE ICASSP

    Journal ref: Proc. IEEE ICASSP, June 2023

  46. arXiv:2303.16281  [pdf

    cs.CY cs.AI cs.CL cs.LG cs.SI

    A "Perspectival" Mirror of the Elephant: Investigating Language Bias on Google, ChatGPT, YouTube, and Wikipedia

    Authors: Queenie Luo, Michael J. Puett, Michael D. Smith

    Abstract: Contrary to Google Search's mission of delivering information from "many angles so you can form your own understanding of the world," we find that Google and its most prominent returned results - Wikipedia and YouTube - simply reflect a narrow set of culturally dominant views tied to the search language for complex topics like "Buddhism," "Liberalism," "colonization," "Iran" and "America." Simply… ▽ More

    Submitted 7 March, 2024; v1 submitted 28 March, 2023; originally announced March 2023.

  47. arXiv:2212.09479  [pdf

    cs.NE cs.AI

    Performance assessment and exhaustive listing of 500+ nature inspired metaheuristic algorithms

    Authors: Zhongqiang Ma, Guohua Wu, Ponnuthurai N. Suganthan, Aijuan Song, Qizhang Luo

    Abstract: Metaheuristics are popularly used in various fields, and they have attracted much attention in the scientific and industrial communities. In recent years, the number of new metaheuristic names has been continuously growing. Generally, the inventors attribute the novelties of these new algorithms to inspirations from either biology, human behaviors, physics, or other phenomena. In addition, these n… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Report number: 45 pages

  48. arXiv:2211.16752  [pdf, other

    cs.LG

    DimenFix: A novel meta-dimensionality reduction method for feature preservation

    Authors: Qiaodan Luo, Leonardo Christino, Fernando V Paulovich, Evangelos Milios

    Abstract: Dimensionality reduction has become an important research topic as demand for interpreting high-dimensional datasets has been increasing rapidly in recent years. There have been many dimensionality reduction methods with good performance in preserving the overall relationship among data points when mapping them to a lower-dimensional space. However, these existing methods fail to incorporate the d… ▽ More

    Submitted 30 November, 2022; originally announced November 2022.

  49. STGC-GNNs: A GNN-based traffic prediction framework with a spatial-temporal Granger causality graph

    Authors: Silu He, Qinyao Luo, Ronghua Du, Ling Zhao, Haifeng Li

    Abstract: The key to traffic prediction is to accurately depict the temporal dynamics of traffic flow traveling in a road network, so it is important to model the spatial dependence of the road network. The essence of spatial dependence is to accurately describe how traffic information transmission is affected by other nodes in the road network, and the GNN-based traffic prediction model, as a benchmark for… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

    Comments: 14 pages, 16 figures, 4 tables

  50. arXiv:2210.15154  [pdf, other

    cs.IR cs.AI

    AutoAttention: Automatic Field Pair Selection for Attention in User Behavior Modeling

    Authors: Zuowu Zheng, Xiaofeng Gao, Junwei Pan, Qi Luo, Guihai Chen, Dapeng Liu, Jie Jiang

    Abstract: In Click-through rate (CTR) prediction models, a user's interest is usually represented as a fixed-length vector based on her history behaviors. Recently, several methods are proposed to learn an attentive weight for each user behavior and conduct weighted sum pooling. However, these methods only manually select several fields from the target item side as the query to interact with the behaviors,… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: Accepted by ICDM 2022