Skip to main content

Showing 1–50 of 4,552 results for author: Chen, W

  1. arXiv:2407.11875  [pdf, ps, other

    eess.SP

    Cramer-Rao Bound Minimization for Movable Antenna-Assisted Multiuser Integrated Sensing and Communications

    Authors: Haoran Qin, Wen Chen, Qingqing Wu, Ziheng Zhang, Zhendong Li, Nan Cheng

    Abstract: This paper investigates a movable antenna (MA)-assisted multiuser integrated sensing and communication (ISAC) system, where the base station (BS) and communication users are all equipped with MA for improving both the sensing and communication performance. We employ the Cramer-Rao bound (CRB) as the performance metric of sensing, thus a joint beamforming design and MAs' position optimizing problem… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  2. arXiv:2407.11569  [pdf, other

    cs.CV

    SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds

    Authors: Yanbo Wang, Wentao Zhao, Chuan Cao, Tianchen Deng, Jingchuan Wang, Weidong Chen

    Abstract: Although LiDAR semantic segmentation advances rapidly, state-of-the-art methods often incorporate specifically designed inductive bias derived from benchmarks originating from mechanical spinning LiDAR. This can limit model generalizability to other kinds of LiDAR technologies and make hyperparameter tuning more complex. To tackle these issues, we propose a generalized framework to accommodate var… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  3. arXiv:2407.11268  [pdf, other

    stat.ML cs.CE cs.LG

    Heterogenous Multi-Source Data Fusion Through Input Mapping and Latent Variable Gaussian Process

    Authors: Yigitcan Comlek, Sandipp Krishnan Ravi, Piyush Pandita, Sayan Ghosh, Liping Wang, Wei Chen

    Abstract: Artificial intelligence and machine learning frameworks have served as computationally efficient mapping between inputs and outputs for engineering problems. These mappings have enabled optimization and analysis routines that have warranted superior designs, ingenious material systems and optimized manufacturing processes. A common occurrence in such modeling endeavors is the existence of multiple… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 20 Pages,9 Figures, Data is available per request

  4. arXiv:2407.11162  [pdf, other

    cs.CV

    Integrating Amortized Inference with Diffusion Models for Learning Clean Distribution from Corrupted Images

    Authors: Yifei Wang, Weimin Bai, Weijian Luo, Wenzheng Chen, He Sun

    Abstract: Diffusion models (DMs) have emerged as powerful generative models for solving inverse problems, offering a good approximation of prior distributions of real-world image data. Typically, diffusion models rely on large-scale clean signals to accurately learn the score functions of ground truth clean image distributions. However, such a requirement for large amounts of clean data is often impractical… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  5. arXiv:2407.11079  [pdf, ps, other

    eess.SP cs.IT

    One-Bit MIMO Detection: From Global Maximum-Likelihood Detector to Amplitude Retrieval Approach

    Authors: Mingjie Shao, Wei-Kun Chen, Cheng-Yang Yu, Ya-Feng Liu, Wing-Kin Ma

    Abstract: As communication systems advance towards the future 6G era, the incorporation of large-scale antenna arrays in base stations (BSs) presents challenges such as increased hardware costs and energy consumption. To address these issues, the use of one-bit analog-to-digital converters (ADCs)/digital-to-analog converters (DACs) has gained significant attentions. This paper focuses on one-bit multiple-in… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

  6. arXiv:2407.10892  [pdf, other

    hep-ex astro-ph.SR nucl-ex

    First Measurement of Solar $^8$B Neutrino Flux through Coherent Elastic Neutrino-Nucleus Scattering in PandaX-4T

    Authors: PandaX Collaboration, Zihao Bo, Wei Chen, Xun Chen, Yunhua Chen, Zhaokan Cheng, Xiangyi Cui, Yingjie Fan, Deqing Fang, Zhixing Gao, Lisheng Geng, Karl Giboni, Xunan Guo, Xuyuan Guo, Zichao Guo, Chencheng Han, Ke Han, Changda He, Jinrong He, Di Huang, Houqi Huang, Junting Huang, Ruquan Hou, Yu Hou, Xiangdong Ji , et al. (77 additional authors not shown)

    Abstract: The PandaX-4T liquid xenon detector at the China Jinping Underground Laboratory is used to measure the solar $^8$B neutrino flux by detecting neutrinos through coherent scattering with xenon nuclei. Data samples requiring the coincidence of scintillation and ionization signals (paired), as well as unpaired ionization-only signals (US2), are selected with energy threshold of approximately 1.1 keV (… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  7. arXiv:2407.10627  [pdf, other

    cs.CL cs.AI cs.LG

    Arena Learning: Build Data Flywheel for LLMs Post-training via Simulated Chatbot Arena

    Authors: Haipeng Luo, Qingfeng Sun, Can Xu, Pu Zhao, Qingwei Lin, Jianguang Lou, Shifeng Chen, Yansong Tang, Weizhu Chen

    Abstract: Assessing the effectiveness of large language models (LLMs) presents substantial challenges. The method of conducting human-annotated battles in an online Chatbot Arena is a highly effective evaluative technique. However, this approach is limited by the costs and time required for human annotation. In this paper, we introduce Arena Learning, an innovative offline strategy designed to simulate thes… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  8. arXiv:2407.10339  [pdf, other

    hep-ex astro-ph.HE astro-ph.IM astro-ph.SR nucl-ex physics.ins-det

    Supernova Pointing Capabilities of DUNE

    Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, D. A. Andrade , et al. (1340 additional authors not shown)

    Abstract: The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electr… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 25 pages, 16 figures

    Report number: FERMILAB-PUB-24-0319-LBNF

  9. arXiv:2407.10058  [pdf, other

    cs.CL cs.AI

    Learning to Refuse: Towards Mitigating Privacy Risks in LLMs

    Authors: Zhenhua Liu, Tong Zhu, Chuanyuan Tan, Wenliang Chen

    Abstract: Large language models (LLMs) exhibit remarkable capabilities in understanding and generating natural language. However, these models can inadvertently memorize private information, posing significant privacy risks. This study addresses the challenge of enabling LLMs to protect specific individuals' private data without the need for complete retraining. We propose \return, a Real-world pErsonal daT… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

  10. arXiv:2407.09928  [pdf, other

    physics.ins-det hep-ex

    Results for pixel and strip centimeter-scale AC-LGAD sensors with a 120 GeV proton beam

    Authors: Irene Dutta, Christopher Madrid, Ryan Heller, Shirsendu Nanda, Danush Shekar, Claudio San Martín, Matías Barría, Artur Apresyan, Zhenyu Ye, William K. Brooks, Wei Chen, Gabriele D'Amen, Gabriele Giacomini, Alessandro Tricoli, Aram Hayrapetyan, Hakseong Lee, Ohannes Kamer Köseyan, Sergey Los, Koji Nakamura, Sayuka Kita, Tomoka Imamura, Cristían Peña, Si Xie

    Abstract: We present the results of an extensive evaluation of strip and pixel AC-LGAD sensors tested with a 120 GeV proton beam, focusing on the influence of design parameters on the sensor temporal and spatial resolutions. Results show that reducing the thickness of pixel sensors significantly enhances their time resolution, with 20 $μ$m-thick sensors achieving around 20 ps. Uniform performance is attaina… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

  11. arXiv:2407.09893  [pdf, other

    cs.CL

    Synergistic Multi-Agent Framework with Trajectory Learning for Knowledge-Intensive Tasks

    Authors: Shengbin Yue, Siyuan Wang, Wei Chen, Xuanjing Huang, Zhongyu Wei

    Abstract: Recent advancements in Large Language Models (LLMs) have led to significant breakthroughs in various natural language processing tasks. However, generating factually consistent responses in knowledge-intensive scenarios remains a challenge due to issues such as hallucination, difficulty in acquiring long-tailed knowledge, and limited memory expansion. This paper introduces SMART, a novel multi-age… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

  12. arXiv:2407.08961  [pdf

    eess.IV cs.CV

    Tissue-Contrastive Semi-Masked Autoencoders for Segmentation Pretraining on Chest CT

    Authors: Jie Zheng, Ru Wen, Haiqin Hu, Lina Wei, Kui Su, Wei Chen, Chen Liu, Jun Wang

    Abstract: Existing Masked Image Modeling (MIM) depends on a spatial patch-based masking-reconstruction strategy to perceive objects'features from unlabeled images, which may face two limitations when applied to chest CT: 1) inefficient feature learning due to complex anatomical details presented in CT images, and 2) suboptimal knowledge transfer owing to input disparity between upstream and downstream model… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  13. arXiv:2407.08949  [pdf, other

    cs.CV

    One-Shot Pose-Driving Face Animation Platform

    Authors: He Feng, Donglin Di, Yongjia Ma, Wei Chen, Tonghua Su

    Abstract: The objective of face animation is to generate dynamic and expressive talking head videos from a single reference face, utilizing driving conditions derived from either video or audio inputs. Current approaches often require fine-tuning for specific identities and frequently fail to produce expressive videos due to the limited effectiveness of Wav2Pose modules. To facilitate the generation of one-… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  14. arXiv:2407.08693  [pdf, other

    cs.RO cs.LG

    Robotic Control via Embodied Chain-of-Thought Reasoning

    Authors: Michał Zawalski, William Chen, Karl Pertsch, Oier Mees, Chelsea Finn, Sergey Levine

    Abstract: A key limitation of learned robot control policies is their inability to generalize outside their training data. Recent works on vision-language-action models (VLAs) have shown that the use of large, internet pre-trained vision-language models as the backbone of learned robot policies can substantially improve their robustness and generalization ability. Yet, one of the most exciting capabilities… ▽ More

    Submitted 12 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

    Comments: Project Website: https://embodied-cot.github.io

  15. arXiv:2407.08559  [pdf

    physics.app-ph

    Study of a Novel Capacitive Pressure Sensor Using Spiral Comb Electrodes

    Authors: Wenjie Chen, Qi Yang, Qi Liu, Yiqun Zhang, Liang He, Yuanlin Xia, Zhuqing Wang, Yubo Huang, Jianfeng Chen, Cao Xia

    Abstract: For traditional capacitive pressure sensors, high nonlinearity and poor sensitivity greatly limited their sensing applications. Hence, an innovative design of capacitors based on spiral comb electrodes is proposed for high-sensitivity pressure detection in this work. Compared to traditional capacitive pressure sensors with straight plate electrodes, the proposed sensor with the spiral electrodes i… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 20 pages, 14 figures

    MSC Class: -

  16. arXiv:2407.08370  [pdf

    nlin.CD cond-mat.soft

    Nonlinear vibration and stability of a dielectric elastomer balloon based on a strain-stiffening model

    Authors: Amin Alibakhshi, Weiqiu Chen, Michel Destrade

    Abstract: Limiting chain extensibility is a characteristic that plays a vital role in the stretching of highly elastic materials. The Gent model has been widely used to capture this behaviour, as it performs very well in fitting stress-stretch data in simple tension, and involves two material parameters only. Recently, Anssari-Benam and Bucchi [Int. J. Non. Linear. Mech. 2021, 128, 103626] introduced a diff… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Journal ref: Journal of Elasticity, 153 (2022) 533-548

  17. arXiv:2407.08353  [pdf

    cond-mat.mtrl-sci

    One-dimensional flat bands in phosphorene nanoribbons with pentagonal nature

    Authors: Shuo Sun, Jing-Yang You, Zhihao Cai, Jie Su, Tong Yang, Xinnan Peng, Yihe Wang, Daiyu Geng, Jian Gou, Yuli Huang, Sisheng Duan, Lan Chen, Kehui Wu, Andrew T. S. Wee, Yuan Ping Feng, Jia Lin Zhang, Jiong Lu, Baojie Feng, Wei Chen

    Abstract: Materials with topological flat bands can serve as a promising platform to investigate strongly interacting phenomena. However, experimental realization of ideal flat bands is mostly limited to artificial lattices or moiré systems. Here we report a general way to construct one-dimensional (1D) flat bands in phosphorene nanoribbons (PNRs) with pentagonal nature: penta-hexa-PNRs and penta-dodeca-PNR… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 13 pages, 4 figures

  18. arXiv:2407.07577  [pdf, other

    cs.CV cs.AI

    IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model

    Authors: Yatai Ji, Shilong Zhang, Jie Wu, Peize Sun, Weifeng Chen, Xuefeng Xiao, Sidi Yang, Yujiu Yang, Ping Luo

    Abstract: The rapid advancement of Large Vision-Language models (LVLMs) has demonstrated a spectrum of emergent capabilities. Nevertheless, current models only focus on the visual content of a single scenario, while their ability to associate instances across different scenes has not yet been explored, which is essential for understanding complex visual content, such as movies with multiple characters and i… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  19. arXiv:2407.07575  [pdf, other

    cs.LG cs.NI

    Resource Allocation for Twin Maintenance and Computing Task Processing in Digital Twin Vehicular Edge Computing Network

    Authors: Yu Xie, Qiong Wu, Pingyi Fan, Nan Cheng, Wen Chen, Jiangzhou Wang, Khaled B. Letaief

    Abstract: As a promising technology, vehicular edge computing (VEC) can provide computing and caching services by deploying VEC servers near vehicles. However, VEC networks still face challenges such as high vehicle mobility. Digital twin (DT), an emerging technology, can predict, estimate, and analyze real-time states by digitally modeling objects in the physical world. By integrating DT with VEC, a virtua… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: This paper has been submitted to IEEE Journal. The source code has been released at:https://github.com/qiongwu86/Resource-allocation-for-twin-maintenance-and-computing-tasks-in-digital-twin-mobile-edge-network

  20. arXiv:2407.07331  [pdf, ps, other

    cs.CV cs.AI

    Learning with Instance-Dependent Noisy Labels by Anchor Hallucination and Hard Sample Label Correction

    Authors: Po-Hsuan Huang, Chia-Ching Lin, Chih-Fan Hsu, Ming-Ching Chang, Wei-Chao Chen

    Abstract: Learning from noisy-labeled data is crucial for real-world applications. Traditional Noisy-Label Learning (NLL) methods categorize training data into clean and noisy sets based on the loss distribution of training samples. However, they often neglect that clean samples, especially those with intricate visual patterns, may also yield substantial losses. This oversight is particularly significant in… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: ICIP 2024

  21. arXiv:2407.07061  [pdf, other

    cs.CL

    Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence

    Authors: Weize Chen, Ziming You, Ran Li, Yitong Guan, Chen Qian, Chenyang Zhao, Cheng Yang, Ruobing Xie, Zhiyuan Liu, Maosong Sun

    Abstract: The rapid advancement of large language models (LLMs) has paved the way for the development of highly capable autonomous agents. However, existing multi-agent frameworks often struggle with integrating diverse capable third-party agents due to reliance on agents defined within their own ecosystems. They also face challenges in simulating distributed environments, as most frameworks are limited to… ▽ More

    Submitted 10 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: work in progress

  22. arXiv:2407.06957  [pdf, other

    eess.AS cs.CL cs.CY

    Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models

    Authors: Yi-Cheng Lin, Tzu-Quan Lin, Chih-Kai Yang, Ke-Han Lu, Wei-Chih Chen, Chun-Yi Kuan, Hung-yi Lee

    Abstract: Speech Integrated Large Language Models (SILLMs) combine large language models with speech perception to perform diverse tasks, such as emotion recognition to speaker verification, demonstrating universal audio understanding capability. However, these models may amplify biases present in training data, potentially leading to biased access to information for marginalized groups. This work introduce… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  23. arXiv:2407.06905  [pdf, ps, other

    math.AP

    Blowing-up solutions for the Choquard type Brezis-Nirenberg problem in dimension three

    Authors: Wenjing Chen, Zexi Wang

    Abstract: In this paper, we are interested in the existence of solutions for the following Choquard type Brezis-Nirenberg problem \begin{align*} \left\{ \begin{array}{ll} -Δu=\displaystyle\Big(\int\limits_Ω\frac{u^{6-α}(y)}{|x-y|^α}dy\Big)u^{5-α}+λu, \ \ &\mbox{in}\ Ω, u=0, \ \ &\mbox{on}\ \partial Ω, \end{array} \right. \end{align*} where $Ω$ is a smooth bounded domain in $\mathbb{R}^3$,… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  24. arXiv:2407.06886  [pdf, other

    cs.CV cs.AI cs.LG cs.MA cs.RO

    Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI

    Authors: Yang Liu, Weixing Chen, Yongjie Bai, Jingzhou Luo, Xinshuai Song, Kaixuan Jiang, Zhida Li, Ganlong Zhao, Junyi Lin, Guanbin Li, Wen Gao, Liang Lin

    Abstract: Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial General Intelligence (AGI) and serves as a foundation for various applications that bridge cyberspace and the physical world. Recently, the emergence of Multi-modal Large Models (MLMs) and World Models (WMs) have attracted significant attention due to their remarkable perception, interaction, and reasoning capabilit… ▽ More

    Submitted 11 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: The first comprehensive review of Embodied AI in the era of MLMs, 37 pages. We also provide the paper list for Embodied AI: https://github.com/HCPLab-SYSU/Embodied_AI_Paper_List

  25. arXiv:2407.06767  [pdf, other

    cs.IT eess.SP

    Enhancing Robustness and Security in ISAC Network Design: Leveraging Transmissive Reconfigurable Intelligent Surface with RSMA

    Authors: Ziwei Liu, Wen Chen, Qingqing Wu, Zhendong Li, Xusheng Zhu, Qiong Wu, Nan Cheng

    Abstract: In this paper, we propose a novel transmissive reconfigurable intelligent surface transceiver-enhanced robust and secure integrated sensing and communication network. A time-division sensing communication mechanism is designed for the scenario, which enables communication and sensing to share wireless resources. To address the interference management problem and hinder eavesdropping, we implement… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  26. arXiv:2407.06518  [pdf, other

    cs.LG cs.NI

    Graph Neural Networks and Deep Reinforcement Learning Based Resource Allocation for V2X Communications

    Authors: Maoxin Ji, Qiong Wu, Pingyi Fan, Nan Cheng, Wen Chen, Jiangzhou Wang, Khaled B. Letaief

    Abstract: In the rapidly evolving landscape of Internet of Vehicles (IoV) technology, Cellular Vehicle-to-Everything (C-V2X) communication has attracted much attention due to its superior performance in coverage, latency, and throughput. Resource allocation within C-V2X is crucial for ensuring the transmission of safety information and meeting the stringent requirements for ultra-low latency and high reliab… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 14 pages, 11 figures. This paper has been submitted to IEEE Journal. The source code has been released at: https://github.com/qiongwu86/GNN-and-DRL-Based-Resource-Allocation-for-V2X-Communications

  27. arXiv:2407.06027  [pdf, other

    cs.CL

    PAS: Data-Efficient Plug-and-Play Prompt Augmentation System

    Authors: Miao Zheng, Hao Liang, Fan Yang, Haoze Sun, Tianpeng Li, Lingchu Xiong, Yan Zhang, Youzhen Wu, Kun Li, Yanjun Shen, Mingan Lin, Tao Zhang, Guosheng Dong, Yujing Qiao, Kun Fang, Weipeng Chen, Bin Cui, Wentao Zhang, Zenan Zhou

    Abstract: In recent years, the rise of Large Language Models (LLMs) has spurred a growing demand for plug-and-play AI systems. Among the various AI techniques, prompt engineering stands out as particularly significant. However, users often face challenges in writing prompts due to the steep learning curve and significant time investment, and existing automatic prompt engineering (APE) models can be difficul… ▽ More

    Submitted 12 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

  28. arXiv:2407.05331  [pdf, ps, other

    eess.SY

    Channel Characterization of IRS-assisted Resonant Beam Communication Systems

    Authors: Wen Fang, Wen Chen, Qingqing Wu, Xusheng Zhu, Qiong Wu, Nan Cheng

    Abstract: To meet the growing demand for data traffic, spectrum-rich optical wireless communication (OWC) has emerged as a key technological driver for the development of 6G. The resonant beam communication (RBC) system, which employs spatially separated laser cavities as the transmitter and receiver, is a high-speed OWC technology capable of self-alignment without tracking. However, its transmission throug… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  29. arXiv:2407.05216  [pdf, other

    cs.CL

    Large Language Model as an Assignment Evaluator: Insights, Feedback, and Challenges in a 1000+ Student Course

    Authors: Cheng-Han Chiang, Wei-Chih Chen, Chun-Yi Kuan, Chienchou Yang, Hung-yi Lee

    Abstract: Using large language models (LLMs) for automatic evaluation has become an important evaluation method in NLP research. However, it is unclear whether these LLM-based evaluators can be applied in real-world classrooms to assess student assignments. This empirical report shares how we use GPT-4 as an automatic assignment evaluator in a university course with 1,028 students. Based on student response… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: An empirical report of our course: Introduction to Generative AI 2024 Spring (https://speech.ee.ntu.edu.tw/~hylee/genai/2024-spring.php)

  30. arXiv:2407.05052  [pdf, other

    eess.SP

    A Marginal Distributionally Robust Kalman Filter for Centralized Fusion

    Authors: Weizhi Chen

    Abstract: State estimation is a fundamental problem for multi-sensor information fusion, essential in applications such as target tracking, power systems, and control automation. Previous research mostly ignores the correlation between sensors and assumes independent or known distributions. However, in practice, these distributions are often correlated and difAcult to estimate. This paper proposes a novel m… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  31. arXiv:2407.04958  [pdf, other

    cs.LG cs.CV

    Entropy-Informed Weighting Channel Normalizing Flow

    Authors: Wei Chen, Shian Du, Shigui Li, Delu Zeng, John Paisley

    Abstract: Normalizing Flows (NFs) have gained popularity among deep generative models due to their ability to provide exact likelihood estimation and efficient sampling. However, a crucial limitation of NFs is their substantial memory requirements, arising from maintaining the dimension of the latent space equal to that of the input space. Multi-scale architectures bypass this limitation by progressively re… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  32. arXiv:2407.04888  [pdf, other

    eess.IV cs.CV

    Unraveling Radiomics Complexity: Strategies for Optimal Simplicity in Predictive Modeling

    Authors: Mahdi Ait Lhaj Loutfi, Teodora Boblea Podasca, Alex Zwanenburg, Taman Upadhaya, Jorge Barrios, David R. Raleigh, William C. Chen, Dante P. I. Capaldi, Hong Zheng, Olivier Gevaert, Jing Wu, Alvin C. Silva, Paul J. Zhang, Harrison X. Bai, Jan Seuntjens, Steffen Löck, Patrick O. Richard, Olivier Morin, Caroline Reinhold, Martin Lepage, Martin Vallières

    Abstract: Background: The high dimensionality of radiomic feature sets, the variability in radiomic feature types and potentially high computational requirements all underscore the need for an effective method to identify the smallest set of predictive features for a given clinical problem. Purpose: Develop a methodology and tools to identify and explain the smallest set of predictive radiomic features. Mat… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  33. arXiv:2407.04804  [pdf, other

    cs.LG cs.CY cs.DS

    Fair Submodular Cover

    Authors: Wenjing Chen, Shuo Xing, Samson Zhou, Victoria G. Crawford

    Abstract: Submodular optimization is a fundamental problem with many applications in machine learning, often involving decision-making over datasets with sensitive attributes such as gender or age. In such settings, it is often desirable to produce a diverse solution set that is fairly distributed with respect to these attributes. Motivated by this, we initiate the study of Fair Submodular Cover (FSC), wher… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  34. arXiv:2407.04751  [pdf, ps, other

    cs.LG cs.AI cs.CR

    A Unified Learn-to-Distort-Data Framework for Privacy-Utility Trade-off in Trustworthy Federated Learning

    Authors: Xiaojin Zhang, Mingcong Xu, Wei Chen

    Abstract: In this paper, we first give an introduction to the theoretical basis of the privacy-utility equilibrium in federated learning based on Bayesian privacy definitions and total variation distance privacy definitions. We then present the \textit{Learn-to-Distort-Data} framework, which provides a principled approach to navigate the privacy-utility equilibrium by explicitly modeling the distortion intr… ▽ More

    Submitted 9 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

  35. arXiv:2407.04675  [pdf, other

    eess.AS cs.SD

    Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

    Authors: Ye Bai, Jingping Chen, Jitong Chen, Wei Chen, Zhuo Chen, Chuang Ding, Linhao Dong, Qianqian Dong, Yujiao Du, Kepan Gao, Lu Gao, Yi Guo, Minglun Han, Ting Han, Wenchao Hu, Xinying Hu, Yuxiang Hu, Deyu Hua, Lu Huang, Mingkun Huang, Youjia Huang, Jishuo Jin, Fanliu Kong, Zongwei Lan, Tianyu Li , et al. (30 additional authors not shown)

    Abstract: Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this wor… ▽ More

    Submitted 10 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

  36. arXiv:2407.04336  [pdf, ps, other

    eess.SP cs.AI

    AI-Based Beam-Level and Cell-Level Mobility Management for High Speed Railway Communications

    Authors: Wen Li, Wei Chen, Shiyue Wang, Yuanyuan Zhang, Michail Matthaiou, Bo Ai

    Abstract: High-speed railway (HSR) communications are pivotal for ensuring rail safety, operations, maintenance, and delivering passenger information services. The high speed of trains creates rapidly time-varying wireless channels, increases the signaling overhead, and reduces the system throughput, making it difficult to meet the growing and stringent needs of HSR applications. In this article, we explore… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  37. arXiv:2407.03992  [pdf, other

    eess.IV

    Performance of Medical Image Fusion in High-level Analysis Tasks: A Mutual Enhancement Framework for Unaligned PAT and MRI Image Fusion

    Authors: Yutian Zhong, Jinchuan He, Zhichao Liang, Shuangyang Zhang, Qianjin Feng, Wufan Chen, Li Qi

    Abstract: Photoacoustic tomography (PAT) offers optical contrast, whereas magnetic resonance imaging (MRI) excels in imaging soft tissue and organ anatomy. The fusion of PAT with MRI holds promising application prospects due to their complementary advantages. Existing image fusion have made considerable progress in pre-registered images, yet spatial deformations are difficult to avoid in medical imaging sce… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  38. arXiv:2407.03750  [pdf, other

    cs.DB

    GriDB: Scaling Blockchain Database via Sharding and Off-Chain Cross-Shard Mechanism

    Authors: Zicong Hong, Song Guo, Enyuan Zhou, Wuhui Chen, Huawei Huang, Albert Zomaya

    Abstract: Blockchain databases have attracted widespread attention but suffer from poor scalability due to underlying non-scalable blockchains. While blockchain sharding is necessary for a scalable blockchain database, it poses a new challenge named on-chain cross-shard database services. Each cross-shard database service (e.g., cross-shard queries or inter-shard load balancing) involves massive cross-shard… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  39. arXiv:2407.03502  [pdf, other

    cs.AI cs.CL cs.LG

    AgentInstruct: Toward Generative Teaching with Agentic Flows

    Authors: Arindam Mitra, Luciano Del Corro, Guoqing Zheng, Shweti Mahajan, Dany Rouhana, Andres Codas, Yadong Lu, Wei-ge Chen, Olga Vrousgos, Corby Rosset, Fillipe Silva, Hamed Khanpour, Yash Lara, Ahmed Awadallah

    Abstract: Synthetic data is becoming increasingly important for accelerating the development of language models, both large and small. Despite several successful use cases, researchers also raised concerns around model collapse and drawbacks of imitating other models. This discrepancy can be attributed to the fact that synthetic data varies in quality and diversity. Effective use of synthetic data usually r… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  40. arXiv:2407.02631  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Nollywood: Let's Go to the Movies!

    Authors: John E. Ortega, Ibrahim Said Ahmad, William Chen

    Abstract: Nollywood, based on the idea of Bollywood from India, is a series of outstanding movies that originate from Nigeria. Unfortunately, while the movies are in English, they are hard to understand for many native speakers due to the dialect of English that is spoken. In this article, we accomplish two goals: (1) create a phonetic sub-title model that is able to translate Nigerian English speech to Ame… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 8 pages, 4 figures, 2 tables

  41. arXiv:2407.02601  [pdf, other

    cs.LG cs.DS

    Linear Submodular Maximization with Bandit Feedback

    Authors: Wenjing Chen, Victoria G. Crawford

    Abstract: Submodular optimization with bandit feedback has recently been studied in a variety of contexts. In a number of real-world applications such as diversified recommender systems and data summarization, the submodular function exhibits additional linear structure. We consider developing approximation algorithms for the maximization of a submodular objective function $f:2^U\to\mathbb{R}_{\geq 0}$, whe… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  42. arXiv:2407.02342  [pdf, ps, other

    cs.LG cs.DC cs.MA cs.NI

    Optimizing Age of Information in Vehicular Edge Computing with Federated Graph Neural Network Multi-Agent Reinforcement Learning

    Authors: Wenhua Wang, Qiong Wu, Pingyi Fan, Nan Cheng, Wen Chen, Jiangzhou Wang, Khaled B. Letaief

    Abstract: With the rapid development of intelligent vehicles and Intelligent Transport Systems (ITS), the sensors such as cameras and LiDAR installed on intelligent vehicles provides higher capacity of executing computation-intensive and delay-sensitive tasks, thereby raising deployment costs. To address this issue, Vehicular Edge Computing (VEC) has been proposed to process data through Road Side Units (RS… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: This paper has been submitted to IEEE Journal. The source code has been released at: https://github.com/qiongwu86/Optimizing-AoI-in-VEC-with-Federated-Graph-Neural-Network-Multi-Agent-Reinforcement-Learning

  43. arXiv:2407.02235  [pdf

    cs.CL

    Towards a Holistic Framework for Multimodal Large Language Models in Three-dimensional Brain CT Report Generation

    Authors: Cheng-Yi Li, Kao-Jung Chang, Cheng-Fu Yang, Hsin-Yu Wu, Wenting Chen, Hritik Bansal, Ling Chen, Yi-Ping Yang, Yu-Chun Chen, Shih-Pin Chen, Jiing-Feng Lirng, Kai-Wei Chang, Shih-Hwa Chiou

    Abstract: Multi-modal large language models (MLLMs) have been given free rein to explore exciting medical applications with a primary focus on radiology report generation. Nevertheless, the preliminary success in 2D radiology captioning is incompetent to reflect the real-world diagnostic challenge in the volumetric 3D anatomy. To mitigate three crucial limitation aspects in the existing literature, includin… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 6 figures, 5 supplementary figures, 8 supplementary tables

  44. arXiv:2407.01926  [pdf

    physics.med-ph cs.CV

    Chemical Shift Encoding based Double Bonds Quantification in Triglycerides using Deep Image Prior

    Authors: Chaoxing Huang, Ziqiang Yu, Zijian Gao, Qiuyi Shen, Queenie Chan, Vincent Wai-Sun Wong, Winnie Chiu-Wing Chu, Weitian Chen

    Abstract: This study evaluated a deep learning-based method using Deep Image Prior (DIP) to quantify triglyceride double bonds from chemical-shift encoded multi-echo gradient echo images without network training. We employed a cost function based on signal constraints to iteratively update the neural network on a single dataset. The method was validated using phantom experiments and in vivo scans. Results s… ▽ More

    Submitted 3 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

  45. arXiv:2407.01893  [pdf, other

    cs.HC

    CausalPrism: A Visual Analytics Approach for Subgroup-based Causal Heterogeneity Exploration

    Authors: Jiehui Zhou, Xumeng Wang, Wong Kam-Kwai, Wei Zhang, Xingyu Liu, Juntian Zhang, Minfeng Zhu, Wei Chen

    Abstract: In causal inference, estimating Heterogeneous Treatment Effects (HTEs) from observational data is critical for understanding how different subgroups respond to treatments, with broad applications such as precision medicine and targeted advertising. However, existing work on HTE, subgroup discovery, and causal visualization is insufficient to address two challenges: first, the sheer number of poten… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 12 pages, 7 figures

  46. arXiv:2407.01613  [pdf, other

    cs.LG cs.AI stat.ML

    Self-adaptive weights based on balanced residual decay rate for physics-informed neural networks and deep operator networks

    Authors: Wenqian Chen, Amanda A. Howard, Panos Stinis

    Abstract: Physics-informed deep learning has emerged as a promising alternative for solving partial differential equations. However, for complex problems, training these networks can still be challenging, often resulting in unsatisfactory accuracy and efficiency. In this work, we demonstrate that the failure of plain physics-informed neural networks arises from the significant discrepancy in the convergence… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: 13 figures, 4 tables

    Report number: PNNL-SA-199965

  47. arXiv:2407.01239  [pdf, other

    cs.CV cs.AI

    SGCCNet: Single-Stage 3D Object Detector With Saliency-Guided Data Augmentation and Confidence Correction Mechanism

    Authors: Ao Liang, Wenyu Chen, Jian Fang, Huaici Zhao

    Abstract: The single-stage point-based 3D object detectors have attracted widespread research interest due to their advantages of lightweight and fast inference speed. However, they still face challenges such as inadequate learning of low-quality objects (ILQ) and misalignment between localization accuracy and classification confidence (MLC). In this paper, we propose SGCCNet to alleviate these two issues.… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 16 pages, 16 figures

  48. arXiv:2407.01027  [pdf, other

    cs.CV

    Blind Inversion using Latent Diffusion Priors

    Authors: Weimin Bai, Siyi Chen, Wenzheng Chen, He Sun

    Abstract: Diffusion models have emerged as powerful tools for solving inverse problems due to their exceptional ability to model complex prior distributions. However, existing methods predominantly assume known forward operators (i.e., non-blind), limiting their applicability in practical settings where acquiring such operators is costly. Additionally, many current approaches rely on pixel-space diffusion m… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  49. arXiv:2407.01014  [pdf, other

    cs.CV

    An Expectation-Maximization Algorithm for Training Clean Diffusion Models from Corrupted Observations

    Authors: Weimin Bai, Yifei Wang, Wenzheng Chen, He Sun

    Abstract: Diffusion models excel in solving imaging inverse problems due to their ability to model complex image priors. However, their reliance on large, clean datasets for training limits their practical use where clean data is scarce. In this paper, we propose EMDiffusion, an expectation-maximization (EM) approach to train diffusion models from corrupted observations. Our method alternates between recons… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  50. arXiv:2407.01006  [pdf, other

    eess.SP

    Multi-Functional Beamforming Design for Integrated Sensing, Communication, and Computation

    Authors: Yapeng Zhao, Qingqing Wu, Wen Chen, Yong Zeng, Ruiqi Liu, Weidong Mei, Fen Hou, Shaodan Ma

    Abstract: Integrated sensing and communication (ISAC) systems may face a heavy computation burden since the sensory data needs to be further processed. This paper studies a novel system that integrates sensing, communication, and computation, aiming to provide services for different objectives efficiently. This system consists of a multi-antenna multi-functional base station (BS), an edge server, a target,… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.