Skip to main content

Showing 1–50 of 7,416 results for author: Wang, C

  1. arXiv:2407.11935  [pdf, other

    cs.CV

    Learning Multi-view Anomaly Detection

    Authors: Haoyang He, Jiangning Zhang, Guanzhong Tian, Chengjie Wang, Lei Xie

    Abstract: This study explores the recently proposed challenging multi-view Anomaly Detection (AD) task. Single-view tasks would encounter blind spots from other perspectives, resulting in inaccuracies in sample-level prediction. Therefore, we introduce the \textbf{M}ulti-\textbf{V}iew \textbf{A}nomaly \textbf{D}etection (\textbf{MVAD}) framework, which learns and integrates features from multi-views. Specif… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 10 pages

  2. arXiv:2407.11628  [pdf, ps, other

    math.OC math.NA

    Block triangular preconditioning for elliptic boundary optimal control with mixed boundary conditions

    Authors: Chaojie Wang

    Abstract: In this paper, preconditioning the saddle point problem arising from the elliptic boundary optimal control problem with mixed boundary conditions is considered. A block triangular reconditioning method is proposed based on permutations of the saddle point problem and approximations of the corresponding Schur complement. The spectral properties of the preconditioned matrix is analyzed. Numerical ex… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  3. arXiv:2407.11474  [pdf, other

    hep-ex

    Search for the rare $Λ_c^+ \to p μ^+ μ^-$ decay

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1062 additional authors not shown)

    Abstract: A search for the nonresonant $Λ_c^+ \to p μ^+ μ^-$ decay is performed using proton-proton collision data recorded at a centre-of-mass energy of 13 TeV by the LHCb experiment, corresponding to an integrated luminosity of 5.4 fb$^{-1}$. No evidence for the decay is found in the dimuon invariant-mass regions where the expected contributions of resonances is subdominant. The upper limit on the branchi… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-005.html (LHCb public pages)

    Report number: LHCb-PAPER-2024-005, CERN-EP-2024-158

  4. arXiv:2407.11364  [pdf, ps, other

    cs.DS cs.LG

    Learning-augmented Maximum Independent Set

    Authors: Vladimir Braverman, Prathamesh Dharangutte, Vihan Shah, Chen Wang

    Abstract: We study the Maximum Independent Set (MIS) problem on general graphs within the framework of learning-augmented algorithms. The MIS problem is known to be NP-hard and is also NP-hard to approximate to within a factor of $n^{1-δ}$ for any $δ>0$. We show that we can break this barrier in the presence of an oracle obtained through predictions from a machine learning model that answers vertex membersh… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: APPROX 2024

  5. arXiv:2407.11343  [pdf, other

    cs.CV

    Ev-GS: Event-based Gaussian splatting for Efficient and Accurate Radiance Field Rendering

    Authors: Jingqian Wu, Shuo Zhu, Chutian Wang, Edmund Y. Lam

    Abstract: Computational neuromorphic imaging (CNI) with event cameras offers advantages such as minimal motion blur and enhanced dynamic range, compared to conventional frame-based methods. Existing event-based radiance field rendering methods are built on neural radiance field, which is computationally heavy and slow in reconstruction speed. Motivated by the two aspects, we introduce Ev-GS, the first CNI-i… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  6. arXiv:2407.11208  [pdf

    cond-mat.mtrl-sci physics.comp-ph

    Machine learning accelerated prediction of Ce-based ternary compounds involving antagonistic pairs

    Authors: Weiyi Xia, Wei-Shen Tee, Paul C. Canfield, Fernando Assis Garcia, Raquel D Ribeiro, Yongbin Lee, Liqin Ke, Rebecca Flint, Cai-Zhuang Wang

    Abstract: The discovery of novel quantum materials within ternary phase spaces containing antagonistic pair such as Fe with Bi, Pb, In, and Ag, presents significant challenges yet holds great potential. In this work, we investigate the stabilization of these immiscible pairs through the integration of Cerium (Ce), an abundant rare-earth and cost-effective element. By employing a machine learning (ML)-guided… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  7. arXiv:2407.10975  [pdf

    cs.OH cs.AI cs.CL

    Stream State-tying for Sign Language Recognition

    Authors: Jiyong Ma, Wen Gao, Chunli Wang

    Abstract: In this paper, a novel approach to sign language recognition based on state tying in each of data streams is presented. In this framework, it is assumed that hand gesture signal is represented in terms of six synchronous data streams, i.e., the left/right hand position, left/right hand orientation and left/right handshape. This approach offers a very accurate representation of the sign space and k… ▽ More

    Submitted 21 April, 2024; originally announced July 2024.

  8. arXiv:2407.10467  [pdf, ps, other

    math.GT

    A lower bound of the crossing number of composite knots

    Authors: Ruifeng Qiu, Chao Wang

    Abstract: Let $c(K)$ denote the crossing number of a knot $K$ and let $K_1\# K_2$ denote the connected sum of two oriented knots $K_1$ and $K_2$. It is a very old unsolved question that whether $c(K_1\# K_2)=c(K_1)+c(K_2)$. In this paper we show that $c(K_1\# K_2)> (c(K_1)+c(K_2))/16$.

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 46 pages, 45 figures

    MSC Class: Primary 57M25; Secondary 57N10

  9. arXiv:2407.10433  [pdf, other

    cs.CV cs.AI

    A Multi-Stage Framework for 3D Individual Tooth Segmentation in Dental CBCT

    Authors: Chunshi Wang, Bin Zhao, Shuxue Ding

    Abstract: Cone beam computed tomography (CBCT) is a common way of diagnosing dental related diseases. Accurate segmentation of 3D tooth is of importance for the treatment. Although deep learning based methods have achieved convincing results in medical image processing, they need a large of annotated data for network training, making it very time-consuming in data collection and annotation. Besides, domain… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Semi-supervised Tooth Segmentation MICCAI 2023 Challenge

  10. arXiv:2407.09417  [pdf, other

    cs.CL cs.IR

    Mitigating Entity-Level Hallucination in Large Language Models

    Authors: Weihang Su, Yichen Tang, Qingyao Ai, Changyue Wang, Zhijing Wu, Yiqun Liu

    Abstract: The emergence of Large Language Models (LLMs) has revolutionized how users access information, shifting from traditional search engines to direct question-and-answer interactions with LLMs. However, the widespread adoption of LLMs has revealed a significant challenge known as hallucination, wherein LLMs generate coherent yet factually inaccurate responses. This hallucination phenomenon has led to… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  11. arXiv:2407.09329  [pdf, ps, other

    math.FA

    Function spaces on formal manifolds

    Authors: Fulin Chen, Binyong Sun, Chuyun Wang

    Abstract: This is a paper in a series that studies smooth relative Lie algebra homologies and cohomologies based on the theory of formal manifolds and formal Lie groups. In a previous paper, we introduce the notion of formal manifolds and develop the foundational framework of formal manifolds. In this paper, we study various function spaces on formal manifolds, including generalizations of vector-valued gen… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: The preprint arXiv:2401.01535v1 of ours was split into three separate papers. This is the second paper about vector-valued generalized functions and vector-valued distributions

  12. arXiv:2407.09247  [pdf, other

    cs.AI

    Constrained Intrinsic Motivation for Reinforcement Learning

    Authors: Xiang Zheng, Xingjun Ma, Chao Shen, Cong Wang

    Abstract: This paper investigates two fundamental problems that arise when utilizing Intrinsic Motivation (IM) for reinforcement learning in Reward-Free Pre-Training (RFPT) tasks and Exploration with Intrinsic Motivation (EIM) tasks: 1) how to design an effective intrinsic objective in RFPT tasks, and 2) how to reduce the bias introduced by the intrinsic objective in EIM tasks. Existing IM methods suffer fr… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Accepted by IJCAI 2024

  13. arXiv:2407.09048  [pdf, other

    cs.AI

    KUNPENG: An Embodied Large Model for Intelligent Maritime

    Authors: Naiyao Wang, Tongbang Jiang, Ye Wang, Shaoyang Qiu, Bo Zhang, Xinqiang Xie, Munan Li, Chunliu Wang, Yiyang Wang, Hongxiang Ren, Ruili Wang, Hongjun Shan, Hongbo Liu

    Abstract: Intelligent maritime, as an essential component of smart ocean construction, deeply integrates advanced artificial intelligence technology and data analysis methods, which covers multiple aspects such as smart vessels, route optimization, safe navigation, aiming to enhance the efficiency of ocean resource utilization and the intelligence of transportation networks. However, the complex and dynamic… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: 9 pages, 3 figures

  14. arXiv:2407.08948  [pdf, other

    eess.IV cs.CV

    Symmetry Awareness Encoded Deep Learning Framework for Brain Imaging Analysis

    Authors: Yang Ma, Dongang Wang, Peilin Liu, Lynette Masters, Michael Barnett, Weidong Cai, Chenyu Wang

    Abstract: The heterogeneity of neurological conditions, ranging from structural anomalies to functional impairments, presents a significant challenge in medical imaging analysis tasks. Moreover, the limited availability of well-annotated datasets constrains the development of robust analysis models. Against this backdrop, this study introduces a novel approach leveraging the inherent anatomical symmetrical… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: MICCAI 2024

    ACM Class: I.2.10; I.4.10

  15. arXiv:2407.08865  [pdf, other

    cs.CV

    Single-Image Shadow Removal Using Deep Learning: A Comprehensive Survey

    Authors: Laniqng Guo, Chong Wang, Yufei Wang, Siyu Huang, Wenhan Yang, Alex C. Kot, Bihan Wen

    Abstract: Shadow removal aims at restoring the image content within shadow regions, pursuing a uniform distribution of illumination that is consistent between shadow and non-shadow regions. {Comparing to other image restoration tasks, there are two unique challenges in shadow removal:} 1) The patterns of shadows are arbitrary, varied, and often have highly complex trace structures, making ``trace-less'' ima… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: url: https://github.com/GuoLanqing/Awesome-Shadow-Removal

  16. arXiv:2407.08855  [pdf, other

    eess.IV cs.CV

    BraTS-PEDs: Results of the Multi-Consortium International Pediatric Brain Tumor Segmentation Challenge 2023

    Authors: Anahita Fathi Kazerooni, Nastaran Khalili, Xinyang Liu, Debanjan Haldar, Zhifan Jiang, Anna Zapaishchykova, Julija Pavaine, Lubdha M. Shah, Blaise V. Jones, Nakul Sheth, Sanjay P. Prabhu, Aaron S. McAllister, Wenxin Tu, Khanak K. Nandolia, Andres F. Rodriguez, Ibraheem Salman Shaikh, Mariana Sanchez Montano, Hollie Anne Lai, Maruf Adewole, Jake Albrecht, Udunna Anazodo, Hannah Anderson, Syed Muhammed Anwar, Alejandro Aristizabal, Sina Bagheri , et al. (54 additional authors not shown)

    Abstract: Pediatric central nervous system tumors are the leading cause of cancer-related deaths in children. The five-year survival rate for high-grade glioma in children is less than 20%. The development of new treatments is dependent upon multi-institutional collaborative clinical trials requiring reproducible and accurate centralized response assessment. We present the results of the BraTS-PEDs 2023 cha… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  17. arXiv:2407.08726  [pdf, other

    cs.CV

    Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public Data

    Authors: Cherie Ho, Jiaye Zou, Omar Alama, Sai Mitheran Jagadesh Kumar, Benjamin Chiang, Taneesh Gupta, Chen Wang, Nikhil Keetha, Katia Sycara, Sebastian Scherer

    Abstract: Top-down Bird's Eye View (BEV) maps are a popular representation for ground robot navigation due to their richness and flexibility for downstream tasks. While recent methods have shown promise for predicting BEV maps from First-Person View (FPV) images, their generalizability is limited to small regions captured by current autonomous vehicle-based datasets. In this context, we show that a more sca… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  18. arXiv:2407.08706  [pdf, other

    cs.CV

    HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models

    Authors: Runhui Huang, Xinpeng Ding, Chunwei Wang, Jianhua Han, Yulong Liu, Hengshuang Zhao, Hang Xu, Lu Hou, Wei Zhang, Xiaodan Liang

    Abstract: High-resolution inputs enable Large Vision-Language Models (LVLMs) to discern finer visual details, enhancing their comprehension capabilities. To reduce the training and computation costs caused by high-resolution input, one promising direction is to use sliding windows to slice the input into uniform patches, each matching the input size of the well-trained vision encoder. Although efficient, th… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  19. arXiv:2407.08224  [pdf, other

    q-bio.QM cs.AI

    stEnTrans: Transformer-based deep learning for spatial transcriptomics enhancement

    Authors: Shuailin Xue, Fangfang Zhu, Changmiao Wang, Wenwen Min

    Abstract: The spatial location of cells within tissues and organs is crucial for the manifestation of their specific functions.Spatial transcriptomics technology enables comprehensive measurement of the gene expression patterns in tissues while retaining spatial information. However, current popular spatial transcriptomics techniques either have shallow sequencing depth or low resolution. We present stEnTra… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: ISBRA2024, Code: https://github.com/shuailinxue/stEnTrans

  20. arXiv:2407.08216  [pdf, other

    eess.IV cs.AI cs.CV q-bio.QM

    Multimodal contrastive learning for spatial gene expression prediction using histology images

    Authors: Wenwen Min, Zhiceng Shi, Jun Zhang, Jun Wan, Changmiao Wang

    Abstract: In recent years, the advent of spatial transcriptomics (ST) technology has unlocked unprecedented opportunities for delving into the complexities of gene expression patterns within intricate biological systems. Despite its transformative potential, the prohibitive cost of ST technology remains a significant barrier to its widespread adoption in large-scale studies. An alternative, more cost-effect… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: BIB, Code: https://github.com/shizhiceng/mclSTExp

  21. arXiv:2407.08200  [pdf, other

    cs.CV

    Deep Understanding of Soccer Match Videos

    Authors: Shikun Xu, Yandong Zhu, Gen Li, Changhu Wang

    Abstract: Soccer is one of the most popular sport worldwide, with live broadcasts frequently available for major matches. However, extracting detailed, frame-by-frame information on player actions from these videos remains a challenge. Utilizing state-of-the-art computer vision technologies, our system can detect key objects such as soccer balls, players and referees. It also tracks the movements of players… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  22. arXiv:2407.08199  [pdf, other

    cs.CV

    SRPose: Two-view Relative Pose Estimation with Sparse Keypoints

    Authors: Rui Yin, Yulun Zhang, Zherong Pan, Jianjun Zhu, Cheng Wang, Biao Jia

    Abstract: Two-view pose estimation is essential for map-free visual relocalization and object pose tracking tasks. However, traditional matching methods suffer from time-consuming robust estimators, while deep learning-based pose regressors only cater to camera-to-world pose estimation, lacking generalizability to different image sizes and camera intrinsics. In this paper, we propose SRPose, a sparse keypoi… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 30 pages, 11 figures, to be published in ECCV 2024

  23. arXiv:2407.08187  [pdf, other

    cs.CV

    ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation

    Authors: Ruijie Zhu, Chuxin Wang, Ziyang Song, Li Liu, Tianzhu Zhang, Yongdong Zhang

    Abstract: Estimating depth from a single image is a challenging visual task. Compared to relative depth estimation, metric depth estimation attracts more attention due to its practical physical significance and critical applications in real-life scenarios. However, existing metric depth estimation methods are typically trained on specific datasets with similar scenes, facing challenges in generalizing acros… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 14 pages, 11 figure, 13 tables

  24. arXiv:2407.07954  [pdf

    physics.med-ph cond-mat.mtrl-sci cond-mat.soft

    3D E-textile for Exercise Physiology and Clinical Maternal Health Monitoring

    Authors: Junyi Zhao, Chansoo Kim, Weilun Li, Zichao Wen, Zhili Xiao, Yong Wang, Shantanu Chakrabartty, Chuan Wang

    Abstract: Electronic textiles (E-textiles) offer great wearing comfort and unobtrusiveness, thus holding potential for next-generation health monitoring wearables. However, the practical implementation is hampered by challenges associated with poor signal quality, substantial motion artifacts, durability for long-term usage, and non-ideal user experience. Here, we report a cost-effective E-textile system th… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 16 pages, 6 figures

  25. arXiv:2407.07697  [pdf

    quant-ph

    Revealing spontaneous symmetry breaking in continuous time crystals

    Authors: Yuanjiang Tang, Chenyang Wang, Bei Liu, Jin Peng, Chao Liang, Yaohua Li, Xian Zhao, Cuicui Lu, Shuang Zhang, Yong-Chun Liu

    Abstract: Spontaneous symmetry breaking plays a pivotal role in physics ranging from the emergence of elementary particles to the phase transitions of matter. The spontaneous breaking of continuous time translation symmetry leads to a novel state of matter named continuous time crystal (CTC). It exhibits periodic oscillation without the need for periodic driving, and the relative phases for repetitively rea… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  26. arXiv:2407.07518  [pdf, other

    cs.CV

    Multi-modal Crowd Counting via a Broker Modality

    Authors: Haoliang Meng, Xiaopeng Hong, Chenhao Wang, Miao Shang, Wangmeng Zuo

    Abstract: Multi-modal crowd counting involves estimating crowd density from both visual and thermal/depth images. This task is challenging due to the significant gap between these distinct modalities. In this paper, we propose a novel approach by introducing an auxiliary broker modality and on this basis frame the task as a triple-modal learning problem. We devise a fusion-based method to generate this brok… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: This is the preprint version of the paper and supplemental material to appear in ECCV 2024. Please cite the final published version. Code is available at https://github.com/HenryCilence/Broker-Modality-Crowd-Counting

  27. arXiv:2407.07099  [pdf, other

    cs.CL cs.AI cs.GT cs.LG

    Nash CoT: Multi-Path Inference with Preference Equilibrium

    Authors: Ziqi Zhang, Cunxiang Wang, Xiong Xiao, Yue Zhang, Donglin Wang

    Abstract: Chain-of-thought (CoT) prompting has emerged as a powerful technique for enhancing the reasoning capabilities of Large Language Models (LLMs) on complex problems. Among CoT-related studies, self-consistency (Multi-path inference with answer filtering through voting) involves generating multiple reasoning paths using the CoT framework and then selecting the most frequently produced outputs standing… ▽ More

    Submitted 18 June, 2024; originally announced July 2024.

  28. arXiv:2407.07020  [pdf, other

    cs.AI cs.RO

    Less is More: Efficient Brain-Inspired Learning for Autonomous Driving Trajectory Prediction

    Authors: Haicheng Liao, Yongkang Li, Zhenning Li, Chengyue Wang, Chunlin Tian, Yuming Huang, Zilin Bian, Kaiqun Zhu, Guofa Li, Ziyuan Pu, Jia Hu, Zhiyong Cui, Chengzhong Xu

    Abstract: Accurately and safely predicting the trajectories of surrounding vehicles is essential for fully realizing autonomous driving (AD). This paper presents the Human-Like Trajectory Prediction model (HLTP++), which emulates human cognitive processes to improve trajectory prediction in AD. HLTP++ incorporates a novel teacher-student knowledge distillation framework. The "teacher" model equipped with an… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2402.19251

  29. arXiv:2407.06938  [pdf, other

    cs.CV

    RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models

    Authors: Bowen Zhang, Yiji Cheng, Chunyu Wang, Ting Zhang, Jiaolong Yang, Yansong Tang, Feng Zhao, Dong Chen, Baining Guo

    Abstract: We present RodinHD, which can generate high-fidelity 3D avatars from a portrait image. Existing methods fail to capture intricate details such as hairstyles which we tackle in this paper. We first identify an overlooked problem of catastrophic forgetting that arises when fitting triplanes sequentially on many avatars, caused by the MLP decoder sharing scheme. To overcome this issue, we raise a nov… ▽ More

    Submitted 10 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: ECCV 2024; project page: https://rodinhd.github.io/

  30. arXiv:2407.06698  [pdf, ps, other

    cs.CV cs.LG

    PSPU: Enhanced Positive and Unlabeled Learning by Leveraging Pseudo Supervision

    Authors: Chengjie Wang, Chengming Xu, Zhenye Gan, Jianlong Hu, Wenbing Zhu, Lizhuag Ma

    Abstract: Positive and Unlabeled (PU) learning, a binary classification model trained with only positive and unlabeled data, generally suffers from overfitted risk estimation due to inconsistent data distributions. To address this, we introduce a pseudo-supervised PU learning framework (PSPU), in which we train the PU model first, use it to gather confident samples for the pseudo supervision, and then apply… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: accepted by ICME2024

  31. arXiv:2407.05909  [pdf, other

    cs.CV

    Multi-clue Consistency Learning to Bridge Gaps Between General and Oriented Object in Semi-supervised Detection

    Authors: Chenxu Wang, Chunyan Xu, Ziqi Gu, Zhen Cui

    Abstract: While existing semi-supervised object detection (SSOD) methods perform well in general scenes, they encounter challenges in handling oriented objects in aerial images. We experimentally find three gaps between general and oriented object detection in semi-supervised learning: 1) Sampling inconsistency: the common center sampling is not suitable for oriented objects with larger aspect ratios when s… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  32. arXiv:2407.05764  [pdf, other

    eess.IV

    Neuromorphic Imaging with Super-Resolution

    Authors: Pei Zhang, Shuo Zhu, Chutian Wang, Yaping Zhao, Edmund Y. Lam

    Abstract: Neuromorphic imaging is a bio-inspired technique that imitates the human retina to sense variations in a dynamic scene. It responds to pixel-level brightness changes by asynchronous streaming events and boasts microsecond temporal precision over a high dynamic range, yielding blur-free recordings under extreme illumination. Nevertheless, such a modality falls short in spatial resolution and leads… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 11 pages, 13 figures, and 3 tables

  33. arXiv:2407.05749  [pdf, other

    eess.SP cs.HC cs.LG

    LDGCN: An Edge-End Lightweight Dual GCN Based on Single-Channel EEG for Driver Drowsiness Monitoring

    Authors: Jingwei Huang, Chuansheng Wang, Jiayan Huang, Haoyi Fan, Antoni Grau, Fuquan Zhang

    Abstract: Driver drowsiness electroencephalography (EEG) signal monitoring can timely alert drivers of their drowsiness status, thereby reducing the probability of traffic accidents. Graph convolutional networks (GCNs) have shown significant advancements in processing the non-stationary, time-varying, and non-Euclidean nature of EEG signals. However, the existing single-channel EEG adjacency graph construct… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  34. arXiv:2407.05540  [pdf, other

    cs.CV

    GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation

    Authors: Chenxin Li, Xinyu Liu, Cheng Wang, Yifan Liu, Weihao Yu, Jing Shao, Yixuan Yuan

    Abstract: Recent advances in learning multi-modal representation have witnessed the success in biomedical domains. While established techniques enable handling multi-modal information, the challenges are posed when extended to various clinical modalities and practical modalitymissing setting due to the inherent modality gaps. To tackle these, we propose an innovative Modality-prompted Heterogeneous Graph fo… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  35. arXiv:2407.05361  [pdf, other

    eess.AS cs.CL

    Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation

    Authors: Haorui He, Zengqiang Shang, Chaoren Wang, Xuyuan Li, Yicheng Gu, Hua Hua, Liwei Liu, Chen Yang, Jiaqi Li, Peiyang Shi, Yuancheng Wang, Kai Chen, Pengyuan Zhang, Zhizheng Wu

    Abstract: Recently, speech generation models have made significant progress by using large-scale training data. However, the research community struggle to produce highly spontaneous and human-like speech due to the lack of large-scale, diverse, and spontaneous speech data. This paper present Emilia, the first multilingual speech generation dataset from in-the-wild speech data, and Emilia-Pipe, the first op… ▽ More

    Submitted 12 July, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

    Comments: Fix typos

  36. arXiv:2407.05358  [pdf, other

    cs.CV

    CPM: Class-conditional Prompting Machine for Audio-visual Segmentation

    Authors: Yuanhong Chen, Chong Wang, Yuyuan Liu, Hu Wang, Gustavo Carneiro

    Abstract: Audio-visual segmentation (AVS) is an emerging task that aims to accurately segment sounding objects based on audio-visual cues. The success of AVS learning systems depends on the effectiveness of cross-modal interaction. Such a requirement can be naturally fulfilled by leveraging transformer-based segmentation architecture due to its inherent ability to capture long-range dependencies and flexibi… ▽ More

    Submitted 15 July, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

  37. arXiv:2407.05253  [pdf, other

    math.NA

    A Third-order Implicit-Explicit Runge-Kutta Method for Landau-Lifshitz Equation with Arbitrary Damping Parameters

    Authors: Yan Gui, Rui Du, Cheng Wang

    Abstract: A third-order accurate implicit-explicit Runge-Kutta time marching numerical scheme is proposed and implemented for the Landau-Lifshitz-Gilbert equation, which models magnetization dynamics in ferromagnetic materials, with arbitrary damping parameters. This method has three remarkable advantages:~(1) only a linear system with constant coefficients needs to be solved at each Runge-Kutta stage, whic… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted by Numerical Mathematics: Theory, Methods and Applications and is prepared for publication

  38. arXiv:2407.04981  [pdf, other

    cs.CL cs.LG

    TRACE: TRansformer-based Attribution using Contrastive Embeddings in LLMs

    Authors: Cheng Wang, Xinyang Lu, See-Kiong Ng, Bryan Kian Hsiang Low

    Abstract: The rapid evolution of large language models (LLMs) represents a substantial leap forward in natural language understanding and generation. However, alongside these advancements come significant challenges related to the accountability and transparency of LLM responses. Reliable source attribution is essential to adhering to stringent legal and regulatory standards, including those set forth by th… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  39. arXiv:2407.04969  [pdf, other

    cs.CL

    EVA-Score: Evaluation of Long-form Summarization on Informativeness through Extraction and Validation

    Authors: Yuchen Fan, Xin Zhong, Chengsi Wang, Gaoche Wu, Bowen Zhou

    Abstract: Summarization is a fundamental task in natural language processing (NLP) and since large language models (LLMs), such as GPT-4 and Claude, come out, increasing attention has been paid to long-form summarization whose input sequences are much longer, indicating more information contained. The current evaluation metrics either use similarity-based metrics like ROUGE and BERTScore which rely on sim… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: 16 pages, 3 figures, submitted to EMNLP

  40. arXiv:2407.04922  [pdf, other

    cond-mat.mtrl-sci

    Revolutionizing Alloy Microstructure Segmentation through SAM and Domain Knowledge without Extra Training

    Authors: Xudong Ma, Yuqi Zhang, Chenchong Wang, Wei Xu

    Abstract: Fundamental models, trained on large-scale datasets and adapted to new data using innovative learning methods, have revolutionized various fields. In materials science, microstructure image segmentation plays a pivotal role in understanding alloy properties. However, conventional supervised modelling algorithms often necessitate extensive annotations and intricate optimization procedures. The segm… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  41. arXiv:2407.04842  [pdf, other

    cs.CV cs.CL cs.LG

    MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

    Authors: Zhaorun Chen, Yichao Du, Zichen Wen, Yiyang Zhou, Chenhang Cui, Zhenzhen Weng, Haoqin Tu, Chaoqi Wang, Zhengwei Tong, Qinglan Huang, Canyu Chen, Qinghao Ye, Zhihong Zhu, Yuqing Zhang, Jiawei Zhou, Zhuokai Zhao, Rafael Rafailov, Chelsea Finn, Huaxiu Yao

    Abstract: While text-to-image models like DALLE-3 and Stable Diffusion are rapidly proliferating, they often encounter challenges such as hallucination, bias, and the production of unsafe, low-quality output. To effectively address these issues, it is crucial to align these models with desired behaviors based on feedback from a multimodal judge. Despite their significance, current multimodal judges frequent… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 42 pages, 13 figures, 33 tables

  42. arXiv:2407.04787  [pdf, other

    cs.CL cs.AI cs.LG

    Re-Tuning: Overcoming the Compositionality Limits of Large Language Models with Recursive Tuning

    Authors: Eric Pasewark, Kyle Montgomery, Kefei Duan, Dawn Song, Chenguang Wang

    Abstract: We present a new method for large language models to solve compositional tasks. Although they have shown strong performance on traditional language understanding tasks, large language models struggle to solve compositional tasks, where the solution depends on solving smaller instances of the same problem. We propose a natural approach to solve compositional tasks recursively. Our method, Re-Tuning… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Accepted to ACL 2024

  43. arXiv:2407.04486  [pdf, other

    q-bio.QM cs.AI

    Variational and Explanatory Neural Networks for Encoding Cancer Profiles and Predicting Drug Responses

    Authors: Tianshu Feng, Rohan Gnanaolivu, Abolfazl Safikhani, Yuanhang Liu, Jun Jiang, Nicholas Chia, Alexander Partin, Priyanka Vasanthakumari, Yitan Zhu, Chen Wang

    Abstract: Human cancers present a significant public health challenge and require the discovery of novel drugs through translational research. Transcriptomics profiling data that describes molecular activities in tumors and cancer cell lines are widely utilized for predicting anti-cancer drug responses. However, existing AI models face challenges due to noise in transcriptomics data and lack of biological i… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  44. arXiv:2407.04296  [pdf

    physics.plasm-ph

    The study of propagation characteristics of millimeter-wave vortex in magnetized plasma by using FDTD Method

    Authors: Chenxu Wang, Hideki Kawaguchi, Hiroaki Nakamura, Shin Kubo

    Abstract: It is pointed out that millimeter-wave vortex may contribute an efficient plasma heating since it was found that the millimeter-wave vortex can propagate in magnetized plasma even in which the normal plane wave is in cut-off condition. Then, it was assumed that the vortex field was the Laguerre-Gaussian (L-G) mode which is free-space solution, but the generation and stable propagation of the L-G m… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 9 pages, 5 figures

  45. arXiv:2407.03900  [pdf, other

    cs.CV

    Oracle Bone Inscriptions Multi-modal Dataset

    Authors: Bang Li, Donghao Luo, Yujie Liang, Jing Yang, Zengmao Ding, Xu Peng, Boyuan Jiang, Shengwei Han, Dan Sui, Peichao Qin, Pian Wu, Chaoyang Wang, Yun Qi, Taisong Jin, Chengjie Wang, Xiaoming Huang, Zhan Shu, Rongrong Ji, Yongge Liu, Yunsheng Wu

    Abstract: Oracle bone inscriptions(OBI) is the earliest developed writing system in China, bearing invaluable written exemplifications of early Shang history and paleography. However, the task of deciphering OBI, in the current climate of the scholarship, can prove extremely challenging. Out of the 4,500 oracle bone characters excavated, only a third have been successfully identified. Therefore, leveraging… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  46. arXiv:2407.03772  [pdf, other

    eess.IV cs.CV q-bio.QM

    CS3: Cascade SAM for Sperm Segmentation

    Authors: Yi Shi, Xu-Peng Tian, Yun-Kai Wang, Tie-Yi Zhang, Bin Yao, Hui Wang, Yong Shao, Cen-Cen Wang, Rong Zeng, De-Chuan Zhan

    Abstract: Automated sperm morphology analysis plays a crucial role in the assessment of male fertility, yet its efficacy is often compromised by the challenges in accurately segmenting sperm images. Existing segmentation techniques, including the Segment Anything Model(SAM), are notably inadequate in addressing the complex issue of sperm overlap-a frequent occurrence in clinical samples. Our exploratory stu… ▽ More

    Submitted 9 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: Early accepted by MICCAI2024

  47. arXiv:2407.03548  [pdf, other

    cs.CV

    HiDiff: Hybrid Diffusion Framework for Medical Image Segmentation

    Authors: Tao Chen, Chenhui Wang, Zhihao Chen, Yiming Lei, Hongming Shan

    Abstract: Medical image segmentation has been significantly advanced with the rapid development of deep learning (DL) techniques. Existing DL-based segmentation models are typically discriminative; i.e., they aim to learn a mapping from the input image to segmentation masks. However, these discriminative methods neglect the underlying data distribution and intrinsic class characteristics, suffering from uns… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted by IEEE Transactions on Medical Imaging 2024

  48. arXiv:2407.03531  [pdf, other

    cs.RO

    OrbitGrasp: $SE(3)$-Equivariant Grasp Learning

    Authors: Boce Hu, Xupeng Zhu, Dian Wang, Zihao Dong, Haojie Huang, Chenghao Wang, Robin Walters, Robert Platt

    Abstract: While grasp detection is an important part of any robotic manipulation pipeline, reliable and accurate grasp detection in $SE(3)$ remains a research challenge. Many robotics applications in unstructured environments such as the home or warehouse would benefit a lot from better grasp performance. This paper proposes a novel framework for detecting $SE(3)$ grasp poses based on point cloud input. Our… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  49. arXiv:2407.03449  [pdf, other

    eess.SP

    A Tutorial on Fluid Antenna System for 6G Networks: Encompassing Communication Theory, Optimization Methods and Hardware Designs

    Authors: Wee Kiat New, Kai-Kit Wong, Hao Xu, Chao Wang, Farshad Rostami Ghadi, Jichen Zhang, Junhui Rao, Ross Murch, Pablo Ramírez-Espinosa, David Morales-Jimenez, Chan-Byoung Chae, Kin-Fai Tong

    Abstract: The advent of the sixth-generation (6G) networks presents another round of revolution for the mobile communication landscape, promising an immersive experience, robust reliability, minimal latency, extreme connectivity, ubiquitous coverage, and capabilities beyond communication, including intelligence and sensing. To achieve these ambitious goals, it is apparent that 6G networks need to incorporat… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 50 pages, 45 figures, 5 tables. Submitted for potential publication

  50. arXiv:2407.03374  [pdf

    cs.AI cs.SE eess.SP eess.SY

    An Outline of Prognostics and Health Management Large Model: Concepts, Paradigms, and Challenges

    Authors: Laifa Tao, Shangyu Li, Haifei Liu, Qixuan Huang, Liang Ma, Guoao Ning, Yiling Chen, Yunlong Wu, Bin Li, Weiwei Zhang, Zhengduo Zhao, Wenchao Zhan, Wenyan Cao, Chao Wang, Hongmei Liu, Jian Ma, Mingliang Suo, Yujie Cheng, Yu Ding, Dengwei Song, Chen Lu

    Abstract: Prognosis and Health Management (PHM), critical for ensuring task completion by complex systems and preventing unexpected failures, is widely adopted in aerospace, manufacturing, maritime, rail, energy, etc. However, PHM's development is constrained by bottlenecks like generalization, interpretation and verification abilities. Presently, generative artificial intelligence (AI), represented by Larg… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.