Skip to main content

Showing 1–50 of 322 results for author: Xiang, Y

  1. arXiv:2407.02280  [pdf, other

    cs.CV cs.AI

    FedIA: Federated Medical Image Segmentation with Heterogeneous Annotation Completeness

    Authors: Yangyang Xiang, Nannan Wu, Li Yu, Xin Yang, Kwang-Ting Cheng, Zengqiang Yan

    Abstract: Federated learning has emerged as a compelling paradigm for medical image segmentation, particularly in light of increasing privacy concerns. However, most of the existing research relies on relatively stringent assumptions regarding the uniformity and completeness of annotations across clients. Contrary to this, this paper highlights a prevalent challenge in medical practice: incomplete annotatio… ▽ More

    Submitted 3 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: Early accepted by MICCAI 2024

  2. arXiv:2406.17969  [pdf, other

    cs.CL cs.AI

    Encourage or Inhibit Monosemanticity? Revisit Monosemanticity from a Feature Decorrelation Perspective

    Authors: Hanqi Yan, Yanzheng Xiang, Guangyi Chen, Yifei Wang, Lin Gui, Yulan He

    Abstract: To better interpret the intrinsic mechanism of large language models (LLMs), recent studies focus on monosemanticity on its basic units. A monosemantic neuron is dedicated to a single and specific concept, which forms a one-to-one correlation between neurons and concepts. Despite extensive research in monosemanticity probing, it remains unclear whether monosemanticity is beneficial or harmful to m… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  3. arXiv:2406.15222  [pdf

    eess.IV cs.AI cs.CV

    Rapid and Accurate Diagnosis of Acute Aortic Syndrome using Non-contrast CT: A Large-scale, Retrospective, Multi-center and AI-based Study

    Authors: Yujian Hu, Yilang Xiang, Yan-Jie Zhou, Yangyan He, Shifeng Yang, Xiaolong Du, Chunlan Den, Youyao Xu, Gaofeng Wang, Zhengyao Ding, Jingyong Huang, Wenjun Zhao, Xuejun Wu, Donglin Li, Qianqian Zhu, Zhenjiang Li, Chenyang Qiu, Ziheng Wu, Yunjun He, Chen Tian, Yihui Qiu, Zuodong Lin, Xiaolong Zhang, Yuan He, Zhenpeng Yuan , et al. (15 additional authors not shown)

    Abstract: Chest pain symptoms are highly prevalent in emergency departments (EDs), where acute aortic syndrome (AAS) is a catastrophic cardiovascular emergency with a high fatality rate, especially when timely and accurate treatment is not administered. However, current triage practices in the ED can cause up to approximately half of patients with AAS to have an initially missed diagnosis or be misdiagnosed… ▽ More

    Submitted 24 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: under peer review

  4. arXiv:2406.07232  [pdf, other

    cs.CL cs.AI

    DUAL-REFLECT: Enhancing Large Language Models for Reflective Translation through Dual Learning Feedback Mechanisms

    Authors: Andong Chen, Lianzhang Lou, Kehai Chen, Xuefeng Bai, Yang Xiang, Muyun Yang, Tiejun Zhao, Min Zhang

    Abstract: Recently, large language models (LLMs) enhanced by self-reflection have achieved promising performance on machine translation. The key idea is guiding LLMs to generate translation with human-like feedback. However, existing self-reflection methods lack effective feedback information, limiting the translation performance. To address this, we introduce a DUAL-REFLECT framework, leveraging the dual l… ▽ More

    Submitted 21 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 main conference

  5. arXiv:2406.07036  [pdf, other

    cs.CL cs.AI

    Paying More Attention to Source Context: Mitigating Unfaithful Translations from Large Language Model

    Authors: Hongbin Zhang, Kehai Chen, Xuefeng Bai, Yang Xiang, Min Zhang

    Abstract: Large language models (LLMs) have showcased impressive multilingual machine translation ability. However, unlike encoder-decoder style models, decoder-only LLMs lack an explicit alignment between source and target contexts. Analyzing contribution scores during generation processes revealed that LLMs can be biased towards previously generated tokens over corresponding source tokens, leading to unfa… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL2024 Findings

  6. arXiv:2406.06843  [pdf, other

    cs.CV

    HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction

    Authors: Jikai Wang, Qifan Zhang, Yu-Wei Chao, Bowen Wen, Xiaohu Guo, Yu Xiang

    Abstract: We introduce a data capture system and a new dataset named HO-Cap that can be used to study 3D reconstruction and pose tracking of hands and objects in videos. The capture system uses multiple RGB-D cameras and a HoloLens headset for data collection, avoiding the use of expensive 3D scanners or mocap systems. We propose a semi-automatic method to obtain annotations of shape and pose of hands and o… ▽ More

    Submitted 16 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  7. arXiv:2406.03880  [pdf, other

    cs.LG cs.AI

    Memorization in deep learning: A survey

    Authors: Jiaheng Wei, Yanjun Zhang, Leo Yu Zhang, Ming Ding, Chao Chen, Kok-Leong Ong, Jun Zhang, Yang Xiang

    Abstract: Deep Learning (DL) powered by Deep Neural Networks (DNNs) has revolutionized various domains, yet understanding the intricacies of DNN decision-making and learning processes remains a significant challenge. Recent investigations have uncovered an interesting memorization phenomenon in which DNNs tend to memorize specific details from examples rather than learning general patterns, affecting model… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  8. arXiv:2406.02630  [pdf, other

    cs.CR cs.AI

    AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways

    Authors: Zehang Deng, Yongjian Guo, Changzhou Han, Wanlun Ma, Junwu Xiong, Sheng Wen, Yang Xiang

    Abstract: An Artificial Intelligence (AI) agent is a software entity that autonomously performs tasks or makes decisions based on pre-defined objectives and data inputs. AI agents, capable of perceiving user inputs, reasoning and planning tasks, and executing actions, have seen remarkable advancements in algorithm development and task performance. However, the security challenges they pose remain under-expl… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: ACM Computing Survey

  9. arXiv:2405.17859  [pdf, other

    cs.CV cs.RO

    Adapting Pre-Trained Vision Models for Novel Instance Detection and Segmentation

    Authors: Yangxiao Lu, Jishnu Jaykumar P, Yunhui Guo, Nicholas Ruozzi, Yu Xiang

    Abstract: Novel Instance Detection and Segmentation (NIDS) aims at detecting and segmenting novel object instances given a few examples of each instance. We propose a unified framework (NIDS-Net) comprising object proposal generation, embedding creation for both instance templates and proposal regions, and embedding matching for instance label assignment. Leveraging recent advancements in large vision metho… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 22 pages, 9 figures, Code is available at: https://github.com/YoungSean/NIDS-Net

  10. arXiv:2405.16594  [pdf, ps, other

    stat.ML cs.LG

    Training-Conditional Coverage Bounds under Covariate Shift

    Authors: Mehrdad Pournaderi, Yu Xiang

    Abstract: Training-conditional coverage guarantees in conformal prediction concern the concentration of the error distribution, conditional on the training data, below some nominal level. The conformal prediction methodology has recently been generalized to the covariate shift setting, namely, the covariate distribution changes between the training and test data. In this paper, we study the training-conditi… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2404.13731

  11. arXiv:2405.15258  [pdf, other

    cs.CR

    Leakage-Resilient and Carbon-Neutral Aggregation Featuring the Federated AI-enabled Critical Infrastructure

    Authors: Zehang Deng, Ruoxi Sun, Minhui Xue, Sheng Wen, Seyit Camtepe, Surya Nepal, Yang Xiang

    Abstract: AI-enabled critical infrastructures (ACIs) integrate artificial intelligence (AI) technologies into various essential systems and services that are vital to the functioning of society, offering significant implications for efficiency, security and resilience. While adopting decentralized AI approaches (such as federated learning technology) in ACIs is plausible, private and sensitive data are stil… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  12. arXiv:2405.14099  [pdf, other

    cs.LG math.NA

    Automatic Differentiation is Essential in Training Neural Networks for Solving Differential Equations

    Authors: Chuqi Chen, Yahong Yang, Yang Xiang, Wenrui Hao

    Abstract: Neural network-based approaches have recently shown significant promise in solving partial differential equations (PDEs) in science and engineering, especially in scenarios featuring complex domains or the incorporation of empirical data. One advantage of the neural network method for PDEs lies in its automatic differentiation (AD), which necessitates only the sample points themselves, unlike trad… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  13. arXiv:2405.12114  [pdf, other

    cs.CV math.NA

    A New Cross-Space Total Variation Regularization Model for Color Image Restoration with Quaternion Blur Operator

    Authors: Zhigang Jia, Yuelian Xiang, Meixiang Zhao, Tingting Wu, Michael K. Ng

    Abstract: The cross-channel deblurring problem in color image processing is difficult to solve due to the complex coupling and structural blurring of color pixels. Until now, there are few efficient algorithms that can reduce color infection in deblurring process. To solve this challenging problem, we present a novel cross-space total variation (CSTV) regularization model for color image deblurring by intro… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 15pages,10figures

  14. arXiv:2405.10616  [pdf, other

    cs.CL cs.LG

    Feature-based Low-Rank Compression of Large Language Models via Bayesian Optimization

    Authors: Yixin Ji, Yang Xiang, Juntao Li, Wei Chen, Zhongyi Liu, Kehai Chen, Min Zhang

    Abstract: In recent years, large language models (LLMs) have driven advances in natural language processing. Still, their growing scale has increased the computational burden, necessitating a balance between efficiency and performance. Low-rank compression, a promising technique, reduces non-essential parameters by decomposing weight matrices into products of two low-rank matrices. Yet, its application in L… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: Accepted by 2024 ACL findings

  15. arXiv:2405.09298  [pdf

    eess.IV cs.CV

    Deep Blur Multi-Model (DeepBlurMM) -- a strategy to mitigate the impact of image blur on deep learning model performance in histopathology image analysis

    Authors: Yujie Xiang, Bojing Liu, Mattias Rantalainen

    Abstract: AI-based analysis of histopathology whole slide images (WSIs) is central in computational pathology. However, image quality, including unsharp areas of WSIs, impacts model performance. We investigate the impact of blur and propose a multi-model approach to mitigate negative impact of unsharp image areas. In this study, we use a simulation approach, evaluating model performance under varying levels… ▽ More

    Submitted 23 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

    ACM Class: I.4; J.3

  16. arXiv:2405.06902   

    cs.LG stat.ML

    Causal Inference from Slowly Varying Nonstationary Processes

    Authors: Kang Du, Yu Xiang

    Abstract: Causal inference from observational data following the restricted structural causal models (SCM) framework hinges largely on the asymmetry between cause and effect from the data generating mechanisms, such as non-Gaussianity or non-linearity. This methodology can be adapted to stationary time series, yet inferring causal relationships from nonstationary time series remains a challenging task. In t… ▽ More

    Submitted 29 May, 2024; v1 submitted 11 May, 2024; originally announced May 2024.

    Comments: This work was intended as a replacement of arXiv:2012.13025 and any subsequent updates will appear there

  17. arXiv:2405.05498  [pdf, other

    cs.SD eess.AS

    The RoyalFlush Automatic Speech Diarization and Recognition System for In-Car Multi-Channel Automatic Speech Recognition Challenge

    Authors: Jingguang Tian, Shuaishuai Ye, Shunfei Chen, Yang Xiang, Zhaohui Yin, Xinhui Hu, Xinkang Xu

    Abstract: This paper presents our system submission for the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) Challenge, which focuses on speaker diarization and speech recognition in complex multi-speaker scenarios. To address these challenges, we develop end-to-end speaker diarization models that notably decrease the diarization error rate (DER) by 49.58\% compared to the official baseline on t… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  18. arXiv:2405.04858  [pdf, other

    cs.CV

    Pedestrian Attribute Recognition as Label-balanced Multi-label Learning

    Authors: Yibo Zhou, Hai-Miao Hu, Yirong Xiang, Xiaokang Zhang, Haotian Wu

    Abstract: Rooting in the scarcity of most attributes, realistic pedestrian attribute datasets exhibit unduly skewed data distribution, from which two types of model failures are delivered: (1) label imbalance: model predictions lean greatly towards the side of majority labels; (2) semantics imbalance: model is easily overfitted on the under-represented attributes due to their insufficient semantic diversity… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted as ICML2024 main conference paper

  19. arXiv:2405.00273  [pdf, other

    cs.CL cs.HC

    Social Life Simulation for Non-Cognitive Skills Learning

    Authors: Zihan Yan, Yaohong Xiang, Yun Huang

    Abstract: Non-cognitive skills are crucial for personal and social life well-being, and such skill development can be supported by narrative-based (e.g., storytelling) technologies. While generative AI enables interactive and role-playing storytelling, little is known about how users engage with and perceive the use of AI in social life simulation for non-cognitive skills learning. To this end, we introduce… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

  20. arXiv:2405.00026  [pdf

    cs.CE cs.AI

    Enhancing Credit Card Fraud Detection A Neural Network and SMOTE Integrated Approach

    Authors: Mengran Zhu, Ye Zhang, Yulu Gong, Changxin Xu, Yafei Xiang

    Abstract: Credit card fraud detection is a critical challenge in the financial sector, demanding sophisticated approaches to accurately identify fraudulent transactions. This research proposes an innovative methodology combining Neural Networks (NN) and Synthet ic Minority Over-sampling Technique (SMOTE) to enhance the detection performance. The study addresses the inherent imbalance in credit card transact… ▽ More

    Submitted 26 February, 2024; originally announced May 2024.

  21. PromptCL: Improving Event Representation via Prompt Template and Contrastive Learning

    Authors: Yubo Feng, Lishuang Li, Yi Xiang, Xueyang Qin

    Abstract: The representation of events in text plays a significant role in various NLP tasks. Recent research demonstrates that contrastive learning has the ability to improve event comprehension capabilities of Pre-trained Language Models (PLMs) and enhance the performance of event representation learning. However, the efficacy of event representation learning based on contrastive learning and PLMs is limi… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: NLPCC 2023 Best Student Paper

    Journal ref: Natural Language Processing and Chinese Computing (NLPCC 2023)

  22. arXiv:2404.15245  [pdf, other

    stat.ME cs.LG

    Mining Invariance from Nonlinear Multi-Environment Data: Binary Classification

    Authors: Austin Goddard, Kang Du, Yu Xiang

    Abstract: Making predictions in an unseen environment given data from multiple training environments is a challenging task. We approach this problem from an invariance perspective, focusing on binary classification to shed light on general nonlinear data generation mechanisms. We identify a unique form of invariance that exists solely in a binary setting that allows us to train models invariant over environ… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted to the 2024 International Symposium on Information Theory (ISIT)

  23. arXiv:2404.13731  [pdf, ps, other

    stat.ML cs.LG

    Training-Conditional Coverage Bounds for Uniformly Stable Learning Algorithms

    Authors: Mehrdad Pournaderi, Yu Xiang

    Abstract: The training-conditional coverage performance of the conformal prediction is known to be empirically sound. Recently, there have been efforts to support this observation with theoretical guarantees. The training-conditional coverage bounds for jackknife+ and full-conformal prediction regions have been established via the notion of $(m,n)$-stability by Liang and Barber~[2023]. Although this notion… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: Accepted to the ISIT 2024 workshop on Information-Theoretic Methods for Trustworthy Machine Learning (IT-TML)

  24. arXiv:2404.12715  [pdf, other

    cs.CL

    Ensemble Learning for Heterogeneous Large Language Models with Deep Parallel Collaboration

    Authors: Yichong Huang, Xiaocheng Feng, Baohang Li, Yang Xiang, Hui Wang, Bing Qin, Ting Liu

    Abstract: Large language models (LLMs) exhibit complementary strengths in various tasks, motivating the research of LLM ensembling. However, existing work focuses on training an extra reward model or fusion model to select or combine all candidate answers, posing a great challenge to the generalization on unseen data distributions. Besides, prior methods use textual responses as communication media, ignorin… ▽ More

    Submitted 30 May, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

    Comments: 16 pages, 9 figures, 9 tables

  25. arXiv:2404.11667  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Deep Dependency Networks and Advanced Inference Schemes for Multi-Label Classification

    Authors: Shivvrat Arya, Yu Xiang, Vibhav Gogate

    Abstract: We present a unified framework called deep dependency networks (DDNs) that combines dependency networks and deep learning architectures for multi-label classification, with a particular emphasis on image and video data. The primary advantage of dependency networks is their ease of training, in contrast to other probabilistic graphical models like Markov networks. In particular, when combined with… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Will appear in AISTATS 2024. arXiv admin note: substantial text overlap with arXiv:2302.00633

  26. arXiv:2404.08690  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    Towards Building a Robust Toxicity Predictor

    Authors: Dmitriy Bespalov, Sourav Bhabesh, Yi Xiang, Liutong Zhou, Yanjun Qi

    Abstract: Recent NLP literature pays little attention to the robustness of toxicity language predictors, while these systems are most likely to be used in adversarial contexts. This paper presents a novel adversarial attack, \texttt{ToxicTrap}, introducing small word-level perturbations to fool SOTA text classifiers to predict toxic text samples as benign. ToxicTrap exploits greedy based search strategies t… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: ACL 2023 /

  27. arXiv:2404.06452  [pdf, other

    cs.RO eess.SY

    PAAM: A Framework for Coordinated and Priority-Driven Accelerator Management in ROS 2

    Authors: Daniel Enright, Yecheng Xiang, Hyunjong Choi, Hyoseung Kim

    Abstract: This paper proposes a Priority-driven Accelerator Access Management (PAAM) framework for multi-process robotic applications built on top of the Robot Operating System (ROS) 2 middleware platform. The framework addresses the issue of predictable execution of time- and safety-critical callback chains that require hardware accelerators such as GPUs and TPUs. PAAM provides a standalone ROS executor th… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 14 Pages, 14 Figures

  28. arXiv:2403.16523  [pdf, other

    stat.ML cs.AI cs.LG

    Causal Discovery from Poisson Branching Structural Causal Model Using High-Order Cumulant with Path Analysis

    Authors: Jie Qiao, Yu Xiang, Zhengming Chen, Ruichu Cai, Zhifeng Hao

    Abstract: Count data naturally arise in many fields, such as finance, neuroscience, and epidemiology, and discovering causal structure among count data is a crucial task in various scientific and industrial scenarios. One of the most common characteristics of count data is the inherent branching structure described by a binomial thinning operator and an independent Poisson distribution that captures both br… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Accepted by AAAI-2024

  29. arXiv:2403.13258  [pdf, other

    cs.CV

    SAMCT: Segment Any CT Allowing Labor-Free Task-Indicator Prompts

    Authors: Xian Lin, Yangyang Xiang, Zhehao Wang, Kwang-Ting Cheng, Zengqiang Yan, Li Yu

    Abstract: Segment anything model (SAM), a foundation model with superior versatility and generalization across diverse segmentation tasks, has attracted widespread attention in medical imaging. However, it has been proved that SAM would encounter severe performance degradation due to the lack of medical knowledge in training and local feature encoding. Though several SAM-based models have been proposed for… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  30. arXiv:2403.12504  [pdf, other

    cs.RO

    TON-VIO: Online Time Offset Modeling Networks for Robust Temporal Alignment in High Dynamic Motion VIO

    Authors: Chaoran Xiong, Guoqing Liu, Qi Wu, Songpengcheng Xia, Tong Hua, Kehui Ma, Zhen Sun, Yan Xiang, Ling Pei

    Abstract: Temporal misalignment (time offset) between sensors is common in low cost visual-inertial odometry (VIO) systems. Such temporal misalignment introduces inconsistent constraints for state estimation, leading to a significant positioning drift especially in high dynamic motion scenarios. In this article, we focus on online temporal calibration to reduce the positioning drift caused by the time offse… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  31. arXiv:2403.11544  [pdf, ps, other

    cs.LG

    RL in Markov Games with Independent Function Approximation: Improved Sample Complexity Bound under the Local Access Model

    Authors: Junyi Fan, Yuxuan Han, Jialin Zeng, Jian-Feng Cai, Yang Wang, Yang Xiang, Jiheng Zhang

    Abstract: Efficiently learning equilibria with large state and action spaces in general-sum Markov games while overcoming the curse of multi-agency is a challenging problem. Recent works have attempted to solve this problem by employing independent linear function classes to approximate the marginal $Q$-value for each agent. However, existing sample complexity bounds under such a framework have a suboptimal… ▽ More

    Submitted 19 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Accepted at the 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024)

  32. arXiv:2403.09841  [pdf, other

    cs.RO

    MultiGripperGrasp: A Dataset for Robotic Grasping from Parallel Jaw Grippers to Dexterous Hands

    Authors: Luis Felipe Casas Murrilo, Ninad Khargonkar, Balakrishnan Prabhakaran, Yu Xiang

    Abstract: We introduce a large-scale dataset named MultiGripperGrasp for robotic grasping. Our dataset contains 30.4M grasps from 11 grippers for 345 objects. These grippers range from two-finger grippers to five-finger grippers, including a human hand. All grasps in the dataset are verified in Isaac Sim to classify them as successful and unsuccessful grasps. Additionally, the object fall-off time for each… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  33. arXiv:2403.08822  [pdf

    cs.LG cs.CL

    LoRA-SP: Streamlined Partial Parameter Adaptation for Resource-Efficient Fine-Tuning of Large Language Models

    Authors: Yichao Wu, Yafei Xiang, Shuning Huo, Yulu Gong, Penghao Liang

    Abstract: In addressing the computational and memory demands of fine-tuning Large Language Models(LLMs), we propose LoRA-SP(Streamlined Partial Parameter Adaptation), a novel approach utilizing randomized half-selective parameter freezing within the Low-Rank Adaptation(LoRA)framework. This method efficiently balances pre-trained knowledge retention and adaptability for task-specific optimizations. Through a… ▽ More

    Submitted 28 February, 2024; originally announced March 2024.

  34. arXiv:2403.06174  [pdf, other

    cs.LG cs.AI

    Domain Adversarial Active Learning for Domain Generalization Classification

    Authors: Jianting Chen, Ling Ding, Yunxiao Yang, Zaiyuan Di, Yang Xiang

    Abstract: Domain generalization models aim to learn cross-domain knowledge from source domain data, to improve performance on unknown target domains. Recent research has demonstrated that diverse and rich source domain samples can enhance domain generalization capability. This paper argues that the impact of each sample on the model's generalization ability varies. Despite its small scale, a high-quality da… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  35. arXiv:2403.05466  [pdf, other

    cs.RO cs.CV

    Grasping Trajectory Optimization with Point Clouds

    Authors: Yu Xiang, Sai Haneesh Allu, Rohith Peddi, Tyler Summers, Vibhav Gogate

    Abstract: We introduce a new trajectory optimization method for robotic grasping based on a point-cloud representation of robots and task spaces. In our method, robots are represented by 3D points on their link surfaces. The task space of a robot is represented by a point cloud that can be obtained from depth sensors. Using the point-cloud representation, goal reaching in grasping can be formulated as point… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  36. arXiv:2403.01731  [pdf, other

    cs.CV cs.RO

    RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant Features

    Authors: Howard H. Qian, Yangxiao Lu, Kejia Ren, Gaotian Wang, Ninad Khargonkar, Yu Xiang, Kaiyu Hang

    Abstract: In order to successfully perform manipulation tasks in new environments, such as grasping, robots must be proficient in segmenting unseen objects from the background and/or other objects. Previous works perform unseen object instance segmentation (UOIS) by training deep neural networks on large-scale data to learn RGB/RGB-D feature embeddings, where cluttered environments often result in inaccurat… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 7 pages, 5 figures, ICRA 2024

  37. arXiv:2402.18304  [pdf, other

    cs.DC cs.DB cs.GT

    Play like a Vertex: A Stackelberg Game Approach for Streaming Graph Partitioning

    Authors: Zezhong Ding, Yongan Xiang, Shangyou Wang, Xike Xie, S. Kevin Zhou

    Abstract: In the realm of distributed systems tasked with managing and processing large-scale graph-structured data, optimizing graph partitioning stands as a pivotal challenge. The primary goal is to minimize communication overhead and runtime cost. However, alongside the computational complexity associated with optimal graph partitioning, a critical factor to consider is memory overhead. Real-world graphs… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: This paper has been accepted by SIGMOD 2024

    MSC Class: 97P30 ACM Class: H.2.4

  38. arXiv:2402.16836  [pdf, other

    cs.RO cs.AI cs.CL cs.CV

    PhyGrasp: Generalizing Robotic Grasping with Physics-informed Large Multimodal Models

    Authors: Dingkun Guo, Yuqi Xiang, Shuqi Zhao, Xinghao Zhu, Masayoshi Tomizuka, Mingyu Ding, Wei Zhan

    Abstract: Robotic grasping is a fundamental aspect of robot functionality, defining how robots interact with objects. Despite substantial progress, its generalizability to counter-intuitive or long-tailed scenarios, such as objects with uncommon materials or shapes, remains a challenge. In contrast, humans can easily apply their intuitive physics to grasp skillfully and change grasps efficiently, even for o… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  39. arXiv:2402.16038  [pdf

    cs.CL cs.AI cs.LG

    Deep Learning Approaches for Improving Question Answering Systems in Hepatocellular Carcinoma Research

    Authors: Shuning Huo, Yafei Xiang, Hanyi Yu, Mengran Zhu, Yulu Gong

    Abstract: In recent years, advancements in natural language processing (NLP) have been fueled by deep learning techniques, particularly through the utilization of powerful computing resources like GPUs and TPUs. Models such as BERT and GPT-3, trained on vast amounts of data, have revolutionized language understanding and generation. These pre-trained models serve as robust bases for various tasks including… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  40. arXiv:2402.16036  [pdf

    cs.RO cs.CV cs.LG

    Machine Learning-Based Vehicle Intention Trajectory Recognition and Prediction for Autonomous Driving

    Authors: Hanyi Yu, Shuning Huo, Mengran Zhu, Yulu Gong, Yafei Xiang

    Abstract: In recent years, the expansion of internet technology and advancements in automation have brought significant attention to autonomous driving technology. Major automobile manufacturers, including Volvo, Mercedes-Benz, and Tesla, have progressively introduced products ranging from assisted-driving vehicles to semi-autonomous vehicles. However, this period has also witnessed several traffic safety i… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  41. arXiv:2402.16035  [pdf

    cs.CL cs.AI

    Text Understanding and Generation Using Transformer Models for Intelligent E-commerce Recommendations

    Authors: Yafei Xiang, Hanyi Yu, Yulu Gong, Shuning Huo, Mengran Zhu

    Abstract: With the rapid development of artificial intelligence technology, Transformer structural pre-training model has become an important tool for large language model (LLM) tasks. In the field of e-commerce, these models are especially widely used, from text understanding to generating recommendation systems, which provide powerful technical support for improving user experience and optimizing service… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  42. arXiv:2402.15637  [pdf, other

    cs.CL

    Addressing Order Sensitivity of In-Context Demonstration Examples in Causal Language Models

    Authors: Yanzheng Xiang, Hanqi Yan, Lin Gui, Yulan He

    Abstract: In-context learning has become a popular paradigm in natural language processing. However, its performance can be significantly influenced by the order of in-context demonstration examples. In this paper, we found that causal language models (CausalLMs) are more sensitive to this order compared to prefix language models (PrefixLMs). We attribute this phenomenon to the auto-regressive attention mas… ▽ More

    Submitted 6 June, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

  43. arXiv:2402.09830  [pdf

    cs.LG cs.AI cs.CE

    Utilizing GANs for Fraud Detection: Model Training with Synthetic Transaction Data

    Authors: Mengran Zhu, Yulu Gong, Yafei Xiang, Hanyi Yu, Shuning Huo

    Abstract: Anomaly detection is a critical challenge across various research domains, aiming to identify instances that deviate from normal data distributions. This paper explores the application of Generative Adversarial Networks (GANs) in fraud detection, comparing their advantages with traditional methods. GANs, a type of Artificial Neural Network (ANN), have shown promise in modeling complex data distrib… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  44. arXiv:2402.09820  [pdf

    cs.CR cs.AI cs.LG q-fin.GN

    Utilizing Deep Learning for Enhancing Network Resilience in Finance

    Authors: Yulu Gong, Mengran Zhu, Shuning Huo, Yafei Xiang, Hanyi Yu

    Abstract: In the age of the Internet, people's lives are increasingly dependent on today's network technology. Maintaining network integrity and protecting the legitimate interests of users is at the heart of network construction. Threat detection is an important part of a complete and effective defense system. How to effectively detect unknown threats is one of the concerns of network protection. Currently… ▽ More

    Submitted 18 February, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  45. arXiv:2402.06884  [pdf, other

    stat.ML cs.LG

    Low-Rank Approximation of Structural Redundancy for Self-Supervised Learning

    Authors: Kang Du, Yu Xiang

    Abstract: We study the data-generating mechanism for reconstructive SSL to shed light on its effectiveness. With an infinite amount of labeled samples, we provide a sufficient and necessary condition for perfect linear approximation. The condition reveals a full-rank component that preserves the label classes of Y, along with a redundant component. Motivated by the condition, we propose to approximate the r… ▽ More

    Submitted 27 May, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

    Comments: Accepted to the 3rd Conference on Causal Learning and Reasoning (CLeaR)

  46. arXiv:2401.12983  [pdf

    cs.CL cs.AI physics.ed-ph

    Assessing Large Language Models in Mechanical Engineering Education: A Study on Mechanics-Focused Conceptual Understanding

    Authors: Jie Tian, Jixin Hou, Zihao Wu, Peng Shu, Zhengliang Liu, Yujie Xiang, Beikang Gu, Nicholas Filla, Yiwei Li, Ning Liu, Xianyan Chen, Keke Tang, Tianming Liu, Xianqiao Wang

    Abstract: This study is a pioneering endeavor to investigate the capabilities of Large Language Models (LLMs) in addressing conceptual questions within the domain of mechanical engineering with a focus on mechanics. Our examination involves a manually crafted exam encompassing 126 multiple-choice questions, spanning various aspects of mechanics courses, including Fluid Mechanics, Mechanical Vibration, Engin… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

    Comments: 30 pages, 7 figures, and 1 table

  47. arXiv:2401.09452  [pdf, other

    cs.LG cs.AI

    Incorporating Riemannian Geometric Features for Learning Coefficient of Pressure Distributions on Airplane Wings

    Authors: Liwei Hu, Wenyong Wang, Yu Xiang, Stefan Sommer

    Abstract: The aerodynamic coefficients of aircrafts are significantly impacted by its geometry, especially when the angle of attack (AoA) is large. In the field of aerodynamics, traditional polynomial-based parameterization uses as few parameters as possible to describe the geometry of an airfoil. However, because the 3D geometry of a wing is more complicated than the 2D airfoil, polynomial-based parameteri… ▽ More

    Submitted 22 December, 2023; originally announced January 2024.

  48. arXiv:2401.07718  [pdf

    cs.CY cs.HC cs.SI

    How Social Media Big Data Can Improve Suicide Prevention

    Authors: Anastasia Peshkovskaya, Yu-Tao Xiang

    Abstract: In the light of increasing clues on social media impact on self-harm and suicide risks, there is still no evidence on who are and how factually engaged in suicide-related online behaviors. This study reports new findings of high-performance supercomputing investigation of publicly accessible big data sourced from one of the world-largest social networking site. Three-month supercomputer searching… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

    Comments: 7 pages, 1 figure, 1 table

  49. arXiv:2401.05437  [pdf, other

    eess.SP cs.AI cs.LG

    Representation Learning for Wearable-Based Applications in the Case of Missing Data

    Authors: Janosch Jungo, Yutong Xiang, Shkurta Gashi, Christian Holz

    Abstract: Wearable devices continuously collect sensor data and use it to infer an individual's behavior, such as sleep, physical activity, and emotions. Despite the significant interest and advancements in this field, modeling multimodal sensor data in real-world environments is still challenging due to low data quality and limited data annotations. In this work, we investigate representation learning for… ▽ More

    Submitted 12 January, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

    Comments: Paper accepted in Human-Centric Representation Learning workshop at AAAI 2024 (https://hcrl-workshop.github.io/2024/)

  50. arXiv:2401.05055  [pdf, other

    cs.CV

    Application of Deep Learning in Blind Motion Deblurring: Current Status and Future Prospects

    Authors: Yawen Xiang, Heng Zhou, Chengyang Li, Fangwei Sun, Zhongbo Li, Yongqiang Xie

    Abstract: Motion deblurring is one of the fundamental problems of computer vision and has received continuous attention. The variability in blur, both within and across images, imposes limitations on non-blind deblurring techniques that rely on estimating the blur kernel. As a response, blind motion deblurring has emerged, aiming to restore clear and detailed images without prior knowledge of the blur type,… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: 29 pages, 13 figures, more than 150 papers have been included