Skip to main content

Showing 1–50 of 87 results for author: Ding, R

  1. arXiv:2407.03162  [pdf, other

    cs.RO cs.CV cs.LG

    Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning

    Authors: Runyu Ding, Yuzhe Qin, Jiyue Zhu, Chengzhe Jia, Shiqi Yang, Ruihan Yang, Xiaojuan Qi, Xiaolong Wang

    Abstract: Teleoperation is a crucial tool for collecting human demonstrations, but controlling robots with bimanual dexterous hands remains a challenge. Existing teleoperation systems struggle to handle the complexity of coordinating two hands for intricate manipulations. We introduce Bunny-VisionPro, a real-time bimanual dexterous teleoperation system that leverages a VR headset. Unlike previous vision-bas… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: project page: https://dingry.github.io/projects/bunny_visionpro.html

  2. arXiv:2406.10537  [pdf, other

    cs.LG cs.AI stat.ML

    Scalable Differentiable Causal Discovery in the Presence of Latent Confounders with Skeleton Posterior (Extended Version)

    Authors: Pingchuan Ma, Rui Ding, Qiang Fu, Jiaru Zhang, Shuai Wang, Shi Han, Dongmei Zhang

    Abstract: Differentiable causal discovery has made significant advancements in the learning of directed acyclic graphs. However, its application to real-world datasets remains restricted due to the ubiquity of latent confounders and the requirement to learn maximal ancestral graphs (MAGs). To date, existing differentiable MAG learning algorithms have been limited to small datasets and failed to scale to lar… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  3. arXiv:2406.10216  [pdf, other

    cs.CL cs.AI

    Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs

    Authors: Rui Yang, Ruomeng Ding, Yong Lin, Huan Zhang, Tong Zhang

    Abstract: Reward models trained on human preference data have been proven to be effective for aligning Large Language Models (LLMs) with human intent within the reinforcement learning from human feedback (RLHF) framework. However, the generalization capabilities of current reward models to unseen prompts and responses are limited. This limitation can lead to an unexpected phenomenon known as reward over-opt… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 21 pages

  4. arXiv:2406.06558  [pdf, other

    cs.CL cs.AI

    Enhancing Text Authenticity: A Novel Hybrid Approach for AI-Generated Text Detection

    Authors: Ye Zhang, Qian Leng, Mengran Zhu, Rui Ding, Yue Wu, Jintong Song, Yulu Gong

    Abstract: The rapid advancement of Large Language Models (LLMs) has ushered in an era where AI-generated text is increasingly indistinguishable from human-generated content. Detecting AI-generated text has become imperative to combat misinformation, ensure content authenticity, and safeguard against malicious uses of AI. In this paper, we propose a novel hybrid approach that combines traditional TF-IDF tech… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  5. arXiv:2406.04377  [pdf, other

    eess.IV cs.LG

    Combining Graph Neural Network and Mamba to Capture Local and Global Tissue Spatial Relationships in Whole Slide Images

    Authors: Ruiwen Ding, Kha-Dinh Luong, Erika Rodriguez, Ana Cristina Araujo Lemos da Silva, William Hsu

    Abstract: In computational pathology, extracting spatial features from gigapixel whole slide images (WSIs) is a fundamental task, but due to their large size, WSIs are typically segmented into smaller tiles. A critical aspect of this analysis is aggregating information from these tiles to make predictions at the WSI level. We introduce a model that combines a message-passing graph neural network (GNN) with… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  6. arXiv:2404.16304  [pdf, other

    cs.CV

    BezierFormer: A Unified Architecture for 2D and 3D Lane Detection

    Authors: Zhiwei Dong, Xi Zhu, Xiya Cao, Ran Ding, Wei Li, Caifa Zhou, Yongliang Wang, Qiangbo Liu

    Abstract: Lane detection has made significant progress in recent years, but there is not a unified architecture for its two sub-tasks: 2D lane detection and 3D lane detection. To fill this gap, we introduce BézierFormer, a unified 2D and 3D lane detection architecture based on Bézier curve lane representation. BézierFormer formulate queries as Bézier control points and incorporate a novel Bézier curve atten… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: ICME 2024, 11 pages, 8 figures

  7. arXiv:2403.14760  [pdf, other

    cs.CV

    Can 3D Vision-Language Models Truly Understand Natural Language?

    Authors: Weipeng Deng, Jihan Yang, Runyu Ding, Jiahui Liu, Yijiang Li, Xiaojuan Qi, Edith Ngai

    Abstract: Rapid advancements in 3D vision-language (3D-VL) tasks have opened up new avenues for human interaction with embodied agents or robots using natural language. Despite this progress, we find a notable limitation: existing 3D-VL models exhibit sensitivity to the styles of language input, struggling to understand sentences with the same semantic meaning but written in different variants. This observa… ▽ More

    Submitted 3 July, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: https://github.com/VincentDENGP/3D-LR

  8. arXiv:2402.19248  [pdf, other

    cs.CL

    Let LLMs Take on the Latest Challenges! A Chinese Dynamic Question Answering Benchmark

    Authors: Zhikun Xu, Yinghui Li, Ruixue Ding, Xinyu Wang, Boli Chen, Yong Jiang, Hai-Tao Zheng, Wenlian Lu, Pengjun Xie, Fei Huang

    Abstract: How to better evaluate the capabilities of Large Language Models (LLMs) is the focal point and hot topic in current LLMs research. Previous work has noted that due to the extremely high cost of iterative updates of LLMs, they are often unable to answer the latest dynamic questions well. To promote the improvement of Chinese LLMs' ability to answer dynamic questions, in this paper, we introduce CDQ… ▽ More

    Submitted 1 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: Work in progress!

  9. arXiv:2402.03310  [pdf, other

    cs.AI cs.CV

    V-IRL: Grounding Virtual Intelligence in Real Life

    Authors: Jihan Yang, Runyu Ding, Ellis Brown, Xiaojuan Qi, Saining Xie

    Abstract: There is a sensory gulf between the Earth that humans inhabit and the digital realms in which modern AI agents are created. To develop AI agents that can sense, think, and act as flexibly as humans in real-world settings, it is imperative to bridge the realism gap between the digital and physical worlds. How can we embody agents in an environment as rich and diverse as the one we inhabit, without… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: Project page: https://virl-platform.github.io

  10. arXiv:2401.05395  [pdf, other

    econ.GN cs.AI cs.CY cs.LG

    SRNI-CAR: A comprehensive dataset for analyzing the Chinese automotive market

    Authors: Ruixin Ding, Bowei Chen, James M. Wilson, Zhi Yan, Yufei Huang

    Abstract: The automotive industry plays a critical role in the global economy, and particularly important is the expanding Chinese automobile market due to its immense scale and influence. However, existing automotive sector datasets are limited in their coverage, failing to adequately consider the growing demand for more and diverse variables. This paper aims to bridge this data gap by introducing a compre… ▽ More

    Submitted 19 December, 2023; originally announced January 2024.

    Journal ref: Proceedings of 2023 IEEE International Conference on Big Data (BigData), page 3405-3412

  11. arXiv:2401.02138  [pdf, other

    cs.CV

    Explore Human Parsing Modality for Action Recognition

    Authors: Jinfu Liu, Runwei Ding, Yuhang Wen, Nan Dai, Fanyang Meng, Shen Zhao, Mengyuan Liu

    Abstract: Multimodal-based action recognition methods have achieved high success using pose and RGB modality. However, skeletons sequences lack appearance depiction and RGB images suffer irrelevant noise due to modality limitations. To address this, we introduce human parsing feature map as a novel modality, since it can selectively retain effective semantic features of the body parts, while filtering out m… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: arXiv admin note: text overlap with arXiv:2307.07977

  12. arXiv:2312.13671  [pdf, other

    cs.CL cs.LG

    Text2Analysis: A Benchmark of Table Question Answering with Advanced Data Analysis and Unclear Queries

    Authors: Xinyi He, Mengyu Zhou, Xinrun Xu, Xiaojun Ma, Rui Ding, Lun Du, Yan Gao, Ran Jia, Xu Chen, Shi Han, Zejian Yuan, Dongmei Zhang

    Abstract: Tabular data analysis is crucial in various fields, and large language models show promise in this area. However, current research mostly focuses on rudimentary tasks like Text2SQL and TableQA, neglecting advanced analysis like forecasting and chart generation. To address this gap, we developed the Text2Analysis benchmark, incorporating advanced analysis tasks that go beyond the SQL-compatible ope… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI'2024

  13. arXiv:2311.04254  [pdf, other

    cs.AI cs.LG

    Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation

    Authors: Ruomeng Ding, Chaoyun Zhang, Lu Wang, Yong Xu, Minghua Ma, Wei Zhang, Si Qin, Saravan Rajmohan, Qingwei Lin, Dongmei Zhang

    Abstract: Recent advancements in Large Language Models (LLMs) have revolutionized decision-making by breaking down complex problems into more manageable language sequences referred to as "thoughts". An effective thought design should consider three key perspectives: performance, efficiency, and flexibility. However, existing thought can at most exhibit two of these attributes. To address these limitations,… ▽ More

    Submitted 23 February, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

    Comments: 17 pages, 5 figures

  14. arXiv:2310.18740  [pdf, other

    cs.SE

    TraceDiag: Adaptive, Interpretable, and Efficient Root Cause Analysis on Large-Scale Microservice Systems

    Authors: Ruomeng Ding, Chaoyun Zhang, Lu Wang, Yong Xu, Minghua Ma, Xiaomin Wu, Meng Zhang, Qingjun Chen, Xin Gao, Xuedong Gao, Hao Fan, Saravan Rajmohan, Qingwei Lin, Dongmei Zhang

    Abstract: Root Cause Analysis (RCA) is becoming increasingly crucial for ensuring the reliability of microservice systems. However, performing RCA on modern microservice systems can be challenging due to their large scale, as they usually comprise hundreds of components, leading significant human effort. This paper proposes TraceDiag, an end-to-end RCA framework that addresses the challenges for large-scale… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  15. arXiv:2310.07176  [pdf, other

    cs.CV

    Improving mitosis detection on histopathology images using large vision-language models

    Authors: Ruiwen Ding, James Hall, Neil Tenenholtz, Kristen Severson

    Abstract: In certain types of cancerous tissue, mitotic count has been shown to be associated with tumor proliferation, poor prognosis, and therapeutic resistance. Due to the high inter-rater variability of mitotic counting by pathologists, convolutional neural networks (CNNs) have been employed to reduce the subjectivity of mitosis detection in hematoxylin and eosin (H&E)-stained whole slide images. Howeve… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: Submitted to IEEE ISBI 2024. Under review

  16. arXiv:2310.00268  [pdf, other

    cs.LG cs.AI

    Unravel Anomalies: An End-to-end Seasonal-Trend Decomposition Approach for Time Series Anomaly Detection

    Authors: Zhenwei Zhang, Ruiqi Wang, Ran Ding, Yuantao Gu

    Abstract: Traditional Time-series Anomaly Detection (TAD) methods often struggle with the composite nature of complex time-series data and a diverse array of anomalies. We introduce TADNet, an end-to-end TAD model that leverages Seasonal-Trend Decomposition to link various types of anomalies to specific decomposition components, thereby simplifying the analysis of complex time-series and enhancing detection… ▽ More

    Submitted 14 December, 2023; v1 submitted 30 September, 2023; originally announced October 2023.

    Comments: Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), scheduled for 14-19 April 2024 in Seoul, Korea

  17. arXiv:2309.01606  [pdf, other

    cs.CL

    Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese Geographic Re-Ranking

    Authors: Yong Cao, Ruixue Ding, Boli Chen, Xianzhi Li, Min Chen, Daniel Hershcovich, Pengjun Xie, Fei Huang

    Abstract: Chinese geographic re-ranking task aims to find the most relevant addresses among retrieved candidates, which is crucial for location-related services such as navigation maps. Unlike the general sentences, geographic contexts are closely intertwined with geographical concepts, from general spans (e.g., province) to specific spans (e.g., road). Given this feature, we propose an innovative framework… ▽ More

    Submitted 2 February, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

    Comments: 15 pages, 5 figures, EACL 2024 main

  18. arXiv:2308.10305  [pdf, other

    cs.CV cs.AI cs.LG

    Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video

    Authors: Yingxuan You, Hong Liu, Ti Wang, Wenhao Li, Runwei Ding, Xia Li

    Abstract: Despite significant progress in single image-based 3D human mesh recovery, accurately and smoothly recovering 3D human motion from a video remains challenging. Existing video-based methods generally recover human mesh by estimating the complex pose and shape parameters from coupled image features, whose high complexity and low representation ability often result in inconsistent pose motion and lim… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV 2023. Project page: https://kasvii.github.io/PMCE

  19. arXiv:2308.09259  [pdf, other

    cs.LG

    FRGNN: Mitigating the Impact of Distribution Shift on Graph Neural Networks via Test-Time Feature Reconstruction

    Authors: Rui Ding, Jielong Yang, Feng Ji, Xionghu Zhong, Linbo Xie

    Abstract: Due to inappropriate sample selection and limited training data, a distribution shift often exists between the training and test sets. This shift can adversely affect the test performance of Graph Neural Networks (GNNs). Existing approaches mitigate this issue by either enhancing the robustness of GNNs to distribution shift or reducing the shift itself. However, both approaches necessitate retrain… ▽ More

    Submitted 13 October, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

  20. arXiv:2308.01469  [pdf, other

    cs.LG cs.AI cs.CR

    VertexSerum: Poisoning Graph Neural Networks for Link Inference

    Authors: Ruyi Ding, Shijin Duan, Xiaolin Xu, Yunsi Fei

    Abstract: Graph neural networks (GNNs) have brought superb performance to various applications utilizing graph structural data, such as social analysis and fraud detection. The graph links, e.g., social relationships and transaction history, are sensitive and valuable information, which raises privacy concerns when using GNNs. To exploit these vulnerabilities, we propose VertexSerum, a novel graph poisoning… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

  21. arXiv:2308.00353  [pdf, other

    cs.CV

    Lowis3D: Language-Driven Open-World Instance-Level 3D Scene Understanding

    Authors: Runyu Ding, Jihan Yang, Chuhui Xue, Wenqing Zhang, Song Bai, Xiaojuan Qi

    Abstract: Open-world instance-level scene understanding aims to locate and recognize unseen object categories that are not present in the annotated dataset. This task is challenging because the model needs to both localize novel 3D objects and infer their semantic categories. A key factor for the recent progress in 2D open-world perception is the availability of large-scale image-text pairs from the Interne… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: submit to TPAMI

  22. arXiv:2307.07977  [pdf, other

    cs.CV

    Integrating Human Parsing and Pose Network for Human Action Recognition

    Authors: Runwei Ding, Yuhang Wen, Jinfu Liu, Nan Dai, Fanyang Meng, Mengyuan Liu

    Abstract: Human skeletons and RGB sequences are both widely-adopted input modalities for human action recognition. However, skeletons lack appearance features and color data suffer large amount of irrelevant depiction. To address this, we introduce human parsing feature map as a novel modality, since it can selectively retain spatiotemporal features of the body parts, while filtering out noises regarding ou… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

    Comments: CICAI 2023 Camera-ready Version

  23. PKU-GoodsAD: A Supermarket Goods Dataset for Unsupervised Anomaly Detection and Segmentation

    Authors: Jian Zhang, Runwei Ding, Miaoju Ban, Ge Yang

    Abstract: Visual anomaly detection is essential and commonly used for many tasks in the field of computer vision. Recent anomaly detection datasets mainly focus on industrial automated inspection, medical image analysis and video surveillance. In order to broaden the application and research of anomaly detection in unmanned supermarkets and smart manufacturing, we introduce the supermarket goods anomaly det… ▽ More

    Submitted 26 July, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

    Comments: 8 pages, 6 figures

    Journal ref: IEEE Robotics and Automation Letters, 2024

  24. arXiv:2307.00754  [pdf, other

    cs.LG cs.AI

    ImDiffusion: Imputed Diffusion Models for Multivariate Time Series Anomaly Detection

    Authors: Yuhang Chen, Chaoyun Zhang, Minghua Ma, Yudong Liu, Ruomeng Ding, Bowen Li, Shilin He, Saravan Rajmohan, Qingwei Lin, Dongmei Zhang

    Abstract: Anomaly detection in multivariate time series data is of paramount importance for ensuring the efficient operation of large-scale systems across diverse domains. However, accurately detecting anomalies in such data poses significant challenges. Existing approaches, including forecasting and reconstruction-based methods, struggle to address these challenges effectively. To overcome these limitation… ▽ More

    Submitted 14 November, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: To appear in VLDB 2024.Code: https://github.com/17000cyh/IMDiffusion.git

  25. arXiv:2306.00440  [pdf, other

    cs.CV

    Edge-guided Representation Learning for Underwater Object Detection

    Authors: Linhui Dai, Hong Liu, Pinhao Song, Hao Tang, Runwei Ding, Shengquan Li

    Abstract: Underwater object detection (UOD) is crucial for marine economic development, environmental protection, and the planet's sustainable development. The main challenges of this task arise from low-contrast, small objects, and mimicry of aquatic organisms. The key to addressing these challenges is to focus the model on obtaining more discriminative information. We observe that the edges of underwater… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

  26. arXiv:2305.11618  [pdf, other

    cs.CR cs.CV

    DAP: A Dynamic Adversarial Patch for Evading Person Detectors

    Authors: Amira Guesmi, Ruitian Ding, Muhammad Abdullah Hanif, Ihsen Alouani, Muhammad Shafique

    Abstract: Patch-based adversarial attacks were proven to compromise the robustness and reliability of computer vision systems. However, their conspicuous and easily detectable nature challenge their practicality in real-world setting. To address this, recent work has proposed using Generative Adversarial Networks (GANs) to generate naturalistic patches that may not attract human attention. However, such app… ▽ More

    Submitted 20 November, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

  27. arXiv:2305.06545  [pdf, other

    cs.CL cs.AI

    GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark

    Authors: Dongyang Li, Ruixue Ding, Qiang Zhang, Zheng Li, Boli Chen, Pengjun Xie, Yao Xu, Xin Li, Ning Guo, Fei Huang, Xiaofeng He

    Abstract: With a fast developing pace of geographic applications, automatable and intelligent models are essential to be designed to handle the large volume of information. However, few researchers focus on geographic natural language processing, and there has never been a benchmark to build a unified standard. In this work, we propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE.… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

  28. arXiv:2304.14045  [pdf, other

    cs.CV cs.AI cs.LG

    Interweaved Graph and Attention Network for 3D Human Pose Estimation

    Authors: Ti Wang, Hong Liu, Runwei Ding, Wenhao Li, Yingxuan You, Xia Li

    Abstract: Despite substantial progress in 3D human pose estimation from a single-view image, prior works rarely explore global and local correlations, leading to insufficient learning of human skeleton representations. To address this issue, we propose a novel Interweaved Graph and Attention Network (IGANet) that allows bidirectional communications between graph convolutional networks (GCNs) and attentions.… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

    Comments: Accepted by ICASSP2023

  29. arXiv:2304.00962  [pdf, other

    cs.CV cs.AI

    RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding

    Authors: Jihan Yang, Runyu Ding, Weipeng Deng, Zhe Wang, Xiaojuan Qi

    Abstract: We propose a lightweight and scalable Regional Point-Language Contrastive learning framework, namely \textbf{RegionPLC}, for open-world 3D scene understanding, aiming to identify and recognize open-set objects and categories. Specifically, based on our empirical studies, we introduce a 3D-aware SFusion strategy that fuses 3D vision-language pairs derived from multiple 2D foundation models, yieldin… ▽ More

    Submitted 5 May, 2024; v1 submitted 3 April, 2023; originally announced April 2023.

    Comments: To appear in CVPR2024 .project page: https://jihanyang.github.io/projects/RegionPLC

  30. arXiv:2304.00477  [pdf, other

    cs.DB cs.AI cs.HC

    Demonstration of InsightPilot: An LLM-Empowered Automated Data Exploration System

    Authors: Pingchuan Ma, Rui Ding, Shuai Wang, Shi Han, Dongmei Zhang

    Abstract: Exploring data is crucial in data analysis, as it helps users understand and interpret the data more effectively. However, performing effective data exploration requires in-depth knowledge of the dataset and expertise in data analysis techniques. Not being familiar with either can create obstacles that make the process time-consuming and overwhelming for data analysts. To address this issue, we in… ▽ More

    Submitted 12 November, 2023; v1 submitted 2 April, 2023; originally announced April 2023.

  31. arXiv:2303.15571  [pdf, other

    cs.CR cs.AI

    EMShepherd: Detecting Adversarial Samples via Side-channel Leakage

    Authors: Ruyi Ding, Cheng Gongye, Siyue Wang, Aidong Ding, Yunsi Fei

    Abstract: Deep Neural Networks (DNN) are vulnerable to adversarial perturbations-small changes crafted deliberately on the input to mislead the model for wrong predictions. Adversarial attacks have disastrous consequences for deep learning-empowered critical applications. Existing defense and detection techniques both require extensive knowledge of the model, testing inputs, and even execution details. They… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

  32. arXiv:2303.10547  [pdf, other

    cs.MA

    Event-triggered privacy preserving consensus control with edge-based additive noise

    Authors: Limei Liang, Ruiqi Ding, Shuai Liu

    Abstract: In this article, we investigate the distributed privacy preserving weighted consensus control problem for linear continuous-time multi-agent systems under the event-triggering communication mode. A novel event-triggered privacy preserving consensus scheme is proposed, which can be divided into three phases. First, for each agent, an event-triggered mechanism is designed to determine whether the cu… ▽ More

    Submitted 18 March, 2023; originally announced March 2023.

  33. arXiv:2303.05652  [pdf, other

    cs.CV cs.AI cs.LG

    GATOR: Graph-Aware Transformer with Motion-Disentangled Regression for Human Mesh Recovery from a 2D Pose

    Authors: Yingxuan You, Hong Liu, Xia Li, Wenhao Li, Ti Wang, Runwei Ding

    Abstract: 3D human mesh recovery from a 2D pose plays an important role in various applications. However, it is hard for existing methods to simultaneously capture the multiple relations during the evolution from skeleton to mesh, including joint-joint, joint-vertex and vertex-vertex relations, which often leads to implausible results. To address this issue, we propose a novel solution, called GATOR, that c… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  34. arXiv:2302.09790  [pdf, other

    cs.CV cs.HC cs.LG

    HTNet: Human Topology Aware Network for 3D Human Pose Estimation

    Authors: Jialun Cai, Hong Liu, Runwei Ding, Wenhao Li, Jianbing Wu, Miaoju Ban

    Abstract: 3D human pose estimation errors would propagate along the human body topology and accumulate at the end joints of limbs. Inspired by the backtracking mechanism in automatic control systems, we design an Intra-Part Constraint module that utilizes the parent nodes as the reference to build topological constraints for end joints at the part level. Further considering the hierarchy of the human topolo… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Comments: ICASSP23 Accepted Paper

  35. Cross-center Early Sepsis Recognition by Medical Knowledge Guided Collaborative Learning for Data-scarce Hospitals

    Authors: Ruiqing Ding, Fangjie Rong, Xiao Han, Leye Wang

    Abstract: There are significant regional inequities in health resources around the world. It has become one of the most focused topics to improve health services for data-scarce hospitals and promote health equity through knowledge sharing among medical institutions. Because electronic medical records (EMRs) contain sensitive personal information, privacy protection is unavoidable and essential for multi-ho… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

    Comments: 7 pages, 2 figures

  36. MGeo: Multi-Modal Geographic Pre-Training Method

    Authors: Ruixue Ding, Boli Chen, Pengjun Xie, Fei Huang, Xin Li, Qiang Zhang, Yao Xu

    Abstract: As a core task in location-based services (LBS) (e.g., navigation maps), query and point of interest (POI) matching connects users' intent with real-world geographic information. Recently, pre-trained models (PTMs) have made advancements in many natural language processing (NLP) tasks. Generic text-based PTMs do not have enough geographic knowledge for query-POI matching. To overcome this limitati… ▽ More

    Submitted 24 May, 2023; v1 submitted 10 January, 2023; originally announced January 2023.

    Comments: 10 pages, 5 figures

  37. arXiv:2301.00406  [pdf, other

    cs.CV eess.IV

    Curvature regularization for Non-line-of-sight Imaging from Under-sampled Data

    Authors: Rui Ding, Juntian Ye, Qifeng Gao, Feihu Xu, Yuping Duan

    Abstract: Non-line-of-sight (NLOS) imaging aims to reconstruct the three-dimensional hidden scenes from the data measured in the line-of-sight, which uses photon time-of-flight information encoded in light after multiple diffuse reflections. The under-sampled scanning data can facilitate fast imaging. However, the resulting reconstruction problem becomes a serious ill-posed inverse problem, the solution of… ▽ More

    Submitted 6 March, 2024; v1 submitted 1 January, 2023; originally announced January 2023.

  38. arXiv:2212.05251  [pdf, other

    cs.CL

    A Unified Knowledge Graph Augmentation Service for Boosting Domain-specific NLP Tasks

    Authors: Ruiqing Ding, Xiao Han, Leye Wang

    Abstract: By focusing the pre-training process on domain-specific corpora, some domain-specific pre-trained language models (PLMs) have achieved state-of-the-art results. However, it is under-investigated to design a unified paradigm to inject domain knowledge in the PLM fine-tuning stage. We propose KnowledgeDA, a unified domain language model development service to enhance the task-specific training proce… ▽ More

    Submitted 5 June, 2023; v1 submitted 10 December, 2022; originally announced December 2022.

    Comments: Accepted by ACL Findings 2023

  39. arXiv:2211.16312  [pdf, other

    cs.CV

    PLA: Language-Driven Open-Vocabulary 3D Scene Understanding

    Authors: Runyu Ding, Jihan Yang, Chuhui Xue, Wenqing Zhang, Song Bai, Xiaojuan Qi

    Abstract: Open-vocabulary scene understanding aims to localize and recognize unseen categories beyond the annotated label space. The recent breakthrough of 2D open-vocabulary perception is largely driven by Internet-scale paired image-text data with rich vocabulary concepts. However, this success cannot be directly transferred to 3D scenarios due to the inaccessibility of large-scale 3D-text pairs. To this… ▽ More

    Submitted 22 March, 2023; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: CVPR2023

  40. arXiv:2210.10293  [pdf, other

    cs.CL

    Forging Multiple Training Objectives for Pre-trained Language Models via Meta-Learning

    Authors: Hongqiu Wu, Ruixue Ding, Hai Zhao, Boli Chen, Pengjun Xie, Fei Huang, Min Zhang

    Abstract: Multiple pre-training objectives fill the vacancy of the understanding capability of single-objective language modeling, which serves the ultimate purpose of pre-trained language models (PrLMs), generalizing well on a mass of scenarios. However, learning multiple training objectives in a single model is challenging due to the unknown relative significance as well as the potential contrariety betwe… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022 (findings)

  41. arXiv:2208.00207  [pdf, other

    eess.IV cs.CV math.OC

    LRIP-Net: Low-Resolution Image Prior based Network for Limited-Angle CT Reconstruction

    Authors: Qifeng Gao, Rui Ding, Linyuan Wang, Bin Xue, Yuping Duan

    Abstract: In the practical applications of computed tomography imaging, the projection data may be acquired within a limited-angle range and corrupted by noises due to the limitation of scanning conditions. The noisy incomplete projection data results in the ill-posedness of the inverse problems. In this work, we theoretically verify that the low-resolution reconstruction problem has better numerical stabil… ▽ More

    Submitted 30 July, 2022; originally announced August 2022.

  42. arXiv:2207.12718  [pdf, other

    cs.DB cs.AI

    XInsight: eXplainable Data Analysis Through The Lens of Causality

    Authors: Pingchuan Ma, Rui Ding, Shuai Wang, Shi Han, Dongmei Zhang

    Abstract: In light of the growing popularity of Exploratory Data Analysis (EDA), understanding the underlying causes of the knowledge acquired by EDA is crucial. However, it remains under-researched. This study promotes a transparent and explicable perspective on data analysis, called eXplainable Data Analysis (XDA). For this reason, we present XInsight, a general framework for XDA. XInsight provides data a… ▽ More

    Submitted 30 May, 2023; v1 submitted 26 July, 2022; originally announced July 2022.

  43. arXiv:2206.15463  [pdf, other

    cs.AR cs.LG

    QUIDAM: A Framework for Quantization-Aware DNN Accelerator and Model Co-Exploration

    Authors: Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Ting-Wu Chin, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu

    Abstract: As the machine learning and systems communities strive to achieve higher energy-efficiency through custom deep neural network (DNN) accelerators, varied precision or quantization levels, and model compression techniques, there is a need for design space exploration frameworks that incorporate quantization-aware processing elements into the accelerator design space while having accurate and fast po… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

    Comments: 25 pages, 12 figures. arXiv admin note: substantial text overlap with arXiv:2205.13045, arXiv:2205.08648

  44. arXiv:2206.12608  [pdf, other

    cs.CL cs.LG

    Adversarial Self-Attention for Language Understanding

    Authors: Hongqiu Wu, Ruixue Ding, Hai Zhao, Pengjun Xie, Fei Huang, Min Zhang

    Abstract: Deep neural models (e.g. Transformer) naturally learn spurious features, which create a ``shortcut'' between the labels and inputs, thus impairing the generalization and robustness. This paper advances the self-attention mechanism to its robust variant for Transformer-based pre-trained language models (e.g. BERT). We propose \textit{Adversarial Self-Attention} mechanism (ASA), which adversarially… ▽ More

    Submitted 8 February, 2023; v1 submitted 25 June, 2022; originally announced June 2022.

    Comments: Accepted by AAAI 2023

  45. arXiv:2206.06420  [pdf, other

    cs.CV cs.AI cs.LG

    GraphMLP: A Graph MLP-Like Architecture for 3D Human Pose Estimation

    Authors: Wenhao Li, Hong Liu, Tianyu Guo, Runwei Ding, Hao Tang

    Abstract: Modern multi-layer perceptron (MLP) models have shown competitive results in learning visual representations without self-attention. However, existing MLP models are not good at capturing local details and lack prior knowledge of human body configurations, which limits their modeling power for skeletal representation learning. To address these issues, we propose a simple yet effective graph-reinfo… ▽ More

    Submitted 21 April, 2023; v1 submitted 13 June, 2022; originally announced June 2022.

    Comments: Open Sourced

  46. arXiv:2206.04890  [pdf, other

    cs.LG

    Adversarial Counterfactual Environment Model Learning

    Authors: Xiong-Hui Chen, Yang Yu, Zheng-Mao Zhu, Zhihua Yu, Zhenjun Chen, Chenghe Wang, Yinan Wu, Hongqiu Wu, Rong-Jun Qin, Ruijin Ding, Fangsheng Huang

    Abstract: A good model for action-effect prediction, named environment model, is important to achieve sample-efficient decision-making policy learning in many domains like robot control, recommender systems, and patients' treatment selection. We can take unlimited trials with such a model to identify the appropriate actions so that the costs of queries in the real world can be saved. It requires the model t… ▽ More

    Submitted 8 October, 2023; v1 submitted 10 June, 2022; originally announced June 2022.

  47. Unsupervised Learning of 3D Scene Flow from Monocular Camera

    Authors: Guangming Wang, Xiaoyu Tian, Ruiqi Ding, Hesheng Wang

    Abstract: Scene flow represents the motion of points in the 3D space, which is the counterpart of the optical flow that represents the motion of pixels in the 2D image. However, it is difficult to obtain the ground truth of scene flow in the real scenes, and recent studies are based on synthetic data for training. Therefore, how to train a scene flow network with unsupervised methods based on real-world dat… ▽ More

    Submitted 8 June, 2022; originally announced June 2022.

    Comments: ICRA2021

    Journal ref: 2021 IEEE International Conference on Robotics and Automation (ICRA)

  48. arXiv:2205.15156  [pdf, other

    cs.CV cs.AI cs.LG

    Towards Efficient 3D Object Detection with Knowledge Distillation

    Authors: Jihan Yang, Shaoshuai Shi, Runyu Ding, Zhe Wang, Xiaojuan Qi

    Abstract: Despite substantial progress in 3D object detection, advanced 3D detectors often suffer from heavy computation overheads. To this end, we explore the potential of knowledge distillation (KD) for developing efficient 3D object detectors, focusing on popular pillar- and voxel-based detectors.In the absence of well-developed teacher-student pairs, we first study how to obtain student models with good… ▽ More

    Submitted 13 October, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: NeurIPS 2022

  49. arXiv:2205.13045  [pdf, other

    cs.AR cs.LG

    QADAM: Quantization-Aware DNN Accelerator Modeling for Pareto-Optimality

    Authors: Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu

    Abstract: As the machine learning and systems communities strive to achieve higher energy-efficiency through custom deep neural network (DNN) accelerators, varied bit precision or quantization levels, there is a need for design space exploration frameworks that incorporate quantization-aware processing elements (PE) into the accelerator design space while having accurate and fast power, performance, and are… ▽ More

    Submitted 20 May, 2022; originally announced May 2022.

    Comments: Accepted paper at the Machine Learning for Computer Architecture and Systems (MLArchSys) Workshop in conjunction with ISCA 2021. This is an extended version of arXiv:2205.08648

  50. arXiv:2205.08648  [pdf, other

    cs.AR cs.LG

    QAPPA: Quantization-Aware Power, Performance, and Area Modeling of DNN Accelerators

    Authors: Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu

    Abstract: As the machine learning and systems community strives to achieve higher energy-efficiency through custom DNN accelerators and model compression techniques, there is a need for a design space exploration framework that incorporates quantization-aware processing elements into the accelerator design space while having accurate and fast power, performance, and area models. In this work, we present QAP… ▽ More

    Submitted 17 May, 2022; originally announced May 2022.

    Comments: Accepted paper at the On-Device Intelligence Workshop in conjunction with MLSys Conference 2021