Skip to main content

Showing 1–50 of 378 results for author: Liang, D

  1. arXiv:2407.10563  [pdf, other

    cs.CV

    Pathformer3D: A 3D Scanpath Transformer for 360° Images

    Authors: Rong Quan, Yantao Lai, Mengyu Qiu, Dong Liang

    Abstract: Scanpath prediction in 360° images can help realize rapid rendering and better user interaction in Virtual/Augmented Reality applications. However, existing scanpath prediction models for 360° images execute scanpath prediction on 2D equirectangular projection plane, which always result in big computation error owing to the 2D plane's distortion and coordinate discontinuity. In this work, we perfo… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  2. arXiv:2407.05617  [pdf, other

    eess.IV

    LINEAR: Learning Implicit Neural Representation With Explicit Physical Priors for Accelerated Quantitative T1rho Mapping

    Authors: Yuanyuan Liu, Jinwen Xie, Zhuo-Xu Cui, Qingyong Zhu, Jing Cheng, Dong Liang, Yanjie Zhu

    Abstract: Quantitative T1rho parameter mapping has shown promise in clinical and research studies. However, it suffers from long scan times. Deep learning-based techniques have been successfully applied in accelerated quantitative MR parameter mapping. However, most methods require fully-sampled training dataset, which is impractical in the clinic. In this study, a novel subject-specific unsupervised method… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Yuanyuan Liu and Jinwen Xie contributed equally to this work

  3. arXiv:2407.04289  [pdf, other

    cond-mat.mes-hall quant-ph

    Electronic Correlations in Multielectron Silicon Quantum Dots

    Authors: Dylan H. Liang, MengKe Feng, Philip Y. Mai, Jesus D. Cifuentes, Andrew S. Dzurak, Andre Saraiva

    Abstract: Silicon quantum computing has the potential to revolutionize technology with capabilities to solve real-life problems that are computationally complex or even intractable for modern computers [1] by offering sufficient high quality qubits to perform complex error-corrected calculations. Silicon metal-oxide-semiconductor based quantum dots present a promising pathway for realizing practical quantum… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  4. arXiv:2407.03719  [pdf, other

    cs.CV

    Relative Difficulty Distillation for Semantic Segmentation

    Authors: Dong Liang, Yue Sun, Yun Du, Songcan Chen, Sheng-Jun Huang

    Abstract: Current knowledge distillation (KD) methods primarily focus on transferring various structured knowledge and designing corresponding optimization goals to encourage the student network to imitate the output of the teacher network. However, introducing too many additional optimization objectives may lead to unstable training, such as gradient conflicts. Moreover, these methods ignored the guideline… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  5. arXiv:2407.03699  [pdf, other

    cs.CV

    Generalized Robust Fundus Photography-based Vision Loss Estimation for High Myopia

    Authors: Zipei Yan, Zhile Liang, Zhengji Liu, Shuai Wang, Rachel Ka-Man Chun, Jizhou Li, Chea-su Kee, Dong Liang

    Abstract: High myopia significantly increases the risk of irreversible vision loss. Traditional perimetry-based visual field (VF) assessment provides systematic quantification of visual loss but it is subjective and time-consuming. Consequently, machine learning models utilizing fundus photographs to estimate VF have emerged as promising alternatives. However, due to the high variability and the limited ava… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted by MICCAI 2024, code: https://github.com/yanzipei/VF_RED

  6. arXiv:2407.03263  [pdf, other

    cs.CV

    A Unified Framework for 3D Scene Understanding

    Authors: Wei Xu, Chunsheng Shi, Sifan Tu, Xin Zhou, Dingkang Liang, Xiang Bai

    Abstract: We propose UniSeg3D, a unified 3D segmentation framework that achieves panoptic, semantic, instance, interactive, referring, and open-vocabulary semantic segmentation tasks within a single model. Most previous 3D segmentation approaches are specialized for a specific task, thereby limiting their understanding of 3D scenes to a task-specific perspective. In contrast, the proposed method unifies six… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: The code will be available at https://dk-liang.github.io/UniSeg3D/

  7. arXiv:2407.01016  [pdf, other

    cs.CV

    SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection

    Authors: Dingkang Liang, Wei Hua, Chunsheng Shi, Zhikang Zou, Xiaoqing Ye, Xiang Bai

    Abstract: Semi-supervised object detection (SSOD), leveraging unlabeled data to boost object detectors, has become a hot topic recently. However, existing SSOD approaches mainly focus on horizontal objects, leaving multi-oriented objects common in aerial images unexplored. At the same time, the annotation cost of multi-oriented objects is significantly higher than that of their horizontal counterparts. Ther… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  8. arXiv:2406.14952  [pdf, other

    cs.CL

    ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models

    Authors: Haiquan Zhao, Lingyu Li, Shisong Chen, Shuqi Kong, Jiaan Wang, Kexin Huang, Tianle Gu, Yixu Wang, Dandan Liang, Zhixu Li, Yan Teng, Yanghua Xiao, Yingchun Wang

    Abstract: Emotion Support Conversation (ESC) is a crucial application, which aims to reduce human stress, offer emotional guidance, and ultimately enhance human mental and physical well-being. With the advancement of Large Language Models (LLMs), many researchers have employed LLMs as the ESC models. However, the evaluation of these LLM-based ESCs remains uncertain. Inspired by the awesome development of ro… ▽ More

    Submitted 24 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

    Comments: Pre-print

  9. arXiv:2406.14067  [pdf

    physics.optics eess.SP

    A microwave photonic prototype for concurrent radar detection and spectrum sensing over an 8 to 40 GHz bandwidth

    Authors: Taixia Shi, Dingding Liang, Lu Wang, Lin Li, Shaogang Guo, Jiawei Gao, Xiaowei Li, Chulun Lin, Lei Shi, Baogang Ding, Shiyang Liu, Fangyi Yang, Chi Jiang, Yang Chen

    Abstract: In this work, a microwave photonic prototype for concurrent radar detection and spectrum sensing is proposed, designed, built, and investigated. A direct digital synthesizer and an analog electronic circuit are integrated to generate an intermediate frequency (IF) linearly frequency-modulated (LFM) signal with a tunable center frequency from 2.5 to 9.5 GHz and an instantaneous bandwidth of 1 GHz.… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 18 pages, 12 figures, 1 table

  10. arXiv:2406.11397  [pdf, other

    cs.LG cs.AI stat.ML

    DistPred: A Distribution-Free Probabilistic Inference Method for Regression and Forecasting

    Authors: Daojun Liang, Haixia Zhang, Dongfeng Yuan

    Abstract: Traditional regression and prediction tasks often only provide deterministic point estimates. To estimate the uncertainty or distribution information of the response variable, methods such as Bayesian inference, model ensembling, or MC Dropout are typically used. These methods either assume that the posterior distribution of samples follows a Gaussian process or require thousands of forward passes… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  11. arXiv:2406.07594  [pdf, other

    cs.CL cs.AI cs.CR

    MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models

    Authors: Tianle Gu, Zeyang Zhou, Kexin Huang, Dandan Liang, Yixu Wang, Haiquan Zhao, Yuanqi Yao, Xingge Qiao, Keqing Wang, Yujiu Yang, Yan Teng, Yu Qiao, Yingchun Wang

    Abstract: Powered by remarkable advancements in Large Language Models (LLMs), Multimodal Large Language Models (MLLMs) demonstrate impressive capabilities in manifold tasks. However, the practical application scenarios of MLLMs are intricate, exposing them to potential malicious instructions and thereby posing safety risks. While current benchmarks do incorporate certain safety considerations, they often la… ▽ More

    Submitted 13 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  12. arXiv:2406.04801  [pdf, other

    cs.CV

    MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks

    Authors: Xingkui Zhu, Yiran Guan, Dingkang Liang, Yuchao Chen, Yuliang Liu, Xiang Bai

    Abstract: The sparsely activated mixture of experts (MoE) model presents a promising alternative to traditional densely activated (dense) models, enhancing both quality and computational efficiency. However, training MoE models from scratch demands extensive data and computational resources. Moreover, public repositories like timm mainly provide pre-trained dense checkpoints, lacking similar resources for M… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 9 pages, 6 figures

    ACM Class: I.2

  13. arXiv:2405.19736  [pdf, other

    cs.AI

    Intrinsic Dynamics-Driven Generalizable Scene Representations for Vision-Oriented Decision-Making Applications

    Authors: Dayang Liang, Jinyang Lai, Yunlong Liu

    Abstract: How to improve the ability of scene representation is a key issue in vision-oriented decision-making applications, and current approaches usually learn task-relevant state representations within visual reinforcement learning to address this problem. While prior work typically introduces one-step behavioral similarity metrics with elements (e.g., rewards and actions) to extract task-relevant state… ▽ More

    Submitted 30 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  14. arXiv:2405.15830  [pdf, other

    eess.IV

    Diff-DTI: Fast Diffusion Tensor Imaging Using A Feature-Enhanced Joint Diffusion Model

    Authors: Lang Zhang, Jinling He, Dong Liang, Hairong Zheng, Yanjie Zhu

    Abstract: Magnetic resonance diffusion tensor imaging (DTI) is a critical tool for neural disease diagnosis. However, long scan time greatly hinders the widespread clinical use of DTI. To accelerate image acquisition, a feature-enhanced joint diffusion model (Diff-DTI) is proposed to obtain accurate DTI parameter maps from a limited number of diffusion-weighted images (DWIs). Diff-DTI introduces a joint dif… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 11 pages, 7 figures

  15. arXiv:2405.15271  [pdf

    eess.SY physics.ins-det physics.optics

    Seamless Integration and Implementation of Distributed Contact and Contactless Vital Sign Monitoring

    Authors: Dingding Liang, Yang Chen, Jiawei Gao, Taixia Shi, Jianping Yao

    Abstract: Real-time vital sign monitoring is gaining immense significance not only in the medical field but also in personal health management. Facing the needs of different application scenarios of the smart and healthy city in the future, the low-cost, large-scale, scalable, and distributed vital sign monitoring system is of great significance. In this work, a seamlessly integrated contact and contactless… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 14 pages,9 figures

  16. arXiv:2405.13152  [pdf, other

    cs.CV cs.AI

    Enhancing Interaction Modeling with Agent Selection and Physical Methods for Trajectory Prediction

    Authors: Shiji Huang, Lei Ye, Min Chen, Wenhai Luo, Chenqi Xu, Deyuan Liang, Dihong Wang

    Abstract: In this study, we address the limitations inherent in most existing vehicle trajectory prediction methodologies that indiscriminately incorporate all agents within a predetermined proximity when accounting for inter-agent interactions. These approaches commonly employ attention-based architecture or graph neural networks for encoding interactions, which introduces three challenges: (i) The indiscr… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: code:https://github.com/kkk00714/ASPILin

  17. arXiv:2405.12434  [pdf, other

    cs.CL

    Resolving Word Vagueness with Scenario-guided Adapter for Natural Language Inference

    Authors: Yonghao Liu, Mengyu Li, Di Liang, Ximing Li, Fausto Giunchiglia, Lan Huang, Xiaoyue Feng, Renchu Guan

    Abstract: Natural Language Inference (NLI) is a crucial task in natural language processing that involves determining the relationship between two sentences, typically referred to as the premise and the hypothesis. However, traditional NLI models solely rely on the semantic information inherent in independent sentences and lack relevant situational visual information, which can hinder a complete understandi… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: IJCAI24

  18. arXiv:2405.12119  [pdf, other

    cs.IR cs.AI cs.CL

    Reindex-Then-Adapt: Improving Large Language Models for Conversational Recommendation

    Authors: Zhankui He, Zhouhang Xie, Harald Steck, Dawen Liang, Rahul Jha, Nathan Kallus, Julian McAuley

    Abstract: Large language models (LLMs) are revolutionizing conversational recommender systems by adeptly indexing item content, understanding complex conversational contexts, and generating relevant item titles. However, controlling the distribution of recommended items remains a challenge. This leads to suboptimal performance due to the failure to capture rapidly changing data distributions, such as item p… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  19. arXiv:2405.10242  [pdf, ps, other

    quant-ph

    Quantum State Learning Implies Circuit Lower Bounds

    Authors: Nai-Hui Chia, Daniel Liang, Fang Song

    Abstract: We establish connections between state tomography, pseudorandomness, quantum state synthesis, and circuit lower bounds. In particular, let $\mathfrak{C}$ be a family of non-uniform quantum circuits of polynomial size and suppose that there exists an algorithm that, given copies of $|ψ\rangle$, distinguishes whether $|ψ\rangle$ is produced by $\mathfrak{C}$ or is Haar random, promised one of these… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: 53 pages

  20. arXiv:2405.05763  [pdf

    cs.CV cs.AI

    DP-MDM: Detail-Preserving MR Reconstruction via Multiple Diffusion Models

    Authors: Mengxiao Geng, Jiahao Zhu, Xiaolin Zhu, Qiqing Liu, Dong Liang, Qiegen Liu

    Abstract: Detail features of magnetic resonance images play a cru-cial role in accurate medical diagnosis and treatment, as they capture subtle changes that pose challenges for doc-tors when performing precise judgments. However, the widely utilized naive diffusion model has limitations, as it fails to accurately capture more intricate details. To en-hance the quality of MRI reconstruction, we propose a com… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  21. arXiv:2404.16680  [pdf, other

    gr-qc

    Unrevealing the existence of nontensorial gravitational-wave polarizations from individual supermassive black hole binaries with pulsar timing arrays

    Authors: Dicong Liang, Siyuan Chen, Chao Zhang, Lijing Shao

    Abstract: With the strong evidence for a gravitational wave (GW) background in the nanohertz frequency band from pulsar timing arrays, the detection of continuous GWs from individual supermassive black hole binaries is already at the dawn. Utilizing continuous GWs to test theories of gravity, especially to test the polarizations of GWs is becoming more and more realistic. In this theoretical study, assuming… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: 10 pages, 5 figures. Comments are welcome

  22. arXiv:2404.08882  [pdf, other

    physics.med-ph physics.optics

    Explanations of MTF discrepancy in grating-based X-ray differential phase contrast CT imaging

    Authors: Yuhang Tan, Jiecheng Yang, Hairong Zheng, Dong Liang, Peiping Zhu, Yongshuai Ge

    Abstract: As a multi-contrast X-ray computed tomography (CT) imaging system, the grating-based Talbot-Lau interferometer is able to generate the absorption contrast and differential phase contrast (DPC) images concurrently. However, experiments found that the absorption CT (ACT) images have better spatial resolution, i.e., higher modulation transfer function (MTF), than the differential phase contrast CT (D… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 7 pages,3 figures

    ACM Class: J.2

  23. arXiv:2404.08450  [pdf, other

    cs.CV

    Joint Physical-Digital Facial Attack Detection Via Simulating Spoofing Clues

    Authors: Xianhua He, Dashuang Liang, Song Yang, Zhanlong Hao, Hui Ma, Binjie Mao, Xi Li, Yao Wang, Pengfei Yan, Ajian Liu

    Abstract: Face recognition systems are frequently subjected to a variety of physical and digital attacks of different types. Previous methods have achieved satisfactory performance in scenarios that address physical attacks and digital attacks, respectively. However, few methods are considered to integrate a model that simultaneously addresses both physical and digital attacks, implying the necessity to dev… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 10 pages with 6 figures, Accepted by CVPRW 2024

  24. arXiv:2404.08023  [pdf, other

    q-bio.QM cs.LG

    Pathology-genomic fusion via biologically informed cross-modality graph learning for survival analysis

    Authors: Zeyu Zhang, Yuanshen Zhao, Jingxian Duan, Yaou Liu, Hairong Zheng, Dong Liang, Zhenyu Zhang, Zhi-Cheng Li

    Abstract: The diagnosis and prognosis of cancer are typically based on multi-modal clinical data, including histology images and genomic data, due to the complex pathogenesis and high heterogeneity. Despite the advancements in digital pathology and high-throughput genome sequencing, establishing effective multi-modal fusion models for survival prediction and revealing the potential association between histo… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  25. arXiv:2404.04586  [pdf, other

    cs.CV

    PIE: Physics-inspired Low-light Enhancement

    Authors: Dong Liang, Zhengyan Xu, Ling Li, Mingqiang Wei, Songcan Chen

    Abstract: In this paper, we propose a physics-inspired contrastive learning paradigm for low-light enhancement, called PIE. PIE primarily addresses three issues: (i) To resolve the problem of existing learning-based methods often training a LLE model with strict pixel-correspondence image pairs, we eliminate the need for pixel-correspondence paired training data and instead train with unpaired images. (ii)… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: text overlap with arXiv:2112.06451

  26. arXiv:2404.03813  [pdf, ps, other

    quant-ph cs.LG

    Agnostic Tomography of Stabilizer Product States

    Authors: Sabee Grewal, Vishnu Iyer, William Kretschmer, Daniel Liang

    Abstract: We define a quantum learning task called agnostic tomography, where given copies of an arbitrary state $ρ$ and a class of quantum states $\mathcal{C}$, the goal is to output a succinct description of a state that approximates $ρ$ at least as well as any state in $\mathcal{C}$ (up to some small error $\varepsilon$). This task generalizes ordinary quantum tomography of states in $\mathcal{C}$ and is… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: 20 pages

  27. arXiv:2404.02546  [pdf, ps, other

    math.OC math.NA

    Analysis and approximation to parabolic optimal control problems with measure-valued controls in time

    Authors: Wei Gong, Dongdong Liang

    Abstract: In this paper, we investigate an optimal control problem governed by parabolic equations with measure-valued controls over time. We establish the well-posedness of the optimal control problem and derive the first-order optimality condition using Clarke's subgradients, revealing a sparsity structure in time for the optimal control. Consequently, these optimal control problems represent a generaliza… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    MSC Class: 65M06; 68M99; 49M40

  28. Multi-Level Label Correction by Distilling Proximate Patterns for Semi-supervised Semantic Segmentation

    Authors: Hui Xiao, Yuting Hong, Li Dong, Diqun Yan, Jiayan Zhuang, Junjie Xiong, Dongtai Liang, Chengbin Peng

    Abstract: Semi-supervised semantic segmentation relieves the reliance on large-scale labeled data by leveraging unlabeled data. Recent semi-supervised semantic segmentation approaches mainly resort to pseudo-labeling methods to exploit unlabeled data. However, unreliable pseudo-labeling can undermine the semi-supervision processes. In this paper, we propose an algorithm called Multi-Level Label Correction (… ▽ More

    Submitted 9 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 12 pages, 8 figures. IEEE Transactions on Multimedia, 2024

  29. arXiv:2404.00126  [pdf, ps, other

    quant-ph cs.CC

    Pseudoentanglement Ain't Cheap

    Authors: Sabee Grewal, Vishnu Iyer, William Kretschmer, Daniel Liang

    Abstract: We show that any pseudoentangled state ensemble with a gap of $t$ bits of entropy requires $Ω(t)$ non-Clifford gates to prepare. This bound is tight up to polylogarithmic factors if linear-time quantum-secure pseudorandom functions exist. Our result follows from a polynomial-time algorithm to estimate the entanglement entropy of a quantum state across any cut of qubits. When run on an $n$-qubit st… ▽ More

    Submitted 11 April, 2024; v1 submitted 29 March, 2024; originally announced April 2024.

    Comments: 15 pages; v2: slight edits to concurrent work section

  30. arXiv:2403.15132  [pdf, other

    cs.CV eess.IV

    Transfer CLIP for Generalizable Image Denoising

    Authors: Jun Cheng, Dong Liang, Shan Tan

    Abstract: Image denoising is a fundamental task in computer vision. While prevailing deep learning-based supervised and self-supervised methods have excelled in eliminating in-distribution noise, their susceptibility to out-of-distribution (OOD) noise remains a significant challenge. The recent emergence of contrastive language-image pre-training (CLIP) model has showcased exceptional capabilities in open-w… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024

  31. arXiv:2403.14972  [pdf, other

    cs.AI cs.CL cs.MA cs.MM

    A Picture Is Worth a Graph: Blueprint Debate on Graph for Multimodal Reasoning

    Authors: Changmeng Zheng, Dayong Liang, Wengyu Zhang, Xiao-Yong Wei, Tat-Seng Chua, Qing Li

    Abstract: This paper presents a pilot study aimed at introducing multi-agent debate into multimodal reasoning. The study addresses two key challenges: the trivialization of opinions resulting from excessive summarization and the diversion of focus caused by distractor concepts introduced from images. These challenges stem from the inductive (bottom-up) nature of existing debating schemes. To address the iss… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: Work in progress

  32. arXiv:2403.09493  [pdf, other

    cs.CV

    Anomaly Detection by Adapting a pre-trained Vision Language Model

    Authors: Yuxuan Cai, Xinwei He, Dingkang Liang, Ao Tong, Xiang Bai

    Abstract: Recently, large vision and language models have shown their success when adapting them to many downstream tasks. In this paper, we present a unified framework named CLIP-ADA for Anomaly Detection by Adapting a pre-trained CLIP model. To this end, we make two important improvements: 1) To acquire unified anomaly detection across industrial images of multiple categories, we introduce the learnable p… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  33. arXiv:2403.06323  [pdf, other

    cs.LG

    Risk-Sensitive RL with Optimized Certainty Equivalents via Reduction to Standard RL

    Authors: Kaiwen Wang, Dawen Liang, Nathan Kallus, Wen Sun

    Abstract: We study Risk-Sensitive Reinforcement Learning (RSRL) with the Optimized Certainty Equivalent (OCE) risk, which generalizes Conditional Value-at-risk (CVaR), entropic risk and Markowitz's mean-variance. Using an augmented Markov Decision Process (MDP), we propose two general meta-algorithms via reductions to standard RL: one based on optimistic algorithms and another based on policy optimization.… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  34. arXiv:2403.05385  [pdf, other

    cs.LG

    Switching the Loss Reduces the Cost in Batch Reinforcement Learning

    Authors: Alex Ayoub, Kaiwen Wang, Vincent Liu, Samuel Robertson, James McInerney, Dawen Liang, Nathan Kallus, Csaba Szepesvári

    Abstract: We propose training fitted Q-iteration with log-loss (FQI-LOG) for batch reinforcement learning (RL). We show that the number of samples needed to learn a near-optimal policy with FQI-LOG scales with the accumulated cost of the optimal policy, which is zero in problems where acting optimally achieves the goal and incurs no cost. In doing so, we provide a general framework for proving… ▽ More

    Submitted 12 March, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  35. arXiv:2403.02151  [pdf, other

    cs.CV

    TripoSR: Fast 3D Object Reconstruction from a Single Image

    Authors: Dmitry Tochilkin, David Pankratz, Zexiang Liu, Zixuan Huang, Adam Letts, Yangguang Li, Ding Liang, Christian Laforte, Varun Jampani, Yan-Pei Cao

    Abstract: This technical report introduces TripoSR, a 3D reconstruction model leveraging transformer architecture for fast feed-forward 3D generation, producing 3D mesh from a single image in under 0.5 seconds. Building upon the LRM network architecture, TripoSR integrates substantial improvements in data processing, model design, and training techniques. Evaluations on public datasets show that TripoSR exh… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: Model: https://huggingface.co/stabilityai/TripoSR Code: https://github.com/VAST-AI-Research/TripoSR Demo: https://huggingface.co/spaces/stabilityai/TripoSR

  36. arXiv:2403.01439  [pdf, other

    cs.CV

    Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis

    Authors: Xin Zhou, Dingkang Liang, Wei Xu, Xingkui Zhu, Yihan Xu, Zhikang Zou, Xiang Bai

    Abstract: Point cloud analysis has achieved outstanding performance by transferring point cloud pre-trained models. However, existing methods for model adaptation usually update all model parameters, i.e., full fine-tuning paradigm, which is inefficient as it relies on high computational costs (e.g., training GPU memory) and massive storage space. In this paper, we aim to study parameter-efficient transfer… ▽ More

    Submitted 5 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024. Code is available at https://github.com/LMD0311/DAPT

  37. arXiv:2402.17521  [pdf, other

    cs.CV

    AVS-Net: Point Sampling with Adaptive Voxel Size for 3D Scene Understanding

    Authors: Hongcheng Yang, Dingkang Liang, Dingyuan Zhang, Zhe Liu, Zhikang Zou, Xingyu Jiang, Yingying Zhu

    Abstract: The recent advancements in point cloud learning have enabled intelligent vehicles and robots to comprehend 3D environments better. However, processing large-scale 3D scenes remains a challenging problem, such that efficient downsampling methods play a crucial role in point cloud learning. Existing downsampling methods either require a huge computational burden or sacrifice fine-grained geometric i… ▽ More

    Submitted 15 April, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: 10 pages, 7 figures

  38. arXiv:2402.14598  [pdf, other

    cs.NE cs.LG

    Brain-inspired Distributed Memorization Learning for Efficient Feature-free Unsupervised Domain Adaptation

    Authors: Jianming Lv, Depin Liang, Zequan Liang, Yaobin Zhang, Sijun Xia

    Abstract: Compared with gradient based artificial neural networks, biological neural networks usually show a more powerful generalization ability to quickly adapt to unknown environments without using any gradient back-propagation procedure. Inspired by the distributed memory mechanism of human brains, we propose a novel gradient-free Distributed Memorization Learning mechanism, namely DML, to support quick… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: 15 pages,15 figures

  39. arXiv:2402.13188  [pdf, other

    cs.CL

    Question Calibration and Multi-Hop Modeling for Temporal Question Answering

    Authors: Chao Xue, Di Liang, Pengfei Wang, Jing Zhang

    Abstract: Many models that leverage knowledge graphs (KGs) have recently demonstrated remarkable success in question answering (QA) tasks. In the real world, many facts contained in KGs are time-constrained thus temporal KGQA has received increasing attention. Despite the fruitful efforts of previous models in temporal KGQA, they still have several limitations. (I) They adopt pre-trained language models (PL… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: Accepted by AAAI 2024

  40. arXiv:2402.10739  [pdf, other

    cs.CV

    PointMamba: A Simple State Space Model for Point Cloud Analysis

    Authors: Dingkang Liang, Xin Zhou, Wei Xu, Xingkui Zhu, Zhikang Zou, Xiaoqing Ye, Xiao Tan, Xiang Bai

    Abstract: Transformers have become one of the foundational architectures in point cloud analysis tasks due to their excellent global modeling ability. However, the attention mechanism has quadratic complexity, making the design of a linear complexity method with global modeling appealing. In this paper, we propose PointMamba, transferring the success of Mamba, a recent representative state space model (SSM)… ▽ More

    Submitted 29 May, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

    Comments: Update the architecture and performance. The code is available at https://github.com/LMD0311/PointMamba

  41. arXiv:2402.05954  [pdf, other

    cs.LG

    EasyFS: an Efficient Model-free Feature Selection Framework via Elastic Transformation of Features

    Authors: Jianming Lv, Sijun Xia, Depin Liang, Wei Chen

    Abstract: Traditional model-free feature selection methods treat each feature independently while disregarding the interrelationships among features, which leads to relatively poor performance compared with the model-aware methods. To address this challenge, we propose an efficient model-free feature selection framework via elastic expansion and compression of the features, namely EasyFS, to achieve better… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  42. arXiv:2402.05740  [pdf, other

    cs.IR

    CounterCLR: Counterfactual Contrastive Learning with Non-random Missing Data in Recommendation

    Authors: Jun Wang, Haoxuan Li, Chi Zhang, Dongxu Liang, Enyun Yu, Wenwu Ou, Wenjia Wang

    Abstract: Recommender systems are designed to learn user preferences from observed feedback and comprise many fundamental tasks, such as rating prediction and post-click conversion rate (pCVR) prediction. However, the observed feedback usually suffer from two issues: selection bias and data sparsity, where biased and insufficient feedback seriously degrade the performance of recommender systems in terms of… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: 2023 IEEE International Conference on Data Mining (ICDM)

  43. arXiv:2402.02332  [pdf, other

    cs.LG

    Minusformer: Improving Time Series Forecasting by Progressively Learning Residuals

    Authors: Daojun Liang, Haixia Zhang, Dongfeng Yuan, Bingzheng Zhang, Minggao Zhang

    Abstract: In this paper, we find that ubiquitous time series (TS) forecasting models are prone to severe overfitting. To cope with this problem, we embrace a de-redundancy approach to progressively reinstate the intrinsic values of TS for future intervals. Specifically, we introduce a dual-stream and subtraction mechanism, which is a deep Boosting ensemble learning method. And the vanilla Transformer is ren… ▽ More

    Submitted 17 June, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

  44. arXiv:2401.17509  [pdf, other

    cs.CV

    Anything in Any Scene: Photorealistic Video Object Insertion

    Authors: Chen Bai, Zeman Shao, Guoxiang Zhang, Di Liang, Jie Yang, Zhuorui Zhang, Yujian Guo, Chengzhang Zhong, Yiqiao Qiu, Zhendong Wang, Yichen Guan, Xiaoyin Zheng, Tao Wang, Cheng Lu

    Abstract: Realistic video simulation has shown significant potential across diverse applications, from virtual reality to film production. This is particularly true for scenarios where capturing videos in real-world settings is either impractical or expensive. Existing approaches in video simulation often fail to accurately model the lighting environment, represent the object geometry, or achieve high level… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  45. You Only Look Bottom-Up for Monocular 3D Object Detection

    Authors: Kaixin Xiong, Dingyuan Zhang, Dingkang Liang, Zhe Liu, Hongcheng Yang, Wondimu Dikubab, Jianwei Cheng, Xiang Bai

    Abstract: Monocular 3D Object Detection is an essential task for autonomous driving. Meanwhile, accurate 3D object detection from pure images is very challenging due to the loss of depth information. Most existing image-based methods infer objects' location in 3D space based on their 2D sizes on the image plane, which usually ignores the intrinsic position clues from images, leading to unsatisfactory perfor… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

    Comments: Accepted by IEEE Robotics and Automation Letters (RA-L)

  46. arXiv:2401.13757  [pdf

    physics.optics physics.app-ph

    Heterogeneously Integrated Laser on Silicon with Non-Volatile Wavelength Tuning

    Authors: Bassem Tossoun, Di Liang, Xia Sheng, John Paul Strachan, Raymond G. Beausoleil

    Abstract: The von-Neumann bottleneck has constrained computing systems from efficiently operating on the increasingly large demand in data from networks and devices. Silicon (Si) photonics offers a powerful solution for this issue by providing a platform for high-bandwidth, energy-efficient interconnects. Furthermore, memristors have emerged as a fundamental building block for non-volatile data storage and… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  47. arXiv:2401.10516  [pdf, other

    cs.LG cs.AI

    Episodic Reinforcement Learning with Expanded State-reward Space

    Authors: Dayang Liang, Yaru Zhang, Yunlong Liu

    Abstract: Empowered by deep neural networks, deep reinforcement learning (DRL) has demonstrated tremendous empirical successes in various domains, including games, health care, and autonomous driving. Despite these advancements, DRL is still identified as data-inefficient as effective policies demand vast numbers of environmental samples. Recently, episodic control (EC)-based model-free DRL methods enable s… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: Accepted at AAMAS'24

  48. Dynamic instability analysis for bumblebee black holes: the odd parity

    Authors: Zhan-Feng Mai, Rui Xu, Dicong Liang, Lijing Shao

    Abstract: Spherical black-hole (BH) solutions have been found in the bumblebee gravity where a vector field nonminimally couples to the Ricci tensor. We study dynamic (in)stability associated with the gravitational and vector perturbations of odd parity against these bumblebee BHs. Under the plane-wave approximation, we find that bumblebee BHs do not suffer ghost instability, but gradient instability and ta… ▽ More

    Submitted 8 April, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

    Comments: Accepted by PRD, 13 pages, 4 figures, references added

    Journal ref: Phys. Rev. D 109 (2024) 084076

  49. Graphical Principal Component Analysis of Multivariate Functional Time Series

    Authors: Jianbin Tan, Decai Liang, Yongtao Guan, Hui Huang

    Abstract: In this paper, we consider multivariate functional time series with a two-way dependence structure: a serial dependence across time points and a graphical interaction among the multiple functions within each time point. We develop the notion of dynamic weak separability, a more general condition than those assumed in literature, and use it to characterize the two-way structure in multivariate func… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

    Comments: Journal of the American Statistical Association (2024)

    Journal ref: Journal of the American Statistical Association (2024): 1-24

  50. arXiv:2401.05506  [pdf, ps, other

    math.KT math.GR math.NT math.RA math.RT

    On the Coherency of Completed Group Algebra

    Authors: David Burns, Yu Kuang, Dingli Liang

    Abstract: We investigate coherency properties of certain completed integral group rings, precisely for compact $p$-adic Lie groups.

    Submitted 16 January, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: 16 pages. Submitted

    MSC Class: 16D10; 16E05; 20E18(primary); 16S34(secondary)