Skip to main content

Showing 1–50 of 485 results for author: He, R

  1. arXiv:2407.08601  [pdf, ps, other

    cond-mat.str-el cond-mat.mtrl-sci cond-mat.supr-con physics.comp-ph

    DFT+DMFT study of correlated electronic structure in the monolayer-trilayer phase of La$_3$Ni$_2$O$_7$

    Authors: Zhenfeng Ouyang, Rong-Qiang He, Zhong-Yi Lu

    Abstract: By preforming DFT+DMFT calculations, we systematically investigate the correlated electronic structure in the newly discovered monolayer-trilayer (ML-TL) phase of La$_3$Ni$_2$O$_7$ (1313-La327). Our calculated Fermi surfaces are in good agreement with the angle-resolved photoemission spectroscopy (ARPES) results. We find that 1313-La327 is a multiorbital correlated metal. An orbital-selective Mott… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 7 pages, 3 figures, 3 tables

  2. arXiv:2407.07707  [pdf, other

    cs.IT math.OC stat.ML

    Group Projected Subspace Pursuit for Block Sparse Signal Reconstruction: Convergence Analysis and Applications

    Authors: Roy Y. He, Haixia Liu, Hao Liu

    Abstract: In this paper, we present a convergence analysis of the Group Projected Subspace Pursuit (GPSP) algorithm proposed by He et al. [HKL+23] (Group Projected subspace pursuit for IDENTification of variable coefficient differential equations (GP-IDENT), Journal of Computational Physics, 494, 112526) and extend its application to general tasks of block sparse signal recovery. We prove that when the samp… ▽ More

    Submitted 13 July, 2024; v1 submitted 1 June, 2024; originally announced July 2024.

    Comments: 35 pages

  3. arXiv:2407.02794  [pdf, other

    cs.CV

    Euler's Elastica Based Cartoon-Smooth-Texture Image Decomposition

    Authors: Roy Y. He, Hao Liu

    Abstract: We propose a novel model for decomposing grayscale images into three distinct components: the structural part, representing sharp boundaries and regions with strong light-to-dark transitions; the smooth part, capturing soft shadows and shades; and the oscillatory part, characterizing textures and noise. To capture the homogeneous structures, we introduce a combination of $L^0$-gradient and curvatu… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    MSC Class: 68U10; 94A08; 65D18

  4. arXiv:2407.02345  [pdf, other

    cs.CL

    MORPHEUS: Modeling Role from Personalized Dialogue History by Exploring and Utilizing Latent Space

    Authors: Yihong Tang, Bo Wang, Dongming Zhao, Xiaojia Jin, Jijun Zhang, Ruifang He, Yuexian Hou

    Abstract: Personalized Dialogue Generation (PDG) aims to create coherent responses according to roles or personas. Traditional PDG relies on external role data, which can be scarce and raise privacy concerns. Approaches address these issues by extracting role information from dialogue history, which often fail to generically model roles in continuous space. To overcome these limitations, we introduce a nove… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  5. arXiv:2407.00330  [pdf

    cond-mat.mtrl-sci

    A compositional ordering-driven morphotropic phase boundary in ferroelectric solid solutions

    Authors: Yubai Shi, Yifan Shan, Hongyu Wu, Zhicheng Zhong, Ri He, Run-Wei Li

    Abstract: Ferroelectric solid solutions usually exhibit giant dielectric response and high piezoelectricity in the vicinity of the morphotropic phase boundary (MPB), where the structural phase transitions between the rhombohedral and the tetragonal phases as a result of the composition or strain variation. Here, we propose a compositional ordering-driven MPB in the specified compositional solid solutions. B… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  6. arXiv:2406.17248  [pdf, other

    quant-ph

    MindSpore Quantum: A User-Friendly, High-Performance, and AI-Compatible Quantum Computing Framework

    Authors: Xusheng Xu, Jiangyu Cui, Zidong Cui, Runhong He, Qingyu Li, Xiaowei Li, Yanling Lin, Jiale Liu, Wuxin Liu, Jiale Lu, Maolin Luo, Chufan Lyu, Shijie Pan, Mosharev Pavel, Runqiu Shu, Jialiang Tang, Ruoqian Xu, Shu Xu, Kang Yang, Fan Yu, Qingguo Zeng, Haiying Zhao, Qiang Zheng, Junyuan Zhou, Xu Zhou , et al. (14 additional authors not shown)

    Abstract: We introduce MindSpore Quantum, a pioneering hybrid quantum-classical framework with a primary focus on the design and implementation of noisy intermediate-scale quantum (NISQ) algorithms. Leveraging the robust support of MindSpore, an advanced open-source deep learning training/inference framework, MindSpore Quantum exhibits exceptional efficiency in the design and training of variational quantum… ▽ More

    Submitted 10 July, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  7. arXiv:2406.14635  [pdf, other

    cs.AI cs.LG

    Harvesting Efficient On-Demand Order Pooling from Skilled Couriers: Enhancing Graph Representation Learning for Refining Real-time Many-to-One Assignments

    Authors: Yile Liang, Jiuxia Zhao, Donghui Li, Jie Feng, Chen Zhang, Xuetao Ding, Jinghua Hao, Renqing He

    Abstract: The recent past has witnessed a notable surge in on-demand food delivery (OFD) services, offering delivery fulfillment within dozens of minutes after an order is placed. In OFD, pooling multiple orders for simultaneous delivery in real-time order assignment is a pivotal efficiency source, which may in turn extend delivery time. Constructing high-quality order pooling to harmonize platform efficien… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Accepted in KDD 2024 ADS Track

  8. arXiv:2406.12754  [pdf, other

    cs.CL cs.AI

    Chumor 1.0: A Truly Funny and Challenging Chinese Humor Understanding Dataset from Ruo Zhi Ba

    Authors: Ruiqi He, Yushu He, Longju Bai, Jiarui Liu, Zhenjie Sun, Zenghao Tang, He Wang, Hanchen Xia, Naihao Deng

    Abstract: Existing humor datasets and evaluations predominantly focus on English, lacking resources for culturally nuanced humor in non-English languages like Chinese. To address this gap, we construct Chumor, a dataset sourced from Ruo Zhi Ba (RZB), a Chinese Reddit-like platform dedicated to sharing intellectually challenging and culturally specific jokes. We annotate explanations for each joke and evalua… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  9. arXiv:2406.12207  [pdf, other

    cond-mat.str-el cond-mat.other

    The Green's function Monte Carlo combined with projected entangled pair state approach to the frustrated $J_1$-$J_2$ Heisenberg model

    Authors: He-Yu Lin, Yibin Guo, Rong-Qiang He, Z. Y. Xie, Zhong-Yi Lu

    Abstract: The tensor network algorithm, a family of prevalent numerical methods for quantum many-body problems, aptly captures the entanglement properties intrinsic to quantum systems, enabling precise representation of quantum states. However, its computational cost is notably high, particularly in calculating physical observables like correlation functions. To surmount the computational challenge and enha… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 11 pages, 15 figures

    Journal ref: Phys. Rev. B 109, 235133 (2024)

  10. arXiv:2406.09025  [pdf, other

    eess.SP

    Site-Specific Radio Channel Representation -- Current State and Future Applications

    Authors: Thomas Zemen, Jorge Gomez-Ponce, Aniruddha Chandra, Michael Walter, Enes Aksoy, Ruisi He, David Matolak, Minseok Kim, Jun-ichi Takada, Sana Salous, Reinaldo Valenzuela, Andreas F. Molisch

    Abstract: A site-specific radio channel representation considers the surroundings of the communication system through the environment geometry, such as buildings, vegetation, and mobile objects including their material and surface properties. In this article, we focus on communication technologies for 5G and beyond that are increasingly able to exploit the specific environment geometry for both communicatio… ▽ More

    Submitted 18 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: 7 pages, 5 figures, submitted to the IEEE Communication Magazine

  11. arXiv:2406.08855  [pdf, other

    cs.RO

    Trajectory Planning for Autonomous Driving in Unstructured Scenarios Based on Graph Neural Network and Numerical Optimization

    Authors: Sumin Zhang, Kuo Li, Rui He, Zhiwei Meng, Yupeng Chang, Xiaosong Jin, Ri Bai

    Abstract: In unstructured environments, obstacles are diverse and lack lane markings, making trajectory planning for intelligent vehicles a challenging task. Traditional trajectory planning methods typically involve multiple stages, including path planning, speed planning, and trajectory optimization. These methods require the manual design of numerous parameters for each stage, resulting in significant wor… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  12. arXiv:2406.00908  [pdf, other

    cs.CV

    ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation

    Authors: Shaoshu Yang, Yong Zhang, Xiaodong Cun, Ying Shan, Ran He

    Abstract: Video generation has made remarkable progress in recent years, especially since the advent of the video diffusion models. Many video generation models can produce plausible synthetic videos, e.g., Stable Video Diffusion (SVD). However, most video models can only generate low frame rate videos due to the limited GPU memory as well as the difficulty of modeling a large set of frames. The training vi… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  13. arXiv:2405.20044  [pdf, other

    cs.CV

    A Point-Neighborhood Learning Framework for Nasal Endoscope Image Segmentation

    Authors: Pengyu Jie, Wanquan Liu, Chenqiang Gao, Yihui Wen, Rui He, Pengcheng Li, Jintao Zhang, Deyu Meng

    Abstract: The lesion segmentation on endoscopic images is challenging due to its complex and ambiguous features. Fully-supervised deep learning segmentation methods can receive good performance based on entirely pixel-level labeled dataset but greatly increase experts' labeling burden. Semi-supervised and weakly supervised methods can ease labeling burden, but heavily strengthen the learning difficulty. To… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 10 pages, 10 figures,

  14. arXiv:2405.17815  [pdf, other

    cs.CV

    Visual Anchors Are Strong Information Aggregators For Multimodal Large Language Model

    Authors: Haogeng Liu, Quanzeng You, Xiaotian Han, Yongfei Liu, Huaibo Huang, Ran He, Hongxia Yang

    Abstract: In the realm of Multimodal Large Language Models (MLLMs), vision-language connector plays a crucial role to link the pre-trained vision encoders with Large Language Models (LLMs). Despite its importance, the vision-language connector has been relatively less explored. In this study, we aim to propose a strong vision-language connector that enables MLLMs to achieve high accuracy while maintain low… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  15. arXiv:2405.16240  [pdf, other

    cs.LG

    Analytic Federated Learning

    Authors: Huiping Zhuang, Run He, Kai Tong, Di Fang, Han Sun, Haoran Li, Tianyi Chen, Ziqian Zeng

    Abstract: In this paper, we introduce analytic federated learning (AFL), a new training paradigm that brings analytical (i.e., closed-form) solutions to the federated learning (FL) community. Our AFL draws inspiration from analytic learning -- a gradient-free technique that trains neural networks with analytical solutions in one epoch. In the local client training stage, the AFL facilitates a one-epoch trai… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  16. arXiv:2405.16093  [pdf, other

    cs.CV

    Diverse Teacher-Students for Deep Safe Semi-Supervised Learning under Class Mismatch

    Authors: Qikai Wang, Rundong He, Yongshun Gong, Chunxiao Ren, Haoliang Sun, Xiaoshui Huang, Yilong Yin

    Abstract: Semi-supervised learning can significantly boost model performance by leveraging unlabeled data, particularly when labeled data is scarce. However, real-world unlabeled data often contain unseen-class samples, which can hinder the classification of seen classes. To address this issue, mainstream safe SSL methods suggest detecting and discarding unseen-class samples from unlabeled data. Nevertheles… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  17. arXiv:2405.13949  [pdf, other

    cs.CV

    PitVQA: Image-grounded Text Embedding LLM for Visual Question Answering in Pituitary Surgery

    Authors: Runlong He, Mengya Xu, Adrito Das, Danyal Z. Khan, Sophia Bano, Hani J. Marcus, Danail Stoyanov, Matthew J. Clarkson, Mobarakol Islam

    Abstract: Visual Question Answering (VQA) within the surgical domain, utilizing Large Language Models (LLMs), offers a distinct opportunity to improve intra-operative decision-making and facilitate intuitive surgeon-AI interaction. However, the development of LLMs for surgical VQA is hindered by the scarcity of diverse and extensive datasets with complex reasoning tasks. Moreover, contextual fusion of the i… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 10 pages, 3 figures

  18. arXiv:2405.13337  [pdf, other

    cs.CV

    Semantic Equitable Clustering: A Simple, Fast and Effective Strategy for Vision Transformer

    Authors: Qihang Fan, Huaibo Huang, Mingrui Chen, Ran He

    Abstract: The Vision Transformer (ViT) has gained prominence for its superior relational modeling prowess. However, its global attention mechanism's quadratic complexity poses substantial computational burdens. A common remedy spatially groups tokens for self-attention, reducing computational requirements. Nonetheless, this strategy neglects semantic information in tokens, possibly scattering semantically-l… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  19. arXiv:2405.13335  [pdf, other

    cs.CV

    Vision Transformer with Sparse Scan Prior

    Authors: Qihang Fan, Huaibo Huang, Mingrui Chen, Ran He

    Abstract: In recent years, Transformers have achieved remarkable progress in computer vision tasks. However, their global modeling often comes with substantial computational overhead, in stark contrast to the human eye's efficient information processing. Inspired by the human eye's sparse scanning mechanism, we propose a \textbf{S}parse \textbf{S}can \textbf{S}elf-\textbf{A}ttention mechanism (… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  20. arXiv:2405.07830  [pdf, other

    eess.SP

    Joint Precoding for RIS-Assisted Wideband THz Cell-Free Massive MIMO Systems

    Authors: Xin Su, Ruisi He, Peng Zhang, Bo Ai

    Abstract: Terahertz (THz) cell-free massive multiple-input-multiple-output (mMIMO) networks have been envisioned as a prospective technology for achieving higher system capacity, improved performance, and ultra-high reliability in 6G networks. However, due to severe attenuation and limited scattering in THz transmission, as well as high power consumption for increased number of access points (APs), further… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  21. arXiv:2405.07508  [pdf, other

    cs.SE

    Revealing the value of Repository Centrality in lifespan prediction of Open Source Software Projects

    Authors: Runzhi He, Hengzhi Ye, Minghui Zhou

    Abstract: Background: Open Source Software is the building block of modern software. However, the prevalence of project deprecation in the open source world weakens the integrity of the downstream systems and the broad ecosystem. Therefore it calls for efforts in monitoring and predicting project deprecations, empowering stakeholders to take proactive measures. Challenge: Existing techniques mainly focus on… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  22. arXiv:2405.07303  [pdf, other

    hep-ex hep-ph physics.ins-det

    Search for solar axions by Primakoff effect with the full dataset of the CDEX-1B Experiment

    Authors: L. T. Yang, S. K. Liu, Q. Yue, K. J. Kang, Y. J. Li, H. P. An, Greeshma C., J. P. Chang, Y. H. Chen, J. P. Cheng, W. H. Dai, Z. Deng, C. H. Fang, X. P. Geng, H. Gong, Q. J. Guo, T. Guo, X. Y. Guo, L. He, J. R. He, J. W. Hu, H. X. Huang, T. C. Huang, L. Jiang, S. Karmakar , et al. (61 additional authors not shown)

    Abstract: We present the first limit on $g_{Aγ}$ coupling constant using the Bragg-Primakoff conversion based on an exposure of 1107.5 kg days of data from the CDEX-1B experiment at the China Jinping Underground Laboratory. The data are consistent with the null signal hypothesis, and no excess signals are observed. Limits of the coupling $g_{Aγ}<2.08\times10^{-9}$ GeV$^{-1}$ (95\% C.L.) are derived for axio… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: 7 pages, 5 figures

  23. arXiv:2405.06141  [pdf

    cond-mat.mtrl-sci cond-mat.other

    Recycling failed photoelectrons via tertiary photoemission

    Authors: M. Matzelle, Wei-Chi Chiu, Caiyun Hong, Barun Ghosh, Pengxu Ran, R. S. Markiewicz, B. Barbiellini, Changxi Zheng, Sheng Li, Rui-Hua He, Arun Bansil

    Abstract: A key insight of Einstein's theory of the photoelectric effect is that a minimum energy is required for photoexcited electrons to escape from a material. For the past century it has been assumed that photoexcited electrons of lower energies make no contribution to the photoemission spectrum. Here we demonstrate the conceptual possibility that the energy of these 'failed' photoelectrons-primary or… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 45 Pages, 14 Figures

  24. arXiv:2405.05353  [pdf, other

    eess.SY

    Eco-driving Accounting for Interactive Cut-in Vehicles

    Authors: Chaozhe R. He, Nan Li

    Abstract: Automated vehicles can gather information about surrounding traffic and plan safe and energy-efficient driving behavior, which is known as eco-driving. Conventional eco-driving designs only consider preceding vehicles in the same lane as the ego vehicle. In heavy traffic, however, vehicles in adjacent lanes may cut into the ego vehicle's lane, influencing the ego vehicle's eco-driving behavior and… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted at 2024 IEEE International Conference on Mobility: Operations, Services, and Technologies (MOST)

  25. arXiv:2404.18778  [pdf, ps, other

    math.PR

    On Approximating the Potts Model with Contracting Glauber Dynamics

    Authors: Roxanne He, Jackie Lok

    Abstract: We show that the Potts model on a graph can be approximated by a sequence of independent and identically distributed spins in terms of Wasserstein distance at high temperatures. We prove a similar result for the Curie-Weiss-Potts model on the complete graph, conditioned on being close enough to any of its equilibrium macrostates, in the low-temperature regime. Our proof technique is based on Stein… ▽ More

    Submitted 1 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: 34 pages, changes to formatting

    MSC Class: 60J10; 60K35; 68Q87

  26. arXiv:2404.15684  [pdf, other

    cs.NI

    Generative Diffusion Model (GDM) for Optimization of Wi-Fi Networks

    Authors: Tie Liu, Xuming Fang, Rong He

    Abstract: Generative Diffusion Models (GDMs), have made significant strides in modeling complex data distributions across diverse domains. Meanwhile, Deep Reinforcement Learning (DRL) has demonstrated substantial improvements in optimizing Wi-Fi network performance. Wi-Fi optimization problems are highly challenging to model mathematically, and DRL methods can bypass complex mathematical modeling, while GDM… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: This paper has been submitted to GlobeCom 2024 and is currently under review

  27. arXiv:2404.10451  [pdf, other

    cond-mat.mtrl-sci

    Ultrahigh Stability of O-Sublattice in $β$-Ga$_2$O$_3$

    Authors: Ru He, Junlei Zhao, Jesper Byggmästar, Huan He, Flyura Djurabekova

    Abstract: Recently reported remarkably high radiation tolerance of $γ$/$β$-Ga$_2$O$_3$ double-polymorphic structure brings this ultrawide bandgap semiconductor to the frontiers of power electronics applications that are able to operate in challenging environments. Understanding the mechanism of radiation tolerance is crucial for further material modification and tailoring of the desired properties. In this… ▽ More

    Submitted 18 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  28. arXiv:2404.09793  [pdf, other

    hep-ex hep-ph physics.ins-det

    First Search for Light Fermionic Dark Matter Absorption on Electrons Using Germanium Detector in CDEX-10 Experiment

    Authors: J. X. Liu, L. T. Yang, Q. Yue, K. J. Kang, Y. J. Li, H. P. An, Greeshma C., J. P. Chang, Y. H. Chen, J. P. Cheng, W. H. Dai, Z. Deng, C. H. Fang, X. P. Geng, H. Gong, Q. J. Guo, T. Guo, X. Y. Guo, L. He, J. R. He, J. W. Hu, H. X. Huang, T. C. Huang, L. Jiang, S. Karmakar , et al. (61 additional authors not shown)

    Abstract: We present the first results of the search for sub-MeV fermionic dark matter absorbed by electron targets of Germanium using the 205.4~kg$\cdot$day data collected by the CDEX-10 experiment, with the analysis threshold of 160~eVee. No significant dark matter (DM) signals over the background are observed. Results are presented as limits on the cross section of DM--electron interaction. We present ne… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 6 pages, 4 figures

  29. arXiv:2404.06022   

    cs.CV cs.AI cs.MM

    Band-Attention Modulated RetNet for Face Forgery Detection

    Authors: Zhida Zhang, Jie Cao, Wenkui Yang, Qihang Fan, Kai Zhou, Ran He

    Abstract: The transformer networks are extensively utilized in face forgery detection due to their scalability across large datasets.Despite their success, transformers face challenges in balancing the capture of global context, which is crucial for unveiling forgery clues, with computational complexity.To mitigate this issue, we introduce Band-Attention modulated RetNet (BAR-Net), a lightweight network des… ▽ More

    Submitted 1 July, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: The essay is poorly expressed in writing and will be re-optimised

  30. arXiv:2404.04565  [pdf, other

    cs.CV

    SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos

    Authors: Tao Wu, Runyu He, Gangshan Wu, Limin Wang

    Abstract: Video-based visual relation detection tasks, such as video scene graph generation, play important roles in fine-grained video understanding. However, current video visual relation detection datasets have two main limitations that hinder the progress of research in this area. First, they do not explore complex human-human interactions in multi-person scenarios. Second, the relation types of existin… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024

  31. arXiv:2404.00323  [pdf, other

    cs.CV cs.LG

    CLIP-driven Outliers Synthesis for few-shot OOD detection

    Authors: Hao Sun, Rundong He, Zhongyi Han, Zhicong Lin, Yongshun Gong, Yilong Yin

    Abstract: Few-shot OOD detection focuses on recognizing out-of-distribution (OOD) images that belong to classes unseen during training, with the use of only a small number of labeled in-distribution (ID) images. Up to now, a mainstream strategy is based on large-scale vision-language models, such as CLIP. However, these methods overlook a crucial issue: the lack of reliable OOD supervision information, whic… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 9 pages,5 figures

  32. arXiv:2403.18361  [pdf, other

    cs.CV

    ViTAR: Vision Transformer with Any Resolution

    Authors: Qihang Fan, Quanzeng You, Xiaotian Han, Yongfei Liu, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yang

    Abstract: This paper tackles a significant challenge faced by Vision Transformers (ViTs): their constrained scalability across different image resolutions. Typically, ViTs experience a performance decline when processing resolutions different from those seen during training. Our work introduces two key innovations to address this issue. Firstly, we propose a novel module for dynamic resolution adjustment, d… ▽ More

    Submitted 28 March, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

  33. arXiv:2403.17765  [pdf, other

    cs.CV

    MUTE-SLAM: Real-Time Neural SLAM with Multiple Tri-Plane Hash Representations

    Authors: Yifan Yan, Ruomin He, Zhenghua Liu

    Abstract: We introduce MUTE-SLAM, a real-time neural RGB-D SLAM system employing multiple tri-plane hash-encodings for efficient scene representation. MUTE-SLAM effectively tracks camera positions and incrementally builds a scalable multi-map representation for both small and large indoor environments. As previous methods often require pre-defined scene boundaries, MUTE-SLAM dynamically allocates sub-maps f… ▽ More

    Submitted 7 July, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

  34. arXiv:2403.17503  [pdf, other

    cs.LG cs.CV

    DS-AL: A Dual-Stream Analytic Learning for Exemplar-Free Class-Incremental Learning

    Authors: Huiping Zhuang, Run He, Kai Tong, Ziqian Zeng, Cen Chen, Zhiping Lin

    Abstract: Class-incremental learning (CIL) under an exemplar-free constraint has presented a significant challenge. Existing methods adhering to this constraint are prone to catastrophic forgetting, far more so than replay-based techniques that retain access to past samples. In this paper, to solve the exemplar-free CIL problem, we propose a Dual-Stream Analytic Learning (DS-AL) approach. The DS-AL contains… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted in AAAI 2024

  35. arXiv:2403.15751  [pdf, other

    cs.CV

    AOCIL: Exemplar-free Analytic Online Class Incremental Learning with Low Time and Resource Consumption

    Authors: Huiping Zhuang, Yuchen Liu, Run He, Kai Tong, Ziqian Zeng, Cen Chen, Yi Wang, Lap-Pui Chau

    Abstract: Online Class Incremental Learning (OCIL) aims to train the model in a task-by-task manner, where data arrive in mini-batches at a time while previous data are not accessible. A significant challenge is known as Catastrophic Forgetting, i.e., loss of the previous knowledge on old data. To address this, replay-based methods show competitive results but invade data privacy, while exemplar-free method… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

  36. arXiv:2403.15706  [pdf, other

    cs.LG cs.CV

    G-ACIL: Analytic Learning for Exemplar-Free Generalized Class Incremental Learning

    Authors: Huiping Zhuang, Yizhu Chen, Di Fang, Run He, Kai Tong, Hongxin Wei, Ziqian Zeng, Cen Chen

    Abstract: Class incremental learning (CIL) trains a network on sequential tasks with separated categories but suffers from catastrophic forgetting, where models quickly lose previously learned knowledge when acquiring new tasks. The generalized CIL (GCIL) aims to address the CIL problem in a more real-world scenario, where incoming data have mixed data categories and unknown sample size distribution, leadin… ▽ More

    Submitted 13 April, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  37. Exciton-activated effective phonon magnetic moment in monolayer MoS2

    Authors: Chunli Tang, Gaihua Ye, Cynthia Nnokwe, Mengqi Fang, Li Xiang, Masoud Mahjouri-Samani, Dmitry Smirnov, Eui-Hyeok Yang, Tingting Wang, Lifa Zhang, Rui He, Wencan Jin

    Abstract: Optical excitation of chiral phonons plays a vital role in studying the phonon-driven magnetic phenomena in solids. Transition metal dichalcogenides host chiral phonons at high symmetry points of the Brillouin zone, providing an ideal platform to explore the interplay between chiral phonons and valley degree of freedom. Here, we investigate the helicity-resolved magneto-Raman response of monolayer… ▽ More

    Submitted 7 April, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    Journal ref: Phys. Rev. B 109, 155426 (2024)

  38. arXiv:2403.13804  [pdf, other

    cs.CV cs.CL cs.LG

    Learning from Models and Data for Visual Grounding

    Authors: Ruozhen He, Paola Cascante-Bonilla, Ziyan Yang, Alexander C. Berg, Vicente Ordonez

    Abstract: We introduce SynGround, a novel framework that combines data-driven learning and knowledge transfer from various large-scale pretrained models to enhance the visual grounding capabilities of a pretrained vision-and-language model. The knowledge transfer from the models initiates the generation of image descriptions through an image description generator. These descriptions serve dual purposes: the… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Project Page: https://catherine-r-he.github.io/SynGround/

  39. arXiv:2403.13522  [pdf, other

    cs.LG cs.CV

    REAL: Representation Enhanced Analytic Learning for Exemplar-free Class-incremental Learning

    Authors: Run He, Huiping Zhuang, Di Fang, Yizhu Chen, Kai Tong, Cen Chen

    Abstract: Exemplar-free class-incremental learning (EFCIL) aims to mitigate catastrophic forgetting in class-incremental learning without available historical data. Compared with its counterpart (replay-based CIL) that stores historical samples, the EFCIL suffers more from forgetting issues under the exemplar-free constraint. In this paper, inspired by the recently developed analytic learning (AL) based CIL… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  40. arXiv:2403.10098  [pdf, other

    cs.CV

    DiffMAC: Diffusion Manifold Hallucination Correction for High Generalization Blind Face Restoration

    Authors: Nan Gao, Jia Li, Huaibo Huang, Zhi Zeng, Ke Shang, Shuwu Zhang, Ran He

    Abstract: Blind face restoration (BFR) is a highly challenging problem due to the uncertainty of degradation patterns. Current methods have low generalization across photorealistic and heterogeneous domains. In this paper, we propose a Diffusion-Information-Diffusion (DID) framework to tackle diffusion manifold hallucination correction (DiffMAC), which achieves high-generalization face restoration in divers… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: 15 pages, 12 figures

  41. arXiv:2403.05924  [pdf, other

    cs.CV

    CSCNET: Class-Specified Cascaded Network for Compositional Zero-Shot Learning

    Authors: Yanyi Zhang, Qi Jia, Xin Fan, Yu Liu, Ran He

    Abstract: Attribute and object (A-O) disentanglement is a fundamental and critical problem for Compositional Zero-shot Learning (CZSL), whose aim is to recognize novel A-O compositions based on foregone knowledge. Existing methods based on disentangled representation learning lose sight of the contextual dependency between the A-O primitive pairs. Inspired by this, we propose a novel A-O disentangled framew… ▽ More

    Submitted 13 March, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

    Comments: ICASSP 2024

  42. arXiv:2403.03015  [pdf, other

    cs.IT eess.SP

    Low Complexity Channel Estimation for RIS-Assisted THz Systems with Beam Split

    Authors: Xin Su, Ruisi He, Peng Zhang, Bo Ai

    Abstract: To support extremely high data rates, reconfigurable intelligent surface (RIS)-assisted terahertz (THz) communication is considered to be a promising technology for future sixth-generation networks. However, due to the typical employment of hybrid beamforming architecture in THz systems, as well as the passive nature of RIS which lacks the capability to process pilot signals, obtaining channel sta… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  43. arXiv:2403.01487  [pdf, other

    cs.CV

    InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding

    Authors: Haogeng Liu, Quanzeng You, Xiaotian Han, Yiqi Wang, Bohan Zhai, Yongfei Liu, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yang

    Abstract: Multimodal Large Language Models (MLLMs) have experienced significant advancements recently. Nevertheless, challenges persist in the accurate recognition and comprehension of intricate details within high-resolution images. Despite being indispensable for the development of robust MLLMs, this area remains underinvestigated. To tackle this challenge, our work introduces InfiMM-HD, a novel architect… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

  44. arXiv:2403.00605  [pdf, other

    eess.SP

    Channel Measurements and Modeling for Dynamic Vehicular ISAC Scenarios at 28 GHz

    Authors: Zhengyu Zhang, Ruisi He, Bo Ai, Mi Yang, Xuejian Zhang, Ziyi Qi, Yuan Yuan

    Abstract: Integrated sensing and communication (ISAC) is a promising technology for 6G, with the goal of providing end-to-end information processing and inherent perception capabilities for future communication systems. Within ISAC emerging application scenarios, vehicular ISAC technologies have the potential to enhance traffic efficiency and safety through integration of communication and synchronized perc… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  45. arXiv:2403.00569  [pdf, other

    eess.SP

    Characterization of Wireless Channel Semantics: A New Paradigm

    Authors: Zhengyu Zhang, Ruisi He, Mi Yang, Xuejian Zhang, Ziyi Qi, Yuan Yuan, Bo Ai

    Abstract: Recently, deep learning enabled semantic communications have been developed to understand transmission content from semantic level, which realize effective and accurate information transfer. Aiming to the vision of sixth generation (6G) networks, wireless devices are expected to have native perception and intelligent capabilities, which associate wireless channel with surrounding environments from… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  46. arXiv:2403.00557  [pdf, other

    eess.SP

    Non-stationarity Characteristics in Dynamic Vehicular ISAC Channels at 28 GHz

    Authors: Zhengyu Zhang, Ruisi He, Mi Yang, Xuejian Zhang, Ziyi Qi, Hang Mi, Guiqi Sun, Jingya Yang, Bo Ai

    Abstract: Integrated sensing and communications (ISAC) is a potential technology of 6G, aiming to enable end-to-end information processing ability and native perception capability for future communication systems. As an important part of the ISAC application scenarios, ISAC aided vehicle-to-everything (V2X) can improve the traffic efficiency and safety through intercommunication and synchronous perception.… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  47. arXiv:2403.00505  [pdf, other

    eess.SP

    A Cluster-Based Statistical Channel Model for Integrated Sensing and Communication Channels

    Authors: Zhengyu Zhang, Ruisi He, Bo Ai, Mi Yang, Yong Niu, Zhangdui Zhong, Yujian Li, Xuejian Zhang, Jing Li

    Abstract: The emerging 6G network envisions integrated sensing and communication (ISAC) as a promising solution to meet growing demand for native perception ability. To optimize and evaluate ISAC systems and techniques, it is crucial to have an accurate and realistic wireless channel model. However, some important features of ISAC channels have not been well characterized, for example, most existing ISAC ch… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  48. arXiv:2402.15080  [pdf, other

    cs.CL

    Infusing Hierarchical Guidance into Prompt Tuning: A Parameter-Efficient Framework for Multi-level Implicit Discourse Relation Recognition

    Authors: Haodong Zhao, Ruifang He, Mengnan Xiao, Jing Xu

    Abstract: Multi-level implicit discourse relation recognition (MIDRR) aims at identifying hierarchical discourse relations among arguments. Previous methods achieve the promotion through fine-tuning PLMs. However, due to the data scarcity and the task gap, the pre-trained feature space cannot be accurately tuned to the task-specific space, which even aggravates the collapse of the vanilla space. Besides, th… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: accepted to ACL 2023

  49. arXiv:2402.14600  [pdf, other

    cs.AI

    Diffusion Model-Based Multiobjective Optimization for Gasoline Blending Scheduling

    Authors: Wenxuan Fang, Wei Du, Renchu He, Yang Tang, Yaochu Jin, Gary G. Yen

    Abstract: Gasoline blending scheduling uses resource allocation and operation sequencing to meet a refinery's production requirements. The presence of nonlinearity, integer constraints, and a large number of decision variables adds complexity to this problem, posing challenges for traditional and evolutionary algorithms. This paper introduces a novel multiobjective optimization approach driven by a diffusio… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  50. arXiv:2402.14577  [pdf, other

    cs.CV

    Debiasing Text-to-Image Diffusion Models

    Authors: Ruifei He, Chuhui Xue, Haoru Tan, Wenqing Zhang, Yingchen Yu, Song Bai, Xiaojuan Qi

    Abstract: Learning-based Text-to-Image (TTI) models like Stable Diffusion have revolutionized the way visual content is generated in various domains. However, recent research has shown that nonnegligible social bias exists in current state-of-the-art TTI systems, which raises important concerns. In this work, we target resolving the social bias in TTI diffusion models. We begin by formalizing the problem se… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.