Skip to main content

Showing 1–50 of 1,305 results for author: He, H

  1. arXiv:2407.11935  [pdf, other

    cs.CV

    Learning Multi-view Anomaly Detection

    Authors: Haoyang He, Jiangning Zhang, Guanzhong Tian, Chengjie Wang, Lei Xie

    Abstract: This study explores the recently proposed challenging multi-view Anomaly Detection (AD) task. Single-view tasks would encounter blind spots from other perspectives, resulting in inaccuracies in sample-level prediction. Therefore, we introduce the \textbf{M}ulti-\textbf{V}iew \textbf{A}nomaly \textbf{D}etection (\textbf{MVAD}) framework, which learns and integrates features from multi-views. Specif… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 10 pages

  2. arXiv:2407.08725  [pdf, other

    cs.CV cs.AI cs.RO

    MetaUrban: A Simulation Platform for Embodied AI in Urban Spaces

    Authors: Wayne Wu, Honglin He, Yiran Wang, Chenda Duan, Jack He, Zhizheng Liu, Quanyi Li, Bolei Zhou

    Abstract: Public urban spaces like streetscapes and plazas serve residents and accommodate social life in all its vibrant variations. Recent advances in Robotics and Embodied AI make public urban spaces no longer exclusive to humans. Food delivery bots and electric wheelchairs have started sharing sidewalks with pedestrians, while diverse robot dogs and humanoids have recently emerged in the street. Ensurin… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Technical report. Project page: https://metadriverse.github.io/metaurban/

  3. arXiv:2407.08427  [pdf, other

    astro-ph.CO hep-th

    Constraining Holographic Dark Energy and Analyzing Cosmological Tensions

    Authors: Xin Tang, Yin-Zhe Ma, Wei-Ming Dai, Hong-Jian He

    Abstract: We investigate cosmological constraints on the holographic dark energy (HDE) using the state-of-the-art cosmological datasets: Planck CMB angular power spectra and weak lensing power spectra, Atacama Cosmology Telescope (ACT) temperature power spectra, baryon acoustic oscillation (BAO) and redshift-space distortion (RSD) measurements from six-degree-field galaxy survey and Sloan Digital Sky Survey… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 11 pages, 7 figures, 5 tables

    Journal ref: Physics of the Dark Universe 46 (2024) 101568

  4. arXiv:2407.08330  [pdf, other

    cs.LG

    HDT: Hierarchical Document Transformer

    Authors: Haoyu He, Markus Flicke, Jan Buchmann, Iryna Gurevych, Andreas Geiger

    Abstract: In this paper, we propose the Hierarchical Document Transformer (HDT), a novel sparse Transformer architecture tailored for structured hierarchical documents. Such documents are extremely important in numerous domains, including science, law or medicine. However, most existing solutions are inefficient and fail to make use of the structure inherent to documents. HDT exploits document structure by… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  5. arXiv:2407.05420  [pdf, ps, other

    cs.IR

    Towards Bridging the Cross-modal Semantic Gap for Multi-modal Recommendation

    Authors: Xinglong Wu, Anfeng Huang, Hongwei Yang, Hui He, Yu Tai, Weizhe Zhang

    Abstract: Multi-modal recommendation greatly enhances the performance of recommender systems by modeling the auxiliary information from multi-modality contents. Most existing multi-modal recommendation models primarily exploit multimedia information propagation processes to enrich item representations and directly utilize modal-specific embedding vectors independently obtained from upstream pre-trained mode… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  6. arXiv:2407.05361  [pdf, other

    eess.AS cs.CL

    Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation

    Authors: Haorui He, Zengqiang Shang, Chaoren Wang, Xuyuan Li, Yicheng Gu, Hua Hua, Liwei Liu, Chen Yang, Jiaqi Li, Peiyang Shi, Yuancheng Wang, Kai Chen, Pengyuan Zhang, Zhizheng Wu

    Abstract: Recently, speech generation models have made significant progress by using large-scale training data. However, the research community struggle to produce highly spontaneous and human-like speech due to the lack of large-scale, diverse, and spontaneous speech data. This paper present Emilia, the first multilingual speech generation dataset from in-the-wild speech data, and Emilia-Pipe, the first op… ▽ More

    Submitted 12 July, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

    Comments: Fix typos

  7. arXiv:2407.04549  [pdf, other

    cs.CL cs.AI

    Spontaneous Reward Hacking in Iterative Self-Refinement

    Authors: Jane Pan, He He, Samuel R. Bowman, Shi Feng

    Abstract: Language models are capable of iteratively improving their outputs based on natural language feedback, thus enabling in-context optimization of user preference. In place of human users, a second language model can be used as an evaluator, providing feedback along with numerical ratings which the generator attempts to optimize. However, because the evaluator is an imperfect proxy of user preference… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  8. arXiv:2407.02395  [pdf, other

    cs.SE cs.CL

    Is Your AI-Generated Code Really Safe? Evaluating Large Language Models on Secure Code Generation with CodeSecEval

    Authors: Jiexin Wang, Xitong Luo, Liuwen Cao, Hongkui He, Hailin Huang, Jiayuan Xie, Adam Jatowt, Yi Cai

    Abstract: Large language models (LLMs) have brought significant advancements to code generation and code repair, benefiting both novice and experienced developers. However, their training using unsanitized data from open-source repositories, like GitHub, raises the risk of inadvertently propagating security vulnerabilities. Despite numerous studies investigating the safety of code LLMs, there remains a gap… ▽ More

    Submitted 4 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2310.16263

  9. arXiv:2407.01716  [pdf, other

    astro-ph.GA

    PHANGS-MeerKAT and MHONGOOSE HI observations of nearby spiral galaxies: physical drivers of the molecular gas fraction, $R_{\mathrm{mol}}$

    Authors: Cosima Eibensteiner, Jiayi Sun, Frank Bigiel, Adam K. Leroy, Eva Schinnerer, Erik Rosolowsky, Sushma Kurapati, D. J. Pisano, W. J. G de Blok, Ashley T. Barnes, Mallory Thorp, Dario Colombo, Eric W. Koch, I-Da Chiang, Eve C. Ostriker, Eric J. Murphy, Nikki Zabel, Sebstian Laudage, Filippo M. Maccagni, Julia Healy, Srikrishna Sekhar, Dyas Utomo, Jakob den Brok, Yixian Cao, Mélanie Chevance , et al. (14 additional authors not shown)

    Abstract: The molecular-to-atomic gas ratio is crucial to the evolution of the interstellar medium in galaxies. We investigate the balance between the atomic ($Σ_{\rm HI}$) and molecular gas ($Σ_{\rm H2}$) surface densities in eight nearby star-forming galaxies using new high-quality observations from MeerKAT and ALMA (for HI and CO, respectively). We define the molecular gas ratio as… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: accepted for publication in A&A; 20 pages, 12 Figures (+4 appendix pages)

  10. arXiv:2407.01067  [pdf, other

    cs.AI cs.CL cs.CV cs.HC cs.LG

    Human-like object concept representations emerge naturally in multimodal large language models

    Authors: Changde Du, Kaicheng Fu, Bincheng Wen, Yi Sun, Jie Peng, Wei Wei, Ying Gao, Shengpei Wang, Chuncheng Zhang, Jinpeng Li, Shuang Qiu, Le Chang, Huiguang He

    Abstract: The conceptualization and categorization of natural objects in the human mind have long intrigued cognitive scientists and neuroscientists, offering crucial insights into human perception and cognition. Recently, the rapid development of Large Language Models (LLMs) has raised the attractive question of whether these models can also develop human-like object representations through exposure to vas… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  11. arXiv:2406.19287  [pdf, other

    astro-ph.HE

    Isotropy of cosmic rays beyond $10^{20}$ eV favors their heavy mass composition

    Authors: Telescope Array Collaboration, R. U. Abbasi, Y. Abe, T. Abu-Zayyad, M. Allen, Y. Arai, R. Arimura, E. Barcikowski, J. W. Belz, D. R. Bergman, S. A. Blake, I. Buckland, B. G. Cheon, M. Chikawa, T. Fujii, K. Fujisue, K. Fujita, R. Fujiwara, M. Fukushima, G. Furlich, N. Globus, R. Gonzalez, W. Hanlon, N. Hayashida, H. He , et al. (118 additional authors not shown)

    Abstract: We report an estimation of the injected mass composition of ultra-high energy cosmic rays (UHECRs) at energies higher than 10 EeV. The composition is inferred from an energy-dependent sky distribution of UHECR events observed by the Telescope Array surface detector by comparing it to the Large Scale Structure of the local Universe. In the case of negligible extra-galactic magnetic fields the resul… ▽ More

    Submitted 3 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: 8 pages, 3 figures, accepted for publication in PRL

  12. arXiv:2406.19286  [pdf, other

    astro-ph.HE

    Mass composition of ultra-high energy cosmic rays from distribution of their arrival directions with the Telescope Array

    Authors: Telescope Array Collaboration, R. U. Abbasi, Y. Abe, T. Abu-Zayyad, M. Allen, Y. Arai, R. Arimura, E. Barcikowski, J. W. Belz, D. R. Bergman, S. A. Blake, I. Buckland, B. G. Cheon, M. Chikawa, T. Fujii, K. Fujisue, K. Fujita, R. Fujiwara, M. Fukushima, G. Furlich, N. Globus, R. Gonzalez, W. Hanlon, N. Hayashida, H. He , et al. (118 additional authors not shown)

    Abstract: We use a new method to estimate the injected mass composition of ultrahigh cosmic rays (UHECRs) at energies higher than 10 EeV. The method is based on comparison of the energy-dependent distribution of cosmic ray arrival directions as measured by the Telescope Array experiment (TA) with that calculated in a given putative model of UHECR under the assumption that sources trace the large-scale struc… ▽ More

    Submitted 3 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: 18 pages, 11 figures, accepted for publication in PRD

  13. Self-assembly behaviour of diblock copolymer-diblock copolymer under oscillating shear field

    Authors: Y. Guo, H. He, X. Fu

    Abstract: The self-assembly behaviour of a diblock copolymer-diblock copolymer mixture under an oscillating shear field is investigated via cell dynamics simulation. The results indicate that the macrophase separation of the composite system is accompanied by the corresponding microphase separation induced by the oscillating shear field. With an increase in the shear frequency, the AB phase changes from a t… ▽ More

    Submitted 30 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: 13 pages, 7 figures

    Journal ref: Condensed Matter Physics, 2024, Vol. 27, No. 2, 23801

  14. arXiv:2406.16596  [pdf, other

    q-bio.MN q-bio.QM

    A compact model of Escherichia coli core and biosynthetic metabolism

    Authors: Marco Corrao, Hai He, Wolfram Liebermeister, Elad Noor

    Abstract: Metabolic models condense biochemical knowledge about organisms in a structured and standardised way. As large-scale network reconstructions are readily available for many organisms of interest, genome-scale models are being widely used among modellers and engineers. However, these large models can be difficult to analyse and visualise, and occasionally generate hard-to-interpret or even biologica… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  15. arXiv:2406.16323  [pdf, other

    eess.SP

    Low-Complexity CSI Feedback for FDD Massive MIMO Systems via Learning to Optimize

    Authors: Yifan Ma, Hengtao He, Shenghui Song, Jun Zhang, Khaled B. Letaief

    Abstract: In frequency-division duplex (FDD) massive multiple-input multiple-output (MIMO) systems, the growing number of base station antennas leads to prohibitive feedback overhead for downlink channel state information (CSI). To address this challenge, state-of-the-art (SOTA) fully data-driven deep learning (DL)-based CSI feedback schemes have been proposed. However, the high computational complexity and… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: submitted to IEEE for publication

  16. arXiv:2406.16087  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Imperative Learning: A Self-supervised Neural-Symbolic Learning Framework for Robot Autonomy

    Authors: Chen Wang, Kaiyi Ji, Junyi Geng, Zhongqiang Ren, Taimeng Fu, Fan Yang, Yifan Guo, Haonan He, Xiangyu Chen, Zitong Zhan, Qiwei Du, Shaoshu Su, Bowen Li, Yuheng Qiu, Yi Du, Qihang Li, Yifan Yang, Xiao Lin, Zhipeng Zhao

    Abstract: Data-driven methods such as reinforcement and imitation learning have achieved remarkable success in robot autonomy. However, their data-centric nature still hinders them from generalizing well to ever-changing environments. Moreover, collecting large datasets for robotic tasks is often impractical and expensive. To overcome these challenges, we introduce a new self-supervised neural-symbolic (NeS… ▽ More

    Submitted 6 July, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  17. arXiv:2406.14136  [pdf, other

    cs.RO

    One Fling to Goal: Environment-aware Dynamics for Goal-conditioned Fabric Flinging

    Authors: Linhan Yang, Lei Yang, Haoran Sun, Zeqing Zhang, Haibin He, Fang Wan, Chaoyang Song, Jia Pan

    Abstract: Fabric manipulation dynamically is commonly seen in manufacturing and domestic settings. While dynamically manipulating a fabric piece to reach a target state is highly efficient, this task presents considerable challenges due to the varying properties of different fabrics, complex dynamics when interacting with environments, and meeting required goal conditions. To address these challenges, we pr… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  18. arXiv:2406.14132  [pdf, other

    cs.AI

    Enhancing Monotonic Modeling with Spatio-Temporal Adaptive Awareness in Diverse Marketing

    Authors: Bin Li, Jiayan Pei, Feiyang Xiao, Yifan Zhao, Zhixing Zhang, Diwei Liu, HengXu He, Jia Jia

    Abstract: In the mobile internet era, the Online Food Ordering Service (OFOS) emerges as an integral component of inclusive finance owing to the convenience it brings to people. OFOS platforms offer dynamic allocation incentives to users and merchants through diverse marketing campaigns to encourage payments while maintaining the platforms' budget efficiency. Despite significant progress, the marketing doma… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 7 pages

  19. arXiv:2406.13671  [pdf, other

    hep-th cond-mat.str-el gr-qc

    Topological Equivalence Theorem and Double-Copy for Chern-Simons Scattering Amplitudes

    Authors: Yan-Feng Hang, Hong-Jian He, Cong Shen

    Abstract: We study the mechanism of topological mass-generation for 3d Chern-Simons gauge theories and propose a brand-new Topological Equivalence Theorem to connect scattering amplitudes of the physical gauge boson states to that of the transverse states under high energy expansion. We prove a general energy cancellation mechanism for $N$-point physical gauge boson amplitudes, which predicts large cancella… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 8+5 pages, to match Journal Version. This Letter paper is a companion of the systematic longer paper arXiv:2110.05399 (46pp)

    Journal ref: Research 6 (2023) 0072

  20. arXiv:2406.13583  [pdf, other

    cs.CV

    Low-Rank Mixture-of-Experts for Continual Medical Image Segmentation

    Authors: Qian Chen, Lei Zhu, Hangzhou He, Xinliang Zhang, Shuang Zeng, Qiushi Ren, Yanye Lu

    Abstract: The primary goal of continual learning (CL) task in medical image segmentation field is to solve the "catastrophic forgetting" problem, where the model totally forgets previously learned features when it is extended to new categories (class-level) or tasks (task-level). Due to the privacy protection, the historical data labels are inaccessible. Prevalent continual learning methods primarily focus… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  21. arXiv:2406.13358  [pdf, other

    cs.CV eess.IV

    Multi-scale Restoration of Missing Data in Optical Time-series Images with Masked Spatial-Temporal Attention Network

    Authors: Zaiyan Zhang, Jining Yan, Yuanqi Liang, Jiaxin Feng, Haixu He, Wei Han

    Abstract: Due to factors such as thick cloud cover and sensor limitations, remote sensing images often suffer from significant missing data, resulting in incomplete time-series information. Existing methods for imputing missing values in remote sensing images do not fully exploit spatio-temporal auxiliary information, leading to limited accuracy in restoration. Therefore, this paper proposes a novel deep le… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  22. arXiv:2406.12713  [pdf, other

    hep-th gr-qc hep-ph

    Structure of Massive Gauge/Gravity Scattering Amplitudes, Equivalence Theorems, and Extended Double-Copy with Compactified Warped Space

    Authors: Yanfeng Hang, Wei-Wei Zhao, Hong-Jian He, Yin-Long Qiu

    Abstract: We study the structure of scattering amplitudes of massive Kaluza-Klein (KK) states in the compactified 5-dimensional warped gauge and gravity theories. We present systematic formulations of the gauge theory equivalence theorem (GAET) and the gravitational equivalence theorem (GRET) for warped KK theories in $R_ξ^{}$ gauge, where the GAET connects the scattering amplitudes of longitudinal KK gauge… ▽ More

    Submitted 27 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: 91 pages. Improved version, references added

  23. arXiv:2406.12158  [pdf, other

    cs.CL cs.AI

    LLMs Are Prone to Fallacies in Causal Inference

    Authors: Nitish Joshi, Abulhair Saparov, Yixin Wang, He He

    Abstract: Recent work shows that causal facts can be effectively extracted from LLMs through prompting, facilitating the creation of causal graphs for causal inference tasks. However, it is unclear if this success is limited to explicitly-mentioned causal facts in the pretraining data which the model can memorize. Thus, this work investigates: Can LLMs infer causal relations from other relational data in te… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  24. arXiv:2406.08698  [pdf, other

    astro-ph.HE hep-ph

    Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 17 pages, 12 figures, accepted by PRL

  25. arXiv:2406.08612  [pdf, other

    astro-ph.HE

    Observation of Declination Dependence in the Cosmic Ray Energy Spectrum

    Authors: The Telescope Array Collaboration, R. U. Abbasi, T. Abu-Zayyad, M. Allen, J. W. Belz, D. R. Bergman, I. Buckland, W. Campbell, B. G. Cheon, K. Endo, A. Fedynitch, T. Fujii, K. Fujisue, K. Fujita, M. Fukushima, G. Furlich, Z. Gerber, N. Globus, W. Hanlon, N. Hayashida, H. He, K. Hibino, R. Higuchi, D. Ikeda, T. Ishii , et al. (101 additional authors not shown)

    Abstract: We report on an observation of the difference between northern and southern skies of the ultrahigh energy cosmic ray energy spectrum with a significance of ${\sim}8σ$. We use measurements from the two largest experiments$\unicode{x2014}$the Telescope Array observing the northern hemisphere and the Pierre Auger Observatory viewing the southern hemisphere. Since the comparison of two measurements fr… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 8 pages, 6 figures

  26. arXiv:2406.07529  [pdf, other

    cs.LG

    MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation

    Authors: Lu Li, Tianyu Zhang, Zhiqi Bu, Suyuchen Wang, Huan He, Jie Fu, Yonghui Wu, Jiang Bian, Yong Chen, Yoshua Bengio

    Abstract: Model merging has emerged as an effective approach to combine multiple single-task models, fine-tuned from the same pre-trained model, into a multitask model. This process typically involves computing a weighted average of the model parameters without any additional training. Existing model-merging methods focus on enhancing average task accuracy. However, interference and conflicts between the ob… ▽ More

    Submitted 18 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  27. arXiv:2406.05452  [pdf, other

    eess.SP cs.IT

    Near-Field Channel Estimation for Extremely Large-Scale Terahertz Communications

    Authors: Songjie Yang, Yizhou Peng, Wanting Lyu, Ya Li, Hongjun He, Zhongpei Zhang, Chau Yuen

    Abstract: Future Terahertz communications exhibit significant potential in accommodating ultra-high-rate services. Employing extremely large-scale array antennas is a key approach to realize this potential, as they can harness substantial beamforming gains to overcome the severe path loss and leverage the electromagnetic advantages in the near field. This paper proposes novel estimation methods designed to… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  28. arXiv:2406.03262  [pdf, other

    cs.CV

    ADer: A Comprehensive Benchmark for Multi-class Visual Anomaly Detection

    Authors: Jiangning Zhang, Haoyang He, Zhenye Gan, Qingdong He, Yuxuan Cai, Zhucun Xue, Yabiao Wang, Chengjie Wang, Lei Xie, Yong Liu

    Abstract: Visual anomaly detection aims to identify anomalous regions in images through unsupervised learning paradigms, with increasing application demand and value in fields such as industrial inspection and medical lesion detection. Despite significant progress in recent years, there is a lack of comprehensive benchmarks to adequately evaluate the performance of various mainstream methods across differen… ▽ More

    Submitted 6 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

  29. arXiv:2406.03081  [pdf, other

    quant-ph

    A Quantum Neural Network-Based Approach to Power Quality Disturbances Detection and Recognition

    Authors: Guo-Dong Li, Hai-Yan He, Yue Li, Xin-Hao Li, Hao Liu, Qing-Le Wang, Long Cheng

    Abstract: Power quality disturbances (PQDs) significantly impact the stability and reliability of power systems, necessitating accurate and efficient detection and recognition methods. While numerous classical algorithms for PQDs detection and recognition have been extensively studied and applied, related work in the quantum domain is still in its infancy. In this paper, an improved quantum neural networks… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  30. arXiv:2406.02578  [pdf, other

    cs.LG

    Pretrained Mobility Transformer: A Foundation Model for Human Mobility

    Authors: Xinhua Wu, Haoyu He, Yanchao Wang, Qi Wang

    Abstract: Ubiquitous mobile devices are generating vast amounts of location-based service data that reveal how individuals navigate and utilize urban spaces in detail. In this study, we utilize these extensive, unlabeled sequences of user trajectories to develop a foundation model for understanding urban space and human mobility. We introduce the \textbf{P}retrained \textbf{M}obility \textbf{T}ransformer (P… ▽ More

    Submitted 28 May, 2024; originally announced June 2024.

  31. arXiv:2406.02213  [pdf, other

    cs.LG

    Rectifying Reinforcement Learning for Reward Matching

    Authors: Haoran He, Emmanuel Bengio, Qingpeng Cai, Ling Pan

    Abstract: The Generative Flow Network (GFlowNet) is a probabilistic framework in which an agent learns a stochastic policy and flow functions to sample objects with probability proportional to an unnormalized reward function. GFlowNets share a strong resemblance to reinforcement learning (RL), that typically aims to maximize reward, due to their sequential decision-making processes. Recent works have studie… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  32. arXiv:2406.01150  [pdf, other

    cs.LG

    Looking Backward: Retrospective Backward Synthesis for Goal-Conditioned GFlowNets

    Authors: Haoran He, Can Chang, Huazhe Xu, Ling Pan

    Abstract: Generative Flow Networks (GFlowNets) are amortized sampling methods for learning a stochastic policy to sequentially generate compositional objects with probabilities proportional to their rewards. GFlowNets exhibit a remarkable ability to generate diverse sets of high-reward objects, in contrast to standard return maximization reinforcement learning approaches, which often converge to a single op… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  33. arXiv:2406.00050  [pdf, other

    cs.CL cs.AI

    An Empirical Analysis on Large Language Models in Debate Evaluation

    Authors: Xinyi Liu, Pinxin Liu, Hangfeng He

    Abstract: In this study, we investigate the capabilities and inherent biases of advanced large language models (LLMs) such as GPT-3.5 and GPT-4 in the context of debate evaluation. We discover that LLM's performance exceeds humans and surpasses the performance of state-of-the-art methods fine-tuned on extensive datasets in debate evaluation. We additionally explore and analyze biases present in LLMs, includ… ▽ More

    Submitted 4 June, 2024; v1 submitted 28 May, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 main

  34. arXiv:2405.20763  [pdf, other

    cs.LG math.OC stat.ML

    Improving Generalization and Convergence by Enhancing Implicit Regularization

    Authors: Mingze Wang, Haotian He, Jinbo Wang, Zilin Wang, Guanhua Huang, Feiyu Xiong, Zhiyu Li, Weinan E, Lei Wu

    Abstract: In this work, we propose an Implicit Regularization Enhancement (IRE) framework to accelerate the discovery of flat solutions in deep learning, thereby improving generalization and convergence. Specifically, IRE decouples the dynamics of flat and sharp directions, which boosts the sharpness reduction along flat directions while maintaining the training stability in sharp directions. We show that I… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 35 pages

  35. arXiv:2405.20600  [pdf, other

    cs.AI

    Multi-label Class Incremental Emotion Decoding with Augmented Emotional Semantics Learning

    Authors: Kaicheng Fu, Changde Du, Xiaoyu Chen, Jie Peng, Huiguang He

    Abstract: Emotion decoding plays an important role in affective human-computer interaction. However, previous studies ignored the dynamic real-world scenario, where human experience a blend of multiple emotions which are incrementally integrated into the model, leading to the multi-label class incremental learning (MLCIL) problem. Existing methods have difficulty in solving MLCIL issue due to notorious cata… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  36. arXiv:2405.19678  [pdf, other

    cs.CV cs.AI

    View-Consistent Hierarchical 3D SegmentationUsing Ultrametric Feature Fields

    Authors: Haodi He, Colton Stearns, Adam W. Harley, Leonidas J. Guibas

    Abstract: Large-scale vision foundation models such as Segment Anything (SAM) demonstrate impressive performance in zero-shot image segmentation at multiple levels of granularity. However, these zero-shot predictions are rarely 3D-consistent. As the camera viewpoint changes in a scene, so do the segmentation predictions, as well as the characterizations of ``coarse" or ``fine" granularity. In this work, we… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  37. arXiv:2405.19586  [pdf, other

    cs.CV cs.LG cs.RO

    SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation

    Authors: Junjie Zhang, Chenjia Bai, Haoran He, Wenke Xia, Zhigang Wang, Bin Zhao, Xiu Li, Xuelong Li

    Abstract: Acquiring a multi-task imitation policy in 3D manipulation poses challenges in terms of scene understanding and action prediction. Current methods employ both 3D representation and multi-view 2D representation to predict the poses of the robot's end-effector. However, they still require a considerable amount of high-quality robot trajectories, and suffer from limited generalization in unseen tasks… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: ICML 2024. Project page: https://sam-embodied.github.io

  38. arXiv:2405.18726  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    Reverse the auditory processing pathway: Coarse-to-fine audio reconstruction from fMRI

    Authors: Che Liu, Changde Du, Xiaoyu Chen, Huiguang He

    Abstract: Drawing inspiration from the hierarchical processing of the human auditory system, which transforms sound from low-level acoustic features to high-level semantic understanding, we introduce a novel coarse-to-fine audio reconstruction method. Leveraging non-invasive functional Magnetic Resonance Imaging (fMRI) data, our approach mimics the inverse pathway of auditory processing. Initially, we utili… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  39. arXiv:2405.18399  [pdf, ps, other

    math.NA

    A simple, randomized algorithm for diagonalizing normal matrices

    Authors: Haoze He, Daniel Kressner

    Abstract: We present and analyze a simple numerical method that diagonalizes a complex normal matrix A by diagonalizing the Hermitian matrix obtained from a random linear combination of the Hermitian and skew-Hermitian parts of A.

    Submitted 28 May, 2024; originally announced May 2024.

    MSC Class: 65F15; 15B57; 15A18

  40. arXiv:2405.17976  [pdf

    cs.AI cs.CL

    Yuan 2.0-M32: Mixture of Experts with Attention Router

    Authors: Shaohua Wu, Jiangang Luo, Xi Chen, Lingjun Li, Xudong Zhao, Tong Yu, Chao Wang, Yue Wang, Fei Wang, Weixu Qiao, Houbo He, Zeru Zhang, Zeyu Sun, Junxiong Mao, Chong Shen

    Abstract: Yuan 2.0-M32, with a similar base architecture as Yuan-2.0 2B, uses a mixture-of-experts architecture with 32 experts of which 2 experts are active. A new router network, Attention Router, is proposed and adopted for a more efficient selection of experts, which improves the accuracy compared to the model with classical router network. Yuan 2.0-M32 is trained with 2000B tokens from scratch, and the… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: 14 pages,3 figures, 7 tables

  41. arXiv:2405.17414  [pdf, other

    cs.CV cs.GR

    Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control

    Authors: Zhengfei Kuang, Shengqu Cai, Hao He, Yinghao Xu, Hongsheng Li, Leonidas Guibas, Gordon Wetzstein

    Abstract: Research on video generation has recently made tremendous progress, enabling high-quality videos to be generated from text prompts or images. Adding control to the video generation process is an important goal moving forward and recent approaches that condition video generation models on camera trajectories make strides towards it. Yet, it remains challenging to generate a video of the same scene… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  42. arXiv:2405.16730  [pdf, other

    cs.LG cs.AI stat.AP

    Latent Energy-Based Odyssey: Black-Box Optimization via Expanded Exploration in the Energy-Based Latent Space

    Authors: Peiyu Yu, Dinghuai Zhang, Hengzhi He, Xiaojian Ma, Ruiyao Miao, Yifan Lu, Yasi Zhang, Deqian Kong, Ruiqi Gao, Jianwen Xie, Guang Cheng, Ying Nian Wu

    Abstract: Offline Black-Box Optimization (BBO) aims at optimizing a black-box function using the knowledge from a pre-collected offline dataset of function values and corresponding input designs. However, the high-dimensional and highly-multimodal input design space of black-box function pose inherent challenges for most existing methods that model and operate directly upon input designs. These issues inclu… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  43. arXiv:2405.16289  [pdf

    physics.optics

    Intensity adaptive optics

    Authors: Zimo Zhao, Yifei Ma, Jacopo Antonello, Zipei Song, Jiahe Cui, Binguo Chen, Jingyu Wang, Bangshan Sun, Honghui He, Lin Luo, Julian A. J. Fells, Steve J. Elston, Martin J. Booth, Stephen M. Morris, Chao He

    Abstract: Adaptive optics (AO) is a powerful tool used in a wide range of research areas spanning from aerospace to microscopy. To date, AO has largely been applied to optical phase aberration correction, with recent advances extending to include the vectorial properties of light. However, intensity errors widely exist in optical systems, yet their associated correction methods are still very much in their… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  44. arXiv:2405.15525  [pdf, other

    cs.CL

    Sparse Matrix in Large Language Model Fine-tuning

    Authors: Haoze He, Juncheng Billy Li, Xuan Jiang, Heather Miller

    Abstract: LoRA and its variants have become popular parameter-efficient fine-tuning (PEFT) methods due to their ability to avoid excessive computational costs. However, an accuracy gap often exists between PEFT methods and full fine-tuning (FT), and this gap has yet to be systematically studied. In this work, we introduce a method for selecting sparse sub-matrices that aim to minimize the performance gap be… ▽ More

    Submitted 29 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 14 pages

  45. arXiv:2405.15214  [pdf, other

    cs.CV

    PointRWKV: Efficient RWKV-Like Model for Hierarchical Point Cloud Learning

    Authors: Qingdong He, Jiangning Zhang, Jinlong Peng, Haoyang He, Yabiao Wang, Chengjie Wang

    Abstract: Transformers have revolutionized the point cloud learning task, but the quadratic complexity hinders its extension to long sequence and makes a burden on limited computational resources. The recent advent of RWKV, a fresh breed of deep sequence models, has shown immense potential for sequence modeling in NLP tasks. In this paper, we present PointRWKV, a model of linear complexity derived from the… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  46. arXiv:2405.14018  [pdf, other

    cs.CR cs.LG stat.AP

    Watermarking Generative Tabular Data

    Authors: Hengzhi He, Peiyu Yu, Junpeng Ren, Ying Nian Wu, Guang Cheng

    Abstract: In this paper, we introduce a simple yet effective tabular data watermarking mechanism with statistical guarantees. We show theoretically that the proposed watermark can be effectively detected, while faithfully preserving the data fidelity, and also demonstrates appealing robustness against additive noise attack. The general idea is to achieve the watermarking through a strategic embedding based… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  47. arXiv:2405.11826  [pdf, other

    astro-ph.IM hep-ex physics.ins-det

    Data quality control system and long-term performance monitor of the LHAASO-KM2A

    Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

    Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More

    Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: 15 pages, 9 figures

  48. arXiv:2405.11739  [pdf

    cs.LG cs.AI cs.CY

    Contactless Polysomnography: What Radio Waves Tell Us about Sleep

    Authors: Hao He, Chao Li, Wolfgang Ganglberger, Kaileigh Gallagher, Rumen Hristov, Michail Ouroutzoglou, Haoqi Sun, Jimeng Sun, Brandon Westover, Dina Katabi

    Abstract: The ability to assess sleep at home, capture sleep stages, and detect the occurrence of apnea (without on-body sensors) simply by analyzing the radio waves bouncing off people's bodies while they sleep is quite powerful. Such a capability would allow for longitudinal data collection in patients' homes, informing our understanding of sleep and its interaction with various diseases and their therape… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: The first two authors contributed equally to this work

  49. arXiv:2405.11389  [pdf, other

    cs.LG

    Adjacent Leader Decentralized Stochastic Gradient Descent

    Authors: Haoze He, Jing Wang, Anna Choromanska

    Abstract: This work focuses on the decentralized deep learning optimization framework. We propose Adjacent Leader Decentralized Gradient Descent (AL-DSGD), for improving final model performance, accelerating convergence, and reducing the communication overhead of decentralized deep learning optimizers. AL-DSGD relies on two main ideas. Firstly, to increase the influence of the strongest learners on the lear… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

    Comments: 9 pages of main paper, and 12 pages of appendix

  50. arXiv:2405.11021  [pdf, other

    cs.CV

    Enhanced 3D Urban Scene Reconstruction and Point Cloud Densification using Gaussian Splatting and Google Earth Imagery

    Authors: Kyle Gao, Dening Lu, Hongjie He, Linlin Xu, Jonathan Li

    Abstract: 3D urban scene reconstruction and modelling is a crucial research area in remote sensing with numerous applications in academia, commerce, industry, and administration. Recent advancements in view synthesis models have facilitated photorealistic 3D reconstruction solely from 2D images. Leveraging Google Earth imagery, we construct a 3D Gaussian Splatting model of the Waterloo region centered on th… ▽ More

    Submitted 1 June, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    ACM Class: I.4; I.3