Skip to main content

Showing 1–50 of 9,603 results for author: Zhang, H

  1. arXiv:2407.11921  [pdf, other

    cs.CV cs.CR

    IPA-NeRF: Illusory Poisoning Attack Against Neural Radiance Fields

    Authors: Wenxiang Jiang, Hanwei Zhang, Shuo Zhao, Zhongwen Guo, Hao Wang

    Abstract: Neural Radiance Field (NeRF) represents a significant advancement in computer vision, offering implicit neural network-based scene representation and novel view synthesis capabilities. Its applications span diverse fields including robotics, urban mapping, autonomous navigation, virtual reality/augmented reality, etc., some of which are considered high-risk AI applications. However, despite its wi… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  2. arXiv:2407.11895  [pdf, other

    cs.CV

    OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces

    Authors: Zehan Wang, Ziang Zhang, Hang Zhang, Luping Liu, Rongjie Huang, Xize Cheng, Hengshuang Zhao, Zhou Zhao

    Abstract: Recently, human-computer interaction with various modalities has shown promising applications, like GPT-4o and Gemini. Given the foundational role of multimodal joint representation in understanding and generation pipelines, high-quality omni joint representations would be a step toward co-processing more diverse multimodal information. In this work, we present OmniBind, large-scale multimodal joi… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Homepage is http://omnibind.github.io

  3. A cryogenic on-chip microwave pulse generator for large-scale superconducting quantum computing

    Authors: Zenghui Bao, Yan Li, Zhiling Wang, Jiahui Wang, Jize Yang, Haonan Xiong, Yipu Song, Yukai Wu, Hongyi Zhang, Luming Duan

    Abstract: For superconducting quantum processors, microwave signals are delivered to each qubit from room-temperature electronics to the cryogenic environment through coaxial cables. Limited by the heat load of cabling and the massive cost of electronics, such an architecture is not viable for millions of qubits required for fault-tolerant quantum computing. Monolithic integration of the control electronics… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 12 pages, 4 figures

    Journal ref: Nat Commun 15, 5958 (2024)

  4. arXiv:2407.11736  [pdf, other

    cs.RO cs.CV

    GV-Bench: Benchmarking Local Feature Matching for Geometric Verification of Long-term Loop Closure Detection

    Authors: Jingwen Yu, Hanjing Ye, Jianhao Jiao, Ping Tan, Hong Zhang

    Abstract: Visual loop closure detection is an important module in visual simultaneous localization and mapping (SLAM), which associates current camera observation with previously visited places. Loop closures correct drifts in trajectory estimation to build a globally consistent map. However, a false loop closure can be fatal, so verification is required as an additional step to ensure robustness by rejecti… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 9 pages, 11 figures, Accepted by IROS(2024)

  5. arXiv:2407.11734  [pdf, other

    q-bio.QM cs.LG q-bio.GN

    Generating Multi-Modal and Multi-Attribute Single-Cell Counts with CFGen

    Authors: Alessandro Palma, Till Richter, Hanyi Zhang, Manuel Lubetzki, Alexander Tong, Andrea Dittadi, Fabian Theis

    Abstract: Generative modeling of single-cell RNA-seq data has shown invaluable potential in community-driven tasks such as trajectory inference, batch effect removal and gene expression generation. However, most recent deep models generating synthetic single cells from noise operate on pre-processed continuous gene expression approximations, ignoring the inherently discrete and over-dispersed nature of sing… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 28 pages, 12 figures

  6. arXiv:2407.11727  [pdf, ps, other

    hep-ex hep-ph

    Measurement of the branching fraction of $D^+_s\to \ell^+ν_\ell$ via $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

    Abstract: Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(\bfmuv)\%$ and… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 27 pages, 13 figures

  7. arXiv:2407.11684  [pdf, ps, other

    math.DS

    $α$-SGHN: A Robust Model for Learning Particle Interactions in Lattice Systems

    Authors: Yixian Gao, Ru Geng, Panayotis Kevrekidis, Hong-Kun Zhang, Jian Zu

    Abstract: We propose an $α$-separable graph Hamiltonian network ($α$-SGHN) that reveals complex interaction patterns between particles in lattice systems. Utilizing trajectory data, $α$-SGHN infers potential interactions without prior knowledge about particle coupling, overcoming the limitations of traditional graph neural networks that require predefined links. Furthermore, $α$-SGHN preserves all conservat… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 17pages

  8. arXiv:2407.11682  [pdf, other

    cs.CV

    MapDistill: Boosting Efficient Camera-based HD Map Construction via Camera-LiDAR Fusion Model Distillation

    Authors: Xiaoshuai Hao, Ruikai Li, Hui Zhang, Dingzhe Li, Rong Yin, Sangil Jung, Seung-In Park, ByungIn Yoo, Haimei Zhao, Jing Zhang

    Abstract: Online high-definition (HD) map construction is an important and challenging task in autonomous driving. Recently, there has been a growing interest in cost-effective multi-view camera-based methods without relying on other sensors like LiDAR. However, these methods suffer from a lack of explicit depth information, necessitating the use of large models to achieve satisfactory performance. To addre… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  9. arXiv:2407.11619  [pdf, ps, other

    cs.LG cs.GT

    Strategic Littlestone Dimension: Improved Bounds on Online Strategic Classification

    Authors: Saba Ahmadi, Kunhe Yang, Hanrui Zhang

    Abstract: We study the problem of online binary classification in settings where strategic agents can modify their observable features to receive a positive classification. We model the set of feasible manipulations by a directed graph over the feature space, and assume the learner only observes the manipulated features instead of the original ones. We introduce the Strategic Littlestone Dimension, a new co… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  10. arXiv:2407.11529  [pdf, other

    eess.IV cs.AI cs.CV

    Cross-Phase Mutual Learning Framework for Pulmonary Embolism Identification on Non-Contrast CT Scans

    Authors: Bizhe Bai, Yan-Jie Zhou, Yujian Hu, Tony C. W. Mok, Yilang Xiang, Le Lu, Hongkun Zhang, Minfeng Xu

    Abstract: Pulmonary embolism (PE) is a life-threatening condition where rapid and accurate diagnosis is imperative yet difficult due to predominantly atypical symptomatology. Computed tomography pulmonary angiography (CTPA) is acknowledged as the gold standard imaging tool in clinics, yet it can be contraindicated for emergency department (ED) patients and represents an onerous procedure, thus necessitating… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Early accept by MICCAI 2024

  11. arXiv:2407.11509  [pdf, other

    cond-mat.str-el cond-mat.quant-gas

    Exact eigenstates with off-diagonal long-range order for interacting bosonic systems

    Authors: C. H. Zhang, Z. Song

    Abstract: Fermions and hardcore bosons share the same restriction: no more than one particle can occupy a single site in a lattice system. Specifically, in one dimension, two systems can share the same matrix representation. In this work, we investigate both the fermion and hardcore-boson models with nearest-neighbor (NN) interaction in a ring lattice. We construct the exact eigenstates of the hardcore-boso… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  12. Incremental high average-utility itemset mining: survey and challenges

    Authors: Jing Chen, Shengyi Yang, Weiping Ding, Peng Li, Aijun Liu, Hongjun Zhang, Tian Li

    Abstract: The High Average Utility Itemset Mining (HAUIM) technique, a variation of High Utility Itemset Mining (HUIM), uses the average utility of the itemsets. Historically, most HAUIM algorithms were designed for static databases. However, practical applications like market basket analysis and business decision-making necessitate regular updates of the database with new transactions. As a result, researc… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 25 pages, 23 figures

  13. arXiv:2407.11422  [pdf, other

    cs.CV

    Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models

    Authors: Jinrui Zhang, Teng Wang, Haigang Zhang, Ping Lu, Feng Zheng

    Abstract: Large vision-language models (LVLMs) have shown promising performance on a variety of vision-language tasks. However, they remain susceptible to hallucinations, generating outputs misaligned with visual content or instructions. While various mitigation strategies have been proposed, they often neglect a key contributor to hallucinations: lack of fine-grained reasoning supervision during training.… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: To appear at ECCV2024

  14. arXiv:2407.11382  [pdf, other

    cs.CV cs.AI cs.RO

    Segment, Lift and Fit: Automatic 3D Shape Labeling from 2D Prompts

    Authors: Jianhao Li, Tianyu Sun, Zhongdao Wang, Enze Xie, Bailan Feng, Hongbo Zhang, Ze Yuan, Ke Xu, Jiaheng Liu, Ping Luo

    Abstract: This paper proposes an algorithm for automatically labeling 3D objects from 2D point or box prompts, especially focusing on applications in autonomous driving. Unlike previous arts, our auto-labeler predicts 3D shapes instead of bounding boxes and does not require training on a specific dataset. We propose a Segment, Lift, and Fit (SLF) paradigm to achieve this goal. Firstly, we segment high-quali… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  15. arXiv:2407.11322  [pdf, ps, other

    eess.SP

    Reconfigurable-Intelligent-Surface Assisted Orbital-Angular-Momentum Secure Communications

    Authors: Minmin Wang, Liping Liang, Wenchi Cheng, Wei Zhang, Ruirui Chen, Hailin Zhang

    Abstract: As a kind of wavefront with helical phase, orbital angular momentum (OAM) shows the great potential to enhance the security results of wireless communications due to its unique orthogonality and central hollow electromagnetic wave structure. Therefore, in this paper we propose the reconfigurable-intelligent-surface (RIS) assisted OAM scheme, where RIS is deployed to weaken the information acquisit… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2406.05799

  16. arXiv:2407.11087  [pdf, other

    eess.IV cs.CV

    Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV

    Authors: Zhiwen Yang, Hui Zhang, Dan Zhao, Bingzheng Wei, Yan Xu

    Abstract: Transformers have revolutionized medical image restoration, but the quadratic complexity still poses limitations for their application to high-resolution medical images. The recent advent of RWKV in the NLP field has attracted much attention as it can process long sequences efficiently. To leverage its advanced design, we propose Restore-RWKV, the first RWKV-based model for medical image restorati… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: This paper introduces the first RWKV-based model for image restoration

  17. arXiv:2407.10988  [pdf, other

    cs.LG

    Residual resampling-based physics-informed neural network for neutron diffusion equations

    Authors: Heng Zhang, Yun-Ling He, Dong Liu, Qin Hang, He-Min Yao, Di Xiang

    Abstract: The neutron diffusion equation plays a pivotal role in the analysis of nuclear reactors. Nevertheless, employing the Physics-Informed Neural Network (PINN) method for its solution entails certain limitations. Traditional PINN approaches often utilize fully connected network (FCN) architecture, which is susceptible to overfitting, training instability, and gradient vanishing issues as the network d… ▽ More

    Submitted 23 June, 2024; originally announced July 2024.

  18. arXiv:2407.10982  [pdf, other

    cs.NI

    ARA-O-RAN: End-to-End Programmable O-RAN Living Lab for Agriculture and Rural Communities

    Authors: Tianyi Zhang, Joshua Ofori Boateng, Taimoor UI Islam, Arsalan Ahmad, Hongwei Zhang, Daji Qiao

    Abstract: As wireless networks evolve towards open architectures like O-RAN, testing, and integration platforms are crucial to address challenges like interoperability. This paper describes ARA-O-RAN, a novel O-RAN testbed established through the NSF Platforms for Advanced Wireless Research (PAWR) ARA platform. ARA provides an at-scale rural wireless living lab focused on technologies for digital agricultur… ▽ More

    Submitted 14 June, 2024; originally announced July 2024.

  19. arXiv:2407.10957  [pdf, other

    cs.CV cs.AI

    Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes

    Authors: Yaoting Wang, Peiwen Sun, Dongzhan Zhou, Guangyao Li, Honggang Zhang, Di Hu

    Abstract: Traditional reference segmentation tasks have predominantly focused on silent visual scenes, neglecting the integral role of multimodal perception and interaction in human experiences. In this work, we introduce a novel task called Reference Audio-Visual Segmentation (Ref-AVS), which seeks to segment objects within the visual domain based on expressions containing multimodal cues. Such expressions… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  20. arXiv:2407.10956  [pdf, other

    cs.AI cs.CL

    Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

    Authors: Ruisheng Cao, Fangyu Lei, Haoyuan Wu, Jixuan Chen, Yeqiao Fu, Hongcheng Gao, Xinzhuang Xiong, Hanchong Zhang, Yuchen Mao, Wenjing Hu, Tianbao Xie, Hongshen Xu, Danyang Zhang, Sida Wang, Ruoxi Sun, Pengcheng Yin, Caiming Xiong, Ansong Ni, Qian Liu, Victor Zhong, Lu Chen, Kai Yu, Tao Yu

    Abstract: Data science and engineering workflows often span multiple stages, from warehousing to orchestration, using tools like BigQuery, dbt, and Airbyte. As vision language models (VLMs) advance in multimodal understanding and code generation, VLM-based agents could potentially automate these workflows by generating SQL queries, Python code, and GUI operations. This automation can improve the productivit… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 34 pages, 14 figures, 10 tables

  21. arXiv:2407.10947  [pdf, other

    cs.CV

    Can Textual Semantics Mitigate Sounding Object Segmentation Preference?

    Authors: Yaoting Wang, Peiwen Sun, Yuanchao Li, Honggang Zhang, Di Hu

    Abstract: The Audio-Visual Segmentation (AVS) task aims to segment sounding objects in the visual space using audio cues. However, in this work, it is recognized that previous AVS methods show a heavy reliance on detrimental segmentation preferences related to audible objects, rather than precise audio guidance. We argue that the primary reason is that audio lacks robust semantics compared to vision, especi… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  22. arXiv:2407.10701  [pdf, other

    cs.CL

    DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems

    Authors: Anni Zou, Wenhao Yu, Hongming Zhang, Kaixin Ma, Deng Cai, Zhuosheng Zhang, Hai Zhao, Dong Yu

    Abstract: Recently, there has been a growing interest among large language model (LLM) developers in LLM-based document reading systems, which enable users to upload their own documents and pose questions related to the document contents, going beyond simple reading comprehension tasks. Consequently, these systems have been carefully designed to tackle challenges such as file parsing, metadata extraction, m… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Work in progress

  23. arXiv:2407.10691  [pdf, other

    cs.IR cs.CL

    $\texttt{MixGR}$: Enhancing Retriever Generalization for Scientific Domain through Complementary Granularity

    Authors: Fengyu Cai, Xinran Zhao, Tong Chen, Sihao Chen, Hongming Zhang, Iryna Gurevych, Heinz Koeppl

    Abstract: Recent studies show the growing significance of document retrieval in the generation of LLMs, i.e., RAG, within the scientific domain by bridging their knowledge gap. However, dense retrievers often struggle with domain-specific retrieval and complex query-document relationships, particularly when query segments correspond to various parts of a document. To alleviate such prevalent challenges, thi… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  24. arXiv:2407.10670  [pdf, other

    cs.CL cs.AI

    Enhancing Retrieval and Managing Retrieval: A Four-Module Synergy for Improved Quality and Efficiency in RAG Systems

    Authors: Yunxiao Shi, Xing Zi, Zijing Shi, Haimin Zhang, Qiang Wu, Min Xu

    Abstract: Retrieval-augmented generation (RAG) techniques leverage the in-context learning capabilities of large language models (LLMs) to produce more accurate and relevant responses. Originating from the simple 'retrieve-then-read' approach, the RAG framework has evolved into a highly flexible and modular paradigm. A critical component, the Query Rewriter module, enhances knowledge retrieval by generating… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: ECAI2024 #1304

  25. arXiv:2407.10421  [pdf, ps, other

    gr-qc

    Constraining Weyl type f(Q,T) gravity with Big Bang Nucleosynthesis

    Authors: Jian Ge, Lei Ming, Shi-Dong Liang, Hong-Hao Zhang, Tiberiu Harko

    Abstract: The Weyl type $f(Q,T)$ modified gravity theory is an extension of the $f(Q)$ and $f(Q,T)$ type theories, where $T$ is the trace of the matter energy-momentum tensor, and the scalar non-metricity $Q$ is represented in its standard Weyl form, and it is fully determined by a vector field $ω_μ$. The theory can give a good description of the observational data, and of the evolution of the late-time Uni… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 19 pages, 12 figures

  26. arXiv:2407.10352  [pdf

    cond-mat.supr-con cond-mat.str-el

    Signature of Orbital Driven Finite Momentum Pairing in a 3D Ising Superconductor

    Authors: F. Z. Yang, H. D. Zhang, Saswata Mandal, F. Y. Meng, G. Fabbris, A. Said, P. Mercado Lozano, A. Rajapitamahuni, E. Vescovo, C. Nelson, S. Lin, Y. Park, E. M. Clements, T. Z. Ward, H. -N. Lee, H. C. Lei, C. X. Liu, H. Miao

    Abstract: The finite momentum superconducting pairing states (FMPs), where Cooper pairs carry non-zero momentum, are believed to give rise to exotic physical phenomena including the pseudogap phase of cuprate high-Tc superconductors and Majorana fermions in topological superconductivity. FMPs can emerge in intertwined electronic liquids with strong spin-spin interactions or be induced by lifting the spin de… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

  27. arXiv:2407.10287  [pdf, ps, other

    hep-th

    Gauss Relations in Feynman Integrals

    Authors: Tai-Fu Feng, Yang Zhou, Hai-Bin Zhang

    Abstract: Embedding Feynman integrals in Grassmannians, we express Feynman integrals as linear combinations of generalized hypergeometric functions. Here we present general methods to obtain Gauss relations among those generalized hypergeometric functions. The hypergeometric expressions of Feynman integral are analytically continued from some connected component to another by the Gauss inverse relations, th… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 75 pages, including text of 22 pages + 1 figure +appendices of 52 pages

  28. arXiv:2407.10233  [pdf, other

    cs.CV cs.AI

    Visual Prompt Selection for In-Context Learning Segmentation

    Authors: Wei Suo, Lanqing Lai, Mengyang Sun, Hanwang Zhang, Peng Wang, Yanning Zhang

    Abstract: As a fundamental and extensively studied task in computer vision, image segmentation aims to locate and identify different semantic concepts at the pixel level. Recently, inspired by In-Context Learning (ICL), several generalist segmentation frameworks have been proposed, providing a promising paradigm for segmenting specific objects. However, existing works mostly ignore the value of visual promp… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: Accept by ECCV2024

  29. arXiv:2407.10199  [pdf, other

    nucl-ex nucl-th

    Charge radii of $^{11-16}$C, $^{13-17}$N and $^{15-18}$O determined from their charge-changing cross-sections and the mirror-difference charge radii

    Authors: J. W. Zhao, B. -H. Sun, I. Tanihata, J. Y. Xu, K. Y. Zhang, A. Prochazka, L. H. Zhu, S. Terashima, J. Meng, L. C. He, C. Y. Liu, G. S. Li, C. G. Lu, W. J. Lin, W. P. Lin, Z. Liu, P. P Ren, Z. Y. Sun, F. Wang, J. Wang, M. Wang, S. T. Wang, X. L. Wei, X. D. Xu, J. C. Zhang , et al. (2 additional authors not shown)

    Abstract: Charge-changing cross-sections of $^{11-16}$C, $^{13-17}$N and $^{15-18}$O on a carbon target have been determined at energies around 300 MeV/nucleon. A nucleon separation energy dependent correction factor has been introduced to the Glauber model calculation for extracting the nuclear charge radii from the experimental CCCSs. The charge radii of $^{11}$C, $^{13,16}$N and $^{15}$O thus were determ… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 3 figures, submitted to Physics Letters B

  30. arXiv:2407.10124  [pdf, other

    cs.RO

    Adaptive Model Predictive Control with Data-driven Error Model for Quadrupedal Locomotion

    Authors: Xuanqi Zeng, Hongbo Zhang, Linzhu Yue, Zhitao Song, Linwei Zhang, Yun-Hui Liu

    Abstract: Model Predictive Control (MPC) relies heavily on the robot model for its control law. However, a gap always exists between the reduced-order control model with uncertainties and the real robot, which degrades its performance. To address this issue, we propose the controller of integrating a data-driven error model into traditional MPC for quadruped robots. Our approach leverages real-world data fr… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 7 Pages, 7 figures, conference(ICRA 2024)

  31. arXiv:2407.10081  [pdf, other

    cs.IR

    All Roads Lead to Rome: Unveiling the Trajectory of Recommender Systems Across the LLM Era

    Authors: Bo Chen, Xinyi Dai, Huifeng Guo, Wei Guo, Weiwen Liu, Yong Liu, Jiarui Qin, Ruiming Tang, Yichao Wang, Chuhan Wu, Yaxiong Wu, Hao Zhang

    Abstract: Recommender systems (RS) are vital for managing information overload and delivering personalized content, responding to users' diverse information needs. The emergence of large language models (LLMs) offers a new horizon for redefining recommender systems with vast general knowledge and reasoning capabilities. Standing across this LLM era, we aim to integrate recommender systems into a broader pic… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

  32. arXiv:2407.09984  [pdf, ps, other

    cs.RO

    Stabilizing Dynamic Systems through Neural Network Learning: A Robust Approach

    Authors: Yu Zhang, Haoyu Zhang, Yongxiang Zou, Houcheng Li, Long Cheng

    Abstract: Point-to-point and periodic motions are ubiquitous in the world of robotics. To master these motions, Autonomous Dynamic System (DS) based algorithms are fundamental in the domain of Learning from Demonstration (LfD). However, these algorithms face the significant challenge of balancing precision in learning with the maintenance of system stability. This paper addresses this challenge by presentin… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2309.08849

  33. arXiv:2407.09943  [pdf, other

    cs.CL

    Minimizing PLM-Based Few-Shot Intent Detectors

    Authors: Haode Zhang, Xiao-Ming Wu, Albert Y. S. Lam

    Abstract: Recent research has demonstrated the feasibility of training efficient intent detectors based on pre-trained language model~(PLM) with limited labeled data. However, deploying these detectors in resource-constrained environments such as mobile devices poses challenges due to their large sizes. In this work, we aim to address this issue by exploring techniques to minimize the size of PLM-based inte… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

  34. arXiv:2407.09942  [pdf, other

    quant-ph

    Deterministic Benchmarking of Quantum Gates

    Authors: Vinay Tripathi, Daria Kowsari, Kumar Saurav, Haimeng Zhang, Eli M. Levenson-Falk, Daniel A. Lidar

    Abstract: We introduce deterministic benchmarking (DB), a protocol designed to identify the interplay of coherent and incoherent errors overlooked by randomized benchmarking (RB) and related benchmarking methods. DB provides a set of four parameters that characterize both incoherent and coherent errors in the single-qubit gate set. Furthermore, DB reveals asymmetries in gate performance induced by strong re… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: 13 pages, 5 figures, comments are welcome

  35. Scheme for measuring topological transitions in a continuous variable system

    Authors: Bi-Yao Wang, Hao-Long Zhang, Shou-Bang Yang, Fan Wu, Zhen-Biao Yang, Shi-Biao Zheng

    Abstract: We propose a scheme for measuring topological properties in a two-photon-driven Kerr-nonlinear resonator (KNR) subjected to a single-photon modulation. The topological properties are revealed through the observation of the Berry curvature and hence the first Chern number, as a nonadiabatic response of the physical observable to the change rate of the control parameter of the modulated drive. The p… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Journal ref: Advanced Quantum Technologies, 2023

  36. arXiv:2407.09528  [pdf

    cs.HC

    Prism XR -- A Curated Exhibition Experience in Virtual Reality with Peer Annotation Features and Virtual Guides for Art and Archaeology Classes

    Authors: Huopu Zhang

    Abstract: The Prism XR project is a curated exhibition experience in virtual reality (VR) for art and archaeology education with features designed for the enhancement of interactivity and collaborative learning. The project integrates peer annotations and a virtual exhibition guide to augment educational experiences. The peer annotation features are intended for facilitating visitor critiques and comments p… ▽ More

    Submitted 15 July, 2024; v1 submitted 24 June, 2024; originally announced July 2024.

  37. arXiv:2407.09268  [pdf, other

    eess.IV cs.CV

    Region Attention Transformer for Medical Image Restoration

    Authors: Zhiwen Yang, Haowei Chen, Ziniu Qian, Yang Zhou, Hui Zhang, Dan Zhao, Bingzheng Wei, Yan Xu

    Abstract: Transformer-based methods have demonstrated impressive results in medical image restoration, attributed to the multi-head self-attention (MSA) mechanism in the spatial dimension. However, the majority of existing Transformers conduct attention within fixed and coarsely partitioned regions (\text{e.g.} the entire image or fixed patches), resulting in interference from irrelevant regions and fragmen… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted by MICCAI 2024

  38. arXiv:2407.09265  [pdf, other

    hep-ph

    Novel structures and collapse of solitons in nonminimally gravitating dark matter halos

    Authors: Jiajun Chen, Hong-Yi Zhang

    Abstract: Ultralight dark matter simulations predict Bose-Einstein condensations with short-range correlation, known as solitons or boson stars, at the centers of dark matter halos. This paper investigates the formation and collapse of dark matter solitons influenced by nonminimal gravitational effects, characterized by gradient-dependent self-interactions of dark matter and an additional source in Poisson'… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: 10 pages, 7 big figures

  39. arXiv:2407.08924  [pdf, other

    cs.CR

    Disassembling Obfuscated Executables with LLM

    Authors: Huanyao Rong, Yue Duan, Hang Zhang, XiaoFeng Wang, Hongbo Chen, Shengchen Duan, Shen Wang

    Abstract: Disassembly is a challenging task, particularly for obfuscated executables containing junk bytes, which is designed to induce disassembly errors. Existing solutions rely on heuristics or leverage machine learning techniques, but only achieve limited successes. Fundamentally, such obfuscation cannot be defeated without in-depth understanding of the binary executable's semantics, which is made possi… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  40. arXiv:2407.08882  [pdf, ps, other

    cs.HC

    Emerging Practices for Large Multimodal Model (LMM) Assistance for People with Visual Impairments: Implications for Design

    Authors: Jingyi Xie, Rui Yu, He Zhang, Sooyeon Lee, Syed Masum Billah, John M. Carroll

    Abstract: People with visual impairments perceive their environment non-visually and often use AI-powered assistive tools to obtain textual descriptions of visual information. Recent large vision-language model-based AI-powered tools like Be My AI are more capable of understanding users' inquiries in natural language and describing the scene in audible text; however, the extent to which these tools are usef… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  41. Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification

    Authors: Wenshuo Peng, Kaipeng Zhang, Yue Yang, Hao Zhang, Yu Qiao

    Abstract: Vision-language foundation models have been incredibly successful in a wide range of downstream computer vision tasks using adaptation methods. However, due to the high cost of obtaining pre-training datasets, pairs with weak image-text correlation in the data exist in large numbers. We call them weak-paired samples. Due to the limitations of these weak-paired samples, the pre-training model are u… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 9 pages,4 figures

  42. arXiv:2407.08569  [pdf, other

    cs.CV

    Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene

    Authors: Ruiyang Zhang, Hu Zhang, Hang Yu, Zhedong Zheng

    Abstract: The unsupervised 3D object detection is to accurately detect objects in unstructured environments with no explicit supervisory signals. This task, given sparse LiDAR point clouds, often results in compromised performance for detecting distant or small objects due to the inherent sparsity and limited spatial resolution. In this paper, we are among the early attempts to integrate LiDAR data with 2D… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV'24, 18 pages, 5 figures, 6 tables

  43. arXiv:2407.08265  [pdf, other

    cs.CV

    Enhancing Thermal Infrared Tracking with Natural Language Modeling and Coordinate Sequence Generation

    Authors: Miao Yan, Ping Zhang, Haofei Zhang, Ruqian Hao, Juanxiu Liu, Xiaoyang Wang, Lin Liu

    Abstract: Thermal infrared tracking is an essential topic in computer vision tasks because of its advantage of all-weather imaging. However, most conventional methods utilize only hand-crafted features, while deep learning-based correlation filtering methods are limited by simple correlation operations. Transformer-based methods ignore temporal and coordinate information, which is critical for TIR tracking… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  44. arXiv:2407.08093  [pdf, other

    eess.IV cs.AI cs.CV eess.SP

    MemWarp: Discontinuity-Preserving Cardiac Registration with Memorized Anatomical Filters

    Authors: Hang Zhang, Xiang Chen, Renjiu Hu, Dongdong Liu, Gaolei Li, Rongguang Wang

    Abstract: Many existing learning-based deformable image registration methods impose constraints on deformation fields to ensure they are globally smooth and continuous. However, this assumption does not hold in cardiac image registration, where different anatomical regions exhibit asymmetric motions during respiration and movements due to sliding organs within the chest. Consequently, such global constraint… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 11 pages, 2 figure, 2 tables

  45. arXiv:2407.07959  [pdf, other

    cs.SE cs.AI

    Source Code Summarization in the Era of Large Language Models

    Authors: Weisong Sun, Yun Miao, Yuekang Li, Hongyu Zhang, Chunrong Fang, Yi Liu, Gelei Deng, Yang Liu, Zhenyu Chen

    Abstract: To support software developers in understanding and maintaining programs, various automatic (source) code summarization techniques have been proposed to generate a concise natural language summary (i.e., comment) for a given code snippet. Recently, the emergence of large language models (LLMs) has led to a great boost in the performance of code-related tasks. In this paper, we undertake a systemat… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: Just accepted to the 47th International Conference on Software Engineering (ICSE 2025)

    MSC Class: 68-04 ACM Class: D.2.3; I.2.7

  46. arXiv:2407.07895  [pdf, other

    cs.CV cs.CL cs.LG

    LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models

    Authors: Feng Li, Renrui Zhang, Hao Zhang, Yuanhan Zhang, Bo Li, Wei Li, Zejun Ma, Chunyuan Li

    Abstract: Visual instruction tuning has made considerable strides in enhancing the capabilities of Large Multimodal Models (LMMs). However, existing open LMMs largely focus on single-image tasks, their applications to multi-image scenarios remains less explored. Additionally, prior LMM research separately tackles different scenarios, leaving it impossible to generalize cross scenarios with new emerging capa… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Project Page: https://llava-vl.github.io/blog/2024-06-16-llava-next-interleave/

  47. arXiv:2407.07731  [pdf, other

    cond-mat.mtrl-sci cond-mat.mes-hall

    Large spin-orbit torque in a-plane $α$-Fe$_{2}$O$_{3}$/Pt bilayers

    Authors: Igor Lyalin, Hantao Zhang, Justin Michel, Daniel Russell, Fengyuan Yang, Ran Cheng, Roland K. Kawakami

    Abstract: Realization of efficient spin-orbit torque switching of the Néel vector in insulating antiferromagnets is a challenge, often complicated by spurious effects. Quantifying the spin-orbit torques in antiferromagnet/heavy metal heterostructures is an important first step towards this goal. Here, we employ magneto-optic techniques to study damping-like spin-orbit torque (DL-SOT) in a-plane $α$-Fe$_2$O… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 6 pages, 3 figures

  48. arXiv:2407.07651  [pdf, other

    hep-ex physics.data-an

    Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

    Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

    Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  49. arXiv:2407.07554  [pdf, other

    cs.GR cs.SD eess.AS

    Beat-It: Beat-Synchronized Multi-Condition 3D Dance Generation

    Authors: Zikai Huang, Xuemiao Xu, Cheng Xu, Huaidong Zhang, Chenxi Zheng, Jing Qin, Shengfeng He

    Abstract: Dance, as an art form, fundamentally hinges on the precise synchronization with musical beats. However, achieving aesthetically pleasing dance sequences from music is challenging, with existing methods often falling short in controllability and beat alignment. To address these shortcomings, this paper introduces Beat-It, a novel framework for beat-specific, key pose-guided dance generation. Unlike… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  50. arXiv:2407.07351  [pdf, other

    cs.CV

    Unity in Diversity: Multi-expert Knowledge Confrontation and Collaboration for Generalizable Vehicle Re-identification

    Authors: Zhenyu Kuang, Hongyang Zhang, Lidong Cheng, Yinhao Liu, Yue Huang, Xinghao Ding

    Abstract: Generalizable vehicle re-identification (ReID) aims to enable the well-trained model in diverse source domains to broadly adapt to unknown target domains without additional fine-tuning or retraining. However, it still faces the challenges of domain shift problem and has difficulty accurately generalizing to unknown target domains. This limitation occurs because the model relies heavily on primary… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.