Skip to main content

Showing 1–50 of 602 results for author: Xiong, Y

  1. arXiv:2407.10195  [pdf, other

    cs.CV

    V2I-Calib: A Novel Calibration Approach for Collaborative Vehicle and Infrastructure LiDAR Systems

    Authors: Qianxin Qu, Yijin Xiong, Xin Wu, Hanyu Li, Shichun Guo

    Abstract: Cooperative vehicle and infrastructure LiDAR systems hold great potential, yet their implementation faces numerous challenges. Calibration of LiDAR systems across heterogeneous vehicle and infrastructure endpoints is a critical step to ensure the accuracy and consistency of perception system data, necessitating calibration methods that are real-time and stable. To this end, this paper introduces a… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: to be published in 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS2024)

  2. arXiv:2407.09932  [pdf, other

    quant-ph

    Quantum Clock Synchronization Network with Silicon-chip Dual-Pumped Entangled Photon Source

    Authors: J. A. Li, H. Han, X. P. Huang, B. Y. Tang, K. Guo, J. Q. Huang, S. Y. Xiong, W. R. Yu, Z. J. Zhang, J. B. Yang, B. Liu, H. Chen, Z. K. Lu

    Abstract: In this paper, we propose a quantum clock synchronization (QCS) network scheme with silicon-chip dual-pumped entangled photon source. This scheme couples two pump beams into the silicon-based waveguide, where degenerate and non-degenerate spontaneous four-wave mixing (SFWM) occurs, generating entanglement between one signal channel and three idler channels. The entangled photons are distributed to… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

  3. arXiv:2407.09816  [pdf, other

    cs.CL

    MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts

    Authors: Zhenpeng Su, Zijia Lin, Xue Bai, Xing Wu, Yizhe Xiong, Haoran Lian, Guangyuan Ma, Hui Chen, Guiguang Ding, Wei Zhou, Songlin Hu

    Abstract: Scaling model capacity enhances its capabilities but significantly increases computation. Mixture-of-Experts models (MoEs) address this by allowing model capacity to scale without substantially increasing training or inference costs. Despite their promising results, MoE models encounter several challenges. Primarily, the dispersion of training tokens across multiple experts can lead to underfittin… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: Work in progress

  4. arXiv:2407.09761  [pdf, other

    stat.ME

    Exploring Differences between Two Decades of Mental Health Related Emergency Department Visits by Youth via Recurrent Events Analyses

    Authors: Yi Xiong, Joan Hu, Rhonda Rosychuk

    Abstract: We aim to develop a tool for understanding how the mental health of youth aged less than 18 years evolve over time through administrative records of mental health related emergency department (MHED) visits in two decades. Administrative health data usually contain rich information for investigating public health issues; however, many restrictions and regulations apply to their use. Moreover, the d… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  5. arXiv:2407.06691  [pdf, other

    cs.IT eess.SP

    OFDM Achieves the Lowest Ranging Sidelobe Under Random ISAC Signaling

    Authors: Fan Liu, Ying Zhang, Yifeng Xiong, Shuangyang Li, Weijie Yuan, Feifei Gao, Shi Jin, Giuseppe Caire

    Abstract: This paper aims to answer a fundamental question in the area of Integrated Sensing and Communications (ISAC): What is the optimal communication-centric ISAC waveform for ranging? Towards that end, we first established a generic framework to analyze the sensing performance of communication-centric ISAC waveforms built upon orthonormal signaling bases and random data symbols. Then, we evaluated thei… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 14 pages, 12 figures, submitted to IEEE for possible publication

  6. arXiv:2407.06358  [pdf, other

    cs.CV

    MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions

    Authors: Xuan Ju, Yiming Gao, Zhaoyang Zhang, Ziyang Yuan, Xintao Wang, Ailing Zeng, Yu Xiong, Qiang Xu, Ying Shan

    Abstract: Sora's high-motion intensity and long consistent videos have significantly impacted the field of video generation, attracting unprecedented attention. However, existing publicly available datasets are inadequate for generating Sora-like videos, as they mainly contain short videos with low motion intensity and brief captions. To address these issues, we propose MiraData, a high-quality video datase… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  7. arXiv:2407.04954  [pdf, other

    eess.SP

    Extremely Large-Scale Dynamic Metasurface Antennas (XL-DMAs): Near-Field Modeling and Channel Estimation

    Authors: Songjie Yang, Wanting Lyu, Boyu Ning, Yue Xiu, Youzhi Xiong, Hua Chen, Chadi Assi, Chau Yuen

    Abstract: Dynamic metasurface antennas (DMAs) represent a novel transceiver array architecture for extremely large-scale (XL) communications, offering the advantages of reduced power consumption and lower hardware costs compared to conventional arrays. This paper focuses on near-field channel estimation for XL-DMAs. We begin by analyzing the near-field characteristics of uniform planar arrays (UPAs) and i… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  8. arXiv:2407.01296  [pdf, other

    cond-mat.mes-hall cond-mat.quant-gas math-ph physics.optics quant-ph

    Non-Hermitian skin effect in arbitrary dimensions: non-Bloch band theory and classification

    Authors: Yuncheng Xiong, Ze-Yu Xing, Haiping Hu

    Abstract: Non-Hermitian skin effect (NHSE) is a distinctive phenomenon in non-Hermitian systems, characterized by a significant accumulation of eigenstates at system boundaries. While well-understood in one dimension via non-Bloch band theory, unraveling the NHSE in higher dimensions faces formidable challenges due to the diversity of open boundary conditions or lattice geometries and inevitable numerical e… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 24 pages, 17 figures

  9. arXiv:2406.19444  [pdf, other

    cond-mat.mes-hall quant-ph

    Kramers Nonlinearity in PT Symmetric Magnets

    Authors: Oles Matsyshyn, Ying Xiong, Justin C. W. Song

    Abstract: Kramers degeneracies play an essential role in the spectrum of electronic materials. Here we argue that beyond spectral properties, Kramers degeneracy plays a critical role in the nonlinear response of PT symmetric magnets. In particular, we uncover a class of second-order Kramers nonlinearities that only arise in the presence of Kramers degeneracy, vanishing in non-degenerate PT symmetric materia… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 9 pages, 2 figures

  10. arXiv:2406.18485  [pdf, other

    cs.DC

    LoongTrain: Efficient Training of Long-Sequence LLMs with Head-Context Parallelism

    Authors: Diandian Gu, Peng Sun, Qinghao Hu, Ting Huang, Xun Chen, Yingtong Xiong, Guoteng Wang, Qiaoling Chen, Shangchun Zhao, Jiarui Fang, Yonggang Wen, Tianwei Zhang, Xin Jin, Xuanzhe Liu

    Abstract: Efficiently training LLMs with long sequences is important yet challenged by the massive computation and memory requirements. Sequence parallelism has been proposed to tackle these problems, but existing methods suffer from scalability or efficiency issues. We propose LoongTrain, a novel system to efficiently train LLMs with long sequences at scale. The core of LoongTrain is the 2D-Attention mecha… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  11. arXiv:2406.12187  [pdf, other

    cond-mat.mtrl-sci

    Diverse Responses in Lattice Thermal Conductivity of $n$-type/$p$-type Semiconductors Driven by Asymmetric Electron-Phonon Interactions

    Authors: Jianshi Sun, Shouhang Li, Zhen Tong, Cheng Shao, Han Xie, Meng An, Chuang Zhang, Xiongfei Zhu, Chen Huang, Yucheng Xiong, Xiangjun Liu

    Abstract: Accurately assessing the impact of electron-phonon interaction (EPI) on the lattice thermal conductivity of semiconductors is crucial for the thermal management of electronic devices and a unified physical understanding of this issue is highly desired. In this work, we predict the lattice thermal conductivities of typical direct and indirect bandgap semiconductors accounting for EPI based on mode-… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 8 pages,5 figures

  12. arXiv:2406.11891  [pdf, other

    cs.SI cs.AI cs.LG

    Towards Adaptive Neighborhood for Advancing Temporal Interaction Graph Modeling

    Authors: Siwei Zhang, Xi Chen, Yun Xiong, Xixi Wu, Yao Zhang, Yongrui Fu, Yinglong Zhao, Jiawei Zhang

    Abstract: Temporal Graph Networks (TGNs) have demonstrated their remarkable performance in modeling temporal interaction graphs. These works can generate temporal node representations by encoding the surrounding neighborhoods for the target node. However, an inherent limitation of existing TGNs is their reliance on fixed, hand-crafted rules for neighborhood encoding, overlooking the necessity for an adaptiv… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: KDD'2024 Research Track Paper

  13. arXiv:2406.11836  [pdf, other

    cs.CV cs.GR

    RetinaGS: Scalable Training for Dense Scene Rendering with Billion-Scale 3D Gaussians

    Authors: Bingling Li, Shengyi Chen, Luchao Wang, Kaimin Liao, Sijie Yan, Yuanjun Xiong

    Abstract: In this work, we explore the possibility of training high-parameter 3D Gaussian splatting (3DGS) models on large-scale, high-resolution datasets. We design a general model parallel training method for 3DGS, named RetinaGS, which uses a proper rendering equation and can be applied to any scene and arbitrary distribution of Gaussian primitives. It enables us to explore the scaling behavior of 3DGS i… ▽ More

    Submitted 22 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  14. arXiv:2406.11833  [pdf, other

    cs.CV cs.AI cs.LG

    MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs

    Authors: Ziyu Liu, Tao Chu, Yuhang Zang, Xilin Wei, Xiaoyi Dong, Pan Zhang, Zijian Liang, Yuanjun Xiong, Yu Qiao, Dahua Lin, Jiaqi Wang

    Abstract: Generating natural and meaningful responses to communicate with multi-modal human inputs is a fundamental capability of Large Vision-Language Models(LVLMs). While current open-source LVLMs demonstrate promising performance in simplified scenarios such as single-turn single-image input, they fall short in real-world conversation scenarios such as following instructions in a long context history wit… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: This project is available at https://github.com/Liuziyu77/MMDU

  15. arXiv:2406.08717  [pdf, other

    cond-mat.str-el

    Comparison of superconducting pairing in doped cuprates and nickelates within an extended Hubbard model

    Authors: Yicheng Xiong, Hang Ma, Hongxing Liu, Runyu Ma, Tianxing Ma

    Abstract: Within an extended Hubbard model, we investigate the superconducting pairing behavior of infinite-layer nickelate $\mathrm{NdNiO_2}$ and cuprates superconductors by using the determinant quantum Monte Carlo method. Our focus is on comparing their dominant pairing symmetries. The results indicate that the $d_{x^2-y^2}$ pairing interaction is significantly enhanced at low temperatures in both doped… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 6 pages and 8 figures

  16. arXiv:2406.07548  [pdf, other

    cs.CV cs.IT cs.LG eess.IV

    Image and Video Tokenization with Binary Spherical Quantization

    Authors: Yue Zhao, Yuanjun Xiong, Philipp Krähenbühl

    Abstract: We propose a new transformer-based image and video tokenizer with Binary Spherical Quantization (BSQ). BSQ projects the high-dimensional visual embedding to a lower-dimensional hypersphere and then applies binary quantization. BSQ is (1) parameter-efficient without an explicit codebook, (2) scalable to arbitrary token dimensions, and (3) compact: compressing visual data by up to 100$\times$ with m… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Tech report

  17. arXiv:2406.06609  [pdf, other

    cs.LG cs.AI cs.CV

    Mitigating Bias in Dataset Distillation

    Authors: Justin Cui, Ruochen Wang, Yuanhao Xiong, Cho-Jui Hsieh

    Abstract: Dataset Distillation has emerged as a technique for compressing large datasets into smaller synthetic counterparts, facilitating downstream training tasks. In this paper, we study the impact of bias inside the original dataset on the performance of dataset distillation. With a comprehensive empirical evaluation on canonical datasets with color, corruption and background biases, we found that color… ▽ More

    Submitted 10 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: ICML

  18. arXiv:2406.06069  [pdf, other

    cs.CV

    PointABM:Integrating Bidirectional State Space Model with Multi-Head Self-Attention for Point Cloud Analysis

    Authors: Jia-wei Chen, Yu-jie Xiong, Yong-bin Gao

    Abstract: Mamba, based on state space model (SSM) with its linear complexity and great success in classification provide its superiority in 3D point cloud analysis. Prior to that, Transformer has emerged as one of the most prominent and successful architectures for point cloud analysis. We present PointABM, a hybrid model that integrates the Mamba and Transformer architectures for enhancing local feature to… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  19. arXiv:2406.05940  [pdf, other

    cs.SE

    M2CVD: Multi-Model Collaboration for Code Vulnerability Detection

    Authors: Ziliang Wang, Ge Li, Jia Li, Yingfei Xiong, Jia Li, Zhi Jin

    Abstract: Large Language Models (LLMs) have strong capabilities in code comprehension, but fine-tuning costs and semantic alignment issues limit their project-specific optimization; conversely, code models such CodeBERT are easy to fine-tune, but it is often difficult to learn vulnerability semantics from complex code languages. To address these challenges, this paper introduces the Multi-Model Collaborativ… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  20. arXiv:2406.04292  [pdf, other

    cs.IR cs.CL cs.CV

    VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval

    Authors: Junjie Zhou, Zheng Liu, Shitao Xiao, Bo Zhao, Yongping Xiong

    Abstract: Multi-modal retrieval becomes increasingly popular in practice. However, the existing retrievers are mostly text-oriented, which lack the capability to process visual information. Despite the presence of vision-language models like CLIP, the current methods are severely limited in representing the text-only and image-only data. In this work, we present a new embedding model VISTA for universal mul… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 main conference

  21. arXiv:2406.04264  [pdf, other

    cs.CV cs.AI cs.CL

    MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding

    Authors: Junjie Zhou, Yan Shu, Bo Zhao, Boya Wu, Shitao Xiao, Xi Yang, Yongping Xiong, Bo Zhang, Tiejun Huang, Zheng Liu

    Abstract: The evaluation of Long Video Understanding (LVU) performance poses an important but challenging research problem. Despite previous efforts, the existing video understanding benchmarks are severely constrained by several issues, especially the insufficient lengths of videos, a lack of diversity in video types and evaluation tasks, and the inappropriateness for evaluating LVU performances. To addres… ▽ More

    Submitted 19 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  22. arXiv:2406.02874  [pdf, other

    cond-mat.mtrl-sci physics.comp-ph

    Giant enhancement of hole mobility for 4H-silicon carbide through suppressing interband electron-phonon scattering

    Authors: Jianshi Sun, Shouhang Li, Zhen Tong, Cheng Shao, Meng An, Xiongfei Zhu, Chuang Zhang, Xiangchuan Chen, Yucheng Xiong, Thomas Frauenheim, Xiangjun Liu

    Abstract: 4H-Silicon Carbide (4H-SiC) possesses a high Baliga figure of merit, making it a promising material for power electronics. However, its applications are limited by its low hole mobility. Herein, we found that the hole mobility of 4H-SiC is mainly limited by the strong interband electron-phonon scattering using mode-level first-principles calculations. Our research indicates that applying compressi… ▽ More

    Submitted 20 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: 22 pages, 4 figures

  23. arXiv:2406.00093  [pdf, other

    cs.CV cs.AI cs.GR cs.LG cs.MM

    Bootstrap3D: Improving 3D Content Creation with Synthetic Data

    Authors: Zeyi Sun, Tong Wu, Pan Zhang, Yuhang Zang, Xiaoyi Dong, Yuanjun Xiong, Dahua Lin, Jiaqi Wang

    Abstract: Recent years have witnessed remarkable progress in multi-view diffusion models for 3D content creation. However, there remains a significant gap in image quality and prompt-following ability compared to 2D diffusion models. A critical bottleneck is the scarcity of high-quality 3D assets with detailed captions. To address this challenge, we propose Bootstrap3D, a novel framework that automatically… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: Project Page: https://sunzey.github.io/Bootstrap3D/

  24. arXiv:2405.19731  [pdf, other

    cs.DC

    Some New Approaches to MPI Implementations

    Authors: Yuqing Xiong

    Abstract: This paper provides some new approaches to MPI implementations to improve MPI performance. These approaches include dynamically composable libraries, reducing average layer numbers of MPI libraries, and a single entity of MPI-network, MPI-protocol, and MPI.

    Submitted 30 May, 2024; originally announced May 2024.

  25. arXiv:2405.19487  [pdf, other

    cs.CL

    A Full-duplex Speech Dialogue Scheme Based On Large Language Models

    Authors: Peng Wang, Songshuo Lu, Yaohua Tang, Sijie Yan, Yuanjun Xiong, Wei Xia

    Abstract: We present a generative dialogue system capable of operating in a full-duplex manner, allowing for seamless interaction. It is based on a large language model (LLM) carefully aligned to be aware of a perception module, a motor function module, and the concept of a simple finite state machine (called neural FSM) with two states. The perception and motor function modules operate simultaneously, allo… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  26. arXiv:2405.19119  [pdf, other

    cs.LG

    Can Graph Learning Improve Task Planning?

    Authors: Xixi Wu, Yifei Shen, Caihua Shan, Kaitao Song, Siwei Wang, Bohang Zhang, Jiarui Feng, Hong Cheng, Wei Chen, Yun Xiong, Dongsheng Li

    Abstract: Task planning is emerging as an important research topic alongside the development of large language models (LLMs). It aims to break down complex user requests into solvable sub-tasks, thereby fulfilling the original requests. In this context, the sub-tasks can be naturally viewed as a graph, where the nodes represent the sub-tasks, and the edges denote the dependencies among them. Consequently, t… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  27. arXiv:2405.17463  [pdf, other

    cs.GT cs.LG

    No Algorithmic Collusion in Two-Player Blindfolded Game with Thompson Sampling

    Authors: Ningyuan Chen, Xuefeng Gao, Yi Xiong

    Abstract: When two players are engaged in a repeated game with unknown payoff matrices, they may be completely unaware of the existence of each other and use multi-armed bandit algorithms to choose the actions, which is referred to as the ``blindfolded game'' in this paper. We show that when the players use Thompson sampling, the game dynamics converges to the Nash equilibrium under a mild assumption on the… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  28. arXiv:2405.17247  [pdf, other

    cs.LG

    An Introduction to Vision-Language Modeling

    Authors: Florian Bordes, Richard Yuanzhe Pang, Anurag Ajay, Alexander C. Li, Adrien Bardes, Suzanne Petryk, Oscar Mañas, Zhiqiu Lin, Anas Mahmoud, Bargav Jayaraman, Mark Ibrahim, Melissa Hall, Yunyang Xiong, Jonathan Lebensold, Candace Ross, Srihari Jayakumar, Chuan Guo, Diane Bouchacourt, Haider Al-Tahan, Karthik Padthe, Vasu Sharma, Hu Xu, Xiaoqing Ellen Tan, Megan Richards, Samuel Lavoie , et al. (16 additional authors not shown)

    Abstract: Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through unfamiliar environments to generative models that produce images using only a high-level text description, the vision-language model (VLM) applications will significantly impact our relationship with technol… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  29. arXiv:2405.16127  [pdf, other

    cs.IR

    Finetuning Large Language Model for Personalized Ranking

    Authors: Zhuoxi Bai, Ning Wu, Fengyu Cai, Xinyi Zhu, Yun Xiong

    Abstract: Large Language Models (LLMs) have demonstrated remarkable performance across various domains, motivating researchers to investigate their potential use in recommendation systems. However, directly applying LLMs to recommendation tasks has proven challenging due to the significant disparity between the data used for pre-training LLMs and the specific requirements of recommendation tasks. In this st… ▽ More

    Submitted 20 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  30. arXiv:2405.15198  [pdf, other

    cs.CL

    RAEE: A Training-Free Retrieval-Augmented Early Exiting Framework for Efficient Inference

    Authors: Lianming Huang, Shangyu Wu, Yufei Cui, Ying Xiong, Xue Liu, Tei-Wei Kuo, Nan Guan, Chun Jason Xue

    Abstract: Deploying large language model inference remains challenging due to their high computational overhead. Early exiting accelerates model inference by adaptively reducing the number of inference layers. Existing methods require training internal classifiers to determine whether to exit at each intermediate layer. However, such classifier-based early exiting frameworks require significant effort to de… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  31. arXiv:2405.11914  [pdf, other

    cs.CV

    PT43D: A Probabilistic Transformer for Generating 3D Shapes from Single Highly-Ambiguous RGB Images

    Authors: Yiheng Xiong, Angela Dai

    Abstract: Generating 3D shapes from single RGB images is essential in various applications such as robotics. Current approaches typically target images containing clear and complete visual descriptions of the object, without considering common realistic cases where observations of objects that are largely occluded or truncated. We thus propose a transformer-based autoregressive model to generate the probabi… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 10 pages, 6 figures

  32. arXiv:2405.11535  [pdf, ps, other

    cs.PL

    Proving Functional Program Equivalence via Directed Lemma Synthesis

    Authors: Yican Sun, Ruyi Ji, Jian Fang, Xuanlin Jiang, Mingshuai Chen, Yingfei Xiong

    Abstract: Proving equivalence between functional programs is a fundamental problem in program verification, which often amounts to reasoning about algebraic data types (ADTs) and compositions of structural recursions. Modern theorem provers address this problem by applying structural induction, which is insufficient for proving many equivalence theorems. In such cases, one has to invent a set of lemmas, pro… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: 21 pages

  33. arXiv:2405.10300  [pdf, other

    cs.CV

    Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

    Authors: Tianhe Ren, Qing Jiang, Shilong Liu, Zhaoyang Zeng, Wenlong Liu, Han Gao, Hongjie Huang, Zhengyu Ma, Xiaoke Jiang, Yihao Chen, Yuda Xiong, Hao Zhang, Feng Li, Peijun Tang, Kent Yu, Lei Zhang

    Abstract: This paper introduces Grounding DINO 1.5, a suite of advanced open-set object detection models developed by IDEA Research, which aims to advance the "Edge" of open-set object detection. The suite encompasses two models: Grounding DINO 1.5 Pro, a high-performance model designed for stronger generalization capability across a wide range of scenarios, and Grounding DINO 1.5 Edge, an efficient model o… ▽ More

    Submitted 31 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: homepage: https://deepdataspace.com/home

  34. arXiv:2405.10132  [pdf, other

    cs.CV

    Cooperative Visual-LiDAR Extrinsic Calibration Technology for Intersection Vehicle-Infrastructure: A review

    Authors: Xinyu Zhang, Yijin Xiong, Qianxin Qu, Renjie Wang, Xin Gao, Jing Liu, Shichun Guo, Jun Li

    Abstract: In the typical urban intersection scenario, both vehicles and infrastructures are equipped with visual and LiDAR sensors. By successfully integrating the data from vehicle-side and road monitoring devices, a more comprehensive and accurate environmental perception and information acquisition can be achieved. The Calibration of sensors, as an essential component of autonomous driving technology, ha… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  35. arXiv:2405.07144  [pdf, other

    quant-ph

    Optical transition parameters of the silicon T centre

    Authors: Chloe Clear, Sara Hosseini, Amirhossein AlizadehKhaledi, Nicholas Brunelle, Austin Woolverton, Joshua Kanaganayagam, Moein Kazemi, Camille Chartrand, Mehdi Keshavarz, Yihuang Xiong, Oney O. Soykal, Geoffroy Hautier, Valentin Karassiouk, Mike Thewalt, Daniel Higginbottom, Stephanie Simmons

    Abstract: The silicon T centre's narrow, telecommunications-band optical emission, long spin coherence, and direct photonic integration have spurred interest in this emitter as a spin-photon interface for distributed quantum computing and networking. However, key parameters of the T centre's spin-selective optical transitions remain undetermined or ambiguous in literature. In this paper we present a Hamilto… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: 9 pages and 6 figures in the main manuscript. 10 pages and 6 figures in the supplementary information

  36. arXiv:2405.05165  [pdf, other

    cond-mat.mtrl-sci quant-ph

    Discovery of T center-like quantum defects in silicon

    Authors: Yihuang Xiong, Jiongzhi Zheng, Shay McBride, Xueyue Zhang, Sinéad M. Griffin, Geoffroy Hautier

    Abstract: Quantum technologies would benefit from the development of high performance quantum defects acting as single-photon emitters or spin-photon interface. Finding such a quantum defect in silicon is especially appealing in view of its favorable spin bath and high processability. While some color centers in silicon have been emerging in quantum applications, there is still a need to search and develop… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  37. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  38. arXiv:2405.03597  [pdf, other

    eess.SP

    Improving the Ranging Performance of Random ISAC Signals Through Pulse Shaping Design

    Authors: Zihan Liao, Fan Liu, Shuangyang Li, Yifeng Xiong, Weijie Yuan, Marco Lops

    Abstract: In this paper, we propose a novel pulse shaping design for single-carrier integrated sensing and communication (ISAC) transmission. Due to the communication information embedded in the ISAC signal, the resulting auto-correlation function (ACF) is determined by both the information-conveying random symbol sequence and the signaling pulse, where the former leads to random fluctuations in the sidelob… ▽ More

    Submitted 6 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  39. arXiv:2404.17808  [pdf, other

    cs.CL

    Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal

    Authors: Haoran Lian, Yizhe Xiong, Jianwei Niu, Shasha Mo, Zhenpeng Su, Zijia Lin, Peng Liu, Hui Chen, Guiguang Ding

    Abstract: Byte Pair Encoding (BPE) serves as a foundation method for text tokenization in the Natural Language Processing (NLP) field. Despite its wide adoption, the original BPE algorithm harbors an inherent flaw: it inadvertently introduces a frequency imbalance for tokens in the text corpus. Since BPE iteratively merges the most frequent token pair in the text corpus while keeping all tokens that have be… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  40. arXiv:2404.17785  [pdf, other

    cs.CL

    Temporal Scaling Law for Large Language Models

    Authors: Yizhe Xiong, Xiansheng Chen, Xin Ye, Hui Chen, Zijia Lin, Haoran Lian, Zhenpeng Su, Jianwei Niu, Guiguang Ding

    Abstract: Recently, Large Language Models (LLMs) have been widely adopted in a wide range of tasks, leading to increasing attention towards the research on how scaling LLMs affects their performance. Existing works, termed Scaling Laws, have discovered that the final test loss of LLMs scales as power-laws with model size, computational budget, and dataset size. However, the temporal change of the test loss… ▽ More

    Submitted 16 June, 2024; v1 submitted 27 April, 2024; originally announced April 2024.

    Comments: 8 pages, 3 figures; Under review

  41. arXiv:2404.16037  [pdf, other

    cs.CV cs.LG physics.ao-ph

    VN-Net: Vision-Numerical Fusion Graph Convolutional Network for Sparse Spatio-Temporal Meteorological Forecasting

    Authors: Yutong Xiong, Xun Zhu, Ming Wu, Weiqing Li, Fanbin Mo, Chuang Zhang, Bin Zhang

    Abstract: Sparse meteorological forecasting is indispensable for fine-grained weather forecasting and deserves extensive attention. Recent studies have highlighted the potential of spatio-temporal graph convolutional networks (ST-GCNs) in predicting numerical data from ground weather stations. However, as one of the highest fidelity and lowest latency data, the application of the vision data from satellites… ▽ More

    Submitted 26 January, 2024; originally announced April 2024.

  42. arXiv:2404.08188  [pdf, other

    cs.IT eess.SP

    Fundamental Limits of Communication-Assisted Sensing in ISAC Systems

    Authors: Fuwang Dong, Fan Liu, Shihang Liu, Yifeng Xiong, Weijie Yuan, Yuanhao Cui

    Abstract: In this paper, we introduce a novel communication-assisted sensing (CAS) framework that explores the potential coordination gains offered by the integrated sensing and communication technique. The CAS system endows users with beyond-line-of-the-sight sensing capabilities, supported by a dual-functional base station that enables simultaneous sensing and communication. To delve into the system's fun… ▽ More

    Submitted 23 April, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

    Comments: This paper has been accepted by ISIT. The updated version will be coming soon

  43. Characterizing visual cortical magnification with topological smoothing and optimal transportation

    Authors: Yujian Xiong, Yanshuai Tu, Zhong-Lin Lu, Yalin Wang

    Abstract: Human vision has different concentration on visual fields. Cortical magnification factor (CMF) is a popular measurement on visual acuity and cortex concentration. In order to achieve thorough measurement of CMF across the whole visual field, we propose a method to measure planar CMF upon retinotopic maps generated by pRF decoding, with help of our proposed methods: optimal transportation and topol… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: Accepted by SPIE 2023

    Journal ref: Proc. SPIE 12464, Medical Imaging 2023: Image Processing, 124641Z (3 April 2023)

  44. arXiv:2404.05232  [pdf, ps, other

    math.AG math.RT

    Invariant stability conditions on local $\mathbb{P}^1\times \mathbb{P}^1$ (after Del Monte-Longhi)

    Authors: Yirui Xiong

    Abstract: Let $X$ be the total space of canonical bundle of $\pp$, we study an invariant subspace of stability conditions on $X$ under an autoequivalence of $D^b(X)$. We describe the complete set of stable objects with respect to the invariant stability conditions and characterize the space of invariant stability conditions.

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: All comments are welcome!

  45. arXiv:2404.05107  [pdf, other

    cs.CV

    Reconstructing Retinal Visual Images from 3T fMRI Data Enhanced by Unsupervised Learning

    Authors: Yujian Xiong, Wenhui Zhu, Zhong-Lin Lu, Yalin Wang

    Abstract: The reconstruction of human visual inputs from brain activity, particularly through functional Magnetic Resonance Imaging (fMRI), holds promising avenues for unraveling the mechanisms of the human visual system. Despite the significant strides made by deep learning methods in improving the quality and interpretability of visual reconstruction, there remains a substantial demand for high-quality, l… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: Accepted by ISBI 2024

    Journal ref: 2024 IEEE International Symposium on Biomedical Imaging

  46. arXiv:2404.03253  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    A dataset of primary nasopharyngeal carcinoma MRI with multi-modalities segmentation

    Authors: Yin Li, Qi Chen, Kai Wang, Meige Li, Liping Si, Yingwei Guo, Yu Xiong, Qixing Wang, Yang Qin, Ling Xu, Patrick van der Smagt, Jun Tang, Nutan Chen

    Abstract: Multi-modality magnetic resonance imaging data with various sequences facilitate the early diagnosis, tumor segmentation, and disease staging in the management of nasopharyngeal carcinoma (NPC). The lack of publicly available, comprehensive datasets limits advancements in diagnosis, treatment planning, and the development of machine learning algorithms for NPC. Addressing this critical need, we in… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  47. arXiv:2404.02628  [pdf, other

    physics.comp-ph cond-mat.quant-gas physics.chem-ph

    GPU acceleration of ab initio simulations of large-scale identical particles based on path integral molecular dynamics

    Authors: Yunuo Xiong

    Abstract: Path integral Monte Carlo (PIMC) and path integral molecular dynamics (PIMD) provide the golden standard for the ab initio simulations of identical particles. In this work, we achieved significant GPU acceleration based on PIMD, which is equivalent to PIMC in the ab initio simulations, and developed an open-source PIMD code repository that does not rely on any other third party library. Numerical… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 23 pages. 7 figures

  48. arXiv:2404.00567  [pdf, ps, other

    math.CO

    Characterizations of amorphic schemes and fusions of pairs

    Authors: Edwin R. van Dam, Jack H. Koolen, Yanzhen Xiong

    Abstract: An association scheme is called amorphic if every possible fusion of relations gives rise to a fusion scheme. We call a pair of relations fusing if fusing that pair gives rise to a fusion scheme. We define the fusing-relations graph on the set of relations, where a pair forms an edge if it fuses. We show that if the fusing-relations graph is connected but not a path, then the association scheme is… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  49. arXiv:2403.19907  [pdf, ps, other

    cs.LG cs.AI

    Beyond the Known: Novel Class Discovery for Open-world Graph Learning

    Authors: Yucheng Jin, Yun Xiong, Juncheng Fang, Xixi Wu, Dongxiao He, Xing Jia, Bingchen Zhao, Philip Yu

    Abstract: Node classification on graphs is of great importance in many applications. Due to the limited labeling capability and evolution in real-world open scenarios, novel classes can emerge on unlabeled testing nodes. However, little attention has been paid to novel class discovery on graphs. Discovering novel classes is challenging as novel and known class nodes are correlated by edges, which makes thei… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  50. arXiv:2403.19395  [pdf, ps, other

    math.ST

    Kernel entropy estimation for linear processes II

    Authors: Yudan Xiong, Fangjun Xu

    Abstract: Let $X=\{X_n: n\in \mathbb{N}\}$ be a linear process with bounded probability density function $f(x)$. Under certain conditions, we use the kernel estimator \[ \frac{2}{n(n-1)h_n} \sum_{1\le i<j\le n}K\Big(\frac{X_i-X_j}{h_n}\Big) \] to estimate the quadratic functional of $\int_{\mathbb{R}}f^2(x)dx$ of the linear process $X=\{X_n: n\in \mathbb{N}\}$ and improve the corresponding results in [4].

    Submitted 28 March, 2024; originally announced March 2024.