Skip to main content

Showing 1–50 of 416 results for author: Cao, M

  1. arXiv:2407.09853  [pdf, other

    cs.CV

    Image Compression for Machine and Human Vision with Spatial-Frequency Adaptation

    Authors: Han Li, Shaohui Li, Shuangrui Ding, Wenrui Dai, Maida Cao, Chenglin Li, Junni Zou, Hongkai Xiong

    Abstract: Image compression for machine and human vision (ICMH) has gained increasing attention in recent years. Existing ICMH methods are limited by high training and storage overheads due to heavy design of task-specific networks. To address this issue, in this paper, we develop a novel lightweight adapter-based tuning framework for ICMH, named Adapt-ICMH, that better balances task performance and bitrate… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024, project: https://github.com/qingshi9974/ECCV2024-AdpatICMH

  2. arXiv:2407.08340  [pdf, other

    cs.LG

    SLRL: Structured Latent Representation Learning for Multi-view Clustering

    Authors: Zhangci Xiong, Meng Cao

    Abstract: In recent years, Multi-View Clustering (MVC) has attracted increasing attention for its potential to reduce the annotation burden associated with large datasets. The aim of MVC is to exploit the inherent consistency and complementarity among different views, thereby integrating information from multiple perspectives to improve clustering outcomes. Despite extensive research in MVC, most existing… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  3. arXiv:2407.05097  [pdf, other

    physics.optics

    $\mathcal{PT}$-symmetric photonic lattices with type-II Dirac cones

    Authors: Qian Tang, Milivoj R. Belić, Hua Zhong, Meng Cao, Yongdong Li, Yiqi Zhang

    Abstract: The type-II Dirac cone is a special feature of the band structure, whose Fermi level is represented by a pair of crossing lines. It has been demonstrated that such a structure is useful for investigating topological edge solitons, and more specifically, for mimicking the Kline tunneling. However, it is still not clear what the interplay between type-II Dirac cones and the non-Hermiticity mechanism… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: 5 pages, 4 figures, to appear in Optics Letters. Comments are welcome

  4. arXiv:2407.02787  [pdf

    physics.optics quant-ph

    A versatile quantum microwave photonic signal processing platform based on coincidence window selection technique

    Authors: Xinghua Li, Yifan Guo, Xiao Xiang, Runai Quan, Mingtao Cao, Ruifang Dong, Tao Liu, Ming Li, Shougang Zhang

    Abstract: Quantum microwave photonics (QMWP) is an innovative approach that combines energy-time entangled biphoton sources as the optical carrier with time-correlated single-photon detection for high-speed RF signal recovery. This groundbreaking method offers unique advantages such as nonlocal RF signal encoding and robust resistance to dispersion-induced frequency fading. This paper explores the versatili… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  5. arXiv:2407.02774  [pdf

    physics.optics quant-ph

    Quantum microwave photonic mixer with a large spurious-free dynamic range

    Authors: Xinghua Li, Yifan Guo, Xiao Xiang, Runai Quan, Mingtao Cao, Ruifang Dong, Tao Liu, Ming Li, Shougang Zhang

    Abstract: As one of the most fundamental functionalities of microwave photonics, microwave frequency mixing plays an essential role in modern radars and wireless communication systems. However, the commonly utilized intensity modulation in the systems often leads to inadequate spurious-free dynamic range (SFDR) for many sought-after applications. Quantum microwave photonics technique offers a promising solu… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  6. arXiv:2406.16935  [pdf, other

    eess.SP cs.AI

    Benchmarking Out-of-Distribution Generalization Capabilities of DNN-based Encoding Models for the Ventral Visual Cortex

    Authors: Spandan Madan, Will Xiao, Mingran Cao, Hanspeter Pfister, Margaret Livingstone, Gabriel Kreiman

    Abstract: We characterized the generalization capabilities of DNN-based encoding models when predicting neuronal responses from the visual cortex. We collected \textit{MacaqueITBench}, a large-scale dataset of neural population responses from the macaque inferior temporal (IT) cortex to over $300,000$ images, comprising $8,233$ unique natural images presented to seven monkeys over $109$ sessions. Using \tex… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  7. arXiv:2406.14887  [pdf, other

    cs.CL

    InternLM-Law: An Open Source Chinese Legal Large Language Model

    Authors: Zhiwei Fei, Songyang Zhang, Xiaoyu Shen, Dawei Zhu, Xiao Wang, Maosong Cao, Fengzhe Zhou, Yining Li, Wenwei Zhang, Dahua Lin, Kai Chen, Jidong Ge

    Abstract: While large language models (LLMs) have showcased impressive capabilities, they struggle with addressing legal queries due to the intricate complexities and specialized expertise required in the legal field. In this paper, we introduce InternLM-Law, a specialized LLM tailored for addressing diverse legal queries related to Chinese laws, spanning from responding to standard legal questions (e.g., l… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Our dataset, code and models will be released at https://github.com/InternLM/InternLM-Law

  8. arXiv:2406.14060  [pdf, ps, other

    math.OC

    Distributed Event-Triggered Bandit Convex Optimization with Time-Varying Constraints

    Authors: Kunpeng Zhang, Xinlei Yi, Guanghui Wen, Ming Cao, Karl H. Johansson, Tianyou Chai, Tao Yang

    Abstract: This paper considers the distributed bandit convex optimization problem with time-varying inequality constraints over a network of agents, where the goal is to minimize network regret and cumulative constraint violation. Existing distributed online algorithms require that each agent broadcasts its decision to its neighbors at each iteration. To better utilize the limited communication resources, w… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 34 pages, 4 figures. arXiv admin note: text overlap with arXiv:2311.01957

  9. arXiv:2406.12703  [pdf, other

    eess.IV cs.CV

    Coarse-Fine Spectral-Aware Deformable Convolution For Hyperspectral Image Reconstruction

    Authors: Jincheng Yang, Lishun Wang, Miao Cao, Huan Wang, Yinping Zhao, Xin Yuan

    Abstract: We study the inverse problem of Coded Aperture Snapshot Spectral Imaging (CASSI), which captures a spatial-spectral data cube using snapshot 2D measurements and uses algorithms to reconstruct 3D hyperspectral images (HSI). However, current methods based on Convolutional Neural Networks (CNNs) struggle to capture long-range dependencies and non-local similarities. The recently popular Transformer-b… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 7 pages, 5 figures, Accepted by ICIP2024

  10. arXiv:2406.10318  [pdf, other

    cs.CV cs.AI

    Creating a Lens of Chinese Culture: A Multimodal Dataset for Chinese Pun Rebus Art Understanding

    Authors: Tuo Zhang, Tiantian Feng, Yibin Ni, Mengqin Cao, Ruying Liu, Katharine Butler, Yanjun Weng, Mi Zhang, Shrikanth S. Narayanan, Salman Avestimehr

    Abstract: Large vision-language models (VLMs) have demonstrated remarkable abilities in understanding everyday content. However, their performance in the domain of art, particularly culturally rich art forms, remains less explored. As a pearl of human wisdom and creativity, art encapsulates complex cultural narratives and symbolism. In this paper, we offer the Pun Rebus Art Dataset, a multimodal dataset for… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  11. arXiv:2406.09838  [pdf, other

    cs.CV cs.AI

    Vision-Language Models Meet Meteorology: Developing Models for Extreme Weather Events Detection with Heatmaps

    Authors: Jian Chen, Peilin Zhou, Yining Hua, Dading Chong, Meng Cao, Yaowei Li, Zixuan Yuan, Bing Zhu, Junwei Liang

    Abstract: Real-time detection and prediction of extreme weather protect human lives and infrastructure. Traditional methods rely on numerical threshold setting and manual interpretation of weather heatmaps with Geographic Information Systems (GIS), which can be slow and error-prone. Our research redefines Extreme Weather Events Detection (EWED) by framing it as a Visual Question Answering (VQA) problem, the… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  12. arXiv:2406.06329  [pdf, other

    cs.CL eess.AS

    A Parameter-efficient Language Extension Framework for Multilingual ASR

    Authors: Wei Liu, Jingyong Hou, Dong Yang, Muyong Cao, Tan Lee

    Abstract: Covering all languages with a multilingual speech recognition model (MASR) is very difficult. Performing language extension on top of an existing MASR is a desirable choice. In this study, the MASR continual learning problem is probabilistically decomposed into language identity prediction (LP) and cross-lingual adaptation (XLA) sub-problems. Based on this, we propose an architecture-based framewo… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  13. arXiv:2405.20607  [pdf, other

    cs.CV

    Textual Inversion and Self-supervised Refinement for Radiology Report Generation

    Authors: Yuanjiang Luo, Hongxiang Li, Xuan Wu, Meng Cao, Xiaoshuang Huang, Zhihong Zhu, Peixi Liao, Hu Chen, Yi Zhang

    Abstract: Existing mainstream approaches follow the encoder-decoder paradigm for generating radiology reports. They focus on improving the network structure of encoders and decoders, which leads to two shortcomings: overlooking the modality gap and ignoring report content constraints. In this paper, we proposed Textual Inversion and Self-supervised Refinement (TISR) to address the above two issues. Specific… ▽ More

    Submitted 6 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: This paper has been early accepted by MICCAI 2024!

  14. arXiv:2405.19689  [pdf, other

    cs.CV cs.IR

    Uncertainty-aware sign language video retrieval with probability distribution modeling

    Authors: Xuan Wu, Hongxiang Li, Yuanjiang Luo, Xuxin Cheng, Xianwei Zhuang, Meng Cao, Keren Fu

    Abstract: Sign language video retrieval plays a key role in facilitating information access for the deaf community. Despite significant advances in video-text retrieval, the complexity and inherent uncertainty of sign language preclude the direct application of these techniques. Previous methods achieve the mapping between sign language video and text through fine-grained modal alignment. However, due to th… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  15. arXiv:2405.19465  [pdf, other

    cs.CV

    RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter

    Authors: Meng Cao, Haoran Tang, Jinfa Huang, Peng Jin, Can Zhang, Ruyang Liu, Long Chen, Xiaodan Liang, Li Yuan, Ge Li

    Abstract: Text-Video Retrieval (TVR) aims to align relevant video content with natural language queries. To date, most state-of-the-art TVR methods learn image-to-video transfer learning based on large-scale pre-trained visionlanguage models (e.g., CLIP). However, fully fine-tuning these pre-trained models for TVR incurs prohibitively expensive computation costs. To this end, we propose to conduct efficient… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted by ACL 2024 Findings

  16. arXiv:2405.18969  [pdf, ps, other

    eess.SY

    Global and local observability of hypergraphs

    Authors: Chencheng Zhang, Hao Yang, Shaoxuan Cui, Bin Jiang, Ming Cao

    Abstract: This paper studies observability for non-uniform hypergraphs with inputs and outputs. To capture higher-order interactions, we define a canonical non-homogeneous dynamical system with nonlinear outputs on hypergraphs. We then construct algebraic necessary and sufficient conditions based on polynomial ideals and varieties for global observability at an initial state of hypergraphs. An example is gi… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  17. arXiv:2405.18333  [pdf, other

    eess.SY

    On the analysis of a higher-order Lotka-Volterra model: an application of S-tensors and the polynomial complementarity problem

    Authors: Shaoxuan Cui, Qi Zhao, Guofeng Zhang, Hildeberto Jardón-Kojakhmetov, Ming Cao

    Abstract: It is known that the effect of species' density on species' growth is non-additive in real ecological systems. This challenges the conventional Lotka-Volterra model, where the interactions are always pairwise and their effects are additive. To address this challenge, we introduce HOIs (Higher-Order Interactions) which are able to capture, for example, the indirect effect of one species on a second… ▽ More

    Submitted 8 July, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  18. arXiv:2405.13865  [pdf, other

    cs.CV

    ReVideo: Remake a Video with Motion and Content Control

    Authors: Chong Mou, Mingdeng Cao, Xintao Wang, Zhaoyang Zhang, Ying Shan, Jian Zhang

    Abstract: Despite significant advancements in video generation and editing using diffusion models, achieving accurate and localized video editing remains a substantial challenge. Additionally, most existing video editing methods primarily focus on altering visual content, with limited research dedicated to motion editing. In this paper, we present a novel attempt to Remake a Video (ReVideo) which stands out… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  19. arXiv:2405.10915  [pdf, other

    math.DS

    Strategic control for a Boltzmann like decision-making model

    Authors: Luis Guillermo Venegas-Pineda, Hildeberto Jardón-Kojakhmetov, Maximilian Engel, Jobst Heitzig, Muhittin Cenk Eser, Ming Cao

    Abstract: We study a prototypical non-polynomial decision-making model for which agents in a population potentially alternate between two consumption strategies, one related to the exploitation of an unlimited but considerably expensive resource and the other a comparably cheaper but restricted and slowly renewable source. In particular, we study a model following a Boltzmann-like exploration policy, enhanc… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 40 pages, 20 figures

  20. arXiv:2405.07740  [pdf, ps, other

    cs.IT

    The $σ$ hulls of matrix-product codes and related entanglement-assisted quantum error-correcting codes

    Authors: Meng Cao

    Abstract: Let $\mathrm{SLAut}(\mathbb{F}_{q}^{n})$ denote the group of all semilinear isometries on $\mathbb{F}_{q}^{n}$, where $q=p^{e}$ is a prime power. Matrix-product (MP) codes are a class of long classical codes generated by combining several commensurate classical codes with a defining matrix. We give an explicit formula for calculating the dimension of the $σ$ hull of a MP code. As a result, we give… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  21. arXiv:2405.02538  [pdf, other

    cs.CV

    AdaFPP: Adapt-Focused Bi-Propagating Prototype Learning for Panoramic Activity Recognition

    Authors: Meiqi Cao, Rui Yan, Xiangbo Shu, Guangzhao Dai, Yazhou Yao, Guo-Sen Xie

    Abstract: Panoramic Activity Recognition (PAR) aims to identify multi-granularity behaviors performed by multiple persons in panoramic scenes, including individual activities, group activities, and global activities. Previous methods 1) heavily rely on manually annotated detection boxes in training and inference, hindering further practical deployment; or 2) directly employ normal detectors to detect multip… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  22. arXiv:2405.02285  [pdf, ps, other

    cs.IT

    Special matrices over finite fields and their applications to quantum error-correcting codes

    Authors: Meng Cao

    Abstract: The matrix-product (MP) code $\mathcal{C}_{A,k}:=[\mathcal{C}_{1},\mathcal{C}_{2},\ldots,\mathcal{C}_{k}]\cdot A$ with a non-singular by column (NSC) matrix $A$ plays an important role in constructing good quantum error-correcting codes. In this paper, we study the MP code when the defining matrix $A$ satisfies the condition that $AA^†$ is $(D,τ)$-monomial. We give an explicit formula for calculat… ▽ More

    Submitted 11 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

  23. arXiv:2404.18686  [pdf

    quant-ph

    Dynamic temperature compensation for wavelength-stable entangled biphoton generation

    Authors: Yuting Liu, Huibo Hong, Xiao Xiang, Runai Quan, Tao Liu, Mingtao Cao, Shougang Zhang, Ruifang Dong

    Abstract: A dynamic temperature compensation method is presented to stabilize the wavelength of the entangled biphoton source, which is generated via the spontaneous parametric down-conversion based on a MgO: PPLN waveguide. Utilizing the dispersive Fourier transformation technique combined with a digital proportional-integral-differential algorithm, the small amount of wavelength variation can be instantly… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  24. arXiv:2404.18106  [pdf, other

    cs.CV

    Semi-supervised Text-based Person Search

    Authors: Daming Gao, Yang Bai, Min Cao, Hao Dou, Mang Ye, Min Zhang

    Abstract: Text-based person search (TBPS) aims to retrieve images of a specific person from a large image gallery based on a natural language description. Existing methods rely on massive annotated image-text data to achieve satisfactory performance in fully-supervised learning. It poses a significant challenge in practice, as acquiring person images from surveillance videos is relatively easy, while obtain… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 13 pages

  25. arXiv:2404.14013  [pdf, ps, other

    math.CA

    A characterization of compactness via bilinear $T1$ theorem

    Authors: Mingming Cao, Honghai Liu, Zengyan Si, Kôzô Yabuta

    Abstract: We establish a bilinear $T1$ theorem to characterize the weighted compactness of bilinear Calderón--Zygmund operators. Let $T$ be a bilinear operator associated with a standard bilinear Calderón--Zygmund kernel. We demonstrate that $T$ can be extended to a compact bilinear operator from $L^{p_1}(w_1^{p_1}) \times L^{p_2}(w_2^{p_2})$ to $L^p(w^p)$ for all exponents… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: This is just a draft, but we post the file in its current form now, in response to several queries about the result and method. Eventually, these results will be a part of a more extensive work about compactness of bilinear singular integrals

    MSC Class: 42B20; 42B35

  26. arXiv:2404.09842  [pdf, other

    cs.CV

    STMixer: A One-Stage Sparse Action Detector

    Authors: Tao Wu, Mengqi Cao, Ziteng Gao, Gangshan Wu, Limin Wang

    Abstract: Traditional video action detectors typically adopt the two-stage pipeline, where a person detector is first employed to generate actor boxes and then 3D RoIAlign is used to extract actor-specific features for classification. This detection paradigm requires multi-stage training and inference, and the feature sampling is constrained inside the box, failing to effectively leverage richer context inf… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Extended version of the paper arXiv:2303.15879 presented at CVPR 2023. Accepted by TPAMI 2024

  27. arXiv:2404.06784  [pdf

    quant-ph cond-mat.mes-hall cs.AR eess.SY

    Statistical evaluation of 571 GaAs quantum point contact transistors showing the 0.7 anomaly in quantized conductance using millikelvin cryogenic on-chip multiplexing

    Authors: Pengcheng Ma, Kaveh Delfanazari, Reuben K. Puddy, Jiahui Li, Moda Cao, Teng Yi, Jonathan P. Griffiths, Harvey E. Beere, David A. Ritchie, Michael J. Kelly, Charles G. Smith

    Abstract: The mass production and the practical number of cryogenic quantum devices producible in a single chip are limited to the number of electrical contact pads and wiring of the cryostat or dilution refrigerator. It is, therefore, beneficial to contrast the measurements of hundreds of devices fabricated in a single chip in one cooldown process to promote the scalability, integrability, reliability, and… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  28. arXiv:2404.06350  [pdf, other

    cs.CV

    Rolling Shutter Correction with Intermediate Distortion Flow Estimation

    Authors: Mingdeng Cao, Sidi Yang, Yujiu Yang, Yinqiang Zheng

    Abstract: This paper proposes to correct the rolling shutter (RS) distorted images by estimating the distortion flow from the global shutter (GS) to RS directly. Existing methods usually perform correction using the undistortion flow from the RS to GS. They initially predict the flow from consecutive RS frames, subsequently rescaling it as the displacement fields from the RS frame to the underlying GS image… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: CVPR2024

  29. arXiv:2404.02845  [pdf, other

    cs.CV

    Cross-Modal Conditioned Reconstruction for Language-guided Medical Image Segmentation

    Authors: Xiaoshuang Huang, Hongxiang Li, Meng Cao, Long Chen, Chenyu You, Dong An

    Abstract: Recent developments underscore the potential of textual information in enhancing learning models for a deeper understanding of medical visual semantics. However, language-guided medical image segmentation still faces a challenging issue. Previous works employ implicit and ambiguous architectures to embed textual information. This leads to segmentation results that are inconsistent with the semanti… ▽ More

    Submitted 7 July, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

  30. arXiv:2403.19238  [pdf, other

    cs.CV cs.AI eess.IV

    Taming Lookup Tables for Efficient Image Retouching

    Authors: Sidi Yang, Binxiao Huang, Mingdeng Cao, Yatai Ji, Hanzhong Guo, Ngai Wong, Yujiu Yang

    Abstract: The widespread use of high-definition screens in edge devices, such as end-user cameras, smartphones, and televisions, is spurring a significant demand for image enhancement. Existing enhancement models often optimize for high performance while falling short of reducing hardware inference time and power consumption, especially on edge devices with constrained computing and storage resources. To th… ▽ More

    Submitted 13 July, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Accepted by ECCV2024

  31. arXiv:2403.18167  [pdf, other

    cs.CL cs.AI

    Mechanistic Understanding and Mitigation of Language Model Non-Factual Hallucinations

    Authors: Lei Yu, Meng Cao, Jackie Chi Kit Cheung, Yue Dong

    Abstract: State-of-the-art language models (LMs) sometimes generate non-factual hallucinations that misalign with world knowledge. To explore the mechanistic causes of these hallucinations, we create diagnostic datasets with subject-relation queries and adapt interpretability methods to trace hallucinations through internal model representations. We discover two general and distinct mechanistic causes of ha… ▽ More

    Submitted 17 June, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

  32. arXiv:2403.17297  [pdf, other

    cs.CL cs.AI

    InternLM2 Technical Report

    Authors: Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang , et al. (75 additional authors not shown)

    Abstract: The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI). However, replicating such advancements in open-source models has been challenging. This paper introduces InternLM2, an open-source LLM that outperforms its predecessors in comprehensive evaluations across 6 dimensions and 30 benchmarks, long-context m… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  33. arXiv:2403.15805  [pdf, other

    cs.RO

    AirCrab: A Hybrid Aerial-Ground Manipulator with An Active Wheel

    Authors: Muqing Cao, Jiayan Zhao, Xinhang Xu, Lihua Xie

    Abstract: Inspired by the behavior of birds, we present AirCrab, a hybrid aerial ground manipulator (HAGM) with a single active wheel and a 3-degree of freedom (3-DoF) manipulator. AirCrab leverages a single point of contact with the ground to reduce position drift and improve manipulation accuracy. The single active wheel enables locomotion on narrow surfaces without adding significant weight to the robot.… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

  34. arXiv:2403.14668  [pdf, other

    cs.CY cs.AI cs.CL cs.LG

    Predicting Learning Performance with Large Language Models: A Study in Adult Literacy

    Authors: Liang Zhang, Jionghao Lin, Conrad Borchers, John Sabatini, John Hollander, Meng Cao, Xiangen Hu

    Abstract: Intelligent Tutoring Systems (ITSs) have significantly enhanced adult literacy training, a key factor for societal participation, employment opportunities, and lifelong learning. Our study investigates the application of advanced AI models, including Large Language Models (LLMs) like GPT-4, for predicting learning performance in adult literacy programs in ITSs. This research is motivated by the po… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 26TH International Conference on Human-Computer Interaction

  35. arXiv:2403.14416  [pdf, other

    quant-ph cs.IT

    Quantum Channel Simulation in Fidelity is no more difficult than State Splitting

    Authors: Michael X. Cao, Rahul Jain, Marco Tomamichel

    Abstract: Characterizing the minimal communication needed for the quantum channel simulation is a fundamental task in the quantum information theory. In this paper, we show that, in fidelity, the quantum channel simulation can be directly achieved via quantum state splitting without using a technique known as the de~Finetti reduction, and thus provide a pair of tighter one-shot bounds. Using the bounds, we… ▽ More

    Submitted 24 June, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

  36. arXiv:2403.14173  [pdf, other

    cs.RO

    HCTO: Optimality-Aware LiDAR Inertial Odometry with Hybrid Continuous Time Optimization for Compact Wearable Mapping System

    Authors: Jianping Li, Shenghai Yuan, Muqing Cao, Thien-Minh Nguyen, Kun Cao, Lihua Xie

    Abstract: Compact wearable mapping system (WMS) has gained significant attention due to their convenience in various applications. Specifically, it provides an efficient way to collect prior maps for 3D structure inspection and robot-based "last-mile delivery" in complex environments. However, vibrations in human motion and the uneven distribution of point cloud features in complex environments often lead t… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  37. arXiv:2403.13839  [pdf, other

    cs.LG cs.AI cs.PL

    depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers

    Authors: Kaichao You, Runsheng Bai, Meng Cao, Jianmin Wang, Ion Stoica, Mingsheng Long

    Abstract: PyTorch \texttt{2.x} introduces a compiler designed to accelerate deep learning programs. However, for machine learning researchers, adapting to the PyTorch compiler to full potential can be challenging. The compiler operates at the Python bytecode level, making it appear as an opaque box. To address this, we introduce \texttt{depyf}, a tool designed to demystify the inner workings of the PyTorch… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 16 pages, 2 figures

  38. arXiv:2403.12549  [pdf, other

    math.CO

    Treewidth of generalized Hamming graph, bipartite Kneser graph and generalized Petersen graph

    Authors: Yichen Wang, Mengyu Cao, Zequn Lv, Mei Lu

    Abstract: Let $t,q$ and $n$ be positive integers. Write $[q] = \{1,2,\ldots,q\}$. The generalized Hamming graph $H(t,q,n)$ is the graph whose vertex set is the cartesian product of $n$ copies of $[q]$$(q\ge 2)$, where two vertices are adjacent if their Hamming distance is at most $t$. In particular, $H(1,q,n)$ is the well-known Hamming graph and $H(1,2,n)$ is the hypercube. In 2006, Chandran and Kavitha des… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  39. arXiv:2403.11183  [pdf, other

    cs.CL

    Decoding Continuous Character-based Language from Non-invasive Brain Recordings

    Authors: Cenyuan Zhang, Xiaoqing Zheng, Ruicheng Yin, Shujie Geng, Jianhan Xu, Xuan Gao, Changze Lv, Zixuan Ling, Xuanjing Huang, Miao Cao, Jianfeng Feng

    Abstract: Deciphering natural language from brain activity through non-invasive devices remains a formidable challenge. Previous non-invasive decoders either require multiple experiments with identical stimuli to pinpoint cortical regions and enhance signal-to-noise ratios in brain activity, or they are limited to discerning basic linguistic elements such as letters and words. We propose a novel approach to… ▽ More

    Submitted 19 March, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

  40. arXiv:2403.09323  [pdf, other

    cs.CV

    E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection

    Authors: Jiaqing Zhang, Mingxiang Cao, Xue Yang, Weiying Xie, Jie Lei, Daixun Li, Wenbo Huang, Yunsong Li

    Abstract: Multimodal image fusion and object detection are crucial for autonomous driving. While current methods have advanced the fusion of texture details and semantic information, their complex training processes hinder broader applications. Addressing this challenge, we introduce E2E-MFD, a novel end-to-end algorithm for multimodal fusion detection. E2E-MFD streamlines the process, achieving high perfor… ▽ More

    Submitted 23 May, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  41. arXiv:2403.03416  [pdf, other

    eess.SY

    On discrete-time polynomial dynamical systems on hypergraphs

    Authors: Shaoxuan Cui, Guofeng Zhang, Hildeberto Jardón-Kojakhmetov, Ming Cao

    Abstract: This paper studies the stability of discrete-time polynomial dynamical systems on hypergraphs by utilizing the Perron-Frobenius theorem for nonnegative tensors with respect to the tensors Z-eigenvalues and Z-eigenvectors. Firstly, for a multilinear polynomial system on a uniform hypergraph, we study the stability of the origin of the corresponding systems. Next, we extend our results to non-homoge… ▽ More

    Submitted 5 June, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: text overlap with arXiv:2401.03652

  42. arXiv:2403.03048  [pdf, other

    eess.SY cs.CR

    Design of Stochastic Quantizers for Privacy Preservation

    Authors: Le Liu, Yu Kawano, Ming Cao

    Abstract: In this paper, we examine the role of stochastic quantizers for privacy preservation. We first employ a static stochastic quantizer and investigate its corresponding privacy-preserving properties. Specifically, we demonstrate that a sufficiently large quantization step guarantees $(0, δ)$ differential privacy. Additionally, the degradation of control performance caused by quantization is evaluated… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 11 pages, 4 figures

  43. arXiv:2403.02146  [pdf, ps, other

    math.OC math.DS

    Reinforcement Learning for Inverse Non-Cooperative Linear-Quadratic Output-feedback Differential Games

    Authors: Emin Martirosyan, Ming Cao

    Abstract: In this paper, we address the inverse problem for linear-quadratic differential non-cooperative games with output-feedback. Given players' stabilizing feedback laws, the goal is to find cost function parameters that lead to a game for which the observed game dynamics are at a Nash equilibrium. Using the given feedback laws, we introduce a model-based algorithm that generates cost function paramete… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  44. arXiv:2403.01322  [pdf, ps, other

    math.OC

    A Communication-Efficient Stochastic Gradient Descent Algorithm for Distributed Nonconvex Optimization

    Authors: Antai Xie, Xinlei Yi, Xiaofan Wang, Ming Cao, Xiaoqiang Ren

    Abstract: This paper studies distributed nonconvex optimization problems with stochastic gradients for a multi-agent system, in which each agent aims to minimize the sum of all agents' cost functions by using local compressed information exchange. We propose a distributed stochastic gradient descent (SGD) algorithm, suitable for a general class of compressors. We show that the proposed algorithm achieves th… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  45. arXiv:2403.01225  [pdf, other

    cs.RO

    A Cost-Effective Cooperative Exploration and Inspection Strategy for Heterogeneous Aerial System

    Authors: Xinhang Xu, Muqing Cao, Shenghai Yuan, Thien Hoang Nguyen, Thien-Minh Nguyen, Lihua Xie

    Abstract: In this paper, we propose a cost-effective strategy for heterogeneous UAV swarm systems for cooperative aerial inspection. Unlike previous swarm inspection works, the proposed method does not rely on precise prior knowledge of the environment and can complete full 3D surface coverage of objects in any shape. In this work, agents are partitioned into teams, with each drone assign a different task,… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: Baseline method of CARIC at CDC 2023, Singapore

  46. arXiv:2402.13942  [pdf, other

    physics.plasm-ph physics.flu-dyn

    The Maintenance of Coherent Vortex Topology by Lagrangian Chaos in Drift-Rossby Wave Turbulence

    Authors: Norman M. Cao, Di Qi

    Abstract: This work introduces the "potential vorticity bucket brigade," a mechanism for explaining the resilience of vortex structures in magnetically confined fusion plasmas and geophysical flows. Drawing parallels with zonal jet formation, we show how inhomogeneous patterns of mixing can reinforce, rather than destroy non-zonal flow structure. We accomplish this through an exact stochastic Lagrangian rep… ▽ More

    Submitted 3 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Journal ref: Physics of Fluids 36, 061701 (2024)

  47. arXiv:2402.13822  [pdf, other

    cs.CV

    MSTAR: Multi-Scale Backbone Architecture Search for Timeseries Classification

    Authors: Tue M. Cao, Nhat H. Tran, Hieu H. Pham, Hung T. Nguyen, Le P. Nguyen

    Abstract: Most of the previous approaches to Time Series Classification (TSC) highlight the significance of receptive fields and frequencies while overlooking the time resolution. Hence, unavoidably suffered from scalability issues as they integrated an extensive range of receptive fields into classification models. Other methods, while having a better adaptation for large datasets, require manual design an… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  48. arXiv:2402.11907  [pdf, other

    cs.CL

    Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation

    Authors: Aiwei Liu, Haoping Bai, Zhiyun Lu, Xiang Kong, Simon Wang, Jiulong Shan, Meng Cao, Lijie Wen

    Abstract: Aligning large language models (LLMs) with human expectations without human-annotated preference data is an important problem. In this paper, we propose a method to evaluate the response preference by using the output probabilities of response pairs under contrastive prompt pairs, which could achieve better performance on LLaMA2-7B and LLaMA2-13B compared to RLAIF. Based on this, we propose an aut… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 24 pages, 5 pages

    MSC Class: 68T50 ACM Class: I.2.7

  49. arXiv:2402.09752  [pdf

    physics.optics eess.SY physics.app-ph quant-ph

    Vector spectrometer with Hertz-level resolution and super-recognition capability

    Authors: Ting Qing, Shupeng Li, Huashan Yang, Lihan Wang, Yijie Fang, Xiaohu Tang, Meihui Cao, Jianming Lu, Jijun He, Junqiu Liu, Yueguang Lyu, Shilong Pan

    Abstract: High-resolution optical spectrometers are crucial in revealing intricate characteristics of signals, determining laser frequencies, measuring physical constants, identifying substances, and advancing biosensing applications. Conventional spectrometers, however, often grapple with inherent trade-offs among spectral resolution, wavelength range, and accuracy. Furthermore, even at high resolution, re… ▽ More

    Submitted 6 March, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: 21 pages, 6 figures

  50. arXiv:2402.05431  [pdf, ps, other

    quant-ph

    Dynamical quantum state tomography with time-dependent channels

    Authors: Meng Cao, Yu Wang

    Abstract: In this paper, we establish a dynamical quantum state tomography framework. Under this framework, it is feasible to obtain complete knowledge of any unknown state of a $d$-level system via only an arbitrary operator of certain types of IC-POVMs in dimension $d$. We show that under the time-dependent average channel, we can acquire a collection of projective operators that is informationally comple… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: 23 pages, 1 table