Skip to main content

Showing 1–50 of 539 results for author: Xie, S

  1. arXiv:2407.11036  [pdf, other

    cs.AI cs.NI

    Hybrid-Generative Diffusion Models for Attack-Oriented Twin Migration in Vehicular Metaverses

    Authors: Yingkai Kang, Jinbo Wen, Jiawen Kang, Tao Zhang, Hongyang Du, Dusit Niyato, Rong Yu, Shengli Xie

    Abstract: The vehicular metaverse is envisioned as a blended immersive domain that promises to bring revolutionary changes to the automotive industry. As a core component of vehicular metaverses, Vehicle Twins (VTs) are digital twins that cover the entire life cycle of vehicles, providing immersive virtual services for Vehicular Metaverse Users (VMUs). Vehicles with limited resources offload the computation… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  2. arXiv:2407.10342  [pdf

    physics.app-ph cond-mat.mtrl-sci

    Demonstration of Si-doped Al-rich thin regrown Al(Ga)N films on AlN on sapphire templates with $\gt10^{15}/cm^3$ free carrier concentration using close-coupled showerhead MOCVD reactor

    Authors: Swarnav Mukhopadhyay, Parthasarathy Seshadri, Mobinul Haque, Shuwen Xie, Ruixin Bai, Surjava Sanyal, Guangying Wang, Chirag Gupta, Shubhra S. Pasayat

    Abstract: Thin Si-doped Al-rich (Al>0.85) regrown Al(Ga)N layers were deposited on AlN on Sapphire template using metal-organic chemical vapor deposition (MOCVD) techniques. The optimization of the deposition conditions such as temperature, V/III ratio, deposition rate, and Si concentration resulted in a high charge carrier concentration (>$10^{15}/cm^{3}$) in the Si-doped Al-rich Al(Ga)N films. A pulsed de… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 13 pages, 5 figures

  3. arXiv:2407.09928  [pdf, other

    physics.ins-det hep-ex

    Results for pixel and strip centimeter-scale AC-LGAD sensors with a 120 GeV proton beam

    Authors: Irene Dutta, Christopher Madrid, Ryan Heller, Shirsendu Nanda, Danush Shekar, Claudio San Martín, Matías Barría, Artur Apresyan, Zhenyu Ye, William K. Brooks, Wei Chen, Gabriele D'Amen, Gabriele Giacomini, Alessandro Tricoli, Aram Hayrapetyan, Hakseong Lee, Ohannes Kamer Köseyan, Sergey Los, Koji Nakamura, Sayuka Kita, Tomoka Imamura, Cristían Peña, Si Xie

    Abstract: We present the results of an extensive evaluation of strip and pixel AC-LGAD sensors tested with a 120 GeV proton beam, focusing on the influence of design parameters on the sensor temporal and spatial resolutions. Results show that reducing the thickness of pixel sensors significantly enhances their time resolution, with 20 $μ$m-thick sensors achieving around 20 ps. Uniform performance is attaina… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

  4. arXiv:2407.05125  [pdf, other

    cs.DC cs.LG

    A Joint Approach to Local Updating and Gradient Compression for Efficient Asynchronous Federated Learning

    Authors: Jiajun Song, Jiajun Luo, Rongwei Lu, Shuzhao Xie, Bin Chen, Zhi Wang

    Abstract: Asynchronous Federated Learning (AFL) confronts inherent challenges arising from the heterogeneity of devices (e.g., their computation capacities) and low-bandwidth environments, both potentially causing stale model updates (e.g., local gradients) for global aggregation. Traditional approaches mitigating the staleness of updates typically focus on either adjusting the local updating or gradient co… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  5. arXiv:2407.04929  [pdf, other

    cs.RO

    Toward Precise Robotic Weed Flaming Using a Mobile Manipulator with a Flamethrower

    Authors: Di Wang, Chengsong Hu, Shuangyu Xie, Joe Johnson, Hojun Ji, Yingtao Jiang, Muthukumar Bagavathiannan, Dezhen Song

    Abstract: Robotic weed flaming is a new and environmentally friendly approach to weed removal in the agricultural field. Using a mobile manipulator equipped with a flamethrower, we design a new system and algorithm to enable effective weed flaming, which requires robotic manipulation with a soft and deformable end effector, as the thermal coverage of the flame is affected by dynamic or unknown environmental… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: IROS 2024

  6. arXiv:2407.04675  [pdf, other

    eess.AS cs.SD

    Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

    Authors: Ye Bai, Jingping Chen, Jitong Chen, Wei Chen, Zhuo Chen, Chuang Ding, Linhao Dong, Qianqian Dong, Yujiao Du, Kepan Gao, Lu Gao, Yi Guo, Minglun Han, Ting Han, Wenchao Hu, Xinying Hu, Yuxiang Hu, Deyu Hua, Lu Huang, Mingkun Huang, Youjia Huang, Jishuo Jin, Fanliu Kong, Zongwei Lan, Tianyu Li , et al. (30 additional authors not shown)

    Abstract: Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this wor… ▽ More

    Submitted 10 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

  7. arXiv:2407.02376  [pdf, other

    astro-ph.HE

    A new subclass of gamma-ray burst originating from compact binary merger

    Authors: Chen-Wei Wang, Wen-Jun Tan, Shao-Lin Xiong, Shu-Xu Yi, Rahim Moradi, Bing Li, Zhen Zhang, Yu Wang, Yan-Zhi Meng, Jia-Cong Liu, Yue Wang, Sheng-Lun Xie, Wang-Chen Xue, Zheng-Hang Yu, Peng Zhang, Wen-Long Zhang, Yan-Qiu Zhang, Chao Zheng

    Abstract: Type I gamma-ray bursts (GRBs) are believed to originate from compact binary merger usually with duration less than 2 seconds for the main emission. However, recent observations of GRB 211211A and GRB 230307A indicate that some merger-origin GRBs could last much longer. Since they show strikingly similar properties (indicating a common mechanism) which are different from the classic "long"-short b… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  8. arXiv:2407.01003  [pdf, other

    cs.CV cs.AI

    Embedded Prompt Tuning: Towards Enhanced Calibration of Pretrained Models for Medical Images

    Authors: Wenqiang Zu, Shenghao Xie, Qing Zhao, Guoqi Li, Lei Ma

    Abstract: Foundation models pre-trained on large-scale data have been widely witnessed to achieve success in various natural imaging downstream tasks. Parameter-efficient fine-tuning (PEFT) methods aim to adapt foundation models to new domains by updating only a small portion of parameters in order to reduce computational overhead. However, the effectiveness of these PEFT methods, especially in cross-domain… ▽ More

    Submitted 2 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: 16 pages, 7 figures. arXiv admin note: text overlap with arXiv:2306.09579, arXiv:2203.12119 by other authors

  9. arXiv:2406.19853  [pdf, other

    cs.CL cs.AI

    YuLan: An Open-source Large Language Model

    Authors: Yutao Zhu, Kun Zhou, Kelong Mao, Wentong Chen, Yiding Sun, Zhipeng Chen, Qian Cao, Yihan Wu, Yushuo Chen, Feng Wang, Lei Zhang, Junyi Li, Xiaolei Wang, Lei Wang, Beichen Zhang, Zican Dong, Xiaoxue Cheng, Yuhan Chen, Xinyu Tang, Yupeng Hou, Qiangqiang Ren, Xincheng Pang, Shufang Xie, Wayne Xin Zhao, Zhicheng Dou , et al. (13 additional authors not shown)

    Abstract: Large language models (LLMs) have become the foundation of many applications, leveraging their extensive capabilities in processing and understanding natural language. While many open-source LLMs have been released with technical reports, the lack of training details hinders further research and development. This paper presents the development of YuLan, a series of open-source LLMs with $12$ billi… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  10. arXiv:2406.19114  [pdf, ps, other

    math.CV math.AG math.PR

    Hole probabilities of random zeros on compact Riemann surfaces

    Authors: Hao Wu, Song-Yan Xie

    Abstract: We establish a convergence speed estimate for hole probabilities of zeros of random holomorphic sections on compact Riemann surfaces.

    Submitted 1 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    MSC Class: 31A15; 32C30; 32L10; 60B10; 60D05

  11. arXiv:2406.18533  [pdf, other

    cs.CV

    On Scaling Up 3D Gaussian Splatting Training

    Authors: Hexu Zhao, Haoyang Weng, Daohan Lu, Ang Li, Jinyang Li, Aurojit Panda, Saining Xie

    Abstract: 3D Gaussian Splatting (3DGS) is increasingly popular for 3D reconstruction due to its superior visual quality and rendering speed. However, 3DGS training currently occurs on a single GPU, limiting its ability to handle high-resolution and large-scale 3D reconstruction tasks due to memory constraints. We introduce Grendel, a distributed system designed to partition 3DGS parameters and parallelize c… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Code: https://github.com/nyu-systems/Grendel-GS ; Project page: https://daohanlu.github.io/scaling-up-3dgs

    ACM Class: I.4.5

  12. arXiv:2406.16860  [pdf, other

    cs.CV

    Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

    Authors: Shengbang Tong, Ellis Brown, Penghao Wu, Sanghyun Woo, Manoj Middepogu, Sai Charitha Akula, Jihan Yang, Shusheng Yang, Adithya Iyer, Xichen Pan, Austin Wang, Rob Fergus, Yann LeCun, Saining Xie

    Abstract: We introduce Cambrian-1, a family of multimodal LLMs (MLLMs) designed with a vision-centric approach. While stronger language models can enhance multimodal capabilities, the design choices for vision components are often insufficiently explored and disconnected from visual representation learning research. This gap hinders accurate sensory grounding in real-world scenarios. Our study uses LLMs and… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Website at https://cambrian-mllm.github.io

  13. arXiv:2406.15261  [pdf, other

    cond-mat.mtrl-sci cond-mat.mes-hall

    Tailored topotactic chemistry unlocks heterostructures of magnetic intercalation compounds

    Authors: Samra Husremović, Oscar Gonzalez, Berit H. Goodge, Lilia S. Xie, Zhizhi Kong, Wanlin Zhang, Sae Hee Ryu, Stephanie M. Ribet, Karen C. Bustillo, Chengyu Song, Jim Ciston, Takashi Taniguchi, Kenji Watanabe, Colin Ophus, Chris Jozwiak, Aaron Bostwick, Eli Rotenberg, D. Kwabena Bediako

    Abstract: The construction of thin film heterostructures has been a widely successful archetype for fabricating materials with emergent physical properties. This strategy is of particular importance for the design of multilayer magnetic architectures in which direct interfacial spin--spin interactions between magnetic phases in dissimilar layers lead to emergent and controllable magnetic behavior. However,… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  14. arXiv:2406.06258  [pdf, other

    cs.CV

    Tuning-Free Visual Customization via View Iterative Self-Attention Control

    Authors: Xiaojie Li, Chenghao Gu, Shuzhao Xie, Yunpeng Bai, Weixiang Zhang, Zhi Wang

    Abstract: Fine-Tuning Diffusion Models enable a wide range of personalized generation and editing applications on diverse visual modalities. While Low-Rank Adaptation (LoRA) accelerates the fine-tuning process, it still requires multiple reference images and time-consuming training, which constrains its scalability for large-scale and real-time applications. In this paper, we propose \textit{View Iterative… ▽ More

    Submitted 10 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Under review

  15. arXiv:2406.05675  [pdf, ps, other

    math.CO

    Finding irregular subgraphs via local adjustments

    Authors: Jie Ma, Shengjie Xie

    Abstract: For a graph $H$, let $m(H,k)$ denote the number of vertices of degree $k$ in $H$. A conjecture of Alon and Wei states that for any $d\geq 3$, every $n$-vertex $d$-regular graph contains a spanning subgraph $H$ satisfying $|m(H,k)-\frac{n}{d+1}|\leq 2$ for every $0\leq k \leq d$. This holds easily when $d\leq 2$. An asymptotic version of this conjecture was initially established by Frieze, Gould, K… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  16. arXiv:2406.04790  [pdf, ps, other

    math.AP

    On location of maximal gradient of torsion function over some non-symmetric planar domains

    Authors: Qinfeng Li, Shuangquan Xie, Hang Yang, Ruofei Yao

    Abstract: We investigate the location of the maximal gradient of the torsion function on some non-symmetric planar domains. First, for triangles, by reflection method, we show that the maximal gradient of the torsion function always occurs on the longest sides, lying between the foot of the altitude and the middle point. Moreover, via nodal line analysis and continuity method, we demonstrate that restricted… ▽ More

    Submitted 14 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

  17. arXiv:2406.02470  [pdf, other

    quant-ph cs.LG

    Meta-Designing Quantum Experiments with Language Models

    Authors: Sören Arlt, Haonan Duan, Felix Li, Sang Michael Xie, Yuhuai Wu, Mario Krenn

    Abstract: Artificial Intelligence (AI) has the potential to significantly advance scientific discovery by finding solutions beyond human capabilities. However, these super-human solutions are often unintuitive and require considerable effort to uncover underlying principles, if possible at all. Here, we show how a code-generating language model trained on synthetic data can not only find solutions to specif… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 10+3 pages, 5 figures

  18. arXiv:2406.00036  [pdf, other

    cs.CL cs.AI cs.LG

    EMERGE: Integrating RAG for Improved Multimodal EHR Predictive Modeling

    Authors: Yinghao Zhu, Changyu Ren, Zixiang Wang, Xiaochen Zheng, Shiyun Xie, Junlan Feng, Xi Zhu, Zhoujun Li, Liantao Ma, Chengwei Pan

    Abstract: The integration of multimodal Electronic Health Records (EHR) data has notably advanced clinical predictive capabilities. However, current models that utilize clinical notes and multivariate time-series EHR data often lack the necessary medical context for precise clinical tasks. Previous methods using knowledge graphs (KGs) primarily focus on structured knowledge extraction. To address this, we p… ▽ More

    Submitted 27 May, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.07016

  19. arXiv:2405.20774  [pdf, other

    cs.CR cs.AI

    Exploring Backdoor Attacks against Large Language Model-based Decision Making

    Authors: Ruochen Jiao, Shaoyuan Xie, Justin Yue, Takami Sato, Lixu Wang, Yixuan Wang, Qi Alfred Chen, Qi Zhu

    Abstract: Large Language Models (LLMs) have shown significant promise in decision-making tasks when fine-tuned on specific applications, leveraging their inherent common sense and reasoning abilities learned from vast amounts of data. However, these systems are exposed to substantial safety and security risks during the fine-tuning phase. In this work, we propose the first comprehensive framework for Backdo… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 27 pages, including main paper, references, and appendix

  20. arXiv:2405.20325  [pdf, other

    cs.CV

    MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion

    Authors: Shuyuan Tu, Qi Dai, Zihao Zhang, Sicheng Xie, Zhi-Qi Cheng, Chong Luo, Xintong Han, Zuxuan Wu, Yu-Gang Jiang

    Abstract: Despite impressive advancements in diffusion-based video editing models in altering video attributes, there has been limited exploration into modifying motion information while preserving the original protagonist's appearance and background. In this paper, we propose MotionFollower, a lightweight score-guided diffusion model for video motion editing. To introduce conditional controls to the denois… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 23 pages, 18 figures. Project page at https://francis-rings.github.io/MotionFollower/

    MSC Class: 68T45; 68T10

  21. arXiv:2405.17846  [pdf, other

    cs.RO cs.AI

    Safety Control of Service Robots with LLMs and Embodied Knowledge Graphs

    Authors: Yong Qi, Gabriel Kyebambo, Siyuan Xie, Wei Shen, Shenghui Wang, Bitao Xie, Bin He, Zhipeng Wang, Shuo Jiang

    Abstract: Safety limitations in service robotics across various industries have raised significant concerns about the need for robust mechanisms ensuring that robots adhere to safe practices, thereby preventing actions that might harm humans or cause property damage. Despite advances, including the integration of Knowledge Graphs (KGs) with Large Language Models (LLMs), challenges in ensuring consistent saf… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  22. arXiv:2405.17426  [pdf, other

    cs.CV cs.RO

    Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving

    Authors: Shaoyuan Xie, Lingdong Kong, Wenwei Zhang, Jiawei Ren, Liang Pan, Kai Chen, Ziwei Liu

    Abstract: Recent advancements in bird's eye view (BEV) representations have shown remarkable promise for in-vehicle 3D perception. However, while these methods have achieved impressive results on standard benchmarks, their robustness in varied conditions remains insufficiently assessed. In this study, we present RoboBEV, an extensive benchmark suite designed to evaluate the resilience of BEV algorithms. Thi… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Preprint; 17 pages, 13 figures, 11 tables; Code at this https URL: https://github.com/Daniel-xsy/RoboBEV

  23. arXiv:2405.17147  [pdf, other

    cs.MM

    Large Language Models (LLMs): Deployment, Tokenomics and Sustainability

    Authors: Haiwei Dong, Shuang Xie

    Abstract: The rapid advancement of Large Language Models (LLMs) has significantly impacted human-computer interaction, epitomized by the release of GPT-4o, which introduced comprehensive multi-modality capabilities. In this paper, we first explored the deployment strategies, economic considerations, and sustainability challenges associated with the state-of-the-art LLMs. More specifically, we discussed the… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE CTSoc-NCT

  24. arXiv:2405.16852  [pdf, other

    cs.LG cs.AI stat.ML

    EM Distillation for One-step Diffusion Models

    Authors: Sirui Xie, Zhisheng Xiao, Diederik P Kingma, Tingbo Hou, Ying Nian Wu, Kevin Patrick Murphy, Tim Salimans, Ben Poole, Ruiqi Gao

    Abstract: While diffusion models can learn complex distributions, sampling requires a computationally expensive iterative process. Existing distillation methods enable efficient sampling, but have notable limitations, such as performance degradation with very few sampling steps, reliance on training data access, or mode-seeking optimization that may fail to capture the full distribution. We propose EM Disti… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  25. arXiv:2405.15638  [pdf, other

    cs.CV cs.CL

    M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models

    Authors: Hongyu Wang, Jiayu Xu, Senwei Xie, Ruiping Wang, Jialin Li, Zhaojie Xie, Bin Zhang, Chuyan Xiong, Xilin Chen

    Abstract: Multilingual multimodal reasoning is a core component in achieving human-level intelligence. However, most existing benchmarks for multilingual multimodal reasoning struggle to differentiate between models of varying performance; even language models without visual capabilities can easily achieve high scores. This leaves a comprehensive evaluation of leading multilingual multimodal models largely… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Work in progress

  26. arXiv:2405.10292  [pdf, other

    cs.AI cs.CL cs.CV cs.LG

    Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

    Authors: Yuexiang Zhai, Hao Bai, Zipeng Lin, Jiayi Pan, Shengbang Tong, Yifei Zhou, Alane Suhr, Saining Xie, Yann LeCun, Yi Ma, Sergey Levine

    Abstract: Large vision-language models (VLMs) fine-tuned on specialized visual instruction-following data have exhibited impressive language reasoning capabilities across various scenarios. However, this fine-tuning paradigm may not be able to efficiently learn optimal decision-making agents in multi-step goal-directed tasks from interactive environments. To address this challenge, we propose an algorithmic… ▽ More

    Submitted 16 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

  27. arXiv:2405.09144  [pdf, other

    cs.HC

    Evaluation scheme for children-centered language interaction competence of AI-driven robots

    Authors: Siqi Xie, Jiantao Li

    Abstract: This article explores the evaluation method for the language communication proficiency of AI-driven robots engaging in interactive communication with children. The utilization of AI-driven robots in children's everyday communication is swiftly advancing, underscoring the importance of evaluating these robots'language communication skills. Based on 11 Chinese families' interviews and thematic analy… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 7 pages, CHI 2024 The Second Workshop on Child-Centred AI

  28. arXiv:2405.08816  [pdf, other

    cs.CV cs.RO

    The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition

    Authors: Lingdong Kong, Shaoyuan Xie, Hanjiang Hu, Yaru Niu, Wei Tsang Ooi, Benoit R. Cottereau, Lai Xing Ng, Yuexin Ma, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu, Weichao Qiu, Wei Zhang, Xu Cao, Hao Lu, Ying-Cong Chen, Caixin Kang, Xinning Zhou, Chengyang Ying, Wentao Shang, Xingxing Wei, Yinpeng Dong, Bo Yang, Shengyin Jiang , et al. (66 additional authors not shown)

    Abstract: In the realm of autonomous driving, robust perception under out-of-distribution conditions is paramount for the safe deployment of vehicles. Challenges such as adverse weather, sensor malfunctions, and environmental unpredictability can severely impact the performance of autonomous systems. The 2024 RoboDrive Challenge was crafted to propel the development of driving perception technologies that c… ▽ More

    Submitted 29 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

    Comments: ICRA 2024; 32 pages, 24 figures, 5 tables; Code at https://robodrive-24.github.io/

  29. arXiv:2405.06876  [pdf, other

    physics.ins-det

    Enhancing Low-Energy Neutron and Gamma Ray Detection Using Convolutional Neural Networks with EJ-276 Scintillators

    Authors: Fengzhao Shen, Tao Li, Jingkui He, Shenghui Xie, Yuehuan Wei, Tuchen Huang, Wei Wang

    Abstract: Organic scintillators, such as plastic scintillators, are widely used to detect fast neutrons and gamma rays. The EJ-276 scintillator offers a versatile solution for detecting fast neutrons and gamma rays simultaneously, making it ideal for mixed neutron-gamma field detection applications. This study evaluates the Pulse Shape Discrimination (PSD) capabilities of the EJ-276 scintillator paired with… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  30. arXiv:2405.06523  [pdf, ps, other

    math.NT

    Forms in prime variables and differing degrees

    Authors: Jianya Liu, Sizhe Xie

    Abstract: Let $F_1,\ldots,F_R$ be homogeneous polynomials with integer coefficients in $n$ variables with differing degrees. Write $\boldsymbol{F}=(F_1,\ldots,F_R)$ with $D$ being the maximal degree. Suppose that $\boldsymbol{F}$ is a nonsingular system and $n\ge D^2 4^{D+6}R^5$. We prove an asymptotic formula for the number of prime solutions to $\boldsymbol{F}(\boldsymbol{x})=\boldsymbol{0}$, whose main t… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: 35 pages

  31. arXiv:2405.05170  [pdf, other

    cs.MM cs.CV eess.IV

    Picking watermarks from noise (PWFN): an improved robust watermarking model against intensive distortions

    Authors: Sijing Xie, Chengxin Zhao, Nan Sun, Wei Li, Hefei Ling

    Abstract: Digital watermarking is the process of embedding secret information by altering images in an undetectable way to the human eye. To increase the robustness of the model, many deep learning-based watermarking methods use the encoder-noise-decoder architecture by adding different noises to the noise layer. The decoder then extracts the watermarked information from the distorted image. However, this m… ▽ More

    Submitted 17 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

  32. arXiv:2405.03458  [pdf, other

    cs.CV

    SSyncOA: Self-synchronizing Object-aligned Watermarking to Resist Cropping-paste Attacks

    Authors: Chengxin Zhao, Hefei Ling, Sijing Xie, Han Fang, Yaokun Fang, Nan Sun

    Abstract: Modern image processing tools have made it easy for attackers to crop the region or object of interest in images and paste it into other images. The challenge this cropping-paste attack poses to the watermarking technology is that it breaks the synchronization of the image watermark, introducing multiple superimposed desynchronization distortions, such as rotation, scaling, and translation. Howeve… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 7 pages, 5 figures (Have been accepted by ICME 2024)

  33. arXiv:2405.03436  [pdf, other

    cs.CV cs.MM

    DBDH: A Dual-Branch Dual-Head Neural Network for Invisible Embedded Regions Localization

    Authors: Chengxin Zhao, Hefei Ling, Sijing Xie, Nan Sun, Zongyi Li, Yuxuan Shi, Jiazhong Chen

    Abstract: Embedding invisible hyperlinks or hidden codes in images to replace QR codes has become a hot topic recently. This technology requires first localizing the embedded region in the captured photos before decoding. Existing methods that train models to find the invisible embedded region struggle to obtain accurate localization results, leading to degraded decoding accuracy. This limitation is primari… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 7 pages, 6 figures (Have been accepted by IJCNN 2024)

  34. arXiv:2404.18279  [pdf, other

    cs.CV

    Out-of-distribution Detection in Medical Image Analysis: A survey

    Authors: Zesheng Hong, Yubiao Yue, Yubin Chen, Lele Cong, Huanjie Lin, Yuanmei Luo, Mini Han Wang, Weidong Wang, Jialong Xu, Xiaoqi Yang, Hechang Chen, Zhenzhang Li, Sihong Xie

    Abstract: Computer-aided diagnostics has benefited from the development of deep learning-based computer vision techniques in these years. Traditional supervised deep learning methods assume that the test sample is drawn from the identical distribution as the training data. However, it is possible to encounter out-of-distribution samples in real-world clinical scenarios, which may cause silent failure in dee… ▽ More

    Submitted 3 July, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

    Comments: 23 pages, 3 figures

  35. arXiv:2404.16030  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    MoDE: CLIP Data Experts via Clustering

    Authors: Jiawei Ma, Po-Yao Huang, Saining Xie, Shang-Wen Li, Luke Zettlemoyer, Shih-Fu Chang, Wen-Tau Yih, Hu Xu

    Abstract: The success of contrastive language-image pretraining (CLIP) relies on the supervision from the pairing between images and captions, which tends to be noisy in web-crawled data. We present Mixture of Data Experts (MoDE) and learn a system of CLIP data experts via clustering. Each data expert is trained on one data cluster, being less sensitive to false negative noises in other clusters. At inferen… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: IEEE CVPR 2024 Camera Ready. Code Link: https://github.com/facebookresearch/MetaCLIP/tree/main/mode

  36. arXiv:2404.14642  [pdf, other

    cs.LG

    Uncertainty Quantification on Graph Learning: A Survey

    Authors: Chao Chen, Chenghua Guo, Rui Xu, Xiangwen Liao, Xi Zhang, Sihong Xie, Hui Xiong, Philip Yu

    Abstract: Graphical models, including Graph Neural Networks (GNNs) and Probabilistic Graphical Models (PGMs), have demonstrated their exceptional capabilities across numerous fields. These models necessitate effective uncertainty quantification to ensure reliable decision-making amid the challenges posed by model training discrepancies and unpredictable testing scenarios. This survey examines recent works t… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  37. arXiv:2404.14511  [pdf

    cs.HC

    Children's Overtrust and Shifting Perspectives of Generative AI

    Authors: Jaemarie Solyst, Ellia Yang, Shixian Xie, Jessica Hammer, Amy Ogan, Motahhare Eslami

    Abstract: The capabilities of generative AI (genAI) have dramatically increased in recent times, and there are opportunities for children to leverage new features for personal and school-related endeavors. However, while the future of genAI is taking form, there remain potentially harmful limitations, such as generation of outputs with misinformation and bias. We ran a workshop study focused on ChatGPT to e… ▽ More

    Submitted 29 June, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Journal ref: Proceedings of the 18th International Scoeity of the Learning Sciences (ICLS) 2024

  38. arXiv:2404.13848  [pdf, other

    cs.CV

    DSDRNet: Disentangling Representation and Reconstruct Network for Domain Generalization

    Authors: Juncheng Yang, Zuchao Li, Shuai Xie, Wei Yu, Shijun Li

    Abstract: Domain generalization faces challenges due to the distribution shift between training and testing sets, and the presence of unseen target domains. Common solutions include domain alignment, meta-learning, data augmentation, or ensemble learning, all of which rely on domain labels or domain adversarial techniques. In this paper, we propose a Dual-Stream Separation and Reconstruction Network, dubbed… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: This paper is accepted to IJCNN 2024

  39. arXiv:2404.12588  [pdf, other

    cs.CV cs.LG

    Cross-Modal Adapter: Parameter-Efficient Transfer Learning Approach for Vision-Language Models

    Authors: Juncheng Yang, Zuchao Li, Shuai Xie, Weiping Zhu, Wei Yu, Shijun Li

    Abstract: Adapter-based parameter-efficient transfer learning has achieved exciting results in vision-language models. Traditional adapter methods often require training or fine-tuning, facing challenges such as insufficient samples or resource limitations. While some methods overcome the need for training by leveraging image modality cache and retrieval, they overlook the text modality's importance and cro… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: This paper is accepted to ICME 2024

  40. Finding the Particularity of the Active Episode of SGR J1935+2154 during Which FRB 20200428 Occurred: Implication from Statistics of Fermi/GBM X-Ray Bursts

    Authors: Sheng-Lun Xie, Yun-Wei Yu, Shao-Lin Xiong, Lin Lin, Ping Wang, Yi Zhao, Yue Wang, Wen-Long Zhang

    Abstract: By using the Fermi/Gamma-ray Burst Monitor data of the X-ray bursts (XRBs) of SGR J1935+2154, we investigate the temporal clustering of the bursts and the cumulative distribution of the waiting time and fluence/flux. It is found that the bursts occurring in the episode hosting FRB 20200428 have obviously shorter waiting times than those in the other episodes. The general statistical properties of… ▽ More

    Submitted 8 June, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

  41. arXiv:2404.09001  [pdf, other

    cs.RO cs.AI cs.CV

    Smart Help: Strategic Opponent Modeling for Proactive and Adaptive Robot Assistance in Households

    Authors: Zhihao Cao, Zidong Wang, Siwen Xie, Anji Liu, Lifeng Fan

    Abstract: Despite the significant demand for assistive technology among vulnerable groups (e.g., the elderly, children, and the disabled) in daily tasks, research into advanced AI-driven assistive solutions that genuinely accommodate their diverse needs remains sparse. Traditional human-machine interaction tasks often require machines to simply help without nuanced consideration of human abilities and feeli… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  42. arXiv:2404.05221  [pdf, other

    cs.CL cs.AI

    LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models

    Authors: Shibo Hao, Yi Gu, Haotian Luo, Tianyang Liu, Xiyan Shao, Xinyuan Wang, Shuhua Xie, Haodi Ma, Adithya Samavedhi, Qiyue Gao, Zhen Wang, Zhiting Hu

    Abstract: Generating accurate step-by-step reasoning is essential for Large Language Models (LLMs) to address complex problems and enhance robustness and interpretability. Despite the flux of research on developing advanced reasoning approaches, systematically analyzing the diverse LLMs and reasoning strategies in generating reasoning chains remains a significant challenge. The difficulties stem from the la… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: Project website: https://www.llm-reasoners.net/

  43. arXiv:2404.04538  [pdf, other

    cs.AI cs.CL

    Soft-Prompting with Graph-of-Thought for Multi-modal Representation Learning

    Authors: Juncheng Yang, Zuchao Li, Shuai Xie, Wei Yu, Shijun Li, Bo Du

    Abstract: The chain-of-thought technique has been received well in multi-modal tasks. It is a step-by-step linear reasoning process that adjusts the length of the chain to improve the performance of generated prompts. However, human thought processes are predominantly non-linear, as they encompass multiple aspects simultaneously and employ dynamic adjustment and updating mechanisms. Therefore, we propose a… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: This paper is accepted to LREC-COLING 2024

  44. arXiv:2404.04454  [pdf, other

    cs.LG math.OC stat.ML

    Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization

    Authors: Shuo Xie, Zhiyuan Li

    Abstract: Adam with decoupled weight decay, also known as AdamW, is widely acclaimed for its superior performance in language modeling tasks, surpassing Adam with $\ell_2$ regularization in terms of generalization and optimization. However, this advantage is not theoretically well-understood. One challenge here is that though intuitively Adam with $\ell_2$ regularization optimizes the $\ell_2$ regularized l… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  45. arXiv:2404.02545  [pdf, other

    cs.LG cs.AI

    Grid-Mapping Pseudo-Count Constraint for Offline Reinforcement Learning

    Authors: Yi Shen, Hanyan Huang, Shan Xie

    Abstract: Offline reinforcement learning learns from a static dataset without interacting with the environment, which ensures security and thus owns a good prospect of application. However, directly applying naive reinforcement learning methods usually fails in an offline environment due to function approximation errors caused by out-of-distribution(OOD) actions. To solve this problem, existing algorithms m… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  46. arXiv:2404.01664  [pdf, other

    physics.soc-ph nlin.AO nlin.PS physics.bio-ph

    Nonreciprocal interactions in crowd dynamics: investigating the impact of moving threats on pedestrian speed preferences

    Authors: Shaocong Xie, Rui Ye, Xiaolian Li, Zhongyi Huang, Shuchao Cao, Wei Lv, Hong He, Ping Zhang, Zhiming Fang, Jun Zhang, Weiguo Song

    Abstract: Nonreciprocal interaction crowd systems, such as human-human, human-vehicle, and human-robot systems, often have serious impacts on pedestrian safety and social order. A more comprehensive understanding of these systems is needed to optimize system stability and efficiency. Despite the importance of these interactions, empirical research in this area remains limited. Thus, in our study we explore… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  47. arXiv:2404.01217  [pdf, other

    cs.LG cs.AI

    Incorporating Domain Differential Equations into Graph Convolutional Networks to Lower Generalization Discrepancy

    Authors: Yue Sun, Chao Chen, Yuesheng Xu, Sihong Xie, Rick S. Blum, Parv Venkitasubramaniam

    Abstract: Ensuring both accuracy and robustness in time series prediction is critical to many applications, ranging from urban planning to pandemic management. With sufficient training data where all spatiotemporal patterns are well-represented, existing deep-learning models can make reasonably accurate predictions. However, existing methods fail when the training data are drawn from different circumstances… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  48. arXiv:2403.15285  [pdf, other

    cs.NI cs.CR cs.HC cs.LG

    Blockchain-based Pseudonym Management for Vehicle Twin Migrations in Vehicular Edge Metaverse

    Authors: Jiawen Kang, Xiaofeng Luo, Jiangtian Nie, Tianhao Wu, Haibo Zhou, Yonghua Wang, Dusit Niyato, Shiwen Mao, Shengli Xie

    Abstract: Driven by the great advances in metaverse and edge computing technologies, vehicular edge metaverses are expected to disrupt the current paradigm of intelligent transportation systems. As highly computerized avatars of Vehicular Metaverse Users (VMUs), the Vehicle Twins (VTs) deployed in edge servers can provide valuable metaverse services to improve driving safety and on-board satisfaction for th… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 14 pages, 9 figures

  49. arXiv:2403.15025  [pdf, other

    cs.LG stat.ML

    Robust Conformal Prediction under Distribution Shift via Physics-Informed Structural Causal Model

    Authors: Rui Xu, Yue Sun, Chao Chen, Parv Venkitasubramaniam, Sihong Xie

    Abstract: Uncertainty is critical to reliable decision-making with machine learning. Conformal prediction (CP) handles uncertainty by predicting a set on a test input, hoping the set to cover the true label with at least $(1-α)$ confidence. This coverage can be guaranteed on test data even if the marginal distributions $P_X$ differ between calibration and test datasets. However, as it is common in practice,… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  50. arXiv:2403.13237  [pdf, ps, other

    cs.CR math.OC

    Graph Attention Network-based Block Propagation with Optimal AoI and Reputation in Web 3.0

    Authors: Jiana Liao, Jinbo Wen, Jiawen Kang, Changyan Yi, Yang Zhang, Yutao Jiao, Dusit Niyato, Dong In Kim, Shengli Xie

    Abstract: Web 3.0 is recognized as a pioneering paradigm that empowers users to securely oversee data without reliance on a centralized authority. Blockchains, as a core technology to realize Web 3.0, can facilitate decentralized and transparent data management. Nevertheless, the evolution of blockchain-enabled Web 3.0 is still in its nascent phase, grappling with challenges such as ensuring efficiency and… ▽ More

    Submitted 8 May, 2024; v1 submitted 19 March, 2024; originally announced March 2024.