Skip to main content

Showing 1–50 of 396 results for author: Gu, Z

  1. arXiv:2407.05909  [pdf, other

    cs.CV

    Multi-clue Consistency Learning to Bridge Gaps Between General and Oriented Object in Semi-supervised Detection

    Authors: Chenxu Wang, Chunyan Xu, Ziqi Gu, Zhen Cui

    Abstract: While existing semi-supervised object detection (SSOD) methods perform well in general scenes, they encounter challenges in handling oriented objects in aerial images. We experimentally find three gaps between general and oriented object detection in semi-supervised learning: 1) Sampling inconsistency: the common center sampling is not suitable for oriented objects with larger aspect ratios when s… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  2. arXiv:2407.04895  [pdf, ps, other

    math.AT

    Retractive spaces and Bousfield-Kan completions

    Authors: Zeshen Gu, John E. Harper

    Abstract: In this short paper we apply some recent techniques developed by Schonsheck, and subsequently Carr-Harper, in the context of operadic algebras in spectra -- on convergence of Bousfield-Kan completions and comparisons with convergence of the Taylor tower of the identity functor in Goodwillie's functor calculus -- to the setting of retractive spaces: this arises when working with spaces centered awa… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  3. arXiv:2407.04845  [pdf, other

    cs.NI

    Poster: Flexible Scheduling of Network and Computing Resources for Distributed AI Tasks

    Authors: Ruikun Wang, Jiawei Zhang, Qiaolun Zhang, Bojun Zhang, Zhiqun Gu, Aryanaz Attarpour, Yuefeng Ji, Massimo Tornatore

    Abstract: Many emerging Artificial Intelligence (AI) applications require on-demand provisioning of large-scale computing, which can only be enabled by leveraging distributed computing services interconnected through networking. To address such increasing demand for networking to serve AI tasks, we investigate new scheduling strategies to improve communication efficiency and test them on a programmable test… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  4. arXiv:2407.03942  [pdf, other

    cs.AI cs.CL cs.HC

    Diverse and Fine-Grained Instruction-Following Ability Exploration with Synthetic Data

    Authors: Zihui Gu, Xingwu Sun, Fengzong Lian, Zhanhui Kang, Cheng-Zhong Xu, Ju Fan

    Abstract: Instruction-following is particularly crucial for large language models (LLMs) to support diverse user requests. While existing work has made progress in aligning LLMs with human preferences, evaluating their capabilities on instruction following remains a challenge due to complexity and diversity of real-world user instructions. While existing evaluation methods focus on general skills, they suff… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Journal ref: AAAI 2024

  5. arXiv:2407.02730  [pdf, other

    cs.CV cs.AI

    MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical Context

    Authors: Zishan Gu, Changchang Yin, Fenglin Liu, Ping Zhang

    Abstract: Large Vision Language Models (LVLMs) have recently achieved superior performance in various tasks on natural image and text data, which inspires a large amount of studies for LVLMs fine-tuning and training. Despite their advancements, there has been scant research on the robustness of these models against hallucination when fine-tuned on smaller datasets. In this study, we introduce a new benchmar… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  6. arXiv:2406.14250  [pdf, other

    cs.CV cs.HC

    E-ANT: A Large-Scale Dataset for Efficient Automatic GUI NavigaTion

    Authors: Ke Wang, Tianyu Xia, Zhangxuan Gu, Yi Zhao, Shuheng Shen, Changhua Meng, Weiqiang Wang, Ke Xu

    Abstract: Online GUI navigation on mobile devices has driven a lot of attention recent years since it contributes to many real-world applications. With the rapid development of large language models (LLM), multimodal large language models (MLLM) have tremendous potential on this task. However, existing MLLMs need high quality data to improve its abilities of making the correct navigation decisions according… ▽ More

    Submitted 1 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: 9 pages, 5 figures, Under review

  7. arXiv:2406.13726  [pdf, other

    math.OC cs.LG econ.GN

    Global Solutions to Master Equations for Continuous Time Heterogeneous Agent Macroeconomic Models

    Authors: Zhouzhou Gu, Mathieu Laurière, Sebastian Merkel, Jonathan Payne

    Abstract: We propose and compare new global solution algorithms for continuous time heterogeneous agent economies with aggregate shocks. First, we approximate the agent distribution so that equilibrium in the economy can be characterized by a high, but finite, dimensional non-linear partial differential equation. We consider different approximations: discretizing the number of agents, discretizing the agent… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  8. arXiv:2406.12641  [pdf, other

    cs.CL

    DetectBench: Can Large Language Model Detect and Piece Together Implicit Evidence?

    Authors: Zhouhong Gu, Lin Zhang, Xiaoxuan Zhu, Jiangjie Chen, Wenhao Huang, Yikai Zhang, Shusen Wang, Zheyu Ye, Yan Gao, Hongwei Feng, Yanghua Xiao

    Abstract: Detecting evidence within the context is a key step in the process of reasoning task. Evaluating and enhancing the capabilities of LLMs in evidence detection will strengthen context-based reasoning performance. This paper proposes a benchmark called DetectBench for verifying the ability to detect and piece together implicit evidence within a long context. DetectBench contains 3,928 multiple-choice… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  9. arXiv:2406.11931  [pdf, other

    cs.SE cs.AI cs.LG

    DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

    Authors: DeepSeek-AI, Qihao Zhu, Daya Guo, Zhihong Shao, Dejian Yang, Peiyi Wang, Runxin Xu, Y. Wu, Yukun Li, Huazuo Gao, Shirong Ma, Wangding Zeng, Xiao Bi, Zihui Gu, Hanwei Xu, Damai Dai, Kai Dong, Liyue Zhang, Yishi Piao, Zhibin Gou, Zhenda Xie, Zhewen Hao, Bingxuan Wang, Junxiao Song, Deli Chen , et al. (15 additional authors not shown)

    Abstract: We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathe… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  10. arXiv:2406.11374  [pdf, other

    cond-mat.str-el

    Pseudogap with Fermi arcs and Fermi pockets in half-filled twisted transition metal dichalcogenides

    Authors: Yong-Yue Zong, Zhao-Long Gu, Jian-Xin Li

    Abstract: Twisted transition metal dichalcogenides are a new platform for realizing strongly correlated physics with high tunability. Recent transport experiments have reported the realization of a Mott insulator, its bandwidth-driven evolution to a metal, and the strange metal behavior in proximity to the transition via the tuning of a displacement field in twisted $\mathrm{WSe_2}$($\mathrm{tWSe_2}$) fixed… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 8+11 pages, 4+8 figures

  11. arXiv:2406.10621  [pdf, other

    cs.CL cs.AI

    StrucText-Eval: An Autogenerated Benchmark for Evaluating Large Language Model's Ability in Structure-Rich Text Understanding

    Authors: Zhouhong Gu, Haoning Ye, Zeyang Zhou, Hongwei Feng, Yanghua Xiao

    Abstract: Given the substantial volumes of structured data held by many companies, enabling Large Language Models (LLMs) to directly understand structured text in non-structured forms could significantly enhance their capabilities across various business scenarios. To this end, we propose evaluation data generation method for assessing LLM's ability in understanding the structure-rich text, which generates… ▽ More

    Submitted 30 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

  12. arXiv:2405.19707  [pdf, other

    cs.CV

    DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark

    Authors: Haoxing Chen, Yan Hong, Zizheng Huang, Zhuoer Xu, Zhangxuan Gu, Yaohui Li, Jun Lan, Huijia Zhu, Jianfu Zhang, Weiqiang Wang, Huaxiong Li

    Abstract: Recently, video generation techniques have advanced rapidly. Given the popularity of video content on social media platforms, these models intensify concerns about the spread of fake information. Therefore, there is a growing demand for detectors capable of distinguishing between fake AI-generated videos and mitigating the potential harm caused by fake information. However, the lack of large-scale… ▽ More

    Submitted 16 July, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

  13. arXiv:2405.15216  [pdf, other

    cs.LG cs.CL cs.SD eess.AS

    Denoising LM: Pushing the Limits of Error Correction Models for Speech Recognition

    Authors: Zijin Gu, Tatiana Likhomanenko, He Bai, Erik McDermott, Ronan Collobert, Navdeep Jaitly

    Abstract: Language models (LMs) have long been used to improve results of automatic speech recognition (ASR) systems, but they are unaware of the errors that ASR systems make. Error correction models are designed to fix ASR errors, however, they showed little improvement over traditional LMs mainly due to the lack of supervised training data. In this paper, we present Denoising LM (DLM), which is a… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: under review

  14. arXiv:2405.12247  [pdf, other

    cs.CV

    Focus on Low-Resolution Information: Multi-Granular Information-Lossless Model for Low-Resolution Human Pose Estimation

    Authors: Zejun Gu, Zhong-Qiu Zhao, Hao Shen, Zhao Zhang

    Abstract: In real-world applications of human pose estimation, low-resolution input images are frequently encountered when the performance of the image acquisition equipment is limited or the shooting distance is too far. However, existing state-of-the-art models for human pose estimation perform poorly on low-resolution images. One key reason is the presence of downsampling layers in these models, e.g., st… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: 8 pages, 5 figures, conference

  15. arXiv:2405.11640  [pdf, other

    cs.AI cs.CL cs.CV

    Inquire, Interact, and Integrate: A Proactive Agent Collaborative Framework for Zero-Shot Multimodal Medical Reasoning

    Authors: Zishan Gu, Fenglin Liu, Changchang Yin, Ping Zhang

    Abstract: The adoption of large language models (LLMs) in healthcare has attracted significant research interest. However, their performance in healthcare remains under-investigated and potentially limited, due to i) they lack rich domain-specific knowledge and medical reasoning skills; and ii) most state-of-the-art LLMs are unimodal, text-only models that cannot directly process multimodal inputs. To this… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  16. arXiv:2405.11448  [pdf, other

    cs.CV

    Cross-Domain Knowledge Distillation for Low-Resolution Human Pose Estimation

    Authors: Zejun Gu, Zhong-Qiu Zhao, Henghui Ding, Hao Shen, Zhao Zhang, De-Shuang Huang

    Abstract: In practical applications of human pose estimation, low-resolution inputs frequently occur, and existing state-of-the-art models perform poorly with low-resolution images. This work focuses on boosting the performance of low-resolution models by distilling knowledge from a high-resolution model. However, we face the challenge of feature size mismatch and class number mismatch when applying knowled… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: 11 pages, 5 figures

  17. arXiv:2405.10691  [pdf, other

    eess.IV cs.CV

    LoCI-DiffCom: Longitudinal Consistency-Informed Diffusion Model for 3D Infant Brain Image Completion

    Authors: Zihao Zhu, Tianli Tao, Yitian Tao, Haowen Deng, Xinyi Cai, Gaofeng Wu, Kaidong Wang, Haifeng Tang, Lixuan Zhu, Zhuoyang Gu, Jiawei Huang, Dinggang Shen, Han Zhang

    Abstract: The infant brain undergoes rapid development in the first few years after birth.Compared to cross-sectional studies, longitudinal studies can depict the trajectories of infants brain development with higher accuracy, statistical power and flexibility.However, the collection of infant longitudinal magnetic resonance (MR) data suffers a notorious dropout problem, resulting in incomplete datasets wit… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  18. arXiv:2405.10316  [pdf, other

    cs.CV cs.GR

    Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model

    Authors: Zheng Gu, Shiyuan Yang, Jing Liao, Jing Huo, Yang Gao

    Abstract: Visual In-Context Learning (ICL) has emerged as a promising research area due to its capability to accomplish various tasks with limited example pairs through analogical reasoning. However, training-based visual ICL has limitations in its ability to generalize to unseen tasks and requires the collection of a diverse task dataset. On the other hand, existing methods in the inference-based visual IC… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Project page: https://analogist2d.github.io

  19. arXiv:2405.08447  [pdf, other

    cs.HC

    AI-Resilient Interfaces

    Authors: Elena L. Glassman, Ziwei Gu, Jonathan K. Kummerfeld

    Abstract: AI is powerful, but it can make choices that result in objective errors, contextually inappropriate outputs, and disliked options. We need AI-resilient interfaces that help people be resilient to the AI choices that are not right, or not right for them. To support this goal, interfaces need to help users notice and have the context to appropriately judge those AI choices. Existing human-AI interac… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  20. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  21. arXiv:2405.01882  [pdf, other

    cs.RO cs.AI eess.SP

    Millimeter Wave Radar-based Human Activity Recognition for Healthcare Monitoring Robot

    Authors: Zhanzhong Gu, Xiangjian He, Gengfa Fang, Chengpei Xu, Feng Xia, Wenjing Jia

    Abstract: Healthcare monitoring is crucial, especially for the daily care of elderly individuals living alone. It can detect dangerous occurrences, such as falls, and provide timely alerts to save lives. Non-invasive millimeter wave (mmWave) radar-based healthcare monitoring systems using advanced human activity recognition (HAR) models have recently gained significant attention. However, they encounter cha… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  22. arXiv:2405.00797  [pdf, other

    cs.RO cs.CV

    ADM: Accelerated Diffusion Model via Estimated Priors for Robust Motion Prediction under Uncertainties

    Authors: Jiahui Li, Tianle Shen, Zekai Gu, Jiawei Sun, Chengran Yuan, Yuhang Han, Shuo Sun, Marcelo H. Ang Jr

    Abstract: Motion prediction is a challenging problem in autonomous driving as it demands the system to comprehend stochastic dynamics and the multi-modal nature of real-world agent interactions. Diffusion models have recently risen to prominence, and have proven particularly effective in pedestrian motion prediction tasks. However, the significant time consumption and sensitivity to noise have limited the r… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 7 pages, 4 figures

  23. arXiv:2404.16770  [pdf, other

    cond-mat.str-el

    Pseudogap phase as fluctuating pair density wave

    Authors: Zheng-Yuan Yue, Zheng-Tao Xu, Shuo Yang, Zheng-Cheng Gu

    Abstract: The physical nature of pseudogap phase is one of the most important and intriguing problems towards understanding the key mechanism of high temperature superconductivity in cuprates. Theoretically, the square-lattice $t$-$J$ model is widely believed to be the simplest toy model that captures the essential physics of cuprate superconductors. We employ the Grassmann tensor product state approach to… ▽ More

    Submitted 15 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: 10 pages, 13 figures, references added

  24. arXiv:2404.15254  [pdf, other

    cs.CV

    UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition

    Authors: Bin Wang, Zhuangcheng Gu, Chao Xu, Bo Zhang, Botian Shi, Conghui He

    Abstract: This paper presents the UniMER dataset to provide the first study on Mathematical Expression Recognition (MER) towards complex real-world scenarios. The UniMER dataset consists of a large-scale training set UniMER-1M offering an unprecedented scale and diversity with one million training instances and a meticulously designed test set UniMER-Test that reflects a diverse range of formula distributio… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 17 pages, 5 figures

  25. arXiv:2404.13671  [pdf, other

    cs.CV cs.LG

    FiLo: Zero-Shot Anomaly Detection by Fine-Grained Description and High-Quality Localization

    Authors: Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Hao Li, Ming Tang, Jinqiao Wang

    Abstract: Zero-shot anomaly detection (ZSAD) methods entail detecting anomalies directly without access to any known normal or abnormal samples within the target item categories. Existing approaches typically rely on the robust generalization capabilities of multimodal pretrained models, computing similarities between manually crafted textual features representing "normal" or "abnormal" semantics and image… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  26. arXiv:2404.09872  [pdf, other

    cs.CV

    Conditional Prototype Rectification Prompt Learning

    Authors: Haoxing Chen, Yaohui Li, Zizheng Huang, Yan Hong, Zhuoer Xu, Zhangxuan Gu, Jun Lan, Huijia Zhu, Weiqiang Wang

    Abstract: Pre-trained large-scale vision-language models (VLMs) have acquired profound understanding of general visual concepts. Recent advancements in efficient transfer learning (ETL) have shown remarkable success in fine-tuning VLMs within the scenario of limited data, introducing only a few parameters to harness task-specific insights from VLMs. Despite significant progress, current leading ETL methods… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  27. arXiv:2404.07598  [pdf, other

    physics.optics physics.app-ph

    Electro-optically Modulated Nonlinear Metasurfaces

    Authors: Zhengqing He, Lun Qu, Wei Wu, Jikun Liu, Jingfei You, Weiye Liu, Lu Bai, Chunyan Jin, Chenxiong Wang, Zhidong Gu, Wei Cai, Mengxin Ren, Jingjun Xu

    Abstract: Tunable nonlinearity facilitates the creation of reconfigurable nonlinear metasurfaces, enabling innovative applications in signal processing, light switching, and sensing. This paper presents a novel approach to electrically modulate SHG from a lithium niobate (LN) metasurface, exploiting the electro-optical (EO) effect. By fabricating a nanohole array metasurface on a thin LN film and applying a… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 4 pages, 4 figures

  28. arXiv:2404.05685  [pdf, other

    cond-mat.str-el quant-ph

    Global phase diagram of doped quantum spin liquid on the Kagome lattice

    Authors: Zheng-Tao Xu, Zheng-Cheng Gu, Shuo Yang

    Abstract: It has long been believed that doped quantum spin liquids (QSLs) can give rise to fascinating quantum phases, including the possibility of high-temperature superconductivity (SC) as proposed by P. W. Anderson's resonating valence bond (RVB) scenario. The Kagome lattice $t$-$J$ model is known to exhibit spin liquid behavior at half-filling, making it an ideal system for studying the properties of d… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: 11 pages, 17 figures

  29. arXiv:2403.16062  [pdf

    eess.SP

    Holography inspired self-controlled reconfigurable intelligent surface

    Authors: Jieao Zhu, Ze Gu, Qian Ma, Linglong Dai, Tie Jun Cui

    Abstract: Among various promising candidate technologies for the sixth-generation (6G) wireless communications, recent advances in microwave metasurfaces have sparked a new research area of reconfigurable intelligent surfaces (RISs). By controllably reprogramming the wireless propagation channel, RISs are envisioned to achieve low-cost wireless capacity boosting, coverage extension, and enhanced energy effi… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: Traditional BS-controlled RISs suffer from complicated control cables. To "cut" the control cables, we propose a self-controlled RIS by leveraging the holographic interference principle, thus realizing autonomous RIS beamforming

  30. arXiv:2403.15993  [pdf, other

    cs.RO

    Robust-Locomotion-by-Logic: Perturbation-Resilient Bipedal Locomotion via Signal Temporal Logic Guided Model Predictive Control

    Authors: Zhaoyuan Gu, Yuntian Zhao, Yipu Chen, Rongming Guo, Jennifer K. Leestma, Gregory S. Sawicki, Ye Zhao

    Abstract: This study introduces a robust planning framework that utilizes a model predictive control (MPC) approach, enhanced by incorporating signal temporal logic (STL) specifications. This marks the first-ever study to apply STL-guided trajectory optimization for bipedal locomotion, specifically designed to handle both translational and orientational perturbations. Existing recovery strategies often stru… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

  31. arXiv:2403.15718  [pdf, other

    math.AP

    On a dryout point for a stationary incompressible thermal fluid with phase transition in a pipe

    Authors: Yoshikazu Giga, Zhongyang Gu

    Abstract: A dryout point is recognized as the position where the phase transition from liquid to vapor occurs. In the one-dimensional case, by solving the stationary incompressible Navier-Stokes-Fourier equations with phase transition, we derive a necessary and sufficient condition for a dryout point to exist when the temperature at the liquid-vapor interface is given. In addition, we show by considering th… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: 20 pages, 12 figures

    MSC Class: 35Q79; 76D05; 80A22

  32. arXiv:2403.13433  [pdf, other

    cs.AI cs.CL cs.CY

    AgentGroupChat: An Interactive Group Chat Simulacra For Better Eliciting Emergent Behavior

    Authors: Zhouhong Gu, Xiaoxuan Zhu, Haoran Guo, Lin Zhang, Yin Cai, Hao Shen, Jiangjie Chen, Zheyu Ye, Yifei Dai, Yan Gao, Yao Hu, Hongwei Feng, Yanghua Xiao

    Abstract: Language significantly influences the formation and evolution of Human emergent behavior, which is crucial in understanding collective intelligence within human societies. Considering that the study of how language affects human behavior needs to put it into the dynamic scenarios in which it is used, we introduce AgentGroupChat in this paper, a simulation that delves into the complex role of langu… ▽ More

    Submitted 4 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  33. arXiv:2403.12580  [pdf, other

    cs.CV

    Real-IAD: A Real-World Multi-View Dataset for Benchmarking Versatile Industrial Anomaly Detection

    Authors: Chengjie Wang, Wenbing Zhu, Bin-Bin Gao, Zhenye Gan, Jianning Zhang, Zhihao Gu, Shuguang Qian, Mingang Chen, Lizhuang Ma

    Abstract: Industrial anomaly detection (IAD) has garnered significant attention and experienced rapid development. However, the recent development of IAD approach has encountered certain difficulties due to dataset limitations. On the one hand, most of the state-of-the-art methods have achieved saturation (over 99% in AUROC) on mainstream datasets such as MVTec, and the differences of methods cannot be well… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: It is accepted by CVPR2024

  34. arXiv:2403.07825  [pdf, other

    cs.CL

    The Missing Piece in Model Editing: A Deep Dive into the Hidden Damage Brought By Model Editing

    Authors: Jianchen Wang, Zhouhong Gu, Xiaoxuan Zhu, Lin Zhang, Haoning Ye, Zhuozhi Xiong, Hongwei Feng, Yanghua Xiao

    Abstract: Large Language Models have revolutionized numerous tasks with their remarkable efficacy. However, editing these models, crucial for rectifying outdated or erroneous information, often leads to a complex issue known as the ripple effect in the hidden space. While difficult to detect, this effect can significantly impede the efficacy of model editing tasks and deteriorate model performance. This pap… ▽ More

    Submitted 2 July, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

  35. arXiv:2403.05644  [pdf, other

    stat.ME stat.AP

    TSSS: A Novel Triangulated Spherical Spline Smoothing for Surface-based Data

    Authors: Zhiling Gu, Shan Yu, Guannan Wang, Ming-Jun Lai, Li Wang

    Abstract: Surface-based data is commonly observed in diverse practical applications spanning various fields. In this paper, we introduce a novel nonparametric method to discover the underlying signals from data distributed on complex surface-based domains. Our approach involves a penalized spline estimator defined on a triangulation of surface patches, which enables effective signal extraction and recovery.… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 56 pages, 16 figures

    MSC Class: 62G05; 62G08

  36. arXiv:2403.04652  [pdf, other

    cs.CL cs.AI

    Yi: Open Foundation Models by 01.AI

    Authors: 01. AI, :, Alex Young, Bei Chen, Chao Li, Chengen Huang, Ge Zhang, Guanwei Zhang, Heng Li, Jiangcheng Zhu, Jianqun Chen, Jing Chang, Kaidong Yu, Peng Liu, Qiang Liu, Shawn Yue, Senbin Yang, Shiming Yang, Tao Yu, Wen Xie, Wenhao Huang, Xiaohui Hu, Xiaoyi Ren, Xinyao Niu, Pengcheng Nie , et al. (7 additional authors not shown)

    Abstract: We introduce the Yi model family, a series of language and multimodal models that demonstrate strong multi-dimensional capabilities. The Yi model family is based on 6B and 34B pretrained language models, then we extend them to chat models, 200K long context models, depth-upscaled models, and vision-language models. Our base models achieve strong performance on a wide range of benchmarks like MMLU,… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  37. arXiv:2403.01704  [pdf

    physics.optics

    Giant second harmonic generation in supertwisted WS2 spirals grown in step edge particle induced non-Euclidean surfaces

    Authors: Tong Tong, Ruijie Chen, Yuxuan Ke, Qian Wang, Xinchao Wang, Qinjun Sun, Jie Chen, Zhiyuan Gu, Ying Yu, Hongyan Wei, Yuying Hao, Xiaopeng Fan, Qing Zhang

    Abstract: In moire crystals resulting from the stacking of twisted two-dimensional (2D) layered materials, a subtle adjustment in the twist angle surprisingly gives rise to a wide range of correlated optical and electrical properties. Herein, we report the synthesis of supertwisted WS2 spirals and the observation of giant second harmonic generation (SHG) in these spirals. Supertwisted WS2 spirals featuring… ▽ More

    Submitted 12 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: 21 pages, 4 figures

  38. arXiv:2402.19270  [pdf, other

    cs.CV

    Learning Intra-view and Cross-view Geometric Knowledge for Stereo Matching

    Authors: Rui Gong, Weide Liu, Zaiwang Gu, Xulei Yang, Jun Cheng

    Abstract: Geometric knowledge has been shown to be beneficial for the stereo matching task. However, prior attempts to integrate geometric insights into stereo matching algorithms have largely focused on geometric knowledge from single images while crucial cross-view factors such as occlusion and matching uniqueness have been overlooked. To address this gap, we propose a novel Intra-view and Cross-view Geom… ▽ More

    Submitted 6 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted to CVPR2024

  39. arXiv:2402.18986  [pdf, other

    cs.CR

    Always be Pre-Training: Representation Learning for Network Intrusion Detection with GNNs

    Authors: Zhengyao Gu, Diego Troy Lopez, Lilas Alrahis, Ozgur Sinanoglu

    Abstract: Graph neural network-based network intrusion detection systems have recently demonstrated state-of-the-art performance on benchmark datasets. Nevertheless, these methods suffer from a reliance on target encoding for data pre-processing, limiting widespread adoption due to the associated need for annotated labels--a cost-prohibitive requirement. In this work, we propose a solution involving in-cont… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: Will appear in the 2024 International Symposium on Quality Electronic Design (ISQED'24)

  40. arXiv:2402.13776  [pdf, other

    eess.IV cs.CV cs.LG

    Cas-DiffCom: Cascaded diffusion model for infant longitudinal super-resolution 3D medical image completion

    Authors: Lianghu Guo, Tianli Tao, Xinyi Cai, Zihao Zhu, Jiawei Huang, Lixuan Zhu, Zhuoyang Gu, Haifeng Tang, Rui Zhou, Siyan Han, Yan Liang, Qing Yang, Dinggang Shen, Han Zhang

    Abstract: Early infancy is a rapid and dynamic neurodevelopmental period for behavior and neurocognition. Longitudinal magnetic resonance imaging (MRI) is an effective tool to investigate such a crucial stage by capturing the developmental trajectories of the brain structures. However, longitudinal MRI acquisition always meets a serious data-missing problem due to participant dropout and failed scans, makin… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  41. arXiv:2402.13714  [pdf, other

    q-bio.QM cs.AI cs.LG

    An Evaluation of Large Language Models in Bioinformatics Research

    Authors: Hengchuang Yin, Zhonghui Gu, Fanhao Wang, Yiparemu Abuduhaibaier, Yanqiao Zhu, Xinming Tu, Xian-Sheng Hua, Xiao Luo, Yizhou Sun

    Abstract: Large language models (LLMs) such as ChatGPT have gained considerable interest across diverse research communities. Their notable ability for text completion and generation has inaugurated a novel paradigm for language-interfaced problem solving. However, the potential and efficacy of these models in bioinformatics remain incompletely explored. In this work, we study the performance LLMs on a wide… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Under review

  42. arXiv:2402.08238  [pdf, other

    cs.IT cs.NI eess.SP

    Opportunistic Scheduling Using Statistical Information of Wireless Channels

    Authors: Zhouyou Gu, Wibowo Hardjawana, Branka Vucetic

    Abstract: This paper considers opportunistic scheduler (OS) design using statistical channel state information~(CSI). We apply max-weight schedulers (MWSs) to maximize a utility function of users' average data rates. MWSs schedule the user with the highest weighted instantaneous data rate every time slot. Existing methods require hundreds of time slots to adjust the MWS's weights according to the instantane… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: This work has been accepted in the IEEE Transactions on Wireless Communications. Copyright may be transferred without notice, after which this version may no longer be accessible

  43. arXiv:2402.06783  [pdf, other

    cs.RO cs.LG

    Learn to Teach: Improve Sample Efficiency in Teacher-student Learning for Sim-to-Real Transfer

    Authors: Feiyang Wu, Zhaoyuan Gu, Ye Zhao, Anqi Wu

    Abstract: Simulation-to-reality (sim-to-real) transfer is a fundamental problem for robot learning. Domain Randomization, which adds randomization during training, is a powerful technique that effectively addresses the sim-to-real gap. However, the noise in observations makes learning significantly harder. Recently, studies have shown that employing a teacher-student learning paradigm can accelerate trainin… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  44. arXiv:2402.00879  [pdf, other

    cs.NI cs.LG eess.SP

    Graph Representation Learning for Contention and Interference Management in Wireless Networks

    Authors: Zhouyou Gu, Branka Vucetic, Kishore Chikkam, Pasquale Aliberti, Wibowo Hardjawana

    Abstract: Restricted access window (RAW) in Wi-Fi 802.11ah networks manages contention and interference by grouping users and allocating periodic time slots for each group's transmissions. We will find the optimal user grouping decisions in RAW to maximize the network's worst-case user throughput. We review existing user grouping approaches and highlight their performance limitations in the above problem. W… ▽ More

    Submitted 15 January, 2024; originally announced February 2024.

    Comments: This work has been accepted in the IEEE/ACM Transactions on Networking. Copyright may be transferred without notice, after which this version may no longer be accessible

  45. arXiv:2401.15980  [pdf

    cond-mat.supr-con cond-mat.mtrl-sci cond-mat.str-el

    Superconductivity in freestanding infinite-layer nickelate membranes

    Authors: Shengjun Yan, Wei Mao, Wenjie Sun, Yueying Li, Haoying Sun, Jiangfeng Yang, Bo Hao, Wei Guo, Leyan Nian, Zhengbin Gu, Peng Wang, Yuefeng Nie

    Abstract: The observation of superconductivity in infinite-layer nickelates has attracted significant attention due to its potential as a new platform for exploring high $ \mathrm{\textit{T}}_{c} $ superconductivity. However, thus far, superconductivity has only been observed in epitaxial thin films, which limits the manipulation capabilities and modulation methods compared to two-dimensional exfoliated mat… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: 20 pages, 9 figures

  46. arXiv:2401.15979  [pdf, other

    cond-mat.supr-con cond-mat.mtrl-sci cond-mat.str-el

    ${\mathrm{\textit{In situ}}}$ preparation of superconducting infinite-layer nickelate thin films with atomically flat surface

    Authors: Wenjie Sun, Zhichao Wang, Bo Hao, Shengjun Yan, Haoying Sun, Zhengbin Gu, Yu Deng, Yuefeng Nie

    Abstract: Since their discovery, the infinite-layer nickelates have been regarded as an appealing system for gaining deeper insights into high temperature superconductivity (HTSC). However, the synthesis of superconducting samples has been proved to be challenging. Here, we develop an ultrahigh vacuum (UHV) ${\mathrm{\textit{in situ}}}$ reduction method using atomic hydrogen as reducing agent and apply it i… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: 5 pages, 4 figures

    Journal ref: Adv. Mater. 2024, 2401342

  47. arXiv:2401.13726  [pdf, other

    cs.HC cs.LG

    Supporting Sensemaking of Large Language Model Outputs at Scale

    Authors: Katy Ilonka Gero, Chelse Swoopes, Ziwei Gu, Jonathan K. Kummerfeld, Elena L. Glassman

    Abstract: Large language models (LLMs) are capable of generating multiple responses to a single prompt, yet little effort has been expended to help end-users or system designers make use of this capability. In this paper, we explore how to present many LLM responses at once. We design five features, which include both pre-existing and novel methods for computing similarities and differences across textual d… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: 34 pages, 13 figures, conditionally accepted to ACM Conference on Human Factors in Computing Systems 2024

  48. arXiv:2401.10873  [pdf, other

    cs.HC

    An AI-Resilient Text Rendering Technique for Reading and Skimming Documents

    Authors: Ziwei Gu, Ian Arawjo, Kenneth Li, Jonathan K. Kummerfeld, Elena L. Glassman

    Abstract: Readers find text difficult to consume for many reasons. Summarization can address some of these difficulties, but introduce others, such as omitting, misrepresenting, or hallucinating information, which can be hard for a reader to notice. One approach to addressing this problem is to instead modify how the original text is rendered to make important information more salient. We introduce Grammar-… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: Conditionally accepted to CHI 2024

  49. arXiv:2401.09946  [pdf, other

    hep-th cond-mat.dis-nn gr-qc

    Neural ODEs for holographic transport models without translation symmetry

    Authors: Zhuo-Fan Gu, Yu-Kun Yan, Shao-Feng Wu

    Abstract: We investigate the data-driven holographic transport models without translation symmetry. Our data are the real part of the frequency-dependent shear viscosity, denoted as $η_{\mathrm{re}}(ω)$. We develop a radial flow equation of the shear response and establish its relation to $η_{\mathrm{re}}(ω)$ for a wide class of holographic models. This allows us to determine $η_{\mathrm{re}}(ω)$ of a stron… ▽ More

    Submitted 18 June, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: 26 pages, 12 figures, 3 tables

  50. arXiv:2401.07422  [pdf, other

    eess.SP

    Multiperson Detection and Vital-Sign Sensing Empowered by Space-Time-Coding RISs

    Authors: Xinyu Li, Jian Wei You, Ze Gu, Qian Ma, Jingyuan Zhang, Long Chen, Tie Jun Cui

    Abstract: Passive human sensing using wireless signals has attracted increasing attention due to its superiorities of non-contact and robustness in various lighting conditions. However, when multiple human individuals are present, their reflected signals could be intertwined in the time, frequency and spatial domains, making it challenging to separate them. To address this issue, this paper proposes a novel… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.