Skip to main content

Showing 1–50 of 206 results for author: Zou, D

  1. arXiv:2407.10648  [pdf, other

    cs.RO

    Back to Newton's Laws: Learning Vision-based Agile Flight via Differentiable Physics

    Authors: Yuang Zhang, Yu Hu, Yunlong Song, Danping Zou, Weiyao Lin

    Abstract: Swarm navigation in cluttered environments is a grand challenge in robotics. This work combines deep learning with first-principle physics through differentiable simulation to enable autonomous navigation of multiple aerial robots through complex environments at high speed. Our approach optimizes a neural network control policy directly by backpropagating loss gradients through the robot simulatio… ▽ More

    Submitted 15 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

  2. arXiv:2407.10495  [pdf, other

    cs.LG cs.CV

    Improving Hyperbolic Representations via Gromov-Wasserstein Regularization

    Authors: Yifei Yang, Wonjun Lee, Dongmian Zou, Gilad Lerman

    Abstract: Hyperbolic representations have shown remarkable efficacy in modeling inherent hierarchies and complexities within data structures. Hyperbolic neural networks have been commonly applied for learning such representations from data, but they often fall short in preserving the geometric structures of the original feature spaces. In response to this challenge, our work applies the Gromov-Wasserstein (… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted for ECCV 2024

  3. arXiv:2407.10109  [pdf

    eess.SP

    Hardware-Efficient and Reliable Coherent DSCM Systems Enabled by Single-Pilot-Tone-Based Polarization Demultiplexing

    Authors: Wei Wang, Dongdong Zou, Weihao Ni, Fan Li

    Abstract: Recently, coherent digital subcarrier multiplexing (DSCM) technology has become an attractive solution for next-generation ultra-high-speed datacenter interconnects (DCIs). To meet the requirements of low-cost and low-power consumption in DCI applications, a comprehensive simplification of the coherent DSCM system has been investigated. The pilot-tone-based polarization demultiplexing (PT-PDM) tec… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

  4. arXiv:2407.06988  [pdf, other

    math.SP

    Limiting Over-Smoothing and Over-Squashing of Graph Message Passing by Deep Scattering Transforms

    Authors: Yuanhong Jiang, Dongmian Zou, Xiaoqun Zhang, Yu Guang Wang

    Abstract: Graph neural networks (GNNs) have become pivotal tools for processing graph-structured data, leveraging the message passing scheme as their core mechanism. However, traditional GNNs often grapple with issues such as instability, over-smoothing, and over-squashing, which can degrade performance and create a trade-off dilemma. In this paper, we introduce a discriminatively trained, multi-layer Deep… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 35 pages, 6 figures

  5. arXiv:2407.06562  [pdf, other

    gr-qc

    Identifying \textit{doppelgänge} Black Holes through Shadow Images

    Authors: Yukun Xu, Hyat Huang, Meng-Yun Lai, De-Cheng Zou

    Abstract: Recently, an interesting \textit{doppelgänge} black hole solution is obtained in the string-inspired Euler-Heisenberg theory, where the black holes have the same radii but share different charges. We found, however, they possess different ISCOs and photon spheres, and hence affect their shadow images. In this work, we investigate the optical appearances, illuminated by an optically and geometrical… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 20 pages, 13 figures

  6. arXiv:2407.03757  [pdf, other

    cs.CV

    DiffRetouch: Using Diffusion to Retouch on the Shoulder of Experts

    Authors: Zheng-Peng Duan, Jiawei zhang, Zheng Lin, Xin Jin, Dongqing Zou, Chunle Guo, Chongyi Li

    Abstract: Image retouching aims to enhance the visual quality of photos. Considering the different aesthetic preferences of users, the target of retouching is subjective. However, current retouching methods mostly adopt deterministic models, which not only neglects the style diversity in the expert-retouched results and tends to learn an average style during training, but also lacks sample diversity during… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  7. arXiv:2406.12752  [pdf, other

    cs.CR cs.CV cs.LG

    Extracting Training Data from Unconditional Diffusion Models

    Authors: Yunhao Chen, Xingjun Ma, Difan Zou, Yu-Gang Jiang

    Abstract: As diffusion probabilistic models (DPMs) are being employed as mainstream models for generative artificial intelligence (AI), the study of their memorization of the raw training data has attracted growing attention. Existing works in this direction aim to establish an understanding of whether or to what extent DPMs learn by memorization. Such an understanding is crucial for identifying potential r… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  8. arXiv:2406.11163  [pdf, other

    eess.SP

    Explainable Bayesian Recurrent Neural Smoother to Capture Global State Evolutionary Correlations

    Authors: Shi Yan, Yan Liang, Huayu Zhang, Le Zheng, Difan Zou, Binglu Wang

    Abstract: Through integrating the evolutionary correlations across global states in the bidirectional recursion, an explainable Bayesian recurrent neural smoother (EBRNS) is proposed for offline data-assisted fixed-interval state smoothing. At first, the proposed model, containing global states in the evolutionary interval, is transformed into an equivalent model with bidirectional memory. This transformati… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  9. arXiv:2406.10650  [pdf, other

    stat.ML cs.LG

    The Implicit Bias of Adam on Separable Data

    Authors: Chenyang Zhang, Difan Zou, Yuan Cao

    Abstract: Adam has become one of the most favored optimizers in deep learning problems. Despite its success in practice, numerous mysteries persist regarding its theoretical understanding. In this paper, we study the implicit bias of Adam in linear logistic regression. Specifically, we show that when the training data are linearly separable, Adam converges towards a linear classifier that achieves the maxim… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 33 pages, 2 figures

  10. arXiv:2406.02721  [pdf, other

    cs.CL cs.AI

    Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller

    Authors: Min Cai, Yuchen Zhang, Shichang Zhang, Fan Yin, Difan Zou, Yisong Yue, Ziniu Hu

    Abstract: We propose Self-Control, a novel method utilizing suffix gradients to control the behavior of large language models (LLMs) without explicit human annotations. Given a guideline expressed in suffix string and the model's self-assessment of adherence, Self-Control computes the gradient of this self-judgment concerning the model's hidden states, directly influencing the auto-regressive generation pro… ▽ More

    Submitted 18 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: 41 pages, 12 figures, 41 tables; Website: https://llm-self-control.github.io/

  11. arXiv:2405.20914  [pdf, other

    cs.CR

    RASE: Efficient Privacy-preserving Data Aggregation against Disclosure Attacks for IoTs

    Authors: Zuyan Wang, Jun Tao, Dika Zou

    Abstract: The growing popular awareness of personal privacy raises the following quandary: what is the new paradigm for collecting and protecting the data produced by ever-increasing sensor devices. Most previous studies on co-design of data aggregation and privacy preservation assume that a trusted fusion center adheres to privacy regimes. Very recent work has taken steps towards relaxing the assumption by… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 14 pages, 19 figures

  12. Knowledge Enhanced Multi-intent Transformer Network for Recommendation

    Authors: Ding Zou, Wei Wei, Feida Zhu, Chuanyu Xu, Tao Zhang, Chengfu Huo

    Abstract: Incorporating Knowledge Graphs into Recommendation has attracted growing attention in industry, due to the great potential of KG in providing abundant supplementary information and interpretability for the underlying models. However, simply integrating KG into recommendation usually brings in negative feedback in industry, due to the ignorance of the following two factors: i) users' multiple inten… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accept By The Web Conf 2024 (WWW 2024) Industry Track. arXiv admin note: text overlap with arXiv:2204.08807

  13. arXiv:2405.20494  [pdf, other

    cs.CV cs.AI cs.LG

    Slight Corruption in Pre-training Data Makes Better Diffusion Models

    Authors: Hao Chen, Yujin Han, Diganta Misra, Xiang Li, Kai Hu, Difan Zou, Masashi Sugiyama, Jindong Wang, Bhiksha Raj

    Abstract: Diffusion models (DMs) have shown remarkable capabilities in generating realistic high-quality images, audios, and videos. They benefit significantly from extensive pre-training on large-scale datasets, including web-crawled data with paired data and conditions, such as image-text and image-class pairs. Despite rigorous filtering, these pre-training datasets often inevitably contain corrupted pair… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 50 pages, 33 figures, 4 tables

  14. arXiv:2405.18208  [pdf, other

    cs.AI cs.CL cs.LG

    A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models

    Authors: Chengxing Xie, Difan Zou

    Abstract: Recent studies have highlighted their proficiency in some simple tasks like writing and coding through various reasoning strategies. However, LLM agents still struggle with tasks that require comprehensive planning, a process that challenges current models and remains a critical research issue. In this study, we concentrate on travel planning, a Multi-Phases planning problem, that involves multipl… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  15. arXiv:2405.16734  [pdf, other

    stat.ML cs.LG

    Faster Sampling via Stochastic Gradient Proximal Sampler

    Authors: Xunpeng Huang, Difan Zou, Yi-An Ma, Hanze Dong, Tong Zhang

    Abstract: Stochastic gradients have been widely integrated into Langevin-based methods to improve their scalability and efficiency in solving large-scale sampling problems. However, the proximal sampler, which exhibits much faster convergence than Langevin-based algorithms in the deterministic setting Lee et al. (2021), has yet to be explored in its stochastic variants. In this paper, we study the Stochasti… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 48 pages, 2 figures, 5 tables

  16. arXiv:2405.16387  [pdf, other

    stat.ML cs.LG

    Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference

    Authors: Xunpeng Huang, Difan Zou, Hanze Dong, Yi Zhang, Yi-An Ma, Tong Zhang

    Abstract: To generate data from trained diffusion models, most inference algorithms, such as DDPM, DDIM, and other variants, rely on discretizing the reverse SDEs or their equivalent ODEs. In this paper, we view such approaches as decomposing the entire denoising diffusion process into several segments, each corresponding to a reverse transition kernel (RTK) sampling subproblem. Specifically, DDPM uses a Ga… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 68 pages, 2 figures

  17. arXiv:2404.19521  [pdf, other

    gr-qc

    Nonlinear scalarization of Schwarzschild black holes in Einstein-scalar-Gauss-Bonnet gravity

    Authors: Chao-Ming Zhang, Zhen-Hao Yang, Meng-Yun Lai, Yun Soo Myung, De-Cheng Zou

    Abstract: In this paper, we propose a fully nonlinear mechanism for obtaining scalarized black holes in Einstein-scalar-Gauss-Bonnet (EsGB) gravity which is beyond the spontaneous scalarization. Introducing three coupling functions $f(\varphi)$ satisfying $f''(0) = 0$, we find that Schwarzschild black hole is linearly stable against scalar perturbation, whereas it is unstable against nonlinear scalar pertur… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: 14 pages, 6 figures

  18. arXiv:2404.19423  [pdf, other

    gr-qc

    Thermodynamics of charged Lifshitz black holes with scalar hair

    Authors: Shan Wu, Kai-Qiang Qian, Rui-Hong Yue, Ming Zhang, De-Cheng Zou

    Abstract: In this work, we discuss the generalized Einstein-Maxwell-Dilaton gravity theory with a nonminimal coupling between the Maxwell field and scalar field. Considering different geometric properties of black hole horizon structure, the charged dilaton Lifshitz black hole solutions are presented in 4-dimensional spacetimes. Later, utilizing the Wald Formalism, we derive the thermodynamic first law of b… ▽ More

    Submitted 30 April, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: 12 pages, 2 figures

  19. arXiv:2404.13815  [pdf, other

    cs.LG

    Improving Group Robustness on Spurious Correlation Requires Preciser Group Inference

    Authors: Yujin Han, Difan Zou

    Abstract: Standard empirical risk minimization (ERM) models may prioritize learning spurious correlations between spurious features and true labels, leading to poor accuracy on groups where these correlations do not hold. Mitigating this issue often requires expensive spurious attribute (group) labels or relies on trained ERM models to infer group labels when group information is unavailable. However, the s… ▽ More

    Submitted 3 June, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: 25 pages, 13 figures, 8 tables

  20. arXiv:2404.11888  [pdf, other

    cs.LG cs.AI

    The Dog Walking Theory: Rethinking Convergence in Federated Learning

    Authors: Kun Zhai, Yifeng Gao, Xingjun Ma, Difan Zou, Guangnan Ye, Yu-Gang Jiang

    Abstract: Federated learning (FL) is a collaborative learning paradigm that allows different clients to train one powerful global model without sharing their private data. Although FL has demonstrated promising results in various applications, it is known to suffer from convergence issues caused by the data distribution shift across different clients, especially on non-independent and identically distribute… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  21. arXiv:2404.09463  [pdf

    cs.LG

    PRIME: A CyberGIS Platform for Resilience Inference Measurement and Enhancement

    Authors: Debayan Mandal, Dr. Lei Zou, Rohan Singh Wilkho, Joynal Abedin, Bing Zhou, Dr. Heng Cai, Dr. Furqan Baig, Dr. Nasir Gharaibeh, Dr. Nina Lam

    Abstract: In an era of increased climatic disasters, there is an urgent need to develop reliable frameworks and tools for evaluating and improving community resilience to climatic hazards at multiple geographical and temporal scales. Defining and quantifying resilience in the social domain is relatively subjective due to the intricate interplay of socioeconomic factors with disaster resilience. Meanwhile, t… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 28 pages, 6 figures

  22. arXiv:2404.07545  [pdf, other

    cs.CV

    Stereo-LiDAR Depth Estimation with Deformable Propagation and Learned Disparity-Depth Conversion

    Authors: Ang Li, Anning Hu, Wei Xi, Wenxian Yu, Danping Zou

    Abstract: Accurate and dense depth estimation with stereo cameras and LiDAR is an important task for automatic driving and robotic perception. While sparse hints from LiDAR points have improved cost aggregation in stereo matching, their effectiveness is limited by the low density and non-uniform distribution. To address this issue, we propose a novel stereo-LiDAR depth estimation network with Semi-Dense hin… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: Accepted in ICRA 2024. 8 pages, 6 figures

  23. arXiv:2404.01601  [pdf, other

    cs.LG

    What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks

    Authors: Xingwu Chen, Difan Zou

    Abstract: We study the capabilities of the transformer architecture with varying depth. Specifically, we designed a novel set of sequence learning tasks to systematically evaluate and comprehend how the depth of transformer affects its ability to perform memorization, reasoning, generalization, and contextual generalization. We show a transformer with only one attention layer can excel in memorization but f… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  24. arXiv:2403.17592  [pdf, other

    cs.LG stat.ML

    On the Benefits of Over-parameterization for Out-of-Distribution Generalization

    Authors: Yifan Hao, Yong Lin, Difan Zou, Tong Zhang

    Abstract: In recent years, machine learning models have achieved success based on the independently and identically distributed assumption. However, this assumption can be easily violated in real-world applications, leading to the Out-of-Distribution (OOD) problem. Understanding how modern over-parameterized DNNs behave under non-trivial natural distributional shifts is essential, as current theoretical und… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  25. arXiv:2403.16409  [pdf

    astro-ph.IM astro-ph.CO

    Large-scale Array for Radio Astronomy on the Farside

    Authors: Xuelei Chen, Feng Gao, Fengquan Wu, Yechi Zhang, Tong Wang, Weilin Liu, Dali Zou, Furen Deng, Yang Gong, Kai He, Jixia Li, Shijie Sun, Nanben Suo, Yougang Wang, Pengju Wu, Jiaqin Xu, Yidong Xu, Bin Yue, Cong Zhang, Jia Zhou, Minquan Zhou, Chenguang Zhu, Jiacong Zhu

    Abstract: At the Royal Society meeting in 2023, we have mainly presented our lunar orbit array concept called DSL, and also briefly introduced a concept of a lunar surface array, LARAF. As the DSL concept had been presented before, in this article we introduce the LARAF. We propose to build an array in the far side of the Moon, with a master station which handles the data collection and processing, and 20 s… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: final submission version, 30 pages, 16 figures

    Journal ref: Phil. Trans. R. Soc. A.382,20230094(2024)

  26. Thermodynamics of charged black holes in Maxwell-dilaton-massive gravity

    Authors: Rui-Hong Yue, Kai-Qiang Qian, Bo Liu, De-Cheng Zou

    Abstract: Considering the nonminimal coupling of the dilaton field to the massive graviton field in Maxwell-dilaton-massive gravity, we obtain a class of analytical solutions of charged black holes, which are neither asymptotically flat nor (A)dS. The calculated thermodynamic quantities, such as mass, temperature and entropy, verify the validity of the first law of black hole thermodynamics. Moreover, we fu… ▽ More

    Submitted 23 May, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: 20 Pages, 5 figures, 8 tables

    Journal ref: Chinese Physics C Vol. 48, No. 7 (2024) 075104

  27. arXiv:2403.12521  [pdf

    eess.SY

    Multi-mode Fault Diagnosis Datasets of Gearbox Under Variable Working Conditions

    Authors: Shijin Chen, Zeyi Liu, Xiao He, Dongliang Zou, Donghua Zhou

    Abstract: The gearbox is a critical component of electromechanical systems. The occurrence of multiple faults can significantly impact system accuracy and service life. The vibration signal of the gearbox is an effective indicator of its operational status and fault information. However, gearboxes in real industrial settings often operate under variable working conditions, such as varying speeds and loads.… ▽ More

    Submitted 8 April, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: 10 pages, 12 figures

  28. arXiv:2403.11845  [pdf

    eess.SP

    Simplified Self-homodyne Coherent System Based on Alamouti Coding and Digital Subcarrier Multiplexing

    Authors: Wei Wang, Dongdong Zou, Zhenpeng Wu, Qi Sui, Xingwen Yi, Fan Li, Chao Lu, Zhaohui Li

    Abstract: Coherent technology inherent with more availabledegrees of freedom is deemed a competitive solution for nextgeneration ultra-high-speed short-reach optical interconnects.However, the fatal barriers to implementing the conventiona.coherent system in short-reach optical interconnect are the costfootprint, and power consumption. Self-homodyne coherentsystem exhibits its potential to reduce the power… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  29. arXiv:2403.08585  [pdf, other

    cs.LG

    Improving Implicit Regularization of SGD with Preconditioning for Least Square Problems

    Authors: Junwei Su, Difan Zou, Chuan Wu

    Abstract: Stochastic gradient descent (SGD) exhibits strong algorithmic regularization effects in practice and plays an important role in the generalization of modern machine learning. However, prior research has revealed instances where the generalization performance of SGD is worse than ridge regression due to uneven optimization along different dimensions. Preconditioning offers a natural solution to thi… ▽ More

    Submitted 26 May, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  30. arXiv:2403.06183  [pdf, other

    cs.LG math.OC math.ST stat.ML

    An Improved Analysis of Langevin Algorithms with Prior Diffusion for Non-Log-Concave Sampling

    Authors: Xunpeng Huang, Hanze Dong, Difan Zou, Tong Zhang

    Abstract: Understanding the dimension dependency of computational complexity in high-dimensional sampling problem is a fundamental problem, both from a practical and theoretical perspective. Compared with samplers with unbiased stationary distribution, e.g., Metropolis-adjusted Langevin algorithm (MALA), biased samplers, e.g., Underdamped Langevin Dynamics (ULD), perform better in low-accuracy cases just be… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

    Comments: 32 pages

  31. arXiv:2403.04010  [pdf, other

    cs.LG

    Three Revisits to Node-Level Graph Anomaly Detection: Outliers, Message Passing and Hyperbolic Neural Networks

    Authors: Jing Gu, Dongmian Zou

    Abstract: Graph anomaly detection plays a vital role for identifying abnormal instances in complex networks. Despite advancements of methodology based on deep learning in recent years, existing benchmarking approaches exhibit limitations that hinder a comprehensive comparison. In this paper, we revisit datasets and approaches for unsupervised node-level graph anomaly detection tasks from three aspects. Firs… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: Presented at the Second Learning on Graphs Conference (LoG 2023)

  32. arXiv:2403.03962  [pdf, other

    cs.SI cs.AI cs.NE

    Identify Critical Nodes in Complex Network with Large Language Models

    Authors: Jinzhu Mao, Dongyun Zou, Li Sheng, Siyi Liu, Chen Gao, Yue Wang, Yong Li

    Abstract: Identifying critical nodes in networks is a classical decision-making task, and many methods struggle to strike a balance between adaptability and utility. Therefore, we propose an approach that empowers Evolutionary Algorithm (EA) with Large Language Models (LLMs), to generate a function called "score\_nodes" which can further be used to identify crucial nodes based on their assigned scores. Our… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  33. arXiv:2403.01092  [pdf, other

    cs.LG

    Pairwise Alignment Improves Graph Domain Adaptation

    Authors: Shikun Liu, Deyu Zou, Han Zhao, Pan Li

    Abstract: Graph-based methods, pivotal for label inference over interconnected objects in many real-world applications, often encounter generalization challenges, if the graph used for model training differs significantly from the graph used for testing. This work delves into Graph Domain Adaptation (GDA) to address the unique complexities of distribution shifts over graph data, where interconnected data po… ▽ More

    Submitted 4 June, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Comments: ICML 2024. Our code and data are available at: https://github.com/Graph-COM/Pair-Align

  34. arXiv:2402.14308  [pdf, other

    cs.RO

    Ground-Fusion: A Low-cost Ground SLAM System Robust to Corner Cases

    Authors: Jie Yin, Ang Li, Wei Xi, Wenxian Yu, Danping Zou

    Abstract: We introduce Ground-Fusion, a low-cost sensor fusion simultaneous localization and mapping (SLAM) system for ground vehicles. Our system features efficient initialization, effective sensor anomaly detection and handling, real-time dense color mapping, and robust localization in diverse environments. We tightly integrate RGB-D images, inertial measurements, wheel odometer and GNSS signals within a… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  35. arXiv:2402.12987  [pdf, other

    cs.LG

    Towards Robust Graph Incremental Learning on Evolving Graphs

    Authors: Junwei Su, Difan Zou, Zijun Zhang, Chuan Wu

    Abstract: Incremental learning is a machine learning approach that involves training a model on a sequence of tasks, rather than all tasks at once. This ability to learn incrementally from a stream of tasks is crucial for many real-world applications. However, incremental learning is a challenging problem on graph-structured data, as many graph-related problems involve prediction tasks for each individual n… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  36. arXiv:2402.04284  [pdf, other

    cs.LG

    PRES: Toward Scalable Memory-Based Dynamic Graph Neural Networks

    Authors: Junwei Su, Difan Zou, Chuan Wu

    Abstract: Memory-based Dynamic Graph Neural Networks (MDGNNs) are a family of dynamic graph neural networks that leverage a memory module to extract, distill, and memorize long-term temporal dependencies, leading to superior performance compared to memory-less counterparts. However, training MDGNNs faces the challenge of handling entangled temporal and structural dependencies, requiring sequential and chron… ▽ More

    Submitted 26 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  37. arXiv:2401.09767  [pdf, other

    cs.CR cs.SE

    On the Effectiveness of Function-Level Vulnerability Detectors for Inter-Procedural Vulnerabilities

    Authors: Zhen Li, Ning Wang, Deqing Zou, Yating Li, Ruqian Zhang, Shouhuai Xu, Chao Zhang, Hai Jin

    Abstract: Software vulnerabilities are a major cyber threat and it is important to detect them. One important approach to detecting vulnerabilities is to use deep learning while treating a program function as a whole, known as function-level vulnerability detectors. However, the limitation of this approach is not understood. In this paper, we investigate its limitation in detecting one class of vulnerabilit… ▽ More

    Submitted 20 January, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: 12 pages, 7 figures. To appear in the Proceedings of the 46th International Conference on Software Engineering (ICSE'24)

  38. arXiv:2401.06325  [pdf, other

    stat.ML cs.LG math.OC stat.CO

    Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo

    Authors: Xunpeng Huang, Difan Zou, Hanze Dong, Yian Ma, Tong Zhang

    Abstract: To sample from a general target distribution $p_*\propto e^{-f_*}$ beyond the isoperimetric condition, Huang et al. (2023) proposed to perform sampling through reverse diffusion, giving rise to Diffusion-based Monte Carlo (DMC). Specifically, DMC follows the reverse SDE of a diffusion process that transforms the target distribution to the standard Gaussian, utilizing a non-parametric score estimat… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: 54 pages

  39. arXiv:2401.00144  [pdf, other

    gr-qc

    Scalarization of Kerr-Newman black holes in the Einstein-Chern-Simons-scalar theory

    Authors: Kun-Hui Fan, Yun Soo Myung, De-Cheng Zou, Meng-Yun Lai

    Abstract: We investigate the tachyonic instability of Kerr-Newman (KN) black hole with a rotation parameter $a$ in the Einstein-Chern-Simons-scalar theory coupled with a quadratic massive scalar field. This instability analysis corresponds to exploring the onset of spontaneous scalarization for KN black holes. First, we find no $a$-bound for $α<0$ case by considering (1+1)-dimensional analytical method. A d… ▽ More

    Submitted 2 January, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

    Comments: 18 pages, 4 figures, minor changes

  40. arXiv:2312.08886  [pdf, other

    cs.CV

    Diffusion-based Blind Text Image Super-Resolution

    Authors: Yuzhe Zhang, Jiawei Zhang, Hao Li, Zhouxia Wang, Luwei Hou, Dongqing Zou, Liheng Bian

    Abstract: Recovering degraded low-resolution text images is challenging, especially for Chinese text images with complex strokes and severe degradation in real-world scenarios. Ensuring both text fidelity and style realness is crucial for high-quality text image super-resolution. Recently, diffusion models have achieved great success in natural image synthesis and restoration due to their powerful data dist… ▽ More

    Submitted 3 March, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

    Comments: Accepted by CVPR2024

  41. arXiv:2312.03222  [pdf, other

    cs.CV

    Predicting Scores of Various Aesthetic Attribute Sets by Learning from Overall Score Labels

    Authors: Heng Huang, Xin Jin, Yaqi Liu, Hao Lou, Chaoen Xiao, Shuai Cui, Xinning Li, Dongqing Zou

    Abstract: Now many mobile phones embed deep-learning models for evaluation or guidance on photography. These models cannot provide detailed results like human pose scores or scene color scores because of the rare of corresponding aesthetic attribute data. However, the annotation of image aesthetic attribute scores requires experienced artists and professional photographers, which hinders the collection of l… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  42. arXiv:2312.03017  [pdf, other

    cs.LG physics.optics

    AI-driven emergence of frequency information non-uniform distribution via THz metasurface spectrum prediction

    Authors: Xiaohua Xing, Yuqi Ren, Die Zou, Qiankun Zhang, Bingxuan Mao, Jianquan Yao, Deyi Xiong, Shuang Zhang, Liang Wu

    Abstract: Recently, artificial intelligence has been extensively deployed across various scientific disciplines, optimizing and guiding the progression of experiments through the integration of abundant datasets, whilst continuously probing the vast theoretical space encapsulated within the data. Particularly, deep learning models, due to their end-to-end adaptive learning capabilities, are capable of auton… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: 11 pages, 4 figures

  43. arXiv:2311.12864  [pdf, other

    math.OC cs.LG

    OptScaler: A Hybrid Proactive-Reactive Framework for Robust Autoscaling in the Cloud

    Authors: Ding Zou, Wei Lu, Zhibo Zhu, Xingyu Lu, Jun Zhou, Xiaojin Wang, Kangyu Liu, Haiqing Wang, Kefan Wang, Renen Sun

    Abstract: Autoscaling is a vital mechanism in cloud computing that supports the autonomous adjustment of computing resources under dynamic workloads. A primary goal of autoscaling is to stabilize resource utilization at a desirable level, thus reconciling the need for resource-saving with the satisfaction of Service Level Objectives (SLOs). Existing proactive autoscaling methods anticipate the future worklo… ▽ More

    Submitted 26 October, 2023; originally announced November 2023.

  44. GRAM: An Interpretable Approach for Graph Anomaly Detection using Gradient Attention Maps

    Authors: Yifei Yang, Peng Wang, Xiaofan He, Dongmian Zou

    Abstract: Detecting unusual patterns in graph data is a crucial task in data mining. However, existing methods face challenges in consistently achieving satisfactory performance and often lack interpretability, which hinders our understanding of anomaly detection decisions. In this paper, we propose a novel approach to graph anomaly detection that leverages the power of interpretability to enhance performan… ▽ More

    Submitted 26 June, 2024; v1 submitted 10 November, 2023; originally announced November 2023.

    Journal ref: Neural Networks 178(2024) 106463

  45. arXiv:2311.02880  [pdf, other

    cs.LG

    MultiSPANS: A Multi-range Spatial-Temporal Transformer Network for Traffic Forecast via Structural Entropy Optimization

    Authors: Dongcheng Zou, Senzhang Wang, Xuefeng Li, Hao Peng, Yuandong Wang, Chunyang Liu, Kehua Sheng, Bo Zhang

    Abstract: Traffic forecasting is a complex multivariate time-series regression task of paramount importance for traffic management and planning. However, existing approaches often struggle to model complex multi-range dependencies using local spatiotemporal features and road network hierarchical knowledge. To address this, we propose MultiSPANS. First, considering that an individual recording point cannot r… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: 10 pages, 7 figures, conference. The work has been accepted by WSDM2024

  46. arXiv:2311.01375  [pdf, other

    cs.LG math.NA

    Monotone Generative Modeling via a Gromov-Monge Embedding

    Authors: Wonjun Lee, Yifei Yang, Dongmian Zou, Gilad Lerman

    Abstract: Generative adversarial networks (GANs) are popular for generative tasks; however, they often require careful architecture selection, extensive empirical tuning, and are prone to mode collapse. To overcome these challenges, we propose a novel model that identifies the low-dimensional structure of the underlying data distribution, maps it into a low-dimensional latent space while preserving the unde… ▽ More

    Submitted 3 July, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: 21 pages excluding references

  47. arXiv:2310.17074  [pdf, other

    cs.LG math.OC stat.ML

    Benign Oscillation of Stochastic Gradient Descent with Large Learning Rates

    Authors: Miao Lu, Beining Wu, Xiaodong Yang, Difan Zou

    Abstract: In this work, we theoretically investigate the generalization properties of neural networks (NN) trained by stochastic gradient descent (SGD) algorithm with large learning rates. Under such a training regime, our finding is that, the oscillation of the NN weights caused by the large learning rate SGD training turns out to be beneficial to the generalization of the NN, which potentially improves ov… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: 63 pages, 10 figures

  48. Black holes in massive Einstein-dilaton gravity

    Authors: Bo Liu, Rui-Hong Yue, De-Cheng Zou, Lina Zhang, Zhan-Ying Yang, Qiyuan Pan

    Abstract: In this paper, we focus on massive Einstein-dilaton gravity including the coupling of dilaton scalar field to massive graviton terms, and then derive static and spherically symmetric solutions of dilatonic black holes in four dimensional spacetime. We find that the dilatonic black hole could possess two horizons (event and cosmological), extreme (Nariai) and naked singularity for the suitably fixe… ▽ More

    Submitted 5 March, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: 23 pages, 12 figures

    Journal ref: Phys.Rev.D.109, 064013(2024)

  49. arXiv:2310.08677  [pdf, other

    cs.LG cs.AI

    GDL-DS: A Benchmark for Geometric Deep Learning under Distribution Shifts

    Authors: Deyu Zou, Shikun Liu, Siqi Miao, Victor Fung, Shiyu Chang, Pan Li

    Abstract: Geometric deep learning (GDL) has gained significant attention in various scientific fields, chiefly for its proficiency in modeling data with intricate geometric structures. Yet, very few works have delved into its capability of tackling the distribution shift problem, a prevalent challenge in many relevant applications. To bridge this gap, we propose GDL-DS, a comprehensive benchmark designed fo… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: Code and data are available at https://github.com/Graph-COM/GDL_DS

  50. arXiv:2310.08391  [pdf, other

    stat.ML cs.LG

    How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?

    Authors: Jingfeng Wu, Difan Zou, Zixiang Chen, Vladimir Braverman, Quanquan Gu, Peter L. Bartlett

    Abstract: Transformers pretrained on diverse tasks exhibit remarkable in-context learning (ICL) capabilities, enabling them to solve unseen tasks solely based on input contexts without adjusting model parameters. In this paper, we study ICL in one of its simplest setups: pretraining a linearly parameterized single-layer linear attention model for linear regression with a Gaussian prior. We establish a stati… ▽ More

    Submitted 14 March, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: ICLR 2024 Camera Ready