-
UP-Diff: Latent Diffusion Model for Remote Sensing Urban Prediction
Authors:
Zeyu Wang,
Zecheng Hao,
Jingyu Lin,
Yuchao Feng,
Yufei Guo
Abstract:
This study introduces a novel Remote Sensing (RS) Urban Prediction (UP) task focused on future urban planning, which aims to forecast urban layouts by utilizing information from existing urban layouts and planned change maps. To address the proposed RS UP task, we propose UP-Diff, which leverages a Latent Diffusion Model (LDM) to capture positionaware embeddings of pre-change urban layouts and pla…
▽ More
This study introduces a novel Remote Sensing (RS) Urban Prediction (UP) task focused on future urban planning, which aims to forecast urban layouts by utilizing information from existing urban layouts and planned change maps. To address the proposed RS UP task, we propose UP-Diff, which leverages a Latent Diffusion Model (LDM) to capture positionaware embeddings of pre-change urban layouts and planned change maps. In specific, the trainable cross-attention layers within UP-Diff's iterative diffusion modules enable the model to dynamically highlight crucial regions for targeted modifications. By utilizing our UP-Diff, designers can effectively refine and adjust future urban city plans by making modifications to the change maps in a dynamic and adaptive manner. Compared with conventional RS Change Detection (CD) methods, the proposed UP-Diff for the RS UP task avoids the requirement of paired prechange and post-change images, which enhances the practical usage in city development. Experimental results on LEVIRCD and SYSU-CD datasets show UP-Diff's ability to accurately predict future urban layouts with high fidelity, demonstrating its potential for urban planning. Code and model weights are available at https://github.com/zeyuwang-zju/UP-Diff.
△ Less
Submitted 16 July, 2024; v1 submitted 16 July, 2024;
originally announced July 2024.
-
PAPM: A Physics-aware Proxy Model for Process Systems
Authors:
Pengwei Liu,
Zhongkai Hao,
Xingyu Ren,
Hangjie Yuan,
Jiayang Ren,
Dong Ni
Abstract:
In the context of proxy modeling for process systems, traditional data-driven deep learning approaches frequently encounter significant challenges, such as substantial training costs induced by large amounts of data, and limited generalization capabilities. As a promising alternative, physics-aware models incorporate partial physics knowledge to ameliorate these challenges. Although demonstrating…
▽ More
In the context of proxy modeling for process systems, traditional data-driven deep learning approaches frequently encounter significant challenges, such as substantial training costs induced by large amounts of data, and limited generalization capabilities. As a promising alternative, physics-aware models incorporate partial physics knowledge to ameliorate these challenges. Although demonstrating efficacy, they fall short in terms of exploration depth and universality. To address these shortcomings, we introduce a physics-aware proxy model (PAPM) that fully incorporates partial prior physics of process systems, which includes multiple input conditions and the general form of conservation relations, resulting in better out-of-sample generalization. Additionally, PAPM contains a holistic temporal-spatial stepping module for flexible adaptation across various process systems. Through systematic comparisons with state-of-the-art pure data-driven and physics-aware models across five two-dimensional benchmarks in nine generalization tasks, PAPM notably achieves an average performance improvement of 6.7%, while requiring fewer FLOPs, and just 1% of the parameters compared to the prior leading method. The code is available at https://github.com/pengwei07/PAPM.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Collision Avoidance for Multiple UAVs in Unknown Scenarios with Causal Representation Disentanglement
Authors:
Jiafan Zhuang,
Zihao Xia,
Gaofei Han,
Boxi Wang,
Wenji Li,
Dongliang Wang,
Zhifeng Hao,
Ruichu Cai,
Zhun Fan
Abstract:
Deep reinforcement learning (DRL) has achieved remarkable progress in online path planning tasks for multi-UAV systems. However, existing DRL-based methods often suffer from performance degradation when tackling unseen scenarios, since the non-causal factors in visual representations adversely affect policy learning. To address this issue, we propose a novel representation learning approach, \ie,…
▽ More
Deep reinforcement learning (DRL) has achieved remarkable progress in online path planning tasks for multi-UAV systems. However, existing DRL-based methods often suffer from performance degradation when tackling unseen scenarios, since the non-causal factors in visual representations adversely affect policy learning. To address this issue, we propose a novel representation learning approach, \ie, causal representation disentanglement, which can identify the causal and non-causal factors in representations. After that, we only pass causal factors for subsequent policy learning and thus explicitly eliminate the influence of non-causal factors, which effectively improves the generalization ability of DRL models. Experimental results show that our proposed method can achieve robust navigation performance and effective collision avoidance especially in unseen scenarios, which significantly outperforms existing SOTA algorithms.
△ Less
Submitted 15 July, 2024; v1 submitted 4 July, 2024;
originally announced July 2024.
-
Robust Policy Learning for Multi-UAV Collision Avoidance with Causal Feature Selection
Authors:
Jiafan Zhuang,
Gaofei Han,
Zihao Xia,
Boxi Wang,
Wenji Li,
Dongliang Wang,
Zhifeng Hao,
Ruichu Cai,
Zhun Fan
Abstract:
In unseen and complex outdoor environments, collision avoidance navigation for unmanned aerial vehicle (UAV) swarms presents a challenging problem. It requires UAVs to navigate through various obstacles and complex backgrounds. Existing collision avoidance navigation methods based on deep reinforcement learning show promising performance but suffer from poor generalization abilities, resulting in…
▽ More
In unseen and complex outdoor environments, collision avoidance navigation for unmanned aerial vehicle (UAV) swarms presents a challenging problem. It requires UAVs to navigate through various obstacles and complex backgrounds. Existing collision avoidance navigation methods based on deep reinforcement learning show promising performance but suffer from poor generalization abilities, resulting in performance degradation in unseen environments. To address this issue, we investigate the cause of weak generalization ability in DRL and propose a novel causal feature selection module. This module can be integrated into the policy network and effectively filters out non-causal factors in representations, thereby reducing the influence of spurious correlations between non-causal factors and action predictions. Experimental results demonstrate that our proposed method can achieve robust navigation performance and effective collision avoidance especially in scenarios with unseen backgrounds and obstacles, which significantly outperforms existing state-of-the-art algorithms.
△ Less
Submitted 15 July, 2024; v1 submitted 4 July, 2024;
originally announced July 2024.
-
Estimating Long-term Heterogeneous Dose-response Curve: Generalization Bound Leveraging Optimal Transport Weights
Authors:
Zeqin Yang,
Weilin Chen,
Ruichu Cai,
Yuguang Yan,
Zhifeng Hao,
Zhipeng Yu,
Zhichao Zou,
Zhen Peng,
Jiecheng Guo
Abstract:
Long-term causal effect estimation is a significant but challenging problem in many applications. Existing methods rely on ideal assumptions to estimate long-term average effects, e.g., no unobserved confounders or a binary treatment,while in numerous real-world applications, these assumptions could be violated and average effects are unable to provide individual-level suggestions.In this paper,we…
▽ More
Long-term causal effect estimation is a significant but challenging problem in many applications. Existing methods rely on ideal assumptions to estimate long-term average effects, e.g., no unobserved confounders or a binary treatment,while in numerous real-world applications, these assumptions could be violated and average effects are unable to provide individual-level suggestions.In this paper,we address a more general problem of estimating the long-term heterogeneous dose-response curve (HDRC) while accounting for unobserved confounders. Specifically, to remove unobserved confounding in observational data, we introduce an optimal transport weighting framework to align the observational data to the experimental data with theoretical guarantees. Furthermore,to accurately predict the heterogeneous effects of continuous treatment, we establish a generalization bound on counterfactual prediction error by leveraging the reweighted distribution induced by optimal transport. Finally, we develop an HDRC estimator building upon the above theoretical foundations. Extensive experimental studies conducted on multiple synthetic and semi-synthetic datasets demonstrate the effectiveness of our proposed method.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Inverse Design of Planar Clamped-Free Elastic Rods from Noisy Data
Authors:
Dezhong Tong,
Zhuonan Hao,
Weicheng Huang
Abstract:
Slender structures, such as rods, often exhibit large nonlinear geometrical deformations even under moderate external forces (e.g., gravity). This characteristic results in a rich variety of morphological changes, making them appealing for engineering design and applications, such as soft robots, submarine cables, decorative knots, and more. Prior studies have demonstrated that the natural shape o…
▽ More
Slender structures, such as rods, often exhibit large nonlinear geometrical deformations even under moderate external forces (e.g., gravity). This characteristic results in a rich variety of morphological changes, making them appealing for engineering design and applications, such as soft robots, submarine cables, decorative knots, and more. Prior studies have demonstrated that the natural shape of a rod significantly influences its deformed geometry. Consequently, the natural shape of the rod should be considered when manufacturing and designing rod-like structures. Here, we focus on an inverse problem: can we determine the natural shape of a suspended 2D planar rod so that it deforms into a desired target shape? We begin by formulating a theoretical framework based on the statics of planar rod equilibrium that can compute the natural shape of a planar rod given its target shape. Furthermore, we analyze the impact of uncertainties (e.g., noise in the data) on the accuracy of the theoretical framework. The results reveal the shortcomings of the theoretical framework in handling uncertainties in the inverse problem, a fact often overlooked in previous works. To mitigate the influence of the uncertainties, we combine the statics of the planar rod with the adjoint method for parameter sensitivity analysis, constructing a learning framework that can efficiently explore the natural shape of the designed rod with enhanced robustness. This framework is validated numerically for its accuracy and robustness, offering valuable insights into the inverse design of soft structures for various applications, including soft robotics and animation of morphing structures.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Superfluid stiffness of twisted multilayer graphene superconductors
Authors:
Abhishek Banerjee,
Zeyu Hao,
Mary Kreidel,
Patrick Ledwith,
Isabelle Phinney,
Jeong Min Park,
Andrew M. Zimmerman,
Kenji Watanabe,
Takashi Taniguchi,
Robert M Westervelt,
Pablo Jarillo-Herrero,
Pavel A. Volkov,
Ashvin Vishwanath,
Kin Chung Fong,
Philip Kim
Abstract:
The robustness of the macroscopic quantum nature of a superconductor can be characterized by the superfluid stiffness, $ρ_s$, a quantity that describes the energy required to vary the phase of the macroscopic quantum wave function. In unconventional superconductors, such as cuprates, the low-temperature behavior of $ρ_s$ drastically differs from that of conventional superconductors due to quasipar…
▽ More
The robustness of the macroscopic quantum nature of a superconductor can be characterized by the superfluid stiffness, $ρ_s$, a quantity that describes the energy required to vary the phase of the macroscopic quantum wave function. In unconventional superconductors, such as cuprates, the low-temperature behavior of $ρ_s$ drastically differs from that of conventional superconductors due to quasiparticle excitations from gapless points (nodes) in momentum space. Intensive research on the recently discovered magic-angle twisted graphene family has revealed, in addition to superconducting states, strongly correlated electronic states associated with spontaneously broken symmetries, inviting the study of $ρ_s$ to uncover the potentially unconventional nature of its superconductivity. Here we report the measurement of $ρ_s$ in magic-angle twisted trilayer graphene (TTG), revealing unconventional nodal-gap superconductivity. Utilizing radio-frequency reflectometry techniques to measure the kinetic inductive response of superconducting TTG coupled to a microwave resonator, we find a linear temperature dependence of $ρ_s$ at low temperatures and nonlinear Meissner effects in the current bias dependence, both indicating nodal structures in the superconducting order parameter. Furthermore, the doping dependence shows a linear correlation between the zero temperature $ρ_s$ and the superconducting transition temperature $T_c$, reminiscent of Uemura's relation in cuprates, suggesting phase-coherence-limited superconductivity. Our results provide strong evidence for nodal superconductivity in TTG and put strong constraints on the mechanisms of these graphene-based superconductors.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Low mach Number Limit of the Viscous and Heat Conductive Flow with general pressure law on torus
Authors:
Yuhan Chen,
Guilong Gui,
Zhen Hao,
Ning Jiang
Abstract:
We prove the low Mach number limit from compressible Navier-Stokes-Fourier system with the general pressure law around a constant state on the torus $\mathbb{T}^N_a$. We view this limit as a special case of the weakly nonlinear-dissipative approximation of the general hyperbolic-parabolic system with entropy. In particular, we consider the ill-prepared initial data, for which the group of fast aco…
▽ More
We prove the low Mach number limit from compressible Navier-Stokes-Fourier system with the general pressure law around a constant state on the torus $\mathbb{T}^N_a$. We view this limit as a special case of the weakly nonlinear-dissipative approximation of the general hyperbolic-parabolic system with entropy. In particular, we consider the ill-prepared initial data, for which the group of fast acoustic waves is needed to be filtered. This extends the previous works, in particular Danchin [ Amer. J. Math. 124 (2002), 1153-1219] in two ways: 1. We treat the fully general non-isentropic flow, i.e. the pressure depends on the density $ρ$ and temperature $θ$ by basic thermodynamic law. We illustrate the role played by the entropy structure of the system in the coupling of the acoustic waves and incompressible flow, and the construction of the filtering group. 2. We refine the small divisor estimate, which helps us to give the first explicit convergence rate of the filtered acoustic waves whose propogation is governed by non-local averaged system. In previous works, only convergence rate of incompressible limit was obtained.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Intrinsic high-fidelity spin polarization of charged vacancies in hexagonal boron nitride
Authors:
Wonjae Lee,
Vincent S. Liu,
Zhelun Zhang,
Sangha Kim,
Ruotian Gong,
Xinyi Du,
Khanh Pham,
Thomas Poirier,
Zeyu Hao,
James H. Edgar,
Philip Kim,
Chong Zu,
Emily J. Davis,
Norman Y. Yao
Abstract:
The negatively charged boron vacancy ($\mathrm{V}_{\mathrm{B}}^-$) in hexagonal boron nitride (hBN) has garnered significant attention among defects in two-dimensional materials. This owes, in part, to its deterministic generation, well-characterized atomic structure, and optical polarizability at room temperature. We investigate the latter through extensive measurements probing both the ground an…
▽ More
The negatively charged boron vacancy ($\mathrm{V}_{\mathrm{B}}^-$) in hexagonal boron nitride (hBN) has garnered significant attention among defects in two-dimensional materials. This owes, in part, to its deterministic generation, well-characterized atomic structure, and optical polarizability at room temperature. We investigate the latter through extensive measurements probing both the ground and excited state polarization dynamics. We develop a semiclassical model based on these measurements that predicts a near-unity degree of spin polarization, surpassing other solid-state spin defects under ambient conditions. Building upon our model, we include the presence of nuclear spin degrees of freedom adjacent to the $\mathrm{V}_{\mathrm{B}}^-$ and perform a comprehensive set of Lindbladian numerics to investigate the hyperfine-induced polarization of the nuclear spins. Our simulations predict a number of important features that emerge as a function of magnetic field which are borne out by experiment.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Authors:
DeepSeek-AI,
Qihao Zhu,
Daya Guo,
Zhihong Shao,
Dejian Yang,
Peiyi Wang,
Runxin Xu,
Y. Wu,
Yukun Li,
Huazuo Gao,
Shirong Ma,
Wangding Zeng,
Xiao Bi,
Zihui Gu,
Hanwei Xu,
Damai Dai,
Kai Dong,
Liyue Zhang,
Yishi Piao,
Zhibin Gou,
Zhenda Xie,
Zhewen Hao,
Bingxuan Wang,
Junxiao Song,
Deli Chen
, et al. (15 additional authors not shown)
Abstract:
We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathe…
▽ More
We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-V2, while maintaining comparable performance in general language tasks. Compared to DeepSeek-Coder-33B, DeepSeek-Coder-V2 demonstrates significant advancements in various aspects of code-related tasks, as well as reasoning and general capabilities. Additionally, DeepSeek-Coder-V2 expands its support for programming languages from 86 to 338, while extending the context length from 16K to 128K. In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners
Authors:
Bowen Jiang,
Yangxinyu Xie,
Zhuoqun Hao,
Xiaomeng Wang,
Tanwi Mallick,
Weijie J. Su,
Camillo J. Taylor,
Dan Roth
Abstract:
This study introduces a hypothesis-testing framework to assess whether large language models (LLMs) possess genuine reasoning abilities or primarily depend on token bias. We go beyond evaluating LLMs on accuracy; rather, we aim to investigate their token bias in solving logical reasoning tasks. Specifically, we develop carefully controlled synthetic datasets, featuring conjunction fallacy and syll…
▽ More
This study introduces a hypothesis-testing framework to assess whether large language models (LLMs) possess genuine reasoning abilities or primarily depend on token bias. We go beyond evaluating LLMs on accuracy; rather, we aim to investigate their token bias in solving logical reasoning tasks. Specifically, we develop carefully controlled synthetic datasets, featuring conjunction fallacy and syllogistic problems. Our framework outlines a list of hypotheses where token biases are readily identifiable, with all null hypotheses assuming genuine reasoning capabilities of LLMs. The findings in this study suggest, with statistical guarantee, that most LLMs still struggle with logical reasoning. While they may perform well on classic problems, their success largely depends on recognizing superficial patterns with strong token bias, thereby raising concerns about their actual reasoning and generalization abilities.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals
Authors:
Lida Chen,
Zujie Liang,
Xintao Wang,
Jiaqing Liang,
Yanghua Xiao,
Feng Wei,
Jinglei Chen,
Zhenghong Hao,
Bing Han,
Wei Wang
Abstract:
Large language models (LLMs) have achieved great success, but their occasional content fabrication, or hallucination, limits their practical application. Hallucination arises because LLMs struggle to admit ignorance due to inadequate training on knowledge boundaries. We call it a limitation of LLMs that they can not accurately express their knowledge boundary, answering questions they know while a…
▽ More
Large language models (LLMs) have achieved great success, but their occasional content fabrication, or hallucination, limits their practical application. Hallucination arises because LLMs struggle to admit ignorance due to inadequate training on knowledge boundaries. We call it a limitation of LLMs that they can not accurately express their knowledge boundary, answering questions they know while admitting ignorance to questions they do not know. In this paper, we aim to teach LLMs to recognize and express their knowledge boundary, so they can reduce hallucinations caused by fabricating when they do not know. We propose CoKE, which first probes LLMs' knowledge boundary via internal confidence given a set of questions, and then leverages the probing results to elicit the expression of the knowledge boundary. Extensive experiments show CoKE helps LLMs express knowledge boundaries, answering known questions while declining unknown ones, significantly improving in-domain and out-of-domain performance.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
A simple and fast finite difference method for the integral fractional Laplacian of variable order
Authors:
Zhaopeng Hao,
Siyuan Shi,
Zhongqiang Zhang,
Rui Du
Abstract:
For the fractional Laplacian of variable order, an efficient and accurate numerical evaluation in multi-dimension is a challenge for the nature of a singular integral. We propose a simple and easy-to-implement finite difference scheme for the multi-dimensional variable-order fractional Laplacian defined by a hypersingular integral. We prove that the scheme is of second-order convergence and apply…
▽ More
For the fractional Laplacian of variable order, an efficient and accurate numerical evaluation in multi-dimension is a challenge for the nature of a singular integral. We propose a simple and easy-to-implement finite difference scheme for the multi-dimensional variable-order fractional Laplacian defined by a hypersingular integral. We prove that the scheme is of second-order convergence and apply the developed finite difference scheme to solve various equations with the variable-order fractional Laplacian. We present a fast solver with quasi-linear complexity of the scheme for computing variable-order fractional Laplacian and corresponding PDEs. Several numerical examples demonstrate the accuracy and efficiency of our algorithm and verify our theory.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Learning Discrete Latent Variable Structures with Tensor Rank Conditions
Authors:
Zhengming Chen,
Ruichu Cai,
Feng Xie,
Jie Qiao,
Anpeng Wu,
Zijian Li,
Zhifeng Hao,
Kun Zhang
Abstract:
Unobserved discrete data are ubiquitous in many scientific disciplines, and how to learn the causal structure of these latent variables is crucial for uncovering data patterns. Most studies focus on the linear latent variable model or impose strict constraints on latent structures, which fail to address cases in discrete data involving non-linear relationships or complex latent structures. To achi…
▽ More
Unobserved discrete data are ubiquitous in many scientific disciplines, and how to learn the causal structure of these latent variables is crucial for uncovering data patterns. Most studies focus on the linear latent variable model or impose strict constraints on latent structures, which fail to address cases in discrete data involving non-linear relationships or complex latent structures. To achieve this, we explore a tensor rank condition on contingency tables for an observed variable set $\mathbf{X}_p$, showing that the rank is determined by the minimum support of a specific conditional set (not necessary in $\mathbf{X}_p$) that d-separates all variables in $\mathbf{X}_p$. By this, one can locate the latent variable through probing the rank on different observed variables set, and further identify the latent causal structure under some structure assumptions. We present the corresponding identification algorithm and conduct simulated experiments to verify the effectiveness of our method. In general, our results elegantly extend the identification boundary for causal discovery with discrete latent variables and expand the application scope of causal discovery with latent variables.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
S$^2$GSL: Incorporating Segment to Syntactic Enhanced Graph Structure Learning for Aspect-based Sentiment Analysis
Authors:
Bingfeng Chen,
Qihan Ouyang,
Yongqi Luo,
Boyan Xu,
Ruichu Cai,
Zhifeng Hao
Abstract:
Previous graph-based approaches in Aspect based Sentiment Analysis(ABSA) have demonstrated impressive performance by utilizing graph neural networks and attention mechanisms to learn structures of static dependency trees and dynamic latent trees. However, incorporating both semantic and syntactic information simultaneously within complex global structures can introduce irrelevant contexts and synt…
▽ More
Previous graph-based approaches in Aspect based Sentiment Analysis(ABSA) have demonstrated impressive performance by utilizing graph neural networks and attention mechanisms to learn structures of static dependency trees and dynamic latent trees. However, incorporating both semantic and syntactic information simultaneously within complex global structures can introduce irrelevant contexts and syntactic dependencies during the process of graph structure learning, potentially resulting in inaccurate predictions. In order to address the issues above, we propose S$^2$GSL, incorporating Segment to Syntactic enhanced Graph Structure Learning for ABSA. Specifically,S$^2$GSL is featured with a segment-aware semantic graph learning and a syntax-based latent graph learning enabling the removal of irrelevant contexts and dependencies, respectively. We further propose a self-adaptive aggregation network that facilitates the fusion of two graph learning branches, thereby achieving complementarity across diverse structures. Experimental results on four benchmarks demonstrate the effectiveness of our framework.
△ Less
Submitted 7 June, 2024; v1 submitted 4 June, 2024;
originally announced June 2024.
-
UA-Track: Uncertainty-Aware End-to-End 3D Multi-Object Tracking
Authors:
Lijun Zhou,
Tao Tang,
Pengkun Hao,
Zihang He,
Kalok Ho,
Shuo Gu,
Wenbo Hou,
Zhihui Hao,
Haiyang Sun,
Kun Zhan,
Peng Jia,
Xianpeng Lang,
Xiaodan Liang
Abstract:
3D multiple object tracking (MOT) plays a crucial role in autonomous driving perception. Recent end-to-end query-based trackers simultaneously detect and track objects, which have shown promising potential for the 3D MOT task. However, existing methods overlook the uncertainty issue, which refers to the lack of precise confidence about the state and location of tracked objects. Uncertainty arises…
▽ More
3D multiple object tracking (MOT) plays a crucial role in autonomous driving perception. Recent end-to-end query-based trackers simultaneously detect and track objects, which have shown promising potential for the 3D MOT task. However, existing methods overlook the uncertainty issue, which refers to the lack of precise confidence about the state and location of tracked objects. Uncertainty arises owing to various factors during motion observation by cameras, especially occlusions and the small size of target objects, resulting in an inaccurate estimation of the object's position, label, and identity. To this end, we propose an Uncertainty-Aware 3D MOT framework, UA-Track, which tackles the uncertainty problem from multiple aspects. Specifically, we first introduce an Uncertainty-aware Probabilistic Decoder to capture the uncertainty in object prediction with probabilistic attention. Secondly, we propose an Uncertainty-guided Query Denoising strategy to further enhance the training process. We also utilize Uncertainty-reduced Query Initialization, which leverages predicted 2D object location and depth information to reduce query uncertainty. As a result, our UA-Track achieves state-of-the-art performance on the nuScenes benchmark, i.e., 66.3% AMOTA on the test split, surpassing the previous best end-to-end solution by a significant margin of 8.9% AMOTA.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Convergence rate of the Euler-Maruyama scheme to density dependent SDEs driven by $α$-stable additive noise
Authors:
Ke Song,
Zimo Hao
Abstract:
In this paper, we establish the weak convergence rate of density-dependent stochastic differential equations with bounded drift driven by $α$-stable processes with $α\in(1,2)$. The well-posedness of these equations has been previously obtained in \cite{wu2023well}. We derive an explicit convergence rate in total variation for the Euler-Maruyama scheme, employing a technique rooted in \cite{hao2023…
▽ More
In this paper, we establish the weak convergence rate of density-dependent stochastic differential equations with bounded drift driven by $α$-stable processes with $α\in(1,2)$. The well-posedness of these equations has been previously obtained in \cite{wu2023well}. We derive an explicit convergence rate in total variation for the Euler-Maruyama scheme, employing a technique rooted in \cite{hao2023}.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Enhancing Adversarial Robustness in SNNs with Sparse Gradients
Authors:
Yujia Liu,
Tong Bu,
Jianhao Ding,
Zecheng Hao,
Tiejun Huang,
Zhaofei Yu
Abstract:
Spiking Neural Networks (SNNs) have attracted great attention for their energy-efficient operations and biologically inspired structures, offering potential advantages over Artificial Neural Networks (ANNs) in terms of energy efficiency and interpretability. Nonetheless, similar to ANNs, the robustness of SNNs remains a challenge, especially when facing adversarial attacks. Existing techniques, wh…
▽ More
Spiking Neural Networks (SNNs) have attracted great attention for their energy-efficient operations and biologically inspired structures, offering potential advantages over Artificial Neural Networks (ANNs) in terms of energy efficiency and interpretability. Nonetheless, similar to ANNs, the robustness of SNNs remains a challenge, especially when facing adversarial attacks. Existing techniques, whether adapted from ANNs or specifically designed for SNNs, exhibit limitations in training SNNs or defending against strong attacks. In this paper, we propose a novel approach to enhance the robustness of SNNs through gradient sparsity regularization. We observe that SNNs exhibit greater resilience to random perturbations compared to adversarial perturbations, even at larger scales. Motivated by this, we aim to narrow the gap between SNNs under adversarial and random perturbations, thereby improving their overall robustness. To achieve this, we theoretically prove that this performance gap is upper bounded by the gradient sparsity of the probability associated with the true label concerning the input image, laying the groundwork for a practical strategy to train robust SNNs by regularizing the gradient sparsity. We validate the effectiveness of our approach through extensive experiments on both image-based and event-based datasets. The results demonstrate notable improvements in the robustness of SNNs. Our work highlights the importance of gradient sparsity in SNNs and its role in enhancing robustness.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Flow-distribution dependent SDEs and Navier-Stokes equations with $\mathbf f$B$\mathbf m$
Authors:
Zimo Hao,
Michael Röckner,
Xicheng Zhang
Abstract:
Motivated by the probabilistic representation of the Navier-Stokes equations, we introduce a novel class of stochastic differential equations that depend on flow distribution. We establish the existence and uniqueness of both strong and weak solutions under one-sided Lipschitz conditions and singular drifts. These newly proposed flow-distribution dependent stochastic differential equations are clo…
▽ More
Motivated by the probabilistic representation of the Navier-Stokes equations, we introduce a novel class of stochastic differential equations that depend on flow distribution. We establish the existence and uniqueness of both strong and weak solutions under one-sided Lipschitz conditions and singular drifts. These newly proposed flow-distribution dependent stochastic differential equations are closely connected to quasilinear backward Kolmogorov equations and forward Fokker-Planck equations. Furthermore, we investigate a stochastic version of the 2D-Navier-Stokes equation associated with fractional Brownian noise. We demonstrate the global well-posedness and smoothness of solutions when the Hurst parameter $H$ lies in the range $(0, \frac12)$ and the initial vorticity is a finite signed measure.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Nonreciprocal singularities dominated by the dissipative photon-magnon coupling in non-Hermitian systems
Authors:
Yongzhang Shi,
Chi Zhang,
Zhenhui Hao,
Changjun Jiang,
C. K. Ong,
Ke Xia,
Guozhi Chai
Abstract:
We investigated the magnon-photon coupling in an open cavity magnonic system, which leads to two different nonreciprocal singularities dominated by the dissipative coupling. One type of singularity is the exceptional point, which is just on the exceptional surface in parameter space. The other type of singularity is the bound state in the continuum discovered in the level-attraction-like coupling,…
▽ More
We investigated the magnon-photon coupling in an open cavity magnonic system, which leads to two different nonreciprocal singularities dominated by the dissipative coupling. One type of singularity is the exceptional point, which is just on the exceptional surface in parameter space. The other type of singularity is the bound state in the continuum discovered in the level-attraction-like coupling, which is above the exceptional surface. In experiment, we realized the two different singularities with nonreciprocity and selectivity in an open cavity magnonic system with suitable dissipation rating. Our results can be understood well with the pseudo-Hermitian theory of magnon-polariton system.
△ Less
Submitted 31 May, 2024; v1 submitted 28 May, 2024;
originally announced May 2024.
-
Reference Neural Operators: Learning the Smooth Dependence of Solutions of PDEs on Geometric Deformations
Authors:
Ze Cheng,
Zhongkai Hao,
Xiaoqiang Wang,
Jianing Huang,
Youjia Wu,
Xudan Liu,
Yiru Zhao,
Songming Liu,
Hang Su
Abstract:
For partial differential equations on domains of arbitrary shapes, existing works of neural operators attempt to learn a mapping from geometries to solutions. It often requires a large dataset of geometry-solution pairs in order to obtain a sufficiently accurate neural operator. However, for many industrial applications, e.g., engineering design optimization, it can be prohibitive to satisfy the r…
▽ More
For partial differential equations on domains of arbitrary shapes, existing works of neural operators attempt to learn a mapping from geometries to solutions. It often requires a large dataset of geometry-solution pairs in order to obtain a sufficiently accurate neural operator. However, for many industrial applications, e.g., engineering design optimization, it can be prohibitive to satisfy the requirement since even a single simulation may take hours or days of computation. To address this issue, we propose reference neural operators (RNO), a novel way of implementing neural operators, i.e., to learn the smooth dependence of solutions on geometric deformations. Specifically, given a reference solution, RNO can predict solutions corresponding to arbitrary deformations of the referred geometry. This approach turns out to be much more data efficient. Through extensive experiments, we show that RNO can learn the dependence across various types and different numbers of geometry objects with relatively small datasets. RNO outperforms baseline models in accuracy by a large lead and achieves up to 80% error reduction.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
From Orthogonality to Dependency: Learning Disentangled Representation for Multi-Modal Time-Series Sensing Signals
Authors:
Ruichu Cai,
Zhifang Jiang,
Zijian Li,
Weilin Chen,
Xuexin Chen,
Zhifeng Hao,
Yifan Shen,
Guangyi Chen,
Kun Zhang
Abstract:
Existing methods for multi-modal time series representation learning aim to disentangle the modality-shared and modality-specific latent variables. Although achieving notable performances on downstream tasks, they usually assume an orthogonal latent space. However, the modality-specific and modality-shared latent variables might be dependent on real-world scenarios. Therefore, we propose a general…
▽ More
Existing methods for multi-modal time series representation learning aim to disentangle the modality-shared and modality-specific latent variables. Although achieving notable performances on downstream tasks, they usually assume an orthogonal latent space. However, the modality-specific and modality-shared latent variables might be dependent on real-world scenarios. Therefore, we propose a general generation process, where the modality-shared and modality-specific latent variables are dependent, and further develop a \textbf{M}ulti-mod\textbf{A}l \textbf{TE}mporal Disentanglement (\textbf{MATE}) model. Specifically, our \textbf{MATE} model is built on a temporally variational inference architecture with the modality-shared and modality-specific prior networks for the disentanglement of latent variables. Furthermore, we establish identifiability results to show that the extracted representation is disentangled. More specifically, we first achieve the subspace identifiability for modality-shared and modality-specific latent variables by leveraging the pairing of multi-modal data. Then we establish the component-wise identifiability of modality-specific latent variables by employing sufficient changes of historical latent variables. Extensive experimental studies on multi-modal sensors, human activity recognition, and healthcare datasets show a general improvement in different downstream tasks, highlighting the effectiveness of our method in real-world scenarios.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
PEAC: Unsupervised Pre-training for Cross-Embodiment Reinforcement Learning
Authors:
Chengyang Ying,
Zhongkai Hao,
Xinning Zhou,
Xuezhou Xu,
Hang Su,
Xingxing Zhang,
Jun Zhu
Abstract:
Designing generalizable agents capable of adapting to diverse embodiments has achieved significant attention in Reinforcement Learning (RL), which is critical for deploying RL agents in various real-world applications. Previous Cross-Embodiment RL approaches have focused on transferring knowledge across embodiments within specific tasks. These methods often result in knowledge tightly coupled with…
▽ More
Designing generalizable agents capable of adapting to diverse embodiments has achieved significant attention in Reinforcement Learning (RL), which is critical for deploying RL agents in various real-world applications. Previous Cross-Embodiment RL approaches have focused on transferring knowledge across embodiments within specific tasks. These methods often result in knowledge tightly coupled with those tasks and fail to adequately capture the distinct characteristics of different embodiments. To address this limitation, we introduce the notion of Cross-Embodiment Unsupervised RL (CEURL), which leverages unsupervised learning to enable agents to acquire embodiment-aware and task-agnostic knowledge through online interactions within reward-free environments. We formulate CEURL as a novel Controlled Embodiment Markov Decision Process (CE-MDP) and systematically analyze CEURL's pre-training objectives under CE-MDP. Based on these analyses, we develop a novel algorithm Pre-trained Embodiment-Aware Control (PEAC) for handling CEURL, incorporating an intrinsic reward function specifically designed for cross-embodiment pre-training. PEAC not only provides an intuitive optimization strategy for cross-embodiment pre-training but also can integrate flexibly with existing unsupervised RL methods, facilitating cross-embodiment exploration and skill discovery. Extensive experiments in both simulated (e.g., DMC and Robosuite) and real-world environments (e.g., legged locomotion) demonstrate that PEAC significantly improves adaptation performance and cross-embodiment generalization, demonstrating its effectiveness in overcoming the unique challenges of CEURL.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
P-adic Rankin-Selberg L-functions in universal deformation families and functional equations
Authors:
Zeping Hao,
David Loeffler
Abstract:
We construct a $p$-adic Rankin-Selberg $L$-function associated to the product of two families of modular forms, where the first is an ordinary (Hida) family, and the second an arbitrary universal-deformation family (without any ordinarity condition at $p$). This gives a function on a 4-dimensional base space - strictly larger than the ordinary eigenvariety, which is 3-dimensional in this case. We…
▽ More
We construct a $p$-adic Rankin-Selberg $L$-function associated to the product of two families of modular forms, where the first is an ordinary (Hida) family, and the second an arbitrary universal-deformation family (without any ordinarity condition at $p$). This gives a function on a 4-dimensional base space - strictly larger than the ordinary eigenvariety, which is 3-dimensional in this case. We prove our $p$-adic $L$-function interpolates all critical values of the Rankin-Selberg $L$-functions for the classical specialisations of our family, and derive a functional equation for our $p$-adic $L$-function.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
SeBot: Structural Entropy Guided Multi-View Contrastive Learning for Social Bot Detection
Authors:
Yingguang Yang,
Qi Wu,
Buyun He,
Hao Peng,
Renyu Yang,
Zhifeng Hao,
Yong Liao
Abstract:
Recent advancements in social bot detection have been driven by the adoption of Graph Neural Networks. The social graph, constructed from social network interactions, contains benign and bot accounts that influence each other. However, previous graph-based detection methods that follow the transductive message-passing paradigm may not fully utilize hidden graph information and are vulnerable to ad…
▽ More
Recent advancements in social bot detection have been driven by the adoption of Graph Neural Networks. The social graph, constructed from social network interactions, contains benign and bot accounts that influence each other. However, previous graph-based detection methods that follow the transductive message-passing paradigm may not fully utilize hidden graph information and are vulnerable to adversarial bot behavior. The indiscriminate message passing between nodes from different categories and communities results in excessively homogeneous node representations, ultimately reducing the effectiveness of social bot detectors. In this paper, we propose SEBot, a novel multi-view graph-based contrastive learning-enabled social bot detector. In particular, we use structural entropy as an uncertainty metric to optimize the entire graph's structure and subgraph-level granularity, revealing the implicitly existing hierarchical community structure. And we design an encoder to enable message passing beyond the homophily assumption, enhancing robustness to adversarial behaviors of social bots. Finally, we employ multi-view contrastive learning to maximize mutual information between different views and enhance the detection performance through multi-task learning. Experimental results demonstrate that our approach significantly improves the performance of social bot detection compared with SOTA methods.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
Deep Data Consistency: a Fast and Robust Diffusion Model-based Solver for Inverse Problems
Authors:
Hanyu Chen,
Zhixiu Hao,
Liying Xiao
Abstract:
Diffusion models have become a successful approach for solving various image inverse problems by providing a powerful diffusion prior. Many studies tried to combine the measurement into diffusion by score function replacement, matrix decomposition, or optimization algorithms, but it is hard to balance the data consistency and realness. The slow sampling speed is also a main obstacle to its wide ap…
▽ More
Diffusion models have become a successful approach for solving various image inverse problems by providing a powerful diffusion prior. Many studies tried to combine the measurement into diffusion by score function replacement, matrix decomposition, or optimization algorithms, but it is hard to balance the data consistency and realness. The slow sampling speed is also a main obstacle to its wide application. To address the challenges, we propose Deep Data Consistency (DDC) to update the data consistency step with a deep learning model when solving inverse problems with diffusion models. By analyzing existing methods, the variational bound training objective is used to maximize the conditional posterior and reduce its impact on the diffusion process. In comparison with state-of-the-art methods in linear and non-linear tasks, DDC demonstrates its outstanding performance of both similarity and realness metrics in generating high-quality solutions with only 5 inference steps in 0.77 seconds on average. In addition, the robustness of DDC is well illustrated in the experiments across datasets, with large noise and the capacity to solve multiple tasks in only one pre-trained model.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Propagation of chaos for moderately interacting particle systems related to singular kinetic Mckean-Vlasov SDEs
Authors:
Zimo Hao,
Jean-Francois Jabir,
Stéphane Menozzi,
Michael Röckner,
Xicheng Zhang
Abstract:
We study the propagation of chaos in a class of moderately interacting particle systems for the approximation of singular kinetic McKean-Vlasov SDEs driven by alpha-stable processes. Diffusion parts include Brownian (alpha=2) and pure-jump (1<α<2) perturbations and interaction kernels are considered in a non-smooth anisotropic Besov space. Using Duhamel formula, sharp density estimates (recently i…
▽ More
We study the propagation of chaos in a class of moderately interacting particle systems for the approximation of singular kinetic McKean-Vlasov SDEs driven by alpha-stable processes. Diffusion parts include Brownian (alpha=2) and pure-jump (1<α<2) perturbations and interaction kernels are considered in a non-smooth anisotropic Besov space. Using Duhamel formula, sharp density estimates (recently issued in Hao, Rockner and Zhang 2023), and suitable martingale functional inequalities, we obtain direct estimates on the convergence rate between the empirical measure of the particle systems toward the McKean-Vlasov distribution. These estimates further lead to quantitative propagation of chaos results in the weak and strong sense.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Multi-Relational Structural Entropy
Authors:
Yuwei Cao,
Hao Peng,
Angsheng Li,
Chenyu You,
Zhifeng Hao,
Philip S Yu
Abstract:
Structural Entropy (SE) measures the structural information contained in a graph. Minimizing or maximizing SE helps to reveal or obscure the intrinsic structural patterns underlying graphs in an interpretable manner, finding applications in various tasks driven by networked data. However, SE ignores the heterogeneity inherent in the graph relations, which is ubiquitous in modern networks. In this…
▽ More
Structural Entropy (SE) measures the structural information contained in a graph. Minimizing or maximizing SE helps to reveal or obscure the intrinsic structural patterns underlying graphs in an interpretable manner, finding applications in various tasks driven by networked data. However, SE ignores the heterogeneity inherent in the graph relations, which is ubiquitous in modern networks. In this work, we extend SE to consider heterogeneous relations and propose the first metric for multi-relational graph structural information, namely, Multi-relational Structural Entropy (MrSE). To this end, we first cast SE through the novel lens of the stationary distribution from random surfing, which readily extends to multi-relational networks by considering the choices of both nodes and relation types simultaneously at each step. The resulting MrSE is then optimized by a new greedy algorithm to reveal the essential structures within a multi-relational network. Experimental results highlight that the proposed MrSE offers a more insightful interpretation of the structure of multi-relational graphs compared to SE. Additionally, it enhances the performance of two tasks that involve real-world multi-relational graphs, including node clustering and social event detection.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Authors:
DeepSeek-AI,
Aixin Liu,
Bei Feng,
Bin Wang,
Bingxuan Wang,
Bo Liu,
Chenggang Zhao,
Chengqi Dengr,
Chong Ruan,
Damai Dai,
Daya Guo,
Dejian Yang,
Deli Chen,
Dongjie Ji,
Erhang Li,
Fangyun Lin,
Fuli Luo,
Guangbo Hao,
Guanting Chen,
Guowei Li,
H. Zhang,
Hanwei Xu,
Hao Yang,
Haowei Zhang,
Honghui Ding
, et al. (132 additional authors not shown)
Abstract:
We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference…
▽ More
We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation. Compared with DeepSeek 67B, DeepSeek-V2 achieves significantly stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times. We pretrain DeepSeek-V2 on a high-quality and multi-source corpus consisting of 8.1T tokens, and further perform Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to fully unlock its potential. Evaluation results show that, even with only 21B activated parameters, DeepSeek-V2 and its chat versions still achieve top-tier performance among open-source models.
△ Less
Submitted 19 June, 2024; v1 submitted 7 May, 2024;
originally announced May 2024.
-
Acceleration Algorithms in GNNs: A Survey
Authors:
Lu Ma,
Zeang Sheng,
Xunkai Li,
Xinyi Gao,
Zhezheng Hao,
Ling Yang,
Wentao Zhang,
Bin Cui
Abstract:
Graph Neural Networks (GNNs) have demonstrated effectiveness in various graph-based tasks. However, their inefficiency in training and inference presents challenges for scaling up to real-world and large-scale graph applications. To address the critical challenges, a range of algorithms have been proposed to accelerate training and inference of GNNs, attracting increasing attention from the resear…
▽ More
Graph Neural Networks (GNNs) have demonstrated effectiveness in various graph-based tasks. However, their inefficiency in training and inference presents challenges for scaling up to real-world and large-scale graph applications. To address the critical challenges, a range of algorithms have been proposed to accelerate training and inference of GNNs, attracting increasing attention from the research community. In this paper, we present a systematic review of acceleration algorithms in GNNs, which can be categorized into three main topics based on their purpose: training acceleration, inference acceleration, and execution acceleration. Specifically, we summarize and categorize the existing approaches for each main topic, and provide detailed characterizations of the approaches within each category. Additionally, we review several libraries related to acceleration algorithms in GNNs and discuss our Scalable Graph Learning (SGL) library. Finally, we propose promising directions for future research. A complete summary is presented in our GitHub repository: https://github.com/PKU-DAIR/SGL/blob/main/Awsome-GNN-Acceleration.md.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Doubly Robust Causal Effect Estimation under Networked Interference via Targeted Learning
Authors:
Weilin Chen,
Ruichu Cai,
Zeqin Yang,
Jie Qiao,
Yuguang Yan,
Zijian Li,
Zhifeng Hao
Abstract:
Causal effect estimation under networked interference is an important but challenging problem. Available parametric methods are limited in their model space, while previous semiparametric methods, e.g., leveraging neural networks to fit only one single nuisance function, may still encounter misspecification problems under networked interference without appropriate assumptions on the data generatio…
▽ More
Causal effect estimation under networked interference is an important but challenging problem. Available parametric methods are limited in their model space, while previous semiparametric methods, e.g., leveraging neural networks to fit only one single nuisance function, may still encounter misspecification problems under networked interference without appropriate assumptions on the data generation process. To mitigate bias stemming from misspecification, we propose a novel doubly robust causal effect estimator under networked interference, by adapting the targeted learning technique to the training of neural networks. Specifically, we generalize the targeted learning technique into the networked interference setting and establish the condition under which an estimator achieves double robustness. Based on the condition, we devise an end-to-end causal effect estimator by transforming the identified theoretical condition into a targeted loss. Moreover, we provide a theoretical analysis of our designed estimator, revealing a faster convergence rate compared to a single nuisance model. Extensive experimental results on two real-world networks with semisynthetic data demonstrate the effectiveness of our proposed estimators.
△ Less
Submitted 5 July, 2024; v1 submitted 6 May, 2024;
originally announced May 2024.
-
Bundling and Tumbling in Bacterial-inspired Bi-flagellated Soft Robots for Attitude Adjustment
Authors:
Zhuonan Hao,
Siddharth Zalavadia,
Mohammad Khalid Jawed
Abstract:
We create a mechanism inspired by bacterial swimmers, featuring two flexible flagella with individual control over rotation speed and direction in viscous fluid environments. Using readily available materials, we design and fabricate silicone-based helical flagella. To simulate the robot's motion, we develop a physics-based computational tool, drawing inspiration from computer graphics. The framew…
▽ More
We create a mechanism inspired by bacterial swimmers, featuring two flexible flagella with individual control over rotation speed and direction in viscous fluid environments. Using readily available materials, we design and fabricate silicone-based helical flagella. To simulate the robot's motion, we develop a physics-based computational tool, drawing inspiration from computer graphics. The framework incorporates the Discrete Elastic Rod method, modeling the flagella as Kirchhoff's elastic rods, and couples it with the Regularized Stokeslet Segments method for hydrodynamics, along with the Implicit Contact Model to handle contact. This approach effectively captures polymorphic phenomena like bundling and tumbling. Our study reveals how these emergent behaviors affect the robot's attitude angles, demonstrating its ability to self-reorient in both simulations and experiments. We anticipate that this framework will enhance our understanding of the directional change capabilities of flagellated robots, potentially stimulating further exploration on microscopic robot mobility.
△ Less
Submitted 19 January, 2024;
originally announced May 2024.
-
Unsupervised Social Bot Detection via Structural Information Theory
Authors:
Hao Peng,
Jingyun Zhang,
Xiang Huang,
Zhifeng Hao,
Angsheng Li,
Zhengtao Yu,
Philip S. Yu
Abstract:
Research on social bot detection plays a crucial role in maintaining the order and reliability of information dissemination while increasing trust in social interactions. The current mainstream social bot detection models rely on black-box neural network technology, e.g., Graph Neural Network, Transformer, etc., which lacks interpretability. In this work, we present UnDBot, a novel unsupervised, i…
▽ More
Research on social bot detection plays a crucial role in maintaining the order and reliability of information dissemination while increasing trust in social interactions. The current mainstream social bot detection models rely on black-box neural network technology, e.g., Graph Neural Network, Transformer, etc., which lacks interpretability. In this work, we present UnDBot, a novel unsupervised, interpretable, yet effective and practical framework for detecting social bots. This framework is built upon structural information theory. We begin by designing three social relationship metrics that capture various aspects of social bot behaviors: Posting Type Distribution, Posting Influence, and Follow-to-follower Ratio. Three new relationships are utilized to construct a new, unified, and weighted social multi-relational graph, aiming to model the relevance of social user behaviors and discover long-distance correlations between users. Second, we introduce a novel method for optimizing heterogeneous structural entropy. This method involves the personalized aggregation of edge information from the social multi-relational graph to generate a two-dimensional encoding tree. The heterogeneous structural entropy facilitates decoding of the substantial structure of the social bots network and enables hierarchical clustering of social bots. Thirdly, a new community labeling method is presented to distinguish social bot communities by computing the user's stationary distribution, measuring user contributions to network structure, and counting the intensity of user aggregation within the community. Compared with ten representative social bot detection approaches, comprehensive experiments demonstrate the advantages of effectiveness and interpretability of UnDBot on four real social network datasets.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
GhostNetV3: Exploring the Training Strategies for Compact Models
Authors:
Zhenhua Liu,
Zhiwei Hao,
Kai Han,
Yehui Tang,
Yunhe Wang
Abstract:
Compact neural networks are specially designed for applications on edge devices with faster inference speed yet modest performance. However, training strategies of compact models are borrowed from that of conventional models at present, which ignores their difference in model capacity and thus may impede the performance of compact models. In this paper, by systematically investigating the impact o…
▽ More
Compact neural networks are specially designed for applications on edge devices with faster inference speed yet modest performance. However, training strategies of compact models are borrowed from that of conventional models at present, which ignores their difference in model capacity and thus may impede the performance of compact models. In this paper, by systematically investigating the impact of different training ingredients, we introduce a strong training strategy for compact models. We find that the appropriate designs of re-parameterization and knowledge distillation are crucial for training high-performance compact models, while some commonly used data augmentations for training conventional models, such as Mixup and CutMix, lead to worse performance. Our experiments on ImageNet-1K dataset demonstrate that our specialized training strategy for compact models is applicable to various architectures, including GhostNetV2, MobileNetV2 and ShuffleNetV2. Specifically, equipped with our strategy, GhostNetV3 1.3$\times$ achieves a top-1 accuracy of 79.1% with only 269M FLOPs and a latency of 14.46ms on mobile devices, surpassing its ordinarily trained counterpart by a large margin. Moreover, our observation can also be extended to object detection scenarios. PyTorch code and checkpoints can be found at https://github.com/huawei-noah/Efficient-AI-Backbones/tree/master/ghostnetv3_pytorch.
△ Less
Submitted 21 April, 2024; v1 submitted 17 April, 2024;
originally announced April 2024.
-
Joint Physical-Digital Facial Attack Detection Via Simulating Spoofing Clues
Authors:
Xianhua He,
Dashuang Liang,
Song Yang,
Zhanlong Hao,
Hui Ma,
Binjie Mao,
Xi Li,
Yao Wang,
Pengfei Yan,
Ajian Liu
Abstract:
Face recognition systems are frequently subjected to a variety of physical and digital attacks of different types. Previous methods have achieved satisfactory performance in scenarios that address physical attacks and digital attacks, respectively. However, few methods are considered to integrate a model that simultaneously addresses both physical and digital attacks, implying the necessity to dev…
▽ More
Face recognition systems are frequently subjected to a variety of physical and digital attacks of different types. Previous methods have achieved satisfactory performance in scenarios that address physical attacks and digital attacks, respectively. However, few methods are considered to integrate a model that simultaneously addresses both physical and digital attacks, implying the necessity to develop and maintain multiple models. To jointly detect physical and digital attacks within a single model, we propose an innovative approach that can adapt to any network architecture. Our approach mainly contains two types of data augmentation, which we call Simulated Physical Spoofing Clues augmentation (SPSC) and Simulated Digital Spoofing Clues augmentation (SDSC). SPSC and SDSC augment live samples into simulated attack samples by simulating spoofing clues of physical and digital attacks, respectively, which significantly improve the capability of the model to detect "unseen" attack types. Extensive experiments show that SPSC and SDSC can achieve state-of-the-art generalization in Protocols 2.1 and 2.2 of the UniAttackData dataset, respectively. Our method won first place in "Unified Physical-Digital Face Attack Detection" of the 5th Face Anti-spoofing Challenge@CVPR2024. Our final submission obtains 3.75% APCER, 0.93% BPCER, and 2.34% ACER, respectively. Our code is available at https://github.com/Xianhua-He/cvpr2024-face-anti-spoofing-challenge.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Holographic reconstruction of flat spacetime
Authors:
Zezhuang Hao
Abstract:
The flat/CFT dictionary between the bulk gravitational theory and boundary conformal field theory is systematically developed in this paper. Asymptotically flat spacetime is built up by asymptotically AdS hyperboloid slices in terms of Fefferman Graham coordinates together with soft modes propagating between different slices near the null boundary. Then we construct the flat holography dictionary…
▽ More
The flat/CFT dictionary between the bulk gravitational theory and boundary conformal field theory is systematically developed in this paper. Asymptotically flat spacetime is built up by asymptotically AdS hyperboloid slices in terms of Fefferman Graham coordinates together with soft modes propagating between different slices near the null boundary. Then we construct the flat holography dictionary based on studying Einstein equation at zero and first order and it turns out that these correspond to the description of hard and soft sector for the field theory from the boundary point of view. The explicit expression for energy-stress tensor is also determined by performing holographic renormalisation on the Einstein Hilbert action. By studying the anomalies of the energy-stress tensor, we obtain the leading and subleading contribution to the central charge. Einstein equations in the bulk are related to the Ward identities of the boundary theory and we find that the boundary CFT energy-stress tensor is not conserved due to the existence of radiative soft modes which will generate the energy flow through the null boundary.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Causal Discovery from Poisson Branching Structural Causal Model Using High-Order Cumulant with Path Analysis
Authors:
Jie Qiao,
Yu Xiang,
Zhengming Chen,
Ruichu Cai,
Zhifeng Hao
Abstract:
Count data naturally arise in many fields, such as finance, neuroscience, and epidemiology, and discovering causal structure among count data is a crucial task in various scientific and industrial scenarios. One of the most common characteristics of count data is the inherent branching structure described by a binomial thinning operator and an independent Poisson distribution that captures both br…
▽ More
Count data naturally arise in many fields, such as finance, neuroscience, and epidemiology, and discovering causal structure among count data is a crucial task in various scientific and industrial scenarios. One of the most common characteristics of count data is the inherent branching structure described by a binomial thinning operator and an independent Poisson distribution that captures both branching and noise. For instance, in a population count scenario, mortality and immigration contribute to the count, where survival follows a Bernoulli distribution, and immigration follows a Poisson distribution. However, causal discovery from such data is challenging due to the non-identifiability issue: a single causal pair is Markov equivalent, i.e., $X\rightarrow Y$ and $Y\rightarrow X$ are distributed equivalent. Fortunately, in this work, we found that the causal order from $X$ to its child $Y$ is identifiable if $X$ is a root vertex and has at least two directed paths to $Y$, or the ancestor of $X$ with the most directed path to $X$ has a directed path to $Y$ without passing $X$. Specifically, we propose a Poisson Branching Structure Causal Model (PB-SCM) and perform a path analysis on PB-SCM using high-order cumulants. Theoretical results establish the connection between the path and cumulant and demonstrate that the path information can be obtained from the cumulant. With the path information, causal order is identifiable under some graphical conditions. A practical algorithm for learning causal structure under PB-SCM is proposed and the experiments demonstrate and verify the effectiveness of the proposed method.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Energy-Efficient Hybrid Beamforming with Dynamic On-off Control for Integrated Sensing, Communications, and Powering
Authors:
Zeyu Hao,
Yuan Fang,
Xianghao Yu,
Jie Xu,
Ling Qiu,
Lexi Xu,
Shuguang Cui
Abstract:
This paper investigates the energy-efficient hybrid beamforming design for a multi-functional integrated sensing, communications, and powering (ISCAP) system. In this system, a base station (BS) with a hybrid analog-digital (HAD) architecture sends unified wireless signals to communicate with multiple information receivers (IRs), sense multiple point targets, and wirelessly charge multiple energy…
▽ More
This paper investigates the energy-efficient hybrid beamforming design for a multi-functional integrated sensing, communications, and powering (ISCAP) system. In this system, a base station (BS) with a hybrid analog-digital (HAD) architecture sends unified wireless signals to communicate with multiple information receivers (IRs), sense multiple point targets, and wirelessly charge multiple energy receivers (ERs) at the same time. To facilitate the energy-efficient design, we present a novel HAD architecture for the BS transmitter, which allows dynamic on-off control of its radio frequency (RF) chains and analog phase shifters (PSs) through a switch network. We also consider a practical and comprehensive power consumption model for the BS, by taking into account the power-dependent non-linear power amplifier (PA) efficiency, and the on-off non-transmission power consumption model of RF chains and PSs. We jointly design the hybrid beamforming and dynamic on-off control at the BS, aiming to minimize its total power consumption, while guaranteeing the performance requirements on communication rates, sensing Cramér-Rao bound (CRB), and harvested power levels. The formulation also takes into consideration the per-antenna transmit power constraint and the constant modulus constraints for the analog beamformer at the BS. The resulting optimization problem for ISCAP is highly non-convex. Please refer to the paper for a complete abstract.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
SpikingResformer: Bridging ResNet and Vision Transformer in Spiking Neural Networks
Authors:
Xinyu Shi,
Zecheng Hao,
Zhaofei Yu
Abstract:
The remarkable success of Vision Transformers in Artificial Neural Networks (ANNs) has led to a growing interest in incorporating the self-attention mechanism and transformer-based architecture into Spiking Neural Networks (SNNs). While existing methods propose spiking self-attention mechanisms that are compatible with SNNs, they lack reasonable scaling methods, and the overall architectures propo…
▽ More
The remarkable success of Vision Transformers in Artificial Neural Networks (ANNs) has led to a growing interest in incorporating the self-attention mechanism and transformer-based architecture into Spiking Neural Networks (SNNs). While existing methods propose spiking self-attention mechanisms that are compatible with SNNs, they lack reasonable scaling methods, and the overall architectures proposed by these methods suffer from a bottleneck in effectively extracting local features. To address these challenges, we propose a novel spiking self-attention mechanism named Dual Spike Self-Attention (DSSA) with a reasonable scaling method. Based on DSSA, we propose a novel spiking Vision Transformer architecture called SpikingResformer, which combines the ResNet-based multi-stage architecture with our proposed DSSA to improve both performance and energy efficiency while reducing parameters. Experimental results show that SpikingResformer achieves higher accuracy with fewer parameters and lower energy consumption than other spiking Vision Transformer counterparts. Notably, our SpikingResformer-L achieves 79.40% top-1 accuracy on ImageNet with 4 time-steps, which is the state-of-the-art result in the SNN field.
△ Less
Submitted 28 March, 2024; v1 submitted 21 March, 2024;
originally announced March 2024.
-
Mitigating Data Consistency Induced Discrepancy in Cascaded Diffusion Models for Sparse-view CT Reconstruction
Authors:
Hanyu Chen,
Zhixiu Hao,
Lin Guo,
Liying Xiao
Abstract:
Sparse-view Computed Tomography (CT) image reconstruction is a promising approach to reduce radiation exposure, but it inevitably leads to image degradation. Although diffusion model-based approaches are computationally expensive and suffer from the training-sampling discrepancy, they provide a potential solution to the problem. This study introduces a novel Cascaded Diffusion with Discrepancy Mit…
▽ More
Sparse-view Computed Tomography (CT) image reconstruction is a promising approach to reduce radiation exposure, but it inevitably leads to image degradation. Although diffusion model-based approaches are computationally expensive and suffer from the training-sampling discrepancy, they provide a potential solution to the problem. This study introduces a novel Cascaded Diffusion with Discrepancy Mitigation (CDDM) framework, including the low-quality image generation in latent space and the high-quality image generation in pixel space which contains data consistency and discrepancy mitigation in a one-step reconstruction process. The cascaded framework minimizes computational costs by moving some inference steps from pixel space to latent space. The discrepancy mitigation technique addresses the training-sampling gap induced by data consistency, ensuring the data distribution is close to the original manifold. A specialized Alternating Direction Method of Multipliers (ADMM) is employed to process image gradients in separate directions, offering a more targeted approach to regularization. Experimental results across two datasets demonstrate CDDM's superior performance in high-quality image generation with clearer boundaries compared to existing methods, highlighting the framework's computational efficiency.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Photonic simulation of Majorana-based Jones polynomials
Authors:
Jia-Kun Li,
Kai Sun,
Ze-Yan Hao,
Jia-He Liang,
Si-Jing Tao,
Jiannis K. Pachos,
Jin-Shi Xu,
Yong-Jian Han,
Chuan-Feng Li,
Guang-Can Guo
Abstract:
Jones polynomials were introduced as a tool to distinguish between topologically different links. Recently, they emerged as the central building block of topological quantum computation: by braiding non-Abelian anyons it is possible to realise quantum algorithms through the computation of Jones polynomials. So far, it has been a formidable task to evaluate Jones polynomials through the control and…
▽ More
Jones polynomials were introduced as a tool to distinguish between topologically different links. Recently, they emerged as the central building block of topological quantum computation: by braiding non-Abelian anyons it is possible to realise quantum algorithms through the computation of Jones polynomials. So far, it has been a formidable task to evaluate Jones polynomials through the control and manipulation of non-Abelian anyons. In this study, a photonic quantum system employing two-photon correlations and non-dissipative imaginary-time evolution is utilized to simulate two inequivalent braiding operations of Majorana zero modes. The resulting amplitudes are shown to be mathematically equivalent to Jones polynomials at a particular value of their parameter. The high-fidelity of our optical platform allows us to distinguish between a wide range of links, such as Hopf links, Solomon links, Trefoil knots, Figure Eight knots and Borromean rings, through determining their corresponding Jones polynomials. Our photonic quantum simulator represents a significant step towards executing fault-tolerant quantum algorithms based on topological quantum encoding and manipulation.
△ Less
Submitted 31 May, 2024; v1 submitted 7 March, 2024;
originally announced March 2024.
-
Optical and spin properties of nitrogen vacancy centers formed along the tracks of high energy heavy ions
Authors:
Wei Liu,
Aleksi A. M. Leino,
Arun Persaud,
Qing Ji,
Kaushalya Jhuria,
Edward S. Barnard,
Shaul Aloni,
Christina Trautmann,
Marilena Tomut,
Ralf Wunderlich,
Hunter Ocker,
Nishanth Anand,
Zhao Hao,
Flyura Djurabekova,
Thomas Schenkel
Abstract:
Exposure of nitrogen doped diamond to high energy, heavy ions induces formation of vacancy related color centers aligned along the trajectories of the ions. Quasi 1D chains of coupled NV centers with lengths of a few tens of microns can be building blocks for quantum information processing and they provide insights into harsh radiation-matter interactions. Here, we report on color center formation…
▽ More
Exposure of nitrogen doped diamond to high energy, heavy ions induces formation of vacancy related color centers aligned along the trajectories of the ions. Quasi 1D chains of coupled NV centers with lengths of a few tens of microns can be building blocks for quantum information processing and they provide insights into harsh radiation-matter interactions. Here, we report on color center formation in diamond (1 ppm nitrogen) with 1 GeV gold and uranium ions. Using depth-resolved photoluminescence, we observe direct formation of single vacancy related color centers (GR1 centers) along the ion tracks. Mobile vacancies can form NV-centers with native nitrogen atoms during thermal annealing. Molecular dynamics simulations indicate that both isolated vacancies and defect clusters form along ion trajectory through electronic stopping processes, leading to broad color center profiles that range from the sample surface to a depth of about 25 microns. We quantify the spin properties of NV-centers formed by swift heavy ions through optical detection of magnetic resonance (ODMR) and validate the feasibility of using swift-heavy-ion-generated NV$^{-}$ along quasi 1D chains (for isolated tracks from low fluence irradiations) or in thin sheets of coupled 1D spin chains (formed with higher ion fluences) for NV-based magnetometry and for the exploration of quasi 1D and 2D spin textures in diamond.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training
Authors:
Zhongkai Hao,
Chang Su,
Songming Liu,
Julius Berner,
Chengyang Ying,
Hang Su,
Anima Anandkumar,
Jian Song,
Jun Zhu
Abstract:
Pre-training has been investigated to improve the efficiency and performance of training neural operators in data-scarce settings. However, it is largely in its infancy due to the inherent complexity and diversity, such as long trajectories, multiple scales and varying dimensions of partial differential equations (PDEs) data. In this paper, we present a new auto-regressive denoising pre-training s…
▽ More
Pre-training has been investigated to improve the efficiency and performance of training neural operators in data-scarce settings. However, it is largely in its infancy due to the inherent complexity and diversity, such as long trajectories, multiple scales and varying dimensions of partial differential equations (PDEs) data. In this paper, we present a new auto-regressive denoising pre-training strategy, which allows for more stable and efficient pre-training on PDE data and generalizes to various downstream tasks. Moreover, by designing a flexible and scalable model architecture based on Fourier attention, we can easily scale up the model for large-scale pre-training. We train our PDE foundation model with up to 0.5B parameters on 10+ PDE datasets with more than 100k trajectories. Extensive experiments show that we achieve SOTA on these benchmarks and validate the strong generalizability of our model to significantly enhance performance on diverse downstream PDE tasks like 3D data. Code is available at \url{https://github.com/thu-ml/DPOT}.
△ Less
Submitted 6 May, 2024; v1 submitted 6 March, 2024;
originally announced March 2024.
-
SAM-DiffSR: Structure-Modulated Diffusion Model for Image Super-Resolution
Authors:
Chengcheng Wang,
Zhiwei Hao,
Yehui Tang,
Jianyuan Guo,
Yujie Yang,
Kai Han,
Yunhe Wang
Abstract:
Diffusion-based super-resolution (SR) models have recently garnered significant attention due to their potent restoration capabilities. But conventional diffusion models perform noise sampling from a single distribution, constraining their ability to handle real-world scenes and complex textures across semantic regions. With the success of segment anything model (SAM), generating sufficiently fine…
▽ More
Diffusion-based super-resolution (SR) models have recently garnered significant attention due to their potent restoration capabilities. But conventional diffusion models perform noise sampling from a single distribution, constraining their ability to handle real-world scenes and complex textures across semantic regions. With the success of segment anything model (SAM), generating sufficiently fine-grained region masks can enhance the detail recovery of diffusion-based SR model. However, directly integrating SAM into SR models will result in much higher computational cost. In this paper, we propose the SAM-DiffSR model, which can utilize the fine-grained structure information from SAM in the process of sampling noise to improve the image quality without additional computational cost during inference. In the process of training, we encode structural position information into the segmentation mask from SAM. Then the encoded mask is integrated into the forward diffusion process by modulating it to the sampled noise. This adjustment allows us to independently adapt the noise mean within each corresponding segmentation area. The diffusion model is trained to estimate this modulated noise. Crucially, our proposed framework does NOT change the reverse diffusion process and does NOT require SAM at inference. Experimental results demonstrate the effectiveness of our proposed method, showcasing superior performance in suppressing artifacts, and surpassing existing diffusion-based methods by 0.74 dB at the maximum in terms of PSNR on DIV2K dataset. The code and dataset are available at https://github.com/lose4578/SAM-DiffSR.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Variable martingale Hardy-Lorentz-Karamata spaces and their applications in Fourier Analysis
Authors:
Zhiwei Hao,
Xinru Ding,
Libo Li,
Ferenc Weisz
Abstract:
In this paper, we introduce a new class of function spaces, which unify and generalize Lorentz-Karamata spaces, variable Lorentz spaces and other several classical function spaces. Based on the new spaces, we develop the theory of variable martingale Hardy-Lorentz-Karamata spaces and apply it to Fourier Analysis. To be precise, we discuss the basic properties of Lorentz-Karamata spaces with variab…
▽ More
In this paper, we introduce a new class of function spaces, which unify and generalize Lorentz-Karamata spaces, variable Lorentz spaces and other several classical function spaces. Based on the new spaces, we develop the theory of variable martingale Hardy-Lorentz-Karamata spaces and apply it to Fourier Analysis. To be precise, we discuss the basic properties of Lorentz-Karamata spaces with variable exponents. We introduce five variable martingale Hardy-Lorentz-Karamata spaces and characterize them via simple atoms as well as via atoms. As applications of the atomic decompositions, dual theorems and the generalized John-Nirenberg theorem for the new framework are presented. Moreover, we obtain the boundedness of $σ$-sublinear operator defined on variable martingale Hardy-Lorentz-Karamata spaces, which leads to martingale inequalities and the relation of the five variable martingale Hardy-Lorentz-Karamata spaces. Also, we investigate the boundedness of fractional integral operators in this new framework. Finally, we deal with the applications of variable martingale Hardy-Lorentz-Karamata spaces in Fourier analysis by using the previous results. More precisely, we show that the partial sums of the Walsh-Fourier series converge to the function in norm if $f\in L_{p(\cdot),q,b}$ with $1<p_-\le p_+<\infty$. The Fejér summability method is also studied and it is proved that the maximal Fejér operator is bounded from variable martingale Hardy-Lorentz-Karamata spaces to variable Lorentz-Karamata spaces. As a consequence, we get conclusions about almost everywhere and norm convergence of Fejér means. The results obtained in this paper generalize the results for martingale Hardy-Lorentz-Karamata spaces and variable martingale Hardy-Lorentz spaces. Especially, we remove the condition that $b$ is nondecreasing in previous literature.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
Debiased Model-based Interactive Recommendation
Authors:
Zijian Li,
Ruichu Cai,
Haiqin Huang,
Sili Zhang,
Yuguang Yan,
Zhifeng Hao,
Zhenghua Dong
Abstract:
Existing model-based interactive recommendation systems are trained by querying a world model to capture the user preference, but learning the world model from historical logged data will easily suffer from bias issues such as popularity bias and sampling bias. This is why some debiased methods have been proposed recently. However, two essential drawbacks still remain: 1) ignoring the dynamics of…
▽ More
Existing model-based interactive recommendation systems are trained by querying a world model to capture the user preference, but learning the world model from historical logged data will easily suffer from bias issues such as popularity bias and sampling bias. This is why some debiased methods have been proposed recently. However, two essential drawbacks still remain: 1) ignoring the dynamics of the time-varying popularity results in a false reweighting of items. 2) taking the unknown samples as negative samples in negative sampling results in the sampling bias. To overcome these two drawbacks, we develop a model called \textbf{i}dentifiable \textbf{D}ebiased \textbf{M}odel-based \textbf{I}nteractive \textbf{R}ecommendation (\textbf{iDMIR} in short). In iDMIR, for the first drawback, we devise a debiased causal world model based on the causal mechanism of the time-varying recommendation generation process with identification guarantees; for the second drawback, we devise a debiased contrastive policy, which coincides with the debiased contrastive learning and avoids sampling bias. Moreover, we demonstrate that the proposed method not only outperforms several latest interactive recommendation algorithms but also enjoys diverse recommendation performance.
△ Less
Submitted 24 February, 2024;
originally announced February 2024.
-
Turbulent flows over porous and rough substrates
Authors:
Zengrong Hao,
Ricardo García-Mayoral
Abstract:
Turbulent flows over porous substrates are studied via a systematic exploration of the dependence of the flow properties on the substrate parameters, including permeability $K$, grain size $L$, and depth $h$. The study uses direct numerical simulations for staggered-cube substrates with $L^+\approx10$ - $50$, $\sqrt{K}/L\approx0.01$ - $0.25$, and depths from $h=O(L)$ to $h\gg L$, ranging from typi…
▽ More
Turbulent flows over porous substrates are studied via a systematic exploration of the dependence of the flow properties on the substrate parameters, including permeability $K$, grain size $L$, and depth $h$. The study uses direct numerical simulations for staggered-cube substrates with $L^+\approx10$ - $50$, $\sqrt{K}/L\approx0.01$ - $0.25$, and depths from $h=O(L)$ to $h\gg L$, ranging from typical impermeable rough surfaces to deep porous substrates. The results indicate that the permeability has significantly greater relevance than the grain size for the properties of the overlying flow, including the mean-flow slip and the shear across the interface, the drag increase relative to smooth-wall flow, and the statistics and spectra of the overlying turbulence, whereas the direct effect of grain size is only noticeable near the interface as grain-coherent flow fluctuations. The substrate depth also has a significant effect, with shallower substrates suppressing the effective transpiration at the interface excited by pressure fluctuations. We propose an empirical `equivalent permeability' $K_{eq}^t$, that incorporates this effect and scales well the overlying turbulence for substrates with different depths. Based on this, we propose a conceptual $h^+$-$\sqrt{K^+}$ regime diagram where turbulence transitions smoothly from that over impermeable rough surfaces with $h=O(L)$ to that over deep porous substrates with $h^+\gtrsim50$, with the latter limit determined by the typical lengthscale of the overlying pressure fluctuations.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
Robust single divacancy defects near stacking faults in 4H-SiC under resonant excitation
Authors:
Zhen-Xuan He,
Ji-Yang Zhou,
Wu-Xi Lin,
Qiang Li,
Rui-Jian Liang,
Jun-Feng Wang,
Xiao-Lei Wen,
Zhi-He Hao,
Wei Liu,
Shuo Ren,
Hao Li,
Li-Xing You,
Jian-Shun Tang,
Jin-Shi Xu,
Chuan-Feng Li,
Guang-Can Guo
Abstract:
Color centers in silicon carbide (SiC) have demonstrated significant promise for quantum information processing. However, the undesirable ionization process that occurs during optical manipulation frequently causes fluctuations in the charge state and performance of these defects, thereby restricting the effectiveness of spin-photon interfaces. Recent predictions indicate that divacancy defects ne…
▽ More
Color centers in silicon carbide (SiC) have demonstrated significant promise for quantum information processing. However, the undesirable ionization process that occurs during optical manipulation frequently causes fluctuations in the charge state and performance of these defects, thereby restricting the effectiveness of spin-photon interfaces. Recent predictions indicate that divacancy defects near stacking faults possess the capability to stabilize their neutral charge states, thereby providing robustness against photoionization effects. In this work, we present a comprehensive protocol for the scalable and targeted fabrication of single divacancy arrays in 4H-SiC using a high-resolution focused helium ion beam. Through photoluminescence emission (PLE) experiments, we demonstrate long-term emission stability with minimal linewidth shift ($\sim$ 50 MHz over 3 hours) for the single c-axis divacancies within stacking faults. By measuring the ionization rate for different polytypes of divacancies, we found that the divacancies within stacking faults are more robust against resonant excitation. Additionally, angle-resolved PLE spectra reveal their two resonant-transition lines with mutually orthogonal polarizations. Notably, the PLE linewidths are approximately 7 times narrower and the spin-coherent times are 6 times longer compared to divacancies generated via carbon-ion implantation. These findings highlight the immense potential of SiC divacancies for on-chip quantum photonics and the construction of efficient spin-to-photon interfaces, indicating a significant step forward in the development of quantum technologies.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Kerr nonlinearity and parametric amplification with an Al-InAs superconductor-semiconductor Josephson junction
Authors:
Z. Hao,
T. Shaw,
M. Hatefipour,
W. M. Strickland,
B. H. Elfeky,
D. Langone,
J. Shabani,
S. Shankar
Abstract:
Nearly quantum limited Josephson parametric amplifiers (JPAs) are essential components in superconducting quantum circuits. However, higher order nonlinearities of the Josephson cosine potential are known to cause gain compression, therefore limiting scalability. In an effort to reduce the fourth order, or Kerr nonlinearity, we realize a parametric amplifier with an Al-InAs superconductor-semicond…
▽ More
Nearly quantum limited Josephson parametric amplifiers (JPAs) are essential components in superconducting quantum circuits. However, higher order nonlinearities of the Josephson cosine potential are known to cause gain compression, therefore limiting scalability. In an effort to reduce the fourth order, or Kerr nonlinearity, we realize a parametric amplifier with an Al-InAs superconductor-semiconductor hybrid Josephson junction (JJ). We extract the Kerr nonlinearity of the Al-InAs JJ from two different devices and show that it is three orders of magnitude lower compared to an Al-$\text{AlO}_\text{X}$ junction with identical Josephson inductance. We then demonstrate a four-wave-mixing (4WM) parametric amplifier made with an Al-InAs junction that achieves more than 20 dB of gain and -119 dBm of compression power, that outperforms single resonant JPAs based on Al junctions.
△ Less
Submitted 22 February, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
Unifying Invariance and Spuriousity for Graph Out-of-Distribution via Probability of Necessity and Sufficiency
Authors:
Xuexin Chen,
Ruichu Cai,
Kaitao Zheng,
Zhifan Jiang,
Zhengting Huang,
Zhifeng Hao,
Zijian Li
Abstract:
Graph Out-of-Distribution (OOD), requiring that models trained on biased data generalize to the unseen test data, has a massive of real-world applications. One of the most mainstream methods is to extract the invariant subgraph by aligning the original and augmented data with the help of environment augmentation. However, these solutions might lead to the loss or redundancy of semantic subgraph an…
▽ More
Graph Out-of-Distribution (OOD), requiring that models trained on biased data generalize to the unseen test data, has a massive of real-world applications. One of the most mainstream methods is to extract the invariant subgraph by aligning the original and augmented data with the help of environment augmentation. However, these solutions might lead to the loss or redundancy of semantic subgraph and further result in suboptimal generalization. To address this challenge, we propose a unified framework to exploit the Probability of Necessity and Sufficiency to extract the Invariant Substructure (PNSIS). Beyond that, this framework further leverages the spurious subgraph to boost the generalization performance in an ensemble manner to enhance the robustness on the noise data. Specificially, we first consider the data generation process for graph data. Under mild conditions, we show that the invariant subgraph can be extracted by minimizing an upper bound, which is built on the theoretical advance of probability of necessity and sufficiency. To further bridge the theory and algorithm, we devise the PNSIS model, which involves an invariant subgraph extractor for invariant graph learning as well invariant and spurious subgraph classifiers for generalization enhancement. Experimental results demonstrate that our \textbf{PNSIS} model outperforms the state-of-the-art techniques on graph OOD on several benchmarks, highlighting the effectiveness in real-world scenarios.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.