subscribe to arXiv mailings

Bridge Past and Future: Overcoming Information Asymmetry in Incremental Object Detection

Authors: Qijie Mo, Yipeng Gao, Shenghao Fu, Junkai Yan, Ancong Wu, Wei-Shi Zheng

Abstract: In incremental object detection, knowledge distillation has been proven to be an effective way to alleviate catastrophic forgetting. However, previous works focused on preserving the knowledge of old models, ignoring that images could simultaneously contain categories from past, present, and future stages. The co-occurrence of objects makes the optimization objectives inconsistent across different… ▽ More In incremental object detection, knowledge distillation has been proven to be an effective way to alleviate catastrophic forgetting. However, previous works focused on preserving the knowledge of old models, ignoring that images could simultaneously contain categories from past, present, and future stages. The co-occurrence of objects makes the optimization objectives inconsistent across different stages since the definition for foreground objects differs across various stages, which limits the model's performance greatly. To overcome this problem, we propose a method called ``Bridge Past and Future'' (BPF), which aligns models across stages, ensuring consistent optimization directions. In addition, we propose a novel Distillation with Future (DwF) loss, fully leveraging the background probability to mitigate the forgetting of old classes while ensuring a high level of adaptability in learning new classes. Extensive experiments are conducted on both Pascal VOC and MS COCO benchmarks. Without memory, BPF outperforms current state-of-the-art methods under various settings. The code is available at https://github.com/iSEE-Laboratory/BPF. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: Accepted to ECCV 2024

arXiv:2407.10681 [pdf, other]

doi 10.1145/3637528.3671700

GeoMix: Towards Geometry-Aware Data Augmentation

Authors: Wentao Zhao, Qitian Wu, Chenxiao Yang, Junchi Yan

Abstract: Mixup has shown considerable success in mitigating the challenges posed by limited labeled data in image classification. By synthesizing samples through the interpolation of features and labels, Mixup effectively addresses the issue of data scarcity. However, it has rarely been explored in graph learning tasks due to the irregularity and connectivity of graph data. Specifically, in node classifica… ▽ More Mixup has shown considerable success in mitigating the challenges posed by limited labeled data in image classification. By synthesizing samples through the interpolation of features and labels, Mixup effectively addresses the issue of data scarcity. However, it has rarely been explored in graph learning tasks due to the irregularity and connectivity of graph data. Specifically, in node classification tasks, Mixup presents a challenge in creating connections for synthetic data. In this paper, we propose Geometric Mixup (GeoMix), a simple and interpretable Mixup approach leveraging in-place graph editing. It effectively utilizes geometry information to interpolate features and labels with those from the nearby neighborhood, generating synthetic nodes and establishing connections for them. We conduct theoretical analysis to elucidate the rationale behind employing geometry information for node Mixup, emphasizing the significance of locality enhancement-a critical aspect of our method's design. Extensive experiments demonstrate that our lightweight Geometric Mixup achieves state-of-the-art results on a wide variety of standard datasets with limited labeled data. Furthermore, it significantly improves the generalization capability of underlying GNNs across various challenging out-of-distribution generalization tasks. Our code is available at https://github.com/WtaoZhao/geomix. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: Published as a conference paper at KDD 2024

arXiv:2407.09790 [pdf, other]

Team up GBDTs and DNNs: Advancing Efficient and Effective Tabular Prediction with Tree-hybrid MLPs

Authors: Jiahuan Yan, Jintai Chen, Qianxing Wang, Danny Z. Chen, Jian Wu

Abstract: Tabular datasets play a crucial role in various applications. Thus, developing efficient, effective, and widely compatible prediction algorithms for tabular data is important. Currently, two prominent model types, Gradient Boosted Decision Trees (GBDTs) and Deep Neural Networks (DNNs), have demonstrated performance advantages on distinct tabular prediction tasks. However, selecting an effective mo… ▽ More Tabular datasets play a crucial role in various applications. Thus, developing efficient, effective, and widely compatible prediction algorithms for tabular data is important. Currently, two prominent model types, Gradient Boosted Decision Trees (GBDTs) and Deep Neural Networks (DNNs), have demonstrated performance advantages on distinct tabular prediction tasks. However, selecting an effective model for a specific tabular dataset is challenging, often demanding time-consuming hyperparameter tuning. To address this model selection dilemma, this paper proposes a new framework that amalgamates the advantages of both GBDTs and DNNs, resulting in a DNN algorithm that is as efficient as GBDTs and is competitively effective regardless of dataset preferences for GBDTs or DNNs. Our idea is rooted in an observation that deep learning (DL) offers a larger parameter space that can represent a well-performing GBDT model, yet the current back-propagation optimizer struggles to efficiently discover such optimal functionality. On the other hand, during GBDT development, hard tree pruning, entropy-driven feature gate, and model ensemble have proved to be more adaptable to tabular data. By combining these key components, we present a Tree-hybrid simple MLP (T-MLP). In our framework, a tensorized, rapidly trained GBDT feature gate, a DNN architecture pruning approach, as well as a vanilla back-propagation optimizer collaboratively train a randomly initialized MLP model. Comprehensive experiments show that T-MLP is competitive with extensively tuned DNNs and GBDTs in their dominating tabular benchmarks (88 datasets) respectively, all achieved with compact model storage and significantly reduced training duration. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: Accepted at KDD 2024 Research Track, codes will be available at https://github.com/jyansir/tmlp

arXiv:2407.06677 [pdf, other]

Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules

Authors: Zhuocheng Gong, Ang Lv, Jian Guan, Junxi Yan, Wei Wu, Huishuai Zhang, Minlie Huang, Dongyan Zhao, Rui Yan

Abstract: Is it always necessary to compute tokens from shallow to deep layers in Transformers? The continued success of vanilla Transformers and their variants suggests an undoubted "yes". In this work, however, we attempt to break the depth-ordered convention by proposing a novel architecture dubbed mixture-of-modules (MoM), which is motivated by an intuition that any layer, regardless of its position, ca… ▽ More Is it always necessary to compute tokens from shallow to deep layers in Transformers? The continued success of vanilla Transformers and their variants suggests an undoubted "yes". In this work, however, we attempt to break the depth-ordered convention by proposing a novel architecture dubbed mixture-of-modules (MoM), which is motivated by an intuition that any layer, regardless of its position, can be used to compute a token as long as it possesses the needed processing capabilities. The construction of MoM starts from a finite set of modules defined by multi-head attention and feed-forward networks, each distinguished by its unique parameterization. Two routers then iteratively select attention modules and feed-forward modules from the set to process a token. The selection dynamically expands the computation graph in the forward pass of the token, culminating in an assembly of modules. We show that MoM provides not only a unified framework for Transformers and their numerous variants but also a flexible and learnable approach for reducing redundancy in Transformer parameterization. We pre-train various MoMs using OpenWebText. Empirical results demonstrate that MoMs, of different parameter counts, consistently outperform vanilla transformers on both GLUE and XSUM benchmarks. More interestingly, with a fixed parameter budget, MoM-large enables an over 38% increase in depth for computation graphs compared to GPT-2-large, resulting in absolute gains of 1.4 on GLUE and 1 on XSUM. On the other hand, MoM-large also enables an over 60% reduction in depth while involving more modules per layer, yielding a 16% reduction in TFLOPs and a 43% decrease in memory usage compared to GPT-2-large, while maintaining comparable performance. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.06675 [pdf, other]

Arbitrary $H$-linked oriented graphs

Authors: Jia Zhou, Jin Yan

Abstract: Suppose that $D$ is a digraph, and $H$ is a multi-digraph on $k$ vertices with $q$ arcs. Let $\mathcal{P}(D)$ be the set of paths in a digraph $D$. An $H$-subdivision $(f,g)$ in a digraph $D$ is a pair of bijections $f : V(H)\rightarrow V(D)$ and $g : A(H) \rightarrow \mathcal{P}(D)$ such that for every arc $uv\in A(H)$, $g(uv)$ is a path from $f(u)$ to $f (v)$, and distinct arcs map into internal… ▽ More Suppose that $D$ is a digraph, and $H$ is a multi-digraph on $k$ vertices with $q$ arcs. Let $\mathcal{P}(D)$ be the set of paths in a digraph $D$. An $H$-subdivision $(f,g)$ in a digraph $D$ is a pair of bijections $f : V(H)\rightarrow V(D)$ and $g : A(H) \rightarrow \mathcal{P}(D)$ such that for every arc $uv\in A(H)$, $g(uv)$ is a path from $f(u)$ to $f (v)$, and distinct arcs map into internally vertex disjoint paths in $D$. Further, $D$ is arbitrary $H$-linked if any $k$ distinct vertices in $D$ can be extended to an $H$-subdivision $(f,g)$, and the length of each subdivision path can be specified as a number of at least four. In this paper, we prove that there exists a positive integer $n_0 = n_0(k,q)$ such that if $D$ is an oriented graph of order $n\geq n_0$ with minimum semi-degree at least $(3n+3k+6q-3)/8$, then $D$ is arbitrary $H$-linked. This minimum semi-degree is sharp. Also, we refine the bounds on the semi-degree of sufficiently large arbitrary $k$-linked oriented graphs, sufficiently large arbitrary $l$-ordered oriented graphs, and sufficiently large oriented graphs with disjoint cycles of prescribed lengths containing prescribed arcs. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: 26 pages,2 figures

MSC Class: 05C20(Primary); 05C38; 05C70(Secondary)

arXiv:2407.06373 [pdf]

Enhancing super-resolution ultrasound localisation through multi-frame deconvolution exploiting spatiotemporal coherence

Authors: Su Yan, Clotilde Vié, Marcelo Lerendegui, Herman Verinaz-Jadan, Jipeng Yan, Martina Tashkova, James Burn, Bingxue Wang, Gary Frost, Kevin G. Murphy, Meng-Xing Tang

Abstract: Super-resolution ultrasound imaging through microbubble (MB) localisation and tracking, also known as ultrasound localisation microscopy, allows non-invasive sub-diffraction resolution imaging of microvasculature in animals and humans. The number of MBs localised from the acquired contrast-enhanced ultrasound (CEUS) images and the localisation precision directly influence the quality of the result… ▽ More Super-resolution ultrasound imaging through microbubble (MB) localisation and tracking, also known as ultrasound localisation microscopy, allows non-invasive sub-diffraction resolution imaging of microvasculature in animals and humans. The number of MBs localised from the acquired contrast-enhanced ultrasound (CEUS) images and the localisation precision directly influence the quality of the resulting super-resolution microvasculature images. However, non-negligible noise present in the CEUS images can make localising MBs challenging. To enhance the MB localisation performance, we propose a Multi-Frame Deconvolution (MF-Decon) framework that can exploit the spatiotemporal coherence inherent in the CEUS data, with new spatial and temporal regularisers designed based on total variation (TV) and regularisation by denoising (RED). Based on the MF-Decon framework, we introduce two novel methods: MF-Decon with spatial and temporal TVs (MF-Decon+3DTV) and MF-Decon with spatial RED and temporal TV (MF-Decon+RED+TV). Results from in silico simulations indicate that our methods outperform two widely used methods using deconvolution or normalised cross-correlation across all evaluation metrics, including precision, recall, $F_1$ score, mean and standard localisation errors. In particular, our methods improve MB localisation precision by up to 39% and recall by up to 12%. Super-resolution microvasculature maps generated with our methods on a publicly available in vivo rat brain dataset show less noise, better contrast, higher resolution and more vessel structures. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 26 pages, 1 table, 7 figures

arXiv:2407.05681 [pdf]

Bulk high-temperature superconductivity in the high-pressure tetragonal phase of bilayer La2PrNi2O7

Authors: Ningning Wang, Gang Wang, Xiaoling Shen, Jun Hou, Jun Luo, Xiaoping Ma, Huaixin Yang, Lifen Shi, Jie Dou, Jie Feng, Jie Yang, Yunqing Shi, Zhian Ren, Hanming Ma, Pengtao Yang, Ziyi Liu, Yue Liu, Hua Zhang, Xiaoli Dong, Yuxin Wang, Kun Jiang, Jiangping Hu, Stuart Calder, Jiaqiang Yan, Jianping Sun , et al. (4 additional authors not shown)

Abstract: The Ruddlesden-Popper (R-P) bilayer nickelate, La3Ni2O7, was recently found to show signatures of high-temperature superconductivity (HTSC) at pressures above 14 GPa. Subsequent investigations achieved zero resistance in single- and poly-crystalline samples under hydrostatic pressure conditions. Yet, obvious diamagnetic signals, the other hallmark of superconductors, are still lacking owing to the… ▽ More The Ruddlesden-Popper (R-P) bilayer nickelate, La3Ni2O7, was recently found to show signatures of high-temperature superconductivity (HTSC) at pressures above 14 GPa. Subsequent investigations achieved zero resistance in single- and poly-crystalline samples under hydrostatic pressure conditions. Yet, obvious diamagnetic signals, the other hallmark of superconductors, are still lacking owing to the filamentary nature with low superconducting volume fraction. The presence of a novel "1313" polymorph and competing R-P phases obscured proper identification of the phase for HTSC. Thus, achieving bulk HTSC and identifying the phase at play are the most prominent tasks at present. Here, we address these issues in the praseodymium (Pr)-doped La2PrNi2O7 polycrystalline samples. We find that the substitutions of Pr for La effectively inhibits the intergrowth of different R-P phases, resulting in nearly pure bilayer structure. For La2PrNi2O7, pressure-induced orthorhombic-to-tetragonal structural transition takes place at Pc ~ 11 GPa, above which HTSC emerges gradually upon further compression. The superconducting transition temperatures at 18-20 GPa reach Tconset = 82.5 K and Tczero = 60 K, which are the highest values among known nickelate superconductors. More importantly, bulk HTSC was testified by detecting clear diamagnetic signals below ~75 K corresponding to an estimated superconducting volume fraction ~ 57(5)% at 20 GPa. Our results not only resolve the existing controversies but also illuminate directions for exploring bulk HTSC in the bilayer nickelates. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.03658 [pdf, other]

GPT-4 vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels

Authors: Jianhao Yan, Pingchuan Yan, Yulong Chen, Judy Li, Xianchao Zhu, Yue Zhang

Abstract: This study comprehensively evaluates the translation quality of Large Language Models (LLMs), specifically GPT-4, against human translators of varying expertise levels across multiple language pairs and domains. Through carefully designed annotation rounds, we find that GPT-4 performs comparably to junior translators in terms of total errors made but lags behind medium and senior translators. We a… ▽ More This study comprehensively evaluates the translation quality of Large Language Models (LLMs), specifically GPT-4, against human translators of varying expertise levels across multiple language pairs and domains. Through carefully designed annotation rounds, we find that GPT-4 performs comparably to junior translators in terms of total errors made but lags behind medium and senior translators. We also observe the imbalanced performance across different languages and domains, with GPT-4's translation capability gradually weakening from resource-rich to resource-poor directions. In addition, we qualitatively study the translation given by GPT-4 and human translators, and find that GPT-4 translator suffers from literal translations, but human translators sometimes overthink the background information. To our knowledge, this study is the first to evaluate LLMs against human translators and analyze the systematic differences between their outputs, providing valuable insights into the current state of LLM-based translation and its potential limitations. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2407.03625 [pdf, other]

Augmenting LLMs to Repair Obsolete Test Cases with Static Collector and Neural Reranker

Authors: Jun Liu, Jiwei Yan, Yuanyuan Xie, Jun Yan, Jian Zhang

Abstract: During software evolution, it is advocated that test code should co-evolve with production code. In real development scenarios, test updating may lag behind production code changing, which may cause the project to fail to compile or bring other troubles. Existing techniques based on pre-trained language models can be adopted to repair obsolete tests caused by such unsynchronized code changes, espe… ▽ More During software evolution, it is advocated that test code should co-evolve with production code. In real development scenarios, test updating may lag behind production code changing, which may cause the project to fail to compile or bring other troubles. Existing techniques based on pre-trained language models can be adopted to repair obsolete tests caused by such unsynchronized code changes, especially syntactic-related ones. However, the lack of target-oriented contextual information affects repair accuracy on large-scale projects. Starting from an obsoleted test, the key challenging task is precisely identifying and constructing Test-Repair-Oriented Contexts (TROCtx) from the whole repository within a limited token size. In this paper, we propose SynBCIATR (Syntactic-Breaking-Change-Induced Automated Test Repair), a novel approach to automatically repair obsolete test cases via precise and concise TROCtx construction. Inspired by developers' programming practices of the task, we design three types of TROCtx: class contexts, usage contexts, and environment contexts. For every type of TROCtx, SynBCIATR automatically collects the changed-token-related code information through static analysis techniques. Then it generates reranking queries to identify the most relevant TROCtxs, which will be taken as the repair-required key context and be input to the Large Language Model for the final test repair. To evaluate the effectiveness of SynBCIATR, we construct a benchmark dataset that contains diverse syntactic breaking changes. The experimental results show that SynBCIATR outperforms baseline approaches both on textual- and intent-matching metrics. With the augmentation of TROCtx constructed by SynBCIATR, hallucinations are reduced by 57.1%. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2407.02888 [pdf, ps, other]

Joint Optimization of Resource Allocation and Data Selection for Fast and Cost-Efficient Federated Edge Learning

Authors: Yunjian Jia, Zhen Huang, Jiping Yan, Yulu Zhang, Kun Luo, Wanli Wen

Abstract: Deploying federated learning at the wireless edge introduces federated edge learning (FEEL). Given FEEL's limited communication resources and potential mislabeled data on devices, improper resource allocation or data selection can hurt convergence speed and increase training costs. Thus, to realize an efficient FEEL system, this paper emphasizes jointly optimizing resource allocation and data sele… ▽ More Deploying federated learning at the wireless edge introduces federated edge learning (FEEL). Given FEEL's limited communication resources and potential mislabeled data on devices, improper resource allocation or data selection can hurt convergence speed and increase training costs. Thus, to realize an efficient FEEL system, this paper emphasizes jointly optimizing resource allocation and data selection. Specifically, in this work, through rigorously modeling the training process and deriving an upper bound on FEEL's one-round convergence rate, we establish a problem of joint resource allocation and data selection, which, unfortunately, cannot be solved directly. Toward this end, we equivalently transform the original problem into a solvable form via a variable substitution and then break it into two subproblems, that is, the resource allocation problem and the data selection problem. The two subproblems are mixed-integer non-convex and integer non-convex problems, respectively, and achieving their optimal solutions is a challenging task. Based on the matching theory and applying the convex-concave procedure and gradient projection methods, we devise a low-complexity suboptimal algorithm for the two subproblems, respectively. Finally, the superiority of our proposed scheme of joint resource allocation and data selection is validated by numerical results. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.01891 [pdf, other]

Refined Motion Compensation with Soft Laser Manipulators using Data-Driven Surrogate Models

Authors: Yongjun Yan, Qingpeng Ding, Mingwu Li, Junyan Yan, Shing Shin Cheng

Abstract: Non-contact laser ablation, a precise thermal technique, simultaneously cuts and coagulates tissue without the insertion errors associated with rigid needles. Human organ motions, such as those in the liver, exhibit rhythmic components influenced by respiratory and cardiac cycles, making effective laser energy delivery to target lesions while compensating for tumor motion crucial. This research in… ▽ More Non-contact laser ablation, a precise thermal technique, simultaneously cuts and coagulates tissue without the insertion errors associated with rigid needles. Human organ motions, such as those in the liver, exhibit rhythmic components influenced by respiratory and cardiac cycles, making effective laser energy delivery to target lesions while compensating for tumor motion crucial. This research introduces a data-driven method to derive surrogate models of a soft manipulator. These low-dimensional models offer computational efficiency when integrated into the Model Predictive Control (MPC) framework, while still capturing the manipulator's dynamics with and without control input. Spectral Submanifolds (SSM) theory models the manipulator's autonomous dynamics, acknowledging its tendency to reach equilibrium when external forces are removed. Preliminary results show that the MPC controller using the surrogate model outperforms two other models within the same MPC framework. The data-driven MPC controller also supports a design-agnostic feature, allowing the interchangeability of different soft manipulators within the laser ablation surgery robot system. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2407.00690 [pdf, other]

MnRhBi3: A Cleavable Antiferromagnetic Metal

Authors: Eleanor M. Clements, Dmitry Ovchinnikov, Parul R. Raghuvanshi, Valentino R. Cooper, Satoshi Okamoto, Andrew D. Christianson, Joseph A. M. Paddison, Brenden R. Ortiz, Stuart Calder, Andrew F. May, Xiaodong Xu, Jiaqiang Yan, Michael A. McGuire

Abstract: Cleavable metallic antiferromagnets may be of use for low-dissipation spintronic devices; however, few are currently known. Here we present orthorhombic MnRhBi3 as one such compound and present a thorough study of its physical properties. Exfoliation is demonstrated experimentally, and the cleavage energy and electronic structure are examined by density functional theory calculations. It is conclu… ▽ More Cleavable metallic antiferromagnets may be of use for low-dissipation spintronic devices; however, few are currently known. Here we present orthorhombic MnRhBi3 as one such compound and present a thorough study of its physical properties. Exfoliation is demonstrated experimentally, and the cleavage energy and electronic structure are examined by density functional theory calculations. It is concluded that MnRhBi3 is a van der Waals layered material that cleaves easily between neighboring Bi layers, and that the Bi atoms have lone pairs extending into the van der Waals gaps. A series of four phase transitions are observed below room temperature, and neutron diffraction shows that at least two of the transitions involve the formation of antiferromagnetic order. Anomalous thermal expansion points to a crystallographic phase transition and/or strong magnetoelastic coupling. This work reveals a complex phase evolution in MnRhBi3 and establishes this cleavable antiferromagnetic metal as an interesting material for studying the interplay of structure, magnetism, and transport in the bulk and ultrathin limits as well as the role of lone pair electrons in interface chemistry and proximity effects in van der Waals heterostructures. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2406.17555 [pdf, ps, other]

A response to commenter Ke Lan's comment on our paper published in Nature Communications (2023)14:5782 by J. Yan et al

Authors: Ji Yan, Jiwei Li, X. T. He, Lifeng Wang, Yaohua Chen, Feng Wang, Xiaoying Han, Kaiqiang Pan, Juxi Liang, Yulong Li, Zanyang Guan, Xiangming Liu, Xingsen Che, Zhongjing Chen, Xing Zhang, Yan Xu, Bin Li, Minging He, Hongbo Cai, Liang. Hao, Zhanjun Liu, Chunyang Zheng, Zhensheng Dai, Zhengfeng Fan, Bin Qiao , et al. (4 additional authors not shown)

Abstract: A response to commenter Ke Lan's comment on our paper published in Nature Communications (2023)14:5782 by J. Yan et al A response to commenter Ke Lan's comment on our paper published in Nature Communications (2023)14:5782 by J. Yan et al △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.17470 [pdf, other]

Dynamic Scheduling for Vehicle-to-Vehicle Communications Enhanced Federated Learning

Authors: Jintao Yan, Tan Chen, Yuxuan Sun, Zhaojun Nan, Sheng Zhou, Zhisheng Niu

Abstract: Leveraging the computing and sensing capabilities of vehicles, vehicular federated learning (VFL) has been applied to edge training for connected vehicles. The dynamic and interconnected nature of vehicular networks presents unique opportunities to harness direct vehicle-to-vehicle (V2V) communications, enhancing VFL training efficiency. In this paper, we formulate a stochastic optimization proble… ▽ More Leveraging the computing and sensing capabilities of vehicles, vehicular federated learning (VFL) has been applied to edge training for connected vehicles. The dynamic and interconnected nature of vehicular networks presents unique opportunities to harness direct vehicle-to-vehicle (V2V) communications, enhancing VFL training efficiency. In this paper, we formulate a stochastic optimization problem to optimize the VFL training performance, considering the energy constraints and mobility of vehicles, and propose a V2V-enhanced dynamic scheduling (VEDS) algorithm to solve it. The model aggregation requirements of VFL and the limited transmission time due to mobility result in a stepwise objective function, which presents challenges in solving the problem. We thus propose a derivative-based drift-plus-penalty method to convert the long-term stochastic optimization problem to an online mixed integer nonlinear programming (MINLP) problem, and provide a theoretical analysis to bound the performance gap between the online solution and the offline optimal solution. Further analysis of the scheduling priority reduces the original problem into a set of convex optimization problems, which are efficiently solved using the interior-point method. Experimental results demonstrate that compared with the state-of-the-art benchmarks, the proposed algorithm enhances the image classification accuracy on the CIFAR-10 dataset by 3.18% and reduces the average displacement errors on the Argoverse trajectory prediction dataset by 10.21%. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: Submitted to IEEE for possible publication

arXiv:2406.16949 [pdf, other]

Fair Differentiable Neural Network Architecture Search for Long-Tailed Data with Self-Supervised Learning

Authors: Jiaming Yan

Abstract: Recent advancements in artificial intelligence (AI) have positioned deep learning (DL) as a pivotal technology in fields like computer vision, data mining, and natural language processing. A critical factor in DL performance is the selection of neural network architecture. Traditional predefined architectures often fail to adapt to different data distributions, making it challenging to achieve opt… ▽ More Recent advancements in artificial intelligence (AI) have positioned deep learning (DL) as a pivotal technology in fields like computer vision, data mining, and natural language processing. A critical factor in DL performance is the selection of neural network architecture. Traditional predefined architectures often fail to adapt to different data distributions, making it challenging to achieve optimal performance. Neural architecture search (NAS) offers a solution by automatically designing architectures tailored to specific datasets. However, the effectiveness of NAS diminishes on long-tailed datasets, where a few classes have abundant samples, and many have few, leading to biased models.In this paper, we explore to improve the searching and training performance of NAS on long-tailed datasets. Specifically, we first discuss the related works about NAS and the deep learning method for long-tailed datasets.Then, we focus on an existing work, called SSF-NAS, which integrates the self-supervised learning and fair differentiable NAS to making NAS achieve better performance on long-tailed datasets.An detailed description about the fundamental techniques for SSF-NAS is provided in this paper, including DARTS, FairDARTS, and Barlow Twins. Finally, we conducted a series of experiments on the CIFAR10-LT dataset for performance evaluation, where the results are align with our expectation. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.16367 [pdf, other]

On the Role of Long-tail Knowledge in Retrieval Augmented Large Language Models

Authors: Dongyang Li, Junbing Yan, Taolin Zhang, Chengyu Wang, Xiaofeng He, Longtao Huang, Hui Xue, Jun Huang

Abstract: Retrieval augmented generation (RAG) exhibits outstanding performance in promoting the knowledge capabilities of large language models (LLMs) with retrieved documents related to user queries. However, RAG only focuses on improving the response quality of LLMs via enhancing queries indiscriminately with retrieved information, paying little attention to what type of knowledge LLMs really need to ans… ▽ More Retrieval augmented generation (RAG) exhibits outstanding performance in promoting the knowledge capabilities of large language models (LLMs) with retrieved documents related to user queries. However, RAG only focuses on improving the response quality of LLMs via enhancing queries indiscriminately with retrieved information, paying little attention to what type of knowledge LLMs really need to answer original queries more accurately. In this paper, we suggest that long-tail knowledge is crucial for RAG as LLMs have already remembered common world knowledge during large-scale pre-training. Based on our observation, we propose a simple but effective long-tail knowledge detection method for LLMs. Specifically, the novel Generative Expected Calibration Error (GECE) metric is derived to measure the ``long-tailness'' of knowledge based on both statistics and semantics. Hence, we retrieve relevant documents and infuse them into the model for patching knowledge loopholes only when the input query relates to long-tail knowledge. Experiments show that, compared to existing RAG pipelines, our method achieves over 4x speedup in average inference time and consistent performance improvement in downstream tasks. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.15836 [pdf, other]

Decentralized Transformers with Centralized Aggregation are Sample-Efficient Multi-Agent World Models

Authors: Yang Zhang, Chenjia Bai, Bin Zhao, Junchi Yan, Xiu Li, Xuelong Li

Abstract: Learning a world model for model-free Reinforcement Learning (RL) agents can significantly improve the sample efficiency by learning policies in imagination. However, building a world model for Multi-Agent RL (MARL) can be particularly challenging due to the scalability issue in a centralized architecture arising from a large number of agents, and also the non-stationarity issue in a decentralized… ▽ More Learning a world model for model-free Reinforcement Learning (RL) agents can significantly improve the sample efficiency by learning policies in imagination. However, building a world model for Multi-Agent RL (MARL) can be particularly challenging due to the scalability issue in a centralized architecture arising from a large number of agents, and also the non-stationarity issue in a decentralized architecture stemming from the inter-dependency among agents. To address both challenges, we propose a novel world model for MARL that learns decentralized local dynamics for scalability, combined with a centralized representation aggregation from all agents. We cast the dynamics learning as an auto-regressive sequence modeling problem over discrete tokens by leveraging the expressive Transformer architecture, in order to model complex local dynamics across different agents and provide accurate and consistent long-term imaginations. As the first pioneering Transformer-based world model for multi-agent systems, we introduce a Perceiver Transformer as an effective solution to enable centralized representation aggregation within this context. Results on Starcraft Multi-Agent Challenge (SMAC) show that it outperforms strong model-free approaches and existing model-based methods in both sample efficiency and overall performance. △ Less

Submitted 22 June, 2024; originally announced June 2024.

arXiv:2406.14143 [pdf, other]

Using the Transport of Intensity and the Transport of Phase Equation for Phase Retrieval

Authors: Clemens Kirisits, Kemal Raik, Otmar Scherzer, Christina Strohmenger, Jikai Yan

Abstract: We investigate the transport of intensity equation (TIE) and the transport of phase equation (TPE) for solving the phase retrieval problem. Both the TIE and the TPE are derived from the paraxial Helmholtz equation and relate phase information to the intensity. The TIE is usually favored since the TPE is nonlinear. The main contribution of this paper is that we discuss situations in which it is pos… ▽ More We investigate the transport of intensity equation (TIE) and the transport of phase equation (TPE) for solving the phase retrieval problem. Both the TIE and the TPE are derived from the paraxial Helmholtz equation and relate phase information to the intensity. The TIE is usually favored since the TPE is nonlinear. The main contribution of this paper is that we discuss situations in which it is possible to use the two equations in a hybrid manner: We show that 2-dimensional phase information retrieved by the TIE can be used as initial data for the TPE, enabling the acquisition of 3-dimensional phase information. The latter is solved using the method of characteristic and viscosity methods. Both the TIE and the viscosity method are numerically implemented with finite element methods. △ Less

Submitted 20 June, 2024; originally announced June 2024.

MSC Class: 65Z05

arXiv:2406.13945 [pdf, other]

CityBench: Evaluating the Capabilities of Large Language Model as World Model

Authors: Jie Feng, Jun Zhang, Junbo Yan, Xin Zhang, Tianjian Ouyang, Tianhui Liu, Yuwei Du, Siqi Guo, Yong Li

Abstract: Large language models (LLMs) with powerful generalization ability has been widely used in many domains. A systematic and reliable evaluation of LLMs is a crucial step in their development and applications, especially for specific professional fields. In the urban domain, there have been some early explorations about the usability of LLMs, but a systematic and scalable evaluation benchmark is still… ▽ More Large language models (LLMs) with powerful generalization ability has been widely used in many domains. A systematic and reliable evaluation of LLMs is a crucial step in their development and applications, especially for specific professional fields. In the urban domain, there have been some early explorations about the usability of LLMs, but a systematic and scalable evaluation benchmark is still lacking. The challenge in constructing a systematic evaluation benchmark for the urban domain lies in the diversity of data and scenarios, as well as the complex and dynamic nature of cities. In this paper, we propose CityBench, an interactive simulator based evaluation platform, as the first systematic evaluation benchmark for the capability of LLMs for urban domain. First, we build CitySim to integrate the multi-source data and simulate fine-grained urban dynamics. Based on CitySim, we design 7 tasks in 2 categories of perception-understanding and decision-making group to evaluate the capability of LLMs as city-scale world model for urban domain. Due to the flexibility and ease-of-use of CitySim, our evaluation platform CityBench can be easily extended to any city in the world. We evaluate 13 well-known LLMs including open source LLMs and commercial LLMs in 13 cities around the world. Extensive experiments demonstrate the scalability and effectiveness of proposed CityBench and shed lights for the future development of LLMs in urban domain. The dataset, benchmark and source codes are openly accessible to the research community via https://github.com/tsinghua-fib-lab/CityBench △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.13358 [pdf, other]

Multi-scale Restoration of Missing Data in Optical Time-series Images with Masked Spatial-Temporal Attention Network

Authors: Zaiyan Zhang, Jining Yan, Yuanqi Liang, Jiaxin Feng, Haixu He, Wei Han

Abstract: Due to factors such as thick cloud cover and sensor limitations, remote sensing images often suffer from significant missing data, resulting in incomplete time-series information. Existing methods for imputing missing values in remote sensing images do not fully exploit spatio-temporal auxiliary information, leading to limited accuracy in restoration. Therefore, this paper proposes a novel deep le… ▽ More Due to factors such as thick cloud cover and sensor limitations, remote sensing images often suffer from significant missing data, resulting in incomplete time-series information. Existing methods for imputing missing values in remote sensing images do not fully exploit spatio-temporal auxiliary information, leading to limited accuracy in restoration. Therefore, this paper proposes a novel deep learning-based approach called MS2TAN (Multi-scale Masked Spatial-Temporal Attention Network), for reconstructing time-series remote sensing images. Firstly, we introduce an efficient spatio-temporal feature extractor based on Masked Spatial-Temporal Attention (MSTA), to obtain high-quality representations of the spatio-temporal neighborhood features in the missing regions. Secondly, a Multi-scale Restoration Network consisting of the MSTA-based Feature Extractors, is employed to progressively refine the missing values by exploring spatio-temporal neighborhood features at different scales. Thirdly, we propose a ``Pixel-Structure-Perception'' Multi-Objective Joint Optimization method to enhance the visual effects of the reconstruction results from multiple perspectives and preserve more texture structures. Furthermore, the proposed method reconstructs missing values in all input temporal phases in parallel (i.e., Multi-In Multi-Out), achieving higher processing efficiency. Finally, experimental evaluations on two typical missing data restoration tasks across multiple research areas demonstrate that the proposed method outperforms state-of-the-art methods with an improvement of 0.40dB/1.17dB in mean peak signal-to-noise ratio (mPSNR) and 3.77/9.41 thousandths in mean structural similarity (mSSIM), while exhibiting stronger texture and structural consistency. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.12878 [pdf, other]

Beam test results of the prototype of the multi wire drift chamber for the CSR external-target experiment

Authors: Zhi Qin, Zhoubo He, Zhe Cao, Tao Chen, Zhi Deng, Limin Duan, Dong Guo, Rongjiang Hu, Jie Kong, Canwen Liu, Peng Ma, Xianglun Wei, Shihai Wen, Xiangjie Wen, Junwei Yan, Herun Yang, Zuoqiao Yang, Yuhong Yu, Zhigang Xiao

Abstract: The half-size prototype of the multi wire drift chamber (MWDC) for the cooling storage ring (CSR) external-target experiment (CEE) was assembled and tested in 350 MeV/u Kr+Fe reactions on the heavy ion research facility in Lanzhou (HIRFL). The prototype consists of 6 sense layers, where the sense wires are stretched in three directions X, U and V, meeting $0^\circ$, $30^\circ$ and $-30^\circ$ with… ▽ More The half-size prototype of the multi wire drift chamber (MWDC) for the cooling storage ring (CSR) external-target experiment (CEE) was assembled and tested in 350 MeV/u Kr+Fe reactions on the heavy ion research facility in Lanzhou (HIRFL). The prototype consists of 6 sense layers, where the sense wires are stretched in three directions X, U and V, meeting $0^\circ$, $30^\circ$ and $-30^\circ$ with respect to the vertical axis, respectively. The sensitive area of the prototype is $76 {\rm cm} \times 76 {\rm cm}$. The amplified and shaped signals from the anode wires are digitized in a serial capacity array. Being operated with 1500 V high voltage on the anode wires, the efficiency for each layer is beyond 95\%. The tracking residual is about $301 \pm 2 \rm μm$. The performance meets the requirements of CEE. △ Less

Submitted 15 May, 2024; originally announced June 2024.

arXiv:2406.11633 [pdf, other]

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models

Authors: Renqiu Xia, Song Mao, Xiangchao Yan, Hongbin Zhou, Bo Zhang, Haoyang Peng, Jiahao Pi, Daocheng Fu, Wenjie Wu, Hancheng Ye, Shiyang Feng, Bin Wang, Chao Xu, Conghui He, Pinlong Cai, Min Dou, Botian Shi, Sheng Zhou, Yongwei Wang, Bin Wang, Junchi Yan, Fei Wu, Yu Qiao

Abstract: Scientific documents record research findings and valuable human knowledge, comprising a vast corpus of high-quality data. Leveraging multi-modality data extracted from these documents and assessing large models' abilities to handle scientific document-oriented tasks is therefore meaningful. Despite promising advancements, large models still perform poorly on multi-page scientific document extract… ▽ More Scientific documents record research findings and valuable human knowledge, comprising a vast corpus of high-quality data. Leveraging multi-modality data extracted from these documents and assessing large models' abilities to handle scientific document-oriented tasks is therefore meaningful. Despite promising advancements, large models still perform poorly on multi-page scientific document extraction and understanding tasks, and their capacity to process within-document data formats such as charts and equations remains under-explored. To address these issues, we present DocGenome, a structured document benchmark constructed by annotating 500K scientific documents from 153 disciplines in the arXiv open-access community, using our custom auto-labeling pipeline. DocGenome features four key characteristics: 1) Completeness: It is the first dataset to structure data from all modalities including 13 layout attributes along with their LaTeX source codes. 2) Logicality: It provides 6 logical relationships between different entities within each scientific document. 3) Diversity: It covers various document-oriented tasks, including document classification, visual grounding, document layout detection, document transformation, open-ended single-page QA and multi-page QA. 4) Correctness: It undergoes rigorous quality control checks conducted by a specialized team. We conduct extensive experiments to demonstrate the advantages of DocGenome and objectively evaluate the performance of large models on our benchmark. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: Homepage of DocGenome: https://unimodal4reasoning.github.io/DocGenome_page 22 pages, 11 figures

arXiv:2406.11207 [pdf]

Doping-tunable Fermi surface with persistent topological Hall effect in axion candidate EuIn$_2$As$_2$

Authors: Jian Yan, Jianguo Si, Zhongzhu Jiang, Hanming Ma, Yoshiya Uwatoko, Bao-Tian Wang, Xuan Luo, Yuping Sun, Minoru Yamashita

Abstract: Rare-earth Zintl compound EuIn$_2$As$_2$ has been theoretically recognized as a candidate for realizing an intrinsic antiferromagnetic (AFM) bulk axion insulator and a higher-order topological state, which provides a fertile platform to explore novel topological transport phenomena. However, the axion state has yet to be realized because EuIn$_2$As$_2$ is highly hole-doped. Here, we synthesized a… ▽ More Rare-earth Zintl compound EuIn$_2$As$_2$ has been theoretically recognized as a candidate for realizing an intrinsic antiferromagnetic (AFM) bulk axion insulator and a higher-order topological state, which provides a fertile platform to explore novel topological transport phenomena. However, the axion state has yet to be realized because EuIn$_2$As$_2$ is highly hole-doped. Here, we synthesized a series of high-quality Ca-doped EuIn2As2 (Ca$_x$Eu$_{1-x}$In$_2$As$_2$, x = 0 ~ 0.25) single crystals to tune the Fermi energy above the hole pocket. Our Hall measurements reveal that the isovalent Ca substitution decreases the hole carrier density by shrinking the lattice spacing, which is also confirmed by our first-principles calculations. We further find that both the temperature dependence of the magnetic susceptibility with a local maximum at the Néel temperature and the topological Hall effect originating from the finite real-space spin chirality persist in the Ca-doped samples as observed in the pristine EuIn$_2$As$_2$, despite that the nonmagnetic Ca substitution decreases the effective moment and the Néel temperature. These results show that the Ca substitution tunes the Fermi energy while keeping the AFM magnetic structure, suggesting that the axion insulating state may be realized by further Ca substitution. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 18 pages, 8 figures

arXiv:2406.10661 [pdf, other]

A GPU-accelerated Large-scale Simulator for Transportation System Optimization Benchmarking

Authors: Jun Zhang, Wenxuan Ao, Junbo Yan, Depeng Jin, Yong Li

Abstract: With the development of artificial intelligence techniques, transportation system optimization is evolving from traditional methods relying on expert experience to simulation and learning-based decision optimization methods. Learning-based optimization methods require extensive interaction with highly realistic microscopic traffic simulators for optimization. However, existing microscopic traffic… ▽ More With the development of artificial intelligence techniques, transportation system optimization is evolving from traditional methods relying on expert experience to simulation and learning-based decision optimization methods. Learning-based optimization methods require extensive interaction with highly realistic microscopic traffic simulators for optimization. However, existing microscopic traffic simulators are computationally inefficient in large-scale scenarios and therefore significantly reduce the efficiency of the data sampling process of optimization algorithms. In addition, the optimization scenarios supported by existing simulators are limited, mainly focusing on the traffic signal control. To address these challenges and limitations, we propose the first open-source GPU-accelerated large-scale microscopic simulator for transportation system simulation. The simulator is able to iterate at 84.09Hz, which achieves 88.92 times computational acceleration in the large-scale scenario with more than a million vehicles compared to the best baseline. Based on the simulator, we implement a set of microscopic and macroscopic controllable objects and metrics to support most typical transportation system optimization scenarios. These controllable objects and metrics are all provided by Python API for ease of use. We choose five important and representative transportation system optimization scenarios and benchmark classical rule-based algorithms, reinforcement learning, and black-box optimization in four cities. The codes are available at \url{https://github.com/tsinghua-fib-lab/moss-benchmark} with the MIT License. △ Less

Submitted 15 June, 2024; originally announced June 2024.

Comments: Submitted to NeurIPS 2024 Datasets and Benchmarks Track

arXiv:2406.09410 [pdf, other]

STAR: A First-Ever Dataset and A Large-Scale Benchmark for Scene Graph Generation in Large-Size Satellite Imagery

Authors: Yansheng Li, Linlin Wang, Tingzhu Wang, Xue Yang, Junwei Luo, Qi Wang, Youming Deng, Wenbin Wang, Xian Sun, Haifeng Li, Bo Dang, Yongjun Zhang, Yi Yu, Junchi Yan

Abstract: Scene graph generation (SGG) in satellite imagery (SAI) benefits promoting understanding of geospatial scenarios from perception to cognition. In SAI, objects exhibit great variations in scales and aspect ratios, and there exist rich relationships between objects (even between spatially disjoint objects), which makes it attractive to holistically conduct SGG in large-size very-high-resolution (VHR… ▽ More Scene graph generation (SGG) in satellite imagery (SAI) benefits promoting understanding of geospatial scenarios from perception to cognition. In SAI, objects exhibit great variations in scales and aspect ratios, and there exist rich relationships between objects (even between spatially disjoint objects), which makes it attractive to holistically conduct SGG in large-size very-high-resolution (VHR) SAI. However, there lack such SGG datasets. Due to the complexity of large-size SAI, mining triplets <subject, relationship, object> heavily relies on long-range contextual reasoning. Consequently, SGG models designed for small-size natural imagery are not directly applicable to large-size SAI. This paper constructs a large-scale dataset for SGG in large-size VHR SAI with image sizes ranging from 512 x 768 to 27,860 x 31,096 pixels, named STAR (Scene graph generaTion in lArge-size satellite imageRy), encompassing over 210K objects and over 400K triplets. To realize SGG in large-size SAI, we propose a context-aware cascade cognition (CAC) framework to understand SAI regarding object detection (OBD), pair pruning and relationship prediction for SGG. We also release a SAI-oriented SGG toolkit with about 30 OBD and 10 SGG methods which need further adaptation by our devised modules on our challenging STAR dataset. The dataset and toolkit are available at: https://linlin-dev.github.io/project/STAR. △ Less

Submitted 3 July, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

Comments: 18 pages, 11 figures

arXiv:2406.09385 [pdf, other]

Towards Vision-Language Geo-Foundation Model: A Survey

Authors: Yue Zhou, Litong Feng, Yiping Ke, Xue Jiang, Junchi Yan, Xue Yang, Wayne Zhang

Abstract: Vision-Language Foundation Models (VLFMs) have made remarkable progress on various multimodal tasks, such as image captioning, image-text retrieval, visual question answering, and visual grounding. However, most methods rely on training with general image datasets, and the lack of geospatial data leads to poor performance on earth observation. Numerous geospatial image-text pair datasets and VLFMs… ▽ More Vision-Language Foundation Models (VLFMs) have made remarkable progress on various multimodal tasks, such as image captioning, image-text retrieval, visual question answering, and visual grounding. However, most methods rely on training with general image datasets, and the lack of geospatial data leads to poor performance on earth observation. Numerous geospatial image-text pair datasets and VLFMs fine-tuned on them have been proposed recently. These new approaches aim to leverage large-scale, multimodal geospatial data to build versatile intelligent models with diverse geo-perceptive capabilities, which we refer to as Vision-Language Geo-Foundation Models (VLGFMs). This paper thoroughly reviews VLGFMs, summarizing and analyzing recent developments in the field. In particular, we introduce the background and motivation behind the rise of VLGFMs, highlighting their unique research significance. Then, we systematically summarize the core technologies employed in VLGFMs, including data construction, model architectures, and applications of various multimodal geospatial tasks. Finally, we conclude with insights, issues, and discussions regarding future research directions. To the best of our knowledge, this is the first comprehensive literature review of VLGFMs. We keep tracing related works at https://github.com/zytx121/Awesome-VLGFM. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 18 pages, 4 figures

arXiv:2406.08806 [pdf, ps, other]

Adaptive Cooperative Streaming of Holographic Video Over Wireless Networks: A Proximal Policy Optimization Solution

Authors: Wanli Wen, Jiping Yan, Yulu Zhang, Zhen Huang, Liang Liang, Yunjian Jia

Abstract: Adapting holographic video streaming to fluctuating wireless channels is essential to maintain consistent and satisfactory Quality of Experience (QoE) for users, which, however, is a challenging task due to the dynamic and uncertain characteristics of wireless networks. To address this issue, we propose a holographic video cooperative streaming framework designed for a generic wireless network in… ▽ More Adapting holographic video streaming to fluctuating wireless channels is essential to maintain consistent and satisfactory Quality of Experience (QoE) for users, which, however, is a challenging task due to the dynamic and uncertain characteristics of wireless networks. To address this issue, we propose a holographic video cooperative streaming framework designed for a generic wireless network in which multiple access points can cooperatively transmit video with different bitrates to multiple users. Additionally, we model a novel QoE metric tailored specifically for holographic video streaming, which can effectively encapsulate the nuances of holographic video quality, quality fluctuations, and rebuffering occurrences simultaneously. Furthermore, we formulate a formidable QoE maximization problem, which is a non-convex mixed integer nonlinear programming problem. Using proximal policy optimization (PPO), a new class of reinforcement learning algorithms, we devise a joint beamforming and bitrate control scheme, which can be wisely adapted to fluctuations in the wireless channel. The numerical results demonstrate the superiority of the proposed scheme over representative baselines. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: This paper has been accepted for publication in IEEE Wireless Communications Letters

arXiv:2406.08698 [pdf, other]

Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes of astrophysical $γ$-ray background while large amount of dark matter. By analyzing more than 700 days observational data at LHAASO, no significant dark matter signal from 1 TeV to 1 EeV is detected. Accordingly we derive the most stringent constraints on the ultra-heavy dark matter annihilation cross-section up to EeV. The constraints on the lifetime of dark matter in decay mode are also derived. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 17 pages, 12 figures, accepted by PRL

arXiv:2406.07786 [pdf, ps, other]

doi 10.1364/OE.496966

Field Test of Quantum Key Distribution with High Key Creation Efficiency

Authors: Yung-Cheng Kao, Sheng-Hsuan Huang, Chin-Hsuan Chang, Chih-Hsiang Wu, Shih-Hsien Chu, Jian Jiang, An-Chi Zhang, Sheng-Yao Huang, Jhih-Heng Yan, Kai-Ming Feng, Chih-Sung Chuu

Abstract: Quantumkey distribution (QKD) promises unconditional security for communication. However, the random choices of the measurement basis in QKD usually result in low key creation efficiency. This drawback is overcome in the differential-phase-shift QKD, provided that each photon can be prepared in a large number of time bins with a proper waveform. In this work we develop a miniature 1550-nm single-p… ▽ More Quantumkey distribution (QKD) promises unconditional security for communication. However, the random choices of the measurement basis in QKD usually result in low key creation efficiency. This drawback is overcome in the differential-phase-shift QKD, provided that each photon can be prepared in a large number of time bins with a proper waveform. In this work we develop a miniature 1550-nm single-photon source to generate narrowband single photon in 50 time bins with a nearly optimal waveform for achieving unity key creation efficiency. By utilizing these single photons in the field test, we demonstrate the differential-phase-shift QKD with a key creation efficiency of 97%. Our work shows that the practical QKD can benefit from the narrowband single photons with controllable waveforms. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 9pages, 4figures

Journal ref: Opt. Express 31, 30239-30247 (2023)

arXiv:2406.07355 [pdf]

Insulator-to-Metal Transition and Anomalously Slow Hot Carrier Cooling in a Photo-doped Mott Insulator

Authors: Usama Choudhry, Jin Zhang, Kewen Huang, Emma Low, Yujie Quan, Basamat Shaheen, Ryan Gnabasik, Jiaqiang Yan, Angel Rubio, Kenneth S. Burch, Bolin Liao

Abstract: Photo-doped Mott insulators can exhibit novel photocarrier transport and relaxation dynamics and non-equilibrium phases. However, time-resolved real-space imaging of these processes are still lacking. Here, we use scanning ultrafast electron microscopy (SUEM) to directly visualize the spatial-temporal evolution of photoexcited species in a spin-orbit assisted Mott insulator α-RuCl3. At low optical… ▽ More Photo-doped Mott insulators can exhibit novel photocarrier transport and relaxation dynamics and non-equilibrium phases. However, time-resolved real-space imaging of these processes are still lacking. Here, we use scanning ultrafast electron microscopy (SUEM) to directly visualize the spatial-temporal evolution of photoexcited species in a spin-orbit assisted Mott insulator α-RuCl3. At low optical fluences, we observe extremely long hot photocarrier transport time over one nanosecond, almost an order of magnitude longer than any known values in conventional semiconductors. At higher optical fluences, we observe nonlinear features suggesting a photo-induced insulator-to-metal transition, which is unusual in a large-gap Mott insulator. Our results demonstrate the rich physics in a photo-doped Mott insulator that can be extracted from spatial-temporal imaging and showcase the capability of SUEM to sensitively probe photoexcitations in strongly correlated electron systems. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: Comments are welcome. Please email feedback to bliao@ucsb.edu

arXiv:2406.04963 [pdf, other]

Learning Divergence Fields for Shift-Robust Graph Representations

Authors: Qitian Wu, Fan Nie, Chenxiao Yang, Junchi Yan

Abstract: Real-world data generation often involves certain geometries (e.g., graphs) that induce instance-level interdependence. This characteristic makes the generalization of learning models more difficult due to the intricate interdependent patterns that impact data-generative distributions and can vary from training to testing. In this work, we propose a geometric diffusion model with learnable diverge… ▽ More Real-world data generation often involves certain geometries (e.g., graphs) that induce instance-level interdependence. This characteristic makes the generalization of learning models more difficult due to the intricate interdependent patterns that impact data-generative distributions and can vary from training to testing. In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging generalization problem with interdependent data. We generalize the diffusion equation with stochastic diffusivity at each time step, which aims to capture the multi-faceted information flows among interdependent data. Furthermore, we derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains. Regarding practical implementation, we introduce three model instantiations that can be considered as the generalized versions of GCN, GAT, and Transformers, respectively, which possess advanced robustness against distribution shifts. We demonstrate their promising efficacy for out-of-distribution generalization on diverse real-world datasets. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: Accepted to ICML 2024. Source codes at https://github.com/fannie1208/GLIND

arXiv:2406.04133 [pdf]

GLOBUS: Global building renovation potential by 2070

Authors: Shufan Zhang, Minda Ma, Nan Zhou, Jinyue Yan

Abstract: Surpassing the two large emission sectors of transportation and industry, the building sector accounted for 34% and 37% of global energy consumption and carbon emissions in 2021, respectively. The building sector, the final piece to be addressed in the transition to net-zero carbon emissions, requires a comprehensive, multisectoral strategy for reducing emissions. Until now, the absence of data on… ▽ More Surpassing the two large emission sectors of transportation and industry, the building sector accounted for 34% and 37% of global energy consumption and carbon emissions in 2021, respectively. The building sector, the final piece to be addressed in the transition to net-zero carbon emissions, requires a comprehensive, multisectoral strategy for reducing emissions. Until now, the absence of data on global building floorspace has impeded the measurement of building carbon intensity (carbon emissions per floorspace) and the identification of ways to achieve carbon neutrality for buildings. For this study, we develop a global building stock model (GLOBUS) to fill that data gap. Our study's primary contribution lies in providing a dataset of global building stock turnover using scenarios that incorporate various levels of building renovation. By unifying the evaluation indicators, the dataset empowers building science researchers to perform comparative analyses based on floorspace. Specifically, the building stock dataset establishes a reference for measuring carbon emission intensity and decarbonization intensity of buildings within different countries. Further, we emphasize the sufficiency of existing buildings by incorporating building renovation into the model. Renovation can minimize the need to expand the building stock, thereby bolstering decarbonization of the building sector. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 26 pages, 6 figures

arXiv:2406.04074 [pdf]

Estimation of Global Building Stocks by 2070: Unlocking Renovation Potential

Authors: Shufan Zhang, Minda Ma, Nan Zhou, Jinyue Yan, Wei Feng, Ran Yan, Kairui You, Jingjing Zhang, Jing Ke

Abstract: Buildings produce one-third of carbon emissions globally, however, data absence regarding global floorspace poses challenges in advancing building carbon neutrality. We compile the measured building stocks for 14 major economies and apply our global building stock model, GLOBUS, to evaluate future trends in stock turnover. Based on a scenario not considering renovation, by 2070 the building stock… ▽ More Buildings produce one-third of carbon emissions globally, however, data absence regarding global floorspace poses challenges in advancing building carbon neutrality. We compile the measured building stocks for 14 major economies and apply our global building stock model, GLOBUS, to evaluate future trends in stock turnover. Based on a scenario not considering renovation, by 2070 the building stock in developed economies will be ~1.4 times that of 2020 (100 billion m2); in developing economies it is expected to be 2.2 times that of 2020 (313 billion m2). Based on a techno-economic potential scenario, however, stocks in developed economies will decline to approximately 0.8 times the 2020 level, while stocks in developing economies will increase to nearly twice the 2020 level due to their fewer buildings currently. Overall, GLOBUS provides a way of calculating the global building stock, helping scientists, engineers, and policymakers conduct a range of investigation across various future scenarios. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 25 pages, 4 figures

arXiv:2406.03877 [pdf, other]

Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving

Authors: Xiaosong Jia, Zhenjie Yang, Qifeng Li, Zhiyuan Zhang, Junchi Yan

Abstract: In an era marked by the rapid scaling of foundation models, autonomous driving technologies are approaching a transformative threshold where end-to-end autonomous driving (E2E-AD) emerges due to its potential of scaling up in the data-driven manner. However, existing E2E-AD methods are mostly evaluated under the open-loop log-replay manner with L2 errors and collision rate as metrics (e.g., in nuS… ▽ More In an era marked by the rapid scaling of foundation models, autonomous driving technologies are approaching a transformative threshold where end-to-end autonomous driving (E2E-AD) emerges due to its potential of scaling up in the data-driven manner. However, existing E2E-AD methods are mostly evaluated under the open-loop log-replay manner with L2 errors and collision rate as metrics (e.g., in nuScenes), which could not fully reflect the driving performance of algorithms as recently acknowledged in the community. For those E2E-AD methods evaluated under the closed-loop protocol, they are tested in fixed routes (e.g., Town05Long and Longest6 in CARLA) with the driving score as metrics, which is known for high variance due to the unsmoothed metric function and large randomness in the long route. Besides, these methods usually collect their own data for training, which makes algorithm-level fair comparison infeasible. To fulfill the paramount need of comprehensive, realistic, and fair testing environments for Full Self-Driving (FSD), we present Bench2Drive, the first benchmark for evaluating E2E-AD systems' multiple abilities in a closed-loop manner. Bench2Drive's official training data consists of 2 million fully annotated frames, collected from 10000 short clips uniformly distributed under 44 interactive scenarios (cut-in, overtaking, detour, etc), 23 weathers (sunny, foggy, rainy, etc), and 12 towns (urban, village, university, etc) in CARLA v2. Its evaluation protocol requires E2E-AD models to pass 44 interactive scenarios under different locations and weathers which sums up to 220 routes and thus provides a comprehensive and disentangled assessment about their driving capability under different situations. We implement state-of-the-art E2E-AD models and evaluate them in Bench2Drive, providing insights regarding current status and future directions. △ Less

Submitted 11 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

Comments: Fix typos in text and Table 4. More reference

arXiv:2405.20583 [pdf, other]

The Gestalt Computational Model

Authors: Yu Chen, Hongwei Lin, Jiacong Yan

Abstract: Widely employed in cognitive psychology, Gestalt theory elucidates basic principles in visual perception, but meanwhile presents significant challenges for computation. The advancement of artificial intelligence requires the emulation of human cognitive behavior, for which Gestalt theory serves as a fundamental framework describing human visual cognitive behavior. In this paper, we utilize persist… ▽ More Widely employed in cognitive psychology, Gestalt theory elucidates basic principles in visual perception, but meanwhile presents significant challenges for computation. The advancement of artificial intelligence requires the emulation of human cognitive behavior, for which Gestalt theory serves as a fundamental framework describing human visual cognitive behavior. In this paper, we utilize persistent homology, a mathematical tool in computational topology, to develop a computational model for Gestalt theory, addressing the challenges of quantification and computation. The Gestalt computational model not only holds promise for applications in artificial intelligence and computer vision, but also opens a new research direction of computational visual perception. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.18132 [pdf, other]

EG4D: Explicit Generation of 4D Object without Score Distillation

Authors: Qi Sun, Zhiyang Guo, Ziyu Wan, Jing Nathan Yan, Shengming Yin, Wengang Zhou, Jing Liao, Houqiang Li

Abstract: In recent years, the increasing demand for dynamic 3D assets in design and gaming applications has given rise to powerful generative pipelines capable of synthesizing high-quality 4D objects. Previous methods generally rely on score distillation sampling (SDS) algorithm to infer the unseen views and motion of 4D objects, thus leading to unsatisfactory results with defects like over-saturation and… ▽ More In recent years, the increasing demand for dynamic 3D assets in design and gaming applications has given rise to powerful generative pipelines capable of synthesizing high-quality 4D objects. Previous methods generally rely on score distillation sampling (SDS) algorithm to infer the unseen views and motion of 4D objects, thus leading to unsatisfactory results with defects like over-saturation and Janus problem. Therefore, inspired by recent progress of video diffusion models, we propose to optimize a 4D representation by explicitly generating multi-view videos from one input image. However, it is far from trivial to handle practical challenges faced by such a pipeline, including dramatic temporal inconsistency, inter-frame geometry and texture diversity, and semantic defects brought by video generation results. To address these issues, we propose DG4D, a novel multi-stage framework that generates high-quality and consistent 4D assets without score distillation. Specifically, collaborative techniques and solutions are developed, including an attention injection strategy to synthesize temporal-consistent multi-view videos, a robust and efficient dynamic reconstruction method based on Gaussian Splatting, and a refinement stage with diffusion prior for semantic restoration. The qualitative results and user preference study demonstrate that our framework outperforms the baselines in generation quality by a considerable margin. Code will be released at \url{https://github.com/jasongzy/EG4D}. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.16759 [pdf, other]

Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models

Authors: Cristina N. Vasconcelos, Abdullah Rashwan, Austin Waters, Trevor Walker, Keyang Xu, Jimmy Yan, Rui Qian, Shixin Luo, Zarana Parekh, Andrew Bunner, Hongliang Fei, Roopal Garg, Mandy Guo, Ivana Kajic, Yeqing Li, Henna Nandwani, Jordi Pont-Tuset, Yasumasa Onoe, Sarah Rosston, Su Wang, Wenlei Zhou, Kevin Swersky, David J. Fleet, Jason M. Baldridge, Oliver Wang

Abstract: We address the long-standing problem of how to learn effective pixel-based image diffusion models at scale, introducing a remarkably simple greedy growing method for stable training of large-scale, high-resolution models. without the needs for cascaded super-resolution components. The key insight stems from careful pre-training of core components, namely, those responsible for text-to-image alignm… ▽ More We address the long-standing problem of how to learn effective pixel-based image diffusion models at scale, introducing a remarkably simple greedy growing method for stable training of large-scale, high-resolution models. without the needs for cascaded super-resolution components. The key insight stems from careful pre-training of core components, namely, those responsible for text-to-image alignment {\it vs.} high-resolution rendering. We first demonstrate the benefits of scaling a {\it Shallow UNet}, with no down(up)-sampling enc(dec)oder. Scaling its deep core layers is shown to improve alignment, object structure, and composition. Building on this core model, we propose a greedy algorithm that grows the architecture into high-resolution end-to-end models, while preserving the integrity of the pre-trained representation, stabilizing training, and reducing the need for large high-resolution datasets. This enables a single stage model capable of generating high-resolution images without the need of a super-resolution cascade. Our key results rely on public datasets and show that we are able to train non-cascaded models up to 8B parameters with no further regularization schemes. Vermeer, our full pipeline model trained with internal datasets to produce 1024x1024 images, without cascades, is preferred by 44.0% vs. 21.4% human evaluators over SDXL. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.16432 [pdf, other]

Revealing the hidden Dirac gap in a topological antiferromagnet using Floquet-Bloch manipulation

Authors: Nina Bielinski, Rajas Chari, Julian May-Mann, Soyeun Kim, Jack Zwettler, Yujun Deng, Anuva Aishwarya, Subhajit Roychowdhury, Chandra Shekhar, Makoto Hashimoto, Donghui Lu, Jiaqiang Yan, Claudia Felser, Vidya Madhavan, Zhi-Xun Shen, Taylor L. Hughes, Fahad Mahmood

Abstract: Manipulating solids using the time-periodic drive of a laser pulse is a promising route to generate new phases of matter. Whether such `Floquet-Bloch' manipulation can be achieved in topological magnetic systems with disorder has so far been unclear. In this work, we realize Floquet-Bloch manipulation of the Dirac surface-state mass of the topological antiferromagnet (AFM) MnBi$_2$Te$_4$. Using ti… ▽ More Manipulating solids using the time-periodic drive of a laser pulse is a promising route to generate new phases of matter. Whether such `Floquet-Bloch' manipulation can be achieved in topological magnetic systems with disorder has so far been unclear. In this work, we realize Floquet-Bloch manipulation of the Dirac surface-state mass of the topological antiferromagnet (AFM) MnBi$_2$Te$_4$. Using time- and angle-resolved photoemission spectroscopy (tr-ARPES), we show that opposite helicities of mid-infrared circularly polarized light result in substantially different Dirac mass gaps in the AFM phase, despite the equilibrium Dirac cone being massless. We explain our findings in terms of a Dirac fermion with a random mass. Our results underscore Floquet-Bloch manipulation as a powerful tool for controlling topology even in the presence of disorder, and for uncovering properties of materials that may elude conventional probes. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.16187 [pdf, other]

An X-Ray High-Frequency QPO in NGC 1365

Authors: Yongkang Yan, Peng Zhang, Qingzhong Liu, Zhi Chang, Gaochao Liu, Jingzhi Yan, Xiangyun Zeng

Abstract: This study presents the detection of a high-frequency Quasi-Periodic Oscillation (QPO) in the Seyfert galaxy NGC 1365, based on observational data obtained by the XMM-Newton in January 2004. Utilizing the Weighted Wavelet Z-transform (WWZ) and Lomb-Scargle Periodogram (LSP) methods, a QPO signal was identified at a frequency of 2.19 * 10^-4 Hz (4566 s), with a confidence level of 3.6 sigma. The si… ▽ More This study presents the detection of a high-frequency Quasi-Periodic Oscillation (QPO) in the Seyfert galaxy NGC 1365, based on observational data obtained by the XMM-Newton in January 2004. Utilizing the Weighted Wavelet Z-transform (WWZ) and Lomb-Scargle Periodogram (LSP) methods, a QPO signal was identified at a frequency of 2.19 * 10^-4 Hz (4566 s), with a confidence level of 3.6 sigma. The signal was notably absent in the lower 0.2-1.0 keV energy band, with the primary contribution emerging from the 2.0-10.0 keV band, where the confidence level reached 3.9 sigma. Spectral analysis shows that there are multiple absorption and emission lines in the high-energy band (> 6 keV). The correlation between the QPO frequency (f_QPO) and the mass of NGC 1365 central black hole (M_BH) aligns with the established logarithmic trend observed across black holes, indicating the QPO is of high frequency. This discovery provides new clues for studying the generation mechanism of QPO in Seyfert galaxies, which helps us understand the accretion process around supermassive black holes and the characteristics of strong gravitational fields in active galactic nuclei. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: 6 pages, 5 figures, 1 table

arXiv:2405.15908 [pdf, other]

Knowledge-Informed Auto-Penetration Testing Based on Reinforcement Learning with Reward Machine

Authors: Yuanliang Li, Hanzheng Dai, Jun Yan

Abstract: Automated penetration testing (AutoPT) based on reinforcement learning (RL) has proven its ability to improve the efficiency of vulnerability identification in information systems. However, RL-based PT encounters several challenges, including poor sampling efficiency, intricate reward specification, and limited interpretability. To address these issues, we propose a knowledge-informed AutoPT frame… ▽ More Automated penetration testing (AutoPT) based on reinforcement learning (RL) has proven its ability to improve the efficiency of vulnerability identification in information systems. However, RL-based PT encounters several challenges, including poor sampling efficiency, intricate reward specification, and limited interpretability. To address these issues, we propose a knowledge-informed AutoPT framework called DRLRM-PT, which leverages reward machines (RMs) to encode domain knowledge as guidelines for training a PT policy. In our study, we specifically focus on lateral movement as a PT case study and formulate it as a partially observable Markov decision process (POMDP) guided by RMs. We design two RMs based on the MITRE ATT\&CK knowledge base for lateral movement. To solve the POMDP and optimize the PT policy, we employ the deep Q-learning algorithm with RM (DQRM). The experimental results demonstrate that the DQRM agent exhibits higher training efficiency in PT compared to agents without knowledge embedding. Moreover, RMs encoding more detailed domain knowledge demonstrated better PT performance compared to RMs with simpler knowledge. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.14854 [pdf, other]

TerDiT: Ternary Diffusion Models with Transformers

Authors: Xudong Lu, Aojun Zhou, Ziyi Lin, Qi Liu, Yuhui Xu, Renrui Zhang, Yafei Wen, Shuai Ren, Peng Gao, Junchi Yan, Hongsheng Li

Abstract: Recent developments in large-scale pre-trained text-to-image diffusion models have significantly improved the generation of high-fidelity images, particularly with the emergence of diffusion models based on transformer architecture (DiTs). Among these diffusion models, diffusion transformers have demonstrated superior image generation capabilities, boosting lower FID scores and higher scalability.… ▽ More Recent developments in large-scale pre-trained text-to-image diffusion models have significantly improved the generation of high-fidelity images, particularly with the emergence of diffusion models based on transformer architecture (DiTs). Among these diffusion models, diffusion transformers have demonstrated superior image generation capabilities, boosting lower FID scores and higher scalability. However, deploying large-scale DiT models can be expensive due to their extensive parameter numbers. Although existing research has explored efficient deployment techniques for diffusion models such as model quantization, there is still little work concerning DiT-based models. To tackle this research gap, in this paper, we propose TerDiT, a quantization-aware training (QAT) and efficient deployment scheme for ternary diffusion models with transformers. We focus on the ternarization of DiT networks and scale model sizes from 600M to 4.2B. Our work contributes to the exploration of efficient deployment strategies for large-scale DiT models, demonstrating the feasibility of training extremely low-bit diffusion transformer models from scratch while maintaining competitive image generation capacities compared to full-precision models. Code will be available at https://github.com/Lucky-Lance/TerDiT. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 18 pages, 13 figures

arXiv:2405.13279 [pdf, other]

Constraints on Einstein-dilation-Gauss-Bonnet gravity and electric charge of compact binary systems from GW230529

Authors: Bo Gao, Shao-Peng Tang, Hai-Tian Wang, Jingzhi Yan, Yi-Zhong Fan

Abstract: In this work, we study the implications of GW230529 on gravity theories and the charge of black holes. The GW230529, which was initially released in O4a, is most likely neutron star-black hole (NSBH) mergers. We reanalyze the data from the GW230529 event to obtain bounds on the Einstein-dilation-Gauss-Bonnet (EdGB) gravity parameter $\sqrt{α_{\rm EdGB}}$ and the electric charge of compact binary s… ▽ More In this work, we study the implications of GW230529 on gravity theories and the charge of black holes. The GW230529, which was initially released in O4a, is most likely neutron star-black hole (NSBH) mergers. We reanalyze the data from the GW230529 event to obtain bounds on the Einstein-dilation-Gauss-Bonnet (EdGB) gravity parameter $\sqrt{α_{\rm EdGB}}$ and the electric charge of compact binary systems. The event places a $90\%$ credible upper bounds on $\sqrt{α_{\rm EdGB}}$ of $\lesssim 0.298$ km. After including high order corrections of EdGB gravity, the bounds improve to $\sqrt{α_{\rm EdGB}} \lesssim 0.260$ km. Analyses of GW230529 also yield a $90\%$ credible upper bounds on the combination of charge-to-mass ratio of the binary components $ζ\lesssim 0.024$. The constraints are more stringent than those derived from previously observed single gravitational wave merger event. △ Less

Submitted 12 July, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

Comments: 7 pages, 4 figures, PRD accepted

arXiv:2405.12788 [pdf, other]

What Have We Achieved on Non-autoregressive Translation?

Authors: Yafu Li, Huajian Zhang, Jianhao Yan, Yongjing Yin, Yue Zhang

Abstract: Recent advances have made non-autoregressive (NAT) translation comparable to autoregressive methods (AT). However, their evaluation using BLEU has been shown to weakly correlate with human annotations. Limited research compares non-autoregressive translation and autoregressive translation comprehensively, leaving uncertainty about the true proximity of NAT to AT. To address this gap, we systematic… ▽ More Recent advances have made non-autoregressive (NAT) translation comparable to autoregressive methods (AT). However, their evaluation using BLEU has been shown to weakly correlate with human annotations. Limited research compares non-autoregressive translation and autoregressive translation comprehensively, leaving uncertainty about the true proximity of NAT to AT. To address this gap, we systematically evaluate four representative NAT methods across various dimensions, including human evaluation. Our empirical results demonstrate that despite narrowing the performance gap, state-of-the-art NAT still underperforms AT under more reliable evaluation metrics. Furthermore, we discover that explicitly modeling dependencies is crucial for generating natural language and generalizing to out-of-distribution sequences. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: ACL 2024 Findings

arXiv:2405.12520 [pdf, other]

MOSS: A Large-scale Open Microscopic Traffic Simulation System

Authors: Jun Zhang, Wenxuan Ao, Junbo Yan, Can Rong, Depeng Jin, Wei Wu, Yong Li

Abstract: In the research of Intelligent Transportation Systems (ITS), traffic simulation is a key procedure for the evaluation of new methods and optimization of strategies. However, existing traffic simulation systems face two challenges. First, how to balance simulation scale with realism is a dilemma. Second, it is hard to simulate realistic results, which requires realistic travel demand data and simul… ▽ More In the research of Intelligent Transportation Systems (ITS), traffic simulation is a key procedure for the evaluation of new methods and optimization of strategies. However, existing traffic simulation systems face two challenges. First, how to balance simulation scale with realism is a dilemma. Second, it is hard to simulate realistic results, which requires realistic travel demand data and simulator. These problems limit computer-aided optimization of traffic management strategies for large-scale road networks and reduce the usability of traffic simulations in areas where real-world travel demand data are lacking. To address these problems, we design and implement MObility Simulation System (MOSS). MOSS adopts GPU acceleration to significantly improve the efficiency and scale of microscopic traffic simulation, which enables realistic and fast simulations for large-scale road networks. It provides realistic travel Origin-Destination (OD) matrices generation through a pre-trained generative neural network model based on publicly available data on a global scale, such as satellite imagery, to help researchers build meaningful travel demand data. It also provides a complete open toolchain to help users with road network construction, demand generation, simulation, and result analysis. The whole toolchain including the simulator can be accessed at https://moss.fiblab.net and the codes are open-source for community collaboration. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: Submitted to IEEE ITSC 2024

arXiv:2405.11826 [pdf, other]

Data quality control system and long-term performance monitor of the LHAASO-KM2A

Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To ensure the reliability of the LHAASO-KM2A data, a three-level quality control system has been established. It is used to monitor the status of detector units, stability of reconstructed parameters and the performance of the array based on observations of the Crab Nebula and Moon shadow. This paper will introduce the control system and its application on the LHAASO-KM2A data collected from August 2021 to July 2023. During this period, the pointing and angular resolution of the array were stable. From the observations of the Moon shadow and Crab Nebula, the results achieved using the two methods are consistent with each other. According to the observation of the Crab Nebula at energies from 25 TeV to 100 TeV, the time averaged pointing errors are estimated to be $-0.003^{\circ} \pm 0.005^{\circ}$ and $0.001^{\circ} \pm 0.006^{\circ}$ in the R.A. and Dec directions, respectively. △ Less

Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: 15 pages, 9 figures

arXiv:2405.10889 [pdf]

Unconventional Unidirectional Magnetoresistance in vdW Heterostructures

Authors: I-Hsuan Kao, Junyu Tang, Gabriel Calderon Ortiz, Menglin Zhu, Sean Yuan, Rahul Rao, Jiahan Li, James H. Edgar, Jiaqiang Yan, David G. Mandrus, Kenji Watanabe, Takashi Taniguchi, Jinwoo Hwang, Ran Cheng, Jyoti Katoch, Simranjeet Singh

Abstract: Electrical readout of magnetic states is a key to realize novel spintronics devices for efficient computing and data storage. Unidirectional magnetoresistance (UMR) in bilayer systems, consisting of a spin source material and a magnetic layer, refers to a change in the longitudinal resistance upon the reversal of magnetization, which typically originates from the interaction of spin-current and ma… ▽ More Electrical readout of magnetic states is a key to realize novel spintronics devices for efficient computing and data storage. Unidirectional magnetoresistance (UMR) in bilayer systems, consisting of a spin source material and a magnetic layer, refers to a change in the longitudinal resistance upon the reversal of magnetization, which typically originates from the interaction of spin-current and magnetization at the interface. Because of UMR s linear dependence on applied charge current and magnetization, it can be used to electrically read the magnetization state. However, in conventional spin source materials, the spin polarization of an electric field induced spin current is restricted to be in the film plane and hence the ensuing UMR can only respond to the in plane component of the magnetization. On the other hand, magnets with perpendicular magnetic anisotropy (PMA) are highly desired for magnetic memory and spin-logic devices, while the electrical read out of PMA magnets through UMR is critically missing. Here, we report the discovery of an unconventional UMR in bilayer heterostructures of a topological semimetal (WTe2) and a PMA ferromagnetic insulator (Cr2Ge2Te6, CGT), which allows to electrically read the up and down magnetic states of the CGT layer by measuring the longitudinal resistance. Our theoretical calculations based on a tight binding model show that the unconventional UMR originates from the interplay of crystal symmetry breaking in WTe2 and magnetic exchange interaction across the WTe2 and CGT interface. Combining with the ability of WTe2 to obtain magnetic field free switching of the PMA magnets, our discoveries open an exciting pathway to achieve two terminal magnetic memory devices that operate solely on the spin orbit torque and UMR, which is critical for developing next-generation non volatile and low power consumption data storage technologies. △ Less

Submitted 17 May, 2024; originally announced May 2024.

arXiv:2405.10559 [pdf, other]

Observational test of ${\cal R}^{2}$ spacetimes with the S2 star in the Milky Way galactic center

Authors: Jian-Ming Yan, Tao Zhu, Mustapha Azreg-Aïnou, Mubasher Jamil, Hoang Ky Nguyen

Abstract: A novel class of vacuum metrics expressible in analytical form was recently found for pure $\mathcal R^2$ gravity, based on a groundwork put forth by Buchdahl in 1962. These Buchdahl-inspired solutions offer a practical framework for testing ${\cal R}^2$ gravity through empirical observations. Within a subclass of asymptotically flat Buchdahl-inspired vacuum spacetimes, we identified a parameter… ▽ More A novel class of vacuum metrics expressible in analytical form was recently found for pure $\mathcal R^2$ gravity, based on a groundwork put forth by Buchdahl in 1962. These Buchdahl-inspired solutions offer a practical framework for testing ${\cal R}^2$ gravity through empirical observations. Within a subclass of asymptotically flat Buchdahl-inspired vacuum spacetimes, we identified a parameter $ε$ measuring the deviation from the classic Schwarzschild metric, which corresponds to $ε=0$. In this paper, we employ observational data from the S2 star's orbit around Sgr A* in the Milky Way galactic center and perform Monte Carlo Markov Chain simulations to probe the effects of the new metrics on the orbit of S2 star. Our analysis presented herein reports a range at 95\% confidence level on the deviation parameter as $ε\in(-0.4930,\ 0.5001)$. While no decisive evidence either in favor or in disfavor of the asymptotically flat Buchdahl-inspired spacetimes has been achieved, the obtained bound is compatible with the tighter results using other data of different nature as recently reported in Eur.\,Phys.\,J.\,C $\bf 84$, 330 (2024). As a meaningful test probing into a strong-field regime, our present study calls for further observations with prolonged period and improved accuracy in order to tighten the bound for $ε$ using the S2 star orbit. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures, 1 table

arXiv:2405.07691 [pdf, other]

Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i… ▽ More The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) is compatible with NGC 4278 within $\sim0.03$ degree. Variation analysis shows an indication of the variability at a few months level in the TeV band, which is consistent with low frequency observations. Based on these observations, we report the detection of TeV $γ$-ray emissions from this low-luminosity AGN NGC 4278. The observations by LHAASO-WCDA during active period has a significance level of 8.8\,$σ$ with best-fit photon spectral index $\varGamma=2.56\pm0.14$ and a flux $f_{1-10\,\rm{TeV}}=(7.0\pm1.1_{\rm{sta}}\pm0.35_{\rm{syst}})\times10^{-13}\,\rm{photons\,cm^{-2}\,s^{-1}}$, or approximately $5\%$ of the Crab Nebula. The discovery of VHE from NGC 4278 indicates that the compact, weak radio jet can efficiently accelerate particles and emit TeV photons. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures

arXiv:2405.05880 [pdf, other]

Optical contrast analysis of α-RuCl$_3$ nanoflakes on oxidized silicon wafers

Authors: Tatyana V. Ivanova, Daniel Andres-Penares, Yiping Wang, Jiaqiang Yan, Daniel Forbes, Servet Ozdemir, Kenneth S. Burch, Brian D. Gerardot, Mauro Brotons-Gisbert

Abstract: α-RuCl$_3$, a narrow-band Mott insulator with large work function, offers intriguing potential as a quantum material or as a charge acceptor for electrical contacts in van der Waals devices. In this work, we perform a systematic study of the optical reflection contrast of α-RuCl$_3$ nanoflakes on oxidized silicon wafers and estimate the accuracy of this imaging technique to assess the crystal thic… ▽ More α-RuCl$_3$, a narrow-band Mott insulator with large work function, offers intriguing potential as a quantum material or as a charge acceptor for electrical contacts in van der Waals devices. In this work, we perform a systematic study of the optical reflection contrast of α-RuCl$_3$ nanoflakes on oxidized silicon wafers and estimate the accuracy of this imaging technique to assess the crystal thickness. Via spectroscopic micro-ellipsometry measurements, we characterize the wavelength-dependent complex refractive index of α-RuCl$_3$ nanoflakes of varying thickness in the visible and near-infrared. Building on these results, we simulate the optical contrast of α-RuCl$_3$ nanoflakes with thicknesses below 100 nm on SiO$_2$/Si substrates under different illumination conditions. We compare the simulated optical contrast with experimental values extracted from optical microscopy images and obtain good agreement. Finally, we show that optical contrast imaging allows us to retrieve the thickness of the RuCl$_3$ nanoflakes exfoliated on an oxidized silicon substrate with a mean deviation of -0.2 nm for thicknesses below 100 nm with a standard deviation of only 1 nm. Our results demonstrate that optical contrast can be used as a non-invasive, fast, and reliable technique to estimate the α-RuCl$_3$ thickness. △ Less

Submitted 9 May, 2024; originally announced May 2024.

arXiv:2405.05519 [pdf, ps, other]

Numerical model of Phobos' motion incorporating the effects of free rotation

Authors: Yongzhang Yang, Jianguo Yan, Nianchuan Jian, Koji Matsumoto, Jean-Pierre Barriot

Abstract: High-precision ephemerides are not only useful in supporting space missions, but also in investigating the physical nature of celestial bodies. This paper reports an update to the orbit and rotation model of the Martian moon Phobos. In contrast to earlier numerical models, this paper details a dynamical model that fully considers the rotation of Phobos. Here, Phobos' rotation is first described by… ▽ More High-precision ephemerides are not only useful in supporting space missions, but also in investigating the physical nature of celestial bodies. This paper reports an update to the orbit and rotation model of the Martian moon Phobos. In contrast to earlier numerical models, this paper details a dynamical model that fully considers the rotation of Phobos. Here, Phobos' rotation is first described by Euler's rotational equations and integrated simultaneously with the orbital motion equations. We discuss this dynamical model, along with the differences with respect to the model now in use. We present the variational equation for Phobos' rotation employing the symbolic \emph{Maple} computation software. The adjustment test simulations confirm the latitude libration of Phobos, suggesting gravity field coefficients obtained using a shape model and homogeneous density hypothesis should be re-examined in the future in the context of dynamics. Furthermore, the simulations with different $k_2$ values indicate that it is difficult to determine k_2 efficiently using the current data. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Showing 1–50 of 1,492 results for author: Yan, J