subscribe to arXiv mailings

doi 10.1007/s12540-022-01226-4

Numerical Analysis on the Spatiotemporal Characteristics of the Portevin-Le Chatelier Effect in Ti-12Mo Alloy

Authors: Shiyuan Luo, Yongxin Jiang, Sandrine Thuillier, Philippe Castany, Liangcai Zeng

Abstract: A simplified 3D FE model based on McCormick's model is developed to numerically predict the spatiotemporal behaviors of the PLC effect in Ti-12Mo alloy tensile tests at 350 degrees C with strain rates from the order of $10^{-4}$ s$^{-1}$ to $10^{-2}$ s$^{-1}$. The material parameter identification procedure is firstly presented in details, and the simulated results are highly consistent with exper… ▽ More A simplified 3D FE model based on McCormick's model is developed to numerically predict the spatiotemporal behaviors of the PLC effect in Ti-12Mo alloy tensile tests at 350 degrees C with strain rates from the order of $10^{-4}$ s$^{-1}$ to $10^{-2}$ s$^{-1}$. The material parameter identification procedure is firstly presented in details, and the simulated results are highly consistent with experimental ones, especially in terms of stress drop magnitudes and PLC band widths. The distribution of simulated stress drop magnitudes at a constant tensile velocity (0.01 mm/s) follows a normal distribution and its peak value is in the range of 26-28 MPa. Furthermore, the simulated band width slightly fluctuates with the increase of true strain and its average value is about 1.5 mm. Besides, the staircase behavior of strain-time curves and the hopping propagation of the PLC band are observed in Ti-12Mo alloy tensile process, which are related to the strain localization and stress drop magnitudes. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Journal ref: Metals and Materials International, 2023, 29 (2), pp.269-279

arXiv:2407.08348 [pdf, other]

Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On

Authors: Liang Zeng, Liangjun Zhong, Liang Zhao, Tianwen Wei, Liu Yang, Jujie He, Cheng Cheng, Rui Hu, Yang Liu, Shuicheng Yan, Han Fang, Yahui Zhou

Abstract: In this paper, we investigate the underlying factors that potentially enhance the mathematical reasoning capabilities of large language models (LLMs). We argue that the data scaling law for math reasoning capabilities in modern LLMs is far from being saturated, highlighting how the model's quality improves with increases in data quantity. To support this claim, we introduce the Skywork-Math model… ▽ More In this paper, we investigate the underlying factors that potentially enhance the mathematical reasoning capabilities of large language models (LLMs). We argue that the data scaling law for math reasoning capabilities in modern LLMs is far from being saturated, highlighting how the model's quality improves with increases in data quantity. To support this claim, we introduce the Skywork-Math model series, supervised fine-tuned (SFT) on common 7B LLMs using our proposed 2.5M-instance Skywork-MathQA dataset. Skywork-Math 7B has achieved impressive accuracies of 51.2% on the competition-level MATH benchmark and 83.9% on the GSM8K benchmark using only SFT data, outperforming an early version of GPT-4 on MATH. The superior performance of Skywork-Math models contributes to our novel two-stage data synthesis and model SFT pipelines, which include three different augmentation methods and a diverse seed problem set, ensuring both the quantity and quality of Skywork-MathQA dataset across varying difficulty levels. Most importantly, we provide several practical takeaways to enhance math reasoning abilities in LLMs for both research and industry applications. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2406.19791 [pdf, other]

Mobile Robot Oriented Large-Scale Indoor Dataset for Dynamic Scene Understanding

Authors: Yifan Tang, Cong Tai, Fangxing Chen, Wanting Zhang, Tao Zhang, Xueping Liu, Yongjin Liu, Long Zeng

Abstract: Most existing robotic datasets capture static scene data and thus are limited in evaluating robots' dynamic performance. To address this, we present a mobile robot oriented large-scale indoor dataset, denoted as THUD (Tsinghua University Dynamic) robotic dataset, for training and evaluating their dynamic scene understanding algorithms. Specifically, the THUD dataset construction is first detailed,… ▽ More Most existing robotic datasets capture static scene data and thus are limited in evaluating robots' dynamic performance. To address this, we present a mobile robot oriented large-scale indoor dataset, denoted as THUD (Tsinghua University Dynamic) robotic dataset, for training and evaluating their dynamic scene understanding algorithms. Specifically, the THUD dataset construction is first detailed, including organization, acquisition, and annotation methods. It comprises both real-world and synthetic data, collected with a real robot platform and a physical simulation platform, respectively. Our current dataset includes 13 larges-scale dynamic scenarios, 90K image frames, 20M 2D/3D bounding boxes of static and dynamic objects, camera poses, and IMU. The dataset is still continuously expanding. Then, the performance of mainstream indoor scene understanding tasks, e.g. 3D object detection, semantic segmentation, and robot relocalization, is evaluated on our THUD dataset. These experiments reveal serious challenges for some robot scene understanding tasks in dynamic scenes. By sharing this dataset, we aim to foster and iterate new mobile robot algorithms quickly for robot actual working dynamic environment, i.e. complex crowded dynamic scenes. △ Less

Submitted 30 June, 2024; v1 submitted 28 June, 2024; originally announced June 2024.

Comments: This version has been accepted by ICRA2024 and the dataset has been published, where the link can be found in the paper

Journal ref: IEEE International Conference on Robotics & Automation,2024

arXiv:2406.19613 [pdf, other]

Online Optimization of DNN Inference Network Utility in Collaborative Edge Computing

Authors: Rui Li, Tao Ouyang, Liekang Zeng, Guocheng Liao, Zhi Zhou, Xu Chen

Abstract: Collaborative Edge Computing (CEC) is an emerging paradigm that collaborates heterogeneous edge devices as a resource pool to compute DNN inference tasks in proximity such as edge video analytics. Nevertheless, as the key knob to improve network utility in CEC, existing works mainly focus on the workload routing strategies among edge devices with the aim of minimizing the routing cost, remaining a… ▽ More Collaborative Edge Computing (CEC) is an emerging paradigm that collaborates heterogeneous edge devices as a resource pool to compute DNN inference tasks in proximity such as edge video analytics. Nevertheless, as the key knob to improve network utility in CEC, existing works mainly focus on the workload routing strategies among edge devices with the aim of minimizing the routing cost, remaining an open question for joint workload allocation and routing optimization problem from a system perspective. To this end, this paper presents a holistic, learned optimization for CEC towards maximizing the total network utility in an online manner, even though the utility functions of task input rates are unknown a priori. In particular, we characterize the CEC system in a flow model and formulate an online learning problem in a form of cross-layer optimization. We propose a nested-loop algorithm to solve workload allocation and distributed routing iteratively, using the tools of gradient sampling and online mirror descent. To improve the convergence rate over the nested-loop version, we further devise a single-loop algorithm. Rigorous analysis is provided to show its inherent convexity, efficient convergence, as well as algorithmic optimality. Finally, extensive numerical simulations demonstrate the superior performance of our solutions. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: Accepted by IEEE/ACM TRANSACTIONS ON NETWORKING (ToN)

arXiv:2406.17192 [pdf, other]

Upgrading the Submillimeter Array: wSMA and beyond

Authors: Paul K. Grimes, Garrett K. Keating, Raymond Blundell, Robert D. Christensen, Mark Gurwell, Attila Kovacs, Timothy Norton, Scott N. Paine, Ramprasad Rao, Edward C. -Y. Tong, Jonathan Weintroub, David Wilner, Robert W. Wilson, Lingzhen Zeng, Qizhou Zhang

Abstract: The Submillimeter Array (SMA) is an array of 8 antennas operating at millimeter and submillimeter wavelengths on Maunakea, Hawaii, operated by the Smithsonian Astrophysical Observatory and Academia Sinica Institute of Astronomy and Astrophysics, Taiwan. Over the past several years, we have been preparing a major upgrade to the SMA that will replace the aging original receiver cryostats and receive… ▽ More The Submillimeter Array (SMA) is an array of 8 antennas operating at millimeter and submillimeter wavelengths on Maunakea, Hawaii, operated by the Smithsonian Astrophysical Observatory and Academia Sinica Institute of Astronomy and Astrophysics, Taiwan. Over the past several years, we have been preparing a major upgrade to the SMA that will replace the aging original receiver cryostats and receiver cartridges with all new cryostats and new 230 and 345 GHz receiver designs. This wideband upgrade (wSMA) will also include significantly increased instantaneous bandwidth, improved sensitivity, and greater capabilities for dual frequency observations. In this paper, we will describe the wSMA receiver upgrade and status, as well as the future upgrades that will be enabled by the deployment of the wSMA receivers. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: To be published in the proceedings of SPIE Astronomical Telescopes + Instrumentation 2024, paper number 13096-122

arXiv:2406.14843 [pdf, other]

Synthesis of Electron Microbunching Rotation for Generating Isolated Attosecond Soft X-ray Free-electron Laser Pulses

Authors: Hao Sun, Xiaofan Wang, Li Zeng, Weiqing Zhang

Abstract: Attosecond x-ray pulses play a crucial role in the study of ultrafast phenomena occurring within inner and valence electrons. Especially isolated attosecond pulses with high photon energy and high peak power are of great significance in single-shot imaging in the soft x-ray region, life sciences, and attosecond pump-probe experiments. In modern accelerators, laser manipulation of electrons can be… ▽ More Attosecond x-ray pulses play a crucial role in the study of ultrafast phenomena occurring within inner and valence electrons. Especially isolated attosecond pulses with high photon energy and high peak power are of great significance in single-shot imaging in the soft x-ray region, life sciences, and attosecond pump-probe experiments. In modern accelerators, laser manipulation of electrons can be used to tailor the ultrafast properties of free-electron laser (FEL) pulses. In this paper, we propose a novel laser manipulation technique that makes use of two laser beams with mutual delays and tilted wavefronts to synthesize microbunching rotation on the scale of infrared laser wavelengths within the electron bunch for generating isolated attosecond soft x-ray pulses. This microbunching rotation ultimately leads to an enhanced current contrast ratio between the main peak and the surrounding satellite peaks within the bunch. By properly accounting for the longitudinal space charge fields within the FEL undulator, a tapered undulator can further suppress the side peaks in the radiation pulse and enable the selection of an isolated, hundred-attosecond, GW-level soft x-ray pulse. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.11800 [pdf, other]

Magnetic field in mini starburst complex Sgr B2

Authors: Xing Pan, Qizhou Zhang, Keping Qiu, Ramprasad Rao, Lingzhen Zeng, Xing Lu, Junhao Liu

Abstract: We report the first arcsecond-resolution observations of the magnetic field in the mini starburst complex Sgr B2. SMA polarization observations revealed magnetic field morphology in three dense cores of Sgr B2 N(orth), M(ain), and S(outh). The total plane-of-sky magnetic field strengths in these cores are estimated to be 4.3-10.0 mG, 6.2-14.7 mG, and 1.9-4.5 mG derived from the angular dispersion… ▽ More We report the first arcsecond-resolution observations of the magnetic field in the mini starburst complex Sgr B2. SMA polarization observations revealed magnetic field morphology in three dense cores of Sgr B2 N(orth), M(ain), and S(outh). The total plane-of-sky magnetic field strengths in these cores are estimated to be 4.3-10.0 mG, 6.2-14.7 mG, and 1.9-4.5 mG derived from the angular dispersion function method after applying the correction factors of 0.21 and 0.5. Combining with analyses of the parsec-scale polarization data from SOFIA, we found that a magnetically supercritical condition is present from the cloud-scale ($\sim$10 pc) to core-scale ($\sim$0.2 pc) in Sgr B2, which is consistent with the burst of star formation activities in the region likely resulted from a multi-scale gravitational collapse from the cloud to dense cores. △ Less

Submitted 18 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

Comments: 17 pages, 4 figures, accepted for publication in ApJ

arXiv:2406.08877 [pdf, other]

EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding

Authors: Yuan-Ming Li, Wei-Jin Huang, An-Lan Wang, Ling-An Zeng, Jing-Ke Meng, Wei-Shi Zheng

Abstract: We present EgoExo-Fitness, a new full-body action understanding dataset, featuring fitness sequence videos recorded from synchronized egocentric and fixed exocentric (third-person) cameras. Compared with existing full-body action understanding datasets, EgoExo-Fitness not only contains videos from first-person perspectives, but also provides rich annotations. Specifically, two-level temporal bound… ▽ More We present EgoExo-Fitness, a new full-body action understanding dataset, featuring fitness sequence videos recorded from synchronized egocentric and fixed exocentric (third-person) cameras. Compared with existing full-body action understanding datasets, EgoExo-Fitness not only contains videos from first-person perspectives, but also provides rich annotations. Specifically, two-level temporal boundaries are provided to localize single action videos along with sub-steps of each action. More importantly, EgoExo-Fitness introduces innovative annotations for interpretable action judgement--including technical keypoint verification, natural language comments on action execution, and action quality scores. Combining all of these, EgoExo-Fitness provides new resources to study egocentric and exocentric full-body action understanding across dimensions of "what", "when", and "how well". To facilitate research on egocentric and exocentric full-body action understanding, we construct benchmarks on a suite of tasks (i.e., action classification, action localization, cross-view sequence verification, cross-view skill determination, and a newly proposed task of guidance-based execution verification), together with detailed analysis. Code and data will be available at https://github.com/iSEE-Laboratory/EgoExo-Fitness/tree/main. △ Less

Submitted 16 July, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

Comments: Accepted by ECCV2024

arXiv:2406.06563 [pdf, other]

Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models

Authors: Tianwen Wei, Bo Zhu, Liang Zhao, Cheng Cheng, Biye Li, Weiwei Lü, Peng Cheng, Jianhao Zhang, Xiaoyu Zhang, Liang Zeng, Xiaokun Wang, Yutuan Ma, Rui Hu, Shuicheng Yan, Han Fang, Yahui Zhou

Abstract: In this technical report, we introduce the training methodologies implemented in the development of Skywork-MoE, a high-performance mixture-of-experts (MoE) large language model (LLM) with 146 billion parameters and 16 experts. It is initialized from the pre-existing dense checkpoints of our Skywork-13B model. We explore the comparative effectiveness of upcycling versus training from scratch initi… ▽ More In this technical report, we introduce the training methodologies implemented in the development of Skywork-MoE, a high-performance mixture-of-experts (MoE) large language model (LLM) with 146 billion parameters and 16 experts. It is initialized from the pre-existing dense checkpoints of our Skywork-13B model. We explore the comparative effectiveness of upcycling versus training from scratch initializations. Our findings suggest that the choice between these two approaches should consider both the performance of the existing dense checkpoints and the MoE training budget. We highlight two innovative techniques: gating logit normalization, which improves expert diversification, and adaptive auxiliary loss coefficients, allowing for layer-specific adjustment of auxiliary loss coefficients. Our experimental results validate the effectiveness of these methods. Leveraging these techniques and insights, we trained our upcycled Skywork-MoE on a condensed subset of our SkyPile corpus. The evaluation results demonstrate that our model delivers strong performance across a wide range of benchmarks. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.04603 [pdf, ps, other]

Simplify Implant Depth Prediction as Video Grounding: A Texture Perceive Implant Depth Prediction Network

Authors: Xinquan Yang, Xuguang Li, Xiaoling Luo, Leilei Zeng, Yudi Zhang, Linlin Shen, Yongqiang Deng

Abstract: Surgical guide plate is an important tool for the dental implant surgery. However, the design process heavily relies on the dentist to manually simulate the implant angle and depth. When deep neural networks have been applied to assist the dentist quickly locates the implant position, most of them are not able to determine the implant depth. Inspired by the video grounding task which localizes the… ▽ More Surgical guide plate is an important tool for the dental implant surgery. However, the design process heavily relies on the dentist to manually simulate the implant angle and depth. When deep neural networks have been applied to assist the dentist quickly locates the implant position, most of them are not able to determine the implant depth. Inspired by the video grounding task which localizes the starting and ending time of the target video segment, in this paper, we simplify the implant depth prediction as video grounding and develop a Texture Perceive Implant Depth Prediction Network (TPNet), which enables us to directly output the implant depth without complex measurements of oral bone. TPNet consists of an implant region detector (IRD) and an implant depth prediction network (IDPNet). IRD is an object detector designed to crop the candidate implant volume from the CBCT, which greatly saves the computation resource. IDPNet takes the cropped CBCT data to predict the implant depth. A Texture Perceive Loss (TPL) is devised to enable the encoder of IDPNet to perceive the texture variation among slices. Extensive experiments on a large dental implant dataset demonstrated that the proposed TPNet achieves superior performance than the existing methods. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Journal ref: MICCAI'2024

arXiv:2406.00605 [pdf, other]

LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models

Authors: Liang Zhao, Tianwen Wei, Liang Zeng, Cheng Cheng, Liu Yang, Peng Cheng, Lijie Wang, Chenxia Li, Xuejie Wu, Bo Zhu, Yimeng Gan, Rui Hu, Shuicheng Yan, Han Fang, Yahui Zhou

Abstract: We introduce LongSkywork, a long-context Large Language Model (LLM) capable of processing up to 200,000 tokens. We provide a training recipe for efficiently extending context length of LLMs. We identify that the critical element in enhancing long-context processing capability is to incorporate a long-context SFT stage following the standard SFT stage. A mere 200 iterations can convert the standard… ▽ More We introduce LongSkywork, a long-context Large Language Model (LLM) capable of processing up to 200,000 tokens. We provide a training recipe for efficiently extending context length of LLMs. We identify that the critical element in enhancing long-context processing capability is to incorporate a long-context SFT stage following the standard SFT stage. A mere 200 iterations can convert the standard SFT model into a long-context model. To reduce the effort in collecting and annotating data for long-context language modeling, we develop two novel methods for creating synthetic data. These methods are applied during the continual pretraining phase as well as the Supervised Fine-Tuning (SFT) phase, greatly enhancing the training efficiency of our long-context LLMs. Our findings suggest that synthetic long-context SFT data can surpass the performance of data curated by humans to some extent. LongSkywork achieves outstanding performance on a variety of long-context benchmarks. In the Needle test, a benchmark for long-context information retrieval, our models achieved perfect accuracy across multiple context spans. Moreover, in realistic application scenarios, LongSkywork-13B demonstrates performance on par with Claude2.1, the leading long-context model, underscoring the effectiveness of our proposed methods. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2406.00023 [pdf, other]

LocMoE+: Enhanced Router with Token Feature Awareness for Efficient LLM Pre-Training

Authors: Jing Li, Zhijie Sun, Dachao Lin, Xuan He, Yi Lin, Binfan Zheng, Li Zeng, Rongqian Zhao, Xin Chen

Abstract: Mixture-of-Experts (MoE) architectures have recently gained increasing popularity within the domain of large language models (LLMs) due to their ability to significantly reduce training and inference overhead. However, MoE architectures face challenges, such as significant disparities in the number of tokens assigned to each expert and a tendency toward homogenization among experts, which adversel… ▽ More Mixture-of-Experts (MoE) architectures have recently gained increasing popularity within the domain of large language models (LLMs) due to their ability to significantly reduce training and inference overhead. However, MoE architectures face challenges, such as significant disparities in the number of tokens assigned to each expert and a tendency toward homogenization among experts, which adversely affects the model's semantic generation capabilities. In this paper, we introduce LocMoE+, a refined version of the low-overhead LocMoE, incorporating the following enhancements: (1) Quantification and definition of the affinity between experts and tokens. (2) Implementation of a global-level adaptive routing strategy to rearrange tokens based on their affinity scores. (3) Reestimation of the lower bound for expert capacity, which has been shown to progressively decrease as the token feature distribution evolves. Experimental results demonstrate that, without compromising model convergence or efficacy, the number of tokens each expert processes can be reduced by over 60%. Combined with communication optimizations, this leads to an average improvement in training efficiency ranging from 5.4% to 46.6%. After fine-tuning, LocMoE+ exhibits a performance improvement of 9.7% to 14.1% across the GDAD, C-Eval, and TeleQnA datasets. △ Less

Submitted 23 May, 2024; originally announced June 2024.

arXiv:2405.19745 [pdf, other]

doi 10.1145/3641519.3657417

GaussianPrediction: Dynamic 3D Gaussian Prediction for Motion Extrapolation and Free View Synthesis

Authors: Boming Zhao, Yuan Li, Ziyu Sun, Lin Zeng, Yujun Shen, Rui Ma, Yinda Zhang, Hujun Bao, Zhaopeng Cui

Abstract: Forecasting future scenarios in dynamic environments is essential for intelligent decision-making and navigation, a challenge yet to be fully realized in computer vision and robotics. Traditional approaches like video prediction and novel-view synthesis either lack the ability to forecast from arbitrary viewpoints or to predict temporal dynamics. In this paper, we introduce GaussianPrediction, a n… ▽ More Forecasting future scenarios in dynamic environments is essential for intelligent decision-making and navigation, a challenge yet to be fully realized in computer vision and robotics. Traditional approaches like video prediction and novel-view synthesis either lack the ability to forecast from arbitrary viewpoints or to predict temporal dynamics. In this paper, we introduce GaussianPrediction, a novel framework that empowers 3D Gaussian representations with dynamic scene modeling and future scenario synthesis in dynamic environments. GaussianPrediction can forecast future states from any viewpoint, using video observations of dynamic scenes. To this end, we first propose a 3D Gaussian canonical space with deformation modeling to capture the appearance and geometry of dynamic scenes, and integrate the lifecycle property into Gaussians for irreversible deformations. To make the prediction feasible and efficient, a concentric motion distillation approach is developed by distilling the scene motion with key points. Finally, a Graph Convolutional Network is employed to predict the motions of key points, enabling the rendering of photorealistic images of future scenarios. Our framework shows outstanding performance on both synthetic and real-world datasets, demonstrating its efficacy in predicting and rendering future environments. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: Accepted to SIGGRAPH 2024 Conference. Project Page: https://zju3dv.github.io/gaussian-prediction/

arXiv:2405.19469 [pdf, other]

Constraining Inflation with the BICEP/Keck CMB Polarization Experiments

Authors: The BICEP/Keck Collaboration, :, P. A. R. Ade, Z. Ahmed, M. Amiri, D. Barkats, R. Basu Thakur, C. A. Bischoff, D. Beck, J. J. Bock, H. Boenish, V. Buza, J. R. Cheshire IV, J. Connors, J. Cornelison, M. Crumrine, A. Cukierman, E. V. Denison, M. Dierickx, L. Duband, M. Eiben, B. Elwood, S. Fatigoni, J. P. Filippini, M. Gao , et al. (63 additional authors not shown)

Abstract: The BICEP/$\textit{Keck}$ (BK) series of cosmic microwave background (CMB) polarization experiments has, over the past decade and a half, produced a series of field-leading constraints on cosmic inflation via measurements of the "B-mode" polarization of the CMB. Primordial B modes are directly tied to the amplitude of primordial gravitational waves (PGW), their strength parameterized by the tensor… ▽ More The BICEP/$\textit{Keck}$ (BK) series of cosmic microwave background (CMB) polarization experiments has, over the past decade and a half, produced a series of field-leading constraints on cosmic inflation via measurements of the "B-mode" polarization of the CMB. Primordial B modes are directly tied to the amplitude of primordial gravitational waves (PGW), their strength parameterized by the tensor-to-scalar ratio, $r$, and thus the energy scale of inflation. Having set the most sensitive constraints to-date on $r$, $σ(r)=0.009$ ($r_{0.05}<0.036, 95\%$ C.L.) using data through the 2018 observing season ("BK18"), the BICEP/$\textit{Keck}$ program has continued to improve its dataset in the years since. We give a brief overview of the BK program and the "BK18" result before discussing the program's ongoing efforts, including the deployment and performance of the $\textit{Keck Array}$'s successor instrument, BICEP Array, improvements to data processing and internal consistency testing, new techniques such as delensing, and how those will ultimately serve to allow BK reach $σ(r) \lesssim 0.003$ using data through the 2027 observing season. △ Less

Submitted 11 July, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

Comments: 9 pages, 5 figures. Contribution to the 2024 Cosmology session of the 58th Rencontres de Moriond

arXiv:2405.18739 [pdf, other]

FlocOff: Data Heterogeneity Resilient Federated Learning with Communication-Efficient Edge Offloading

Authors: Mulei Ma, Chenyu Gong, Liekang Zeng, Yang Yang, Liantao Wu

Abstract: Federated Learning (FL) has emerged as a fundamental learning paradigm to harness massive data scattered at geo-distributed edge devices in a privacy-preserving way. Given the heterogeneous deployment of edge devices, however, their data are usually Non-IID, introducing significant challenges to FL including degraded training accuracy, intensive communication costs, and high computing complexity.… ▽ More Federated Learning (FL) has emerged as a fundamental learning paradigm to harness massive data scattered at geo-distributed edge devices in a privacy-preserving way. Given the heterogeneous deployment of edge devices, however, their data are usually Non-IID, introducing significant challenges to FL including degraded training accuracy, intensive communication costs, and high computing complexity. Towards that, traditional approaches typically utilize adaptive mechanisms, which may suffer from scalability issues, increased computational overhead, and limited adaptability to diverse edge environments. To address that, this paper instead leverages the observation that the computation offloading involves inherent functionalities such as node matching and service correlation to achieve data reshaping and proposes Federated learning based on computing Offloading (FlocOff) framework, to address data heterogeneity and resource-constrained challenges. Specifically, FlocOff formulates the FL process with Non-IID data in edge scenarios and derives rigorous analysis on the impact of imbalanced data distribution. Based on this, FlocOff decouples the optimization in two steps, namely : (1) Minimizes the Kullback-Leibler (KL) divergence via Computation Offloading scheduling (MKL-CO); (2) Minimizes the Communication Cost through Resource Allocation (MCC-RA). Extensive experimental results demonstrate that the proposed FlocOff effectively improves model convergence and accuracy by 14.3\%-32.7\% while reducing data heterogeneity under various data distributions. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.18435 [pdf, other]

QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge

Authors: Hongwei Bran Li, Fernando Navarro, Ivan Ezhov, Amirhossein Bayat, Dhritiman Das, Florian Kofler, Suprosanna Shit, Diana Waldmannstetter, Johannes C. Paetzold, Xiaobin Hu, Benedikt Wiestler, Lucas Zimmer, Tamaz Amiranashvili, Chinmay Prabhakar, Christoph Berger, Jonas Weidner, Michelle Alonso-Basant, Arif Rashid, Ujjwal Baid, Wesam Adel, Deniz Ali, Bhakti Baheti, Yingbin Bai, Ishaan Bhatt, Sabri Can Cetindag , et al. (55 additional authors not shown)

Abstract: Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the de… ▽ More Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the development and evaluation of automated segmentation algorithms. Accurately modeling and quantifying this variability is essential for enhancing the robustness and clinical applicability of these algorithms. We report the set-up and summarize the benchmark results of the Quantification of Uncertainties in Biomedical Image Quantification Challenge (QUBIQ), which was organized in conjunction with International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2020 and 2021. The challenge focuses on the uncertainty quantification of medical image segmentation which considers the omnipresence of inter-rater variability in imaging datasets. The large collection of images with multi-rater annotations features various modalities such as MRI and CT; various organs such as the brain, prostate, kidney, and pancreas; and different image dimensions 2D-vs-3D. A total of 24 teams submitted different solutions to the problem, combining various baseline models, Bayesian neural networks, and ensemble model techniques. The obtained results indicate the importance of the ensemble models, as well as the need for further research to develop efficient 3D methods for uncertainty quantification methods in 3D segmentation tasks. △ Less

Submitted 24 June, 2024; v1 submitted 19 March, 2024; originally announced May 2024.

Comments: initial technical report

arXiv:2405.17245 [pdf, other]

Galaxy: A Resource-Efficient Collaborative Edge AI System for In-situ Transformer Inference

Authors: Shengyuan Ye, Jiangsu Du, Liekang Zeng, Wenzhong Ou, Xiaowen Chu, Yutong Lu, Xu Chen

Abstract: Transformer-based models have unlocked a plethora of powerful intelligent applications at the edge, such as voice assistant in smart home. Traditional deployment approaches offload the inference workloads to the remote cloud server, which would induce substantial pressure on the backbone network as well as raise users' privacy concerns. To address that, in-situ inference has been recently recogniz… ▽ More Transformer-based models have unlocked a plethora of powerful intelligent applications at the edge, such as voice assistant in smart home. Traditional deployment approaches offload the inference workloads to the remote cloud server, which would induce substantial pressure on the backbone network as well as raise users' privacy concerns. To address that, in-situ inference has been recently recognized for edge intelligence, but it still confronts significant challenges stemming from the conflict between intensive workloads and limited on-device computing resources. In this paper, we leverage our observation that many edge environments usually comprise a rich set of accompanying trusted edge devices with idle resources and propose Galaxy, a collaborative edge AI system that breaks the resource walls across heterogeneous edge devices for efficient Transformer inference acceleration. Galaxy introduces a novel hybrid model parallelism to orchestrate collaborative inference, along with a heterogeneity-aware parallelism planning for fully exploiting the resource potential. Furthermore, Galaxy devises a tile-based fine-grained overlapping of communication and computation to mitigate the impact of tensor synchronizations on inference latency under bandwidth-constrained edge environments. Extensive evaluation based on prototype implementation demonstrates that Galaxy remarkably outperforms state-of-the-art approaches under various edge environment setups, achieving up to 2.5x end-to-end latency reduction. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: Accepted by IEEE International Conference on Computer Communications 2024

arXiv:2405.16773 [pdf, other]

On the origin of infrared bands attributed to tryptophan in Spitzer observations of IC 348

Authors: Aditya Dhariwal, Thomas H. Speak, Linshan Zeng, Amirhossein Rashidi, Brendan Moore, Olivier Berné, Anthony J. Remijan, Ilane Schroetter, Brett A. McGuire, Víctor M. Rivilla, Arnaud Belloche, Jes K. Jørgensen, Pavle Djuricanin, Takamasa Momose, Ilsa R. Cooke

Abstract: Infrared emission features toward interstellar gas of the IC 348 star cluster in Perseus have been recently proposed to originate from the amino acid tryptophan. The assignment was based on laboratory infrared spectra of tryptophan pressed into pellets, a method which is known to cause large frequency shifts compared to the gas phase. We assess the validity of the assignment based on the original… ▽ More Infrared emission features toward interstellar gas of the IC 348 star cluster in Perseus have been recently proposed to originate from the amino acid tryptophan. The assignment was based on laboratory infrared spectra of tryptophan pressed into pellets, a method which is known to cause large frequency shifts compared to the gas phase. We assess the validity of the assignment based on the original Spitzer data as well as new data from JWST. In addition, we report new spectra of tryptophan condensed in para-hydrogen matrices to compare with the observed spectra. The JWST MIRI data do not show evidence for tryptophan, despite deeper integration toward IC 348. In addition, we show that several of the lines attributed to tryptophan are likely due to instrumental artifacts. This, combined with the new laboratory data, allows us to conclude that there is no compelling evidence for the tryptophan assignment. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.11323 [pdf]

High-yield fabrication of bubble-free magic-angle twisted bilayer graphene devices with high twist-angle homogeneity

Authors: J. Diez-Merida, I. Das, G. Di Battista, A. Diez-Carlon, M. Lee, L. Zeng, K. Watanabe, T. Taniguchi, E. Olsson, D. K. Efetov

Abstract: Magic-angle twisted bilayer graphene (MATBG) stands as one of the most versatile materials in condensed-matter physics due to its hosting of a wide variety of exotic phases while also offering convenient tunability. However, the fabrication of MATBG is still manual, and remains to be a challenging and inefficient process, with devices being highly dependent on specific fabrication methods, that of… ▽ More Magic-angle twisted bilayer graphene (MATBG) stands as one of the most versatile materials in condensed-matter physics due to its hosting of a wide variety of exotic phases while also offering convenient tunability. However, the fabrication of MATBG is still manual, and remains to be a challenging and inefficient process, with devices being highly dependent on specific fabrication methods, that often result in inconsistency and variability. In this work, we present an optimized protocol for the fabrication of MATBG samples, for which we use deterministic graphene anchoring to stabilize the twist-angle, and a careful bubble removal techniques to ensure a high twist-angle homogeneity. We use low-temperature transport experiments to extract the average twist-angle between pairs of leads. We find that up to 38 percent of the so fabricated devices show micrometer square sized regions with a twist-angle in the range 1.1 plus/minus 0.1 degrees, and a twist-angle variation of only 0.02 degrees, where in some instances such regions were up to 36 micrometer square large. We are certain that the discussed protocols can be directly transferred to non-graphene materials, and will be useful for the growing field of moire materials. △ Less

Submitted 18 May, 2024; originally announced May 2024.

arXiv:2405.03924 [pdf, other]

NeurDB: An AI-powered Autonomous Data System

Authors: Beng Chin Ooi, Shaofeng Cai, Gang Chen, Yanyan Shen, Kian-Lee Tan, Yuncheng Wu, Xiaokui Xiao, Naili Xing, Cong Yue, Lingze Zeng, Meihui Zhang, Zhanhao Zhao

Abstract: In the wake of rapid advancements in artificial intelligence (AI), we stand on the brink of a transformative leap in data systems. The imminent fusion of AI and DB (AIxDB) promises a new generation of data systems, which will relieve the burden on end-users across all industry sectors by featuring AI-enhanced functionalities, such as personalized and automated in-database AI-powered analytics, sel… ▽ More In the wake of rapid advancements in artificial intelligence (AI), we stand on the brink of a transformative leap in data systems. The imminent fusion of AI and DB (AIxDB) promises a new generation of data systems, which will relieve the burden on end-users across all industry sectors by featuring AI-enhanced functionalities, such as personalized and automated in-database AI-powered analytics, self-driving capabilities for improved system performance, etc. In this paper, we explore the evolution of data systems with a focus on deepening the fusion of AI and DB. We present NeurDB, an AI-powered autonomous data system designed to fully embrace AI design in each major system component and provide in-database AI-powered analytics. We outline the conceptual and architectural overview of NeurDB, discuss its design choices and key components, and report its current development and future plan. △ Less

Submitted 4 July, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

arXiv:2405.00568 [pdf, other]

Powering In-Database Dynamic Model Slicing for Structured Data Analytics

Authors: Lingze Zeng, Naili Xing, Shaofeng Cai, Gang Chen, Beng Chin Ooi, Jian Pei, Yuncheng Wu

Abstract: Relational database management systems (RDBMS) are widely used for the storage and retrieval of structured data. To derive insights beyond statistical aggregation, we typically have to extract specific subdatasets from the database using conventional database operations, and then apply deep neural networks (DNN) training and inference on these respective subdatasets in a separate machine learning… ▽ More Relational database management systems (RDBMS) are widely used for the storage and retrieval of structured data. To derive insights beyond statistical aggregation, we typically have to extract specific subdatasets from the database using conventional database operations, and then apply deep neural networks (DNN) training and inference on these respective subdatasets in a separate machine learning system. The process can be prohibitively expensive, especially when there are a combinatorial number of subdatasets extracted for different analytical purposes. This calls for efficient in-database support of advanced analytical methods In this paper, we introduce LEADS, a novel SQL-aware dynamic model slicing technique to customize models for subdatasets specified by SQL queries. LEADS improves the predictive modeling of structured data via the mixture of experts (MoE) technique and maintains inference efficiency by a SQL-aware gating network. At the core of LEADS is the construction of a general model with multiple expert sub-models via MoE trained over the entire database. This SQL-aware MoE technique scales up the modeling capacity, enhances effectiveness, and preserves efficiency by activating only necessary experts via the gating network during inference. Additionally, we introduce two regularization terms during the training process of LEADS to strike a balance between effectiveness and efficiency. We also design and build an in-database inference system, called INDICES, to support end-to-end advanced structured data analytics by non-intrusively incorporating LEADS onto PostgreSQL. Our extensive experiments on real-world datasets demonstrate that LEADS consistently outperforms baseline models, and INDICES delivers effective in-database analytics with a considerable reduction in inference latency compared to traditional solutions. △ Less

Submitted 1 May, 2024; originally announced May 2024.

arXiv:2404.17766 [pdf, other]

Implementation of Big AI Models for Wireless Networks with Collaborative Edge Computing

Authors: Liekang Zeng, Shengyuan Ye, Xu Chen, Yang Yang

Abstract: Big Artificial Intelligence (AI) models have emerged as a crucial element in various intelligent applications at the edge, such as voice assistants in smart homes and autonomous robotics in smart factories. Training big AI models, e.g., for personalized fine-tuning and continual model refinement, poses significant challenges to edge devices due to the inherent conflict between limited computing re… ▽ More Big Artificial Intelligence (AI) models have emerged as a crucial element in various intelligent applications at the edge, such as voice assistants in smart homes and autonomous robotics in smart factories. Training big AI models, e.g., for personalized fine-tuning and continual model refinement, poses significant challenges to edge devices due to the inherent conflict between limited computing resources and intensive workload associated with training. Despite the constraints of on-device training, traditional approaches usually resort to aggregating training data and sending it to a remote cloud for centralized training. Nevertheless, this approach is neither sustainable, which strains long-range backhaul transmission and energy-consuming datacenters, nor safely private, which shares users' raw data with remote infrastructures. To address these challenges, we alternatively observe that prevalent edge environments usually contain a diverse collection of trusted edge devices with untapped idle resources, which can be leveraged for edge training acceleration. Motivated by this, in this article, we propose collaborative edge training, a novel training mechanism that orchestrates a group of trusted edge devices as a resource pool for expedited, sustainable big AI model training at the edge. As an initial step, we present a comprehensive framework for building collaborative edge training systems and analyze in-depth its merits and sustainable scheduling choices following its workflow. To further investigate the impact of its parallelism design, we empirically study a case of four typical parallelisms from the perspective of energy demand with realistic testbeds. Finally, we discuss open challenges for sustainable collaborative edge training to point to future directions of edge-centric big AI model training. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.13692 [pdf, other]

A sustainable development perspective on urban-scale roof greening priorities and benefits

Authors: Jie Shao, Wei Yao, Lei Luo, Linzhou Zeng, Zhiyi He, Puzuo Wang, Huadong Guo

Abstract: Greenspaces are tightly linked to human well-being. Yet, rapid urbanization has exacerbated greenspace exposure inequality and declining human life quality. Roof greening has been recognized as an effective strategy to mitigate these negative impacts. Understanding priorities and benefits is crucial to promoting green roofs. Here, using geospatial big data, we conduct an urban-scale assessment of… ▽ More Greenspaces are tightly linked to human well-being. Yet, rapid urbanization has exacerbated greenspace exposure inequality and declining human life quality. Roof greening has been recognized as an effective strategy to mitigate these negative impacts. Understanding priorities and benefits is crucial to promoting green roofs. Here, using geospatial big data, we conduct an urban-scale assessment of roof greening at a single building level in Hong Kong from a sustainable development perspective. We identify that 85.3\% of buildings reveal potential and urgent demand for roof greening. We further find green roofs could increase greenspace exposure by \textasciitilde61\% and produce hundreds of millions (HK\$) in economic benefits annually but play a small role in urban heat mitigation (\textasciitilde0.15\degree{C}) and annual carbon emission offsets (\textasciitilde0.8\%). Our study offers a comprehensive assessment of roof greening, which could provide reference for sustainable development in cities worldwide, from data utilization to solutions and findings. △ Less

Submitted 21 April, 2024; originally announced April 2024.

arXiv:2404.12036 [pdf, other]

Exploring the Premelting Transition through Molecular Simulations Powered by Neural Network Potentials

Authors: Limin Zeng, Ang Gao

Abstract: The system has addressed the error of "Bad character(s) in field Abstract" for no reason. Please refer to manuscript for the full abstract. The system has addressed the error of "Bad character(s) in field Abstract" for no reason. Please refer to manuscript for the full abstract. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 10 pages, 6 figures

arXiv:2404.08681 [pdf, other]

EFSA: Towards Event-Level Financial Sentiment Analysis

Authors: Tianyu Chen, Yiming Zhang, Guoxin Yu, Dapeng Zhang, Li Zeng, Qing He, Xiang Ao

Abstract: In this paper, we extend financial sentiment analysis~(FSA) to event-level since events usually serve as the subject of the sentiment in financial text. Though extracting events from the financial text may be conducive to accurate sentiment predictions, it has specialized challenges due to the lengthy and discontinuity of events in a financial text. To this end, we reconceptualize the event extrac… ▽ More In this paper, we extend financial sentiment analysis~(FSA) to event-level since events usually serve as the subject of the sentiment in financial text. Though extracting events from the financial text may be conducive to accurate sentiment predictions, it has specialized challenges due to the lengthy and discontinuity of events in a financial text. To this end, we reconceptualize the event extraction as a classification task by designing a categorization comprising coarse-grained and fine-grained event categories. Under this setting, we formulate the \textbf{E}vent-Level \textbf{F}inancial \textbf{S}entiment \textbf{A}nalysis~(\textbf{EFSA} for short) task that outputs quintuples consisting of (company, industry, coarse-grained event, fine-grained event, sentiment) from financial text. A large-scale Chinese dataset containing $12,160$ news articles and $13,725$ quintuples is publicized as a brand new testbed for our task. A four-hop Chain-of-Thought LLM-based approach is devised for this task. Systematically investigations are conducted on our dataset, and the empirical results demonstrate the benchmarking scores of existing methods and our proposed method can reach the current state-of-the-art. Our dataset and framework implementation are available at https://anonymous.4open.science/r/EFSA-645E △ Less

Submitted 8 April, 2024; originally announced April 2024.

arXiv:2404.04661 [pdf, other]

Transform then Explore: a Simple and Effective Technique for Exploratory Combinatorial Optimization with Reinforcement Learning

Authors: Tianle Pu, Changjun Fan, Mutian Shen, Yizhou Lu, Li Zeng, Zohar Nussinov, Chao Chen, Zhong Liu

Abstract: Many complex problems encountered in both production and daily life can be conceptualized as combinatorial optimization problems (COPs) over graphs. Recent years, reinforcement learning (RL) based models have emerged as a promising direction, which treat the COPs solving as a heuristic learning problem. However, current finite-horizon-MDP based RL models have inherent limitations. They are not all… ▽ More Many complex problems encountered in both production and daily life can be conceptualized as combinatorial optimization problems (COPs) over graphs. Recent years, reinforcement learning (RL) based models have emerged as a promising direction, which treat the COPs solving as a heuristic learning problem. However, current finite-horizon-MDP based RL models have inherent limitations. They are not allowed to explore adquately for improving solutions at test time, which may be necessary given the complexity of NP-hard optimization tasks. Some recent attempts solve this issue by focusing on reward design and state feature engineering, which are tedious and ad-hoc. In this work, we instead propose a much simpler but more effective technique, named gauge transformation (GT). The technique is originated from physics, but is very effective in enabling RL agents to explore to continuously improve the solutions during test. Morever, GT is very simple, which can be implemented with less than 10 lines of Python codes, and can be applied to a vast majority of RL models. Experimentally, we show that traditional RL models with GT technique produce the state-of-the-art performances on the MaxCut problem. Furthermore, since GT is independent of any RL models, it can be seamlessly integrated into various RL frameworks, paving the way of these models for more effective explorations in the solving of general COPs. △ Less

Submitted 6 April, 2024; originally announced April 2024.

arXiv:2404.01875 [pdf, other]

Satellite Federated Edge Learning: Architecture Design and Convergence Analysis

Authors: Yuanming Shi, Li Zeng, Jingyang Zhu, Yong Zhou, Chunxiao Jiang, Khaled B. Letaief

Abstract: The proliferation of low-earth-orbit (LEO) satellite networks leads to the generation of vast volumes of remote sensing data which is traditionally transferred to the ground server for centralized processing, raising privacy and bandwidth concerns. Federated edge learning (FEEL), as a distributed machine learning approach, has the potential to address these challenges by sharing only model paramet… ▽ More The proliferation of low-earth-orbit (LEO) satellite networks leads to the generation of vast volumes of remote sensing data which is traditionally transferred to the ground server for centralized processing, raising privacy and bandwidth concerns. Federated edge learning (FEEL), as a distributed machine learning approach, has the potential to address these challenges by sharing only model parameters instead of raw data. Although promising, the dynamics of LEO networks, characterized by the high mobility of satellites and short ground-to-satellite link (GSL) duration, pose unique challenges for FEEL. Notably, frequent model transmission between the satellites and ground incurs prolonged waiting time and large transmission latency. This paper introduces a novel FEEL algorithm, named FEDMEGA, tailored to LEO mega-constellation networks. By integrating inter-satellite links (ISL) for intra-orbit model aggregation, the proposed algorithm significantly reduces the usage of low data rate and intermittent GSL. Our proposed method includes a ring all-reduce based intra-orbit aggregation mechanism, coupled with a network flow-based transmission scheme for global model aggregation, which enhances transmission efficiency. Theoretical convergence analysis is provided to characterize the algorithm performance. Extensive simulations show that our FEDMEGA algorithm outperforms existing satellite FEEL algorithms, exhibiting an approximate 30% improvement in convergence rate. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Comments: 16 pages, 15 figures

arXiv:2404.01050 [pdf, other]

Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation

Authors: Haofeng Liu, Chenshu Xu, Yifei Yang, Lihua Zeng, Shengfeng He

Abstract: Point-based interactive editing serves as an essential tool to complement the controllability of existing generative models. A concurrent work, DragDiffusion, updates the diffusion latent map in response to user inputs, causing global latent map alterations. This results in imprecise preservation of the original content and unsuccessful editing due to gradient vanishing. In contrast, we present Dr… ▽ More Point-based interactive editing serves as an essential tool to complement the controllability of existing generative models. A concurrent work, DragDiffusion, updates the diffusion latent map in response to user inputs, causing global latent map alterations. This results in imprecise preservation of the original content and unsuccessful editing due to gradient vanishing. In contrast, we present DragNoise, offering robust and accelerated editing without retracing the latent map. The core rationale of DragNoise lies in utilizing the predicted noise output of each U-Net as a semantic editor. This approach is grounded in two critical observations: firstly, the bottleneck features of U-Net inherently possess semantically rich features ideal for interactive editing; secondly, high-level semantics, established early in the denoising process, show minimal variation in subsequent stages. Leveraging these insights, DragNoise edits diffusion semantics in a single denoising step and efficiently propagates these changes, ensuring stability and efficiency in diffusion editing. Comparative experiments reveal that DragNoise achieves superior control and semantic retention, reducing the optimization time by over 50% compared to DragDiffusion. Our codes are available at https://github.com/haofengl/DragNoise. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: Accepted by CVPR 2024

arXiv:2403.16283 [pdf, other]

Sample Empirical Likelihood Methods for Causal Inference

Authors: Jingyue Huang, Changbao Wu, Leilei Zeng

Abstract: Causal inference is crucial for understanding the true impact of interventions, policies, or actions, enabling informed decision-making and providing insights into the underlying mechanisms that shape our world. In this paper, we establish a framework for the estimation and inference of average treatment effects using a two-sample empirical likelihood function. Two different approaches to incorpor… ▽ More Causal inference is crucial for understanding the true impact of interventions, policies, or actions, enabling informed decision-making and providing insights into the underlying mechanisms that shape our world. In this paper, we establish a framework for the estimation and inference of average treatment effects using a two-sample empirical likelihood function. Two different approaches to incorporating propensity scores are developed. The first approach introduces propensity scores calibrated constraints in addition to the standard model-calibration constraints; the second approach uses the propensity scores to form weighted versions of the model-calibration constraints. The resulting estimators from both approaches are doubly robust. The limiting distributions of the two sample empirical likelihood ratio statistics are derived, facilitating the construction of confidence intervals and hypothesis tests for the average treatment effect. Bootstrap methods for constructing sample empirical likelihood ratio confidence intervals are also discussed for both approaches. Finite sample performances of the methods are investigated through simulation studies. △ Less

Submitted 24 March, 2024; originally announced March 2024.

arXiv:2403.10003 [pdf, ps, other]

An improved light-cone harmonic oscillator model for the $φ$-meson longitudinal leading-twist light-cone distribution amplitude

Authors: Dan-Dan Hu, Xing-Gang Wu, Long Zeng, Hai-Bing Fu, Tao Zhong

Abstract: In the present paper, we study the properties of $φ$-meson longitudinal leading-twist light-cone distribution amplitude $φ_{2;φ}^{\|}(x,μ)$ by starting from a light-cone harmonic oscillator model for its wavefunction. To fix the input parameters, we derive the first ten $ξ$-moments of $φ_{2;φ}^{\|}(x,μ)$ by using the QCD sum rules approach under the background field theory. The shape of… ▽ More In the present paper, we study the properties of $φ$-meson longitudinal leading-twist light-cone distribution amplitude $φ_{2;φ}^{\|}(x,μ)$ by starting from a light-cone harmonic oscillator model for its wavefunction. To fix the input parameters, we derive the first ten $ξ$-moments of $φ_{2;φ}^{\|}(x,μ)$ by using the QCD sum rules approach under the background field theory. The shape of $φ_{2;φ}^{\|}(x,μ=2~{\rm GeV})$ tends to be a single-peak behavior, which is consistent with the latest Lattice QCD result. As an application, we derive the $D^+_s \to φ$ transition form factors (TFFs) by using the light-cone sum rules approach. At the large recoil point, we obtain $A_1(0) = 0.512_{-0.020}^{+0.030}$, $A_2(0) = 0.402_{-0.067}^{+0.078}$, $A_0(0) = 0.596_{-0.020}^{+0.025}$ and $V(0) = 0.882_{-0.036}^{+0.040}$. As for the two typical ratios $γ_V$ and $γ_2$, we obtain $γ_V = 1.723_{-0.021}^{+0.023}$ and $γ_2 = 0.785_{-0.104}^{+0.100}$. After extrapolating those TFFs to the physically allowable region, we then obtain the transverse, longitudinal and total decay widths for semi-leptonic decay $D^+_s\toφ\ell^+ν_{\ell}$. Then the branching fractions are ${\cal B}(D^+_s\to φe^+ν_e) = (2.367_{-0.132}^{+0.256})\times 10^{-3}$ and ${\cal B}(D^+_s\to φμ^+ν_μ) = (2.349_{-0.132}^{+0.255})\times 10^{-3}$, which show good agreement with the data issued by the BESIII, the CLEO, and the BABAR Collaborations. We finally calculate $D^+_s\toφ\ell^+ ν_\ell$ polarization and asymmetry parameters. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: 32 pages, 7 figures, comments welcome

arXiv:2403.09317 [pdf, other]

SD-Net: Symmetric-Aware Keypoint Prediction and Domain Adaptation for 6D Pose Estimation In Bin-picking Scenarios

Authors: Ding-Tao Huang, En-Te Lin, Lipeng Chen, Li-Fu Liu, Long Zeng

Abstract: Despite the success in 6D pose estimation in bin-picking scenarios, existing methods still struggle to produce accurate prediction results for symmetry objects and real world scenarios. The primary bottlenecks include 1) the ambiguity keypoints caused by object symmetries; 2) the domain gap between real and synthetic data. To circumvent these problem, we propose a new 6D pose estimation network wi… ▽ More Despite the success in 6D pose estimation in bin-picking scenarios, existing methods still struggle to produce accurate prediction results for symmetry objects and real world scenarios. The primary bottlenecks include 1) the ambiguity keypoints caused by object symmetries; 2) the domain gap between real and synthetic data. To circumvent these problem, we propose a new 6D pose estimation network with symmetric-aware keypoint prediction and self-training domain adaptation (SD-Net). SD-Net builds on pointwise keypoint regression and deep hough voting to perform reliable detection keypoint under clutter and occlusion. Specifically, at the keypoint prediction stage, we designe a robust 3D keypoints selection strategy considering the symmetry class of objects and equivalent keypoints, which facilitate locating 3D keypoints even in highly occluded scenes. Additionally, we build an effective filtering algorithm on predicted keypoint to dynamically eliminate multiple ambiguity and outlier keypoint candidates. At the domain adaptation stage, we propose the self-training framework using a student-teacher training scheme. To carefully distinguish reliable predictions, we harnesses a tailored heuristics for 3D geometry pseudo labelling based on semi-chamfer distance. On public Sil'eane dataset, SD-Net achieves state-of-the-art results, obtaining an average precision of 96%. Testing learning and generalization abilities on public Parametric datasets, SD-Net is 8% higher than the state-of-the-art method. The code is available at https://github.com/dingthuang/SD-Net. △ Less

Submitted 14 March, 2024; originally announced March 2024.

arXiv:2403.05115 [pdf]

Superconductivity of the New Medium-Entropy Alloy V4Ti2W with a Body-Centered Cubic Structure

Authors: Kuan Li, Weijie Lin, Ruixin Guo, Shu Guo, Lingyong Zeng, Longfu Li, Peifeng Yu, Kangwang Wang, Chao Zhang, Huixia Luo

Abstract: Medium- and high-entropy alloy (MEA and HEA) superconductors have attracted considerable interest since their discovery. This paper reports the superconducting properties of ternary tungsten-containing MEA V4Ti2W for the first time. V4Ti2W is a type II superconductor with a body-centered cubic (BCC) structure. Experimental results of resistivity, magnetization, and heat capacity indicate that the… ▽ More Medium- and high-entropy alloy (MEA and HEA) superconductors have attracted considerable interest since their discovery. This paper reports the superconducting properties of ternary tungsten-containing MEA V4Ti2W for the first time. V4Ti2W is a type II superconductor with a body-centered cubic (BCC) structure. Experimental results of resistivity, magnetization, and heat capacity indicate that the superconducting transition temperature of the MEA V4Ti2W is roughly 5.0 K. The critical magnetic fields at the upper and lower ends are 9.93(2) T and 40.7(3) mT, respectively. Interestingly, few BCC MEA superconductors with VEC greater than 4.8 have been found. The addition of tungsten leads to a VEC of 4.83 e/a for V4Ti2W, which is rarely higher than the 4.8 value. Adding tungsten element expands the variety of MEA alloys, which may improve the microstructure and mechanical properties of materials and even superconducting properties. This material could potentially offer a new platform for the investigation of innovative MEA and HEA superconductors. △ Less

Submitted 8 March, 2024; originally announced March 2024.

Comments: 20 pages, 6 figures

Journal ref: Materials Today Communications, 2024, 38, 108444

arXiv:2403.01767 [pdf, other]

doi 10.1109/ICASSP48485.2024.10447643

KeNet:Knowledge-enhanced Doc-Label Attention Network for Multi-label text classification

Authors: Bo Li, Yuyan Chen, Liang Zeng

Abstract: Multi-Label Text Classification (MLTC) is a fundamental task in the field of Natural Language Processing (NLP) that involves the assignment of multiple labels to a given text. MLTC has gained significant importance and has been widely applied in various domains such as topic recognition, recommendation systems, sentiment analysis, and information retrieval. However, traditional machine learning an… ▽ More Multi-Label Text Classification (MLTC) is a fundamental task in the field of Natural Language Processing (NLP) that involves the assignment of multiple labels to a given text. MLTC has gained significant importance and has been widely applied in various domains such as topic recognition, recommendation systems, sentiment analysis, and information retrieval. However, traditional machine learning and Deep neural network have not yet addressed certain issues, such as the fact that some documents are brief but have a large number of labels and how to establish relationships between the labels. It is imperative to additionally acknowledge that the significance of knowledge is substantiated in the realm of MLTC. To address this issue, we provide a novel approach known as Knowledge-enhanced Doc-Label Attention Network (KeNet). Specifically, we design an Attention Network that incorporates external knowledge, label embedding, and a comprehensive attention mechanism. In contrast to conventional methods, we use comprehensive representation of documents, knowledge and labels to predict all labels for each single text. Our approach has been validated by comprehensive research conducted on three multi-label datasets. Experimental results demonstrate that our method outperforms state-of-the-art MLTC method. Additionally, a case study is undertaken to illustrate the practical implementation of KeNet. △ Less

Submitted 4 March, 2024; originally announced March 2024.

Comments: Accepted in ICASSP 2024

arXiv:2403.01198 [pdf]

Organic solvent boosts charge storage and charging dynamics of conductive MOF supercapacitors

Authors: Ming Chen, Taizheng Wu, Liang Niu, Ting Ye, Wenlei Dai, Liang Zeng, Alexei A. Kornyshev, Zhenxiang Wang, Zhou Liu, Guang Feng

Abstract: Conductive metal-organic frameworks (c-MOFs) and ionic liquids (ILs) have emerged as auspicious combinations for high-performance supercapacitors. However, the nanoconfinement from c-MOFs and high viscosity of ILs slow down the charging process. This hindrance can, however, be resolved by adding solvent. Here, we performed constant-potential molecular simulations to scrutinize the solvent impact o… ▽ More Conductive metal-organic frameworks (c-MOFs) and ionic liquids (ILs) have emerged as auspicious combinations for high-performance supercapacitors. However, the nanoconfinement from c-MOFs and high viscosity of ILs slow down the charging process. This hindrance can, however, be resolved by adding solvent. Here, we performed constant-potential molecular simulations to scrutinize the solvent impact on charge storage and charging dynamics of MOF-IL-based supercapacitors. We find conditions for >100% enhancement in capacity and ~6 times increase in charging speed. These improvements were confirmed by synthesizing near-ideal c-MOFs and developing multiscale models linking molecular simulations to electrochemical measurements. Fundamentally, our findings elucidate that the solvent acts as an ionophobic agent to induce a substantial enhancement in charge storage, and as an ion traffic police to eliminate convoluted counterion and co-ion motion paths and create two distinct ion transport highways to accelerate charging dynamics. This work paves the way for the optimal design of MOF supercapacitors. △ Less

Submitted 2 March, 2024; originally announced March 2024.

arXiv:2403.00723 [pdf, other]

Characterization of process-related interfacial dielectric loss in aluminum-on-silicon by resonator microwave measurements, materials analysis, and imaging

Authors: Lert Chayanun, Janka Biznárová, Lunjie Zeng, Per Malmberg, Andreas Nylander, Amr Osman, Marcus Rommel, Pui Lam Tam, Eva Olsson, August Yurgens, Jonas Bylander, Anita Fadavi Roudsari

Abstract: We systematically investigate the influence of the fabrication process on dielectric loss in aluminum-on-silicon superconducting coplanar waveguide resonators with internal quality factors ($Q_i$) of about one million at the single-photon level. These devices are essential components in superconducting quantum processors; they also serve as proxies for understanding the energy loss of superconduct… ▽ More We systematically investigate the influence of the fabrication process on dielectric loss in aluminum-on-silicon superconducting coplanar waveguide resonators with internal quality factors ($Q_i$) of about one million at the single-photon level. These devices are essential components in superconducting quantum processors; they also serve as proxies for understanding the energy loss of superconducting qubits. By systematically varying several fabrication steps, we identify the relative importance of reducing loss at the substrate-metal and the substrate-air interfaces. We find that it is essential to clean the silicon substrate in hydrogen fluoride (HF) prior to aluminum deposition. A post-fabrication removal of the oxides on the surface of the silicon substrate and the aluminum film by immersion in HF further improves the $Q_i$. We observe a small, but noticeable, adverse effect on the loss by omitting either standard cleaning (SC1), pre-deposition heating of the substrate to 300$°$C, or in-situ post-deposition oxidation of the film's top surface. We find no improvement due to excessive pumping meant to reach a background pressure below $6{\times} 10^{-8}$ mbar. We correlate the measured loss with microscopic properties of the substrate-metal interface through characterization with X-ray photoelectron spectroscopy (XPS), time-of-flight secondary ion mass spectroscopy (ToF-SIMS), transmission electron microscopy (TEM), energy-dispersive X-ray spectroscopy (EDS), and atomic force microscopy (AFM). △ Less

Submitted 1 March, 2024; originally announced March 2024.

Comments: 22 pages, 11 figures

arXiv:2403.00331 [pdf, other]

WindGP: Efficient Graph Partitioning on Heterogenous Machines

Authors: Li Zeng, Haohan Huang, Binfan Zheng, Kang Yang, Shengcheng Shao, Jinhua Zhou, Jun Xie, Rongqian Zhao, Xin Chen

Abstract: Graph Partitioning is widely used in many real-world applications such as fraud detection and social network analysis, in order to enable the distributed graph computing on large graphs. However, existing works fail to balance the computation cost and communication cost on machines with different power (including computing capability, network bandwidth and memory size), as they only consider repli… ▽ More Graph Partitioning is widely used in many real-world applications such as fraud detection and social network analysis, in order to enable the distributed graph computing on large graphs. However, existing works fail to balance the computation cost and communication cost on machines with different power (including computing capability, network bandwidth and memory size), as they only consider replication factor and neglect the difference of machines in realistic data centers. In this paper, we propose a general graph partitioning algorithm WindGP, which can support fast and high-quality edge partitioning on heterogeneous machines. WindGP designs novel preprocessing techniques to simplify the metric and balance the computation cost according to the characteristics of graphs and machines. Also, best-first search is proposed instead of BFS and DFS, in order to generate clusters with high cohesion. Furthermore, WindGP adaptively tunes the partition results by sophisticated local search methods. Extensive experiments show that WindGP outperforms all state-of-the-art partition methods by 1.35 - 27 times on both dense and sparse distributed graph algorithms, and has good scalability with graph size and machine number. △ Less

Submitted 6 March, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

Comments: 19 pages, 15 figures, 18 tables

arXiv:2402.14247 [pdf, ps, other]

Spectrum of the Dirac operator on Compact Riemannian Manifolds

Authors: Lingzhong Zeng

Abstract: In this paper, we consider the eigenvalue problem of Dirac operator on a compact Riemannian manifold isometrically immersed into Euclidean space and derive some extrinsic estimates for the sum of arbitrary consecutive $n$ eigenvalues of the square of the Dirac operator acting on some Dirac invariant subbundles. As some applications, we deduce some eigenvalue inequalities on the compact submanifold… ▽ More In this paper, we consider the eigenvalue problem of Dirac operator on a compact Riemannian manifold isometrically immersed into Euclidean space and derive some extrinsic estimates for the sum of arbitrary consecutive $n$ eigenvalues of the square of the Dirac operator acting on some Dirac invariant subbundles. As some applications, we deduce some eigenvalue inequalities on the compact submanifolds immersed into Euclidean space, unit sphere or projective spaces and further get some bounds of general Reilly type. In addition, we also establish some universal bounds under certain curvature condition and on the meanwhile provide an alternative proof for Anghel's result. In particular, utilizing Atiyah-Singer index theorem, we drive an upper bound estimate for the sum of the first $n$ nontrivial eigenvalues of Atiyah-Singer Laplacian acting on the spin manifolds without dimensional assumption. △ Less

Submitted 21 February, 2024; originally announced February 2024.

Comments: 28 pages

arXiv:2402.12772 [pdf, other]

GazePrompt: Enhancing Low Vision People's Reading Experience with Gaze-Aware Augmentations

Authors: Ru Wang, Zach Potter, Yun Ho, Daniel Killough, Linxiu Zeng, Sanbrita Mondal, Yuhang Zhao

Abstract: Reading is a challenging task for low vision people. While conventional low vision aids (e.g., magnification) offer certain support, they cannot fully address the difficulties faced by low vision users, such as locating the next line and distinguishing similar words. To fill this gap, we present GazePrompt, a gaze-aware reading aid that provides timely and targeted visual and audio augmentations b… ▽ More Reading is a challenging task for low vision people. While conventional low vision aids (e.g., magnification) offer certain support, they cannot fully address the difficulties faced by low vision users, such as locating the next line and distinguishing similar words. To fill this gap, we present GazePrompt, a gaze-aware reading aid that provides timely and targeted visual and audio augmentations based on users' gaze behaviors. GazePrompt includes two key features: (1) a Line-Switching support that highlights the line a reader intends to read; and (2) a Difficult-Word support that magnifies or reads aloud a word that the reader hesitates with. Through a study with 13 low vision participants who performed well-controlled reading-aloud tasks with and without GazePrompt, we found that GazePrompt significantly reduced participants' line switching time, reduced word recognition errors, and improved their subjective reading experiences. A follow-up silent-reading study showed that GazePrompt can enhance users' concentration and perceived comprehension of the reading contents. We further derive design considerations for future gaze-based low vision aids. △ Less

Submitted 22 February, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

ACM Class: H.5.2

arXiv:2402.09444 [pdf, other]

doi 10.1109/TIP.2024.3362135

Multimodal Action Quality Assessment

Authors: Ling-An Zeng, Wei-Shi Zheng

Abstract: Action quality assessment (AQA) is to assess how well an action is performed. Previous works perform modelling by only the use of visual information, ignoring audio information. We argue that although AQA is highly dependent on visual information, the audio is useful complementary information for improving the score regression accuracy, especially for sports with background music, such as figure s… ▽ More Action quality assessment (AQA) is to assess how well an action is performed. Previous works perform modelling by only the use of visual information, ignoring audio information. We argue that although AQA is highly dependent on visual information, the audio is useful complementary information for improving the score regression accuracy, especially for sports with background music, such as figure skating and rhythmic gymnastics. To leverage multimodal information for AQA, i.e., RGB, optical flow and audio information, we propose a Progressive Adaptive Multimodal Fusion Network (PAMFN) that separately models modality-specific information and mixed-modality information. Our model consists of with three modality-specific branches that independently explore modality-specific information and a mixed-modality branch that progressively aggregates the modality-specific information from the modality-specific branches. To build the bridge between modality-specific branches and the mixed-modality branch, three novel modules are proposed. First, a Modality-specific Feature Decoder module is designed to selectively transfer modality-specific information to the mixed-modality branch. Second, when exploring the interaction between modality-specific information, we argue that using an invariant multimodal fusion policy may lead to suboptimal results, so as to take the potential diversity in different parts of an action into consideration. Therefore, an Adaptive Fusion Module is proposed to learn adaptive multimodal fusion policies in different parts of an action. This module consists of several FusionNets for exploring different multimodal fusion strategies and a PolicyNet for deciding which FusionNets are enabled. Third, a module called Cross-modal Feature Decoder is designed to transfer cross-modal features generated by Adaptive Fusion Module to the mixed-modality branch. △ Less

Submitted 20 February, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

Comments: IEEE Transactions on Image Processing 2024

ACM Class: I.2.10

arXiv:2402.07959 [pdf]

The Shifting Impact of Recurrent Flooding on Transportation Accessibility: A Case Study of Affected Populations in The Hampton Roads Region

Authors: Luwei Zeng, T. Donna Chen, John S. Miller, Faria Tuz Zahura, Jonathan L. Goodall

Abstract: Accelerated sea level rise has resulted in recurrent flooding in coastal regions, increasingly impacting both transportation systems and local populations. Using the Hampton Roads region in Virginia as a case study, this study a. identifies hotspots with frequent, significant accessibility reduction for work and nonwork travel utilizing crowdsourced WAZE flood report data during the month of Augus… ▽ More Accelerated sea level rise has resulted in recurrent flooding in coastal regions, increasingly impacting both transportation systems and local populations. Using the Hampton Roads region in Virginia as a case study, this study a. identifies hotspots with frequent, significant accessibility reduction for work and nonwork travel utilizing crowdsourced WAZE flood report data during the month of August over 5 years: 2018 to 2022; and b. examines the shifts in social vulnerability in populations residing in these hotspots over the 5 year period using 2016 and 2021 American Community Survey data. Results show that approximately 12 percent and 3 percent of the population of the region reside in hotspots experiencing significant recurrent flooding-induced accessibility reduction for work and nonwork trips. Social vulnerability analysis revealed that populations with greater socioeconomic and transportation vulnerabilities are more susceptible to recurrent flooding induced accessibility impacts in terms of both extent and frequency. Furthermore, a comparison of social vulnerability indices between 2016 and 2021 shows an increasing trend of social vulnerability for highly impacted zones, with low income, disabled, and households with young children having restricted ability to relocate from these zones. The findings reinforce the necessity for spatially and temporally disaggregated studies of climate event impacts. Furthermore, the longer term population trends highlight the importance of dynamic assessment of climate event impacts at different time scales. △ Less

Submitted 10 February, 2024; originally announced February 2024.

arXiv:2402.06789 [pdf]

ELM-free Enhanced Dα H-mode with Near Zero NBI Torque Injection in DIII-D Tokamak

Authors: T. Macwan, K. Barada, J. F. Parisi, R. Groebner, T. L. Rhodes, S. Banerjee, C. Chrystal, Q. Pratt, Z. Yan, H. Wang, L. Zeng, M. E. Austin, N. A. Crocker, W. A. Peebles

Abstract: Enhanced $D_α$ H-mode (EDA H-mode), an ELM-free H-mode regime, is explored in neutral beam heated, lower single null plasmas with near zero torque injection. This regime exhibits a good energy confinement ($\mathrm{H}_{\mathrm{98y2}}$ $\sim 1$) with $β_N \sim 2$, high density, regime access at low input power, and no ELMs. This paper further presents the time-resolved measurements of electron and… ▽ More Enhanced $D_α$ H-mode (EDA H-mode), an ELM-free H-mode regime, is explored in neutral beam heated, lower single null plasmas with near zero torque injection. This regime exhibits a good energy confinement ($\mathrm{H}_{\mathrm{98y2}}$ $\sim 1$) with $β_N \sim 2$, high density, regime access at low input power, and no ELMs. This paper further presents the time-resolved measurements of electron and ion density, temperature, plasma rotation, and radial electric field during the EDA H-mode phase and examines the dynamics of the edge quasi-coherent mode (QCM). Measurements using multiple fluctuation diagnostics reveal the QCM to be a separatrix spanning mode, peaking just inside the separatrix, existing in a wide range of $k_{\perp}ρ_s \sim 0.1-1.2$ with multiple harmonics, and propagating with a very small phase velocity in the plasma frame, where $k_{\perp}$ is the binormal wavenumber and $ρ_s$ is the ion sound radius. Linear gyrokinetic simulations of an EDA H-mode discharge with CGYRO indicates that the trapped electron mode (TEM) and electron temperature gradient (ETG) are dominant instabilities in the region where QCM is unstable. Qualitative analysis indicates that the properties of TEM are consistent with the experimental observed characteristics of the QCM. These similarities suggest that the QCM might be a TEM instability existing in the edge region of the EDA H-mode plasmas. △ Less

Submitted 9 February, 2024; originally announced February 2024.

Comments: 28 pages, 15 figures, 1 table

arXiv:2402.03335 [pdf]

Assessing The Spatially Heterogeneous Impact of Recurrent Flooding On Accessibility: A Case Study of The Hampton Roads Region:Part 2 Transit Accessibility

Authors: Luwei Zeng, T. Donna Chen, John S. Miller, Jonathan L. Goodall, Faria Tuz Zahura

Abstract: Due to accelerated sea level rise and climate change, the transportation system is increasingly affected by recurrent flooding coastal regions, yet the cumulative travel disruption effects are not well understood. In Part 1 of this study, the accessibility impacts of recurrent flooding on the auto mode were examined. In this paper (Part 2 of the study), the impact of recurrent flooding on transit… ▽ More Due to accelerated sea level rise and climate change, the transportation system is increasingly affected by recurrent flooding coastal regions, yet the cumulative travel disruption effects are not well understood. In Part 1 of this study, the accessibility impacts of recurrent flooding on the auto mode were examined. In this paper (Part 2 of the study), the impact of recurrent flooding on transit service accessibility was quantified with the aid of spatially and temporally disaggregated crowdsourced flood incident data from WAZE. A fixed route transit network is built for five time of day periods for 710 traffic analysis zones (TAZs), to capture the spatial and temporal variation of transit accessibility reduction due to recurrent flooding. Results show that the greatest transit accessibility reduction occurs during the morning peak hour, with individual TAZ transit accessibility reduction ranging from 0 to 88.2% for work trips (with an average of 6.4%) and ranging from 0 to 99.9% for non-work trips (with an average of 3.7%). Furthermore, social vulnerability analysis indicates that TAZs with a greater share of people with higher vulnerability in transportation and socioeconomic status are more likely to experience recurrent flooding-induced transit accessibility reduction. Results from this study reinforce the notion that transportation impacts under recurrent flooding are not uniformly experienced throughout a region, and this spatial and temporal variation translates to different impacts borne by various population groups. Disaggregate impact analysis like this study can support transportation engineers and planners to prioritize resources to ensure equitable transit accessibility under increasing climate disruptions. △ Less

Submitted 12 January, 2024; originally announced February 2024.

Comments: Under review of the Journal of Transport Geography

arXiv:2402.03334 [pdf]

Assessing The Spatially Heterogeneous Transportation Impacts of Recurrent Flooding in The Hampton Roads Region: Part 1 Auto Accessibility

Authors: Luwei Zeng, T. Donna Chen, John S. Miller, Jonathan L. Goodall, Faria Tuz Zahura

Abstract: Recurrent flooding has increased rapidly in coastal regions due to sea level rise and climate change. A key metric for evaluating transportation system degradation is accessibility, yet the lack of temporally and spatially disaggregate data means that the impact of recurrent flooding on accessibility, and hence transportation system performance: is not well understood. Using crowdsourced WAZE floo… ▽ More Recurrent flooding has increased rapidly in coastal regions due to sea level rise and climate change. A key metric for evaluating transportation system degradation is accessibility, yet the lack of temporally and spatially disaggregate data means that the impact of recurrent flooding on accessibility, and hence transportation system performance: is not well understood. Using crowdsourced WAZE flood incident data from the Hampton Roads region in Virginia, this study (Part 1) examines changes in the roadway network accessibility for travelers residing in 1,113 traffic analysis zones (TAZs) across five time of day periods. Additionally, a social vulnerability index framework is developed to understand the socioeconomic characteristics of TAZs that experience high accessibility reduction under recurrent flooding. Results show that TAZs experience the most accessibility reduction under recurrent flooding during the morning peak period (6 to 9am) with large differences across different zones, ranging from 0 to 49.6 (percentage) for work trips (with population weighted mean reduction of 1.71 percent) and 0 to 87.9 (percentage) for nonwork trips (with population weighted mean reduction of 0.81 percent). Furthermore, the social vulnerability analysis showed that zones with higher percentages of lower socioeconomic status, unemployed, less educated, and limited English proficiency residents experience greater accessibility reduction for work trips. In contrast to previous studies that aggregate the effects of recurrent flooding across a city, these results demonstrate that there exists large spatial and temporal variation in recurrent floodings impacts on accessibility. This study also highlights the need to include social vulnerability analysis in assessing impacts of climate events, to ensure equitable outcomes as investments are made to create resilient transportation infrastructure. △ Less

Submitted 12 January, 2024; originally announced February 2024.

Comments: Under review of the Journal of Transport Geography

arXiv:2401.17145 [pdf]

Moment-Tensor-Based Constant-Potential Modeling of Electrical Double Layers

Authors: Zhenxiang Wang, Ming Chen, Jiedu Wu, Xiangyu Ji, Liang Zeng, Jiaxing Peng, Jiawei Yan, Alexei A. Kornyshev, Bingwei Mao, Guang Feng

Abstract: Constant-potential molecular dynamics (MD) simulations are indispensable for understanding the capacitance, structure, and dynamics of electrical double layers (EDLs) at the atomistic level. However, the classical constant-potential method, relying on the so-called 'floating charges' to keep electrode equipotential, overlooks quantum effects on the electrode and always underestimates EDL capacitan… ▽ More Constant-potential molecular dynamics (MD) simulations are indispensable for understanding the capacitance, structure, and dynamics of electrical double layers (EDLs) at the atomistic level. However, the classical constant-potential method, relying on the so-called 'floating charges' to keep electrode equipotential, overlooks quantum effects on the electrode and always underestimates EDL capacitance for typical electrochemical systems featuring metal electrodes in aqueous electrolytes. Here, we propose a universal theoretical framework as moment-tensor-based constant potential method (mCPM) to capture electronic structure variations with electric moments. For EDLs at Au(111) electrodes, mCPM-based MD reveals bell-shaped capacitance curves in magnitude and shape both quantitatively consistent with experiments. It further unveils the potential-dependent local electric fields, agreeing with experimental observations of redshift vibration of interfacial water under negative polarization and predicting a blueshift under positive polarization, and identifies geometry dependence of two time scales during EDL formation. △ Less

Submitted 30 January, 2024; originally announced January 2024.

arXiv:2401.13920 [pdf, other]

LocMoE: A Low-Overhead MoE for Large Language Model Training

Authors: Jing Li, Zhijie Sun, Xuan He, Li Zeng, Yi Lin, Entong Li, Binfan Zheng, Rongqian Zhao, Xin Chen

Abstract: The Mixtures-of-Experts (MoE) model is a widespread distributed and integrated learning method for large language models (LLM), which is favored due to its ability to sparsify and expand models efficiently. However, the performance of MoE is limited by load imbalance and high latency of All-to-All communication, along with relatively redundant computation owing to large expert capacity. Load imbal… ▽ More The Mixtures-of-Experts (MoE) model is a widespread distributed and integrated learning method for large language models (LLM), which is favored due to its ability to sparsify and expand models efficiently. However, the performance of MoE is limited by load imbalance and high latency of All-to-All communication, along with relatively redundant computation owing to large expert capacity. Load imbalance may result from existing routing policies that consistently tend to select certain experts. The frequent inter-node communication in the All-to-All procedure also significantly prolongs the training time. To alleviate the above performance problems, we propose a novel routing strategy that combines load balance and locality by converting partial inter-node communication to that of intra-node. Notably, we elucidate that there is a minimum threshold for expert capacity, calculated through the maximal angular deviation between the gating weights of the experts and the assigned tokens. We port these modifications on the PanGu-Sigma model based on the MindSpore framework with multi-level routing and conduct experiments on Ascend clusters. The experiment results demonstrate that the proposed LocMoE reduces training time per epoch by 12.68% to 22.24% compared to classical routers, such as hash router and switch router, without impacting the model accuracy. △ Less

Submitted 23 May, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

Comments: 1. Update the font size of all figures. 2. Update the name of the proposed layer Grouped Average Pooling (GrAP). 3. Change the order of the Section Contribution Statement

arXiv:2401.09707 [pdf]

Spring-block friction model for landslides: Application to Vaiont and Maoxian landslides

Authors: Rong Qiang Wei, Qing Li Zeng

Abstract: It is necessary to study the kinematics of landslide prior to its failure for accurately estimating the time of landslide instability. Based on a spring block model, considering the Dieterich Ruina's friction, the kinematic displacement and velocity of landslide along the slip surface are analyzed under quasistatic approximation. A algebraic relationship including three parameters between the disp… ▽ More It is necessary to study the kinematics of landslide prior to its failure for accurately estimating the time of landslide instability. Based on a spring block model, considering the Dieterich Ruina's friction, the kinematic displacement and velocity of landslide along the slip surface are analyzed under quasistatic approximation. A algebraic relationship including three parameters between the displacement (or velocity) and time is obtained, and then applied to two typical landslides: Vaiont in Italy, and Maoxian in China. The results show that the proposed spring block friction model can well describe the kinematic data of landslides before their failure. If the effective data of displacement can be obtained to determine the three parameters above, this simple physical model could be used to estimate the time of landslide instability. This spring block friction model also provides clear physical basis for the usual inverse velocity method of the landslide warning, the stick slip of some landslides, and the scaling relationship between the numbers of the landslides and their volume. △ Less

Submitted 29 January, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

Comments: 16 pages; 7 figures; 1 Table

arXiv:2401.08472 [pdf, other]

Instilling Multi-round Thinking to Text-guided Image Generation

Authors: Lidong Zeng, Zhedong Zheng, Yinwei Wei, Tat-seng Chua

Abstract: This paper delves into the text-guided image editing task, focusing on modifying a reference image according to user-specified textual feedback to embody specific attributes. Despite recent advancements, a persistent challenge remains that the single-round generation often overlooks crucial details, particularly in the realm of fine-grained changes like shoes or sleeves. This issue compounds over… ▽ More This paper delves into the text-guided image editing task, focusing on modifying a reference image according to user-specified textual feedback to embody specific attributes. Despite recent advancements, a persistent challenge remains that the single-round generation often overlooks crucial details, particularly in the realm of fine-grained changes like shoes or sleeves. This issue compounds over multiple rounds of interaction, severely limiting customization quality. In an attempt to address this challenge, we introduce a new self-supervised regularization, \ie, multi-round regularization, which is compatible with existing methods. Specifically, the multi-round regularization encourages the model to maintain consistency across different modification orders. It builds upon the observation that the modification order generally should not affect the final result. Different from traditional one-round generation, the mechanism underpinning the proposed method is the error amplification of initially minor inaccuracies in capturing intricate details. Qualitative and quantitative experiments affirm that the proposed method achieves high-fidelity editing quality, especially the local modification, in both single-round and multiple-round generation, while also showcasing robust generalization to irregular text inputs. The effectiveness of our semantic alignment with textual feedback is further substantiated by the retrieval improvements on FahisonIQ and Fashion200k. △ Less

Submitted 9 March, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

Comments: 14 pages, 6 figures

arXiv:2401.06919 [pdf, other]

Pseudo-Empirical Likelihood Methods for Causal Inference

Authors: Jingyue Huang, Changbao Wu, Leilei Zeng

Abstract: Causal inference problems have remained an important research topic over the past several decades due to their general applicability in assessing a treatment effect in many different real-world settings. In this paper, we propose two inferential procedures on the average treatment effect (ATE) through a two-sample pseudo-empirical likelihood (PEL) approach. The first procedure uses the estimated p… ▽ More Causal inference problems have remained an important research topic over the past several decades due to their general applicability in assessing a treatment effect in many different real-world settings. In this paper, we propose two inferential procedures on the average treatment effect (ATE) through a two-sample pseudo-empirical likelihood (PEL) approach. The first procedure uses the estimated propensity scores for the formulation of the PEL function, and the resulting maximum PEL estimator of the ATE is equivalent to the inverse probability weighted estimator discussed in the literature. Our focus in this scenario is on the PEL ratio statistic and establishing its theoretical properties. The second procedure incorporates outcome regression models for PEL inference through model-calibration constraints, and the resulting maximum PEL estimator of the ATE is doubly robust. Our main theoretical result in this case is the establishment of the asymptotic distribution of the PEL ratio statistic. We also propose a bootstrap method for constructing PEL ratio confidence intervals for the ATE to bypass the scaling constant which is involved in the asymptotic distribution of the PEL ratio statistic but is very difficult to calculate. Finite sample performances of our proposed methods with comparisons to existing ones are investigated through simulation studies. △ Less

Submitted 12 January, 2024; originally announced January 2024.

arXiv:2312.15859 [pdf, other]

SCPMan: Shape Context and Prior Constrained Multi-scale Attention Network for Pancreatic Segmentation

Authors: Leilei Zeng, Xuechen Li, Xinquan Yang, Linlin Shen, Song Wu

Abstract: Due to the poor prognosis of Pancreatic cancer, accurate early detection and segmentation are critical for improving treatment outcomes. However, pancreatic segmentation is challenged by blurred boundaries, high shape variability, and class imbalance. To tackle these problems, we propose a multiscale attention network with shape context and prior constraint for robust pancreas segmentation. Specif… ▽ More Due to the poor prognosis of Pancreatic cancer, accurate early detection and segmentation are critical for improving treatment outcomes. However, pancreatic segmentation is challenged by blurred boundaries, high shape variability, and class imbalance. To tackle these problems, we propose a multiscale attention network with shape context and prior constraint for robust pancreas segmentation. Specifically, we proposed a Multi-scale Feature Extraction Module (MFE) and a Mixed-scale Attention Integration Module (MAI) to address unclear pancreas boundaries. Furthermore, a Shape Context Memory (SCM) module is introduced to jointly model semantics across scales and pancreatic shape. Active Shape Model (ASM) is further used to model the shape priors. Experiments on NIH and MSD datasets demonstrate the efficacy of our model, which improves the state-of-the-art Dice Score for 1.01% and 1.03% respectively. Our architecture provides robust segmentation performance, against the blurry boundaries, and variations in scale and shape of pancreas. △ Less

Submitted 25 December, 2023; originally announced December 2023.

Comments: 9 pages,6 figures

arXiv:2312.10618 [pdf]

Sparse Learning and Class Probability Estimation with Weighted Support Vector Machines

Authors: Liyun Zeng, Hao Helen Zhang

Abstract: Classification and probability estimation have broad applications in modern machine learning and data science applications, including biology, medicine, engineering, and computer science. The recent development of a class of weighted Support Vector Machines (wSVMs) has shown great values in robustly predicting the class probability and classification for various problems with high accuracy. The cu… ▽ More Classification and probability estimation have broad applications in modern machine learning and data science applications, including biology, medicine, engineering, and computer science. The recent development of a class of weighted Support Vector Machines (wSVMs) has shown great values in robustly predicting the class probability and classification for various problems with high accuracy. The current framework is based on the $\ell^2$-norm regularized binary wSVMs optimization problem, which only works with dense features and has poor performance at sparse features with redundant noise in most real applications. The sparse learning process requires a prescreen of the important variables for each binary wSVMs for accurately estimating pairwise conditional probability. In this paper, we proposed novel wSVMs frameworks that incorporate automatic variable selection with accurate probability estimation for sparse learning problems. We developed efficient algorithms for effective variable selection for solving either the $\ell^1$-norm or elastic net regularized binary wSVMs optimization problems. The binary class probability is then estimated either by the $\ell^2$-norm regularized wSVMs framework with selected variables or by elastic net regularized wSVMs directly. The two-step approach of $\ell^1$-norm followed by $\ell^2$-norm wSVMs show a great advantage in both automatic variable selection and reliable probability estimators with the most efficient time. The elastic net regularized wSVMs offer the best performance in terms of variable selection and probability estimation with the additional advantage of variable grouping in the compensation of more computation time for high dimensional problems. The proposed wSVMs-based sparse learning methods have wide applications and can be further extended to $K$-class problems through ensemble learning. △ Less

Submitted 17 December, 2023; originally announced December 2023.

Showing 1–50 of 368 results for author: Zeng, L