subscribe to arXiv mailings

arXiv:2407.11949 [pdf, other]

Minimally Entangled Typical Thermal States for Classical and Quantum Simulation of Gauge Theories at Finite Temperature and Density

Authors: I-Chi Chen, João C. Getelina, Klée Pollock, Srimoyee Sen, Yong-Xin Yao, Thomas Iadecola

Abstract: Simulating strongly coupled gauge theories at finite temperature and density is a longstanding challenge in nuclear and high-energy physics that also has fundamental implications for condensed matter physics. In this work, we investigate the utility of minimally entangled typical thermal state (METTS) approaches to facilitate both classical and quantum computational studies of such systems. METTS… ▽ More Simulating strongly coupled gauge theories at finite temperature and density is a longstanding challenge in nuclear and high-energy physics that also has fundamental implications for condensed matter physics. In this work, we investigate the utility of minimally entangled typical thermal state (METTS) approaches to facilitate both classical and quantum computational studies of such systems. METTS techniques combine classical random sampling with imaginary time evolution, which can be performed on either a classical or a quantum computer, to estimate thermal averages of observables. We study the simplest model of a confining gauge theory, namely $\mathbb{Z}_2$ gauge theory coupled to spinless fermionic matter in 1+1 dimensions, which can be directly mapped to a local quantum spin chain with two- and three-body interactions. We benchmark both a classical matrix-product-state implementation of METTS and a recently proposed adaptive variational approach to METTS that is a promising candidate for implementation on near-term quantum devices, focusing on the equation of state as well as on various measures of fermion confinement. Of particular importance is the choice of basis for obtaining new METTS samples, which impacts both the classical sampling complexity (a key factor in both classical and quantum simulation applications) and complexity of circuits used in the quantum computing approach. Our work sets the stage for future studies of strongly coupled gauge theories with both classical and quantum hardware. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: 14 pages, 8 figures

arXiv:2407.11781 [pdf, other]

SlingBAG: Sliding ball adaptive growth algorithm with differentiable radiation enables super-efficient iterative 3D photoacoustic image reconstruction

Authors: Shuang Li, Yibing Wang, Jian Gao, Chulhong Kim, Seongwook Choi, Yu Zhang, Qian Chen, Yao Yao, Changhui Li

Abstract: High-quality 3D photoacoustic imaging (PAI) reconstruction under sparse view or limited view has long been challenging. Traditional 3D iterative-based reconstruction methods suffer from both slow speed and high memory consumption. Recently, in computer graphics, the differentiable rendering has made significant progress, particularly with the rise of 3D Gaussian Splatting. Inspired by these, we in… ▽ More High-quality 3D photoacoustic imaging (PAI) reconstruction under sparse view or limited view has long been challenging. Traditional 3D iterative-based reconstruction methods suffer from both slow speed and high memory consumption. Recently, in computer graphics, the differentiable rendering has made significant progress, particularly with the rise of 3D Gaussian Splatting. Inspired by these, we introduce differentiable radiation into PAI, developing a novel reconstruction algorithm: the Sliding Ball Adaptive Growth algorithm (SlingBAG) for 3D PAI, which shows ability in high-quality 3D PAI reconstruction both under extremely sparse view and limited view. We established the point cloud dataset in PAI, and used unique differentiable rapid radiator based on the spherical decomposition strategy and the randomly initialized point cloud adaptively optimized according to sparse sensor data. Each point undergoes updates in 3D coordinates, initial pressure, and resolution (denoted by the radius of ball). Points undergo adaptive growth during iterative process, including point destroying, splitting and duplicating along the gradient of their positions, manifesting the sliding ball effect. Finally, our point cloud to voxel grid shader renders the final reconstruction results. Simulation and in vivo experiments demonstrate that our SlingBAG reconstruction result's SNR can be more than 40 dB under extremely sparse view, while the SNR of traditional back-projection algorithm's result is less than 20 dB. Moreover, the result of SlingBAG's structural similarity to the ground truth is significantly higher, with an SSIM value of 95.6%. Notably, our differentiable rapid radiator can conduct forward PA simulation in homogeneous, non-viscous media substantially faster than current methods that numerically simulate the wave propagation, such as k-Wave. The dataset and all code will be open source. △ Less

Submitted 16 July, 2024; originally announced July 2024.

arXiv:2407.11197 [pdf]

A Vision to Enhance Trust Requirements for Peer Support Systems by Revisiting Trust Theories

Authors: Yasaman Gheidar, Lysanne Lessard, Yao Yao

Abstract: This vision paper focuses on the mental health crisis impacting healthcare workers (HCWs), which exacerbated by the COVID-19 pandemic, leads to increased stress and psychological issues like burnout. Peer Support Programs (PSP) are a recognized intervention for mitigating these issues. These programs are increasingly being delivered virtually through Peer Support Systems (PSS) for increased conven… ▽ More This vision paper focuses on the mental health crisis impacting healthcare workers (HCWs), which exacerbated by the COVID-19 pandemic, leads to increased stress and psychological issues like burnout. Peer Support Programs (PSP) are a recognized intervention for mitigating these issues. These programs are increasingly being delivered virtually through Peer Support Systems (PSS) for increased convenience and accessibility. However, HCWs perception of these systems results in fear of information sharing, perceived lack of safety, and low participation rate, which challenges these systems ability to achieve their goals. In line with the rich body of research on the requirements and properties of trustworthy systems, we posit that increasing HCWs trust in PSS could address these challenges. However, extant research focuses on objectively defined trustworthiness rather than perceptual trust because trustworthy requirements are viewed as more controllable and easier to operationalize. This study proposes a novel approach to elicit perceptual trust requirements by proposing a trust framework anchored in recognized trust theories from different disciplines that unpacks trust into its recognized types and their antecedents. This approach allows the identification of trust requirements beyond those already proposed for trustworthy systems, providing a strong foundation for improving the effectiveness of PSS for HCWs. Keywords: Trust Requirements, Requirements elicitation, Peer support systems, Healthcare workers △ Less

Submitted 5 June, 2024; originally announced July 2024.

Comments: Accepted for publication at the RE@Next! track of RE 2024

arXiv:2407.10923 [pdf, other]

OPa-Ma: Text Guided Mamba for 360-degree Image Out-painting

Authors: Penglei Gao, Kai Yao, Tiandi Ye, Steven Wang, Yuan Yao, Xiaofeng Wang

Abstract: In this paper, we tackle the recently popular topic of generating 360-degree images given the conventional narrow field of view (NFoV) images that could be taken from a single camera or cellphone. This task aims to predict the reasonable and consistent surroundings from the NFoV images. Existing methods for feature extraction and fusion, often built with transformer-based architectures, incur subs… ▽ More In this paper, we tackle the recently popular topic of generating 360-degree images given the conventional narrow field of view (NFoV) images that could be taken from a single camera or cellphone. This task aims to predict the reasonable and consistent surroundings from the NFoV images. Existing methods for feature extraction and fusion, often built with transformer-based architectures, incur substantial memory usage and computational expense. They also have limitations in maintaining visual continuity across the entire 360-degree images, which could cause inconsistent texture and style generation. To solve the aforementioned issues, we propose a novel text-guided out-painting framework equipped with a State-Space Model called Mamba to utilize its long-sequence modelling and spatial continuity. Furthermore, incorporating textual information is an effective strategy for guiding image generation, enriching the process with detailed context and increasing diversity. Efficiently extracting textual features and integrating them with image attributes presents a significant challenge for 360-degree image out-painting. To address this, we develop two modules, Visual-textual Consistency Refiner (VCR) and Global-local Mamba Adapter (GMA). VCR enhances contextual richness by fusing the modified text features with the image features, while GMA provides adaptive state-selective conditions by capturing the information flow from global to local representations. Our proposed method achieves state-of-the-art performance with extensive experiments on two broadly used 360-degree image datasets, including indoor and outdoor settings. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.10671 [pdf, other]

Qwen2 Technical Report

Authors: An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin, Kai Dang , et al. (34 additional authors not shown)

Abstract: This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, a… ▽ More This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, and exhibits competitive performance relative to proprietary models across diverse benchmarks on language understanding, generation, multilingual proficiency, coding, mathematics, and reasoning. The flagship model, Qwen2-72B, showcases remarkable performance: 84.2 on MMLU, 37.9 on GPQA, 64.6 on HumanEval, 89.5 on GSM8K, and 82.4 on BBH as a base language model. The instruction-tuned variant, Qwen2-72B-Instruct, attains 9.1 on MT-Bench, 48.1 on Arena-Hard, and 35.7 on LiveCodeBench. Moreover, Qwen2 demonstrates robust multilingual capabilities, proficient in approximately 30 languages, spanning English, Chinese, Spanish, French, German, Arabic, Russian, Korean, Japanese, Thai, Vietnamese, and more, underscoring its versatility and global reach. To foster community innovation and accessibility, we have made the Qwen2 model weights openly available on Hugging Face and ModelScope, and the supplementary materials including example code on GitHub. These platforms also include resources for quantization, fine-tuning, and deployment, facilitating a wide range of applications and research endeavors. △ Less

Submitted 16 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

Comments: 25 pages, 1 figure

arXiv:2407.09833 [pdf, other]

LiveHPS++: Robust and Coherent Motion Capture in Dynamic Free Environment

Authors: Yiming Ren, Xiao Han, Yichen Yao, Xiaoxiao Long, Yujing Sun, Yuexin Ma

Abstract: LiDAR-based human motion capture has garnered significant interest in recent years for its practicability in large-scale and unconstrained environments. However, most methods rely on cleanly segmented human point clouds as input, the accuracy and smoothness of their motion results are compromised when faced with noisy data, rendering them unsuitable for practical applications. To address these lim… ▽ More LiDAR-based human motion capture has garnered significant interest in recent years for its practicability in large-scale and unconstrained environments. However, most methods rely on cleanly segmented human point clouds as input, the accuracy and smoothness of their motion results are compromised when faced with noisy data, rendering them unsuitable for practical applications. To address these limitations and enhance the robustness and precision of motion capture with noise interference, we introduce LiveHPS++, an innovative and effective solution based on a single LiDAR system. Benefiting from three meticulously designed modules, our method can learn dynamic and kinematic features from human movements, and further enable the precise capture of coherent human motions in open settings, making it highly applicable to real-world scenarios. Through extensive experiments, LiveHPS++ has proven to significantly surpass existing state-of-the-art methods across various datasets, establishing a new benchmark in the field. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: Accepted by ECCV 2024

arXiv:2407.08596 [pdf, other]

Modeling X-Ray Multi-Reflection in Super-Eddington Winds

Authors: Zijian Zhang, Lars Lund Thomsen, Lixin Dai, Christopher S. Reynolds, Javier A. García, Erin Kara, Riley Connors, Megan Masterson, Yuhan Yao, Thomas Dauser

Abstract: It has been recently discovered that a few super-Eddington sources undergoing black hole super-Eddington accretion exhibit X-ray reflection signatures. In such new systems, one expects that the coronal X-ray emissions are mainly reflected by optically thick super-Eddington winds instead of thin disks. In this paper, we conduct a series of general relativistic ray-tracing and Monte Carlo radiative… ▽ More It has been recently discovered that a few super-Eddington sources undergoing black hole super-Eddington accretion exhibit X-ray reflection signatures. In such new systems, one expects that the coronal X-ray emissions are mainly reflected by optically thick super-Eddington winds instead of thin disks. In this paper, we conduct a series of general relativistic ray-tracing and Monte Carlo radiative transfer simulations to model the X-ray reflection signatures, especially the characteristic Fe K$α$ line, produced from super-Eddington accretion flows. In particular, we allow the photons emitted by a lamppost corona to be reflected multiple times in a cone-like funnel surrounded by fast winds. We find that the Fe K$α$ line profile most sensitively depends on the wind kinematics, while its exact shape also depends on the funnel open angle and corona height. Furthermore, very interestingly, we find that the Fe K$α$ line can have a prominent double-peak profile in certain parameter spaces even with a face-on orientation. Moreover, we compare the Fe K$α$ line profiles produced from super-Eddington and thin disks and show that such lines can provide important insights into the understanding of black hole systems undergoing super-Eddington accretion. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 23 pages, 21 figures, 2 tables. Comments are welcome

arXiv:2407.08512 [pdf, other]

Anchored symplectic embeddings

Authors: Michael Hutchings, Agniva Roy, Morgan Weiler, Yuan Yao

Abstract: Given two four-dimensional symplectic manifolds, together with knots in their boundaries, we define an ``anchored symplectic embedding'' to be a symplectic embedding, together with a two-dimensional symplectic cobordism between the knots (in the four-dimensional cobordism determined by the embedding). We use techniques from embedded contact homology to determine quantitative critera for when ancho… ▽ More Given two four-dimensional symplectic manifolds, together with knots in their boundaries, we define an ``anchored symplectic embedding'' to be a symplectic embedding, together with a two-dimensional symplectic cobordism between the knots (in the four-dimensional cobordism determined by the embedding). We use techniques from embedded contact homology to determine quantitative critera for when anchored symplectic embeddings exist, for many examples of toric domains. In particular we find examples where ordinarily symplectic embeddings exist, but they cannot be upgraded to anchored symplectic embeddings unless one enlarges the target domain. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 30 pages, 3 figures

MSC Class: 57K43

arXiv:2407.05765 [pdf, other]

Enlarging Feature Support Overlap for Domain Generalization

Authors: Yaoyao Zhu, Xiuding Cai, Dong Miao, Yu Yao, Zhongliang Fu

Abstract: Deep models often struggle with out-of-distribution (OOD) generalization, limiting their real-world applicability beyond controlled laboratory settings. Invariant risk minimization (IRM) addresses this issue by learning invariant features and minimizing the risk across different domains. Thus, it avoids the pitfalls of pseudo-invariant features and spurious causality associated with empirical risk… ▽ More Deep models often struggle with out-of-distribution (OOD) generalization, limiting their real-world applicability beyond controlled laboratory settings. Invariant risk minimization (IRM) addresses this issue by learning invariant features and minimizing the risk across different domains. Thus, it avoids the pitfalls of pseudo-invariant features and spurious causality associated with empirical risk minimization (ERM). However, according to the support overlap theorem, ERM and IRM may fail to address the OOD problem when pseudo-invariant features have insufficient support overlap. To this end, we propose a novel method to enlarge feature support overlap for domain generalization. Specifically, we introduce Bayesian random semantic data augmentation to increase sample diversity and overcome the deficiency of IRM. Experiments on several challenging OOD generalization benchmarks demonstrate that our approach surpasses existing models, delivering superior performance and robustness. The code is available at \url{https://github.com/YaoyaoZhu19/BSDG}. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.04509 [pdf, other]

Analysis of SIR Reaction diffusion system with constant birth and death rate

Authors: Yiting Yao

Abstract: This is a truncation of the second year group project at Imperial college london. In this paper, we consider a semilinear reaction diffusion system of SIR model which involves the birth rate and the death rate. We first prove the non-negativity and global existence theorem to ensure that the model makes sense. We prove the uniform convergence of the infection-free solution and study an example tha… ▽ More This is a truncation of the second year group project at Imperial college london. In this paper, we consider a semilinear reaction diffusion system of SIR model which involves the birth rate and the death rate. We first prove the non-negativity and global existence theorem to ensure that the model makes sense. We prove the uniform convergence of the infection-free solution and study an example that separable solutions can be computed. We also focus on the steady state solution, which we prove the non-uniqueness of the solution and investigate the regularity of the general solution. In the end we also introduce an interesting phenomenon, which is called the Turing instability caused by the diffusion in the model. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.04493 [pdf, other]

doi 10.1007/s10994-024-06575-2

PROUD: PaRetO-gUided Diffusion Model for Multi-objective Generation

Authors: Yinghua Yao, Yuangang Pan, Jing Li, Ivor Tsang, Xin Yao

Abstract: Recent advancements in the realm of deep generative models focus on generating samples that satisfy multiple desired properties. However, prevalent approaches optimize these property functions independently, thus omitting the trade-offs among them. In addition, the property optimization is often improperly integrated into the generative models, resulting in an unnecessary compromise on generation… ▽ More Recent advancements in the realm of deep generative models focus on generating samples that satisfy multiple desired properties. However, prevalent approaches optimize these property functions independently, thus omitting the trade-offs among them. In addition, the property optimization is often improperly integrated into the generative models, resulting in an unnecessary compromise on generation quality (i.e., the quality of generated samples). To address these issues, we formulate a constrained optimization problem. It seeks to optimize generation quality while ensuring that generated samples reside at the Pareto front of multiple property objectives. Such a formulation enables the generation of samples that cannot be further improved simultaneously on the conflicting property functions and preserves good quality of generated samples. Building upon this formulation, we introduce the PaRetO-gUided Diffusion model (PROUD), wherein the gradients in the denoising process are dynamically adjusted to enhance generation quality while the generated samples adhere to Pareto optimality. Experimental evaluations on image generation and protein generation tasks demonstrate that our PROUD consistently maintains superior generation quality while approaching Pareto optimality across multiple property functions compared to various baselines. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Journal ref: Machine Learning 2024

arXiv:2407.03917 [pdf, other]

Timestep-Aware Correction for Quantized Diffusion Models

Authors: Yuzhe Yao, Feng Tian, Jun Chen, Haonan Lin, Guang Dai, Yong Liu, Jingdong Wang

Abstract: Diffusion models have marked a significant breakthrough in the synthesis of semantically coherent images. However, their extensive noise estimation networks and the iterative generation process limit their wider application, particularly on resource-constrained platforms like mobile devices. Existing post-training quantization (PTQ) methods have managed to compress diffusion models to low precisio… ▽ More Diffusion models have marked a significant breakthrough in the synthesis of semantically coherent images. However, their extensive noise estimation networks and the iterative generation process limit their wider application, particularly on resource-constrained platforms like mobile devices. Existing post-training quantization (PTQ) methods have managed to compress diffusion models to low precision. Nevertheless, due to the iterative nature of diffusion models, quantization errors tend to accumulate throughout the generation process. This accumulation of error becomes particularly problematic in low-precision scenarios, leading to significant distortions in the generated images. We attribute this accumulation issue to two main causes: error propagation and exposure bias. To address these problems, we propose a timestep-aware correction method for quantized diffusion model, which dynamically corrects the quantization error. By leveraging the proposed method in low-precision diffusion models, substantial enhancement of output quality could be achieved with only negligible computation overhead. Extensive experiments underscore our method's effectiveness and generalizability. By employing the proposed correction strategy, we achieve state-of-the-art (SOTA) results on low-precision models. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: ECCV 2024

arXiv:2407.03178 [pdf, other]

Relating CNN-Transformer Fusion Network for Change Detection

Authors: Yuhao Gao, Gensheng Pei, Mengmeng Sheng, Zeren Sun, Tao Chen, Yazhou Yao

Abstract: While deep learning, particularly convolutional neural networks (CNNs), has revolutionized remote sensing (RS) change detection (CD), existing approaches often miss crucial features due to neglecting global context and incomplete change learning. Additionally, transformer networks struggle with low-level details. RCTNet addresses these limitations by introducing \textbf{(1)} an early fusion backbo… ▽ More While deep learning, particularly convolutional neural networks (CNNs), has revolutionized remote sensing (RS) change detection (CD), existing approaches often miss crucial features due to neglecting global context and incomplete change learning. Additionally, transformer networks struggle with low-level details. RCTNet addresses these limitations by introducing \textbf{(1)} an early fusion backbone to exploit both spatial and temporal features early on, \textbf{(2)} a Cross-Stage Aggregation (CSA) module for enhanced temporal representation, \textbf{(3)} a Multi-Scale Feature Fusion (MSF) module for enriched feature extraction in the decoder, and \textbf{(4)} an Efficient Self-deciphering Attention (ESA) module utilizing transformers to capture global information and fine-grained details for accurate change detection. Extensive experiments demonstrate RCTNet's clear superiority over traditional RS image CD methods, showing significant improvement and an optimal balance between accuracy and computational cost. △ Less