-
Minimally Entangled Typical Thermal States for Classical and Quantum Simulation of Gauge Theories at Finite Temperature and Density
Authors:
I-Chi Chen,
João C. Getelina,
Klée Pollock,
Srimoyee Sen,
Yong-Xin Yao,
Thomas Iadecola
Abstract:
Simulating strongly coupled gauge theories at finite temperature and density is a longstanding challenge in nuclear and high-energy physics that also has fundamental implications for condensed matter physics. In this work, we investigate the utility of minimally entangled typical thermal state (METTS) approaches to facilitate both classical and quantum computational studies of such systems. METTS…
▽ More
Simulating strongly coupled gauge theories at finite temperature and density is a longstanding challenge in nuclear and high-energy physics that also has fundamental implications for condensed matter physics. In this work, we investigate the utility of minimally entangled typical thermal state (METTS) approaches to facilitate both classical and quantum computational studies of such systems. METTS techniques combine classical random sampling with imaginary time evolution, which can be performed on either a classical or a quantum computer, to estimate thermal averages of observables. We study the simplest model of a confining gauge theory, namely $\mathbb{Z}_2$ gauge theory coupled to spinless fermionic matter in 1+1 dimensions, which can be directly mapped to a local quantum spin chain with two- and three-body interactions. We benchmark both a classical matrix-product-state implementation of METTS and a recently proposed adaptive variational approach to METTS that is a promising candidate for implementation on near-term quantum devices, focusing on the equation of state as well as on various measures of fermion confinement. Of particular importance is the choice of basis for obtaining new METTS samples, which impacts both the classical sampling complexity (a key factor in both classical and quantum simulation applications) and complexity of circuits used in the quantum computing approach. Our work sets the stage for future studies of strongly coupled gauge theories with both classical and quantum hardware.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
SlingBAG: Sliding ball adaptive growth algorithm with differentiable radiation enables super-efficient iterative 3D photoacoustic image reconstruction
Authors:
Shuang Li,
Yibing Wang,
Jian Gao,
Chulhong Kim,
Seongwook Choi,
Yu Zhang,
Qian Chen,
Yao Yao,
Changhui Li
Abstract:
High-quality 3D photoacoustic imaging (PAI) reconstruction under sparse view or limited view has long been challenging. Traditional 3D iterative-based reconstruction methods suffer from both slow speed and high memory consumption. Recently, in computer graphics, the differentiable rendering has made significant progress, particularly with the rise of 3D Gaussian Splatting. Inspired by these, we in…
▽ More
High-quality 3D photoacoustic imaging (PAI) reconstruction under sparse view or limited view has long been challenging. Traditional 3D iterative-based reconstruction methods suffer from both slow speed and high memory consumption. Recently, in computer graphics, the differentiable rendering has made significant progress, particularly with the rise of 3D Gaussian Splatting. Inspired by these, we introduce differentiable radiation into PAI, developing a novel reconstruction algorithm: the Sliding Ball Adaptive Growth algorithm (SlingBAG) for 3D PAI, which shows ability in high-quality 3D PAI reconstruction both under extremely sparse view and limited view.
We established the point cloud dataset in PAI, and used unique differentiable rapid radiator based on the spherical decomposition strategy and the randomly initialized point cloud adaptively optimized according to sparse sensor data. Each point undergoes updates in 3D coordinates, initial pressure, and resolution (denoted by the radius of ball). Points undergo adaptive growth during iterative process, including point destroying, splitting and duplicating along the gradient of their positions, manifesting the sliding ball effect.
Finally, our point cloud to voxel grid shader renders the final reconstruction results. Simulation and in vivo experiments demonstrate that our SlingBAG reconstruction result's SNR can be more than 40 dB under extremely sparse view, while the SNR of traditional back-projection algorithm's result is less than 20 dB. Moreover, the result of SlingBAG's structural similarity to the ground truth is significantly higher, with an SSIM value of 95.6%.
Notably, our differentiable rapid radiator can conduct forward PA simulation in homogeneous, non-viscous media substantially faster than current methods that numerically simulate the wave propagation, such as k-Wave. The dataset and all code will be open source.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
A Vision to Enhance Trust Requirements for Peer Support Systems by Revisiting Trust Theories
Authors:
Yasaman Gheidar,
Lysanne Lessard,
Yao Yao
Abstract:
This vision paper focuses on the mental health crisis impacting healthcare workers (HCWs), which exacerbated by the COVID-19 pandemic, leads to increased stress and psychological issues like burnout. Peer Support Programs (PSP) are a recognized intervention for mitigating these issues. These programs are increasingly being delivered virtually through Peer Support Systems (PSS) for increased conven…
▽ More
This vision paper focuses on the mental health crisis impacting healthcare workers (HCWs), which exacerbated by the COVID-19 pandemic, leads to increased stress and psychological issues like burnout. Peer Support Programs (PSP) are a recognized intervention for mitigating these issues. These programs are increasingly being delivered virtually through Peer Support Systems (PSS) for increased convenience and accessibility. However, HCWs perception of these systems results in fear of information sharing, perceived lack of safety, and low participation rate, which challenges these systems ability to achieve their goals. In line with the rich body of research on the requirements and properties of trustworthy systems, we posit that increasing HCWs trust in PSS could address these challenges. However, extant research focuses on objectively defined trustworthiness rather than perceptual trust because trustworthy requirements are viewed as more controllable and easier to operationalize. This study proposes a novel approach to elicit perceptual trust requirements by proposing a trust framework anchored in recognized trust theories from different disciplines that unpacks trust into its recognized types and their antecedents. This approach allows the identification of trust requirements beyond those already proposed for trustworthy systems, providing a strong foundation for improving the effectiveness of PSS for HCWs. Keywords: Trust Requirements, Requirements elicitation, Peer support systems, Healthcare workers
△ Less
Submitted 5 June, 2024;
originally announced July 2024.
-
OPa-Ma: Text Guided Mamba for 360-degree Image Out-painting
Authors:
Penglei Gao,
Kai Yao,
Tiandi Ye,
Steven Wang,
Yuan Yao,
Xiaofeng Wang
Abstract:
In this paper, we tackle the recently popular topic of generating 360-degree images given the conventional narrow field of view (NFoV) images that could be taken from a single camera or cellphone. This task aims to predict the reasonable and consistent surroundings from the NFoV images. Existing methods for feature extraction and fusion, often built with transformer-based architectures, incur subs…
▽ More
In this paper, we tackle the recently popular topic of generating 360-degree images given the conventional narrow field of view (NFoV) images that could be taken from a single camera or cellphone. This task aims to predict the reasonable and consistent surroundings from the NFoV images. Existing methods for feature extraction and fusion, often built with transformer-based architectures, incur substantial memory usage and computational expense. They also have limitations in maintaining visual continuity across the entire 360-degree images, which could cause inconsistent texture and style generation. To solve the aforementioned issues, we propose a novel text-guided out-painting framework equipped with a State-Space Model called Mamba to utilize its long-sequence modelling and spatial continuity. Furthermore, incorporating textual information is an effective strategy for guiding image generation, enriching the process with detailed context and increasing diversity. Efficiently extracting textual features and integrating them with image attributes presents a significant challenge for 360-degree image out-painting. To address this, we develop two modules, Visual-textual Consistency Refiner (VCR) and Global-local Mamba Adapter (GMA). VCR enhances contextual richness by fusing the modified text features with the image features, while GMA provides adaptive state-selective conditions by capturing the information flow from global to local representations. Our proposed method achieves state-of-the-art performance with extensive experiments on two broadly used 360-degree image datasets, including indoor and outdoor settings.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Qwen2 Technical Report
Authors:
An Yang,
Baosong Yang,
Binyuan Hui,
Bo Zheng,
Bowen Yu,
Chang Zhou,
Chengpeng Li,
Chengyuan Li,
Dayiheng Liu,
Fei Huang,
Guanting Dong,
Haoran Wei,
Huan Lin,
Jialong Tang,
Jialin Wang,
Jian Yang,
Jianhong Tu,
Jianwei Zhang,
Jianxin Ma,
Jin Xu,
Jingren Zhou,
Jinze Bai,
Jinzheng He,
Junyang Lin,
Kai Dang
, et al. (34 additional authors not shown)
Abstract:
This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, a…
▽ More
This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, and exhibits competitive performance relative to proprietary models across diverse benchmarks on language understanding, generation, multilingual proficiency, coding, mathematics, and reasoning.
The flagship model, Qwen2-72B, showcases remarkable performance: 84.2 on MMLU, 37.9 on GPQA, 64.6 on HumanEval, 89.5 on GSM8K, and 82.4 on BBH as a base language model. The instruction-tuned variant, Qwen2-72B-Instruct, attains 9.1 on MT-Bench, 48.1 on Arena-Hard, and 35.7 on LiveCodeBench. Moreover, Qwen2 demonstrates robust multilingual capabilities, proficient in approximately 30 languages, spanning English, Chinese, Spanish, French, German, Arabic, Russian, Korean, Japanese, Thai, Vietnamese, and more, underscoring its versatility and global reach.
To foster community innovation and accessibility, we have made the Qwen2 model weights openly available on Hugging Face and ModelScope, and the supplementary materials including example code on GitHub. These platforms also include resources for quantization, fine-tuning, and deployment, facilitating a wide range of applications and research endeavors.
△ Less
Submitted 16 July, 2024; v1 submitted 15 July, 2024;
originally announced July 2024.
-
LiveHPS++: Robust and Coherent Motion Capture in Dynamic Free Environment
Authors:
Yiming Ren,
Xiao Han,
Yichen Yao,
Xiaoxiao Long,
Yujing Sun,
Yuexin Ma
Abstract:
LiDAR-based human motion capture has garnered significant interest in recent years for its practicability in large-scale and unconstrained environments. However, most methods rely on cleanly segmented human point clouds as input, the accuracy and smoothness of their motion results are compromised when faced with noisy data, rendering them unsuitable for practical applications. To address these lim…
▽ More
LiDAR-based human motion capture has garnered significant interest in recent years for its practicability in large-scale and unconstrained environments. However, most methods rely on cleanly segmented human point clouds as input, the accuracy and smoothness of their motion results are compromised when faced with noisy data, rendering them unsuitable for practical applications. To address these limitations and enhance the robustness and precision of motion capture with noise interference, we introduce LiveHPS++, an innovative and effective solution based on a single LiDAR system. Benefiting from three meticulously designed modules, our method can learn dynamic and kinematic features from human movements, and further enable the precise capture of coherent human motions in open settings, making it highly applicable to real-world scenarios. Through extensive experiments, LiveHPS++ has proven to significantly surpass existing state-of-the-art methods across various datasets, establishing a new benchmark in the field.
△ Less
Submitted 13 July, 2024;
originally announced July 2024.
-
Modeling X-Ray Multi-Reflection in Super-Eddington Winds
Authors:
Zijian Zhang,
Lars Lund Thomsen,
Lixin Dai,
Christopher S. Reynolds,
Javier A. García,
Erin Kara,
Riley Connors,
Megan Masterson,
Yuhan Yao,
Thomas Dauser
Abstract:
It has been recently discovered that a few super-Eddington sources undergoing black hole super-Eddington accretion exhibit X-ray reflection signatures. In such new systems, one expects that the coronal X-ray emissions are mainly reflected by optically thick super-Eddington winds instead of thin disks. In this paper, we conduct a series of general relativistic ray-tracing and Monte Carlo radiative…
▽ More
It has been recently discovered that a few super-Eddington sources undergoing black hole super-Eddington accretion exhibit X-ray reflection signatures. In such new systems, one expects that the coronal X-ray emissions are mainly reflected by optically thick super-Eddington winds instead of thin disks. In this paper, we conduct a series of general relativistic ray-tracing and Monte Carlo radiative transfer simulations to model the X-ray reflection signatures, especially the characteristic Fe K$α$ line, produced from super-Eddington accretion flows. In particular, we allow the photons emitted by a lamppost corona to be reflected multiple times in a cone-like funnel surrounded by fast winds. We find that the Fe K$α$ line profile most sensitively depends on the wind kinematics, while its exact shape also depends on the funnel open angle and corona height. Furthermore, very interestingly, we find that the Fe K$α$ line can have a prominent double-peak profile in certain parameter spaces even with a face-on orientation. Moreover, we compare the Fe K$α$ line profiles produced from super-Eddington and thin disks and show that such lines can provide important insights into the understanding of black hole systems undergoing super-Eddington accretion.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Anchored symplectic embeddings
Authors:
Michael Hutchings,
Agniva Roy,
Morgan Weiler,
Yuan Yao
Abstract:
Given two four-dimensional symplectic manifolds, together with knots in their boundaries, we define an ``anchored symplectic embedding'' to be a symplectic embedding, together with a two-dimensional symplectic cobordism between the knots (in the four-dimensional cobordism determined by the embedding). We use techniques from embedded contact homology to determine quantitative critera for when ancho…
▽ More
Given two four-dimensional symplectic manifolds, together with knots in their boundaries, we define an ``anchored symplectic embedding'' to be a symplectic embedding, together with a two-dimensional symplectic cobordism between the knots (in the four-dimensional cobordism determined by the embedding). We use techniques from embedded contact homology to determine quantitative critera for when anchored symplectic embeddings exist, for many examples of toric domains. In particular we find examples where ordinarily symplectic embeddings exist, but they cannot be upgraded to anchored symplectic embeddings unless one enlarges the target domain.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Enlarging Feature Support Overlap for Domain Generalization
Authors:
Yaoyao Zhu,
Xiuding Cai,
Dong Miao,
Yu Yao,
Zhongliang Fu
Abstract:
Deep models often struggle with out-of-distribution (OOD) generalization, limiting their real-world applicability beyond controlled laboratory settings. Invariant risk minimization (IRM) addresses this issue by learning invariant features and minimizing the risk across different domains. Thus, it avoids the pitfalls of pseudo-invariant features and spurious causality associated with empirical risk…
▽ More
Deep models often struggle with out-of-distribution (OOD) generalization, limiting their real-world applicability beyond controlled laboratory settings. Invariant risk minimization (IRM) addresses this issue by learning invariant features and minimizing the risk across different domains. Thus, it avoids the pitfalls of pseudo-invariant features and spurious causality associated with empirical risk minimization (ERM). However, according to the support overlap theorem, ERM and IRM may fail to address the OOD problem when pseudo-invariant features have insufficient support overlap. To this end, we propose a novel method to enlarge feature support overlap for domain generalization. Specifically, we introduce Bayesian random semantic data augmentation to increase sample diversity and overcome the deficiency of IRM. Experiments on several challenging OOD generalization benchmarks demonstrate that our approach surpasses existing models, delivering superior performance and robustness. The code is available at \url{https://github.com/YaoyaoZhu19/BSDG}.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Analysis of SIR Reaction diffusion system with constant birth and death rate
Authors:
Yiting Yao
Abstract:
This is a truncation of the second year group project at Imperial college london. In this paper, we consider a semilinear reaction diffusion system of SIR model which involves the birth rate and the death rate. We first prove the non-negativity and global existence theorem to ensure that the model makes sense. We prove the uniform convergence of the infection-free solution and study an example tha…
▽ More
This is a truncation of the second year group project at Imperial college london. In this paper, we consider a semilinear reaction diffusion system of SIR model which involves the birth rate and the death rate. We first prove the non-negativity and global existence theorem to ensure that the model makes sense. We prove the uniform convergence of the infection-free solution and study an example that separable solutions can be computed. We also focus on the steady state solution, which we prove the non-uniqueness of the solution and investigate the regularity of the general solution. In the end we also introduce an interesting phenomenon, which is called the Turing instability caused by the diffusion in the model.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
PROUD: PaRetO-gUided Diffusion Model for Multi-objective Generation
Authors:
Yinghua Yao,
Yuangang Pan,
Jing Li,
Ivor Tsang,
Xin Yao
Abstract:
Recent advancements in the realm of deep generative models focus on generating samples that satisfy multiple desired properties. However, prevalent approaches optimize these property functions independently, thus omitting the trade-offs among them. In addition, the property optimization is often improperly integrated into the generative models, resulting in an unnecessary compromise on generation…
▽ More
Recent advancements in the realm of deep generative models focus on generating samples that satisfy multiple desired properties. However, prevalent approaches optimize these property functions independently, thus omitting the trade-offs among them. In addition, the property optimization is often improperly integrated into the generative models, resulting in an unnecessary compromise on generation quality (i.e., the quality of generated samples). To address these issues, we formulate a constrained optimization problem. It seeks to optimize generation quality while ensuring that generated samples reside at the Pareto front of multiple property objectives. Such a formulation enables the generation of samples that cannot be further improved simultaneously on the conflicting property functions and preserves good quality of generated samples. Building upon this formulation, we introduce the PaRetO-gUided Diffusion model (PROUD), wherein the gradients in the denoising process are dynamically adjusted to enhance generation quality while the generated samples adhere to Pareto optimality. Experimental evaluations on image generation and protein generation tasks demonstrate that our PROUD consistently maintains superior generation quality while approaching Pareto optimality across multiple property functions compared to various baselines.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Timestep-Aware Correction for Quantized Diffusion Models
Authors:
Yuzhe Yao,
Feng Tian,
Jun Chen,
Haonan Lin,
Guang Dai,
Yong Liu,
Jingdong Wang
Abstract:
Diffusion models have marked a significant breakthrough in the synthesis of semantically coherent images. However, their extensive noise estimation networks and the iterative generation process limit their wider application, particularly on resource-constrained platforms like mobile devices. Existing post-training quantization (PTQ) methods have managed to compress diffusion models to low precisio…
▽ More
Diffusion models have marked a significant breakthrough in the synthesis of semantically coherent images. However, their extensive noise estimation networks and the iterative generation process limit their wider application, particularly on resource-constrained platforms like mobile devices. Existing post-training quantization (PTQ) methods have managed to compress diffusion models to low precision. Nevertheless, due to the iterative nature of diffusion models, quantization errors tend to accumulate throughout the generation process. This accumulation of error becomes particularly problematic in low-precision scenarios, leading to significant distortions in the generated images. We attribute this accumulation issue to two main causes: error propagation and exposure bias. To address these problems, we propose a timestep-aware correction method for quantized diffusion model, which dynamically corrects the quantization error. By leveraging the proposed method in low-precision diffusion models, substantial enhancement of output quality could be achieved with only negligible computation overhead. Extensive experiments underscore our method's effectiveness and generalizability. By employing the proposed correction strategy, we achieve state-of-the-art (SOTA) results on low-precision models.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Relating CNN-Transformer Fusion Network for Change Detection
Authors:
Yuhao Gao,
Gensheng Pei,
Mengmeng Sheng,
Zeren Sun,
Tao Chen,
Yazhou Yao
Abstract:
While deep learning, particularly convolutional neural networks (CNNs), has revolutionized remote sensing (RS) change detection (CD), existing approaches often miss crucial features due to neglecting global context and incomplete change learning. Additionally, transformer networks struggle with low-level details. RCTNet addresses these limitations by introducing \textbf{(1)} an early fusion backbo…
▽ More
While deep learning, particularly convolutional neural networks (CNNs), has revolutionized remote sensing (RS) change detection (CD), existing approaches often miss crucial features due to neglecting global context and incomplete change learning. Additionally, transformer networks struggle with low-level details. RCTNet addresses these limitations by introducing \textbf{(1)} an early fusion backbone to exploit both spatial and temporal features early on, \textbf{(2)} a Cross-Stage Aggregation (CSA) module for enhanced temporal representation, \textbf{(3)} a Multi-Scale Feature Fusion (MSF) module for enriched feature extraction in the decoder, and \textbf{(4)} an Efficient Self-deciphering Attention (ESA) module utilizing transformers to capture global information and fine-grained details for accurate change detection. Extensive experiments demonstrate RCTNet's clear superiority over traditional RS image CD methods, showing significant improvement and an optimal balance between accuracy and computational cost.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Stereo Risk: A Continuous Modeling Approach to Stereo Matching
Authors:
Ce Liu,
Suryansh Kumar,
Shuhang Gu,
Radu Timofte,
Yao Yao,
Luc Van Gool
Abstract:
We introduce Stereo Risk, a new deep-learning approach to solve the classical stereo-matching problem in computer vision. As it is well-known that stereo matching boils down to a per-pixel disparity estimation problem, the popular state-of-the-art stereo-matching approaches widely rely on regressing the scene disparity values, yet via discretization of scene disparity values. Such discretization o…
▽ More
We introduce Stereo Risk, a new deep-learning approach to solve the classical stereo-matching problem in computer vision. As it is well-known that stereo matching boils down to a per-pixel disparity estimation problem, the popular state-of-the-art stereo-matching approaches widely rely on regressing the scene disparity values, yet via discretization of scene disparity values. Such discretization often fails to capture the nuanced, continuous nature of scene depth. Stereo Risk departs from the conventional discretization approach by formulating the scene disparity as an optimal solution to a continuous risk minimization problem, hence the name "stereo risk". We demonstrate that $L^1$ minimization of the proposed continuous risk function enhances stereo-matching performance for deep networks, particularly for disparities with multi-modal probability distributions. Furthermore, to enable the end-to-end network training of the non-differentiable $L^1$ risk optimization, we exploited the implicit function theorem, ensuring a fully differentiable network. A comprehensive analysis demonstrates our method's theoretical soundness and superior performance over the state-of-the-art methods across various benchmark datasets, including KITTI 2012, KITTI 2015, ETH3D, SceneFlow, and Middlebury 2014.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Evolution of Band Structure in a Kagome Superconductor Cs(V1-xCrx)3Sb5: Toward Universal Understanding of CDW and Superconducting Phase Diagrams
Authors:
Shuto Suzuki,
Takemi Kato,
Yongkai Li,
Kosuke Nakayama,
Zhiwei Wang,
Seigo Souma,
Kenichi Ozawa,
Miho Kitamura,
Koji Horiba,
Hiroshi Kumigashira,
Takashi Takahashi,
Yugui Yao,
Takafumi Sato
Abstract:
Kagome superconductors AV3Sb5 (A = K, Rb, Cs) exhibit a characteristic superconducting and charge-density wave (CDW) phase diagram upon carrier doping and chemical substitution. However, the key electronic states responsible for such a phase diagram have yet to be clarified. Here we report a systematic micro-focused angle-resolved photoemission spectroscopy (ARPES) study of Cs(V1-xCrx)3Sb5 as a fu…
▽ More
Kagome superconductors AV3Sb5 (A = K, Rb, Cs) exhibit a characteristic superconducting and charge-density wave (CDW) phase diagram upon carrier doping and chemical substitution. However, the key electronic states responsible for such a phase diagram have yet to be clarified. Here we report a systematic micro-focused angle-resolved photoemission spectroscopy (ARPES) study of Cs(V1-xCrx)3Sb5 as a function of Cr content x, where Cr substitution causes monotonic reduction of superconducting and CDW transition temperatures. We found that the V-derived bands forming saddle points at the M point and Dirac nodes along high-symmetry cuts show an energy shift due to electron doping by Cr substitution, whereas the Sb-derived electron band at the Gamma point remains almost unchanged, signifying an orbital-selective band shift. We also found that band doubling associated with the emergence of three-dimensional CDW identified at x = 0 vanishes at x = 0.25, in line with the disappearance of CDW. A comparison of band diagrams among Ti-, Nb-, and Cr-substituted Cs(V1-xCrx)3Sb5 suggests the importance to simultaneously take into account the two saddle points at the M point and their proximity to the Fermi energy, to understand the complex phase diagram against carrier doping and chemical pressure.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Anti-Collapse Loss for Deep Metric Learning Based on Coding Rate Metric
Authors:
Xiruo Jiang,
Yazhou Yao,
Xili Dai,
Fumin Shen,
Xian-Sheng Hua,
Heng-Tao Shen
Abstract:
Deep metric learning (DML) aims to learn a discriminative high-dimensional embedding space for downstream tasks like classification, clustering, and retrieval. Prior literature predominantly focuses on pair-based and proxy-based methods to maximize inter-class discrepancy and minimize intra-class diversity. However, these methods tend to suffer from the collapse of the embedding space due to their…
▽ More
Deep metric learning (DML) aims to learn a discriminative high-dimensional embedding space for downstream tasks like classification, clustering, and retrieval. Prior literature predominantly focuses on pair-based and proxy-based methods to maximize inter-class discrepancy and minimize intra-class diversity. However, these methods tend to suffer from the collapse of the embedding space due to their over-reliance on label information. This leads to sub-optimal feature representation and inferior model performance. To maintain the structure of embedding space and avoid feature collapse, we propose a novel loss function called Anti-Collapse Loss. Specifically, our proposed loss primarily draws inspiration from the principle of Maximal Coding Rate Reduction. It promotes the sparseness of feature clusters in the embedding space to prevent collapse by maximizing the average coding rate of sample features or class proxies. Moreover, we integrate our proposed loss with pair-based and proxy-based methods, resulting in notable performance improvement. Comprehensive experiments on benchmark datasets demonstrate that our proposed method outperforms existing state-of-the-art methods. Extensive ablation studies verify the effectiveness of our method in preventing embedding space collapse and promoting generalization performance.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Hilbert band complexes and their applications
Authors:
Zeying Zhang,
Y. X. Zhao,
Yugui Yao,
Shengyuan A. Yang
Abstract:
The study of band connectivity is a fundamental problem in condensed matter physics. Here, we develop a new method for analyzing band connectivity, which completely solves the outstanding questions of the reducibility and decomposition of band complexes. By translating the symmetry conditions into a set of band balance equations, we show that all possible band structure solutions can be described…
▽ More
The study of band connectivity is a fundamental problem in condensed matter physics. Here, we develop a new method for analyzing band connectivity, which completely solves the outstanding questions of the reducibility and decomposition of band complexes. By translating the symmetry conditions into a set of band balance equations, we show that all possible band structure solutions can be described by a positive affine monoid structure, which has a unique minimal set of generators, called Hilbert basis. We show that Hilbert basis completely determine whether a band complex is reducible and how it can be decomposed. The band complexes corresponding to Hilbert basis vectors, termed as Hilbert band complexes (HBCs), can be regarded as elementary building blocks of band structures. We develop algorithms to construct HBCs, analyze their graph features, and merge them into large complexes. We find some interesting examples, such as HBCs corresponding to complete bipartite graphs, and complexes which can grow without bound by successively merging a HBC.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
52B to 1T: Lessons Learned via Tele-FLM Series
Authors:
Xiang Li,
Yiqun Yao,
Xin Jiang,
Xuezhi Fang,
Chao Wang,
Xinzhang Liu,
Zihan Wang,
Yu Zhao,
Xin Wang,
Yuyao Huang,
Shuangyong Song,
Yongxiang Li,
Zheng Zhang,
Bo Zhao,
Aixin Sun,
Yequan Wang,
Zhongjiang He,
Zhongyuan Wang,
Xuelong Li,
Tiejun Huang
Abstract:
Large Language Models (LLMs) represent a significant stride toward Artificial General Intelligence. As scaling laws underscore the potential of increasing model sizes, the academic community has intensified its investigations into LLMs with capacities exceeding 50 billion parameters. This technical report builds on our prior work with Tele-FLM (also known as FLM-2), a publicly available 52-billion…
▽ More
Large Language Models (LLMs) represent a significant stride toward Artificial General Intelligence. As scaling laws underscore the potential of increasing model sizes, the academic community has intensified its investigations into LLMs with capacities exceeding 50 billion parameters. This technical report builds on our prior work with Tele-FLM (also known as FLM-2), a publicly available 52-billion-parameter model. We delve into two primary areas: we first discuss our observation of Supervised Fine-tuning (SFT) on Tele-FLM-52B, which supports the "less is more" approach for SFT data construction; second, we demonstrate our experiments and analyses on the best practices for progressively growing a model from 52 billion to 102 billion, and subsequently to 1 trillion parameters. We will open-source a 1T model checkpoint, namely Tele-FLM-1T, to advance further training and research.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Foster Adaptivity and Balance in Learning with Noisy Labels
Authors:
Mengmeng Sheng,
Zeren Sun,
Tao Chen,
Shuchao Pang,
Yucheng Wang,
Yazhou Yao
Abstract:
Label noise is ubiquitous in real-world scenarios, posing a practical challenge to supervised models due to its effect in hurting the generalization performance of deep neural networks. Existing methods primarily employ the sample selection paradigm and usually rely on dataset-dependent prior knowledge (\eg, a pre-defined threshold) to cope with label noise, inevitably degrading the adaptivity. Mo…
▽ More
Label noise is ubiquitous in real-world scenarios, posing a practical challenge to supervised models due to its effect in hurting the generalization performance of deep neural networks. Existing methods primarily employ the sample selection paradigm and usually rely on dataset-dependent prior knowledge (\eg, a pre-defined threshold) to cope with label noise, inevitably degrading the adaptivity. Moreover, existing methods tend to neglect the class balance in selecting samples, leading to biased model performance. To this end, we propose a simple yet effective approach named \textbf{SED} to deal with label noise in a \textbf{S}elf-adaptiv\textbf{E} and class-balance\textbf{D} manner. Specifically, we first design a novel sample selection strategy to empower self-adaptivity and class balance when identifying clean and noisy data. A mean-teacher model is then employed to correct labels of noisy samples. Subsequently, we propose a self-adaptive and class-balanced sample re-weighting mechanism to assign different weights to detected noisy samples. Finally, we additionally employ consistency regularization on selected clean samples to improve model generalization performance. Extensive experimental results on synthetic and real-world datasets demonstrate the effectiveness and superiority of our proposed method. The source code has been made available at https://github.com/NUST-Machine-Intelligence-Laboratory/SED.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Knowledge Transfer with Simulated Inter-Image Erasing for Weakly Supervised Semantic Segmentation
Authors:
Tao Chen,
XiRuo Jiang,
Gensheng Pei,
Zeren Sun,
Yucheng Wang,
Yazhou Yao
Abstract:
Though adversarial erasing has prevailed in weakly supervised semantic segmentation to help activate integral object regions, existing approaches still suffer from the dilemma of under-activation and over-expansion due to the difficulty in determining when to stop erasing. In this paper, we propose a \textbf{K}nowledge \textbf{T}ransfer with \textbf{S}imulated Inter-Image \textbf{E}rasing (KTSE) a…
▽ More
Though adversarial erasing has prevailed in weakly supervised semantic segmentation to help activate integral object regions, existing approaches still suffer from the dilemma of under-activation and over-expansion due to the difficulty in determining when to stop erasing. In this paper, we propose a \textbf{K}nowledge \textbf{T}ransfer with \textbf{S}imulated Inter-Image \textbf{E}rasing (KTSE) approach for weakly supervised semantic segmentation to alleviate the above problem. In contrast to existing erasing-based methods that remove the discriminative part for more object discovery, we propose a simulated inter-image erasing scenario to weaken the original activation by introducing extra object information. Then, object knowledge is transferred from the anchor image to the consequent less activated localization map to strengthen network localization ability. Considering the adopted bidirectional alignment will also weaken the anchor image activation if appropriate constraints are missing, we propose a self-supervised regularization module to maintain the reliable activation in discriminative regions and improve the inter-class object boundary recognition for complex images with multiple categories of objects. In addition, we resort to intra-image erasing and propose a multi-granularity alignment module to gently enlarge the object activation to boost the object knowledge transfer. Extensive experiments and ablation studies on PASCAL VOC 2012 and COCO datasets demonstrate the superiority of our proposed approach. Source codes and models are available at https://github.com/NUST-Machine-Intelligence-Laboratory/KTSE.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
The Inverted 3-Sum Box: General Formulation and Quantum Information Theoretic Optimality
Authors:
Yuhang Yao,
Syed A. Jafar
Abstract:
The $N$-sum box protocol specifies a class of $\mathbb{F}_d$ linear functions $f(W_1,\cdots,W_K)=V_1W_1+V_2W_2+\cdots+V_KW_K\in\mathbb{F}_d^{m\times 1}$ that can be computed at information theoretically optimal communication cost (minimum number of qudits $Δ_1,\cdots,Δ_K$ sent by the transmitters Alice$_1$, Alice$_2$,$\cdots$, Alice$_K$, respectively, to the receiver, Bob, per computation instance…
▽ More
The $N$-sum box protocol specifies a class of $\mathbb{F}_d$ linear functions $f(W_1,\cdots,W_K)=V_1W_1+V_2W_2+\cdots+V_KW_K\in\mathbb{F}_d^{m\times 1}$ that can be computed at information theoretically optimal communication cost (minimum number of qudits $Δ_1,\cdots,Δ_K$ sent by the transmitters Alice$_1$, Alice$_2$,$\cdots$, Alice$_K$, respectively, to the receiver, Bob, per computation instance) over a noise-free quantum multiple access channel (QMAC), when the input data streams $W_k\in\mathbb{F}_d^{m_k\times 1}, k\in[K]$, originate at the distributed transmitters, who share quantum entanglement in advance but are not otherwise allowed to communicate with each other. In prior work this set of optimally computable functions is identified in terms of a strong self-orthogonality (SSO) condition on the transfer function of the $N$-sum box. In this work we consider an `inverted' scenario, where instead of a feasible $N$-sum box transfer function, we are given an arbitrary $\mathbb{F}_d$ linear function, i.e., arbitrary matrices $V_k\in\mathbb{F}_d^{m\times m_k}$ are specified, and the goal is to characterize the set of all feasible communication cost tuples $(Δ_1,\cdots,Δ_K)$, not just based on $N$-sum box protocols, but across all possible quantum coding schemes. As our main result, we fully solve this problem for $K=3$ transmitters ($K\geq 4$ settings remain open). Coding schemes based on the $N$-sum box protocol (along with elementary ideas such as treating qudits as classical dits, time-sharing and batch-processing) are shown to be information theoretically optimal in all cases. As an example, in the symmetric case where rk$(V_1)$=rk$(V_2)$=rk$(V_3) \triangleq r_1$, rk$([V_1, V_2])$=rk$([V_2, V_3])$=rk$([V_3, V_1])\triangleq r_2$, and rk$([V_1, V_2, V_3])\triangleq r_3$ (rk = rank), the minimum total-download cost is $\max \{1.5r_1 + 0.75(r_3 - r_2), r_3\}$.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Adaptive variational quantum computing approaches for Green's functions and nonlinear susceptibilities
Authors:
Martin Mootz,
Thomas Iadecola,
Yong-Xin Yao
Abstract:
We present and benchmark quantum computing approaches for calculating real-time single-particle Green's functions and nonlinear susceptibilities of Hamiltonian systems. The approaches leverage adaptive variational quantum algorithms for state preparation and propagation. Using automatically generated compact circuits, the dynamical evolution is performed over sufficiently long times to achieve ade…
▽ More
We present and benchmark quantum computing approaches for calculating real-time single-particle Green's functions and nonlinear susceptibilities of Hamiltonian systems. The approaches leverage adaptive variational quantum algorithms for state preparation and propagation. Using automatically generated compact circuits, the dynamical evolution is performed over sufficiently long times to achieve adequate frequency resolution of the response functions. We showcase accurate Green's function calculations using a statevector simulator for Fermi-Hubbard chains of 4 and 6 sites, with maximal circuit depth of 65 and 424 layers, respectively. Additionally, we consider an antiferromagnetic quantum spin-1 model that incorporates the Dzyaloshinskii-Moriya interaction to illustrate calculations of the third-order nonlinear susceptibilities, which can be measured in two-dimensional coherent spectroscopy experiments. These results demonstrate that real-time approaches using adaptive parameterized circuits to evaluate linear and nonlinear response functions can be feasible with near-term quantum processors.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Flat bands and distinct density wave orders in correlated Kagome superconductor CsCr$_3$Sb$_5$
Authors:
Shuting Peng,
Yulei Han,
Yongkai Li,
Jianchang Shen,
Yu Miao,
Yang Luo,
Linwei Huai,
Zhipeng Ou,
Hongyu Li,
Ziji Xiang,
Zhengtai Liu,
Dawei Shen,
Makoto Hashimoto,
Donghui Lu,
Yugui Yao,
Zhenhua Qiao,
Zhiwei Wang,
Junfeng He
Abstract:
Kagome metal CsV$_3$Sb$_5$ has attracted much recent attention due to the coexistence of multiple exotic orders and the associated proposals to mimic unconventional high temperature superconductors. Nevertheless, magnetism and strong electronic correlations -- two essential ingredients for unconventional superconductivity, are absent in this V-based Kagome metal. CsCr$_3$Sb$_5$ is a newly discover…
▽ More
Kagome metal CsV$_3$Sb$_5$ has attracted much recent attention due to the coexistence of multiple exotic orders and the associated proposals to mimic unconventional high temperature superconductors. Nevertheless, magnetism and strong electronic correlations -- two essential ingredients for unconventional superconductivity, are absent in this V-based Kagome metal. CsCr$_3$Sb$_5$ is a newly discovered Cr-based parallel of CsV$_3$Sb$_5$, in which magnetism appears with charge density wave and superconductivity at different temperature and pressure regions. Enhanced electronic correlations are also suggested by theoretical proposals due to the calculated flat bands. Here, we report angle-resolved photoemission measurements and first-principles calculations on this new material system. Electron energy bands and the associated orbitals are resolved. Flat bands are observed near the Fermi level. Doping dependent measurements on Cs(Cr$_x$V$_{1-x}$)$_3$Sb$_5$ reveal a gradually enhanced band renormalization from CsV$_3$Sb$_5$ to CsCr$_3$Sb$_5$, accompanied by distinct spatial symmetry breaking states in the phase diagram.
△ Less
Submitted 26 June, 2024; v1 submitted 25 June, 2024;
originally announced June 2024.
-
Indications of superconductivities in blend of variant apatite and covellite
Authors:
Hongyang Wang,
Yijing Zhao,
Hao Wu,
Ling Wang,
Zhixing Wu,
Zhihui Geng,
Jiewen Xiao,
Weiwei Xue,
Shufeng Ye,
Ning Chen,
Xianfeng Qiao,
Yao Yao
Abstract:
Through heavily doping sulfur into an apatite framework, we synthesize a new blend mainly comprising variant apatite and covellite (copper sulfide). Magnetic measurement exhibits that significant diamagnetism appears at around 260 K and drops dramatically below 30 K implying coexistence of two superconducting phases. The upper critical magnetic field is larger than 1000 Oe at 250 K. Electric measu…
▽ More
Through heavily doping sulfur into an apatite framework, we synthesize a new blend mainly comprising variant apatite and covellite (copper sulfide). Magnetic measurement exhibits that significant diamagnetism appears at around 260 K and drops dramatically below 30 K implying coexistence of two superconducting phases. The upper critical magnetic field is larger than 1000 Oe at 250 K. Electric measurement manifests that the current-voltage curves deviate from the normal linear lineshape suggesting the presence of zero-resistance effect, and the critical current is around 50 $μ$A at 140 K. These exotic magnetic and electric features strongly indicate these two components, variant apatite and covellite, individually trigger two superconducting phases at near-room and low temperatures.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Adaptive Payoff-driven Interaction in Networked Snowdrift Games
Authors:
Xiaojin Xiong,
Yichao Yao,
Minyu Feng,
Manuel Chica
Abstract:
In social dilemmas, most interactions are transient and susceptible to restructuring, leading to continuous changes in social networks over time. Typically, agents assess the rewards of their current interactions and adjust their connections to optimize outcomes. In this paper, we introduce an adaptive network model in the snowdrift game to examine dynamic levels of cooperation and network topolog…
▽ More
In social dilemmas, most interactions are transient and susceptible to restructuring, leading to continuous changes in social networks over time. Typically, agents assess the rewards of their current interactions and adjust their connections to optimize outcomes. In this paper, we introduce an adaptive network model in the snowdrift game to examine dynamic levels of cooperation and network topology, involving the potential for both the termination of existing connections and the establishment of new ones. In particular, we define the agent's asymmetric disassociation tendency toward their neighbors, which fundamentally determines the probability of edge dismantlement. The mechanism allows agents to selectively sever and rewire their connections to alternative individuals to refine partnerships. Our findings reveal that adaptive networks are particularly effective in promoting a robust evolution toward states of either pure cooperation or complete defection, especially under conditions of extreme cost-benefit ratios, as compared to static network models. Moreover, the dynamic restructuring of connections and the distribution of network degrees among agents are closely linked to the levels of cooperation in stationary states. Specifically, cooperators tend to seek broader neighborhoods when confronted with the invasion of multiple defectors.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Quantum geometry embedded in unitarity of evolution: revealing its impacts as quantum oscillation and dephasing in spin resonance and crystal bands
Authors:
B. Q. Song,
J. D. H. Smith,
T. Jiang,
Y. X. Yao,
J. Wang
Abstract:
Quantum Hall effects provide intuitive ways of revealing the topology in crystals, i.e., each quantized "step" represents a distinct topological state. Here, we seek a counterpart for "visualizing" quantum geometry, which is a broader concept. We show how geometry emerges in quantum as an intrinsic consequence of unitary evolution, independent of specific details or approximations, suggesting quan…
▽ More
Quantum Hall effects provide intuitive ways of revealing the topology in crystals, i.e., each quantized "step" represents a distinct topological state. Here, we seek a counterpart for "visualizing" quantum geometry, which is a broader concept. We show how geometry emerges in quantum as an intrinsic consequence of unitary evolution, independent of specific details or approximations, suggesting quantum geometry may have widespread applicability. Indeed, we exemplify geometric observables, such as oscillation, dephasing, in spin and band scenarios. These phenomena are robust owing to the continuity of geometry, and can be tuned by geometric parameters. Anomalies, supported by both analytic and numerical solutions, underscore the advantages of adopting a geometric perspective, potentially yielding distinguishable experimental signatures.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
CPL effective dark energy from the backreaction effect
Authors:
Yan-Hong Yao,
Xin-He Meng
Abstract:
In this paper, we interpret the dark energy as an effect caused by small scale inhomogeneities of the universe with the use of the spatial averaged approach of Buchert. The model considered here adopts the Chevallier-Polarski-Linder(CPL) parameterizations of the equation of state of the effective perfect fluid from the backreaction effect. Thanks to the effective geometry introduced by Larena et.…
▽ More
In this paper, we interpret the dark energy as an effect caused by small scale inhomogeneities of the universe with the use of the spatial averaged approach of Buchert. The model considered here adopts the Chevallier-Polarski-Linder(CPL) parameterizations of the equation of state of the effective perfect fluid from the backreaction effect. Thanks to the effective geometry introduced by Larena et. al.\cite{larena2009testing} in their previous work, we confront such backreaction model with latest type Ia supernova and Hubble parameter observations, coming out with results that reveal the difference between the Friedmann-Lemaître-Robertson-Walker model and backreaction model.
△ Less
Submitted 30 May, 2024;
originally announced June 2024.
-
MR-BEN: A Comprehensive Meta-Reasoning Benchmark for Large Language Models
Authors:
Zhongshen Zeng,
Yinhong Liu,
Yingjia Wan,
Jingyao Li,
Pengguang Chen,
Jianbo Dai,
Yuxuan Yao,
Rongwu Xu,
Zehan Qi,
Wanru Zhao,
Linling Shen,
Jianqiao Lu,
Haochen Tan,
Yukang Chen,
Hao Zhang,
Zhan Shi,
Bailin Wang,
Zhijiang Guo,
Jiaya Jia
Abstract:
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making, largely based on the step-by-step chain-of-thought reasoning processes. However, it has been increasingly challenging to evaluate the reasoning capability of LLMs. Concretely, existing outcome-based benchmarks begin to saturate and become less sufficient to monitor the progress. To this end, we pr…
▽ More
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making, largely based on the step-by-step chain-of-thought reasoning processes. However, it has been increasingly challenging to evaluate the reasoning capability of LLMs. Concretely, existing outcome-based benchmarks begin to saturate and become less sufficient to monitor the progress. To this end, we present a process-based benchmark MR-BEN that demands a meta reasoning skill, where LMs are asked to locate and analyse potential errors in automatically generated reasoning steps. MR-BEN is a comprehensive benchmark comprising 5,975 questions collected from human experts, covering various subjects such as physics, chemistry, logic, coding, and more. Through our designed metrics for assessing meta-reasoning on this benchmark, we identify interesting limitations and weaknesses of current LLMs (open-source and closed-source models). For example, open-source models are seemingly comparable to GPT-4 on outcome-based benchmarks, but they lag far behind on our benchmark, revealing the underlying reasoning capability gap between them. Our dataset and codes are available on https://randolph-zeng.github.io/Mr-Ben.github.io/.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Microscopic theory of magnetoresistance in ferromagnetic materials
Authors:
X. -P. Zhang,
X. Wang,
Y. Yao
Abstract:
The magnetoresistance (MR) effect, which stems from the spin-exchange coupling between local moments and itinerant electrons in magnetic materials, is a challenging many-body and open-quantum problem. Here, we develop a comprehensive microscopic theory of MR from an open-quantum system perspective. The theory not only predicts the magnetic field and temperature dependencies of MR which are related…
▽ More
The magnetoresistance (MR) effect, which stems from the spin-exchange coupling between local moments and itinerant electrons in magnetic materials, is a challenging many-body and open-quantum problem. Here, we develop a comprehensive microscopic theory of MR from an open-quantum system perspective. The theory not only predicts the magnetic field and temperature dependencies of MR which are related to spin relaxation time and spin-exchange field but also obtains the universal cosine-square law of anisotropic MR that microscopically elucidates diverse MR effects from the magnon-induced spin flip, anisotropic spin relaxation, and Hanle spin precession of itinerant electrons. Moreover, we reveal fruitful behaviors of the MR effect that enable the simple detection of the microscopic spin-exchange coupling through an electrical approach. Our theory contributes to a deeper understanding of the fundamental physics underlying MR and provides insights for experiments involving magnetic materials.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
PRESTO: Progressive Pretraining Enhances Synthetic Chemistry Outcomes
Authors:
He Cao,
Yanjun Shao,
Zhiyuan Liu,
Zijing Liu,
Xiangru Tang,
Yuan Yao,
Yu Li
Abstract:
Multimodal Large Language Models (MLLMs) have seen growing adoption across various scientific disciplines. These advancements encourage the investigation of molecule-text modeling within synthetic chemistry, a field dedicated to designing and conducting chemical reactions to synthesize new compounds with desired properties and applications. Current approaches, however, often neglect the critical r…
▽ More
Multimodal Large Language Models (MLLMs) have seen growing adoption across various scientific disciplines. These advancements encourage the investigation of molecule-text modeling within synthetic chemistry, a field dedicated to designing and conducting chemical reactions to synthesize new compounds with desired properties and applications. Current approaches, however, often neglect the critical role of multiple molecule graph interaction in understanding chemical reactions, leading to suboptimal performance in synthetic chemistry tasks. This study introduces PRESTO(Progressive Pretraining Enhances Synthetic Chemistry Outcomes), a new framework that bridges the molecule-text modality gap by integrating a comprehensive benchmark of pretraining strategies and dataset configurations. It progressively improves multimodal LLMs through cross-modal alignment and multi-graph understanding. Our extensive experiments demonstrate that PRESTO offers competitive results in downstream synthetic chemistry tasks. The code can be found at https://github.com/IDEA-XL/PRESTO.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
An extrapolation-driven network architecture for physics-informed deep learning
Authors:
Yong Wang,
Yanzhong Yao,
Zhiming Gao
Abstract:
Deep learning with physics-informed neural networks (PINNs) has emerged as a highly popular and effective approach for solving partial differential equations(PDEs). In this paper, we first investigate the extrapolation capability of the PINN method for time-dependent PDEs. Taking advantage of this extrapolation property, we can generalize the training result obtained in the time subinterval to the…
▽ More
Deep learning with physics-informed neural networks (PINNs) has emerged as a highly popular and effective approach for solving partial differential equations(PDEs). In this paper, we first investigate the extrapolation capability of the PINN method for time-dependent PDEs. Taking advantage of this extrapolation property, we can generalize the training result obtained in the time subinterval to the large interval by adding a correction term to the network parameters of the subinterval. The correction term is determined by further training with the sample points in the added subinterval. Secondly, by designing an extrapolation control function with special characteristics and combining it with the correction term, we construct a new neural network architecture whose network parameters are coupled with the time variable, which we call the extrapolation-driven network architecture. Based on this architecture, using a single neural network, we can obtain the overall PINN solution of the whole domain with the following two characteristics: (1) it completely inherits the local solution of the interval obtained from the previous training, (2) at the interval node, it strictly maintains the continuity and smoothness that the true solution has. The extrapolation-driven network architecture allows us to divide a large time domain into multiple subintervals and solve the time-dependent PDEs one by one in chronological order. This training scheme respects the causality principle and effectively overcomes the difficulties of the conventional PINN method in solving the evolution equation on a large time domain. Numerical experiments verify the performance of our proposed method.
△ Less
Submitted 21 June, 2024; v1 submitted 18 June, 2024;
originally announced June 2024.
-
JEN-1 DreamStyler: Customized Musical Concept Learning via Pivotal Parameters Tuning
Authors:
Boyu Chen,
Peike Li,
Yao Yao,
Alex Wang
Abstract:
Large models for text-to-music generation have achieved significant progress, facilitating the creation of high-quality and varied musical compositions from provided text prompts. However, input text prompts may not precisely capture user requirements, particularly when the objective is to generate music that embodies a specific concept derived from a designated reference collection. In this paper…
▽ More
Large models for text-to-music generation have achieved significant progress, facilitating the creation of high-quality and varied musical compositions from provided text prompts. However, input text prompts may not precisely capture user requirements, particularly when the objective is to generate music that embodies a specific concept derived from a designated reference collection. In this paper, we propose a novel method for customized text-to-music generation, which can capture the concept from a two-minute reference music and generate a new piece of music conforming to the concept. We achieve this by fine-tuning a pretrained text-to-music model using the reference music. However, directly fine-tuning all parameters leads to overfitting issues. To address this problem, we propose a Pivotal Parameters Tuning method that enables the model to assimilate the new concept while preserving its original generative capabilities. Additionally, we identify a potential concept conflict when introducing multiple concepts into the pretrained model. We present a concept enhancement strategy to distinguish multiple concepts, enabling the fine-tuned model to generate music incorporating either individual or multiple concepts simultaneously. Since we are the first to work on the customized music generation task, we also introduce a new dataset and evaluation protocol for the new task. Our proposed Jen1-DreamStyler outperforms several baselines in both qualitative and quantitative evaluations. Demos will be available at https://www.jenmusic.ai/research#DreamStyler.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Intrinsic high-fidelity spin polarization of charged vacancies in hexagonal boron nitride
Authors:
Wonjae Lee,
Vincent S. Liu,
Zhelun Zhang,
Sangha Kim,
Ruotian Gong,
Xinyi Du,
Khanh Pham,
Thomas Poirier,
Zeyu Hao,
James H. Edgar,
Philip Kim,
Chong Zu,
Emily J. Davis,
Norman Y. Yao
Abstract:
The negatively charged boron vacancy ($\mathrm{V}_{\mathrm{B}}^-$) in hexagonal boron nitride (hBN) has garnered significant attention among defects in two-dimensional materials. This owes, in part, to its deterministic generation, well-characterized atomic structure, and optical polarizability at room temperature. We investigate the latter through extensive measurements probing both the ground an…
▽ More
The negatively charged boron vacancy ($\mathrm{V}_{\mathrm{B}}^-$) in hexagonal boron nitride (hBN) has garnered significant attention among defects in two-dimensional materials. This owes, in part, to its deterministic generation, well-characterized atomic structure, and optical polarizability at room temperature. We investigate the latter through extensive measurements probing both the ground and excited state polarization dynamics. We develop a semiclassical model based on these measurements that predicts a near-unity degree of spin polarization, surpassing other solid-state spin defects under ambient conditions. Building upon our model, we include the presence of nuclear spin degrees of freedom adjacent to the $\mathrm{V}_{\mathrm{B}}^-$ and perform a comprehensive set of Lindbladian numerics to investigate the hyperfine-induced polarization of the nuclear spins. Our simulations predict a number of important features that emerge as a function of magnetic field which are borne out by experiment.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
GUICourse: From General Vision Language Models to Versatile GUI Agents
Authors:
Wentong Chen,
Junbo Cui,
Jinyi Hu,
Yujia Qin,
Junjie Fang,
Yue Zhao,
Chongyi Wang,
Jun Liu,
Guirong Chen,
Yupeng Huo,
Yuan Yao,
Yankai Lin,
Zhiyuan Liu,
Maosong Sun
Abstract:
Utilizing Graphic User Interface (GUI) for human-computer interaction is essential for accessing a wide range of digital tools. Recent advancements in Vision Language Models (VLMs) highlight the compelling potential to develop versatile agents to help humans finish GUI navigation tasks. However, current VLMs are challenged in terms of fundamental abilities (OCR and grounding) and GUI knowledge (th…
▽ More
Utilizing Graphic User Interface (GUI) for human-computer interaction is essential for accessing a wide range of digital tools. Recent advancements in Vision Language Models (VLMs) highlight the compelling potential to develop versatile agents to help humans finish GUI navigation tasks. However, current VLMs are challenged in terms of fundamental abilities (OCR and grounding) and GUI knowledge (the functions and control methods of GUI elements), preventing them from becoming practical GUI agents. To solve these challenges, we contribute GUICourse, a suite of datasets to train visual-based GUI agents from general VLMs. First, we introduce the GUIEnv dataset to strengthen the OCR and grounding capabilities of VLMs. Then, we introduce the GUIAct and GUIChat datasets to enrich their knowledge of GUI components and interactions. Experiments demonstrate that our GUI agents have better performance on common GUI tasks than their baseline VLMs. Even the small-size GUI agent (with 3.1B parameters) can still work well on single-step and multi-step GUI tasks. Finally, we analyze the different varieties in the training stage of this agent by ablation study. Our source codes and datasets are released at https://github.com/yiye3/GUICourse.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Planar Hall Plateau in Magnetic Weyl Semimetals
Authors:
Lei Li,
Chaoxi Cui,
Run-Wu Zhang,
Zhi-Ming Yu,
Yugui Yao
Abstract:
Despite the rapid progress in the study of planar Hall effect (PHE) in recent years, all the previous works only showed that the PHE is connected to local geometric quantities, such as Berry curvature. Here, for the first time, we point out that the PHE in magnetic Weyl semimetals is directly related to a global quantity, namely, the Chern number of the Weyl point. This leads to a remarkable conse…
▽ More
Despite the rapid progress in the study of planar Hall effect (PHE) in recent years, all the previous works only showed that the PHE is connected to local geometric quantities, such as Berry curvature. Here, for the first time, we point out that the PHE in magnetic Weyl semimetals is directly related to a global quantity, namely, the Chern number of the Weyl point. This leads to a remarkable consequence that the PHE observation predicted here is robust against many system details, including the Fermi energy. The main difference between non-magnetic and magnetic Weyl points is that the latter breaks time-reversal symmetry T, thus generally possessing an energy tilt. Via semiclassical Boltzmann theory, we investigate the PHE in generic magnetic Weyl models with energy tilt and arbitrary Chern number. We find that by aligning the magnetic and electric fields in the same direction, the trace of the PHE conductivity contributed from Berry curvature and orbital moment is proportional to the Chern number and the energy tilt of the Weyl points, resulting in previously undiscovered quantized PHE plateau by varying Fermi energy. We further confirm the existence of PHE plateaus in a more realistic lattice model without T symmetry. By proposing a new quantized physical quantity, our work not only provides a new tool for extracting the topological character of the Weyl points but also suggests that the interplay between topology and magnetism can give rise to intriguing physics.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
All-electron $BSE@GW$ method with Numeric Atom-Centered Orbitals for Extended Systems
Authors:
Ruiyi Zhou,
Yi Yao,
Volker Blum,
Xinguo Ren,
Yosuke Kanai
Abstract:
Green's function theory has emerged as a powerful many-body approach not only in condensed matter physics but also in quantum chemistry in recent years. We have developed a new all-electron implementation of the BSE@GW formalism using numeric atom-centered orbital basis sets (Liu et al., J. Chem. Phys. 152, 044105 (2020)). We present our recent developments in implementing this formalism for exten…
▽ More
Green's function theory has emerged as a powerful many-body approach not only in condensed matter physics but also in quantum chemistry in recent years. We have developed a new all-electron implementation of the BSE@GW formalism using numeric atom-centered orbital basis sets (Liu et al., J. Chem. Phys. 152, 044105 (2020)). We present our recent developments in implementing this formalism for extended systems with periodic boundary conditions. We discuss its numerical implementation and various convergence tests pertaining to numerical atom-centered orbitals, auxiliary basis sets for the resolution-of-identity formalism, and Brillouin zone sampling. Proof-of-principle examples are presented to compare with other formalisms, illustrating the new all-electron BSE@GW method for extended systems.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Toward Optimal LLM Alignments Using Two-Player Games
Authors:
Rui Zheng,
Hongyi Guo,
Zhihan Liu,
Xiaoying Zhang,
Yuanshun Yao,
Xiaojun Xu,
Zhaoran Wang,
Zhiheng Xi,
Tao Gui,
Qi Zhang,
Xuanjing Huang,
Hang Li,
Yang Liu
Abstract:
The standard Reinforcement Learning from Human Feedback (RLHF) framework primarily focuses on optimizing the performance of large language models using pre-collected prompts. However, collecting prompts that provide comprehensive coverage is both tedious and challenging, and often fails to include scenarios that LLMs need to improve on the most. In this paper, we investigate alignment through the…
▽ More
The standard Reinforcement Learning from Human Feedback (RLHF) framework primarily focuses on optimizing the performance of large language models using pre-collected prompts. However, collecting prompts that provide comprehensive coverage is both tedious and challenging, and often fails to include scenarios that LLMs need to improve on the most. In this paper, we investigate alignment through the lens of two-agent games, involving iterative interactions between an adversarial and a defensive agent. The adversarial agent's task at each step is to generate prompts that expose the weakness of the defensive agent. In return, the defensive agent seeks to improve its responses to these newly identified prompts it struggled with, based on feedback from the reward model. We theoretically demonstrate that this iterative reinforcement learning optimization converges to a Nash Equilibrium for the game induced by the agents. Experimental results in safety scenarios demonstrate that learning in such a competitive environment not only fully trains agents but also leads to policies with enhanced generalization capabilities for both adversarial and defensive agents.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
TorchOpera: A Compound AI System for LLM Safety
Authors:
Shanshan Han,
Yuhang Yao,
Zijian Hu,
Dimitris Stripelis,
Zhaozhuo Xu,
Chaoyang He
Abstract:
We introduce TorchOpera, a compound AI system for enhancing the safety and quality of prompts and responses for Large Language Models. TorchOpera ensures that all user prompts are safe, contextually grounded, and effectively processed, while enhancing LLM responses to be relevant and high quality. TorchOpera utilizes the vector database for contextual grounding, rule-based wrappers for flexible mo…
▽ More
We introduce TorchOpera, a compound AI system for enhancing the safety and quality of prompts and responses for Large Language Models. TorchOpera ensures that all user prompts are safe, contextually grounded, and effectively processed, while enhancing LLM responses to be relevant and high quality. TorchOpera utilizes the vector database for contextual grounding, rule-based wrappers for flexible modifications, and specialized mechanisms for detecting and adjusting unsafe or incorrect content. We also provide a view of the compound AI system to reduce the computational cost. Extensive experiments show that TorchOpera ensures the safety, reliability, and applicability of LLMs in real-world settings while maintaining the efficiency of LLM responses.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
Authors:
Mingwang Xu,
Hui Li,
Qingkun Su,
Hanlin Shang,
Liwei Zhang,
Ce Liu,
Jingdong Wang,
Yao Yao,
Siyu Zhu
Abstract:
The field of portrait image animation, driven by speech audio input, has experienced significant advancements in the generation of realistic and dynamic portraits. This research delves into the complexities of synchronizing facial movements and creating visually appealing, temporally consistent animations within the framework of diffusion-based methodologies. Moving away from traditional paradigms…
▽ More
The field of portrait image animation, driven by speech audio input, has experienced significant advancements in the generation of realistic and dynamic portraits. This research delves into the complexities of synchronizing facial movements and creating visually appealing, temporally consistent animations within the framework of diffusion-based methodologies. Moving away from traditional paradigms that rely on parametric models for intermediate facial representations, our innovative approach embraces the end-to-end diffusion paradigm and introduces a hierarchical audio-driven visual synthesis module to enhance the precision of alignment between audio inputs and visual outputs, encompassing lip, expression, and pose motion. Our proposed network architecture seamlessly integrates diffusion-based generative models, a UNet-based denoiser, temporal alignment techniques, and a reference network. The proposed hierarchical audio-driven visual synthesis offers adaptive control over expression and pose diversity, enabling more effective personalization tailored to different identities. Through a comprehensive evaluation that incorporates both qualitative and quantitative analyses, our approach demonstrates obvious enhancements in image and video quality, lip synchronization precision, and motion diversity. Further visualization and access to the source code can be found at: https://fudan-generative-vision.github.io/hallo.
△ Less
Submitted 16 June, 2024; v1 submitted 13 June, 2024;
originally announced June 2024.
-
Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes…
▽ More
In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes of astrophysical $γ$-ray background while large amount of dark matter. By analyzing more than 700 days observational data at LHAASO, no significant dark matter signal from 1 TeV to 1 EeV is detected. Accordingly we derive the most stringent constraints on the ultra-heavy dark matter annihilation cross-section up to EeV. The constraints on the lifetime of dark matter in decay mode are also derived.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Reconfigurable, Multifunctional Origami Electronic Membranes for Mechanical and Environmental Sensing
Authors:
Yao Yao,
Guanghui Li,
Xin Ning
Abstract:
This work introduces a concept of origami electronic membranes that leverages the design and fabrication of flexible electronics and the mechanical behavior of engineering origami to achieve unique multifunctional, shape-reconfigurable, and adaptive membranes for mechanical and environmental sensing in benign and harsh conditions. This paper presents the materials, design, and fabrication methods…
▽ More
This work introduces a concept of origami electronic membranes that leverages the design and fabrication of flexible electronics and the mechanical behavior of engineering origami to achieve unique multifunctional, shape-reconfigurable, and adaptive membranes for mechanical and environmental sensing in benign and harsh conditions. This paper presents the materials, design, and fabrication methods for realizing six origami electronic membranes capable of reconfiguring planar or three-dimensional shapes based on the modified flasher, Kresling, Miura-ori, circular, letter, and Tachi-Miura origami patterns. These origami-based, thin-film flexible electronics can obtain both expansion and folding of their shapes, as well as transformation between different geometries. The origami electronic membranes can achieve mechanical and environmental sensing functions such as measuring motions, mechanical strains, temperatures, UV light, and humidity. The results reported here demonstrate the promise of combining engineering origami with flexible electronics to advance the state-of-the-art in multifunctional foldable and deployable electronics and systems.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Label Smoothing Improves Machine Unlearning
Authors:
Zonglin Di,
Zhaowei Zhu,
Jinghan Jia,
Jiancheng Liu,
Zafar Takhirov,
Bo Jiang,
Yuanshun Yao,
Sijia Liu,
Yang Liu
Abstract:
The objective of machine unlearning (MU) is to eliminate previously learned data from a model. However, it is challenging to strike a balance between computation cost and performance when using existing MU techniques. Taking inspiration from the influence of label smoothing on model confidence and differential privacy, we propose a simple gradient-based MU approach that uses an inverse process of…
▽ More
The objective of machine unlearning (MU) is to eliminate previously learned data from a model. However, it is challenging to strike a balance between computation cost and performance when using existing MU techniques. Taking inspiration from the influence of label smoothing on model confidence and differential privacy, we propose a simple gradient-based MU approach that uses an inverse process of label smoothing. This work introduces UGradSL, a simple, plug-and-play MU approach that uses smoothed labels. We provide theoretical analyses demonstrating why properly introducing label smoothing improves MU performance. We conducted extensive experiments on six datasets of various sizes and different modalities, demonstrating the effectiveness and robustness of our proposed method. The consistent improvement in MU performance is only at a marginal cost of additional computations. For instance, UGradSL improves over the gradient ascent MU baseline by 66% unlearning accuracy without sacrificing unlearning efficiency.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
Authors:
Tianle Gu,
Zeyang Zhou,
Kexin Huang,
Dandan Liang,
Yixu Wang,
Haiquan Zhao,
Yuanqi Yao,
Xingge Qiao,
Keqing Wang,
Yujiu Yang,
Yan Teng,
Yu Qiao,
Yingchun Wang
Abstract:
Powered by remarkable advancements in Large Language Models (LLMs), Multimodal Large Language Models (MLLMs) demonstrate impressive capabilities in manifold tasks. However, the practical application scenarios of MLLMs are intricate, exposing them to potential malicious instructions and thereby posing safety risks. While current benchmarks do incorporate certain safety considerations, they often la…
▽ More
Powered by remarkable advancements in Large Language Models (LLMs), Multimodal Large Language Models (MLLMs) demonstrate impressive capabilities in manifold tasks. However, the practical application scenarios of MLLMs are intricate, exposing them to potential malicious instructions and thereby posing safety risks. While current benchmarks do incorporate certain safety considerations, they often lack comprehensive coverage and fail to exhibit the necessary rigor and robustness. For instance, the common practice of employing GPT-4V as both the evaluator and a model to be evaluated lacks credibility, as it tends to exhibit a bias toward its own responses. In this paper, we present MLLMGuard, a multidimensional safety evaluation suite for MLLMs, including a bilingual image-text evaluation dataset, inference utilities, and a lightweight evaluator. MLLMGuard's assessment comprehensively covers two languages (English and Chinese) and five important safety dimensions (Privacy, Bias, Toxicity, Truthfulness, and Legality), each with corresponding rich subtasks. Focusing on these dimensions, our evaluation dataset is primarily sourced from platforms such as social media, and it integrates text-based and image-based red teaming techniques with meticulous annotation by human experts. This can prevent inaccurate evaluation caused by data leakage when using open-source datasets and ensures the quality and challenging nature of our benchmark. Additionally, a fully automated lightweight evaluator termed GuardRank is developed, which achieves significantly higher evaluation accuracy than GPT-4. Our evaluation results across 13 advanced models indicate that MLLMs still have a substantial journey ahead before they can be considered safe and responsible.
△ Less
Submitted 13 June, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
Simple smooth modules over the Ramond algebra and applications to vertex operator superalgebras
Authors:
Yulu Chen,
Ran Shen,
Yufeng Yao,
Kaiming Zhao
Abstract:
Simple smooth modules over the Virasoro algebra and one of the super-Virasoro algebra named the Neveu-Schwarz algebra were classified. This problem remained unsolved for the other super-Virasoro algebra called the Ramond algebra. In this paper, all simple smooth modules over the Ramond algebra are classified. More precisely, a simple smooth module over the Ramond algebra is either a simple highest…
▽ More
Simple smooth modules over the Virasoro algebra and one of the super-Virasoro algebra named the Neveu-Schwarz algebra were classified. This problem remained unsolved for the other super-Virasoro algebra called the Ramond algebra. In this paper, all simple smooth modules over the Ramond algebra are classified. More precisely, a simple smooth module over the Ramond algebra is either a simple highest weight module or isomorphic to an induced module from a simple module over a finite dimensional solvable Lie superalgebra. As an application we obtain all simple weak $ψ$-twisted modules over some veterx operator superalgebras.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Biderivations of Lie algebras
Authors:
Qiufan Chen,
Yufeng Yao,
Kaiming Zhao
Abstract:
In this paper, we first introduce the concept of symmetric biderivation radicals and characteristic subalgebras of Lie algebras, and study their properties. Based on these results, we precisely determine biderivations of some Lie algebras including finite-dimensional simple Lie algebras over arbitrary fields of characteristic not $2$ or $3$, and the Witt algebras $\mathcal{W}^+_n$ over fields of c…
▽ More
In this paper, we first introduce the concept of symmetric biderivation radicals and characteristic subalgebras of Lie algebras, and study their properties. Based on these results, we precisely determine biderivations of some Lie algebras including finite-dimensional simple Lie algebras over arbitrary fields of characteristic not $2$ or $3$, and the Witt algebras $\mathcal{W}^+_n$ over fields of characteristic $0$. As an application, commutative post-Lie algebra structure on aforementioned Lie algebras is shown to be trivial.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
Authors:
Zanlin Ni,
Yulin Wang,
Renping Zhou,
Jiayi Guo,
Jinyi Hu,
Zhiyuan Liu,
Shiji Song,
Yuan Yao,
Gao Huang
Abstract:
The field of image synthesis is currently flourishing due to the advancements in diffusion models. While diffusion models have been successful, their computational intensity has prompted the pursuit of more efficient alternatives. As a representative work, non-autoregressive Transformers (NATs) have been recognized for their rapid generation. However, a major drawback of these models is their infe…
▽ More
The field of image synthesis is currently flourishing due to the advancements in diffusion models. While diffusion models have been successful, their computational intensity has prompted the pursuit of more efficient alternatives. As a representative work, non-autoregressive Transformers (NATs) have been recognized for their rapid generation. However, a major drawback of these models is their inferior performance compared to diffusion models. In this paper, we aim to re-evaluate the full potential of NATs by revisiting the design of their training and inference strategies. Specifically, we identify the complexities in properly configuring these strategies and indicate the possible sub-optimality in existing heuristic-driven designs. Recognizing this, we propose to go beyond existing methods by directly solving the optimal strategies in an automatic framework. The resulting method, named AutoNAT, advances the performance boundaries of NATs notably, and is able to perform comparably with the latest diffusion models at a significantly reduced inference cost. The effectiveness of AutoNAT is validated on four benchmark datasets, i.e., ImageNet-256 & 512, MS-COCO, and CC3M. Our code is available at https://github.com/LeapLabTHU/ImprovedNAT.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
MatrixGate: A High-performance Data Ingestion Tool for Time-series Databases
Authors:
Shuhui Wang,
Zihan Sun,
Chaochen Hu,
Chao Li,
Yong Zhang,
Yandong Yao,
Hao Wang,
Chunxiao Xing
Abstract:
Recent years have seen massive time-series data generated in many areas. This different scenario brings new challenges, particularly in terms of data ingestion, where existing technologies struggle to handle such massive time-series data, leading to low loading speed and poor timeliness. To address these challenges, this paper presents MatrixGate, a new and efficient data ingestion approach for ma…
▽ More
Recent years have seen massive time-series data generated in many areas. This different scenario brings new challenges, particularly in terms of data ingestion, where existing technologies struggle to handle such massive time-series data, leading to low loading speed and poor timeliness. To address these challenges, this paper presents MatrixGate, a new and efficient data ingestion approach for massive time-series data. MatrixGate implements both single-instance and multi-instance parallel procedures, which is based on its unique ingestion strategies. First, MatrixGate uses policies to tune the slots that are synchronized with segments to ingest data, which eliminates the cost of starting transactions and enhance the efficiency. Second, multi-coroutines are responsible for transfer data, which can increase the degree of parallelism significantly. Third, lock-free queues are used to enable direct data transfer without the need for disk storage or lodging in the master instance. Experiment results on multiple datasets show that MatrixGate outperforms state-of-the-art methods by 3 to 100 times in loading speed, and cuts down about 80% query latency. Furthermore, MatrixGate scales out efficiently under distributed architecture, achieving scalability of 86%.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Exact quantization of topological order parameter in SU($N$) spin models, $N$-ality transformation and ingappabilities
Authors:
Hang Su,
Yuan Yao,
Akira Furusaki
Abstract:
We show that the ground-state expectation value of twisting operator is a topological order parameter for $\text{U}(1)$- and $\mathbb{Z}_{N}$-symmetric symmetry-protected topological (SPT) phases in one-dimensional ``spin'' systems -- it is quantized in the thermodynamic limit and can be used to identify different SPT phases and to diagnose phase transitions among them. We prove that this (non-loc…
▽ More
We show that the ground-state expectation value of twisting operator is a topological order parameter for $\text{U}(1)$- and $\mathbb{Z}_{N}$-symmetric symmetry-protected topological (SPT) phases in one-dimensional ``spin'' systems -- it is quantized in the thermodynamic limit and can be used to identify different SPT phases and to diagnose phase transitions among them. We prove that this (non-local) order parameter must take values in $N$-th roots of unity, and its value can be changed by a generalized lattice translation acting as an $N$-ality transformation connecting distinct phases. This result also implies the Lieb-Schultz-Mattis ingappability for SU($N$) spins if we further impose a general translation symmetry. Furthermore, our exact result for the order parameter of SPT phases can predict a large number of LSM ingappabilities by the general lattice translation. We also apply the $N$-ality property to provide an efficient way to construct possible multi-critical phase transitions starting from a single Hamiltonian with a unique gapped ground state.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Charge self-consistent density functional theory plus ghost rotationally-invariant slave-boson theory for correlated materials
Authors:
Tsung-Han Lee,
Corey Melnick,
Ran Adler,
Xue Sun,
Yongxin Yao,
Nicola Lanatà,
Gabriel Kotliar
Abstract:
We present a charge self-consistent density functional theory combined with the ghost-rotationally-invariant slave-boson (DFT+gRISB) formalism for studying correlated materials. This method is applied to SrVO$_3$ and NiO, representing prototypical correlated metals and charge-transfer insulators. For SrVO$_3$, we demonstrate that DFT+gRISB yields an accurate equilibrium volume and effective mass c…
▽ More
We present a charge self-consistent density functional theory combined with the ghost-rotationally-invariant slave-boson (DFT+gRISB) formalism for studying correlated materials. This method is applied to SrVO$_3$ and NiO, representing prototypical correlated metals and charge-transfer insulators. For SrVO$_3$, we demonstrate that DFT+gRISB yields an accurate equilibrium volume and effective mass close to experimentally observed values. Regarding NiO, DFT+gRISB enables the simultaneous description of charge transfer and Mott-Hubbard bands, significantly enhancing the accuracy of the original DFT+RISB approach. Furthermore, the calculated equilibrium volume and spectral function reasonably agree with experimental observations.
△ Less
Submitted 13 June, 2024; v1 submitted 7 June, 2024;
originally announced June 2024.
-
BugBlitz-AI: An Intelligent QA Assistant
Authors:
Yi Yao,
Jun Wang,
Yabai Hu,
Lifeng Wang,
Yi Zhou,
Jack Chen,
Xuming Gai,
Zhenming Wang,
Wenjun Liu
Abstract:
The evolution of software testing from manual to automated methods has significantly influenced quality assurance (QA) practices. However, challenges persist in post-execution phases, particularly in result analysis and reporting. Traditional post-execution validation phases require manual intervention for result analysis and report generation, leading to inefficiencies and potential development c…
▽ More
The evolution of software testing from manual to automated methods has significantly influenced quality assurance (QA) practices. However, challenges persist in post-execution phases, particularly in result analysis and reporting. Traditional post-execution validation phases require manual intervention for result analysis and report generation, leading to inefficiencies and potential development cycle delays. This paper introduces BugBlitz-AI, an AI-powered validation toolkit designed to enhance end-to-end test automation by automating result analysis and bug reporting processes. BugBlitz-AI leverages recent advancements in artificial intelligence to reduce the time-intensive tasks of manual result analysis and report generation, allowing QA teams to focus more on crucial aspects of product quality. By adopting BugBlitz-AI, organizations can advance automated testing practices and integrate AI into QA processes, ensuring higher product quality and faster time-to-market. The paper outlines BugBlitz-AI's architecture, discusses related work, details its quality enhancement strategies, and presents results demonstrating its effectiveness in real-world scenarios.
△ Less
Submitted 17 May, 2024;
originally announced June 2024.