subscribe to arXiv mailings

arXiv:2407.07418 [pdf, other]

Dynamics of asymmetrically deformed skyrmion driven by internal forces and strain force in a flower-shaped magnetic nanostructure

Authors: Zhen-Yu Tan, Ji-Pei Chen, Yu-Ke Shi, Yuan Chen, Ming-Hui Qin, Xing-Sen Gao, Jun-Ming Liu

Abstract: Magnetic skyrmions emerge as promising quasi-particles for encoding information in nextgeneration spintronic devices. Their innate flexibility in shape is essential for the applications although they were often ideally treated as rigid particles. In this work, we investigated the voltagecontrolled uniform strain mediated dynamics of deformed skyrmions in heterostructures with a flower-shaped magne… ▽ More Magnetic skyrmions emerge as promising quasi-particles for encoding information in nextgeneration spintronic devices. Their innate flexibility in shape is essential for the applications although they were often ideally treated as rigid particles. In this work, we investigated the voltagecontrolled uniform strain mediated dynamics of deformed skyrmions in heterostructures with a flower-shaped magnetic nanostructure, using micromagnetic simulations. The simulated results revealed the possible states of isolated skyrmion nucleated in the nanostructure, which can be mutually switched by applying suitable in-plane strain pulses. In addition, it was found that the skyrmion motions are driven by the emerging internal forces and strain force, which originate from the asymmetric deformation of skyrmion structures. Furthermore, an analytical model of deformed skyrmions was proposed to interpret the dependences of internal forces and strain force on the asymmetric deformation of skyrmion, with some formulae derived for these forces in a semi-analytical approach. Further calculations based on these formulae verified the forces appearing in the skyrmion motion, with the resulting forces showing consistence with the simulated data. This suggested that our semi-analytical model successfully captures the main physics responsible for the motion of deformed skyrmion in the nanostructure. Our work extends the understanding of the mechanics emerging in deformed skyrmion, and provides an effective approach for deterministic manipulation of deformed skyrmion motion via strain forces and internal forces, which may be instructive to design of skyrmion-based spintronic devices. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.06984 [pdf, other]

Category-level Object Detection, Pose Estimation and Reconstruction from Stereo Images

Authors: Chuanrui Zhang, Yonggen Ling, Minglei Lu, Minghan Qin, Haoqian Wang

Abstract: We study the 3D object understanding task for manipulating everyday objects with different material properties (diffuse, specular, transparent and mixed). Existing monocular and RGB-D methods suffer from scale ambiguity due to missing or imprecise depth measurements. We present CODERS, a one-stage approach for Category-level Object Detection, pose Estimation and Reconstruction from Stereo images.… ▽ More We study the 3D object understanding task for manipulating everyday objects with different material properties (diffuse, specular, transparent and mixed). Existing monocular and RGB-D methods suffer from scale ambiguity due to missing or imprecise depth measurements. We present CODERS, a one-stage approach for Category-level Object Detection, pose Estimation and Reconstruction from Stereo images. The base of our pipeline is an implicit stereo matching module that combines stereo image features with 3D position information. Concatenating this presented module and the following transform-decoder architecture leads to end-to-end learning of multiple tasks required by robot manipulation. Our approach significantly outperforms all competing methods in the public TOD dataset. Furthermore, trained on simulated data, CODERS generalize well to unseen category-level object instances in real-world robot manipulation experiments. Our dataset, code, and demos will be available on our project page. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.03202 [pdf, other]

Clifford Circuits Augmented Time-Dependent Variational Principle

Authors: Xiangjian Qian, Jiale Huang, Mingpu Qin

Abstract: The recently proposed Clifford Circuits Augmented Matrix Product States (CA-MPS) (arXiv:2405.09217) seamlessly augments Density Matrix Renormalization Group with Clifford circuits. In CA-MPS, the entanglement from stabilizers is transferred to the Clifford circuits which can be easily handled according to the Gottesman-Knill theorem. As a result, MPS needs only to deal with the non-stabilizer enta… ▽ More The recently proposed Clifford Circuits Augmented Matrix Product States (CA-MPS) (arXiv:2405.09217) seamlessly augments Density Matrix Renormalization Group with Clifford circuits. In CA-MPS, the entanglement from stabilizers is transferred to the Clifford circuits which can be easily handled according to the Gottesman-Knill theorem. As a result, MPS needs only to deal with the non-stabilizer entanglement, which largely reduce the bond dimension and the resource required for the accurate simulation of many-body systems. In this work, we generalize CA-MPS to the framework of Time-Dependent Variational Principle (TDVP) for time evolution simulations. In this method, we apply Clifford circuits to the resulting MPS in each TDVP step with a two-site sweeping process similar as in DMRG, aiming at reducing the entanglement entropy in the MPS, and the Hamiltonian is transformed accordingly using the chosen Clifford circuits. Similar as in CA-MPS, the Clifford circuits doesn't increase the number of terms in the Hamiltonian which makes the overhead very small in the new method. We test this method in both XXZ chain and two dimensional Heisenberg model. The results show that the Clifford circuits augmented TDVP method can reduce the entanglement entropy in the time evolution process and hence makes the simulation reliable for longer time. The Clifford circuits augmented Time-Dependent Variational Principle provides a useful tool for the simulation of time evolution process of many-body systems in the future. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.02813 [pdf, other]

Data Overfitting for On-Device Super-Resolution with Dynamic Algorithm and Compiler Co-Design

Authors: Gen Li, Zhihao Shu, Jie Ji, Minghai Qin, Fatemeh Afghah, Wei Niu, Xiaolong Ma

Abstract: Deep neural networks (DNNs) are frequently employed in a variety of computer vision applications. Nowadays, an emerging trend in the current video distribution system is to take advantage of DNN's overfitting properties to perform video resolution upscaling. By splitting videos into chunks and applying a super-resolution (SR) model to overfit each chunk, this scheme of SR models plus video chunks… ▽ More Deep neural networks (DNNs) are frequently employed in a variety of computer vision applications. Nowadays, an emerging trend in the current video distribution system is to take advantage of DNN's overfitting properties to perform video resolution upscaling. By splitting videos into chunks and applying a super-resolution (SR) model to overfit each chunk, this scheme of SR models plus video chunks is able to replace traditional video transmission to enhance video quality and transmission efficiency. However, many models and chunks are needed to guarantee high performance, which leads to tremendous overhead on model switching and memory footprints at the user end. To resolve such problems, we propose a Dynamic Deep neural network assisted by a Content-Aware data processing pipeline to reduce the model number down to one (Dy-DCA), which helps promote performance while conserving computational resources. Additionally, to achieve real acceleration on the user end, we designed a framework that optimizes dynamic features (e.g., dynamic shapes, sizes, and control flow) in Dy-DCA to enable a series of compilation optimizations, including fused code generation, static execution planning, etc. By employing such techniques, our method achieves better PSNR and real-time performance (33 FPS) on an off-the-shelf mobile phone. Meanwhile, assisted by our compilation optimization, we achieve a 1.7$\times$ speedup while saving up to 1.61$\times$ memory consumption. Code available in https://github.com/coulsonlee/Dy-DCA-ECCV2024. △ Less

Submitted 11 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

Comments: ECCV2024

arXiv:2406.17417 [pdf, other]

Is the Valence Bond Solid state in $J_1$-$J_2$ Square Lattice Heisenberg Model Plaquette or Columnar?

Authors: Jiale Huang, Xiangjian Qian, Mingpu Qin

Abstract: We utilize Density Matrix Renormalization Group (DMRG) and Fully Augmented Matrix Product States (FAMPS) methods to investigate the Valence Bond Solid (VBS) phase in the $J_1$-$J_2$ square lattice Heisenberg model. To differentiate between the Columnar Valence Bond Solid (CVBS) and Plaquette Valence Bond Solid (PVBS) phases, we introduce an anisotropy $Δ_y$ in the nearest neighboring coupling in t… ▽ More We utilize Density Matrix Renormalization Group (DMRG) and Fully Augmented Matrix Product States (FAMPS) methods to investigate the Valence Bond Solid (VBS) phase in the $J_1$-$J_2$ square lattice Heisenberg model. To differentiate between the Columnar Valence Bond Solid (CVBS) and Plaquette Valence Bond Solid (PVBS) phases, we introduce an anisotropy $Δ_y$ in the nearest neighboring coupling in the $y$-direction, aiming at detecting the possible spontaneous rotational symmetry breaking in the VBS phase. In the calculations, we push the bond dimension to as large as $D = 25000$ in FAMPS, simulating systems at a maximum size of $14 \times 14$. With a careful extrapolation of the truncation errors and appropriate finite-size scaling, followed by finite $Δ_y$ scaling analysis of the VBS dimer order parameters, we identify the VBS phase as a PVBS type, meaning there is no spontaneous rotational symmetry breaking in the VBS phase. This study not only resolves the long-standing issue of the characterization of the VBS order in the $J_1$-$J_2$ square lattice Heisenberg model but also highlights the capabilities of FAMPS in the study of two-dimensional quantum many-body systems. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.14537 [pdf, other]

MacroHFT: Memory Augmented Context-aware Reinforcement Learning On High Frequency Trading

Authors: Chuqiao Zong, Chaojie Wang, Molei Qin, Lei Feng, Xinrun Wang, Bo An

Abstract: High-frequency trading (HFT) that executes algorithmic trading in short time scales, has recently occupied the majority of cryptocurrency market. Besides traditional quantitative trading methods, reinforcement learning (RL) has become another appealing approach for HFT due to its terrific ability of handling high-dimensional financial data and solving sophisticated sequential decision-making probl… ▽ More High-frequency trading (HFT) that executes algorithmic trading in short time scales, has recently occupied the majority of cryptocurrency market. Besides traditional quantitative trading methods, reinforcement learning (RL) has become another appealing approach for HFT due to its terrific ability of handling high-dimensional financial data and solving sophisticated sequential decision-making problems, \emph{e.g.,} hierarchical reinforcement learning (HRL) has shown its promising performance on second-level HFT by training a router to select only one sub-agent from the agent pool to execute the current transaction. However, existing RL methods for HFT still have some defects: 1) standard RL-based trading agents suffer from the overfitting issue, preventing them from making effective policy adjustments based on financial context; 2) due to the rapid changes in market conditions, investment decisions made by an individual agent are usually one-sided and highly biased, which might lead to significant loss in extreme markets. To tackle these problems, we propose a novel Memory Augmented Context-aware Reinforcement learning method On HFT, \emph{a.k.a.} MacroHFT, which consists of two training phases: 1) we first train multiple types of sub-agents with the market data decomposed according to various financial indicators, specifically market trend and volatility, where each agent owns a conditional adapter to adjust its trading policy according to market conditions; 2) then we train a hyper-agent to mix the decisions from these sub-agents and output a consistently profitable meta-policy to handle rapid market fluctuations, equipped with a memory mechanism to enhance the capability of decision-making. Extensive experiments on various cryptocurrency markets demonstrate that MacroHFT can achieve state-of-the-art performance on minute-level trading tasks. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: Accepted to KDD 2024

arXiv:2406.09098 [pdf, other]

SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models

Authors: Kehua Feng, Keyan Ding, Weijie Wang, Xiang Zhuang, Zeyuan Wang, Ming Qin, Yu Zhao, Jianhua Yao, Qiang Zhang, Huajun Chen

Abstract: The burgeoning utilization of Large Language Models (LLMs) in scientific research necessitates advanced benchmarks capable of evaluating their understanding and application of scientific knowledge comprehensively. To address this need, we introduce the SciKnowEval benchmark, a novel framework that systematically evaluates LLMs across five progressive levels of scientific knowledge: studying extens… ▽ More The burgeoning utilization of Large Language Models (LLMs) in scientific research necessitates advanced benchmarks capable of evaluating their understanding and application of scientific knowledge comprehensively. To address this need, we introduce the SciKnowEval benchmark, a novel framework that systematically evaluates LLMs across five progressive levels of scientific knowledge: studying extensively, inquiring earnestly, thinking profoundly, discerning clearly, and practicing assiduously. These levels aim to assess the breadth and depth of scientific knowledge in LLMs, including knowledge coverage, inquiry and exploration capabilities, reflection and reasoning abilities, ethic and safety considerations, as well as practice proficiency. Specifically, we take biology and chemistry as the two instances of SciKnowEval and construct a dataset encompassing 50K multi-level scientific problems and solutions. By leveraging this dataset, we benchmark 20 leading open-source and proprietary LLMs using zero-shot and few-shot prompting strategies. The results reveal that despite achieving state-of-the-art performance, the proprietary LLMs still have considerable room for improvement, particularly in addressing scientific computations and applications. We anticipate that SciKnowEval will establish a comprehensive standard for benchmarking LLMs in science research and discovery, and promote the development of LLMs that integrate scientific knowledge with strong safety awareness. The dataset and code are publicly available at https://github.com/hicai-zju/sciknoweval . △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 48 pages, 2 figures

arXiv:2405.20277 [pdf, other]

Pre-train and Refine: Towards Higher Efficiency in K-Agnostic Community Detection without Quality Degradation

Authors: Meng Qin, Chaorui Zhang, Yu Gao, Weixi Zhang, Dit-Yan Yeung

Abstract: Community detection (CD) is a classic graph inference task that partitions nodes of a graph into densely connected groups. While many CD methods have been proposed with either impressive quality or efficiency, balancing the two aspects remains a challenge. This study explores the potential of deep graph learning to achieve a better trade-off between the quality and efficiency of K-agnostic CD, whe… ▽ More Community detection (CD) is a classic graph inference task that partitions nodes of a graph into densely connected groups. While many CD methods have been proposed with either impressive quality or efficiency, balancing the two aspects remains a challenge. This study explores the potential of deep graph learning to achieve a better trade-off between the quality and efficiency of K-agnostic CD, where the number of communities K is unknown. We propose PRoCD (Pre-training & Refinement fOr Community Detection), a simple yet effective method that reformulates K-agnostic CD as the binary node pair classification. PRoCD follows a pre-training & refinement paradigm inspired by recent advances in pre-training techniques. We first conduct the offline pre-training of PRoCD on small synthetic graphs covering various topology properties. Based on the inductive inference across graphs, we then generalize the pre-trained model (with frozen parameters) to large real graphs and use the derived CD results as the initialization of an existing efficient CD method (e.g., InfoMap) to further refine the quality of CD results. In addition to benefiting from the transfer ability regarding quality, the online generalization and refinement can also help achieve high inference efficiency, since there is no time-consuming model optimization. Experiments on public datasets with various scales demonstrate that PRoCD can ensure higher efficiency in K-agnostic CD without significant quality degradation. △ Less

Submitted 7 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

Comments: Accepted by ACM KDD 2024

arXiv:2405.15125 [pdf, other]

HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting

Authors: Yuanhao Cai, Zihao Xiao, Yixun Liang, Minghan Qin, Yulun Zhang, Xiaokang Yang, Yaoyao Liu, Alan Yuille

Abstract: High dynamic range (HDR) novel view synthesis (NVS) aims to create photorealistic images from novel viewpoints using HDR imaging techniques. The rendered HDR images capture a wider range of brightness levels containing more details of the scene than normal low dynamic range (LDR) images. Existing HDR NVS methods are mainly based on NeRF. They suffer from long training time and slow inference speed… ▽ More High dynamic range (HDR) novel view synthesis (NVS) aims to create photorealistic images from novel viewpoints using HDR imaging techniques. The rendered HDR images capture a wider range of brightness levels containing more details of the scene than normal low dynamic range (LDR) images. Existing HDR NVS methods are mainly based on NeRF. They suffer from long training time and slow inference speed. In this paper, we propose a new framework, High Dynamic Range Gaussian Splatting (HDR-GS), which can efficiently render novel HDR views and reconstruct LDR images with a user input exposure time. Specifically, we design a Dual Dynamic Range (DDR) Gaussian point cloud model that uses spherical harmonics to fit HDR color and employs an MLP-based tone-mapper to render LDR color. The HDR and LDR colors are then fed into two Parallel Differentiable Rasterization (PDR) processes to reconstruct HDR and LDR views. To establish the data foundation for the research of 3D Gaussian splatting-based methods in HDR NVS, we recalibrate the camera parameters and compute the initial positions for Gaussian point clouds. Experiments demonstrate that our HDR-GS surpasses the state-of-the-art NeRF-based method by 3.84 and 1.91 dB on LDR and HDR NVS while enjoying 1000x inference speed and only requiring 6.3% training time. Code, models, and recalibrated data will be publicly available at https://github.com/caiyuanhao1998/HDR-GS △ Less

Submitted 27 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

Comments: The first 3D Gaussian Splatting-based method for HDR imaging

arXiv:2405.14192 [pdf, other]

IB-AdCSCNet:Adaptive Convolutional Sparse Coding Network Driven by Information Bottleneck

Authors: He Zou, Meng'en Qin, Yu Song, Xiaohui Yang

Abstract: In the realm of neural network models, the perpetual challenge remains in retaining task-relevant information while effectively discarding redundant data during propagation. In this paper, we introduce IB-AdCSCNet, a deep learning model grounded in information bottleneck theory. IB-AdCSCNet seamlessly integrates the information bottleneck trade-off strategy into deep networks by dynamically adjust… ▽ More In the realm of neural network models, the perpetual challenge remains in retaining task-relevant information while effectively discarding redundant data during propagation. In this paper, we introduce IB-AdCSCNet, a deep learning model grounded in information bottleneck theory. IB-AdCSCNet seamlessly integrates the information bottleneck trade-off strategy into deep networks by dynamically adjusting the trade-off hyperparameter $λ$ through gradient descent, updating it within the FISTA(Fast Iterative Shrinkage-Thresholding Algorithm ) framework. By optimizing the compressive excitation loss function induced by the information bottleneck principle, IB-AdCSCNet achieves an optimal balance between compression and fitting at a global level, approximating the globally optimal representation feature. This information bottleneck trade-off strategy driven by downstream tasks not only helps to learn effective features of the data, but also improves the generalization of the model. This study's contribution lies in presenting a model with consistent performance and offering a fresh perspective on merging deep learning with sparse representation theory, grounded in the information bottleneck concept. Experimental results on CIFAR-10 and CIFAR-100 datasets demonstrate that IB-AdCSCNet not only matches the performance of deep residual convolutional networks but also outperforms them when handling corrupted data. Through the inference of the IB trade-off, the model's robustness is notably enhanced. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.09217 [pdf, other]

Augmenting Density Matrix Renormalization Group with Clifford Circuits

Authors: Xiangjian Qian, Jiale Huang, Mingpu Qin

Abstract: Density Matrix Renormalization Group (DMRG) or Matrix Product States (MPS) are widely acknowledged as highly effective and accurate methods for solving one-dimensional quantum many-body systems. However, the direct application of DMRG to the study two-dimensional systems encounters challenges due to the limited entanglement encoded in the wave-function ansatz. Conversely, Clifford circuits offer a… ▽ More Density Matrix Renormalization Group (DMRG) or Matrix Product States (MPS) are widely acknowledged as highly effective and accurate methods for solving one-dimensional quantum many-body systems. However, the direct application of DMRG to the study two-dimensional systems encounters challenges due to the limited entanglement encoded in the wave-function ansatz. Conversely, Clifford circuits offer a promising avenue for simulating states with substantial entanglement, albeit confined to stabilizer states. In this work, we present the seamless integration of Clifford circuits within the DMRG algorithm, leveraging the advantages of both Clifford circuits and DMRG. This integration leads to a significant enhancement in simulation accuracy with small additional computational cost. Moreover, this framework is useful not only for its current application but also for its potential to be easily adapted to various other numerical approaches △ Less

Submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.08423 [pdf, other]

NAFRSSR: a Lightweight Recursive Network for Efficient Stereo Image Super-Resolution

Authors: Yihong Chen, Zhen Fan, Shuai Dong, Zhiwei Chen, Wenjie Li, Minghui Qin, Min Zeng, Xubing Lu, Guofu Zhou, Xingsen Gao, Jun-Ming Liu

Abstract: Stereo image super-resolution (SR) refers to the reconstruction of a high-resolution (HR) image from a pair of low-resolution (LR) images as typically captured by a dual-camera device. To enhance the quality of SR images, most previous studies focused on increasing the number and size of feature maps and introducing complex and computationally intensive structures, resulting in models with high co… ▽ More Stereo image super-resolution (SR) refers to the reconstruction of a high-resolution (HR) image from a pair of low-resolution (LR) images as typically captured by a dual-camera device. To enhance the quality of SR images, most previous studies focused on increasing the number and size of feature maps and introducing complex and computationally intensive structures, resulting in models with high computational complexity. Here, we propose a simple yet efficient stereo image SR model called NAFRSSR, which is modified from the previous state-of-the-art model NAFSSR by introducing recursive connections and lightweighting the constituent modules. Our NAFRSSR model is composed of nonlinear activation free and group convolution-based blocks (NAFGCBlocks) and depth-separated stereo cross attention modules (DSSCAMs). The NAFGCBlock improves feature extraction and reduces number of parameters by removing the simple channel attention mechanism from NAFBlock and using group convolution. The DSSCAM enhances feature fusion and reduces number of parameters by replacing 1x1 pointwise convolution in SCAM with weight-shared 3x3 depthwise convolution. Besides, we propose to incorporate trainable edge detection operator into NAFRSSR to further improve the model performance. Four variants of NAFRSSR with different sizes, namely, NAFRSSR-Mobile (NAFRSSR-M), NAFRSSR-Tiny (NAFRSSR-T), NAFRSSR-Super (NAFRSSR-S) and NAFRSSR-Base (NAFRSSR-B) are designed, and they all exhibit fewer parameters, higher PSNR/SSIM, and faster speed than the previous state-of-the-art models. In particular, to the best of our knowledge, NAFRSSR-M is the lightest (0.28M parameters) and fastest (50 ms inference time) model achieving an average PSNR/SSIM as high as 24.657 dB/0.7622 on the benchmark datasets. Codes and models will be released at https://github.com/JNUChenYiHong/NAFRSSR. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.02861 [pdf, other]

Revisiting a Pain in the Neck: Semantic Phrase Processing Benchmark for Language Models

Authors: Yang Liu, Melissa Xiaohui Qin, Hongming Li, Chao Huang

Abstract: We introduce LexBench, a comprehensive evaluation suite enabled to test language models (LMs) on ten semantic phrase processing tasks. Unlike prior studies, it is the first work to propose a framework from the comparative perspective to model the general semantic phrase (i.e., lexical collocation) and three fine-grained semantic phrases, including idiomatic expression, noun compound, and verbal co… ▽ More We introduce LexBench, a comprehensive evaluation suite enabled to test language models (LMs) on ten semantic phrase processing tasks. Unlike prior studies, it is the first work to propose a framework from the comparative perspective to model the general semantic phrase (i.e., lexical collocation) and three fine-grained semantic phrases, including idiomatic expression, noun compound, and verbal construction. Thanks to \ourbenchmark, we assess the performance of 15 LMs across model architectures and parameter scales in classification, extraction, and interpretation tasks. Through the experiments, we first validate the scaling law and find that, as expected, large models excel better than the smaller ones in most tasks. Second, we investigate further through the scaling semantic relation categorization and find that few-shot LMs still lag behind vanilla fine-tuned models in the task. Third, through human evaluation, we find that the performance of strong models is comparable to the human level regarding semantic phrase processing. Our benchmarking findings can serve future research aiming to improve the generic capability of LMs on semantic phrase comprehension. Our source code and data are available at https://github.com/jacklanda/LexBench △ Less

Submitted 5 May, 2024; originally announced May 2024.

Comments: 24 pages, 17 figures, 10 tables

MSC Class: 68T50 ACM Class: I.2.7

arXiv:2404.09448 [pdf, ps, other]

On maximum residual block Kaczmarz method for solving large consistent linear systems

Authors: Wen-Ning Sun, Mei Qin

Abstract: For solving large consistent linear systems by iteration methods, inspired by the maximum residual Kaczmarz method and the randomized block Kaczmarz method, we propose the maximum residual block Kaczmarz method, which is designed to preferentially eliminate the largest block in the residual vector $r_{k}$ at each iteration. At the same time, in order to further improve the convergence rate, we con… ▽ More For solving large consistent linear systems by iteration methods, inspired by the maximum residual Kaczmarz method and the randomized block Kaczmarz method, we propose the maximum residual block Kaczmarz method, which is designed to preferentially eliminate the largest block in the residual vector $r_{k}$ at each iteration. At the same time, in order to further improve the convergence rate, we construct the maximum residual average block Kaczmarz method to avoid the calculation of pseudo-inverse in block iteration, which completes the iteration by projecting the iteration vector $x_{k}$ to each row of the constrained subset of $A$ and applying different extrapolation step sizes to average them. We prove the convergence of these two methods and give the upper bounds on their convergence rates, respectively. Numerical experiments validate our theory and show that our proposed methods are superior to some other block Kaczmarz methods. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.07209 [pdf]

doi 10.1016/j.addma.2023.103937

Deep Reinforcement Learning Based Toolpath Generation for Thermal Uniformity in Laser Powder Bed Fusion Process

Authors: Mian Qin, Junhao Ding, Shuo Qu, Xu Song, Charlie C. L. Wang, Wei-Hsin Liao

Abstract: Laser powder bed fusion (LPBF) is a widely used metal additive manufacturing technology. However, the accumulation of internal residual stress during printing can cause significant distortion and potential failure. Although various scan patterns have been studied to reduce possible accumulated stress, such as zigzag scanning vectors with changing directions or a chessboard-based scan pattern with… ▽ More Laser powder bed fusion (LPBF) is a widely used metal additive manufacturing technology. However, the accumulation of internal residual stress during printing can cause significant distortion and potential failure. Although various scan patterns have been studied to reduce possible accumulated stress, such as zigzag scanning vectors with changing directions or a chessboard-based scan pattern with divided small islands, most conventional scan patterns cannot significantly reduce residual stress. The proposed adaptive toolpath generation (ATG) algorithms, aiming to minimize the thermal gradients, may result in extremely accumulated temperature fields in some cases. To address these issues, we developed a deep reinforcement learning (DRL)-based toolpath generation framework, with the goal of achieving uniformly distributed heat and avoiding extremely thermal accumulation regions during the LPBF process. We first developed an overall pipeline for the DRL-based toolpath generation framework, which includes uniformly sampling, agent moving and environment observation, action selection, moving constraints, rewards calculation, and the training process. To accelerate the training process, we simplified the data-intensive numerical model by considering the turning angles on the toolpath. We designed the action spaces with three options, including the minimum temperature value, the smoothest path, and the second smoothest path. The reward function was designed to minimize energy density to ensure the temperature field remains relatively stable. To verify the effectiveness of the proposed DRL-based toolpath generation framework, we performed numerical simulations of polygon shape printing domains. In addition, four groups of thin plate samples with different scan patterns were compared using the LPBF process. △ Less

Submitted 16 February, 2024; originally announced April 2024.

Journal ref: Additive Manufacturing, vol.79, 103937 (12 pages), January 2024

arXiv:2404.01979 [pdf, other]

The ground state of electron-doped $t-t'-J$ model on cylinders

Authors: Yang Shen, Xiangjian Qian, Mingpu Qin

Abstract: We perform a comprehensive study of the electron-doped $t-t'-J$ model on cylinders with Density Matrix Renormalization Group (DMRG). We adopt both periodic and anti-periodic boundary conditions along the circumference direction to explore the finite size effect. We study doping levels of $1/6$, $1/8$, and $1/12$ which represent the most interesting region in the phase diagram of electron-doped cup… ▽ More We perform a comprehensive study of the electron-doped $t-t'-J$ model on cylinders with Density Matrix Renormalization Group (DMRG). We adopt both periodic and anti-periodic boundary conditions along the circumference direction to explore the finite size effect. We study doping levels of $1/6$, $1/8$, and $1/12$ which represent the most interesting region in the phase diagram of electron-doped cuprates. We find that for width-4 and 6 systems, the ground state for fixed doping switches between anti-ferromagnetic Neel state and stripe state under different boundary conditions and with system widths, indicating the presence of large finite size effect in the $t-t'-J$ model. We also have a careful analysis of the $d$-wave pairing correlations which also changes quantitatively with boundary conditions and widths of the system. However, the pairing correlations are enhanced when the system becomes wider for all dopings, suggesting the existence of possible long-ranged superconducting order in the thermodynamic limit. The width-8 results are found to be dependent on the starting state in the DMRG calculation for the kept states we can reach. For width-8 system only Neel (stripe) state can be stabilized in DMRG calculation for $1/12$ ($1/6$) doping, while both stripe and Neel states are stable in the DMRG sweep for $1/8$ doping, regardless of the boundary conditions. These results indicate that $1/8$ doping is likely to lie in the boundary of a phase transition between the Neel phase with lower doping and the stripe phase with higher doping, consistent with the previous study. The sensitivity of ground state on boundary conditions and size observed in this work is similar to that in the $t'$- Hubbard model. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Comments: 8 pages, 5 figures and 1 table

arXiv:2403.15704 [pdf, other]

Gaussian in the Wild: 3D Gaussian Splatting for Unconstrained Image Collections

Authors: Dongbin Zhang, Chuming Wang, Weitao Wang, Peihao Li, Minghan Qin, Haoqian Wang

Abstract: Novel view synthesis from unconstrained in-the-wild images remains a meaningful but challenging task. The photometric variation and transient occluders in those unconstrained images make it difficult to reconstruct the original scene accurately. Previous approaches tackle the problem by introducing a global appearance feature in Neural Radiance Fields (NeRF). However, in the real world, the unique… ▽ More Novel view synthesis from unconstrained in-the-wild images remains a meaningful but challenging task. The photometric variation and transient occluders in those unconstrained images make it difficult to reconstruct the original scene accurately. Previous approaches tackle the problem by introducing a global appearance feature in Neural Radiance Fields (NeRF). However, in the real world, the unique appearance of each tiny point in a scene is determined by its independent intrinsic material attributes and the varying environmental impacts it receives. Inspired by this fact, we propose Gaussian in the wild (GS-W), a method that uses 3D Gaussian points to reconstruct the scene and introduces separated intrinsic and dynamic appearance feature for each point, capturing the unchanged scene appearance along with dynamic variation like illumination and weather. Additionally, an adaptive sampling strategy is presented to allow each Gaussian point to focus on the local and detailed information more effectively. We also reduce the impact of transient occluders using a 2D visibility map. More experiments have demonstrated better reconstruction quality and details of GS-W compared to NeRF-based methods, with a faster rendering speed. Video results and code are available at https://eastbeanzhang.github.io/GS-W/. △ Less

Submitted 14 July, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.03186 [pdf, other]

Cradle: Empowering Foundation Agents Towards General Computer Control

Authors: Weihao Tan, Wentao Zhang, Xinrun Xu, Haochong Xia, Ziluo Ding, Boyu Li, Bohan Zhou, Junpeng Yue, Jiechuan Jiang, Yewen Li, Ruyi An, Molei Qin, Chuqiao Zong, Longtao Zheng, Yujie Wu, Xiaoqiang Chai, Yifei Bi, Tianbao Xie, Pengjie Gu, Xiyun Li, Ceyao Zhang, Long Tian, Chaojie Wang, Xinrun Wang, Börje F. Karlsson , et al. (3 additional authors not shown)

Abstract: Despite the success in specific scenarios, existing foundation agents still struggle to generalize across various virtual scenarios, mainly due to the dramatically different encapsulations of environments with manually designed observation and action spaces. To handle this issue, we propose the General Computer Control (GCC) setting to restrict foundation agents to interact with software through t… ▽ More Despite the success in specific scenarios, existing foundation agents still struggle to generalize across various virtual scenarios, mainly due to the dramatically different encapsulations of environments with manually designed observation and action spaces. To handle this issue, we propose the General Computer Control (GCC) setting to restrict foundation agents to interact with software through the most unified and standardized interface, i.e., using screenshots as input and keyboard and mouse actions as output. We introduce Cradle, a modular and flexible LMM-powered framework, as a preliminary attempt towards GCC. Enhanced by six key modules, Cradle can understand input screenshots and output executable code for low-level keyboard and mouse control after high-level planning, so that Cradle can interact with any software and complete long-horizon complex tasks without relying on any built-in APIs. Experimental results show that Cradle exhibits remarkable generalizability and impressive performance across four previously unexplored commercial video games, five software applications, and a comprehensive benchmark, OSWorld. Cradle is the first to enable foundation agents to follow the main storyline and complete 40-minute-long real missions in the complex AAA game Red Dead Redemption 2 (RDR2). Cradle can also create a city of a thousand people in Cities: Skylines, farm and harvest parsnips in Stardew Valley, and trade and bargain with a maximal weekly total profit of 87% in Dealer's Life 2. Cradle can not only operate daily software, like Chrome, Outlook, and Feishu, but also edit images and videos using Meitu and CapCut. Cradle greatly extends the reach of foundation agents by enabling the easy conversion of any software, especially complex games, into benchmarks to evaluate agents' various abilities and facilitate further data collection, thus paving the way for generalist agents. △ Less

Submitted 2 July, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

arXiv:2402.18892 [pdf, other]

Aligning Knowledge Graph with Visual Perception for Object-goal Navigation

Authors: Nuo Xu, Wen Wang, Rong Yang, Mengjie Qin, Zheyuan Lin, Wei Song, Chunlong Zhang, Jason Gu, Chao Li

Abstract: Object-goal navigation is a challenging task that requires guiding an agent to specific objects based on first-person visual observations. The ability of agent to comprehend its surroundings plays a crucial role in achieving successful object finding. However, existing knowledge-graph-based navigators often rely on discrete categorical one-hot vectors and vote counting strategy to construct graph… ▽ More Object-goal navigation is a challenging task that requires guiding an agent to specific objects based on first-person visual observations. The ability of agent to comprehend its surroundings plays a crucial role in achieving successful object finding. However, existing knowledge-graph-based navigators often rely on discrete categorical one-hot vectors and vote counting strategy to construct graph representation of the scenes, which results in misalignment with visual images. To provide more accurate and coherent scene descriptions and address this misalignment issue, we propose the Aligning Knowledge Graph with Visual Perception (AKGVP) method for object-goal navigation. Technically, our approach introduces continuous modeling of the hierarchical scene architecture and leverages visual-language pre-training to align natural language description with visual perception. The integration of a continuous knowledge graph architecture and multimodal feature alignment empowers the navigator with a remarkable zero-shot navigation capability. We extensively evaluate our method using the AI2-THOR simulator and conduct a series of experiments to demonstrate the effectiveness and efficiency of our navigator. Code available: https://github.com/nuoxu/AKGVP. △ Less

Submitted 25 April, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

Comments: Accepted to ICRA 2024

arXiv:2402.18485 [pdf, other]

A Multimodal Foundation Agent for Financial Trading: Tool-Augmented, Diversified, and Generalist

Authors: Wentao Zhang, Lingxuan Zhao, Haochong Xia, Shuo Sun, Jiaze Sun, Molei Qin, Xinyi Li, Yuqing Zhao, Yilei Zhao, Xinyu Cai, Longtao Zheng, Xinrun Wang, Bo An

Abstract: Financial trading is a crucial component of the markets, informed by a multimodal information landscape encompassing news, prices, and Kline charts, and encompasses diverse tasks such as quantitative trading and high-frequency trading with various assets. While advanced AI techniques like deep learning and reinforcement learning are extensively utilized in finance, their application in financial t… ▽ More Financial trading is a crucial component of the markets, informed by a multimodal information landscape encompassing news, prices, and Kline charts, and encompasses diverse tasks such as quantitative trading and high-frequency trading with various assets. While advanced AI techniques like deep learning and reinforcement learning are extensively utilized in finance, their application in financial trading tasks often faces challenges due to inadequate handling of multimodal data and limited generalizability across various tasks. To address these challenges, we present FinAgent, a multimodal foundational agent with tool augmentation for financial trading. FinAgent's market intelligence module processes a diverse range of data-numerical, textual, and visual-to accurately analyze the financial market. Its unique dual-level reflection module not only enables rapid adaptation to market dynamics but also incorporates a diversified memory retrieval system, enhancing the agent's ability to learn from historical data and improve decision-making processes. The agent's emphasis on reasoning for actions fosters trust in its financial decisions. Moreover, FinAgent integrates established trading strategies and expert insights, ensuring that its trading approaches are both data-driven and rooted in sound financial principles. With comprehensive experiments on 6 financial datasets, including stocks and Crypto, FinAgent significantly outperforms 9 state-of-the-art baselines in terms of 6 financial metrics with over 36% average improvement on profit. Specifically, a 92.27% return (a 84.39% relative improvement) is achieved on one dataset. Notably, FinAgent is the first advanced multimodal foundation agent designed for financial trading tasks. △ Less

Submitted 28 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

arXiv:2402.13770 [pdf]

doi 10.1038/s41467-024-45310-2

Room-temperature sub-100 nm Néel-type skyrmions in non-stoichiometric van der Waals ferromagnet $\rm Fe_{3-x}GaTe_{2}$ with ultrafast laser writability

Authors: Zefang Li, Huai Zhang, Guanqi Li, Jiangteng Guo, Qingping Wang, Ying Deng, Yue Hu, Xuange Hu, Can Liu, Minghui Qin, Xi Shen, Richeng Yu, Xingsen Gao, Zhimin Liao, Junming Liu, Zhipeng Hou, Yimei Zhu, Xuewen Fu

Abstract: Realizing room-temperature magnetic skyrmions in two-dimensional van der Waals ferromagnets offers unparalleled prospects for future spintronic applications. However, due to the intrinsic spin fluctuations that suppress atomic long-range magnetic order and the inherent inversion crystal symmetry that excludes the presence of the Dzyaloshinskii-Moriya interaction, achieving room-temperature skyrmio… ▽ More Realizing room-temperature magnetic skyrmions in two-dimensional van der Waals ferromagnets offers unparalleled prospects for future spintronic applications. However, due to the intrinsic spin fluctuations that suppress atomic long-range magnetic order and the inherent inversion crystal symmetry that excludes the presence of the Dzyaloshinskii-Moriya interaction, achieving room-temperature skyrmions in 2D magnets remains a formidable challenge. In this study, we target room-temperature 2D magnet $\rm Fe_3GaTe_2$ and unveil that the introduction of iron-deficient into this compound enables spatial inversion symmetry breaking, thus inducing a significant Dzyaloshinskii-Moriya interaction that brings about room-temperature Néel-type skyrmions with unprecedentedly small size. To further enhance the practical applications of this finding, we employ a homemade in-situ optical Lorentz transmission electron microscopy to demonstrate ultrafast writing of skyrmions in $\rm Fe_{3-x}GaTe_2$ using a single femtosecond laser pulse. Our results manifest the $\rm Fe_{3-x}GaTe_2$ as a promising building block for realizing skyrmion-based magneto-optical functionalities. △ Less

Submitted 21 February, 2024; originally announced February 2024.

arXiv:2401.14656 [pdf, other]

Scientific Large Language Models: A Survey on Biological & Chemical Domains

Authors: Qiang Zhang, Keyang Ding, Tianwen Lyv, Xinda Wang, Qingyu Yin, Yiwen Zhang, Jing Yu, Yuhao Wang, Xiaotong Li, Zhuoyi Xiang, Xiang Zhuang, Zeyuan Wang, Ming Qin, Mengyao Zhang, Jinlu Zhang, Jiyu Cui, Renjun Xu, Hongyang Chen, Xiaohui Fan, Huabin Xing, Huajun Chen

Abstract: Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension, representing a significant stride toward artificial general intelligence. The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized linguistic systems developed within various scientific disciplines. This growing interest has led to the advent o… ▽ More Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension, representing a significant stride toward artificial general intelligence. The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized linguistic systems developed within various scientific disciplines. This growing interest has led to the advent of scientific LLMs, a novel subclass specifically engineered for facilitating scientific discovery. As a burgeoning area in the community of AI for Science, scientific LLMs warrant comprehensive exploration. However, a systematic and up-to-date survey introducing them is currently lacking. In this paper, we endeavor to methodically delineate the concept of "scientific language", whilst providing a thorough review of the latest advancements in scientific LLMs. Given the expansive realm of scientific disciplines, our analysis adopts a focused lens, concentrating on the biological and chemical domains. This includes an in-depth examination of LLMs for textual knowledge, small molecules, macromolecular proteins, genomic sequences, and their combinations, analyzing them in terms of model architectures, capabilities, datasets, and evaluation. Finally, we critically examine the prevailing challenges and point out promising research directions along with the advances of LLMs. By offering a comprehensive overview of technical developments in this field, this survey aspires to be an invaluable resource for researchers navigating the intricate landscape of scientific LLMs. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2401.14140 [pdf, ps, other]

doi 10.1103/PhysRevB.109.155424

Efficient photon-pair generation empowered by dual quasi-bound states in the continuum

Authors: Tingting Liu, Meibao Qin, Siqi Feng, Xu Tu, Tianjing Guo, Feng Wu, Shuyuan Xiao

Abstract: Here we demonstrate the efficient photon-pair generation via spontaneous parametric down conversion from a semiconductor metasurface supporting dual quasi-bound states in the continuum (quasi-BICs). In a simple metasurface design composed of AlGaAs ellipse nano-cyclinders, the two high-$Q$ quasi-BIC resonances that coincide with the generated signal and idler frequencies significantly boost the lo… ▽ More Here we demonstrate the efficient photon-pair generation via spontaneous parametric down conversion from a semiconductor metasurface supporting dual quasi-bound states in the continuum (quasi-BICs). In a simple metasurface design composed of AlGaAs ellipse nano-cyclinders, the two high-$Q$ quasi-BIC resonances that coincide with the generated signal and idler frequencies significantly boost the local electric field. This leads to a substantial enhancement in the reverse classical nonlinear process of sum frequency generation and subsequently the remarkable high generation rate of photon pairs under the quantum-classical correspondence principle. Within a narrowband wavelength regime around the quasi-BIC resonances, the rate of pair production is enhanced up to $\sim10^{4}$ Hz, two orders of magnitude larger than that in the Mie resonant AlGaAs nanoantennas. Moreover, the photon pair emission is mainly concentrated in the normal direction with respect to the metasurface, and shows tunable rate with the $Q$ factor by engineering the rotation angle of nano-cylinders. The presented work enables nanoscale sources of high-quality entangled photons which will find applications in advanced quantum imaging and communications. △ Less

Submitted 25 January, 2024; originally announced January 2024.

Journal ref: Physical Review B 109 (15), 155424 (2024)

arXiv:2401.13940 [pdf, other]

doi 10.1145/3597503.3639197

How Are Paid and Volunteer Open Source Developers Different? A Study of the Rust Project

Authors: Yuxia Zhang, Mian Qin, Klaas-Jan Stol, Minghui Zhou, Hui Liu

Abstract: It is now commonplace for organizations to pay developers to work on specific open source software (OSS) projects to pursue their business goals. Such paid developers work alongside voluntary contributors, but given the different motivations of these two groups of developers, conflict may arise, which may pose a threat to a project's sustainability. This paper presents an empirical study of paid d… ▽ More It is now commonplace for organizations to pay developers to work on specific open source software (OSS) projects to pursue their business goals. Such paid developers work alongside voluntary contributors, but given the different motivations of these two groups of developers, conflict may arise, which may pose a threat to a project's sustainability. This paper presents an empirical study of paid developers and volunteers in Rust, a popular open source programming language project. Rust is a particularly interesting case given considerable concerns about corporate participation. We compare volunteers and paid developers through contribution characteristics and long-term participation, and solicit volunteers' perceptions on paid developers. We find that core paid developers tend to contribute more frequently; commits contributed by one-time paid developers have bigger sizes; peripheral paid developers implement more features; and being paid plays a positive role in becoming a long-term contributor. We also find that volunteers do have some prejudices against paid developers. This study suggests that the dichotomous view of paid vs. volunteer developers is too simplistic and that further subgroups can be identified. Companies should become more sensitive to how they engage with OSS communities, in certain ways as suggested by this study. △ Less

Submitted 24 January, 2024; originally announced January 2024.

arXiv:2401.07659 [pdf, other]

Parent Hamiltonian for Fully-augmented Matrix Product States

Authors: Xiangjian Qian, Mingpu Qin

Abstract: Fully-augmented Matrix Product States (FAMPS) was proposed recently (Chin. Phys. Lett. 40, 057102 (2023)) as an accurate numerical tool to study two-dimensional quantum many-body systems. It is constructed by including a disentangler layer upon MPS. The cost of simulating quantum models with FAMPS is similar as DMRG (with small overhead), but FAMPS can support area-law entanglement entropy for two… ▽ More Fully-augmented Matrix Product States (FAMPS) was proposed recently (Chin. Phys. Lett. 40, 057102 (2023)) as an accurate numerical tool to study two-dimensional quantum many-body systems. It is constructed by including a disentangler layer upon MPS. The cost of simulating quantum models with FAMPS is similar as DMRG (with small overhead), but FAMPS can support area-law entanglement entropy for two-dimensional systems. These properties make FAMPS an effective and efficient tool. In this work, we demonstrate that for each FAMPS we can construct a two-dimensional Hamiltonian with the FAMPS being its ground state. We show how to construct the parent Hamiltonian for given FAMPS. We also perform numerical simulation to show that the algorithm proposed in Chin. Phys. Lett. 40, 057102 (2023) can find the exact FAMPS for the parent Hamiltonian. FAMPS and the corresponding parent Hamiltonian provides a useful framework for the future study of two-dimensional quantum many-body systems △ Less

Submitted 15 January, 2024; originally announced January 2024.

arXiv:2401.07410 [pdf, other]

Multi-Task DNS Security Analysis via High-Order Heterogeneous Graph Embedding

Authors: Meng Qin

Abstract: DNS is an essential Internet infrastructure to support network applications and services, but is also a significant tool exploited by various cyberattacks. Existing DNS security analysis techniques mostly focus on one specific task associated with one single entity (e.g., domain) via conventional feature engineering. They rely heavily on the labor-intensive feature selection and largely ignore the… ▽ More DNS is an essential Internet infrastructure to support network applications and services, but is also a significant tool exploited by various cyberattacks. Existing DNS security analysis techniques mostly focus on one specific task associated with one single entity (e.g., domain) via conventional feature engineering. They rely heavily on the labor-intensive feature selection and largely ignore the intrinsic correlations among the heterogeneous DNS entities (e.g., domain and IP). In this paper, I explore the potential of heterogeneous graph embedding to automatically learn the behavior features of multiple DNS entities, and to simultaneously support more than one security tasks. Considering the joint optimization of malicious domain detection and IP reputation evaluation as an example, I propose a novel joint DNS embedding (JDE) model to formulate the DNS query behavior via a similarity-enhanced graph with heterogeneous entities. The random walk technique is applied to the heterogeneous graph to comprehensively explore the hidden homogeneous and heterogeneous high-order proximities among domains and IPs. Extensive experiments on real DNS traffic demonstrate that the joint optimization of multiple tasks with the latent high-order proximities can lead to better security analysis performance for all the tasks than respectively optimizing each single task with the observable low-order proximity. △ Less

Submitted 14 January, 2024; originally announced January 2024.

arXiv:2401.03444 [pdf, other]

Towards a Unified Method for Network Dynamic via Adversarial Weighted Link Prediction

Authors: Meng Qin

Abstract: Network dynamic (e.g., traffic burst in data center networks and channel fading in cellular WiFi networks) has a great impact on the performance of communication networks (e.g., throughput, capacity, delay, and jitter). This article proposes a unified prediction-based method to handle the dynamic of various network systems. From the view of graph deep learning, I generally formulate the dynamic pr… ▽ More Network dynamic (e.g., traffic burst in data center networks and channel fading in cellular WiFi networks) has a great impact on the performance of communication networks (e.g., throughput, capacity, delay, and jitter). This article proposes a unified prediction-based method to handle the dynamic of various network systems. From the view of graph deep learning, I generally formulate the dynamic prediction of networks as a temporal link prediction task and analyze the possible challenges of the prediction of weighted networks, where link weights have the wide-value-range and sparsity issues. Inspired by the high-resolution video frame prediction with generative adversarial network (GAN), I try to adopt adversarial learning to generate high-quality predicted snapshots for network dynamic, which is expected to support the precise and fine-grained network control. A novel high-quality temporal link prediction (HQ-TLP) model with GAN is then developed to illustrate the potential of my basic idea. Extensive experiments for various application scenarios further demonstrate the powerful capability of HQ-TLP. △ Less

Submitted 7 January, 2024; originally announced January 2024.

arXiv:2401.00651 [pdf, other]

IRWE: Inductive Random Walk for Joint Inference of Identity and Position Network Embedding

Authors: Meng Qin, Dit-Yan Yeung

Abstract: Network embedding, which maps graphs to distributed representations, is a unified framework for various graph inference tasks. According to the topology properties (e.g., structural roles and community memberships of nodes) to be preserved, it can be categorized into the identity and position embedding. However, existing methods can only capture one type of property. Some approaches can support th… ▽ More Network embedding, which maps graphs to distributed representations, is a unified framework for various graph inference tasks. According to the topology properties (e.g., structural roles and community memberships of nodes) to be preserved, it can be categorized into the identity and position embedding. However, existing methods can only capture one type of property. Some approaches can support the inductive inference that generalizes the embedding model to new nodes or graphs but relies on the availability of attributes. Due to the complicated correlations between topology and attributes, it is unclear for some inductive methods which type of property they can capture. In this study, we explore a unified framework for the joint inductive inference of identity and position embeddings without attributes. An inductive random walk embedding (IRWE) method is proposed, which combines multiple attention units to handle the random walk on graph topology and simultaneously derives identity and position embeddings that are jointly optimized. In particular, we demonstrate that some random walk statistics can be informative features to characterize node identities and positions while supporting the inductive embedding inference. Experiments validate the superior performance of IRWE beyond various baselines for the transductive and inductive inference of identity and position embeddings. △ Less

Submitted 12 May, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

arXiv:2312.16084 [pdf, other]

LangSplat: 3D Language Gaussian Splatting

Authors: Minghan Qin, Wanhua Li, Jiawei Zhou, Haoqian Wang, Hanspeter Pfister

Abstract: Humans live in a 3D world and commonly use natural language to interact with a 3D scene. Modeling a 3D language field to support open-ended language queries in 3D has gained increasing attention recently. This paper introduces LangSplat, which constructs a 3D language field that enables precise and efficient open-vocabulary querying within 3D spaces. Unlike existing methods that ground CLIP langua… ▽ More Humans live in a 3D world and commonly use natural language to interact with a 3D scene. Modeling a 3D language field to support open-ended language queries in 3D has gained increasing attention recently. This paper introduces LangSplat, which constructs a 3D language field that enables precise and efficient open-vocabulary querying within 3D spaces. Unlike existing methods that ground CLIP language embeddings in a NeRF model, LangSplat advances the field by utilizing a collection of 3D Gaussians, each encoding language features distilled from CLIP, to represent the language field. By employing a tile-based splatting technique for rendering language features, we circumvent the costly rendering process inherent in NeRF. Instead of directly learning CLIP embeddings, LangSplat first trains a scene-wise language autoencoder and then learns language features on the scene-specific latent space, thereby alleviating substantial memory demands imposed by explicit modeling. Existing methods struggle with imprecise and vague 3D language fields, which fail to discern clear boundaries between objects. We delve into this issue and propose to learn hierarchical semantics using SAM, thereby eliminating the need for extensively querying the language field across various scales and the regularization of DINO features. Extensive experimental results show that LangSplat significantly outperforms the previous state-of-the-art method LERF by a large margin. Notably, LangSplat is extremely efficient, achieving a 199 $\times$ speedup compared to LERF at the resolution of 1440 $\times$ 1080. We strongly recommend readers to check out our video results at https://langsplat.github.io/ △ Less

Submitted 31 March, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

Comments: CVPR 2024. Project Page: https://langsplat.github.io

arXiv:2312.15584 [pdf]

doi 10.1103/PhysRevB.109.174412

Controllable magnon frequency comb in synthetic ferrimagnets

Authors: Y. Liu, T. T. Liu, Q. Q. Yang, G. Tian, Z. P. Hou, D. Y. Chen, Z. Fan, M. Zeng, X. B. Lu, X. S. Gao, M. H. Qin, J. M. Liu

Abstract: Magnon frequency comb provides opportunities for exploring magnon nonlinear effects and measuring the transmission magnon frequency in magnets, whose controllability becomes vital for modulating the operating frequency and improving the measurement accuracy. Nevertheless, such controllable frequency comb remains to be explored. In this work, we investigate theoretically and numerically the skyrmio… ▽ More Magnon frequency comb provides opportunities for exploring magnon nonlinear effects and measuring the transmission magnon frequency in magnets, whose controllability becomes vital for modulating the operating frequency and improving the measurement accuracy. Nevertheless, such controllable frequency comb remains to be explored. In this work, we investigate theoretically and numerically the skyrmion-induced magnon frequency comb effect generated by interaction between the magnon excitation mode and skyrmion breathing mode in synthetic ferrimagnets. It is revealed that both the skyrmion breathing mode and the magnon frequency gap closely depend on the net angular momentum δs, emphasizing the pivotal role of δs as an effective control parameter in governing the comb teeth. With the increase of δs, the skyrmion size decreases, which results in the enlargement of the breathing frequency and the distance between the comb teeth. Moreover, the dependences of the magnon frequency gap on δs and the inter-layer coupling allow one to modulate the comb lowest coherent frequency via structural control. Consequently, the coherent modes generated by the comb may range from gigahertz to terahertz frequencies, serving as a bridge between microwave and terahertz waves. Thus, this work represents a substantial advance in understanding the magnon frequency comb effect in ferrimagnets. △ Less

Submitted 11 March, 2024; v1 submitted 24 December, 2023; originally announced December 2023.

Comments: 27 pages, 8 figures

Journal ref: Physical Review B 109, 174412 (2024)

arXiv:2312.08554 [pdf, other]

Adaptive Robot Coordination: A Subproblem-based Approach for Hybrid Multi-Robot Motion Planning

Authors: Irving Solis, James Motes, Mike Qin, Marco Morales, Nancy M. Amato

Abstract: This work presents Adaptive Robot Coordination (ARC), a novel hybrid framework for multi-robot motion planning (MRMP) that employs local subproblems to resolve inter-robot conflicts. ARC creates subproblems centered around conflicts, and the solutions represent the robot motions required to resolve these conflicts. The use of subproblems enables an inexpensive hybrid exploration of the multi-robot… ▽ More This work presents Adaptive Robot Coordination (ARC), a novel hybrid framework for multi-robot motion planning (MRMP) that employs local subproblems to resolve inter-robot conflicts. ARC creates subproblems centered around conflicts, and the solutions represent the robot motions required to resolve these conflicts. The use of subproblems enables an inexpensive hybrid exploration of the multi-robot planning space. ARC leverages the hybrid exploration by dynamically adjusting the coupling and decoupling of the multi-robot planning space. This allows ARC to adapt the levels of coordination efficiently by planning in decoupled spaces, where robots can operate independently, and in coupled spaces where coordination is essential. ARC is probabilistically complete, can be used for any robot, and produces efficient cost solutions in reduced planning times. Through extensive evaluation across representative scenarios with different robots requiring various levels of coordination, ARC demonstrates its ability to provide simultaneous scalability and precise coordination. ARC is the only method capable of solving all the scenarios and is competitive with coupled, decoupled, and hybrid baselines. △ Less

Submitted 13 December, 2023; originally announced December 2023.

Comments: This work has been submitted for review

arXiv:2312.04360 [pdf, other]

The Computational Advantage of MIP* Vanishes in the Presence of Noise

Authors: Yangjing Dong, Honghao Fu, Anand Natarajan, Minglong Qin, Haochen Xu, Penghui Yao

Abstract: Quantum multiprover interactive proof systems with entanglement MIP* are much more powerful than their classical counterpart MIP (Babai et al. '91, Ji et al. '20): while MIP = NEXP, the quantum class MIP* is equal to RE, a class including the halting problem. This is because the provers in MIP* can share unbounded quantum entanglement. However, recent works of Qin and Yao '21 and '23 have shown th… ▽ More Quantum multiprover interactive proof systems with entanglement MIP* are much more powerful than their classical counterpart MIP (Babai et al. '91, Ji et al. '20): while MIP = NEXP, the quantum class MIP* is equal to RE, a class including the halting problem. This is because the provers in MIP* can share unbounded quantum entanglement. However, recent works of Qin and Yao '21 and '23 have shown that this advantage is significantly reduced if the provers' shared state contains noise. This paper attempts to exactly characterize the effect of noise on the computational power of quantum multiprover interactive proof systems. We investigate the quantum two-prover one-round interactive system MIP*[poly, O(1)], where the verifier sends polynomially many bits to the provers and the provers send back constantly many bits. We show noise completely destroys the computational advantage given by shared entanglement in this model. Specifically, we show that if the provers are allowed to share arbitrarily many EPR states, where each EPR state is affected by an arbitrarily small constant amount of noise, the resulting complexity class is contained in NEXP = MIP. This improves significantly on the previous best-known bound of NEEEXP (nondeterministic triply exponential time) by Qin and Yao '21. We also show that this collapse in power is due to the noise, rather than the O(1) answer size, by showing that allowing for noiseless EPR states gives the class the full power of RE = MIP*[poly, poly]. Along the way, we develop two technical tools of independent interest. First, we give a new, deterministic tester for the positivity of an exponentially large matrix, provided it has a low-degree Fourier decomposition in terms of Pauli matrices. Secondly, we develop a new invariance principle for smooth matrix functions having bounded third-order Fréchet derivatives or which are Lipschitz continous. △ Less

Submitted 7 December, 2023; originally announced December 2023.

Comments: Comments are welcome!

arXiv:2312.01560 [pdf, ps, other]

RaftGP: Random Fast Graph Partitioning

Authors: Yu Gao, Meng Qin, Yibin Ding, Li Zeng, Chaorui Zhang, Weixi Zhang, Wei Han, Rongqian Zhao, Bo Bai

Abstract: Graph partitioning (GP), a.k.a. community detection, is a classic problem that divides the node set of a graph into densely-connected blocks. Following prior work on the IEEE HPEC Graph Challenge benchmark and recent advances in graph machine learning, we propose a novel RAndom FasT Graph Partitioning (RaftGP) method based on an efficient graph embedding scheme. It uses the Gaussian random project… ▽ More Graph partitioning (GP), a.k.a. community detection, is a classic problem that divides the node set of a graph into densely-connected blocks. Following prior work on the IEEE HPEC Graph Challenge benchmark and recent advances in graph machine learning, we propose a novel RAndom FasT Graph Partitioning (RaftGP) method based on an efficient graph embedding scheme. It uses the Gaussian random projection to extract community-preserving features from classic GP objectives. These features are fed into a graph neural network (GNN) to derive low-dimensional node embeddings. Surprisingly, our experiments demonstrate that a randomly initialized GNN even without training is enough for RaftGP to derive informative community-preserving embeddings and support high-quality GP. To enable the derived embeddings to tackle GP, we introduce a hierarchical model selection algorithm that simultaneously determines the number of blocks and the corresponding GP result. We evaluate RaftGP on the Graph Challenge benchmark and compare the performance with five baselines, where our method can achieve a better trade-off between quality and efficiency. In particular, compared to the baseline algorithm of the IEEE HPEC Graph Challenge, our method is 6.68x -- 23.9x faster on graphs with 1E3 -- 5E4 nodes and at least 64.5x faster on larger (1E5 node) graphs on which the baseline takes more than 1E4 seconds. Our method achieves better accuracy on all test cases. We also develop a new graph generator to address some limitations of the original generator in the benchmark. △ Less

Submitted 3 December, 2023; originally announced December 2023.

arXiv:2311.16482 [pdf, other]

Animatable 3D Gaussian: Fast and High-Quality Reconstruction of Multiple Human Avatars

Authors: Yang Liu, Xiang Huang, Minghan Qin, Qinwei Lin, Haoqian Wang

Abstract: Neural radiance fields are capable of reconstructing high-quality drivable human avatars but are expensive to train and render. To reduce consumption, we propose Animatable 3D Gaussian, which learns human avatars from input images and poses. We extend 3D Gaussians to dynamic human scenes by modeling a set of skinned 3D Gaussians and a corresponding skeleton in canonical space and deforming 3D Gaus… ▽ More Neural radiance fields are capable of reconstructing high-quality drivable human avatars but are expensive to train and render. To reduce consumption, we propose Animatable 3D Gaussian, which learns human avatars from input images and poses. We extend 3D Gaussians to dynamic human scenes by modeling a set of skinned 3D Gaussians and a corresponding skeleton in canonical space and deforming 3D Gaussians to posed space according to the input poses. We introduce hash-encoded shape and appearance to speed up training and propose time-dependent ambient occlusion to achieve high-quality reconstructions in scenes containing complex motions and dynamic shadows. On both novel view synthesis and novel pose synthesis tasks, our method outperforms existing methods in terms of training time, rendering speed, and reconstruction quality. Our method can be easily extended to multi-human scenes and achieve comparable novel view synthesis results on a scene with ten people in only 25 seconds of training. △ Less

Submitted 29 November, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

arXiv:2311.10840 [pdf]

Integration and Implementation Strategies for AI Algorithm Deployment with Smart Routing Rules and Workflow Management

Authors: Barbaros Selnur Erdal, Vikash Gupta, Mutlu Demirer, Kim H. Fair, Richard D. White, Jeff Blair, Barbara Deichert, Laurie Lafleur, Ming Melvin Qin, David Bericat, Brad Genereaux

Abstract: This paper reviews the challenges hindering the widespread adoption of artificial intelligence (AI) solutions in the healthcare industry, focusing on computer vision applications for medical imaging, and how interoperability and enterprise-grade scalability can be used to address these challenges. The complex nature of healthcare workflows, intricacies in managing large and secure medical imaging… ▽ More This paper reviews the challenges hindering the widespread adoption of artificial intelligence (AI) solutions in the healthcare industry, focusing on computer vision applications for medical imaging, and how interoperability and enterprise-grade scalability can be used to address these challenges. The complex nature of healthcare workflows, intricacies in managing large and secure medical imaging data, and the absence of standardized frameworks for AI development pose significant barriers and require a new paradigm to address them. The role of interoperability is examined in this paper as a crucial factor in connecting disparate applications within healthcare workflows. Standards such as DICOM, Health Level 7 (HL7), and Integrating the Healthcare Enterprise (IHE) are highlighted as foundational for common imaging workflows. A specific focus is placed on the role of DICOM gateways, with Smart Routing Rules and Workflow Management leading transformational efforts in this area. To drive enterprise scalability, new tools are needed. Project MONAI, established in 2019, is introduced as an initiative aiming to redefine the development of medical AI applications. The MONAI Deploy App SDK, a component of Project MONAI, is identified as a key tool in simplifying the packaging and deployment process, enabling repeatable, scalable, and standardized deployment patterns for AI applications. The abstract underscores the potential impact of successful AI adoption in healthcare, offering physicians both life-saving and time-saving insights and driving efficiencies in radiology department workflows. The collaborative efforts between academia and industry, are emphasized as essential for advancing the adoption of healthcare AI solutions. △ Less

Submitted 21 November, 2023; v1 submitted 17 November, 2023; originally announced November 2023.

Comments: 13 pages, 6 figures

ACM Class: I.2.m

arXiv:2311.02035 [pdf, other]

A Highly-Compact Direct-Injection Universal Power Flow and Quality Control Circuit

Authors: Mowei Lu, Mengjie Qin, Jan Kacetl, Eeshta Suresh, Teng Long, Stefan M. Goetz

Abstract: This paper presents a novel direct-injection modular universal power flow and quality control topology exclusively using lower power components. In addition to conventional high-voltage applications, it is particularly attractive for the distribution and secondary grids, e.g., in soft open points, down to low voltage as it can exploit the latest developments in low-voltage high-current semiconduct… ▽ More This paper presents a novel direct-injection modular universal power flow and quality control topology exclusively using lower power components. In addition to conventional high-voltage applications, it is particularly attractive for the distribution and secondary grids, e.g., in soft open points, down to low voltage as it can exploit the latest developments in low-voltage high-current semiconductors. In contrast to other concepts that do not interface the grid through transformers, it does not need to convert the entire line power but only the injected or extracted power difference. The proposed power flow and quality (f/q) controller comprises a shunt active front end, together with high-frequency links serving as a power supply for a series floating module per phase. Each of the floating modules is in series with one phase of the line, floating with the electric potential of that particular phase, avoiding any ground connection. Omitting bulky and dynamically limited line transformers of conventional universal power flow controllers, the presented direct-injection f/q controller enables exceptionally small size and volume, high power density, high frequency content, and fast response. In contrast to direct-injection concepts with full back-to-back converters, it only needs to handle a fraction of the power. The circuit combines grid-voltage low-current electronics in the shunt unit and low-voltage high-current modules in the floating series injection units. Simulations and experiments demonstrate and validate the concept. △ Less

Submitted 3 November, 2023; originally announced November 2023.

arXiv:2310.11774 [pdf, other]

doi 10.1088/1361-648X/ad21a8

On the Magnetization of the $120^\circ$ order of the Spin-1/2 Triangular Lattice Heisenberg Model: a DMRG revisit

Authors: Jiale Huang, Xiangjian Qian, Mingpu Qin

Abstract: We revisit the issue about the magnetization of the $120^\circ$ order in the spin-1/2 triangular lattice Heisenberg model (TLHM) with Density Matrix Renormalization Group (DMRG). The accurate determination of the magnetization of this model is challenging for numerical methods and its value exhibits substantial disparities across various methods. We perform a large-scale DMRG calculation of this m… ▽ More We revisit the issue about the magnetization of the $120^\circ$ order in the spin-1/2 triangular lattice Heisenberg model (TLHM) with Density Matrix Renormalization Group (DMRG). The accurate determination of the magnetization of this model is challenging for numerical methods and its value exhibits substantial disparities across various methods. We perform a large-scale DMRG calculation of this model by employing bond dimension as large as $D = 24000$ and by studying the system with width as large as $L_\mathrm{y} = 12$. With careful extrapolation with truncation error and suitable finite size scaling, we give a conservative estimation of the magnetization as $M_0 = 0.208(8)$. The ground state energy per site we obtain is $E_g = -0.5503(8)$. Our results provide valuable benchmark values for the development of new methods in the future. △ Less

Submitted 18 October, 2023; originally announced October 2023.

Comments: 6 pages, 6 figures

Journal ref: J. Phys.: Condens. Matter 36 185602 (2024)

arXiv:2310.07683 [pdf, other]

Controllable Data Generation Via Iterative Data-Property Mutual Mappings

Authors: Bo Pan, Muran Qin, Shiyu Wang, Yifei Zhang, Liang Zhao

Abstract: Deep generative models have been widely used for their ability to generate realistic data samples in various areas, such as images, molecules, text, and speech. One major goal of data generation is controllability, namely to generate new data with desired properties. Despite growing interest in the area of controllable generation, significant challenges still remain, including 1) disentangling des… ▽ More Deep generative models have been widely used for their ability to generate realistic data samples in various areas, such as images, molecules, text, and speech. One major goal of data generation is controllability, namely to generate new data with desired properties. Despite growing interest in the area of controllable generation, significant challenges still remain, including 1) disentangling desired properties with unrelated latent variables, 2) out-of-distribution property control, and 3) objective optimization for out-of-distribution property control. To address these challenges, in this paper, we propose a general framework to enhance VAE-based data generators with property controllability and ensure disentanglement. Our proposed objective can be optimized on both data seen and unseen in the training set. We propose a training procedure to train the objective in a semi-supervised manner by iteratively conducting mutual mappings between the data and properties. The proposed framework is implemented on four VAE-based controllable generators to evaluate its performance on property error, disentanglement, generation quality, and training time. The results indicate that our proposed framework enables more precise control over the properties of generated samples in a short training time, ensuring the disentanglement and keeping the validity of the generated samples. △ Less

Submitted 11 October, 2023; originally announced October 2023.

arXiv:2310.06275 [pdf, other]

High-Fidelity 3D Head Avatars Reconstruction through Spatially-Varying Expression Conditioned Neural Radiance Field

Authors: Minghan Qin, Yifan Liu, Yuelang Xu, Xiaochen Zhao, Yebin Liu, Haoqian Wang

Abstract: One crucial aspect of 3D head avatar reconstruction lies in the details of facial expressions. Although recent NeRF-based photo-realistic 3D head avatar methods achieve high-quality avatar rendering, they still encounter challenges retaining intricate facial expression details because they overlook the potential of specific expression variations at different spatial positions when conditioning the… ▽ More One crucial aspect of 3D head avatar reconstruction lies in the details of facial expressions. Although recent NeRF-based photo-realistic 3D head avatar methods achieve high-quality avatar rendering, they still encounter challenges retaining intricate facial expression details because they overlook the potential of specific expression variations at different spatial positions when conditioning the radiance field. Motivated by this observation, we introduce a novel Spatially-Varying Expression (SVE) conditioning. The SVE can be obtained by a simple MLP-based generation network, encompassing both spatial positional features and global expression information. Benefiting from rich and diverse information of the SVE at different positions, the proposed SVE-conditioned neural radiance field can deal with intricate facial expressions and achieve realistic rendering and geometry details of high-fidelity 3D head avatars. Additionally, to further elevate the geometric and rendering quality, we introduce a new coarse-to-fine training strategy, including a geometry initialization strategy at the coarse stage and an adaptive importance sampling strategy at the fine stage. Extensive experiments indicate that our method outperforms other state-of-the-art (SOTA) methods in rendering and geometry quality on mobile phone-collected and public datasets. △ Less

Submitted 9 October, 2023; originally announced October 2023.

Comments: 9 pages, 5 figures

arXiv:2310.05381 [pdf, other]

doi 10.1007/978-3-031-44696-2_48

CCAE: A Corpus of Chinese-based Asian Englishes

Authors: Yang Liu, Melissa Xiaohui Qin, Long Wang, Chao Huang

Abstract: Language models have been foundations in various scenarios of NLP applications, but it has not been well applied in language variety studies, even for the most popular language like English. This paper represents one of the few initial efforts to utilize the NLP technology in the paradigm of World Englishes, specifically in creating a multi-variety corpus for studying Asian Englishes. We present a… ▽ More Language models have been foundations in various scenarios of NLP applications, but it has not been well applied in language variety studies, even for the most popular language like English. This paper represents one of the few initial efforts to utilize the NLP technology in the paradigm of World Englishes, specifically in creating a multi-variety corpus for studying Asian Englishes. We present an overview of the CCAE -- Corpus of Chinese-based Asian English, a suite of corpora comprising six Chinese-based Asian English varieties. It is based on 340 million tokens in 448 thousand web documents from six regions. The ontology of data would make the corpus a helpful resource with enormous research potential for Asian Englishes (especially for Chinese Englishes for which there has not been a publicly accessible corpus yet so far) and an ideal source for variety-specific language modeling and downstream tasks, thus setting the stage for NLP-based World Englishes studies. And preliminary experiments on this corpus reveal the practical value of CCAE. Finally, we make CCAE available at \href{https://huggingface.co/datasets/CCAE/CCAE-Corpus}{this https URL}. △ Less

Submitted 8 October, 2023; originally announced October 2023.

Comments: NLPCC'2023 (12 pages, 3 figures, 4 charts)

MSC Class: 68T50 ACM Class: I.2.7

arXiv:2310.03269 [pdf, other]

InstructProtein: Aligning Human and Protein Language via Knowledge Instruction

Authors: Zeyuan Wang, Qiang Zhang, Keyan Ding, Ming Qin, Xiang Zhuang, Xiaotong Li, Huajun Chen

Abstract: Large Language Models (LLMs) have revolutionized the field of natural language processing, but they fall short in comprehending biological sequences such as proteins. To address this challenge, we propose InstructProtein, an innovative LLM that possesses bidirectional generation capabilities in both human and protein languages: (i) taking a protein sequence as input to predict its textual function… ▽ More Large Language Models (LLMs) have revolutionized the field of natural language processing, but they fall short in comprehending biological sequences such as proteins. To address this challenge, we propose InstructProtein, an innovative LLM that possesses bidirectional generation capabilities in both human and protein languages: (i) taking a protein sequence as input to predict its textual function description and (ii) using natural language to prompt protein sequence generation. To achieve this, we first pre-train an LLM on both protein and natural language corpora, enabling it to comprehend individual languages. Then supervised instruction tuning is employed to facilitate the alignment of these two distinct languages. Herein, we introduce a knowledge graph-based instruction generation framework to construct a high-quality instruction dataset, addressing annotation imbalance and instruction deficits in existing protein-text corpus. In particular, the instructions inherit the structural relations between proteins and function annotations in knowledge graphs, which empowers our model to engage in the causal modeling of protein functions, akin to the chain-of-thought processes in natural languages. Extensive experiments on bidirectional protein-text generation tasks show that InstructProtein outperforms state-of-the-art LLMs by large margins. Moreover, InstructProtein serves as a pioneering step towards text-based protein function prediction and sequence design, effectively bridging the gap between protein and human language understanding. △ Less

Submitted 4 October, 2023; originally announced October 2023.

arXiv:2309.13630 [pdf, other]

doi 10.1103/PhysRevB.109.L161103

Absence of spin liquid phase in the $J_1-J_2$ Heisenberg model on the square lattice

Authors: Xiangjian Qian, Mingpu Qin

Abstract: We perform an in-depth investigation of the phase diagram of the $J_1-J_2$ Heisenberg model on the square lattice. We take advantage of Density Matrix Renormalization Group and Fully-Augmented Matrix Product States methods and reach unprecedented accuracy with large bond dimensions. We utilize excited-level crossing analysis to pinpoint the phase transition points. It was believed before that ther… ▽ More We perform an in-depth investigation of the phase diagram of the $J_1-J_2$ Heisenberg model on the square lattice. We take advantage of Density Matrix Renormalization Group and Fully-Augmented Matrix Product States methods and reach unprecedented accuracy with large bond dimensions. We utilize excited-level crossing analysis to pinpoint the phase transition points. It was believed before that there exists a narrow spin liquid phase sandwiched by the Néel antiferromagnetic (AFM) and valence bond solid (VBS) phases. Through careful finite size scaling of the level crossing points, we find a direct phase transition between the Néel AFM and VBS phases at $J_2/J_1 = 0.535(3)$, suggesting the absence of an intermediate spin liquid phase. We also provide accurate results for ground state energies for a variety of sizes, from which we find that the transition between the Néel AFM and VBS phases is continuous. These results indicate the existence of a deconfined quantum critical point at $J_2/J_1 = 0.535(3)$ in the model. From the crossing of the first derivative of the energies with $J_2$ for different sizes, we also determine the precise location of the first order phase transition between the VBS and stripe AFM phases at $J_2/J_1=0.610(5)$. △ Less

Submitted 7 April, 2024; v1 submitted 24 September, 2023; originally announced September 2023.

Comments: close to the published version

Journal ref: Phys. Rev. B 109, L161103 (2024)

arXiv:2309.12891 [pdf, other]

EarnHFT: Efficient Hierarchical Reinforcement Learning for High Frequency Trading

Authors: Molei Qin, Shuo Sun, Wentao Zhang, Haochong Xia, Xinrun Wang, Bo An

Abstract: High-frequency trading (HFT) uses computer algorithms to make trading decisions in short time scales (e.g., second-level), which is widely used in the Cryptocurrency (Crypto) market (e.g., Bitcoin). Reinforcement learning (RL) in financial research has shown stellar performance on many quantitative trading tasks. However, most methods focus on low-frequency trading, e.g., day-level, which cannot b… ▽ More High-frequency trading (HFT) uses computer algorithms to make trading decisions in short time scales (e.g., second-level), which is widely used in the Cryptocurrency (Crypto) market (e.g., Bitcoin). Reinforcement learning (RL) in financial research has shown stellar performance on many quantitative trading tasks. However, most methods focus on low-frequency trading, e.g., day-level, which cannot be directly applied to HFT because of two challenges. First, RL for HFT involves dealing with extremely long trajectories (e.g., 2.4 million steps per month), which is hard to optimize and evaluate. Second, the dramatic price fluctuations and market trend changes of Crypto make existing algorithms fail to maintain satisfactory performance. To tackle these challenges, we propose an Efficient hieArchical Reinforcement learNing method for High Frequency Trading (EarnHFT), a novel three-stage hierarchical RL framework for HFT. In stage I, we compute a Q-teacher, i.e., the optimal action value based on dynamic programming, for enhancing the performance and training efficiency of second-level RL agents. In stage II, we construct a pool of diverse RL agents for different market trends, distinguished by return rates, where hundreds of RL agents are trained with different preferences of return rates and only a tiny fraction of them will be selected into the pool based on their profitability. In stage III, we train a minute-level router which dynamically picks a second-level agent from the pool to achieve stable performance across different markets. Through extensive experiments in various market trends on Crypto markets in a high-fidelity simulation trading environment, we demonstrate that EarnHFT significantly outperforms 6 state-of-art baselines in 6 popular financial criteria, exceeding the runner-up by 30% in profitability. △ Less

Submitted 22 September, 2023; originally announced September 2023.

arXiv:2309.08941 [pdf, ps, other]

Quantum Pseudorandom Scramblers

Authors: Chuhan Lu, Minglong Qin, Fang Song, Penghui Yao, Mingnan Zhao

Abstract: Quantum pseudorandom state generators (PRSGs) have stimulated exciting developments in recent years. A PRSG, on a fixed initial (e.g., all-zero) state, produces an output state that is computationally indistinguishable from a Haar random state. However, pseudorandomness of the output state is not guaranteed on other initial states. In fact, known PRSG constructions provably fail on some initial st… ▽ More Quantum pseudorandom state generators (PRSGs) have stimulated exciting developments in recent years. A PRSG, on a fixed initial (e.g., all-zero) state, produces an output state that is computationally indistinguishable from a Haar random state. However, pseudorandomness of the output state is not guaranteed on other initial states. In fact, known PRSG constructions provably fail on some initial state. In this work, we propose and construct quantum Pseudorandom State Scramblers (PRSSs), which can produce a pseudorandom state on an arbitrary initial state. In the information-theoretical setting, we obtain a scrambler which maps an arbitrary initial state to a distribution of quantum states that is close to Haar random in total variation distance. As a result, our PRSS exhibits a dispersing property. Loosely, it can span an $ε$-net of the state space. This significantly strengthens what standard PRSGs can induce, as they may only concentrate on a small region of the state space as long as the average output state approximates a Haar random state in total variation distance. Our PRSS construction develops a parallel extension of the famous Kac's walk, and we show that it mixes exponentially faster than the standard Kac's walk. This constitutes the core of our proof. We also describe a few applications of PRSSs. While our PRSS construction assumes a post-quantum one-way function, PRSSs are potentially a weaker primitive and can be separated from one-way functions in a relativized world similar to standard PRSGs. △ Less

Submitted 16 September, 2023; originally announced September 2023.

arXiv:2309.07539 [pdf]

doi 10.1029/2023GL103519

Electron Precipitation Observed by ELFIN Using Proton Precipitation as a Proxy for Electromagnetic Ion Cyclotron (EMIC) Waves

Authors: Luisa Capannolo, Wen Li, Qianli Ma, Murong Qin, Xiao-Chen Shen, Vassilis Angelopoulos, Anton Artemyev, Xiao-Jia Zhang, Mirek Hanzelka

Abstract: Electromagnetic Ion Cyclotron (EMIC) waves can drive radiation belt depletion and Low-Earth Orbit (LEO) satellites can detect the resulting electron and proton precipitation. The ELFIN (Electron Losses and Fields InvestigatioN) CubeSats provide an excellent opportunity to study the properties of EMIC-driven electron precipitation with much higher energy and pitch-angle resolution than previously a… ▽ More Electromagnetic Ion Cyclotron (EMIC) waves can drive radiation belt depletion and Low-Earth Orbit (LEO) satellites can detect the resulting electron and proton precipitation. The ELFIN (Electron Losses and Fields InvestigatioN) CubeSats provide an excellent opportunity to study the properties of EMIC-driven electron precipitation with much higher energy and pitch-angle resolution than previously allowed. We collect EMIC-driven electron precipitation events from ELFIN observations and use POES (Polar Orbiting Environmental Satellites) to search for 10s-100s keV proton precipitation nearby as a proxy of EMIC wave activity. Electron precipitation mainly occurs on localized radial scales (0.3 L), over 15-24 MLT and 5-8 L shells, stronger at MeV energies and weaker down to 100-200 keV. Additionally, the observed loss cone pitch-angle distribution agrees with quasilinear predictions at >250 keV (more filled loss cone with increasing energy), while additional mechanisms are needed to explain the observed low-energy precipitation. △ Less

Submitted 14 September, 2023; originally announced September 2023.

Comments: This manuscript has been accepted by Geophysical Research Letters in June 2023, currently pending publication. The version here is the accepted version

arXiv:2307.13518 [pdf, other]

doi 10.1103/PhysRevA.109.042202

Effective Hamiltonian approach to the quantum phase transitions in the extended Jaynes-Cummings model

Authors: H. T. Cui, Y. A. Yan, M. Qin, X. X. Yi

Abstract: The study of phase transitions in dissipative quantum systems based on the Liouvillian is often hindered by the difficulty of constructing a time-local master equation when the system-environment coupling is strong. To address this issue, the complex discretization approximation for the environment is proposed to study the quantum phase transition in the extended Jaynes-Cumming model with an infin… ▽ More The study of phase transitions in dissipative quantum systems based on the Liouvillian is often hindered by the difficulty of constructing a time-local master equation when the system-environment coupling is strong. To address this issue, the complex discretization approximation for the environment is proposed to study the quantum phase transition in the extended Jaynes-Cumming model with an infinite number of boson modes. This approach yields a non-Hermitian effective Hamiltonian that can be used to simulate the dynamics of the spin. It is found that the ground state of this effective Hamiltonian determines the spin dynamics in the single-excitation subspace. Depending on the opening of the energy gap and the maximum population of excitations on the spin degree of freedom, three distinct phases can be identified: fast decaying, localized, and stretched dynamics of the spin. This approach can be extended to multiple excitations, and similar dynamics were found in the double-excitation subspace, indicating the robustness of the single-excitation phase. △ Less

Submitted 6 April, 2024; v1 submitted 25 July, 2023; originally announced July 2023.

Comments: 12pages, published version

Journal ref: Phys. Rev. A 109.042202(2024)

arXiv:2306.14137 [pdf]

doi 10.1109/LRA.2024.3359548

BotanicGarden: A High-Quality Dataset for Robot Navigation in Unstructured Natural Environments

Authors: Yuanzhi Liu, Yujia Fu, Minghui Qin, Yufeng Xu, Baoxin Xu, Fengdong Chen, Bart Goossens, Poly Z. H. Sun, Hongwei Yu, Chun Liu, Long Chen, Wei Tao, Hui Zhao

Abstract: The rapid developments of mobile robotics and autonomous navigation over the years are largely empowered by public datasets for testing and upgrading, such as sensor odometry and SLAM tasks. Impressive demos and benchmark scores have arisen, which may suggest the maturity of existing navigation techniques. However, these results are primarily based on moderate structured scenario testing. When tra… ▽ More The rapid developments of mobile robotics and autonomous navigation over the years are largely empowered by public datasets for testing and upgrading, such as sensor odometry and SLAM tasks. Impressive demos and benchmark scores have arisen, which may suggest the maturity of existing navigation techniques. However, these results are primarily based on moderate structured scenario testing. When transitioning to challenging unstructured environments, especially in GNSS-denied, texture-monotonous, and dense-vegetated natural fields, their performance can hardly sustain at a high level and requires further validation and improvement. To bridge this gap, we build a novel robot navigation dataset in a luxuriant botanic garden of more than 48000m2. Comprehensive sensors are used, including Gray and RGB stereo cameras, spinning and MEMS 3D LiDARs, and low-cost and industrial-grade IMUs, all of which are well calibrated and hardware-synchronized. An all-terrain wheeled robot is employed for data collection, traversing through thick woods, riversides, narrow trails, bridges, and grasslands, which are scarce in previous resources. This yields 33 short and long sequences, forming 17.1km trajectories in total. Excitedly, both highly-accurate ego-motions and 3D map ground truth are provided, along with fine-annotated vision semantics. We firmly believe that our dataset can advance robot navigation and sensor fusion research to a higher level. △ Less

Submitted 2 March, 2024; v1 submitted 25 June, 2023; originally announced June 2023.

Comments: This article has been accepted for publication in IEEE Robotics and Automation Letters

arXiv:2306.10894 [pdf, other]

doi 10.1103/PhysRevB.109.024505

Half-filled stripe to N$\acute e$el antiferromagnetism transition in the $t'$-Hubbard model on honeycomb lattice

Authors: Yang Shen, Mingpu Qin

Abstract: We study the ground state of the doped Hubbard model on honeycomb lattice with both nearest ($t$) and next-nearest neighboring hoppings ($t'$) in the small doping and strongly interacting region. Previous study on the model without $t'$ showed the ground state is a half-filled stripe. We employ density matrix renormalization group and extrapolate the results with truncation errors in the converged… ▽ More We study the ground state of the doped Hubbard model on honeycomb lattice with both nearest ($t$) and next-nearest neighboring hoppings ($t'$) in the small doping and strongly interacting region. Previous study on the model without $t'$ showed the ground state is a half-filled stripe. We employ density matrix renormalization group and extrapolate the results with truncation errors in the converged region. In the $t' < 0$ side, we find the half-filled stripe phase at $t' = 0$ is stable against the frustration of $t'$ until a critical point $-0.4 < t'_c < -0.3$, beyond which the ground state switches to anti-ferromagnetic N$\acute e$el phase with charge modulation. With further increase of $t'$ to $-0.7$, the ground state becomes paramagnetic. In the $t' > 0$ side, the half-filled stripe stretches to $t' \approx 0.7$. We don't find obvious enhancement of pairing for the range of $t'$ studied. We study width-4 cylinders in this work but the results for spin, charge, and pairing correlation agree qualitatively for periodic and anti-periodic boundary conditions in the half-filled stripe and anti-ferromagnetic N$\acute e$el phases, suggesting the results are likely to be representative for true two-dimensional systems. The half-filled stripe to anti-ferromagnetic N$\acute e$el phase transition can be realized on real materials or ultra-cold atom platform. △ Less

Submitted 19 June, 2023; originally announced June 2023.

Comments: 5 pages, 4 figures

Journal ref: Phys. Rev. B 109, 024505 (2024)

arXiv:2306.07837 [pdf, other]

doi 10.1088/0256-307X/40/12/127401

Effective bi-layer model Hamiltonian and density-matrix renormalization group study for the high-Tc superconductivity in La$_{3}$Ni$_{2}$O$_{7}$ under high pressure

Authors: Yang Shen, Mingpu Qin, Guang-Ming Zhang

Abstract: High-Tc superconductivity with possible $T_{c}\approx 80K$ has been reported in the single crystal of $\text{La}_{3}\text{Ni}_{2}\text{O}_{7}$ under high pressure. Based on the electronic structure given from the density functional theory calculations, we propose an effective bi-layer model Hamiltonian including both $3d_{z^{2}}$ and $3d_{x^{2}-y^{2}}$ orbital electrons of the nickel cations. The… ▽ More High-Tc superconductivity with possible $T_{c}\approx 80K$ has been reported in the single crystal of $\text{La}_{3}\text{Ni}_{2}\text{O}_{7}$ under high pressure. Based on the electronic structure given from the density functional theory calculations, we propose an effective bi-layer model Hamiltonian including both $3d_{z^{2}}$ and $3d_{x^{2}-y^{2}}$ orbital electrons of the nickel cations. The main feature of the model is that the $3d_{z^{2}}$ electrons form inter-layer $σ$-bonding and anti-bonding bands via the apical oxygen anions between the two layers, while the $3d_{x^{2}-y^{2}}$ electrons hybridize with the $3d_{z^{2}}$ electrons within each NiO$_2$ plane. The chemical potential difference of these two orbital electrons ensures that the $3d_{z^{2}}$ orbitals are close to half-filling and the $3d_{x^{2}-y^{2}}$ orbitals are near quarter-filling. The strong on-site Hubbard repulsion of the $3d_{z^{2}}$ orbital electrons gives rise to an effective inter-layer antiferromagnetic spin super-exchange $J$. Applying pressure can self-dope holes on the $3d_{z^{2}}$ orbitals with the same amount of electrons doped on the $3d_{x^{2}-y^{2}}$ orbitals. By performing numerical density-matrix renormalization group calculations on a minimum setup and focusing on the limit of large $J$ and small doping of $3d_{z^{2}}$ orbitals, we find the superconducting instability on both the $3d_{z^{2}}$ and $3d_{x^{2}-y^{2}}$ orbitals by calculating the equal-time spin singlet pair-pair correlation function. Our numerical results have provided useful insights in the high-Tc superconductivity in single crystal La$_3$Ni$_2$O$_7$ under high pressure. △ Less

Submitted 21 November, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

Comments: 6 pages, 4 figures; published version

Journal ref: Chinese Physical Letters 40, 127401 (2023)

arXiv:2306.04851 [pdf, other]

The Performance of VQE across a phase transition point in the $J_1$-$J_2$ model on kagome lattice

Authors: Yuheng Guo, Mingpu Qin

Abstract: Variational quantum eigensolver (VQE) is an efficient classical-quantum hybrid method to take advantage of quantum computers in the Noisy Intermediate-Scale Quantum (NISQ) era. In this work we test the performance of VQE by studying the $J_1$-$J_2$ anti-ferromagnetic Heisenberg model on the kagome lattice, which is found to display a first order phase transition at $J_2 / J_1 \approx 0.01$. By com… ▽ More Variational quantum eigensolver (VQE) is an efficient classical-quantum hybrid method to take advantage of quantum computers in the Noisy Intermediate-Scale Quantum (NISQ) era. In this work we test the performance of VQE by studying the $J_1$-$J_2$ anti-ferromagnetic Heisenberg model on the kagome lattice, which is found to display a first order phase transition at $J_2 / J_1 \approx 0.01$. By comparing the VQE states with the exact diagonalization results, we find VQE energies agree well with the exact values in most region of parameters for the 18-site system we studied. However, near the phase transition point, VQE tends to converge to the excited states when the number of variational parameters is not large enough. For the system studied in this work, this issue can be solved by either increasing the number of parameters or by initializing the parameters with converged values for $J_2/J_1$ away from the phase transition point. Our results provide useful guidance for the practical application of VQE on real quantum computers to study strongly correlated quantum many-body systems. △ Less

Submitted 7 June, 2023; originally announced June 2023.

Comments: 7 pages, 5 figures

Showing 1–50 of 264 results for author: Qin, M