subscribe to arXiv mailings

Universal replication of chaotic characteristics by classical and quantum machine learning

Abstract: Replicating chaotic characteristics of non-linear dynamics by machine learning (ML) has recently drawn wide attentions. In this work, we propose that a ML model, trained to predict the state one-step-ahead from several latest historic states, can accurately replicate the bifurcation diagram and the Lyapunov exponents of discrete dynamic systems. The characteristics for different values of the hype… ▽ More Replicating chaotic characteristics of non-linear dynamics by machine learning (ML) has recently drawn wide attentions. In this work, we propose that a ML model, trained to predict the state one-step-ahead from several latest historic states, can accurately replicate the bifurcation diagram and the Lyapunov exponents of discrete dynamic systems. The characteristics for different values of the hyper-parameters are captured universally by a single ML model, while the previous works considered training the ML model independently by fixing the hyper-parameters to be specific values. Our benchmarks on the one- and two-dimensional Logistic maps show that variational quantum circuit can reproduce the long-term characteristics with higher accuracy than the long short-term memory (a well-recognized classical ML model). Our work reveals an essential difference between the ML for the chaotic characteristics and that for standard tasks, from the perspective of the relation between performance and model complexity. Our results suggest that quantum circuit model exhibits potential advantages on mitigating over-fitting, achieving higher accuracy and stability. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 8 pages, 4 figures

arXiv:2311.11258 [pdf, other]

doi 10.34133/icomputing.0061

Tensor networks for interpretable and efficient quantum-inspired machine learning

Authors: Shi-Ju Ran, Gang Su

Abstract: It is a critical challenge to simultaneously gain high interpretability and efficiency with the current schemes of deep machine learning (ML). Tensor network (TN), which is a well-established mathematical tool originating from quantum mechanics, has shown its unique advantages on developing efficient ``white-box'' ML schemes. Here, we give a brief review on the inspiring progresses made in TN-base… ▽ More It is a critical challenge to simultaneously gain high interpretability and efficiency with the current schemes of deep machine learning (ML). Tensor network (TN), which is a well-established mathematical tool originating from quantum mechanics, has shown its unique advantages on developing efficient ``white-box'' ML schemes. Here, we give a brief review on the inspiring progresses made in TN-based ML. On one hand, interpretability of TN ML is accommodated with the solid theoretical foundation based on quantum information and many-body physics. On the other hand, high efficiency can be rendered from the powerful TN representations and the advanced computational techniques developed in quantum many-body physics. With the fast development on quantum computers, TN is expected to conceive novel schemes runnable on quantum hardware, heading towards the ``quantum artificial intelligence'' in the forthcoming future. △ Less

Submitted 19 November, 2023; originally announced November 2023.

Comments: 12 pages, 3 figures

Journal ref: Intelligent Computing 2, 0061 (2023)

arXiv:2307.11609 [pdf, other]

Persistent Ballistic Entanglement Spreading with Optimal Control in Quantum Spin Chains

Authors: Ying Lu, Pei Shi, Xiao-Han Wang, Jie Hu, Shi-Ju Ran

Abstract: Entanglement propagation provides a key routine to understand quantum many-body dynamics in and out of equilibrium. In this work, we uncover that the ``variational entanglement-enhancing'' field (VEEF) robustly induces a persistent ballistic spreading of entanglement in quantum spin chains. The VEEF is time dependent, and is optimally controlled to maximize the bipartite entanglement entropy (EE)… ▽ More Entanglement propagation provides a key routine to understand quantum many-body dynamics in and out of equilibrium. In this work, we uncover that the ``variational entanglement-enhancing'' field (VEEF) robustly induces a persistent ballistic spreading of entanglement in quantum spin chains. The VEEF is time dependent, and is optimally controlled to maximize the bipartite entanglement entropy (EE) of the final state. Such a linear growth persists till the EE reaches the genuine saturation $\tilde{S} = - \log_{2} 2^{-\frac{N}{2}}=\frac{N}{2}$ with $N$ the total number of spins. The EE satisfies $S(t) = v t$ for the time $t \leq \frac{N}{2v}$, with $v$ the velocity. These results are in sharp contrast with the behaviors without VEEF, where the EE generally approaches a sub-saturation known as the Page value $\tilde{S}_{P} =\tilde{S} - \frac{1}{2\ln{2}}$ in the long-time limit, and the entanglement growth deviates from being linear before the Page value is reached. The dependence between the velocity and interactions is explored, with $v \simeq 2.76$, $4.98$, and $5.75$ for the spin chains with Ising, XY, and Heisenberg interactions, respectively. We further show that the nonlinear growth of EE emerges with the presence of long-range interactions. △ Less

Submitted 21 July, 2023; originally announced July 2023.

Comments: 5 pages, 4 figures

arXiv:2307.05567 [pdf, other]

Event Extraction as Question Generation and Answering

Authors: Di Lu, Shihao Ran, Joel Tetreault, Alejandro Jaimes

Abstract: Recent work on Event Extraction has reframed the task as Question Answering (QA), with promising results. The advantage of this approach is that it addresses the error propagation issue found in traditional token-based classification approaches by directly predicting event arguments without extracting candidates first. However, the questions are typically based on fixed templates and they rarely l… ▽ More Recent work on Event Extraction has reframed the task as Question Answering (QA), with promising results. The advantage of this approach is that it addresses the error propagation issue found in traditional token-based classification approaches by directly predicting event arguments without extracting candidates first. However, the questions are typically based on fixed templates and they rarely leverage contextual information such as relevant arguments. In addition, prior QA-based approaches have difficulty handling cases where there are multiple arguments for the same role. In this paper, we propose QGA-EE, which enables a Question Generation (QG) model to generate questions that incorporate rich contextual information instead of using fixed templates. We also propose dynamic templates to assist the training of QG model. Experiments show that QGA-EE outperforms all prior single-task-based models on the ACE05 English dataset. △ Less

Submitted 9 July, 2023; originally announced July 2023.

Comments: Accepted to ACL 2023

arXiv:2306.17695 [pdf, other]

A New Task and Dataset on Detecting Attacks on Human Rights Defenders

Authors: Shihao Ran, Di Lu, Joel Tetreault, Aoife Cahill, Alejandro Jaimes

Abstract: The ability to conduct retrospective analyses of attacks on human rights defenders over time and by location is important for humanitarian organizations to better understand historical or ongoing human rights violations and thus better manage the global impact of such events. We hypothesize that NLP can support such efforts by quickly processing large collections of news articles to detect and sum… ▽ More The ability to conduct retrospective analyses of attacks on human rights defenders over time and by location is important for humanitarian organizations to better understand historical or ongoing human rights violations and thus better manage the global impact of such events. We hypothesize that NLP can support such efforts by quickly processing large collections of news articles to detect and summarize the characteristics of attacks on human rights defenders. To that end, we propose a new dataset for detecting Attacks on Human Rights Defenders (HRDsAttack) consisting of crowdsourced annotations on 500 online news articles. The annotations include fine-grained information about the type and location of the attacks, as well as information about the victim(s). We demonstrate the usefulness of the dataset by using it to train and evaluate baseline models on several sub-tasks to predict the annotated characteristics. △ Less

Submitted 30 June, 2023; originally announced June 2023.

arXiv:2305.06058 [pdf, other]

Compressing neural network by tensor network with exponentially fewer variational parameters

Authors: Yong Qing, Ke Li, Peng-Fei Zhou, Shi-Ju Ran

Abstract: Neural network (NN) designed for challenging machine learning tasks is in general a highly nonlinear mapping that contains massive variational parameters. High complexity of NN, if unbounded or unconstrained, might unpredictably cause severe issues including over-fitting, loss of generalization power, and unbearable cost of hardware. In this work, we propose a general compression scheme that signi… ▽ More Neural network (NN) designed for challenging machine learning tasks is in general a highly nonlinear mapping that contains massive variational parameters. High complexity of NN, if unbounded or unconstrained, might unpredictably cause severe issues including over-fitting, loss of generalization power, and unbearable cost of hardware. In this work, we propose a general compression scheme that significantly reduces the variational parameters of NN by encoding them to deep automatically-differentiable tensor network (ADTN) that contains exponentially-fewer free parameters. Superior compression performance of our scheme is demonstrated on several widely-recognized NN's (FC-2, LeNet-5, AlextNet, ZFNet and VGG-16) and datasets (MNIST, CIFAR-10 and CIFAR-100). For instance, we compress two linear layers in VGG-16 with approximately $10^{7}$ parameters to two ADTN's with just 424 parameters, where the testing accuracy on CIFAR-10 is improved from $90.17 \%$ to $91.74\%$. Our work suggests TN as an exceptionally efficient mathematical structure for representing the variational parameters of NN's, which exhibits superior compressibility over the commonly-used matrices and multi-way arrays. △ Less

Submitted 3 May, 2024; v1 submitted 10 May, 2023; originally announced May 2023.

Comments: 6 pages, 3 figures for the main text and 3 pages for the appendices

arXiv:2303.06340 [pdf, other]

Intelligent diagnostic scheme for lung cancer screening with Raman spectra data by tensor network machine learning

Authors: Yu-Jia An, Sheng-Chen Bai, Lin Cheng, Xiao-Guang Li, Cheng-en Wang, Xiao-Dong Han, Gang Su, Shi-Ju Ran, Cong Wang

Abstract: Artificial intelligence (AI) has brought tremendous impacts on biomedical sciences from academic researches to clinical applications, such as in biomarkers' detection and diagnosis, optimization of treatment, and identification of new therapeutic targets in drug discovery. However, the contemporary AI technologies, particularly deep machine learning (ML), severely suffer from non-interpretability,… ▽ More Artificial intelligence (AI) has brought tremendous impacts on biomedical sciences from academic researches to clinical applications, such as in biomarkers' detection and diagnosis, optimization of treatment, and identification of new therapeutic targets in drug discovery. However, the contemporary AI technologies, particularly deep machine learning (ML), severely suffer from non-interpretability, which might uncontrollably lead to incorrect predictions. Interpretability is particularly crucial to ML for clinical diagnosis as the consumers must gain necessary sense of security and trust from firm grounds or convincing interpretations. In this work, we propose a tensor-network (TN)-ML method to reliably predict lung cancer patients and their stages via screening Raman spectra data of Volatile organic compounds (VOCs) in exhaled breath, which are generally suitable as biomarkers and are considered to be an ideal way for non-invasive lung cancer screening. The prediction of TN-ML is based on the mutual distances of the breath samples mapped to the quantum Hilbert space. Thanks to the quantum probabilistic interpretation, the certainty of the predictions can be quantitatively characterized. The accuracy of the samples with high certainty is almost 100$\%$. The incorrectly-classified samples exhibit obviously lower certainty, and thus can be decipherably identified as anomalies, which will be handled by human experts to guarantee high reliability. Our work sheds light on shifting the ``AI for biomedical sciences'' from the conventional non-interpretable ML schemes to the interpretable human-ML interactive approaches, for the purpose of high accuracy and reliability. △ Less

Submitted 11 March, 2023; originally announced March 2023.

Comments: 10 pages, 7 figures

arXiv:2212.09955 [pdf, other]

BUMP: A Benchmark of Unfaithful Minimal Pairs for Meta-Evaluation of Faithfulness Metrics

Authors: Liang Ma, Shuyang Cao, Robert L. Logan IV, Di Lu, Shihao Ran, Ke Zhang, Joel Tetreault, Alejandro Jaimes

Abstract: The proliferation of automatic faithfulness metrics for summarization has produced a need for benchmarks to evaluate them. While existing benchmarks measure the correlation with human judgements of faithfulness on model-generated summaries, they are insufficient for diagnosing whether metrics are: 1) consistent, i.e., indicate lower faithfulness as errors are introduced into a summary, 2) effectiv… ▽ More The proliferation of automatic faithfulness metrics for summarization has produced a need for benchmarks to evaluate them. While existing benchmarks measure the correlation with human judgements of faithfulness on model-generated summaries, they are insufficient for diagnosing whether metrics are: 1) consistent, i.e., indicate lower faithfulness as errors are introduced into a summary, 2) effective on human-written texts, and 3) sensitive to different error types (as summaries can contain multiple errors). To address these needs, we present a benchmark of unfaithful minimal pairs (BUMP), a dataset of 889 human-written, minimally different summary pairs, where a single error is introduced to a summary from the CNN/DailyMail dataset to produce an unfaithful summary. We find BUMP complements existing benchmarks in a number of ways: 1) the summaries in BUMP are harder to discriminate and less probable under SOTA summarization models, 2) unlike non-pair-based datasets, BUMP can be used to measure the consistency of metrics, and reveals that the most discriminative metrics tend not to be the most consistent, and 3) unlike datasets containing generated summaries with multiple errors, BUMP enables the measurement of metrics' performance on individual error types. △ Less

Submitted 4 June, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

Comments: Accepted as a long main conference paper at ACL 2023

arXiv:2210.14190 [pdf, other]

CrisisLTLSum: A Benchmark for Local Crisis Event Timeline Extraction and Summarization

Authors: Hossein Rajaby Faghihi, Bashar Alhafni, Ke Zhang, Shihao Ran, Joel Tetreault, Alejandro Jaimes

Abstract: Social media has increasingly played a key role in emergency response: first responders can use public posts to better react to ongoing crisis events and deploy the necessary resources where they are most needed. Timeline extraction and abstractive summarization are critical technical tasks to leverage large numbers of social media posts about events. Unfortunately, there are few datasets for benc… ▽ More Social media has increasingly played a key role in emergency response: first responders can use public posts to better react to ongoing crisis events and deploy the necessary resources where they are most needed. Timeline extraction and abstractive summarization are critical technical tasks to leverage large numbers of social media posts about events. Unfortunately, there are few datasets for benchmarking technical approaches for those tasks. This paper presents CrisisLTLSum, the largest dataset of local crisis event timelines available to date. CrisisLTLSum contains 1,000 crisis event timelines across four domains: wildfires, local fires, traffic, and storms. We built CrisisLTLSum using a semi-automated cluster-then-refine approach to collect data from the public Twitter stream. Our initial experiments indicate a significant gap between the performance of strong baselines compared to the human performance on both tasks. Our dataset, code, and models are publicly available. △ Less

Submitted 25 October, 2022; originally announced October 2022.

arXiv:2208.04119 [pdf, other]

Deep Machine Learning Reconstructing Lattice Topology with Strong Thermal Fluctuations

Authors: Xiao-Han Wang, Pei Shi, Bin Xi, Jie Hu, Shi-Ju Ran

Abstract: Applying artificial intelligence to scientific problems (namely AI for science) is currently under hot debate. However, the scientific problems differ much from the conventional ones with images, texts, and etc., where new challenges emerges with the unbalanced scientific data and complicated effects from the physical setups. In this work, we demonstrate the validity of the deep convolutional neur… ▽ More Applying artificial intelligence to scientific problems (namely AI for science) is currently under hot debate. However, the scientific problems differ much from the conventional ones with images, texts, and etc., where new challenges emerges with the unbalanced scientific data and complicated effects from the physical setups. In this work, we demonstrate the validity of the deep convolutional neural network (CNN) on reconstructing the lattice topology (i.e., spin connectivities) in the presence of strong thermal fluctuations and unbalanced data. Taking the kinetic Ising model with Glauber dynamics as an example, the CNN maps the time-dependent local magnetic momenta (a single-node feature) evolved from a specific initial configuration (dubbed as an evolution instance) to the probabilities of the presences of the possible couplings. Our scheme distinguishes from the previous ones that might require the knowledge on the node dynamics, the responses from perturbations, or the evaluations of statistic quantities such as correlations or transfer entropy from many evolution instances. The fine tuning avoids the "barren plateau" caused by the strong thermal fluctuations at high temperatures. Accurate reconstructions can be made where the thermal fluctuations dominate over the correlations and consequently the statistic methods in general fail. Meanwhile, we unveil the generalization of CNN on dealing with the instances evolved from the unlearnt initial spin configurations and those with the unlearnt lattices. We raise an open question on the learning with unbalanced data in the nearly "double-exponentially" large sample space. △ Less

Submitted 8 August, 2022; originally announced August 2022.

Comments: 5 pages, 4 figures

arXiv:2207.06031 [pdf, other]

doi 10.1088/0256-307X/39/10/100701

Unsupervised Recognition of Informative Features via Tensor Network Machine Learning and Quantum Entanglement Variations

Authors: Sheng-Chen Bai, Yi-Cheng Tang, Shi-Ju Ran

Abstract: Given an image of a white shoe drawn on a blackboard, how are the white pixels deemed (say by human minds) to be informative for recognizing the shoe without any labeling information on the pixels? Here we investigate such a ``white shoe'' recognition problem from the perspective of tensor network (TN) machine learning and quantum entanglement. Utilizing a generative TN that captures the probabili… ▽ More Given an image of a white shoe drawn on a blackboard, how are the white pixels deemed (say by human minds) to be informative for recognizing the shoe without any labeling information on the pixels? Here we investigate such a ``white shoe'' recognition problem from the perspective of tensor network (TN) machine learning and quantum entanglement. Utilizing a generative TN that captures the probability distribution of the features as quantum amplitudes, we propose an unsupervised recognition scheme of informative features with the variations of entanglement entropy (EE) caused by designed measurements. In this way, a given sample, where the values of its features are statistically meaningless, is mapped to the variations of EE that statistically characterize the gain of information. We show that the EE variations identify the features that are critical to recognize this specific sample, and the EE itself reveals the information distribution of the probabilities represented by the TN model. The signs of the variations further reveal the entanglement structures among the features. We test the validity of our scheme on a toy dataset of strip images, the MNIST dataset of hand-drawn digits, the fashion-MNIST dataset of the pictures of fashion articles, and the images of brain cells. Our scheme opens the avenue to the quantum-inspired and interpreted unsupervised learning, which can be applied to, e.g., image segmentation and object detection. △ Less

Submitted 8 August, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

Comments: 7 pages, 8 figures. In the updated version, we added the unsupervised segmentation of images of cells, and the results with multi-qubit measurements

Journal ref: Chinese Physics Letters 39, 100701 (2022)

arXiv:2207.04788 [pdf, other]

DCCF: Deep Comprehensible Color Filter Learning Framework for High-Resolution Image Harmonization

Authors: Ben Xue, Shenghui Ran, Quan Chen, Rongfei Jia, Binqiang Zhao, Xing Tang

Abstract: Image color harmonization algorithm aims to automatically match the color distribution of foreground and background images captured in different conditions. Previous deep learning based models neglect two issues that are critical for practical applications, namely high resolution (HR) image processing and model comprehensibility. In this paper, we propose a novel Deep Comprehensible Color Filter (… ▽ More Image color harmonization algorithm aims to automatically match the color distribution of foreground and background images captured in different conditions. Previous deep learning based models neglect two issues that are critical for practical applications, namely high resolution (HR) image processing and model comprehensibility. In this paper, we propose a novel Deep Comprehensible Color Filter (DCCF) learning framework for high-resolution image harmonization. Specifically, DCCF first downsamples the original input image to its low-resolution (LR) counter-part, then learns four human comprehensible neural filters (i.e. hue, saturation, value and attentive rendering filters) in an end-to-end manner, finally applies these filters to the original input image to get the harmonized result. Benefiting from the comprehensible neural filters, we could provide a simple yet efficient handler for users to cooperate with deep model to get the desired results with very little effort when necessary. Extensive experiments demonstrate the effectiveness of DCCF learning framework and it outperforms state-of-the-art post-processing method on iHarmony4 dataset on images' full-resolutions by achieving 7.63% and 1.69% relative improvements on MSE and PSNR respectively. △ Less

Submitted 19 July, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

Comments: ECCV 2022 (Oral)

arXiv:2203.15574 [pdf, other]

doi 10.1103/PhysRevResearch.5.023096

Quantum compiling with a variational instruction set for accurate and fast quantum computing

Authors: Ying Lu, Peng-Fei Zhou, Shao-Ming Fei, Shi-Ju Ran

Abstract: The quantum instruction set (QIS) is defined as the quantum gates that are physically realizable by controlling the qubits in quantum hardware. Compiling quantum circuits into the product of the gates in a properly defined QIS is a fundamental step in quantum computing. We here propose the quantum variational instruction set (QuVIS) formed by flexibly designed multi-qubit gates for higher speed an… ▽ More The quantum instruction set (QIS) is defined as the quantum gates that are physically realizable by controlling the qubits in quantum hardware. Compiling quantum circuits into the product of the gates in a properly defined QIS is a fundamental step in quantum computing. We here propose the quantum variational instruction set (QuVIS) formed by flexibly designed multi-qubit gates for higher speed and accuracy of quantum computing. The controlling of qubits for realizing the gates in a QuVIS is variationally achieved using the fine-grained time optimization algorithm. Significant reductions in both the error accumulation and time cost are demonstrated in realizing the swaps of multiple qubits and quantum Fourier transformations, compared with the compiling by a standard QIS such as the quantum microinstruction set (QuMIS, formed by several one- and two-qubit gates including one-qubit rotations and controlled-NOT gates). With the same requirement on quantum hardware, the time cost for QuVIS is reduced to less than one half of that for QuMIS. Simultaneously, the error is suppressed algebraically as the depth of the compiled circuit is reduced. As a general compiling approach with high flexibility and efficiency, QuVIS can be defined for different quantum circuits and be adapted to the quantum hardware with different interactions. △ Less

Submitted 16 May, 2023; v1 submitted 29 March, 2022; originally announced March 2022.

Comments: Main text: 8 pages, 5 figures + Supplemental material

Journal ref: Phys. Rev. Research 5, 023096 (2023)

arXiv:2107.00195 [pdf, other]

doi 10.3390/math10060940

Non-parametric Semi-Supervised Learning in Many-body Hilbert Space with Rescaled Logarithmic Fidelity

Authors: Wei-Ming Li, Shi-Ju Ran

Abstract: In quantum and quantum-inspired machine learning, the very first step is to embed the data in quantum space known as Hilbert space. Developing quantum kernel function (QKF), which defines the distances among the samples in the Hilbert space, belongs to the fundamental topics for machine learning. In this work, we propose the rescaled logarithmic fidelity (RLF) and non-parametric semi-supervised le… ▽ More In quantum and quantum-inspired machine learning, the very first step is to embed the data in quantum space known as Hilbert space. Developing quantum kernel function (QKF), which defines the distances among the samples in the Hilbert space, belongs to the fundamental topics for machine learning. In this work, we propose the rescaled logarithmic fidelity (RLF) and non-parametric semi-supervised learning in the quantum space, which we name as RLF-NSSL. The rescaling takes advantage of the non-linearity of the kernel to tune the mutual distances of samples in the Hilbert space, and meanwhile avoids the exponentially-small fidelities between quantum many-qubit states. Being non-parametric excludes the possible effects from the variational parameters, and evidently demonstrates the advantages from the space itself. We compare RLF-NSSL with several well-known non-parametric algorithms including naive Bayes classifiers, k-nearest neighbors, and spectral clustering. Our method exhibits better accuracy particularly for the unsupervised case with no labeled samples and the few-shot cases with small numbers of labeled samples. With the visualizations by t-stochastic neighbor embedding, our results imply that the machine learning in the Hilbert space complies with the principles of maximal coding rate reduction, where the low-dimensional data exhibit within-class compressibility, between-class discrimination, and overall diversity. Our proposals can be applied to other quantum and quantum-inspired machine learning, including the methods using the parametric models such as tensor networks, quantum circuits, and quantum neural networks. △ Less

Submitted 16 September, 2021; v1 submitted 30 June, 2021; originally announced July 2021.

Comments: 8 pages, 5 figures

Journal ref: Mathematics 10, 940 (2022)

arXiv:2106.03126 [pdf, other]

doi 10.21468/SciPostPhysCore.4.3.022

Predicting Quantum Potentials by Deep Neural Network and Metropolis Sampling

Authors: Rui Hong, Peng-Fei Zhou, Bin Xi, Jie Hu, An-Chun Ji, Shi-Ju Ran

Abstract: The hybridizations of machine learning and quantum physics have caused essential impacts to the methodology in both fields. Inspired by quantum potential neural network, we here propose to solve the potential in the Schrodinger equation provided the eigenstate, by combining Metropolis sampling with deep neural network, which we dub as Metropolis potential neural network (MPNN). A loss function is… ▽ More The hybridizations of machine learning and quantum physics have caused essential impacts to the methodology in both fields. Inspired by quantum potential neural network, we here propose to solve the potential in the Schrodinger equation provided the eigenstate, by combining Metropolis sampling with deep neural network, which we dub as Metropolis potential neural network (MPNN). A loss function is proposed to explicitly involve the energy in the optimization for its accurate evaluation. Benchmarking on the harmonic oscillator and hydrogen atom, MPNN shows excellent accuracy and stability on predicting not just the potential to satisfy the Schrodinger equation, but also the eigen-energy. Our proposal could be potentially applied to the ab-initio simulations, and to inversely solving other partial differential equations in physics and beyond. △ Less

Submitted 8 August, 2021; v1 submitted 6 June, 2021; originally announced June 2021.

Journal ref: SciPost Phys. Core 4, 022 (2021)

arXiv:2106.01779 [pdf, other]

doi 10.1103/PhysRevA.104.052413

Preparation of Many-body Ground States by Time Evolution with Variational Microscopic Magnetic Fields and Incomplete Interactions

Authors: Ying Lu, Yue-Min Li, Peng-Fei Zhou, Shi-Ju Ran

Abstract: State preparation is of fundamental importance in quantum physics, which can be realized by constructing the quantum circuit as a unitary that transforms the initial state to the target, or implementing a quantum control protocol to evolve to the target state with a designed Hamiltonian. In this work, we study the latter on quantum many-body systems by the time evolution with fixed couplings and v… ▽ More State preparation is of fundamental importance in quantum physics, which can be realized by constructing the quantum circuit as a unitary that transforms the initial state to the target, or implementing a quantum control protocol to evolve to the target state with a designed Hamiltonian. In this work, we study the latter on quantum many-body systems by the time evolution with fixed couplings and variational magnetic fields. In specific, we consider to prepare the ground states of the Hamiltonians containing certain interactions that are missing in the Hamiltonians for the time evolution. An optimization method is proposed to optimize the magnetic fields by "fine-graining" the discretization of time, in order to gain high precision and stability. The back propagation technique is utilized to obtain the gradients of the fields against the logarithmic fidelity. Our method is tested on preparing the ground state of Heisenberg chain with the time evolution by the XY and Ising interactions, and its performance surpasses two baseline methods that use local and global optimization strategies, respectively. Our work can be applied and generalized to other quantum models such as those defined on higher dimensional lattices. It enlightens to reduce the complexity of the required interactions for implementing quantum control or other tasks in quantum information and computation by means of optimizing the magnetic fields. △ Less

Submitted 21 November, 2021; v1 submitted 3 June, 2021; originally announced June 2021.

arXiv:2104.14949 [pdf, other]

doi 10.1103/PhysRevA.104.042601

Automatically Differentiable Quantum Circuit for Many-qubit State Preparation

Authors: Peng-Fei Zhou, Rui Hong, Shi-Ju Ran

Abstract: Constructing quantum circuits for efficient state preparation belongs to the central topics in the field of quantum information and computation. As the number of qubits grows fast, methods to derive large-scale quantum circuits are strongly desired. In this work, we propose the automatically differentiable quantum circuit (ADQC) approach to efficiently prepare arbitrary quantum many-qubit states.… ▽ More Constructing quantum circuits for efficient state preparation belongs to the central topics in the field of quantum information and computation. As the number of qubits grows fast, methods to derive large-scale quantum circuits are strongly desired. In this work, we propose the automatically differentiable quantum circuit (ADQC) approach to efficiently prepare arbitrary quantum many-qubit states. A key ingredient is to introduce the latent gates whose decompositions give the unitary gates that form the quantum circuit. The circuit is optimized by updating the latent gates using back propagation to minimize the distance between the evolved and target states. Taking the ground states of quantum lattice models and random matrix product states as examples, with the number of qubits where processing the full coefficients is unlikely, ADQC obtains high fidelities with small numbers of layers $N_L \sim O(1)$. Superior accuracy is reached compared with the existing state-preparation approach based on the matrix product disentangler. The parameter complexity of MPS can be significantly reduced by ADQC with the compression ratio $r \sim O(10^{-3})$. Our work sheds light on the "intelligent construction" of quantum circuits for many-qubit systems by combining with the machine learning methods. △ Less

Submitted 30 April, 2021; originally announced April 2021.

Comments: 5 pages, 5 figures

Journal ref: Phys. Rev. A 104, 042601 (2021)

arXiv:2012.11841 [pdf, other]

doi 10.21468/SciPostPhys.14.6.142

Residual Matrix Product State for Machine Learning

Authors: Ye-Ming Meng, Jing Zhang, Peng Zhang, Chao Gao, Shi-Ju Ran

Abstract: Tensor network, which originates from quantum physics, is emerging as an efficient tool for classical and quantum machine learning. Nevertheless, there still exists a considerable accuracy gap between tensor network and the sophisticated neural network models for classical machine learning. In this work, we combine the ideas of matrix product state (MPS), the simplest tensor network structure, and… ▽ More Tensor network, which originates from quantum physics, is emerging as an efficient tool for classical and quantum machine learning. Nevertheless, there still exists a considerable accuracy gap between tensor network and the sophisticated neural network models for classical machine learning. In this work, we combine the ideas of matrix product state (MPS), the simplest tensor network structure, and residual neural network and propose the residual matrix product state (ResMPS). The ResMPS can be treated as a network where its layers map the "hidden" features to the outputs (e.g., classifications), and the variational parameters of the layers are the functions of the features of the samples (e.g., pixels of images). This is different from neural network, where the layers map feed-forwardly the features to the output. The ResMPS can equip with the non-linear activations and dropout layers, and outperforms the state-of-the-art tensor network models in terms of efficiency, stability, and expression power. Besides, ResMPS is interpretable from the perspective of polynomial expansion, where the factorization and exponential machines naturally emerge. Our work contributes to connecting and hybridizing neural and tensor networks, which is crucial to further enhance our understand of the working mechanisms and improve the performance of both models. △ Less

Submitted 3 December, 2021; v1 submitted 22 December, 2020; originally announced December 2020.

Journal ref: SciPost Phys. 14, 142 (2023)

arXiv:2012.03019 [pdf, other]

doi 10.1088/0256-307X/38/11/110301

Deep learning Local Reduced Density Matrices for Many-body Hamiltonian Estimation

Authors: Xinran Ma, Z. C. Tu, Shi-Ju Ran

Abstract: Human experts cannot efficiently access the physical information of quantum many-body states by simply "reading" the coefficients, but have to reply on the previous knowledge such as order parameters and quantum measurements. In this work, we demonstrate that convolutional neural network (CNN) can learn from the coefficients of local reduced density matrices to estimate the physical parameters of… ▽ More Human experts cannot efficiently access the physical information of quantum many-body states by simply "reading" the coefficients, but have to reply on the previous knowledge such as order parameters and quantum measurements. In this work, we demonstrate that convolutional neural network (CNN) can learn from the coefficients of local reduced density matrices to estimate the physical parameters of the many-body Hamiltonians, such as coupling strengths and magnetic fields, provided the states as the ground states. We propose QubismNet that consists of two main parts: the Qubism map that visualizes the ground states (or the purified reduced density matrices) as images, and a CNN that maps the images to the target physical parameters. By assuming certain constraints on the training set for the sake of balance, QubismNet exhibits impressive powers of learning and generalization on several quantum spin models. While the training samples are restricted to the states from certain ranges of the parameters, QubismNet can accurately estimate the parameters of the states beyond such training regions. For instance, our results show that QubismNet can estimate the magnetic fields near the critical point by learning from the states away from the critical vicinity. Our work illuminates a data-driven way to infer the Hamiltonians that give the designed ground states, and therefore would benefit the existing and future generalizations of quantum technologies such as Hamiltonian-based quantum simulations and state tomography. △ Less

Submitted 7 August, 2021; v1 submitted 5 December, 2020; originally announced December 2020.

Comments: 11 pages, 8 figures

Journal ref: Chin. Phys. Lett. 38 110301 (2021)

arXiv:2001.04029 [pdf, other]

doi 10.1103/PhysRevE.102.012152

Tangent-Space Gradient Optimization of Tensor Network for Machine Learning

Authors: Zheng-zhi Sun, Shi-ju Ran, Gang Su

Abstract: The gradient-based optimization method for deep machine learning models suffers from gradient vanishing and exploding problems, particularly when the computational graph becomes deep. In this work, we propose the tangent-space gradient optimization (TSGO) for the probabilistic models to keep the gradients from vanishing or exploding. The central idea is to guarantee the orthogonality between the v… ▽ More The gradient-based optimization method for deep machine learning models suffers from gradient vanishing and exploding problems, particularly when the computational graph becomes deep. In this work, we propose the tangent-space gradient optimization (TSGO) for the probabilistic models to keep the gradients from vanishing or exploding. The central idea is to guarantee the orthogonality between the variational parameters and the gradients. The optimization is then implemented by rotating parameter vector towards the direction of gradient. We explain and testify TSGO in tensor network (TN) machine learning, where the TN describes the joint probability distribution as a normalized state $\left| ψ\right\rangle $ in Hilbert space. We show that the gradient can be restricted in the tangent space of $\left\langle ψ\right.\left| ψ\right\rangle = 1$ hyper-sphere. Instead of additional adaptive methods to control the learning rate in deep learning, the learning rate of TSGO is naturally determined by the angle $θ$ as $η= \tan θ$. Our numerical results reveal better convergence of TSGO in comparison to the off-the-shelf Adam. △ Less

Submitted 10 January, 2020; originally announced January 2020.

Comments: 5 pages, 4 figures

Journal ref: Phys. Rev. E 102, 012152 (2020)

arXiv:1912.12923 [pdf, other]

Bayesian Tensor Network with Polynomial Complexity for Probabilistic Machine Learning

Authors: Shi-Ju Ran

Abstract: It is known that describing or calculating the conditional probabilities of multiple events is exponentially expensive. In this work, Bayesian tensor network (BTN) is proposed to efficiently capture the conditional probabilities of multiple sets of events with polynomial complexity. BTN is a directed acyclic graphical model that forms a subset of TN. To testify its validity for exponentially many… ▽ More It is known that describing or calculating the conditional probabilities of multiple events is exponentially expensive. In this work, Bayesian tensor network (BTN) is proposed to efficiently capture the conditional probabilities of multiple sets of events with polynomial complexity. BTN is a directed acyclic graphical model that forms a subset of TN. To testify its validity for exponentially many events, BTN is implemented to the image recognition, where the classification is mapped to capturing the conditional probabilities in an exponentially large sample space. Competitive performance is achieved by the BTN with simple tree network structures. Analogous to the tensor network simulations of quantum systems, the validity of the simple-tree BTN implies an ``area law'' of fluctuations in image recognition problems. △ Less

Submitted 7 January, 2020; v1 submitted 30 December, 2019; originally announced December 2019.

Comments: 7 pages, 5 figures; in the second version, results of the BTN with a new structure were added; other modifications including the formulation of Bayes' equation in tensor forms were made

arXiv:1907.10290 [pdf, other]

doi 10.1103/PhysRevResearch.2.033293

Quantum Compressed Sensing with Unsupervised Tensor-Network Machine Learning

Authors: Shi-Ju Ran, Zheng-Zhi Sun, Shao-Ming Fei, Gang Su, Maciej Lewenstein

Abstract: We propose tensor-network compressed sensing (TNCS) by combining the ideas of compressed sensing, tensor network (TN), and machine learning, which permits novel and efficient quantum communications of realistic data. The strategy is to use the unsupervised TN machine learning algorithm to obtain the entangled state $|Ψ\rangle$ that describes the probability distribution of a huge amount of classic… ▽ More We propose tensor-network compressed sensing (TNCS) by combining the ideas of compressed sensing, tensor network (TN), and machine learning, which permits novel and efficient quantum communications of realistic data. The strategy is to use the unsupervised TN machine learning algorithm to obtain the entangled state $|Ψ\rangle$ that describes the probability distribution of a huge amount of classical information considered to be communicated. To transfer a specific piece of information with $|Ψ\rangle$, our proposal is to encode such information in the separable state with the minimal distance to the measured state $|Φ\rangle$ that is obtained by partially measuring on $|Ψ\rangle$ in a designed way. To this end, a measuring protocol analogous to the compressed sensing with neural-network machine learning is suggested, where the measurements are designed to minimize uncertainty of information from the probability distribution given by $|Φ\rangle$. In this way, those who have $|Φ\rangle$ can reliably access the information by simply measuring on $|Φ\rangle$. We propose q-sparsity to characterize the sparsity of quantum states and the efficiency of the quantum communications by TNCS. The high q-sparsity is essentially due to the fact that the TN states describing nicely the probability distribution obey the area law of entanglement entropy. Testing on realistic datasets (hand-written digits and fashion images), TNCS is shown to possess high efficiency and accuracy, where the security of communications is guaranteed by the fundamental quantum principles. △ Less

Submitted 13 October, 2019; v1 submitted 24 July, 2019; originally announced July 2019.

Comments: 5+6 pages, 3+6 figures. Essential changes and new data were added to this new version

Journal ref: Phys. Rev. Research 2, 033293 (2020)

arXiv:1903.10742 [pdf, other]

doi 10.1103/PhysRevB.101.075135

Generative Tensor Network Classification Model for Supervised Machine Learning

Authors: Zheng-Zhi Sun, Cheng Peng, Ding Liu, Shi-Ju Ran, Gang Su

Abstract: Tensor network (TN) has recently triggered extensive interests in developing machine-learning models in quantum many-body Hilbert space. Here we purpose a generative TN classification (GTNC) approach for supervised learning. The strategy is to train the generative TN for each class of the samples to construct the classifiers. The classification is implemented by comparing the distance in the many-… ▽ More Tensor network (TN) has recently triggered extensive interests in developing machine-learning models in quantum many-body Hilbert space. Here we purpose a generative TN classification (GTNC) approach for supervised learning. The strategy is to train the generative TN for each class of the samples to construct the classifiers. The classification is implemented by comparing the distance in the many-body Hilbert space. The numerical experiments by GTNC show impressive performance on the MNIST and Fashion-MNIST dataset. The testing accuracy is competitive to the state-of-the-art convolutional neural network while higher than the naive Bayes classifier (a generative classifier) and support vector machine. Moreover, GTNC is more efficient than the existing TN models that are in general discriminative. By investigating the distances in the many-body Hilbert space, we find that (a) the samples are naturally clustering in such a space; and (b) bounding the bond dimensions of the TN's to finite values corresponds to removing redundant information in the image recognition. These two characters make GTNC an adaptive and universal model of excellent performance. △ Less

Submitted 26 March, 2019; originally announced March 2019.

Comments: 7 pages, 5 figures

Journal ref: Phys. Rev. B 101, 075135 (2020)

arXiv:1803.09111 [pdf, other]

doi 10.3389/fams.2021.716044

Entanglement-guided architectures of machine learning by quantum tensor network

Authors: Yuhan Liu, Xiao Zhang, Maciej Lewenstein, Shi-Ju Ran

Abstract: It is a fundamental, but still elusive question whether the schemes based on quantum mechanics, in particular on quantum entanglement, can be used for classical information processing and machine learning. Even partial answer to this question would bring important insights to both fields of machine learning and quantum mechanics. In this work, we implement simple numerical experiments, related to… ▽ More It is a fundamental, but still elusive question whether the schemes based on quantum mechanics, in particular on quantum entanglement, can be used for classical information processing and machine learning. Even partial answer to this question would bring important insights to both fields of machine learning and quantum mechanics. In this work, we implement simple numerical experiments, related to pattern/images classification, in which we represent the classifiers by many-qubit quantum states written in the matrix product states (MPS). Classical machine learning algorithm is applied to these quantum states to learn the classical data. We explicitly show how quantum entanglement (i.e., single-site and bipartite entanglement) can emerge in such represented images. Entanglement characterizes here the importance of data, and such information are practically used to guide the architecture of MPS, and improve the efficiency. The number of needed qubits can be reduced to less than 1/10 of the original number, which is within the access of the state-of-the-art quantum computers. We expect such numerical experiments could open new paths in charactering classical machine learning algorithms, and at the same time shed lights on the generic quantum simulations/computations of machine learning tasks. △ Less

Submitted 25 June, 2018; v1 submitted 24 March, 2018; originally announced March 2018.

Comments: 10 pages, 5 figures

Journal ref: Front. Appl. Math. Stat., 06 August 2021

Showing 1–24 of 24 results for author: Ran, S