-
BadCLM: Backdoor Attack in Clinical Language Models for Electronic Health Records
Authors:
Weimin Lyu,
Zexin Bi,
Fusheng Wang,
Chao Chen
Abstract:
The advent of clinical language models integrated into electronic health records (EHR) for clinical decision support has marked a significant advancement, leveraging the depth of clinical notes for improved decision-making. Despite their success, the potential vulnerabilities of these models remain largely unexplored. This paper delves into the realm of backdoor attacks on clinical language models…
▽ More
The advent of clinical language models integrated into electronic health records (EHR) for clinical decision support has marked a significant advancement, leveraging the depth of clinical notes for improved decision-making. Despite their success, the potential vulnerabilities of these models remain largely unexplored. This paper delves into the realm of backdoor attacks on clinical language models, introducing an innovative attention-based backdoor attack method, BadCLM (Bad Clinical Language Models). This technique clandestinely embeds a backdoor within the models, causing them to produce incorrect predictions when a pre-defined trigger is present in inputs, while functioning accurately otherwise. We demonstrate the efficacy of BadCLM through an in-hospital mortality prediction task with MIMIC III dataset, showcasing its potential to compromise model integrity. Our findings illuminate a significant security risk in clinical decision support systems and pave the way for future endeavors in fortifying clinical language models against such vulnerabilities.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
STOC-TOT: Stochastic Tree-of-Thought with Constrained Decoding for Complex Reasoning in Multi-Hop Question Answering
Authors:
Zhenyu Bi,
Daniel Hajialigol,
Zhongkai Sun,
Jie Hao,
Xuan Wang
Abstract:
Multi-hop question answering (MHQA) requires a model to retrieve and integrate information from multiple passages to answer a complex question. Recent systems leverage the power of large language models and integrate evidence retrieval with reasoning prompts (e.g., chain-of-thought reasoning) for the MHQA task. However, the complexities in the question types (bridge v.s. comparison questions) and…
▽ More
Multi-hop question answering (MHQA) requires a model to retrieve and integrate information from multiple passages to answer a complex question. Recent systems leverage the power of large language models and integrate evidence retrieval with reasoning prompts (e.g., chain-of-thought reasoning) for the MHQA task. However, the complexities in the question types (bridge v.s. comparison questions) and the reasoning types (sequential v.s. parallel reasonings) require more novel and fine-grained prompting methods to enhance the performance of MHQA under the zero-shot setting. In this paper, we propose STOC-TOT, a stochastic tree-of-thought reasoning prompting method with constrained decoding for MHQA and conduct a detailed comparison with other reasoning prompts on different question types and reasoning types. Specifically, we construct a tree-like reasoning structure by prompting the model to break down the original question into smaller sub-questions to form different reasoning paths. In addition, we prompt the model to provide a probability estimation for each reasoning path at each reasoning step. At answer time, we conduct constrained decoding on the model to generate more grounded answers and reduce hallucination. Experiments comparing STOC-TOT with two MHQA datasets and five large language models showed that our framework outperforms other reasoning prompts by a significant margin.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
AutoDSL: Automated domain-specific language design for structural representation of procedures with constraints
Authors:
Yu-Zhe Shi,
Haofei Hou,
Zhangqian Bi,
Fanxu Meng,
Xiang Wei,
Lecheng Ruan,
Qining Wang
Abstract:
Accurate representation of procedures in restricted scenarios, such as non-standardized scientific experiments, requires precise depiction of constraints. Unfortunately, Domain-specific Language (DSL), as an effective tool to express constraints structurally, often requires case-by-case hand-crafting, necessitating customized, labor-intensive efforts. To overcome this challenge, we introduce the A…
▽ More
Accurate representation of procedures in restricted scenarios, such as non-standardized scientific experiments, requires precise depiction of constraints. Unfortunately, Domain-specific Language (DSL), as an effective tool to express constraints structurally, often requires case-by-case hand-crafting, necessitating customized, labor-intensive efforts. To overcome this challenge, we introduce the AutoDSL framework to automate DSL-based constraint design across various domains. Utilizing domain specified experimental protocol corpora, AutoDSL optimizes syntactic constraints and abstracts semantic constraints. Quantitative and qualitative analyses of the DSLs designed by AutoDSL across five distinct domains highlight its potential as an auxiliary module for language models, aiming to improve procedural planning and execution.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Strong-to-Weak Spontaneous Symmetry Breaking in Mixed Quantum States
Authors:
Leonardo A. Lessa,
Ruochen Ma,
Jian-Hao Zhang,
Zhen Bi,
Meng Cheng,
Chong Wang
Abstract:
Symmetry in mixed quantum states can manifest in two distinct forms: \textit{strong symmetry}, where each individual pure state in the quantum ensemble is symmetric with the same charge, and \textit{weak symmetry}, which applies only to the entire ensemble. This paper explores a novel type of spontaneous symmetry breaking (SSB) where a strong symmetry is broken to a weak one. While the SSB of a we…
▽ More
Symmetry in mixed quantum states can manifest in two distinct forms: \textit{strong symmetry}, where each individual pure state in the quantum ensemble is symmetric with the same charge, and \textit{weak symmetry}, which applies only to the entire ensemble. This paper explores a novel type of spontaneous symmetry breaking (SSB) where a strong symmetry is broken to a weak one. While the SSB of a weak symmetry is measured by the long-ranged two-point correlation function $\mathrm{Tr}(O_xO^{\dagger}_yρ)$, the strong-to-weak SSB (SW-SSB) is measured by the fidelity $F(ρ, O_xO^{\dagger}_yρO_yO^{\dagger}_x)$, dubbed the \textit{fidelity correlator}. We prove that SW-SSB is a universal property of mixed-state quantum phases, in the sense that the phenomenon of SW-SSB is robust against symmetric low-depth local quantum channels. { We also show that the symmetry breaking is "spontaneous
" in the sense that the effect of a local symmetry-breaking measurement cannot be recovered locally.} We argue that a thermal state at a nonzero temperature in the canonical ensemble (with fixed symmetry charge) should have spontaneously broken strong symmetry. Additionally, we study non-thermal scenarios where decoherence induces SW-SSB, leading to phase transitions described by classical statistical models with bond randomness. In particular, the SW-SSB transition of a decohered Ising model can be viewed as the "ungauged" version of the celebrated toric code decodability transition. We confirm that, in the decohered Ising model, the SW-SSB transition defined by the fidelity correlator is the only physical transition in terms of channel recoverability. We also comment on other (inequivalent) definitions of SW-SSB, through correlation functions with higher Rényi indices.
△ Less
Submitted 3 July, 2024; v1 submitted 6 May, 2024;
originally announced May 2024.
-
CodeIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code
Authors:
Batu Guan,
Yao Wan,
Zhangqian Bi,
Zheng Wang,
Hongyu Zhang,
Yulei Sui,
Pan Zhou,
Lichao Sun
Abstract:
As Large Language Models (LLMs) are increasingly used to automate code generation, it is often desired to know if the code is AI-generated and by which model, especially for purposes like protecting intellectual property (IP) in industry and preventing academic misconduct in education. Incorporating watermarks into machine-generated content is one way to provide code provenance, but existing solut…
▽ More
As Large Language Models (LLMs) are increasingly used to automate code generation, it is often desired to know if the code is AI-generated and by which model, especially for purposes like protecting intellectual property (IP) in industry and preventing academic misconduct in education. Incorporating watermarks into machine-generated content is one way to provide code provenance, but existing solutions are restricted to a single bit or lack flexibility. We present CodeIP, a new watermarking technique for LLM-based code generation. CodeIP enables the insertion of multi-bit information while preserving the semantics of the generated code, improving the strength and diversity of the inerseted watermark. This is achieved by training a type predictor to predict the subsequent grammar type of the next token to enhance the syntactical and semantic correctness of the generated code. Experiments on a real-world dataset across five programming languages showcase the effectiveness of CodeIP.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
MugenNet: A Novel Combined Convolution Neural Network and Transformer Network with its Application for Colonic Polyp Image Segmentation
Authors:
Chen Peng,
Zhiqin Qian,
Kunyu Wang,
Qi Luo,
Zhuming Bi,
Wenjun Zhang
Abstract:
Biomedical image segmentation is a very important part in disease diagnosis. The term "colonic polyps" refers to polypoid lesions that occur on the surface of the colonic mucosa within the intestinal lumen. In clinical practice, early detection of polyps is conducted through colonoscopy examinations and biomedical image processing. Therefore, the accurate polyp image segmentation is of great signi…
▽ More
Biomedical image segmentation is a very important part in disease diagnosis. The term "colonic polyps" refers to polypoid lesions that occur on the surface of the colonic mucosa within the intestinal lumen. In clinical practice, early detection of polyps is conducted through colonoscopy examinations and biomedical image processing. Therefore, the accurate polyp image segmentation is of great significance in colonoscopy examinations. Convolutional Neural Network (CNN) is a common automatic segmentation method, but its main disadvantage is the long training time. Transformer utilizes a self-attention mechanism, which essentially assigns different importance weights to each piece of information, thus achieving high computational efficiency during segmentation. However, a potential drawback is the risk of information loss. In the study reported in this paper, based on the well-known hybridization principle, we proposed a method to combine CNN and Transformer to retain the strengths of both, and we applied this method to build a system called MugenNet for colonic polyp image segmentation. We conducted a comprehensive experiment to compare MugenNet with other CNN models on five publicly available datasets. The ablation experiment on MugentNet was conducted as well. The experimental results show that MugenNet achieves significantly higher processing speed and accuracy compared with CNN alone. The generalized implication with our work is a method to optimally combine two complimentary methods of machine learning.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
Topological Phases and Phase Transitions with Dipolar Symmetry Breaking
Authors:
Amogh Anakru,
Zhen Bi
Abstract:
Systems with dipole moment conservation have been of recent interest, as they realize both novel quantum dynamics and exotic ground state phases. In this work, we study some generic properties of 1-D and 2-D dipole-conserving fermionic models at integer fillings. We find that a dipolar symmetry-breaking phase can result in a mean-field band insulator whose topological indices can strongly affect t…
▽ More
Systems with dipole moment conservation have been of recent interest, as they realize both novel quantum dynamics and exotic ground state phases. In this work, we study some generic properties of 1-D and 2-D dipole-conserving fermionic models at integer fillings. We find that a dipolar symmetry-breaking phase can result in a mean-field band insulator whose topological indices can strongly affect the low-energy physics of the dipolar Goldstone modes. We study the 2-D topological phase transition of the mean-field ground states in the presence of the Goldstone modes. The critical theory resembles the 2+1d quantum electrodynamics coupled to massless Dirac fermions with some crucial differences and shows a novel quantum critical point featuring a nontrivial dynamical exponent. We also discuss the analogous case of 1-D dipole-conserving models and the role of topological invariants.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Locally Purified Density Operators for Symmetry-Protected Topological Phases in Mixed States
Authors:
Yuchen Guo,
Jian-Hao Zhang,
Hao-Ran Zhang,
Shuo Yang,
Zhen Bi
Abstract:
We propose a tensor network approach known as the locally purified density operator (LPDO) to investigate the classification and characterization of symmetry-protected topological (SPT) phases in open quantum systems. We extend the concept of injectivity, originally associated with matrix product states and projected entangled pair states, to LPDOs in $(1+1)D$ and $(2+1)D$ systems, unveiling two d…
▽ More
We propose a tensor network approach known as the locally purified density operator (LPDO) to investigate the classification and characterization of symmetry-protected topological (SPT) phases in open quantum systems. We extend the concept of injectivity, originally associated with matrix product states and projected entangled pair states, to LPDOs in $(1+1)D$ and $(2+1)D$ systems, unveiling two distinct types of injectivity conditions inherent in short-range entangled density matrices. Within the LPDO framework, we outline a classification scheme for decohered average symmetry-protected topological (ASPT) phases, consistent with earlier results obtained through spectrum sequence techniques. We first illustrate our framework with ASPT phases protected by fermion parity symmetry, then extend the classification of ASPT phases to a general group extension. We demonstrate examples of explicit construction of fixed-point LPDOs for ASPT phases including intrinsic ASPTs in both $(1+1)D$ and $(2+1)D$ systems.
△ Less
Submitted 16 May, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback
Authors:
Zhangqian Bi,
Yao Wan,
Zheng Wang,
Hongyu Zhang,
Batu Guan,
Fangxin Lu,
Zili Zhang,
Yulei Sui,
Hai Jin,
Xuanhua Shi
Abstract:
Large Language Models (LLMs) have shown remarkable progress in automated code generation. Yet, LLM-generated code may contain errors in API usage, class, data structure, or missing project-specific information. As much of this project-specific context cannot fit into the prompts of LLMs, we must find ways to allow the model to explore the project-level code context. We present CoCoGen, a new code…
▽ More
Large Language Models (LLMs) have shown remarkable progress in automated code generation. Yet, LLM-generated code may contain errors in API usage, class, data structure, or missing project-specific information. As much of this project-specific context cannot fit into the prompts of LLMs, we must find ways to allow the model to explore the project-level code context. We present CoCoGen, a new code generation approach that uses compiler feedback to improve the LLM-generated code. CoCoGen first leverages static analysis to identify mismatches between the generated code and the project's context. It then iteratively aligns and fixes the identified errors using information extracted from the code repository. We integrate CoCoGen with two representative LLMs, i.e., GPT-3.5-Turbo and Code Llama (13B), and apply it to Python code generation. Experimental results show that CoCoGen significantly improves the vanilla LLMs by over 80% in generating code dependent on the project context and consistently outperforms the existing retrieval-based code generation baselines.
△ Less
Submitted 10 June, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
AI for Biomedicine in the Era of Large Language Models
Authors:
Zhenyu Bi,
Sajib Acharjee Dip,
Daniel Hajialigol,
Sindhura Kommu,
Hanwen Liu,
Meng Lu,
Xuan Wang
Abstract:
The capabilities of AI for biomedicine span a wide spectrum, from the atomic level, where it solves partial differential equations for quantum systems, to the molecular level, predicting chemical or protein structures, and further extending to societal predictions like infectious disease outbreaks. Recent advancements in large language models, exemplified by models like ChatGPT, have showcased sig…
▽ More
The capabilities of AI for biomedicine span a wide spectrum, from the atomic level, where it solves partial differential equations for quantum systems, to the molecular level, predicting chemical or protein structures, and further extending to societal predictions like infectious disease outbreaks. Recent advancements in large language models, exemplified by models like ChatGPT, have showcased significant prowess in natural language tasks, such as translating languages, constructing chatbots, and answering questions. When we consider biomedical data, we observe a resemblance to natural language in terms of sequences: biomedical literature and health records presented as text, biological sequences or sequencing data arranged in sequences, or sensor data like brain signals as time series. The question arises: Can we harness the potential of recent large language models to drive biomedical knowledge discoveries? In this survey, we will explore the application of large language models to three crucial categories of biomedical data: 1) textual data, 2) biological sequences, and 3) brain signals. Furthermore, we will delve into large language model challenges in biomedical research, including ensuring trustworthiness, achieving personalization, and adapting to multi-modal data representation
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
EasyInstruct: An Easy-to-use Instruction Processing Framework for Large Language Models
Authors:
Yixin Ou,
Ningyu Zhang,
Honghao Gui,
Ziwen Xu,
Shuofei Qiao,
Yida Xue,
Runnan Fang,
Kangwei Liu,
Lei Li,
Zhen Bi,
Guozhou Zheng,
Huajun Chen
Abstract:
In recent years, instruction tuning has gained increasing attention and emerged as a crucial technique to enhance the capabilities of Large Language Models (LLMs). To construct high-quality instruction datasets, many instruction processing approaches have been proposed, aiming to achieve a delicate balance between data quantity and data quality. Nevertheless, due to inconsistencies that persist am…
▽ More
In recent years, instruction tuning has gained increasing attention and emerged as a crucial technique to enhance the capabilities of Large Language Models (LLMs). To construct high-quality instruction datasets, many instruction processing approaches have been proposed, aiming to achieve a delicate balance between data quantity and data quality. Nevertheless, due to inconsistencies that persist among various instruction processing methods, there is no standard open-source instruction processing implementation framework available for the community, which hinders practitioners from further developing and advancing. To facilitate instruction processing research and development, we present EasyInstruct, an easy-to-use instruction processing framework for LLMs, which modularizes instruction generation, selection, and prompting, while also considering their combination and interaction. EasyInstruct is publicly released and actively maintained at https://github.com/zjunlp/EasyInstruct, along with an online demo app and a demo video for quick-start, calling for broader research centered on instruction data and synthetic data.
△ Less
Submitted 23 June, 2024; v1 submitted 5 February, 2024;
originally announced February 2024.
-
Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit
Authors:
Yao Wan,
Yang He,
Zhangqian Bi,
Jianguo Zhang,
Hongyu Zhang,
Yulei Sui,
Guandong Xu,
Hai Jin,
Philip S. Yu
Abstract:
Code intelligence leverages machine learning techniques to extract knowledge from extensive code corpora, with the aim of developing intelligent tools to improve the quality and productivity of computer programming. Currently, there is already a thriving research community focusing on code intelligence, with efforts ranging from software engineering, machine learning, data mining, natural language…
▽ More
Code intelligence leverages machine learning techniques to extract knowledge from extensive code corpora, with the aim of developing intelligent tools to improve the quality and productivity of computer programming. Currently, there is already a thriving research community focusing on code intelligence, with efforts ranging from software engineering, machine learning, data mining, natural language processing, and programming languages. In this paper, we conduct a comprehensive literature review on deep learning for code intelligence, from the aspects of code representation learning, deep learning techniques, and application tasks. We also benchmark several state-of-the-art neural models for code intelligence, and provide an open-source toolkit tailored for the rapid prototyping of deep-learning-based code intelligence models. In particular, we inspect the existing code intelligence models under the basis of code representation learning, and provide a comprehensive overview to enhance comprehension of the present state of code intelligence. Furthermore, we publicly release the source code and data resources to provide the community with a ready-to-use benchmark, which can facilitate the evaluation and comparison of existing and future code intelligence models (https://xcodemind.github.io). At last, we also point out several challenging and promising directions for future research.
△ Less
Submitted 30 December, 2023;
originally announced January 2024.
-
An Empirical Study of Scaling Law for OCR
Authors:
Miao Rang,
Zhenni Bi,
Chuanjian Liu,
Yunhe Wang,
Kai Han
Abstract:
The laws of model size, data volume, computation and model performance have been extensively studied in the field of Natural Language Processing (NLP). However, the scaling laws in Optical Character Recognition (OCR) have not yet been investigated. To address this, we conducted comprehensive studies that involved examining the correlation between performance and the scale of models, data volume an…
▽ More
The laws of model size, data volume, computation and model performance have been extensively studied in the field of Natural Language Processing (NLP). However, the scaling laws in Optical Character Recognition (OCR) have not yet been investigated. To address this, we conducted comprehensive studies that involved examining the correlation between performance and the scale of models, data volume and computation in the field of text recognition.Conclusively, the study demonstrates smooth power laws between performance and model size, as well as training data volume, when other influencing factors are held constant. Additionally, we have constructed a large-scale dataset called REBU-Syn, which comprises 6 million real samples and 18 million synthetic samples. Based on our scaling law and new dataset, we have successfully trained a scene text recognition model, achieving a new state-ofthe-art on 6 common test benchmarks with a top-1 average accuracy of 97.42%. The models and dataset are publicly available at https://github.com/large-ocr-model/large-ocr-model.github.io.
△ Less
Submitted 31 January, 2024; v1 submitted 28 December, 2023;
originally announced January 2024.
-
OceanGPT: A Large Language Model for Ocean Science Tasks
Authors:
Zhen Bi,
Ningyu Zhang,
Yida Xue,
Yixin Ou,
Daxiong Ji,
Guozhou Zheng,
Huajun Chen
Abstract:
Ocean science, which delves into the oceans that are reservoirs of life and biodiversity, is of great significance given that oceans cover over 70% of our planet's surface. Recently, advances in Large Language Models (LLMs) have transformed the paradigm in science. Despite the success in other domains, current LLMs often fall short in catering to the needs of domain experts like oceanographers, an…
▽ More
Ocean science, which delves into the oceans that are reservoirs of life and biodiversity, is of great significance given that oceans cover over 70% of our planet's surface. Recently, advances in Large Language Models (LLMs) have transformed the paradigm in science. Despite the success in other domains, current LLMs often fall short in catering to the needs of domain experts like oceanographers, and the potential of LLMs for ocean science is under-explored. The intrinsic reasons are the immense and intricate nature of ocean data as well as the necessity for higher granularity and richness in knowledge. To alleviate these issues, we introduce OceanGPT, the first-ever large language model in the ocean domain, which is expert in various ocean science tasks. We also propose OceanGPT, a novel framework to automatically obtain a large volume of ocean domain instruction data, which generates instructions based on multi-agent collaboration. Additionally, we construct the first oceanography benchmark, OceanBench, to evaluate the capabilities of LLMs in the ocean domain. Though comprehensive experiments, OceanGPT not only shows a higher level of knowledge expertise for oceans science tasks but also gains preliminary embodied intelligence capabilities in ocean technology.
△ Less
Submitted 23 May, 2024; v1 submitted 3 October, 2023;
originally announced October 2023.
-
When Do Program-of-Thoughts Work for Reasoning?
Authors:
Zhen Bi,
Ningyu Zhang,
Yinuo Jiang,
Shumin Deng,
Guozhou Zheng,
Huajun Chen
Abstract:
In the realm of embodied artificial intelligence, the reasoning capabilities of Large Language Models (LLMs) play a pivotal role. Although there are effective methods like program-of-thought prompting for LLMs which uses programming language to tackle complex reasoning tasks, the specific impact of code data on the improvement of reasoning capabilities remains under-explored. To address this gap,…
▽ More
In the realm of embodied artificial intelligence, the reasoning capabilities of Large Language Models (LLMs) play a pivotal role. Although there are effective methods like program-of-thought prompting for LLMs which uses programming language to tackle complex reasoning tasks, the specific impact of code data on the improvement of reasoning capabilities remains under-explored. To address this gap, we propose complexity-impacted reasoning score (CIRS), which combines structural and logical attributes, to measure the correlation between code and reasoning abilities. Specifically, we use the abstract syntax tree to encode the structural information and calculate logical complexity by considering the difficulty and the cyclomatic complexity. Through an empirical analysis, we find not all code data of complexity can be learned or understood by LLMs. Optimal level of complexity is critical to the improvement of reasoning abilities by program-aided prompting. Then we design an auto-synthesizing and stratifying algorithm, and apply it to instruction generation for mathematical reasoning and code data filtering for code generation tasks. Extensive results demonstrates the effectiveness of our proposed approach. Code will be integrated into the EasyInstruct framework at https://github.com/zjunlp/EasyInstruct.
△ Less
Submitted 18 December, 2023; v1 submitted 29 August, 2023;
originally announced August 2023.
-
Revealing unusual bandgap shifts with temperature and bandgap renormalization effect in phase-stabilized metal halide perovskite thin films
Authors:
Haochen Zhang,
Zhixuan Bi,
Zehua Zhai,
Han Gao,
Yuwei Liu,
Meiling Jin,
Meng Ye,
Xuanzhang Li,
Haowen Liu,
Yuegang Zhang,
Xiang Li,
Hairen Tan,
Yong Xu,
Luyi Yang
Abstract:
Hybrid organic-inorganic metal halide perovskites are emerging materials in photovoltaics, whose bandgap is one of the most crucial parameters governing their light harvesting performance. Here we present the temperature and photocarrier density dependence of the bandgap in two phase-stabilized perovskite thin films (MA0.3FA0.7PbI3 and MA0.3FA0.7Pb0.5Sn0.5I3) using photoluminescence and absorption…
▽ More
Hybrid organic-inorganic metal halide perovskites are emerging materials in photovoltaics, whose bandgap is one of the most crucial parameters governing their light harvesting performance. Here we present the temperature and photocarrier density dependence of the bandgap in two phase-stabilized perovskite thin films (MA0.3FA0.7PbI3 and MA0.3FA0.7Pb0.5Sn0.5I3) using photoluminescence and absorption spectroscopy. Contrasting bandgap shifts with temperature are observed between the two perovskites. Using X-ray diffraction and in situ high-pressure photoluminescence spectroscopy, we show that thermal expansion plays only a minor role in the large bandgap blueshift, which is attributed to the enhanced structural stability of our samples. Our first-principles calculations further demonstrate the significant impact of thermally induced lattice distortions on the bandgap widening. We propose that the anomalous trends are caused by the competition between static and dynamic distortions. Additionally, both the bandgap renormalization and band filling effects are directly observed for the first time in fluence-dependent photoluminescence measurements and are employed to estimate the exciton effective mass. Our results provide new insights into the basic understanding of thermal and charge-accumulation effects on the band structure of hybrid perovskite thin films.
△ Less
Submitted 28 November, 2023; v1 submitted 21 August, 2023;
originally announced August 2023.
-
Maximally Localized Wannier Orbitals, Interaction Models and Fractional Quantum Anomalous Hall Effect in Twisted Bilayer MoTe2
Authors:
Cheng Xu,
Jiangxu Li,
Yong Xu,
Zhen Bi,
Yang Zhang
Abstract:
We investigate the moiré band structures and the strong correlation effects in twisted bilayer MoTe$_2$ for a wide range of twist angles, employing a combination of various techniques. Using large-scale first principles calculations, we pinpoint realistic continuum modeling parameters, subsequently deriving the maximally localized Wannier functions for the top three moiré bands. Simplifying our mo…
▽ More
We investigate the moiré band structures and the strong correlation effects in twisted bilayer MoTe$_2$ for a wide range of twist angles, employing a combination of various techniques. Using large-scale first principles calculations, we pinpoint realistic continuum modeling parameters, subsequently deriving the maximally localized Wannier functions for the top three moiré bands. Simplifying our model with reasonable assumptions, we obtain a minimal two-band model, encompassing Coulomb repulsion, correlated hopping, and spin exchange. Our minimal interaction models pave the way for further exploration of the rich many-body physics in twisted MoTe$_2$. Furthermore, we explore the phase diagrams of the system through Hartree-Fock approximation and exact diagonalization. Our two-band exact diagonalization analysis underscores significant band-mixing effects in this system, which enlarge the optimal twist angle for fractional quantum anomalous Hall states.
△ Less
Submitted 14 January, 2024; v1 submitted 18 August, 2023;
originally announced August 2023.
-
Spin Coherence and Spin Relaxation in Hybrid Organic-Inorganic Lead and Mixed Lead-Tin Perovskites
Authors:
Haochen Zhang,
Zehua Zhai,
Zhixuan Bi,
Han Gao,
Meng Ye,
Yong Xu,
Hairen Tan,
Luyi Yang
Abstract:
Metal halide perovskites make up a promising class of materials for semiconductor spintronics. Here we report a systematic investigation of coherent spin precession, spin dephasing and spin relaxation of electrons and holes in two hybrid organic-inorganic perovskites MA0.3FA0.7PbI3 and MA0.3FA0.7Pb0.5Sn0.5I3 using time-resolved Faraday rotation spectroscopy. With applied in-plane magnetic fields,…
▽ More
Metal halide perovskites make up a promising class of materials for semiconductor spintronics. Here we report a systematic investigation of coherent spin precession, spin dephasing and spin relaxation of electrons and holes in two hybrid organic-inorganic perovskites MA0.3FA0.7PbI3 and MA0.3FA0.7Pb0.5Sn0.5I3 using time-resolved Faraday rotation spectroscopy. With applied in-plane magnetic fields, we observe robust Larmor spin precession of electrons and holes that persists for hundreds of picoseconds. The spin dephasing and relaxation processes are likely to be sensitive to the defect levels. Temperature-dependent measurements give further insights into the spin relaxation channels. The extracted electron Landé g-factors (3.75 and 4.36) are the biggest among the reported values in inorganic or hybrid perovskites. Both the electron and hole g-factors shift dramatically with temperature, which we propose to originate from thermal lattice vibration effects on the band structure. These results lay the foundation for further design and use of lead- and tin-based perovskites for spintronic applications.
△ Less
Submitted 1 September, 2023; v1 submitted 6 August, 2023;
originally announced August 2023.
-
Fractonic Higher-Order Topological Phases in Open Quantum Systems
Authors:
Jian-Hao Zhang,
Ke Ding,
Shuo Yang,
Zhen Bi
Abstract:
In this work, we study the generalization of decohered average symmetry-protected topological phases to open quantum systems with a combination of subsystem symmetries and global symmetries. In particular, we provide examples of two types of intrinsic average higher-order topological phases with average subsystem symmetries. A classification scheme for these phases based on generalized anomaly can…
▽ More
In this work, we study the generalization of decohered average symmetry-protected topological phases to open quantum systems with a combination of subsystem symmetries and global symmetries. In particular, we provide examples of two types of intrinsic average higher-order topological phases with average subsystem symmetries. A classification scheme for these phases based on generalized anomaly cancellation criteria of average symmetry is also discussed.
△ Less
Submitted 24 October, 2023; v1 submitted 11 July, 2023;
originally announced July 2023.
-
Topological Phases with Average Symmetries: the Decohered, the Disordered, and the Intrinsic
Authors:
Ruochen Ma,
Jian-Hao Zhang,
Zhen Bi,
Meng Cheng,
Chong Wang
Abstract:
Global symmetries greatly enrich the landscape of topological quantum phases, playing an essential role from topological insulators to fractional quantum Hall effect. Topological phases in mixed quantum states, originating from \textit{decoherence} in open quantum systems or \textit{disorders} in imperfect crystalline solids, have recently garnered significant interest. Unlike pure states, mixed q…
▽ More
Global symmetries greatly enrich the landscape of topological quantum phases, playing an essential role from topological insulators to fractional quantum Hall effect. Topological phases in mixed quantum states, originating from \textit{decoherence} in open quantum systems or \textit{disorders} in imperfect crystalline solids, have recently garnered significant interest. Unlike pure states, mixed quantum states can exhibit \textit{average symmetries} -- symmetries that keep the total ensemble invariant but not on each individual state. In this work, we present a systematic classification and characterization of average symmetry-protected topological (ASPT) phases applicable to generic symmetry groups, encompassing both average and exact symmetries, for bosonic and fermionic systems. Moreover, we formulate the theory of average symmetry-enriched topological (ASET) orders in disordered bosonic systems. Our systematic approach helps clarify nuanced issues in previous literature and uncovers compelling new physics. Notably, we discover that (1) the definition and classification of ASPT phases in decohered and disordered systems exhibit subtle differences; (2) despite these differences, ASPT phases in both settings can be classified and characterized under a unified framework of defect decoration and spectral sequence; (3) this systematic classification uncovers a plethora of ASPT phases that are \textit{intrinsically mixed}, implying they can exclusively manifest in decohered or disordered systems where part of the symmetry is average; (4) similarly for ASET, we find intrinsically disordered phases exhibiting exotic anyon behaviors -- the ground states of such phases necessarily contain localized anyons, with gapless (yet still localized) excitation spectral.
△ Less
Submitted 19 May, 2024; v1 submitted 25 May, 2023;
originally announced May 2023.
-
The Lobster Eye Imager for Astronomy Onboard the SATech-01 Satellite
Authors:
Z. X. Ling,
X. J. Sun,
C. Zhang,
S. L. Sun,
G. Jin,
S. N. Zhang,
X. F. Zhang,
J. B. Chang,
F. S. Chen,
Y. F. Chen,
Z. W. Cheng,
W. Fu,
Y. X. Han,
H. Li,
J. F. Li,
Y. Li,
Z. D. Li,
P. R. Liu,
Y. H. Lv,
X. H. Ma,
Y. J. Tang,
C. B. Wang,
R. J. Xie,
Y. L. Xue,
A. L. Yan
, et al. (101 additional authors not shown)
Abstract:
The Lobster Eye Imager for Astronomy (LEIA), a pathfinder of the Wide-field X-ray Telescope of the Einstein Probe (EP) mission, was successfully launched onboard the SATech-01 satellite of the Chinese Academy of Sciences on 27 July 2022. In this paper, we introduce the design and on-ground test results of the LEIA instrument. Using state-of-the-art Micro-Pore Optics (MPO), a wide field-of-view (Fo…
▽ More
The Lobster Eye Imager for Astronomy (LEIA), a pathfinder of the Wide-field X-ray Telescope of the Einstein Probe (EP) mission, was successfully launched onboard the SATech-01 satellite of the Chinese Academy of Sciences on 27 July 2022. In this paper, we introduce the design and on-ground test results of the LEIA instrument. Using state-of-the-art Micro-Pore Optics (MPO), a wide field-of-view (FoV) of 346 square degrees (18.6 degrees * 18.6 degrees) of the X-ray imager is realized. An optical assembly composed of 36 MPO chips is used to focus incident X-ray photons, and four large-format complementary metal-oxide semiconductor (CMOS) sensors, each of 6 cm * 6 cm, are used as the focal plane detectors. The instrument has an angular resolution of 4 - 8 arcmin (in FWHM) for the central focal spot of the point spread function, and an effective area of 2 - 3 cm2 at 1 keV in essentially all the directions within the field of view. The detection passband is 0.5 - 4 keV in the soft X-rays and the sensitivity is 2 - 3 * 10-11 erg s-1 cm-2 (about 1 mini-Crab) at 1,000 second observation. The total weight of LEIA is 56 kg and the power is 85 W. The satellite, with a design lifetime of 2 years, operates in a Sun-synchronous orbit of 500 km with an orbital period of 95 minutes. LEIA is paving the way for future missions by verifying in flight the technologies of both novel focusing imaging optics and CMOS sensors for X-ray observation, and by optimizing the working setups of the instrumental parameters. In addition, LEIA is able to carry out scientific observations to find new transients and to monitor known sources in the soft X-ray band, albeit limited useful observing time available.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
Triggering Boundary Phase Transitions through Bulk Measurements in 2D Cluster States
Authors:
Yuchen Guo,
Jian-Hao Zhang,
Zhen Bi,
Shuo Yang
Abstract:
We investigate the phase diagram at the boundary of an infinite two-dimensional cluster state subject to bulk measurements using tensor network methods. The state is subjected to uniform measurements $M = \cosθZ+\sinθX$ on the lower boundary qubits and in all bulk qubits. Our results show that the boundary of the system exhibits volume-law entanglement at the measurement angle $θ= π/2$ and area-la…
▽ More
We investigate the phase diagram at the boundary of an infinite two-dimensional cluster state subject to bulk measurements using tensor network methods. The state is subjected to uniform measurements $M = \cosθZ+\sinθX$ on the lower boundary qubits and in all bulk qubits. Our results show that the boundary of the system exhibits volume-law entanglement at the measurement angle $θ= π/2$ and area-law entanglement for any $θ< π/2$. Within the area-law phase, a phase transition occurs at $θ_c=1.371$. The phase with $θ\in(θ_c,π/2)$ is characterized by a noninjective matrix product state, which cannot be realized as the unique ground state of a one-dimensional local, gapped Hamiltonian. Instead, it resembles a cat state with spontaneous symmetry breaking. These findings demonstrate that the phase diagram of the boundary of a two-dimensional system can be more intricate than that of a standard one-dimensional system.
△ Less
Submitted 24 October, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
$\mathbb Z_2$-Nontrivial Moiré Minibands and Interaction-Driven Quantum Anomalous Hall Insulators in Topological Insulator Based Moiré Heterostructures
Authors:
Kaijie Yang,
Zian Xu,
Yanjie Feng,
Frank Schindler,
Yuanfeng Xu,
Zhen Bi,
B. Andrei Bernevig,
Peizhe Tang,
Chao-Xing Liu
Abstract:
We studied electronic band structure and topological property of a topological insulator thin film under a moiré superlattice potential to search for two-dimensional (2D) $\mathbb Z_2$ non-trivial isolated mini-bands. To model this system, we assume the Fermi energy inside the bulk band gap and thus consider an effective model Hamiltonian with only two surface states that are located at the top an…
▽ More
We studied electronic band structure and topological property of a topological insulator thin film under a moiré superlattice potential to search for two-dimensional (2D) $\mathbb Z_2$ non-trivial isolated mini-bands. To model this system, we assume the Fermi energy inside the bulk band gap and thus consider an effective model Hamiltonian with only two surface states that are located at the top and bottom surfaces and strongly hybridized with each other. The moiré potential is generated by another layer of 2D insulating materials on top of topological insulator films. In this model, the lowest conduction (highest valence) mini-bands can be $\mathbb Z_2$ non-trivial when the minima (maxima) of the moiré potential approximately forms a hexagonal lattice with six-fold rotation symmetry. For the nontrivial conduction mini-band cases, the two lowest Kramers' pairs of conduction mini-bands both have nontrivial $\mathbb Z_2$ invariant in presence of inversion, while applying external gate voltages to break inversion leads to only the lowest Kramers' pair of mini-bands to be topologically non-trivial. The Coulomb interaction can drive the lowest conduction Kramers' mini-bands into the quantum anomalous Hall state when they are half-filled, which is further stabilized by breaking inversion symmetry. We propose the monolayer Sb$_{2}$ on top of Sb$_2$Te$_3$ thin films to realize our model based on results from the first principles calculations.
△ Less
Submitted 19 April, 2023;
originally announced April 2023.
-
CodeKGC: Code Language Model for Generative Knowledge Graph Construction
Authors:
Zhen Bi,
Jing Chen,
Yinuo Jiang,
Feiyu Xiong,
Wei Guo,
Huajun Chen,
Ningyu Zhang
Abstract:
Current generative knowledge graph construction approaches usually fail to capture structural knowledge by simply flattening natural language into serialized texts or a specification language. However, large generative language model trained on structured data such as code has demonstrated impressive capability in understanding natural language for structural prediction and reasoning tasks. Intuit…
▽ More
Current generative knowledge graph construction approaches usually fail to capture structural knowledge by simply flattening natural language into serialized texts or a specification language. However, large generative language model trained on structured data such as code has demonstrated impressive capability in understanding natural language for structural prediction and reasoning tasks. Intuitively, we address the task of generative knowledge graph construction with code language model: given a code-format natural language input, the target is to generate triples which can be represented as code completion tasks. Specifically, we develop schema-aware prompts that effectively utilize the semantic structure within the knowledge graph. As code inherently possesses structure, such as class and function definitions, it serves as a useful model for prior semantic structural knowledge. Furthermore, we employ a rationale-enhanced generation method to boost the performance. Rationales provide intermediate steps, thereby improving knowledge extraction abilities. Experimental results indicate that the proposed approach can obtain better performance on benchmark datasets compared with baselines. Code and datasets are available in https://github.com/zjunlp/DeepKE/tree/main/example/llm.
△ Less
Submitted 18 January, 2024; v1 submitted 18 April, 2023;
originally announced April 2023.
-
Non-Fermi Liquids from Dipolar Symmetry Breaking
Authors:
Amogh Anakru,
Zhen Bi
Abstract:
The emergence of fractonic topological phases and novel universality classes for quantum dynamics highlights the importance of dipolar symmetry in condensed matter systems. In this work, we study the properties of symmetry-breaking phases of the dipolar symmetries in fermionic models in various spatial dimensions. In such systems, fermions obtain energy dispersion through dipole condensation. Due…
▽ More
The emergence of fractonic topological phases and novel universality classes for quantum dynamics highlights the importance of dipolar symmetry in condensed matter systems. In this work, we study the properties of symmetry-breaking phases of the dipolar symmetries in fermionic models in various spatial dimensions. In such systems, fermions obtain energy dispersion through dipole condensation. Due to the nontrivial commutation between the translation symmetry and dipolar symmetry, the Goldstone modes of the dipolar condensate are strongly coupled to the dispersive fermions and naturally give rise to non-Fermi liquids at low energies. The IR description of the dipolar symmetry-breaking phase is analogous to the well-known theory of a Fermi surface coupled to an emergent U(1) gauge field. We also discuss the crossover behavior when the dipolar symmetry is slightly broken and the cases with anisotropic dipolar conservation.
△ Less
Submitted 30 October, 2023; v1 submitted 3 April, 2023;
originally announced April 2023.
-
Dynamic models for Planar Peristaltic Locomotion of a Metameric Earthworm-like Robot
Authors:
Qinyan Zhou,
Hongbin Fang,
Zhihai Bi,
Jian Xu
Abstract:
The development of versatile robots capable of traversing challenging and irregular environments is of increasing interest in the field of robotics, and metameric robots have been identified as a promising solution due to their slender, deformable bodies. Inspired by the effective locomotion of earthworms, earthworm-like robots capable of both rectilinear and planar locomotion have been designed a…
▽ More
The development of versatile robots capable of traversing challenging and irregular environments is of increasing interest in the field of robotics, and metameric robots have been identified as a promising solution due to their slender, deformable bodies. Inspired by the effective locomotion of earthworms, earthworm-like robots capable of both rectilinear and planar locomotion have been designed and prototyped. While much research has focused on developing kinematic models to describe the planar locomotion of earthworm-like robots, the authors argue that the development of dynamic models is critical to improving the accuracy and efficiency of these robots. A comprehensive analysis of the dynamics of a metameric earthworm-like robot capable of planar motion is presented in this work. The model takes into account the complex interactions between the robot's deformable body and the forces acting on it and draws on the methods previously used to develop mathematical models of snake-like robots. The proposed model represents a significant advancement in the field of metameric robotics and has the potential to enhance the performance of earthworm-like robots in a variety of challenging environments, such as underground pipes and tunnels, and serves as a foundation for future research into the dynamics of soft-bodied robots.
△ Less
Submitted 21 March, 2023;
originally announced March 2023.
-
Cognition of time and thinkings beyond
Authors:
Zedong Bi
Abstract:
A pervasive research protocol of cognitive neuroscience is to train subjects to perform deliberately designed experiments and record brain activity simultaneously, aiming to understand the brain mechanism underlying cognition. However, how the results of this protocol can be applied in technology is seldom discussed. Here, I review the studies on time processing of the brain as examples of this pr…
▽ More
A pervasive research protocol of cognitive neuroscience is to train subjects to perform deliberately designed experiments and record brain activity simultaneously, aiming to understand the brain mechanism underlying cognition. However, how the results of this protocol can be applied in technology is seldom discussed. Here, I review the studies on time processing of the brain as examples of this protocol, as well as two main application areas of neuroscience (neuroengineering and brain-inspired artificial intelligence). Time processing is an indispensable dimension of cognition; time is also an indispensable dimension of any real-world signal to be processed in technology. So one may expect that the studies of time processing in cognition profoundly influence brain-related technology. Surprisingly, I found that the results from cognitive studies on timing processing are hardly helpful in solving practical problems. This awkward situation may be due to the lack of generalizability of the results of cognitive studies, which are under well-controlled laboratory conditions, to real-life situations. This lack of generalizability may be rooted in the fundamental unknowability of the world (including cognition). Overall, this paper questions and criticizes the usefulness and prospect of the above-mentioned research protocol of cognitive neuroscience. I then give three suggestions for future research. First, to improve the generalizability of research, it is better to study brain activity under real-life conditions instead of in well-controlled laboratory experiments. Second, to overcome the unknowability of the world, we can engineer an easily accessible surrogate of the object under investigation, so that we can predict the behavior of the object by experimenting on the surrogate. Third, I call for technology-oriented research, with the aim of technology creation instead of knowledge discovery.
△ Less
Submitted 7 March, 2023;
originally announced March 2023.
-
First wide field-of-view X-ray observations by a lobster eye focusing telescope in orbit
Authors:
C. Zhang,
Z. X. Ling,
X. J. Sun,
S. L. Sun,
Y. Liu,
Z. D. Li,
Y. L. Xue,
Y. F. Chen,
Y. F. Dai,
Z. Q. Jia,
H. Y. Liu,
X. F. Zhang,
Y. H. Zhang,
S. N. Zhang,
F. S. Chen,
Z. W. Cheng,
W. Fu,
Y. X. Han,
H. Li,
J. F. Li,
Y. Li,
P. R. Liu,
X. H. Ma,
Y. J. Tang,
C. B. Wang
, et al. (53 additional authors not shown)
Abstract:
As a novel X-ray focusing technology, lobster eye micro-pore optics (MPO) feature both a wide observing field of view and true imaging capability, promising sky monitoring with significantly improved sensitivity and spatial resolution in soft X-rays. Since first proposed by Angel (1979), the optics have been extensively studied, developed and trialed over the past decades. In this Letter, we repor…
▽ More
As a novel X-ray focusing technology, lobster eye micro-pore optics (MPO) feature both a wide observing field of view and true imaging capability, promising sky monitoring with significantly improved sensitivity and spatial resolution in soft X-rays. Since first proposed by Angel (1979), the optics have been extensively studied, developed and trialed over the past decades. In this Letter, we report on the first-light results from a flight experiment of the Lobster Eye Imager for Astronomy ($LEIA$), a pathfinder of the wide-field X-ray telescope of the Einstein Probe mission. The piggyback imager, launched in July 2022, has a mostly un-vignetted field of view of $18.6^\circ \times 18.6^\circ $. Its spatial resolution is in the range of 4$-$7 arcmin in FWHM and the focal spot effective area is 2$-$3 cm$^2$, both showing only mild fluctuations across the field of view. We present images of the Galactic center region, Sco X-1 and the diffuse Cygnus Loop nebular taken in snapshot observations over 0.5$-$4 keV. These are truly wide-field X-ray images of celestial bodies observed, for the first time, by a focusing imaging telescope. Initial analyses of the in-flight data show excellent agreement between the observed images and the on-ground calibration and simulations. The instrument and its characterization are briefly described, as well as the flight experiment. The results provide a solid basis for the development of the present and proposed wide-field X-ray missions using lobster eye MPO.
△ Less
Submitted 17 November, 2022;
originally announced November 2022.
-
Strange Correlation Function for Average Symmetry-Protected Topological Phases
Authors:
Jian-Hao Zhang,
Yang Qi,
Zhen Bi
Abstract:
Average symmetry-protected topological (ASPT) phase is a generalization of symmetry-protected topological phases to disordered systems or open quantum systems. We devise a "strange correlator" in one and two dimensions to detect nontrivial ASPT states. We demonstrate that for a nontrivial ASPT phase this strange correlator exhibits long-range or power-law behavior. We explore the connection betwee…
▽ More
Average symmetry-protected topological (ASPT) phase is a generalization of symmetry-protected topological phases to disordered systems or open quantum systems. We devise a "strange correlator" in one and two dimensions to detect nontrivial ASPT states. We demonstrate that for a nontrivial ASPT phase this strange correlator exhibits long-range or power-law behavior. We explore the connection between the strange correlators and correlation functions in two-dimensional loop models with quantum corrections, leading to the exact scaling exponents of the strange correlators.
△ Less
Submitted 9 April, 2024; v1 submitted 31 October, 2022;
originally announced October 2022.
-
Classification and construction of interacting fractonic higher-order topological phases
Authors:
Jian-Hao Zhang,
Meng Cheng,
Zhen Bi
Abstract:
The notion of higher-order topological phases can have interesting generalizations to systems with subsystem symmetries that exhibit fractonic dynamics for charged excitations. In this work, we systematically study the higher-order topological phases protected by a combination of subsystem symmetries and ordinary global symmetries in two and three-dimensional interacting boson systems, with some i…
▽ More
The notion of higher-order topological phases can have interesting generalizations to systems with subsystem symmetries that exhibit fractonic dynamics for charged excitations. In this work, we systematically study the higher-order topological phases protected by a combination of subsystem symmetries and ordinary global symmetries in two and three-dimensional interacting boson systems, with some interacting fermionic examples.
△ Less
Submitted 27 October, 2022;
originally announced October 2022.
-
Tele-Knowledge Pre-training for Fault Analysis
Authors:
Zhuo Chen,
Wen Zhang,
Yufeng Huang,
Mingyang Chen,
Yuxia Geng,
Hongtao Yu,
Zhen Bi,
Yichi Zhang,
Zhen Yao,
Wenting Song,
Xinliang Wu,
Yi Yang,
Mingyi Chen,
Zhaoyang Lian,
Yingying Li,
Lei Cheng,
Huajun Chen
Abstract:
In this work, we share our experience on tele-knowledge pre-training for fault analysis, a crucial task in telecommunication applications that requires a wide range of knowledge normally found in both machine log data and product documents. To organize this knowledge from experts uniformly, we propose to create a Tele-KG (tele-knowledge graph). Using this valuable data, we further propose a tele-d…
▽ More
In this work, we share our experience on tele-knowledge pre-training for fault analysis, a crucial task in telecommunication applications that requires a wide range of knowledge normally found in both machine log data and product documents. To organize this knowledge from experts uniformly, we propose to create a Tele-KG (tele-knowledge graph). Using this valuable data, we further propose a tele-domain language pre-training model TeleBERT and its knowledge-enhanced version, a tele-knowledge re-training model KTeleBERT. which includes effective prompt hints, adaptive numerical data encoding, and two knowledge injection paradigms. Concretely, our proposal includes two stages: first, pre-training TeleBERT on 20 million tele-related corpora, and then re-training it on 1 million causal and machine-related corpora to obtain KTeleBERT. Our evaluation on multiple tasks related to fault analysis in tele-applications, including root-cause analysis, event association prediction, and fault chain tracing, shows that pre-training a language model with tele-domain data is beneficial for downstream tasks. Moreover, the KTeleBERT re-training further improves the performance of task models, highlighting the effectiveness of incorporating diverse tele-knowledge into the model.
△ Less
Submitted 17 February, 2023; v1 submitted 20 October, 2022;
originally announced October 2022.
-
Decoding Measurement-Prepared Quantum Phases and Transitions: from Ising model to gauge theory, and beyond
Authors:
Jong Yeon Lee,
Wenjie Ji,
Zhen Bi,
Matthew P. A. Fisher
Abstract:
Measurements allow efficient preparation of interesting quantum many-body states with long-range entanglement, conditioned on additional transformations based on measurement outcomes. Here, we demonstrate that the so-called conformal quantum critical points (CQCP) can be obtained by performing general single-site measurements in an appropriate basis on the cluster states in $d\geq2$. The equal-tim…
▽ More
Measurements allow efficient preparation of interesting quantum many-body states with long-range entanglement, conditioned on additional transformations based on measurement outcomes. Here, we demonstrate that the so-called conformal quantum critical points (CQCP) can be obtained by performing general single-site measurements in an appropriate basis on the cluster states in $d\geq2$. The equal-time correlators of the said states are described by correlation functions of certain $d$-dimensional classical models at finite temperatures and feature spatial conformal invariance. This establishes an exact correspondence between the measurement-prepared critical states and conformal field theories of a range of critical spin models, including familiar Ising models and gauge theories. Furthermore, by mapping the long-range entanglement structure of measured quantum states into the correlations of the corresponding thermal spin model, we rigorously establish the stability condition of the long-range entanglement in the measurement-prepared quantum states deviating from the ideal setting. Most importantly, we describe protocols to decode the resulting quantum phases and transitions without post-selection, thus transferring the exponential measurement complexity to a polynomial classical computation. Therefore, our findings suggest a novel mechanism in which a quantum critical wavefunction emerges, providing new practical ways to study quantum phases and conformal quantum critical points.
△ Less
Submitted 6 September, 2022; v1 submitted 24 August, 2022;
originally announced August 2022.
-
Multi-modal Protein Knowledge Graph Construction and Applications
Authors:
Siyuan Cheng,
Xiaozhuan Liang,
Zhen Bi,
Huajun Chen,
Ningyu Zhang
Abstract:
Existing data-centric methods for protein science generally cannot sufficiently capture and leverage biology knowledge, which may be crucial for many protein tasks. To facilitate research in this field, we create ProteinKG65, a knowledge graph for protein science. Using gene ontology and Uniprot knowledge base as a basis, we transform and integrate various kinds of knowledge with aligned descripti…
▽ More
Existing data-centric methods for protein science generally cannot sufficiently capture and leverage biology knowledge, which may be crucial for many protein tasks. To facilitate research in this field, we create ProteinKG65, a knowledge graph for protein science. Using gene ontology and Uniprot knowledge base as a basis, we transform and integrate various kinds of knowledge with aligned descriptions and protein sequences, respectively, to GO terms and protein entities. ProteinKG65 is mainly dedicated to providing a specialized protein knowledge graph, bringing the knowledge of Gene Ontology to protein function and structure prediction. We also illustrate the potential applications of ProteinKG65 with a prototype. Our dataset can be downloaded at https://w3id.org/proteinkg65.
△ Less
Submitted 14 November, 2022; v1 submitted 27 May, 2022;
originally announced July 2022.
-
Relphormer: Relational Graph Transformer for Knowledge Graph Representations
Authors:
Zhen Bi,
Siyuan Cheng,
Jing Chen,
Xiaozhuan Liang,
Feiyu Xiong,
Ningyu Zhang
Abstract:
Transformers have achieved remarkable performance in widespread fields, including natural language processing, computer vision and graph mining. However, vanilla Transformer architectures have not yielded promising improvements in the Knowledge Graph (KG) representations, where the translational distance paradigm dominates this area. Note that vanilla Transformer architectures struggle to capture…
▽ More
Transformers have achieved remarkable performance in widespread fields, including natural language processing, computer vision and graph mining. However, vanilla Transformer architectures have not yielded promising improvements in the Knowledge Graph (KG) representations, where the translational distance paradigm dominates this area. Note that vanilla Transformer architectures struggle to capture the intrinsically heterogeneous structural and semantic information of knowledge graphs. To this end, we propose a new variant of Transformer for knowledge graph representations dubbed Relphormer. Specifically, we introduce Triple2Seq which can dynamically sample contextualized sub-graph sequences as the input to alleviate the heterogeneity issue. We propose a novel structure-enhanced self-attention mechanism to encode the relational information and keep the semantic information within entities and relations. Moreover, we utilize masked knowledge modeling for general knowledge graph representation learning, which can be applied to various KG-based tasks including knowledge graph completion, question answering, and recommendation. Experimental results on six datasets show that Relphormer can obtain better performance compared with baselines. Code is available in https://github.com/zjunlp/Relphormer.
△ Less
Submitted 21 November, 2023; v1 submitted 22 May, 2022;
originally announced May 2022.
-
Molecular-scale Integration of Multi-modal Sensing and Neuromorphic Computing with Organic Electrochemical Transistors
Authors:
Shijie Wang,
Xi Chen,
Chao Zhao,
Yuxin Kong,
Baojun Lin,
Yongyi Wu,
Zhaozhao Bi,
Ziyi Xuan,
Tao Li,
Yuxiang Li,
Wei Zhang,
En Ma,
Zhongrui Wang,
Wei Ma
Abstract:
Abstract: Bionic learning with fused sensing, memory and processing functions outperforms artificial neural networks running on silicon chips in terms of efficiency and footprint. However, digital hardware implementation of bionic learning suffers from device heterogeneity in sensors and processing cores, which incurs large hardware, energy and time overheads. Here, we present a universal solution…
▽ More
Abstract: Bionic learning with fused sensing, memory and processing functions outperforms artificial neural networks running on silicon chips in terms of efficiency and footprint. However, digital hardware implementation of bionic learning suffers from device heterogeneity in sensors and processing cores, which incurs large hardware, energy and time overheads. Here, we present a universal solution to simultaneously perform multi-modal sensing, memory and processing using organic electrochemical transistors with designed architecture and tailored channel morphology, selective ion injection into the crystalline/amorphous regions. The resultant device work as either a volatile receptor that shows multi-modal sensing, or a non-volatile synapse that features record-high 10-bit analog states, low switching stochasticity and good retention without the integration of any extra devices. Homogeneous integration of such devices enables bionic learning functions such as conditioned reflex and real-time cardiac disease diagnose via reservoir computing, illustrating the promise for future smart edge health informatics.
△ Less
Submitted 19 February, 2022; v1 submitted 9 February, 2022;
originally announced February 2022.
-
Interaction Enabled Fractonic Higher-Order Topological Phases
Authors:
Julian May-Mann,
Yizhi You,
Taylor L. Hughes,
Zhen Bi
Abstract:
In this work, we present a collection of three-dimensional higher-order symmetry protected topological phases (HOSPTs) with gapless hinge modes that exist only in strongly interacting systems subject to subsystem symmetry constraints. We use a coupled wire construction to generate three families of microscopic lattice models: insulators with helical hinge modes, superconductors with chiral Majoran…
▽ More
In this work, we present a collection of three-dimensional higher-order symmetry protected topological phases (HOSPTs) with gapless hinge modes that exist only in strongly interacting systems subject to subsystem symmetry constraints. We use a coupled wire construction to generate three families of microscopic lattice models: insulators with helical hinge modes, superconductors with chiral Majorana hinge modes, and fractionalized insulators with helical hinge modes that carry fractional charge. In particular, these HOSPTs do not require spatial symmetry protection, but are instead protected by subsystem symmetries, and support "fractonic" quasiparticle excitations that move within only a low-dimensional sub-manifold of the system. We analyze the anomaly structure for the boundary theory and the entanglement Hamiltonian, and show that the side surfaces of these HOSPTs, despite being partially gapped, exhibit symmetry anomalies, and can only be realized as the boundary of three-dimensional HOSPT phases.
△ Less
Submitted 2 February, 2022;
originally announced February 2022.
-
OntoProtein: Protein Pretraining With Gene Ontology Embedding
Authors:
Ningyu Zhang,
Zhen Bi,
Xiaozhuan Liang,
Siyuan Cheng,
Haosen Hong,
Shumin Deng,
Jiazhang Lian,
Qiang Zhang,
Huajun Chen
Abstract:
Self-supervised protein language models have proved their effectiveness in learning the proteins representations. With the increasing computational power, current protein language models pre-trained with millions of diverse sequences can advance the parameter scale from million-level to billion-level and achieve remarkable improvement. However, those prevailing approaches rarely consider incorpora…
▽ More
Self-supervised protein language models have proved their effectiveness in learning the proteins representations. With the increasing computational power, current protein language models pre-trained with millions of diverse sequences can advance the parameter scale from million-level to billion-level and achieve remarkable improvement. However, those prevailing approaches rarely consider incorporating knowledge graphs (KGs), which can provide rich structured knowledge facts for better protein representations. We argue that informative biology knowledge in KGs can enhance protein representation with external knowledge. In this work, we propose OntoProtein, the first general framework that makes use of structure in GO (Gene Ontology) into protein pre-training models. We construct a novel large-scale knowledge graph that consists of GO and its related proteins, and gene annotation texts or protein sequences describe all nodes in the graph. We propose novel contrastive learning with knowledge-aware negative sampling to jointly optimize the knowledge graph and protein embedding during pre-training. Experimental results show that OntoProtein can surpass state-of-the-art methods with pre-trained protein language models in TAPE benchmark and yield better performance compared with baselines in protein-protein interaction and protein function prediction. Code and datasets are available in https://github.com/zjunlp/OntoProtein.
△ Less
Submitted 3 June, 2022; v1 submitted 23 January, 2022;
originally announced January 2022.
-
Benfordness of the Generalized Gamma Distribution
Authors:
Zelong Bi,
Irfan Durmić,
Steven J. Miller
Abstract:
The generalized gamma distribution shows up in many problems related to engineering, hydrology as well as survival analysis. Earlier work has been done that estimated the deviation of the exponential and the Weibull distribution from Benford's Law. We give a mathematical explanation for the Benfordness of the generalized gamma distribution and present a measure for the deviation of the generalized…
▽ More
The generalized gamma distribution shows up in many problems related to engineering, hydrology as well as survival analysis. Earlier work has been done that estimated the deviation of the exponential and the Weibull distribution from Benford's Law. We give a mathematical explanation for the Benfordness of the generalized gamma distribution and present a measure for the deviation of the generalized gamma distribution from the Benford distribution.
△ Less
Submitted 25 January, 2022;
originally announced January 2022.
-
Improving Knowledge Graph Representation Learning by Structure Contextual Pre-training
Authors:
Ganqiang Ye,
Wen Zhang,
Zhen Bi,
Chi Man Wong,
Chen Hui,
Huajun Chen
Abstract:
Representation learning models for Knowledge Graphs (KG) have proven to be effective in encoding structural information and performing reasoning over KGs. In this paper, we propose a novel pre-training-then-fine-tuning framework for knowledge graph representation learning, in which a KG model is firstly pre-trained with triple classification task, followed by discriminative fine-tuning on specific…
▽ More
Representation learning models for Knowledge Graphs (KG) have proven to be effective in encoding structural information and performing reasoning over KGs. In this paper, we propose a novel pre-training-then-fine-tuning framework for knowledge graph representation learning, in which a KG model is firstly pre-trained with triple classification task, followed by discriminative fine-tuning on specific downstream tasks such as entity type prediction and entity alignment. Drawing on the general ideas of learning deep contextualized word representations in typical pre-trained language models, we propose SCoP to learn pre-trained KG representations with structural and contextual triples of the target triple encoded. Experimental results demonstrate that fine-tuning SCoP not only outperforms results of baselines on a portfolio of downstream tasks but also avoids tedious task-specific model design and parameter training.
△ Less
Submitted 7 December, 2021;
originally announced December 2021.
-
Learning to Ask for Data-Efficient Event Argument Extraction
Authors:
Hongbin Ye,
Ningyu Zhang,
Zhen Bi,
Shumin Deng,
Chuanqi Tan,
Hui Chen,
Fei Huang,
Huajun Chen
Abstract:
Event argument extraction (EAE) is an important task for information extraction to discover specific argument roles. In this study, we cast EAE as a question-based cloze task and empirically analyze fixed discrete token template performance. As generating human-annotated question templates is often time-consuming and labor-intensive, we further propose a novel approach called "Learning to Ask," wh…
▽ More
Event argument extraction (EAE) is an important task for information extraction to discover specific argument roles. In this study, we cast EAE as a question-based cloze task and empirically analyze fixed discrete token template performance. As generating human-annotated question templates is often time-consuming and labor-intensive, we further propose a novel approach called "Learning to Ask," which can learn optimized question templates for EAE without human annotations. Experiments using the ACE-2005 dataset demonstrate that our method based on optimized questions achieves state-of-the-art performance in both the few-shot and supervised settings.
△ Less
Submitted 1 October, 2021;
originally announced October 2021.
-
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners
Authors:
Ningyu Zhang,
Luoqiu Li,
Xiang Chen,
Shumin Deng,
Zhen Bi,
Chuanqi Tan,
Fei Huang,
Huajun Chen
Abstract:
Large-scale pre-trained language models have contributed significantly to natural language processing by demonstrating remarkable abilities as few-shot learners. However, their effectiveness depends mainly on scaling the model parameters and prompt design, hindering their implementation in most real-world applications. This study proposes a novel pluggable, extensible, and efficient approach named…
▽ More
Large-scale pre-trained language models have contributed significantly to natural language processing by demonstrating remarkable abilities as few-shot learners. However, their effectiveness depends mainly on scaling the model parameters and prompt design, hindering their implementation in most real-world applications. This study proposes a novel pluggable, extensible, and efficient approach named DifferentiAble pRompT (DART), which can convert small language models into better few-shot learners without any prompt engineering. The main principle behind this approach involves reformulating potential natural language processing tasks into the task of a pre-trained language model and differentially optimizing the prompt template as well as the target label with backpropagation. Furthermore, the proposed approach can be: (i) Plugged to any pre-trained language models; (ii) Extended to widespread classification tasks. A comprehensive evaluation of standard NLP tasks demonstrates that the proposed approach achieves a better few-shot performance. Code is available in https://github.com/zjunlp/DART.
△ Less
Submitted 4 May, 2022; v1 submitted 30 August, 2021;
originally announced August 2021.
-
CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Authors:
Ningyu Zhang,
Mosha Chen,
Zhen Bi,
Xiaozhuan Liang,
Lei Li,
Xin Shang,
Kangping Yin,
Chuanqi Tan,
Jian Xu,
Fei Huang,
Luo Si,
Yuan Ni,
Guotong Xie,
Zhifang Sui,
Baobao Chang,
Hui Zong,
Zheng Yuan,
Linfeng Li,
Jun Yan,
Hongying Zan,
Kunli Zhang,
Buzhou Tang,
Qingcai Chen
Abstract:
Artificial Intelligence (AI), along with the recent progress in biomedical language understanding, is gradually changing medical practice. With the development of biomedical language understanding benchmarks, AI applications are widely used in the medical field. However, most benchmarks are limited to English, which makes it challenging to replicate many of the successes in English for other langu…
▽ More
Artificial Intelligence (AI), along with the recent progress in biomedical language understanding, is gradually changing medical practice. With the development of biomedical language understanding benchmarks, AI applications are widely used in the medical field. However, most benchmarks are limited to English, which makes it challenging to replicate many of the successes in English for other languages. To facilitate research in this direction, we collect real-world biomedical data and present the first Chinese Biomedical Language Understanding Evaluation (CBLUE) benchmark: a collection of natural language understanding tasks including named entity recognition, information extraction, clinical diagnosis normalization, single-sentence/sentence-pair classification, and an associated online platform for model evaluation, comparison, and analysis. To establish evaluation on these tasks, we report empirical results with the current 11 pre-trained Chinese models, and experimental results show that state-of-the-art neural models perform by far worse than the human ceiling. Our benchmark is released at \url{https://tianchi.aliyun.com/dataset/dataDetail?dataId=95414&lang=en-us}.
△ Less
Submitted 7 March, 2022; v1 submitted 15 June, 2021;
originally announced June 2021.
-
UCPhrase: Unsupervised Context-aware Quality Phrase Tagging
Authors:
Xiaotao Gu,
Zihan Wang,
Zhenyu Bi,
Yu Meng,
Liyuan Liu,
Jiawei Han,
Jingbo Shang
Abstract:
Identifying and understanding quality phrases from context is a fundamental task in text mining. The most challenging part of this task arguably lies in uncommon, emerging, and domain-specific phrases. The infrequent nature of these phrases significantly hurts the performance of phrase mining methods that rely on sufficient phrase occurrences in the input corpus. Context-aware tagging models, thou…
▽ More
Identifying and understanding quality phrases from context is a fundamental task in text mining. The most challenging part of this task arguably lies in uncommon, emerging, and domain-specific phrases. The infrequent nature of these phrases significantly hurts the performance of phrase mining methods that rely on sufficient phrase occurrences in the input corpus. Context-aware tagging models, though not restricted by frequency, heavily rely on domain experts for either massive sentence-level gold labels or handcrafted gazetteers. In this work, we propose UCPhrase, a novel unsupervised context-aware quality phrase tagger. Specifically, we induce high-quality phrase spans as silver labels from consistently co-occurring word sequences within each document. Compared with typical context-agnostic distant supervision based on existing knowledge bases (KBs), our silver labels root deeply in the input domain and context, thus having unique advantages in preserving contextual completeness and capturing emerging, out-of-KB phrases. Training a conventional neural tagger based on silver labels usually faces the risk of overfitting phrase surface names. Alternatively, we observe that the contextualized attention maps generated from a transformer-based neural language model effectively reveal the connections between words in a surface-agnostic way. Therefore, we pair such attention maps with the silver labels to train a lightweight span prediction model, which can be applied to new input to recognize (unseen) quality phrases regardless of their surface names or frequency. Thorough experiments on various tasks and datasets, including corpus-level phrase ranking, document-level keyphrase extraction, and sentence-level phrase tagging, demonstrate the superiority of our design over state-of-the-art pre-trained, unsupervised, and distantly supervised methods.
△ Less
Submitted 28 May, 2021;
originally announced May 2021.
-
Interventional Aspect-Based Sentiment Analysis
Authors:
Zhen Bi,
Ningyu Zhang,
Ganqiang Ye,
Haiyang Yu,
Xi Chen,
Huajun Chen
Abstract:
Recent neural-based aspect-based sentiment analysis approaches, though achieving promising improvement on benchmark datasets, have reported suffering from poor robustness when encountering confounder such as non-target aspects. In this paper, we take a causal view to addressing this issue. We propose a simple yet effective method, namely, Sentiment Adjustment (SENTA), by applying a backdoor adjust…
▽ More
Recent neural-based aspect-based sentiment analysis approaches, though achieving promising improvement on benchmark datasets, have reported suffering from poor robustness when encountering confounder such as non-target aspects. In this paper, we take a causal view to addressing this issue. We propose a simple yet effective method, namely, Sentiment Adjustment (SENTA), by applying a backdoor adjustment to disentangle those confounding factors. Experimental results on the Aspect Robustness Test Set (ARTS) dataset demonstrate that our approach improves the performance while maintaining accuracy in the original test set.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.
-
Disentangled Contrastive Learning for Learning Robust Textual Representations
Authors:
Xiang Chen,
Xin Xie,
Zhen Bi,
Hongbin Ye,
Shumin Deng,
Ningyu Zhang,
Huajun Chen
Abstract:
Although the self-supervised pre-training of transformer models has resulted in the revolutionizing of natural language processing (NLP) applications and the achievement of state-of-the-art results with regard to various benchmarks, this process is still vulnerable to small and imperceptible permutations originating from legitimate inputs. Intuitively, the representations should be similar in the…
▽ More
Although the self-supervised pre-training of transformer models has resulted in the revolutionizing of natural language processing (NLP) applications and the achievement of state-of-the-art results with regard to various benchmarks, this process is still vulnerable to small and imperceptible permutations originating from legitimate inputs. Intuitively, the representations should be similar in the feature space with subtle input permutations, while large variations occur with different meanings. This motivates us to investigate the learning of robust textual representation in a contrastive manner. However, it is non-trivial to obtain opposing semantic instances for textual samples. In this study, we propose a disentangled contrastive learning method that separately optimizes the uniformity and alignment of representations without negative sampling. Specifically, we introduce the concept of momentum representation consistency to align features and leverage power normalization while conforming the uniformity. Our experimental results for the NLP benchmarks demonstrate that our approach can obtain better results compared with the baselines, as well as achieve promising improvements with invariance tests and adversarial attacks. The code is available in https://github.com/zxlzr/DCL.
△ Less
Submitted 22 August, 2021; v1 submitted 10 April, 2021;
originally announced April 2021.
-
Text-guided Legal Knowledge Graph Reasoning
Authors:
Luoqiu Li,
Zhen Bi,
Hongbin Ye,
Shumin Deng,
Hui Chen,
Huaixiao Tou
Abstract:
Recent years have witnessed the prosperity of legal artificial intelligence with the development of technologies. In this paper, we propose a novel legal application of legal provision prediction (LPP), which aims to predict the related legal provisions of affairs. We formulate this task as a challenging knowledge graph completion problem, which requires not only text understanding but also graph…
▽ More
Recent years have witnessed the prosperity of legal artificial intelligence with the development of technologies. In this paper, we propose a novel legal application of legal provision prediction (LPP), which aims to predict the related legal provisions of affairs. We formulate this task as a challenging knowledge graph completion problem, which requires not only text understanding but also graph reasoning. To this end, we propose a novel text-guided graph reasoning approach. We collect amounts of real-world legal provision data from the Guangdong government service website and construct a legal dataset called LegalLPP. Extensive experimental results on the dataset show that our approach achieves better performance compared with baselines. The code and dataset are available in \url{https://github.com/zxlzr/LegalPP} for reproducibility.
△ Less
Submitted 22 August, 2021; v1 submitted 6 April, 2021;
originally announced April 2021.
-
Normal vs. Adversarial: Salience-based Analysis of Adversarial Samples for Relation Extraction
Authors:
Luoqiu Li,
Xiang Chen,
Zhen Bi,
Xin Xie,
Shumin Deng,
Ningyu Zhang,
Chuanqi Tan,
Mosha Chen,
Huajun Chen
Abstract:
Recent neural-based relation extraction approaches, though achieving promising improvement on benchmark datasets, have reported their vulnerability towards adversarial attacks. Thus far, efforts mostly focused on generating adversarial samples or defending adversarial attacks, but little is known about the difference between normal and adversarial samples. In this work, we take the first step to l…
▽ More
Recent neural-based relation extraction approaches, though achieving promising improvement on benchmark datasets, have reported their vulnerability towards adversarial attacks. Thus far, efforts mostly focused on generating adversarial samples or defending adversarial attacks, but little is known about the difference between normal and adversarial samples. In this work, we take the first step to leverage the salience-based method to analyze those adversarial samples. We observe that salience tokens have a direct correlation with adversarial perturbations. We further find the adversarial perturbations are either those tokens not existing in the training set or superficial cues associated with relation labels. To some extent, our approach unveils the characters against adversarial samples. We release an open-source testbed, "DiagnoseAdv" in https://github.com/zjunlp/DiagnoseAdv.
△ Less
Submitted 25 November, 2021; v1 submitted 1 April, 2021;
originally announced April 2021.
-
Yang-Lee edge singularity triggered entanglement transition
Authors:
Shao-Kai Jian,
Zhi-Cheng Yang,
Zhen Bi,
Xiao Chen
Abstract:
We show that a class of $\mathcal{PT}$ symmetric non-Hermitian Hamiltonians realizing the Yang-Lee edge singularity exhibits an entanglement transition in the long-time steady state evolved under the Hamiltonian. Such a transition is induced by a level crossing triggered by the critical point associated with the Yang-Lee singularity and hence is first-order in nature. At the transition, the entang…
▽ More
We show that a class of $\mathcal{PT}$ symmetric non-Hermitian Hamiltonians realizing the Yang-Lee edge singularity exhibits an entanglement transition in the long-time steady state evolved under the Hamiltonian. Such a transition is induced by a level crossing triggered by the critical point associated with the Yang-Lee singularity and hence is first-order in nature. At the transition, the entanglement entropy of the steady state jumps discontinuously from a volume-law to an area-law scaling. We exemplify this mechanism using a one-dimensional transverse field Ising model with additional imaginary fields, as well as the spin-1 Blume-Capel model and the three-state Potts model. We further make a connection to the forced-measurement induced entanglement transition in a Floquet non-unitary circuit subject to continuous measurements followed by post-selections. Our results demonstrate a new mechanism for entanglement transitions in non-Hermitian systems harboring a critical point.
△ Less
Submitted 11 October, 2021; v1 submitted 11 January, 2021;
originally announced January 2021.
-
On Robustness and Bias Analysis of BERT-based Relation Extraction
Authors:
Luoqiu Li,
Xiang Chen,
Hongbin Ye,
Zhen Bi,
Shumin Deng,
Ningyu Zhang,
Huajun Chen
Abstract:
Fine-tuning pre-trained models have achieved impressive performance on standard natural language processing benchmarks. However, the resultant model generalizability remains poorly understood. We do not know, for example, how excellent performance can lead to the perfection of generalization models. In this study, we analyze a fine-tuned BERT model from different perspectives using relation extrac…
▽ More
Fine-tuning pre-trained models have achieved impressive performance on standard natural language processing benchmarks. However, the resultant model generalizability remains poorly understood. We do not know, for example, how excellent performance can lead to the perfection of generalization models. In this study, we analyze a fine-tuned BERT model from different perspectives using relation extraction. We also characterize the differences in generalization techniques according to our proposed improvements. From empirical experimentation, we find that BERT suffers a bottleneck in terms of robustness by way of randomizations, adversarial and counterfactual tests, and biases (i.e., selection and semantic). These findings highlight opportunities for future improvements. Our open-sourced testbed DiagnoseRE is available in \url{https://github.com/zjunlp/DiagnoseRE}.
△ Less
Submitted 25 December, 2021; v1 submitted 14 September, 2020;
originally announced September 2020.
-
Lattice Analysis of $SU(2)$ with 1 Adjoint Dirac Flavor
Authors:
Zhen Bi,
Anthony Grebe,
Gurtej Kanwar,
Patrick Ledwith,
David Murphy,
Michael L. Wagman
Abstract:
Recently $SU(2)$ Yang-Mills theory with one massless adjoint Dirac quark flavor emerges as a novel critical theory that can describe the evolution between a trivial insulator and a topological insulator in AIII class in $3+1$ dimensions. There are several classes of conjectured infrared dynamics for this theory. One possibility is that the theory undergoes spontaneous chiral symmetry breaking, wit…
▽ More
Recently $SU(2)$ Yang-Mills theory with one massless adjoint Dirac quark flavor emerges as a novel critical theory that can describe the evolution between a trivial insulator and a topological insulator in AIII class in $3+1$ dimensions. There are several classes of conjectured infrared dynamics for this theory. One possibility is that the theory undergoes spontaneous chiral symmetry breaking, with two massless Goldstone bosons (the scalar diquark and its antiparticle) in the infrared. Another scenario, which is suggested by previous lattice studies by Athenodorou et al., is that the IR sector of the theory is a strongly interacting conformal field theory as the quark mass vanishes. The most recent theoretical proposals argue for a case that in the infrared a composite fermion composed of two quarks and an antiquark becomes massless and non-interacting as the quark mass goes to zero, while other sectors are decoupled from this low-energy fermion. This work expands upon previous studies by including the composite fermion to investigate which of these three potential scenarios captures the infrared behavior of this theory.
△ Less
Submitted 25 December, 2019;
originally announced December 2019.