Skip to main content

Showing 1–50 of 91 results for author: Fei, Y

  1. arXiv:2406.13743  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.MM

    GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation

    Authors: Baiqi Li, Zhiqiu Lin, Deepak Pathak, Jiayao Li, Yixin Fei, Kewen Wu, Tiffany Ling, Xide Xia, Pengchuan Zhang, Graham Neubig, Deva Ramanan

    Abstract: While text-to-visual models now produce photo-realistic images and videos, they struggle with compositional text prompts involving attributes, relationships, and higher-order reasoning such as logic and comparison. In this work, we conduct an extensive human study on GenAI-Bench to evaluate the performance of leading image and video generation models in various aspects of compositional text-to-vis… ▽ More

    Submitted 21 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

    Comments: We open-source our dataset, model, and code at: https://linzhiqiu.github.io/papers/genai_bench ; Project page: https://linzhiqiu.github.io/papers/genai_bench ; GenAI-Bench was first introduced in arxiv:2404.01291. This article extends it with an additional GenAI-Rank benchmark.

  2. arXiv:2405.20681  [pdf, other

    cs.CR cs.AI

    No Free Lunch Theorem for Privacy-Preserving LLM Inference

    Authors: Xiaojin Zhang, Yulin Fei, Yan Kang, Wei Chen, Lixin Fan, Hai Jin, Qiang Yang

    Abstract: Individuals and businesses have been significantly benefited by Large Language Models (LLMs) including PaLM, Gemini and ChatGPT in various ways. For example, LLMs enhance productivity, reduce costs, and enable us to focus on more valuable tasks. Furthermore, LLMs possess the capacity to sift through extensive datasets, uncover underlying patterns, and furnish critical insights that propel the fron… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  3. arXiv:2405.13930  [pdf, other

    cond-mat.mtrl-sci cs.RO cs.SE

    AlabOS: A Python-based Reconfigurable Workflow Management Framework for Autonomous Laboratories

    Authors: Yuxing Fei, Bernardus Rendy, Rishi Kumar, Olympia Dartsi, Hrushikesh P. Sahasrabuddhe, Matthew J. McDermott, Zheren Wang, Nathan J. Szymanski, Lauren N. Walters, David Milsted, Yan Zeng, Anubhav Jain, Gerbrand Ceder

    Abstract: The recent advent of autonomous laboratories, coupled with algorithms for high-throughput screening and active learning, promises to accelerate materials discovery and innovation. As these autonomous systems grow in complexity, the demand for robust and efficient workflow management software becomes increasingly critical. In this paper, we introduce AlabOS, a general-purpose software framework for… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 30 pages, 5 figures

  4. arXiv:2405.02724  [pdf, ps, other

    cs.LG cs.GT

    Taming Equilibrium Bias in Risk-Sensitive Multi-Agent Reinforcement Learning

    Authors: Yingjie Fei, Ruitu Xu

    Abstract: We study risk-sensitive multi-agent reinforcement learning under general-sum Markov games, where agents optimize the entropic risk measure of rewards with possibly diverse risk preferences. We show that using the regret naively adapted from existing literature as a performance metric could induce policies with equilibrium bias that favor the most risk-sensitive agents and overlook the other agents… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 29 pages

  5. arXiv:2404.19295  [pdf, ps, other

    cond-mat.quant-gas

    Collisional dynamics of symmetric two-dimensional quantum droplets

    Authors: Yanming Hu, Yifan Fei, Xiao-Long Chen, Yunbo Zhang

    Abstract: The collisional dynamics of two symmetric droplets with equal intraspecies scattering lengths and particle number density for each component is studied by solving the corresponding extended Gross-Pitaevskii equation in two dimensions by including a logarithmic correction term in the usual contact interaction. We find the merging droplet after collision experiences a quadrupole oscillation in its s… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: 7 pages, 4 figures

    Journal ref: Front. Phys. 17, 61505 (2022)

  6. arXiv:2404.11103  [pdf, ps, other

    cs.DS

    Distribution-Free Testing of Decision Lists with a Sublinear Number of Queries

    Authors: Xi Chen, Yumou Fei, Shyamal Patel

    Abstract: We give a distribution-free testing algorithm for decision lists with $\tilde{O}(n^{11/12}/\varepsilon^3)$ queries. This is the first sublinear algorithm for this problem, which shows that, unlike halfspaces, testing is strictly easier than learning for decision lists. Complementing the algorithm, we show that any distribution-free tester for decision lists must make $\tildeΩ(\sqrt{n})$ queries, o… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: To appear in STOC 2024

  7. arXiv:2404.10253  [pdf, other

    cs.DC

    Kilometer-Level Coupled Modeling Using 40 Million Cores: An Eight-Year Journey of Model Development

    Authors: Xiaohui Duan, Yuxuan Li, Zhao Liu, Bin Yang, Juepeng Zheng, Haohuan Fu, Shaoqing Zhang, Shiming Xu, Yang Gao, Wei Xue, Di Wei, Xiaojing Lv, Lifeng Yan, Haopeng Huang, Haitian Lu, Lingfeng Wan, Haoran Lin, Qixin Chang, Chenlin Li, Quanjie He, Zeyu Song, Xuantong Wang, Yangyang Yu, Xilong Fan, Zhaopeng Qu , et al. (16 additional authors not shown)

    Abstract: With current and future leading systems adopting heterogeneous architectures, adapting existing models for heterogeneous supercomputers is of urgent need for improving model resolution and reducing modeling uncertainty. This paper presents our three-week effort on porting a complex earth system model, CESM 2.2, to a 40-million-core Sunway supercomputer. Taking a non-intrusive approach that tries t… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 18 pages, 13 figures

  8. arXiv:2404.01563  [pdf

    eess.IV cs.CV

    Two-Phase Multi-Dose-Level PET Image Reconstruction with Dose Level Awareness

    Authors: Yuchen Fei, Yanmei Luo, Yan Wang, Jiaqi Cui, Yuanyuan Xu, Jiliu Zhou, Dinggang Shen

    Abstract: To obtain high-quality positron emission tomography (PET) while minimizing radiation exposure, a range of methods have been designed to reconstruct standard-dose PET (SPET) from corresponding low-dose PET (LPET) images. However, most current methods merely learn the mapping between single-dose-level LPET and SPET images, but omit the dose disparity of LPET images in clinical scenarios. In this pap… ▽ More

    Submitted 10 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted by ISBI2024

  9. arXiv:2403.16591  [pdf, other

    cs.LG cs.AI cs.CR

    Deciphering the Interplay between Local Differential Privacy, Average Bayesian Privacy, and Maximum Bayesian Privacy

    Authors: Xiaojin Zhang, Yulin Fei, Wei Chen

    Abstract: The swift evolution of machine learning has led to emergence of various definitions of privacy due to the threats it poses to privacy, including the concept of local differential privacy (LDP). Although widely embraced and utilized across numerous domains, this conventional approach to measure privacy still exhibits certain limitations, spanning from failure to prevent inferential disclosure to la… ▽ More

    Submitted 2 April, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

  10. arXiv:2402.19007  [pdf, other

    cs.CV cs.RO

    DOZE: A Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments

    Authors: Ji Ma, Hongming Dai, Yao Mu, Pengying Wu, Hao Wang, Xiaowei Chi, Yang Fei, Shanghang Zhang, Chang Liu

    Abstract: Zero-Shot Object Navigation (ZSON) requires agents to autonomously locate and approach unseen objects in unfamiliar environments and has emerged as a particularly challenging task within the domain of Embodied AI. Existing datasets for developing ZSON algorithms lack consideration of dynamic obstacles, object attribute diversity, and scene texts, thus exhibiting noticeable discrepancies from real-… ▽ More

    Submitted 8 July, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: This version of the paper has been accepted for publication in IEEE Robotics and Automation Letters (RA-L)

  11. arXiv:2402.18879  [pdf

    cs.CV

    Dose Prediction Driven Radiotherapy Paramters Regression via Intra- and Inter-Relation Modeling

    Authors: Jiaqi Cui, Yuanyuan Xu, Jianghong Xiao, Yuchen Fei, Jiliu Zhou, Xingcheng Peng, Yan Wang

    Abstract: Deep learning has facilitated the automation of radiotherapy by predicting accurate dose distribution maps. However, existing methods fail to derive the desirable radiotherapy parameters that can be directly input into the treatment planning system (TPS), impeding the full automation of radiotherapy. To enable more thorough automatic radiotherapy, in this paper, we propose a novel two-stage framew… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted by ISBI 2024

  12. arXiv:2402.18679  [pdf, other

    cs.AI cs.LG

    Data Interpreter: An LLM Agent For Data Science

    Authors: Sirui Hong, Yizhang Lin, Bang Liu, Bangbang Liu, Binhao Wu, Danyang Li, Jiaqi Chen, Jiayi Zhang, Jinlin Wang, Li Zhang, Lingyao Zhang, Min Yang, Mingchen Zhuge, Taicheng Guo, Tuo Zhou, Wei Tao, Wenyi Wang, Xiangru Tang, Xiangtao Lu, Xiawu Zheng, Xinbing Liang, Yaying Fei, Yuheng Cheng, Zongze Xu, Chenglin Wu

    Abstract: Large Language Model (LLM)-based agents have demonstrated remarkable effectiveness. However, their performance can be compromised in data science scenarios that require real-time data adjustment, expertise in optimization due to complex dependencies among various tasks, and the ability to identify logical errors for precise reasoning. In this study, we introduce the Data Interpreter, a solution de… ▽ More

    Submitted 12 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

  13. arXiv:2402.15811  [pdf, ps, other

    cond-mat.quant-gas

    Collective excitations in two-dimensional harmonically trapped quantum droplets

    Authors: Yifan Fei, Xucong Du, Xiao-Long Chen, Yunbo Zhang

    Abstract: The collective excitation modes in quantum droplets trapped in a two-dimensional harmonic potential in the context of symmetric weakly interacting binary bosonic mixtures are studied. By utilizing the linearization technique, the time-dependent extended Gross-Pitaevskii equation, and a sum-rule approach with a variational approximation, the ground state properties and collective excitations of suc… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

    Comments: 12 pages, 6 figures

    Journal ref: Phys. Rev. A 109, 053309 (2024) - Published 13 May 2024

  14. arXiv:2402.07866  [pdf, other

    quant-ph

    Virtual Channel Purification

    Authors: Zhenhuan Liu, Xingjian Zhang, Yue-Yang Fei, Zhenyu Cai

    Abstract: Quantum error mitigation is a key approach for extracting target state properties on state-of-the-art noisy machines and early fault-tolerant devices. Using the ideas from flag fault tolerance and virtual state purification, we develop the virtual channel purification (VCP) protocol, which consumes similar qubit and gate resources as virtual state purification but offers up to exponentially strong… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  15. arXiv:2311.05340  [pdf, other

    math.CO

    Quotients of Special Classes of Positroids

    Authors: Zhixing Chen, Yumou Fei, Jiyang Gao, Yuxuan Sun, Yuchong Zhang

    Abstract: In this paper, we give a complete characterization of rank $k-1$ positroids that are quotients of the uniform matroid $U_{k,n}$, completing a partial result by Bendetti-Chavez-Jiménez. Furthermore, we show that any pair of concordant positroids with adjacent ranks are related by a cyclic shift on their decorated permutations. We also use the concept of conecklaces to give a full characterization o… ▽ More

    Submitted 7 January, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

    Comments: 32 pages, 10 figures; This research was carried out as part of the PACE program in the summer of 2023 at Peking University, Beijing; Comments very welcome

    MSC Class: 05B35

  16. arXiv:2310.19651  [pdf, other

    cs.CL

    Dynamics of Instruction Tuning: Each Ability of Large Language Models Has Its Own Growth Pace

    Authors: Chiyu Song, Zhanchao Zhou, Jianhao Yan, Yuejiao Fei, Zhenzhong Lan, Yue Zhang

    Abstract: Instruction tuning is a burgeoning method to elicit the general intelligence of Large Language Models (LLMs). However, the creation of instruction data is still largely heuristic, leading to significant variation in quantity and quality across existing datasets. While some research advocates for expanding the number of instructions, others suggest that a small set of well-chosen examples is adequa… ▽ More

    Submitted 22 February, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

  17. arXiv:2310.17976  [pdf, other

    cs.CL

    InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews

    Authors: Xintao Wang, Yunze Xiao, Jen-tse Huang, Siyu Yuan, Rui Xu, Haoran Guo, Quan Tu, Yaying Fei, Ziang Leng, Wei Wang, Jiangjie Chen, Cheng Li, Yanghua Xiao

    Abstract: Role-playing agents (RPAs), powered by large language models, have emerged as a flourishing field of applications. However, a key challenge lies in assessing whether RPAs accurately reproduce the personas of target characters, namely their character fidelity. Existing methods mainly focus on the knowledge and linguistic patterns of characters. This paper, instead, introduces a novel perspective to… ▽ More

    Submitted 7 June, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

    Comments: ACL 2024

  18. arXiv:2310.14491  [pdf, other

    cs.CL

    Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models

    Authors: Yifan Hou, Jiaoda Li, Yu Fei, Alessandro Stolfo, Wangchunshu Zhou, Guangtao Zeng, Antoine Bosselut, Mrinmaya Sachan

    Abstract: Recent work has shown that language models (LMs) have strong multi-step (i.e., procedural) reasoning capabilities. However, it is unclear whether LMs perform these tasks by cheating with answers memorized from pretraining corpus, or, via a multi-step reasoning mechanism. In this paper, we try to answer this question by exploring a mechanistic interpretation of LMs for multi-step reasoning tasks. C… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: This work is published in EMNLP 2023

  19. arXiv:2310.10441  [pdf, other

    cs.DS math.PR math.ST stat.ML

    Efficiently matching random inhomogeneous graphs via degree profiles

    Authors: Jian Ding, Yumou Fei, Yuanzheng Wang

    Abstract: In this paper, we study the problem of recovering the latent vertex correspondence between two correlated random graphs with vastly inhomogeneous and unknown edge probabilities between different pairs of vertices. Inspired by and extending the matching algorithm via degree profiles by Ding, Ma, Wu and Xu (2021), we obtain an efficient matching algorithm as long as the minimal average degree is at… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: 44 pages, 3 figures

  20. arXiv:2309.05245  [pdf, ps, other

    cond-mat.quant-gas quant-ph

    Ground-state Properties and Bogoliubov Modes of a Harmonically Trapped One-Dimensional Quantum Droplet

    Authors: Xucong Du, Yifan Fei, Xiao-Long Chen, Yunbo Zhang

    Abstract: We study the stationary and excitation properties of a one-dimensional quantum droplet in the two-component Bose mixture trapped in a harmonic potential. By constructing the energy functional for the inhomogeneous mixture, we elaborate the extended the Gross-Pitaevskii equation applicable to both symmetric and asymmetric mixtures into a universal form, and the equations in two different dimensionl… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: 11 pages, 7 figures

    Journal ref: Phys. Rev. A 108, 033312 (2023), Published 18 September 2023

  21. arXiv:2309.04735  [pdf, other

    cs.CC

    Two-State Spin Systems with Negative Interactions

    Authors: Yumou Fei, Leslie Ann Goldberg, Pinyan Lu

    Abstract: We study the approximability of computing the partition functions of two-state spin systems. The problem is parameterized by a $2\times 2$ symmetric matrix. Previous results on this problem were restricted either to the case where the matrix has non-negative entries, or to the case where the diagonal entries are equal, i.e. Ising models. In this paper, we study the generalization to arbitrary… ▽ More

    Submitted 21 November, 2023; v1 submitted 9 September, 2023; originally announced September 2023.

  22. arXiv:2309.04389  [pdf, other

    cs.CL cs.CE

    CSPRD: A Financial Policy Retrieval Dataset for Chinese Stock Market

    Authors: Jinyuan Wang, Hai Zhao, Zhong Wang, Zeyang Zhu, Jinhao Xie, Yong Yu, Yongjian Fei, Yue Huang, Dawei Cheng

    Abstract: In recent years, great advances in pre-trained language models (PLMs) have sparked considerable research focus and achieved promising performance on the approach of dense passage retrieval, which aims at retrieving relative passages from massive corpus with given questions. However, most of existing datasets mainly benchmark the models with factoid queries of general commonsense, while specialised… ▽ More

    Submitted 11 September, 2023; v1 submitted 8 September, 2023; originally announced September 2023.

  23. arXiv:2308.09597  [pdf, other

    cs.CL cs.HC

    ChatHaruhi: Reviving Anime Character in Reality via Large Language Model

    Authors: Cheng Li, Ziang Leng, Chenxi Yan, Junyi Shen, Hao Wang, Weishi MI, Yaying Fei, Xiaoyang Feng, Song Yan, HaoSheng Wang, Linkang Zhan, Yaokai Jia, Pingyu Wu, Haozhen Sun

    Abstract: Role-playing chatbots built on large language models have drawn interest, but better techniques are needed to enable mimicking specific fictional characters. We propose an algorithm that controls language models via an improved prompt and memories of the character extracted from scripts. We construct ChatHaruhi, a dataset covering 32 Chinese / English TV / anime characters with over 54k simulated… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: v1 - First version of techique report

  24. arXiv:2308.04223  [pdf, other

    eess.SY cs.NE

    Real-Time Progressive Learning: Accumulate Knowledge from Control with Neural-Network-Based Selective Memory

    Authors: Yiming Fei, Jiangang Li, Yanan Li

    Abstract: Memory, as the basis of learning, determines the storage, update and forgetting of knowledge and further determines the efficiency of learning. Featured with the mechanism of memory, a radial basis function neural network based learning control scheme named real-time progressive learning (RTPL) is proposed to learn the unknown dynamics of the system with guaranteed stability and closed-loop perfor… ▽ More

    Submitted 24 November, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

    Comments: 15 pages, 16 figures

    MSC Class: 93-10

  25. arXiv:2308.01469  [pdf, other

    cs.LG cs.AI cs.CR

    VertexSerum: Poisoning Graph Neural Networks for Link Inference

    Authors: Ruyi Ding, Shijin Duan, Xiaolin Xu, Yunsi Fei

    Abstract: Graph neural networks (GNNs) have brought superb performance to various applications utilizing graph structural data, such as social analysis and fraud detection. The graph links, e.g., social relationships and transaction history, are sensitive and valuable information, which raises privacy concerns when using GNNs. To exploit these vulnerabilities, we propose VertexSerum, a novel graph poisoning… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

  26. Iterative self-transfer learning: A general methodology for response time-history prediction based on small dataset

    Authors: Yongjia Xu, Xinzheng Lu, Yifan Fei, Yuli Huang

    Abstract: There are numerous advantages of deep neural network surrogate modeling for response time-history prediction. However, due to the high cost of refined numerical simulations and actual experiments, the lack of data has become an unavoidable bottleneck in practical applications. An iterative self-transfer learningmethod for training neural networks based on small datasets is proposed in this study.… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: 14 pages, 8 figures; Published on Journal of Computational Design and Engineering, 9(5), 2089-2102

    Journal ref: Journal of Computational Design and Engineering, 9(5), 2089-2102 (2022)

  27. arXiv:2305.19148  [pdf, other

    cs.CL cs.AI cs.LG

    Mitigating Label Biases for In-context Learning

    Authors: Yu Fei, Yifan Hou, Zeming Chen, Antoine Bosselut

    Abstract: Various design settings for in-context learning (ICL), such as the choice and order of the in-context examples, can bias a model toward a particular prediction without being reflective of an understanding of the task. While many studies discuss these design choices, there have been few systematic investigations into categorizing them and mitigating their impact. In this work, we define a typology… ▽ More

    Submitted 4 August, 2023; v1 submitted 28 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023

  28. arXiv:2305.16444  [pdf, other

    cs.CL

    Don't Retrain, Just Rewrite: Countering Adversarial Perturbations by Rewriting Text

    Authors: Ashim Gupta, Carter Wood Blum, Temma Choji, Yingjie Fei, Shalin Shah, Alakananda Vempala, Vivek Srikumar

    Abstract: Can language models transform inputs to protect text classifiers against adversarial attacks? In this work, we present ATINTER, a model that intercepts and learns to rewrite adversarial inputs to make them non-adversarial for a downstream text classifier. Our experiments on four datasets and five attack mechanisms reveal that ATINTER is effective at providing better adversarial robustness than exi… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023

  29. arXiv:2305.15676  [pdf, other

    cs.CL

    Enhancing Grammatical Error Correction Systems with Explanations

    Authors: Yuejiao Fei, Leyang Cui, Sen Yang, Wai Lam, Zhenzhong Lan, Shuming Shi

    Abstract: Grammatical error correction systems improve written communication by detecting and correcting language mistakes. To help language learners better understand why the GEC system makes a certain correction, the causes of errors (evidence words) and the corresponding error types are two key factors. To enhance GEC systems with explanations, we introduce EXPECT, a large dataset annotated with evidence… ▽ More

    Submitted 10 June, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: 9 pages, 7 figures, accepted to the main conference of ACL 2023

  30. arXiv:2303.15571  [pdf, other

    cs.CR cs.AI

    EMShepherd: Detecting Adversarial Samples via Side-channel Leakage

    Authors: Ruyi Ding, Cheng Gongye, Siyue Wang, Aidong Ding, Yunsi Fei

    Abstract: Deep Neural Networks (DNN) are vulnerable to adversarial perturbations-small changes crafted deliberately on the input to mislead the model for wrong predictions. Adversarial attacks have disastrous consequences for deep learning-empowered critical applications. Existing defense and detection techniques both require extensive knowledge of the model, testing inputs, and even execution details. They… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

  31. Determination of Molecular Energies via Quantum Imaginary Time Evolution in a Superconducting Qubit System

    Authors: Zhiwen Zong, Sainan Huai, Tianqi Cai, Wenyan Jin, Ze Zhan, Zhenxing Zhang, Kunliang Bu, Liyang Sui, Ying Fei, Yicong Zheng, Shengyu Zhang, Jianlan Wu, Yi Yin

    Abstract: As a valid tool for solving ground state problems, imaginary time evolution (ITE) is widely used in physical and chemical simulations. Different ITE-based algorithms in their quantum counterpart have recently been proposed and applied to some real systems. We experimentally realize the variational-based quantum imaginary time evolution (QITE) algorithm to simulate the ground state energy of hydrog… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: 11 pages, 5 figures

  32. arXiv:2302.08210  [pdf, other

    cs.LG

    A Survey of Geometric Optimization for Deep Learning: From Euclidean Space to Riemannian Manifold

    Authors: Yanhong Fei, Xian Wei, Yingjie Liu, Zhengyu Li, Mingsong Chen

    Abstract: Although Deep Learning (DL) has achieved success in complex Artificial Intelligence (AI) tasks, it suffers from various notorious problems (e.g., feature redundancy, and vanishing or exploding gradients), since updating parameters in Euclidean space cannot fully exploit the geometric structure of the solution space. As a promising alternative solution, Riemannian-based DL uses geometric optimizati… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: 41 pages

  33. arXiv:2301.01286  [pdf, other

    cs.LG eess.IV

    Pseudo-Inverted Bottleneck Convolution for DARTS Search Space

    Authors: Arash Ahmadian, Louis S. P. Liu, Yue Fei, Konstantinos N. Plataniotis, Mahdi S. Hosseini

    Abstract: Differentiable Architecture Search (DARTS) has attracted considerable attention as a gradient-based neural architecture search method. Since the introduction of DARTS, there has been little work done on adapting the action space based on state-of-art architecture design principles for CNNs. In this work, we aim to address this gap by incrementally augmenting the DARTS search space with micro-desig… ▽ More

    Submitted 18 March, 2023; v1 submitted 31 December, 2022; originally announced January 2023.

    Comments: 5 pages

  34. arXiv:2212.12372  [pdf, other

    quant-ph

    Factoring integers with sublinear resources on a superconducting quantum processor

    Authors: Bao Yan, Ziqi Tan, Shijie Wei, Haocong Jiang, Weilong Wang, Hong Wang, Lan Luo, Qianheng Duan, Yiting Liu, Wenhao Shi, Yangyang Fei, Xiangdong Meng, Yu Han, Zheng Shan, Jiachen Chen, Xuhao Zhu, Chuanyu Zhang, Feitong Jin, Hekang Li, Chao Song, Zhen Wang, Zhi Ma, H. Wang, Gui-Lu Long

    Abstract: Shor's algorithm has seriously challenged information security based on public key cryptosystems. However, to break the widely used RSA-2048 scheme, one needs millions of physical qubits, which is far beyond current technical capabilities. Here, we report a universal quantum algorithm for integer factorization by combining the classical lattice reduction with a quantum approximate optimization alg… ▽ More

    Submitted 23 December, 2022; originally announced December 2022.

    Comments: 32 pages, 12 figures

  35. arXiv:2211.07909  [pdf, other

    eess.SY cs.LG cs.NE

    Selective Memory Recursive Least Squares: Recast Forgetting into Memory in RBF Neural Network Based Real-Time Learning

    Authors: Yiming Fei, Jiangang Li, Yanan Li

    Abstract: In radial basis function neural network (RBFNN) based real-time learning tasks, forgetting mechanisms are widely used such that the neural network can keep its sensitivity to new data. However, with forgetting mechanisms, some useful knowledge will get lost simply because they are learned a long time ago, which we refer to as the passive knowledge forgetting phenomenon. To address this problem, th… ▽ More

    Submitted 8 August, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

    Comments: 12 pages, 15 figures

    MSC Class: 93-10

  36. arXiv:2210.16637  [pdf, other

    cs.CL cs.LG

    Beyond Prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Representations

    Authors: Yu Fei, Ping Nie, Zhao Meng, Roger Wattenhofer, Mrinmaya Sachan

    Abstract: Recent work has demonstrated that pre-trained language models (PLMs) are zero-shot learners. However, most existing zero-shot methods involve heavy human engineering or complicated self-training pipelines, hindering their application to new situations. In this work, we show that zero-shot text classification can be improved simply by clustering texts in the embedding spaces of PLMs. Specifically,… ▽ More

    Submitted 23 November, 2022; v1 submitted 29 October, 2022; originally announced October 2022.

    Comments: Accepted to EMNLP 2022

  37. arXiv:2210.09849  [pdf, other

    eess.SP

    Scalable Framework For Deep Learning based CSI Feedback

    Authors: Liqiang Jin, Qiuping Huang, Qiubin Gao, Yongqiang Fei, Shaohui Sun

    Abstract: Deep learning (DL) based channel state information (CSI) feedback in multiple-input multiple-output (MIMO) systems recently has attracted lots of attention from both academia and industrial. From a practical point of views, it is huge burden to train, transfer and deploy a DL model for each parameter configuration of the base station (BS). In this paper, we propose a scalable and flexible framewor… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Comments: 6 pages,3 figures

  38. arXiv:2208.09896  [pdf, other

    cs.CV cs.AI

    SIM2E: Benchmarking the Group Equivariant Capability of Correspondence Matching Algorithms

    Authors: Shuai Su, Zhongkai Zhao, Yixin Fei, Shuda Li, Qijun Chen, Rui Fan

    Abstract: Correspondence matching is a fundamental problem in computer vision and robotics applications. Solving correspondence matching problems using neural networks has been on the rise recently. Rotation-equivariance and scale-equivariance are both critical in correspondence matching applications. Classical correspondence matching approaches are designed to withstand scaling and rotation transformations… ▽ More

    Submitted 21 August, 2022; originally announced August 2022.

    Comments: ECCV2022 Workshop Paper

  39. arXiv:2208.01898  [pdf, other

    cs.CV

    XCon: Learning with Experts for Fine-grained Category Discovery

    Authors: Yixin Fei, Zhongkai Zhao, Siwei Yang, Bingchen Zhao

    Abstract: We address the problem of generalized category discovery (GCD) in this paper, i.e. clustering the unlabeled images leveraging the information from a set of seen classes, where the unlabeled images could contain both seen classes and unseen classes. The seen classes can be seen as an implicit criterion of classes, which makes this setting different from unsupervised clustering where the cluster cri… ▽ More

    Submitted 3 August, 2022; originally announced August 2022.

  40. Hysteretic Behavior Simulation Based on Pyramid Neural Network:Principle, Network Architecture, Case Study and Explanation

    Authors: Yongjia Xu, Xinzheng Lu, Yifan Fei, Yuli Huang

    Abstract: An accurate and efficient simulation of the hysteretic behavior of materials and components is essential for structural analysis. The surrogate model based on neural networks shows significant potential in balancing efficiency and accuracy. However, its serial information flow and prediction based on single-level features adversely affect the network performance. Therefore, a weighted stacked pyra… ▽ More

    Submitted 19 June, 2023; v1 submitted 29 April, 2022; originally announced June 2022.

    Comments: 41 pages, 14 figures

    Journal ref: Advances in Structural Engineering. 2023, 1-16

  41. arXiv:2205.00140  [pdf, ps, other

    cs.GT cs.DS econ.TH

    Improved Approximation to First-Best Gains-from-Trade

    Authors: Yumou Fei

    Abstract: We study the two-agent single-item bilateral trade. Ideally, the trade should happen whenever the buyer's value for the item exceeds the seller's cost. However, the classical result of Myerson and Satterthwaite showed that no mechanism can achieve this without violating one of the Bayesian incentive compatibility, individual rationality and weakly balanced budget conditions. This motivates the stu… ▽ More

    Submitted 29 April, 2022; originally announced May 2022.

  42. arXiv:2203.12046  [pdf, other

    cs.CR cs.AR

    NNReArch: A Tensor Program Scheduling Framework Against Neural Network Architecture Reverse Engineering

    Authors: Yukui Luo, Shijin Duan, Cheng Gongye, Yunsi Fei, Xiaolin Xu

    Abstract: Architecture reverse engineering has become an emerging attack against deep neural network (DNN) implementations. Several prior works have utilized side-channel leakage to recover the model architecture while the target is executing on a hardware acceleration platform. In this work, we target an open-source deep-learning accelerator, Versatile Tensor Accelerator (VTA), and utilize electromagnetic… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Comments: Accepted by FCCM 2022

  43. Loss-tolerant all-photonic quantum repeater with generalized Shor code

    Authors: Rui Zhang, Li-Zheng Liu, Zheng-Da Li, Yue-Yang Fei, Xu-Fei Yin, Li Li, Nai-Le Liu, Yingqiu Mao, Yu-Ao Chen, Jian-Wei Pan

    Abstract: The all-photonic quantum repeater (APQR) is a promising repeater scheme to realize long-distance quantum communication. For a practical APQR, an indispensable requirement is the robustness of the repeater graph state (RGS) against photon loss. We propose a new loss-tolerant scheme by applying the generalized Shor code to RGS, which can be experimentally demonstrated with current technology. Experi… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: 8 pages, 5 figures

    Journal ref: Optica 9, 152-158 (2022)

  44. Efficient Bipartite Entanglement Detection Scheme with a Quantum Adversarial Solver

    Authors: Xu-Fei Yin, Yuxuan Du, Yue-Yang Fei, Rui Zhang, Li-Zheng Liu, Yingqiu Mao, Tongliang Liu, Min-Hsiu Hsieh, Li Li, Nai-Le Liu, Dacheng Tao, Yu-Ao Chen, Jian-Wei Pan

    Abstract: The recognition of entanglement states is a notoriously difficult problem when no prior information is available. Here, we propose an efficient quantum adversarial bipartite entanglement detection scheme to address this issue. Our proposal reformulates the bipartite entanglement detection as a two-player zero-sum game completed by parameterized quantum circuits, where a two-outcome measurement can… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: 7 pages, 3 figures

    Journal ref: Phys. Rev. Lett. 128, 110501 (2022)

  45. arXiv:2203.03110  [pdf, ps, other

    cs.LG math.OC stat.ML

    Cascaded Gaps: Towards Gap-Dependent Regret for Risk-Sensitive Reinforcement Learning

    Authors: Yingjie Fei, Ruitu Xu

    Abstract: In this paper, we study gap-dependent regret guarantees for risk-sensitive reinforcement learning based on the entropic risk measure. We propose a novel definition of sub-optimality gaps, which we call cascaded gaps, and we discuss their key components that adapt to the underlying structures of the problem. Based on the cascaded gaps, we derive non-asymptotic and logarithmic regret bounds for two… ▽ More

    Submitted 6 March, 2022; originally announced March 2022.

  46. arXiv:2201.12133  [pdf, other

    cs.CV cs.LG

    O-ViT: Orthogonal Vision Transformer

    Authors: Yanhong Fei, Yingjie Liu, Xian Wei, Mingsong Chen

    Abstract: Inspired by the tremendous success of the self-attention mechanism in natural language processing, the Vision Transformer (ViT) creatively applies it to image patch sequences and achieves incredible performance. However, the scaled dot-product self-attention of ViT brings about scale ambiguity to the structure of the original feature space. To address this problem, we propose a novel method named… ▽ More

    Submitted 16 February, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

  47. arXiv:2201.09329  [pdf, other

    cs.LG cond-mat.mtrl-sci

    ULSA: Unified Language of Synthesis Actions for Representation of Synthesis Protocols

    Authors: Zheren Wang, Kevin Cruse, Yuxing Fei, Ann Chia, Yan Zeng, Haoyan Huo, Tanjin He, Bowen Deng, Olga Kononova, Gerbrand Ceder

    Abstract: Applying AI power to predict syntheses of novel materials requires high-quality, large-scale datasets. Extraction of synthesis information from scientific publications is still challenging, especially for extracting synthesis actions, because of the lack of a comprehensive labeled dataset using a solid, robust, and well-established ontology for describing synthesis procedures. In this work, we pro… ▽ More

    Submitted 23 January, 2022; originally announced January 2022.

  48. arXiv:2111.10874  [pdf, other

    cond-mat.mtrl-sci

    Dataset of Solution-based Inorganic Materials Synthesis Recipes Extracted from the Scientific Literature

    Authors: Zheren Wang, Olga Kononova, Kevin Cruse, Tanjin He, Haoyan Huo, Yuxing Fei, Yan Zeng, Yingzhi Sun, Zijian Cai, Wenhao Sun, Gerbrand Ceder

    Abstract: The development of a materials synthesis route is usually based on heuristics and experience. A possible new approach would be to apply data-driven approaches to learn the patterns of synthesis from past experience and use them to predict the syntheses of novel materials. However, this route is impeded by the lack of a large-scale database of synthesis formulations. In this work, we applied advanc… ▽ More

    Submitted 21 November, 2021; originally announced November 2021.

  49. arXiv:2111.03947  [pdf, other

    cs.LG math.OC stat.ML

    Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning

    Authors: Yingjie Fei, Zhuoran Yang, Yudong Chen, Zhaoran Wang

    Abstract: We study risk-sensitive reinforcement learning (RL) based on the entropic risk measure. Although existing works have established non-asymptotic regret guarantees for this problem, they leave open an exponential gap between the upper and lower bounds. We identify the deficiencies in existing algorithms and their analysis that result in such a gap. To remedy these deficiencies, we investigate a simp… ▽ More

    Submitted 6 November, 2021; originally announced November 2021.

  50. arXiv:2111.00699  [pdf, other

    cs.GR cs.AR

    Principles towards Real-Time Simulation of Material Point Method on Modern GPUs

    Authors: Yun Fei, Yuhan Huang, Ming Gao

    Abstract: Physics-based simulation has been actively employed in generating offline visual effects in the film and animation industry. However, the computations required for high-quality scenarios are generally immense, deterring its adoption in real-time applications, e.g., virtual production, avatar live-streaming, and cloud gaming. We summarize the principles that can accelerate the computation pipeline… ▽ More

    Submitted 1 November, 2021; originally announced November 2021.

    ACM Class: I.3.1; I.3.7