Skip to main content

Showing 1–50 of 62 results for author: Qi, F

  1. arXiv:2404.08001  [pdf, other

    hep-ph cs.AI cs.CL cs.LG hep-ex physics.comp-ph

    Xiwu: A Basis Flexible and Learnable LLM for High Energy Physics

    Authors: Zhengde Zhang, Yiyu Zhang, Haodong Yao, Jianwen Luo, Rui Zhao, Bo Huang, Jiameng Zhao, Yipu Liao, Ke Li, Lina Zhao, Jun Cao, Fazhi Qi, Changzheng Yuan

    Abstract: Large Language Models (LLMs) are undergoing a period of rapid updates and changes, with state-of-the-art (SOTA) model frequently being replaced. When applying LLMs to a specific scientific field, it's challenging to acquire unique domain knowledge while keeping the model itself advanced. To address this challenge, a sophisticated large language model system named as Xiwu has been developed, allowi… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: 15 pages, 8 figures

    ACM Class: I.2.7

  2. arXiv:2401.15118  [pdf

    cs.CV cs.AI

    GeoDecoder: Empowering Multimodal Map Understanding

    Authors: Feng Qi, Mian Dai, Zixian Zheng, Chao Wang

    Abstract: This paper presents GeoDecoder, a dedicated multimodal model designed for processing geospatial information in maps. Built on the BeitGPT architecture, GeoDecoder incorporates specialized expert modules for image and text processing. On the image side, GeoDecoder utilizes GaoDe Amap as the underlying base map, which inherently encompasses essential details about road and building shapes, relative… ▽ More

    Submitted 18 February, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

  3. arXiv:2311.06078  [pdf, other

    cs.DC

    The First Verification Test of Space-Ground Collaborative Intelligence via Cloud-Native Satellites

    Authors: Shangguang Wang, Qiyang Zhang, Ruolin Xing, Fei Qi, Mengwei Xu

    Abstract: Recent advancements in satellite technologies and the declining cost of access to space have led to the emergence of large satellite constellations in Low Earth Orbit. However, these constellations often rely on bent-pipe architecture, resulting in high communication costs. Existing onboard inference architectures suffer from limitations in terms of low accuracy and inflexibility in the deployment… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

    Comments: Accepted by China Communications

    Report number: CNCOMM-2022-0422

  4. arXiv:2305.06849  [pdf, other

    cs.CL cs.AI cs.IR

    WebCPM: Interactive Web Search for Chinese Long-form Question Answering

    Authors: Yujia Qin, Zihan Cai, Dian Jin, Lan Yan, Shihao Liang, Kunlun Zhu, Yankai Lin, Xu Han, Ning Ding, Huadong Wang, Ruobing Xie, Fanchao Qi, Zhiyuan Liu, Maosong Sun, Jie Zhou

    Abstract: Long-form question answering (LFQA) aims at answering complex, open-ended questions with detailed, paragraph-length responses. The de facto paradigm of LFQA necessitates two procedures: information retrieval, which searches for relevant supporting facts, and information synthesis, which integrates these facts into a coherent answer. In this paper, we introduce WebCPM, the first Chinese LFQA datase… ▽ More

    Submitted 23 May, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

    Comments: ACL 2023, main conference

  5. arXiv:2302.01569  [pdf, other

    cs.LG

    Uniform tensor clustering by jointly exploring sample affinities of various orders

    Authors: Hongmin Cai, Fei Qi, Junyu Li, Yu Hu, Yue Zhang, Yiu-ming Cheung, Bin Hu

    Abstract: Conventional clustering methods based on pairwise affinity usually suffer from the concentration effect while processing huge dimensional features yet low sample sizes data, resulting in inaccuracy to encode the sample proximity and suboptimal performance in clustering. To address this issue, we propose a unified tensor clustering method (UTC) that characterizes sample proximity using multiple sam… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

  6. arXiv:2301.10443  [pdf, other

    cs.LG cs.AI

    Learning to Rank Normalized Entropy Curves with Differentiable Window Transformation

    Authors: Hanyang Liu, Shuai Yang, Feng Qi, Shuaiwen Wang

    Abstract: Recent automated machine learning systems often use learning curves ranking models to inform decisions about when to stop unpromising trials and identify better model configurations. In this paper, we present a novel learning curve ranking model specifically tailored for ranking normalized entropy (NE) learning curves, which are commonly used in online advertising and recommendation systems. Our p… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

    Comments: 13 pages

  7. arXiv:2210.16986  [pdf, other

    cs.DS

    A Practical Distributed ADMM Solver for Billion-Scale Generalized Assignment Problems

    Authors: Jun Zhou, Feng Qi, Zhigang Hua, Daohong Jian, Ziqi Liu, Hua Wu, Xingwen Zhang, Shuang Yang

    Abstract: Assigning items to owners is a common problem found in various real-world applications, for example, audience-channel matching in marketing campaigns, borrower-lender matching in loan management, and shopper-merchant matching in e-commerce. Given an objective and multiple constraints, an assignment problem can be formulated as a constrained optimization problem. Such assignment problems are usuall… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

  8. arXiv:2210.10683  [pdf, other

    cs.CL cs.CR cs.LG

    Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversarial NLP

    Authors: Yangyi Chen, Hongcheng Gao, Ganqu Cui, Fanchao Qi, Longtao Huang, Zhiyuan Liu, Maosong Sun

    Abstract: Textual adversarial samples play important roles in multiple subfields of NLP research, including security, evaluation, explainability, and data augmentation. However, most work mixes all these roles, obscuring the problem definitions and research goals of the security role that aims to reveal the practical concerns of NLP models. In this paper, we rethink the research paradigm of textual adversar… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: Accepted to EMNLP 2022, main conference

  9. arXiv:2209.13899  [pdf, other

    cs.CV

    Strong Instance Segmentation Pipeline for MMSports Challenge

    Authors: Bo Yan, Fengliang Qi, Zhuang Li, Yadong Li, Hongbin Wang

    Abstract: The goal of ACM MMSports2022 DeepSportRadar Instance Segmentation Challenge is to tackle the segmentation of individual humans including players, coaches and referees on a basketball court. And the main characteristics of this challenge are there is a high level of occlusions between players and the amount of data is quite limited. In order to address these problems, we designed a strong instance… ▽ More

    Submitted 28 September, 2022; originally announced September 2022.

    Comments: The first place solution for ACM MMSports2022 DeepSportRadar Instance Segmentation Challenge

  10. arXiv:2209.03785  [pdf, other

    eess.SP cs.AI cs.HC cs.LG

    A Novel Semi-supervised Meta Learning Method for Subject-transfer Brain-computer Interface

    Authors: Jingcong Li, Fei Wang, Haiyun Huang, Feifei Qi, Jiahui Pan

    Abstract: Brain-computer interface (BCI) provides a direct communication pathway between human brain and external devices. Before a new subject could use BCI, a calibration procedure is usually required. Because the inter- and intra-subject variances are so large that the models trained by the existing subjects perform poorly on new subjects. Therefore, effective subject-transfer and calibration method is e… ▽ More

    Submitted 7 September, 2022; originally announced September 2022.

  11. arXiv:2208.13770  [pdf, other

    cs.CE

    Local Verlet buffer approach for broad-phase interaction detection in Discrete Element Method

    Authors: Abdoul Wahid Mainassara Checkaraou, Xavier Besseron, Alban Rousset, Fenglei Qi, Bernhard Peters

    Abstract: The Extended Discrete Element Method (XDEM) is an innovative numerical simulation technique that extends the dynamics of granular materials known as Discrete Element Method (DEM) by additional properties such as the thermodynamic state, stress/strain for each particle. Such DEM simulations used by industries to set up their experimental processes are complexes and heavy in computation time. At e… ▽ More

    Submitted 25 August, 2022; originally announced August 2022.

  12. arXiv:2206.12046  [pdf, other

    cs.CV cs.LG eess.IV

    Bilateral Network with Channel Splitting Network and Transformer for Thermal Image Super-Resolution

    Authors: Bo Yan, Leilei Cao, Fengliang Qi, Hongbin Wang

    Abstract: In recent years, the Thermal Image Super-Resolution (TISR) problem has become an attractive research topic. TISR would been used in a wide range of fields, including military, medical, agricultural and animal ecology. Due to the success of PBVS-2020 and PBVS-2021 workshop challenge, the result of TISR keeps improving and attracts more researchers to sign up for PBVS-2022 challenge. In this paper,… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.

    Comments: The second place solution for CVPR2022 PBVS-TISR challenge

  13. arXiv:2206.12035  [pdf, other

    cs.CV cs.MM

    The Second Place Solution for The 4th Large-scale Video Object Segmentation Challenge--Track 3: Referring Video Object Segmentation

    Authors: Leilei Cao, Zhuang Li, Bo Yan, Feng Zhang, Fengliang Qi, Yuchen Hu, Hongbin Wang

    Abstract: The referring video object segmentation task (RVOS) aims to segment object instances in a given video referred by a language expression in all video frames. Due to the requirement of understanding cross-modal semantics within individual instances, this task is more challenging than the traditional semi-supervised video object segmentation where the ground truth object masks in the first frame are… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.

    Comments: 4 pages, 2 figures

  14. arXiv:2203.07426  [pdf, other

    cs.CL cs.AI

    Sememe Prediction for BabelNet Synsets using Multilingual and Multimodal Information

    Authors: Fanchao Qi, Chuancheng Lv, Zhiyuan Liu, Xiaojun Meng, Maosong Sun, Hai-Tao Zheng

    Abstract: In linguistics, a sememe is defined as the minimum semantic unit of languages. Sememe knowledge bases (KBs), which are built by manually annotating words with sememes, have been successfully applied to various NLP tasks. However, existing sememe KBs only cover a few languages, which hinders the wide utilization of sememes. To address this issue, the task of sememe prediction for BabelNet synsets (… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

    Comments: Accepted by Findings of ACL 2022 as a long paper. Camera-ready version

  15. arXiv:2202.13145  [pdf, other

    cs.CL cs.AI cs.IR

    QuoteR: A Benchmark of Quote Recommendation for Writing

    Authors: Fanchao Qi, Yanhui Yang, Jing Yi, Zhili Cheng, Zhiyuan Liu, Maosong Sun

    Abstract: It is very common to use quotations (quotes) to make our writings more elegant or convincing. To help people find appropriate quotes efficiently, the task of quote recommendation is presented, aiming to recommend quotes that fit the current context of writing. There have been various quote recommendation approaches, but they are evaluated on different unpublished datasets. To facilitate the resear… ▽ More

    Submitted 14 March, 2022; v1 submitted 26 February, 2022; originally announced February 2022.

    Comments: Accepted by the main conference of ACL 2022 as a long paper. The camera-ready version

  16. arXiv:2112.13610  [pdf, other

    cs.CL

    CUGE: A Chinese Language Understanding and Generation Evaluation Benchmark

    Authors: Yuan Yao, Qingxiu Dong, Jian Guan, Boxi Cao, Zhengyan Zhang, Chaojun Xiao, Xiaozhi Wang, Fanchao Qi, Junwei Bao, Jinran Nie, Zheni Zeng, Yuxian Gu, Kun Zhou, Xuancheng Huang, Wenhao Li, Shuhuai Ren, Jinliang Lu, Chengqiang Xu, Huadong Wang, Guoyang Zeng, Zile Zhou, Jiajun Zhang, Juanzi Li, Minlie Huang, Rui Yan , et al. (10 additional authors not shown)

    Abstract: Realizing general-purpose language intelligence has been a longstanding goal for natural language processing, where standard evaluation benchmarks play a fundamental and guiding role. We argue that for general-purpose language intelligence evaluation, the benchmark itself needs to be comprehensive and systematic. To this end, we propose CUGE, a Chinese Language Understanding and Generation Evaluat… ▽ More

    Submitted 14 June, 2022; v1 submitted 27 December, 2021; originally announced December 2021.

    Comments: We add two new datasets, including grammatical error correction dataset YACLC from Beijing Language and Culture University, and reading comprehension dataset GCRC from Shanxi University, and also improve the description consistency of all datasets

  17. arXiv:2112.01072  [pdf, other

    cs.CV

    The Second Place Solution for ICCV2021 VIPriors Instance Segmentation Challenge

    Authors: Bo Yan, Fengliang Qi, Leilei Cao, Hongbin Wang

    Abstract: The Visual Inductive Priors(VIPriors) for Data-Efficient Computer Vision challenges ask competitors to train models from scratch in a data-deficient setting. In this paper, we introduce the technical details of our submission to the ICCV2021 VIPriors instance segmentation challenge. Firstly, we designed an effective data augmentation method to improve the problem of data-deficient. Secondly, we co… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

  18. arXiv:2112.01059  [pdf, other

    cs.CV

    Stronger Baseline for Person Re-Identification

    Authors: Fengliang Qi, Bo Yan, Leilei Cao, Hongbin Wang

    Abstract: Person re-identification (re-ID) aims to identify the same person of interest across non-overlapping capturing cameras, which plays an important role in visual surveillance applications and computer vision research areas. Fitting a robust appearance-based representation extractor with limited collected training data is crucial for person re-ID due to the high expanse of annotating the identity of… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

    Comments: The third-place solution for ICCV2021 VIPriors Re-identification Challenge

  19. arXiv:2110.08247  [pdf, other

    cs.CR cs.AI cs.CL

    Textual Backdoor Attacks Can Be More Harmful via Two Simple Tricks

    Authors: Yangyi Chen, Fanchao Qi, Hongcheng Gao, Zhiyuan Liu, Maosong Sun

    Abstract: Backdoor attacks are a kind of emergent security threat in deep learning. After being injected with a backdoor, a deep neural model will behave normally on standard inputs but give adversary-specified predictions once the input contains specific backdoor triggers. In this paper, we find two simple tricks that can make existing textual backdoor attacks much more harmful. The first trick is to add a… ▽ More

    Submitted 19 October, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: Accepted to EMNLP 2022, main conference

  20. arXiv:2110.07139  [pdf, other

    cs.CL cs.AI cs.CR

    Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer

    Authors: Fanchao Qi, Yangyi Chen, Xurui Zhang, Mukai Li, Zhiyuan Liu, Maosong Sun

    Abstract: Adversarial attacks and backdoor attacks are two common security threats that hang over deep learning. Both of them harness task-irrelevant features of data in their implementation. Text style is a feature that is naturally irrelevant to most NLP tasks, and thus suitable for adversarial and backdoor attacks. In this paper, we make the first attempt to conduct adversarial and backdoor attacks based… ▽ More

    Submitted 13 October, 2021; originally announced October 2021.

    Comments: Accepted by the main conference of EMNLP 2021 as a long paper. The camera-ready version

  21. arXiv:2108.12100  [pdf, other

    cs.LG

    A framework for massive scale personalized promotion

    Authors: Yitao Shen, Yue Wang, Xingyu Lu, Feng Qi, Jia Yan, Yixiang Mu, Yao Yang, YiFan Peng, Jinjie Gu

    Abstract: Technology companies building consumer-facing platforms may have access to massive-scale user population. In recent years, promotion with quantifiable incentive has become a popular approach for increasing active users on such platforms. On one hand, increased user activities can introduce network effect, bring in advertisement audience, and produce other benefits. On the other hand, massive-scale… ▽ More

    Submitted 26 August, 2021; originally announced August 2021.

  22. arXiv:2106.10715  [pdf, other

    cs.CL

    CPM-2: Large-scale Cost-effective Pre-trained Language Models

    Authors: Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun, Yuan Yao, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun

    Abstract: In recent years, the size of pre-trained language models (PLMs) has grown by leaps and bounds. However, efficiency issues of these large-scale PLMs limit their utilization in real-world scenarios. We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference. (1) We introduce knowledge inheritance to accelerate th… ▽ More

    Submitted 24 June, 2021; v1 submitted 20 June, 2021; originally announced June 2021.

  23. arXiv:2106.06361  [pdf, other

    cs.CL cs.CR

    Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution

    Authors: Fanchao Qi, Yuan Yao, Sophia Xu, Zhiyuan Liu, Maosong Sun

    Abstract: Recent studies show that neural natural language processing (NLP) models are vulnerable to backdoor attacks. Injected with backdoors, models perform normally on benign examples but produce attacker-specified predictions when the backdoor is activated, presenting serious security threats to real-world applications. Since existing textual backdoor attacks pay little attention to the invisibility of… ▽ More

    Submitted 11 June, 2021; originally announced June 2021.

    Comments: Accepted by the main conference of ACL-IJCNLP as a long paper. Camera-ready version

  24. arXiv:2106.04927  [pdf, other

    cs.LG math.CO

    A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs

    Authors: Runzhong Wang, Zhigang Hua, Gan Liu, Jiayi Zhang, Junchi Yan, Feng Qi, Shuang Yang, Jun Zhou, Xiaokang Yang

    Abstract: Combinatorial Optimization (CO) has been a long-standing challenging research topic featured by its NP-hard nature. Traditionally such problems are approximately solved with heuristic algorithms which are usually fast but may sacrifice the solution quality. Currently, machine learning for combinatorial optimization (MLCO) has become a trending research topic, but most existing MLCO methods treat C… ▽ More

    Submitted 25 October, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021. Code at https://github.com/Thinklab-SJTU/PPO-BiHyb

  25. arXiv:2106.01979  [pdf, other

    cs.CL

    CCPM: A Chinese Classical Poetry Matching Dataset

    Authors: Wenhao Li, Fanchao Qi, Maosong Sun, Xiaoyuan Yi, Jiarui Zhang

    Abstract: Poetry is one of the most important art forms of human languages. Recently many studies have focused on incorporating some linguistic features of poetry, such as style and sentiment, into its understanding or generation system. However, there is no focus on understanding or evaluating the semantics of poetry. Therefore, we propose a novel task to assess a model's semantic understanding of poetry b… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

  26. arXiv:2106.00400  [pdf, other

    cs.CL

    Sub-Character Tokenization for Chinese Pretrained Language Models

    Authors: Chenglei Si, Zhengyan Zhang, Yingfa Chen, Fanchao Qi, Xiaozhi Wang, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun

    Abstract: Tokenization is fundamental to pretrained language models (PLMs). Existing tokenization methods for Chinese PLMs typically treat each character as an indivisible token. However, they ignore the unique feature of the Chinese writing system where additional linguistic information exists below the character level, i.e., at the sub-character level. To utilize such information, we propose sub-character… ▽ More

    Submitted 14 February, 2023; v1 submitted 1 June, 2021; originally announced June 2021.

    Comments: Accepted at TACL

  27. arXiv:2105.12585  [pdf, other

    cs.CL cs.AI

    Automatic Construction of Sememe Knowledge Bases via Dictionaries

    Authors: Fanchao Qi, Yangyi Chen, Fengyu Wang, Zhiyuan Liu, Xiao Chen, Maosong Sun

    Abstract: A sememe is defined as the minimum semantic unit in linguistics. Sememe knowledge bases (SKBs), which comprise words annotated with sememes, enable sememes to be applied to natural language processing. So far a large body of research has showcased the unique advantages and effectiveness of SKBs in various tasks. However, most languages have no SKBs, and manual construction of SKBs is time-consumin… ▽ More

    Submitted 3 June, 2021; v1 submitted 26 May, 2021; originally announced May 2021.

    Comments: Accepted by Findings of ACL at ACL-IJCNLP 2021. Camera-ready version

  28. arXiv:2105.12400  [pdf, other

    cs.CL cs.CR

    Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger

    Authors: Fanchao Qi, Mukai Li, Yangyi Chen, Zhengyan Zhang, Zhiyuan Liu, Yasheng Wang, Maosong Sun

    Abstract: Backdoor attacks are a kind of insidious security threat against machine learning models. After being injected with a backdoor in training, the victim model will produce adversary-specified outputs on the inputs embedded with predesigned triggers but behave properly on normal inputs during inference. As a sort of emergent attack, backdoor attacks in natural language processing (NLP) are investigat… ▽ More

    Submitted 3 June, 2021; v1 submitted 26 May, 2021; originally announced May 2021.

    Comments: Accepted by ACL-IJCNLP 2021 as a long paper. Camera-ready version

  29. arXiv:2104.05458  [pdf, other

    cs.CV

    PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network

    Authors: Pengfei Wang, Chengquan Zhang, Fei Qi, Shanshan Liu, Xiaoqiang Zhang, Pengyuan Lyu, Junyu Han, Jingtuo Liu, Errui Ding, Guangming Shi

    Abstract: The reading of arbitrarily-shaped text has received increasing research attention. However, existing text spotters are mostly built on two-stage frameworks or character-based methods, which suffer from either Non-Maximum Suppression (NMS), Region-of-Interest (RoI) operations, or character-level annotations. In this paper, to address the above problems, we propose a novel fully convolutional Point… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: 10 pages, 8 figures, AAAI 2021

  30. arXiv:2103.03412  [pdf, other

    cs.LG cs.AI

    Learning to Schedule DAG Tasks

    Authors: Zhigang Hua, Feng Qi, Gan Liu, Shuang Yang

    Abstract: Scheduling computational tasks represented by directed acyclic graphs (DAGs) is challenging because of its complexity. Conventional scheduling algorithms rely heavily on simple heuristics such as shortest job first (SJF) and critical path (CP), and are often lacking in scheduling quality. In this paper, we present a novel learning-based approach to scheduling DAG tasks. The algorithm employs a rei… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

  31. arXiv:2103.02488  [pdf, other

    cs.CV

    Non-local Channel Aggregation Network for Single Image Rain Removal

    Authors: Zhipeng Su, Yixiong Zhang, Xiao-Ping Zhang, Feng Qi

    Abstract: Rain streaks showing in images or videos would severely degrade the performance of computer vision applications. Thus, it is of vital importance to remove rain streaks and facilitate our vision systems. While recent convolutinal neural network based methods have shown promising results in single image rain removal (SIRR), they fail to effectively capture long-range location dependencies or aggrega… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

  32. Red Alarm for Pre-trained Models: Universal Vulnerability to Neuron-Level Backdoor Attacks

    Authors: Zhengyan Zhang, Guangxuan Xiao, Yongwei Li, Tian Lv, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Xin Jiang, Maosong Sun

    Abstract: Pre-trained models (PTMs) have been widely used in various downstream tasks. The parameters of PTMs are distributed on the Internet and may suffer backdoor attacks. In this work, we demonstrate the universal vulnerability of PTMs, where fine-tuned PTMs can be easily controlled by backdoor attacks in arbitrary downstream tasks. Specifically, attackers can add a simple pre-training task, which restr… ▽ More

    Submitted 20 October, 2023; v1 submitted 18 January, 2021; originally announced January 2021.

    Comments: Published in Machine Intelligence Research (https://link.springer.com/article/10.1007/s11633-022-1377-5)

  33. arXiv:2012.15699  [pdf, other

    cs.CL

    Better Robustness by More Coverage: Adversarial Training with Mixup Augmentation for Robust Fine-tuning

    Authors: Chenglei Si, Zhengyan Zhang, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun

    Abstract: Pretrained language models (PLMs) perform poorly under adversarial attacks. To improve the adversarial robustness, adversarial data augmentation (ADA) has been widely adopted to cover more search space of adversarial attacks by adding textual adversarial examples during training. However, the number of adversarial examples for text augmentation is still extremely insufficient due to the exponentia… ▽ More

    Submitted 5 June, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: ACL 2021 (Findings)

  34. arXiv:2012.00413  [pdf, other

    cs.CL

    CPM: A Large-scale Generative Chinese Pre-trained Language Model

    Authors: Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun

    Abstract: Pre-trained Language Models (PLMs) have proven to be beneficial for various downstream NLP tasks. Recently, GPT-3, with 175 billion parameters and 570GB training data, drew a lot of attention due to the capacity of few-shot (even zero-shot) learning. However, applying GPT-3 to address Chinese NLP tasks is still challenging, as the training corpus of GPT-3 is primarily English, and the parameters a… ▽ More

    Submitted 1 December, 2020; originally announced December 2020.

  35. arXiv:2011.11400  [pdf

    cs.AI cs.CL q-bio.NC

    Language guided machine action

    Authors: Feng Qi

    Abstract: Here we build a hierarchical modular network called Language guided machine action (LGMA), whose modules process information stream mimicking human cortical network that allows to achieve multiple general tasks such as language guided action, intention decomposition and mental simulation before action execution etc. LGMA contains 3 main systems: (1) primary sensory system that multimodal sensory i… ▽ More

    Submitted 23 November, 2020; originally announced November 2020.

    Comments: 10 pages, 4 figures

  36. arXiv:2011.10369  [pdf, other

    cs.CL cs.CY

    ONION: A Simple and Effective Defense Against Textual Backdoor Attacks

    Authors: Fanchao Qi, Yangyi Chen, Mukai Li, Yuan Yao, Zhiyuan Liu, Maosong Sun

    Abstract: Backdoor attacks are a kind of emergent training-time threat to deep neural networks (DNNs). They can manipulate the output of DNNs and possess high insidiousness. In the field of natural language processing, some attack methods have been proposed and achieve very high attack success rates on multiple popular models. Nevertheless, there are few studies on defending against textual backdoor attacks… ▽ More

    Submitted 3 November, 2021; v1 submitted 20 November, 2020; originally announced November 2020.

    Comments: Accepted by the main conference of EMNLP 2021 as a short paper. The camera-ready version

  37. arXiv:2011.03770  [pdf, other

    cs.CL

    Know What You Don't Need: Single-Shot Meta-Pruning for Attention Heads

    Authors: Zhengyan Zhang, Fanchao Qi, Zhiyuan Liu, Qun Liu, Maosong Sun

    Abstract: Deep pre-trained Transformer models have achieved state-of-the-art results over a variety of natural language processing (NLP) tasks. By learning rich language knowledge with millions of parameters, these models are usually overparameterized and significantly increase the computational overhead in applications. It is intuitive to address this issue by model compression. In this work, we propose a… ▽ More

    Submitted 7 November, 2020; originally announced November 2020.

  38. arXiv:2010.12876  [pdf, other

    eess.IV cs.LG eess.SP

    Electromagnetic Source Imaging via a Data-Synthesis-Based Convolutional Encoder-Decoder Network

    Authors: Gexin Huang, Jiawen Liang, Ke Liu, Chang Cai, ZhengHui Gu, Feifei Qi, Yuan Qing Li, Zhu Liang Yu, Wei Wu

    Abstract: Electromagnetic source imaging (ESI) requires solving a highly ill-posed inverse problem. To seek a unique solution, traditional ESI methods impose various forms of priors that may not accurately reflect the actual source properties, which may hinder their broad applications. To overcome this limitation, in this paper a novel data-synthesized spatio-temporally convolutional encoder-decoder network… ▽ More

    Submitted 13 July, 2022; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: 15 pages, 14 figures, and journal

  39. arXiv:2009.09192  [pdf, other

    cs.CL cs.AI cs.CR

    Learning to Attack: Towards Textual Adversarial Attacking in Real-world Situations

    Authors: Yuan Zang, Bairu Hou, Fanchao Qi, Zhiyuan Liu, Xiaojun Meng, Maosong Sun

    Abstract: Adversarial attacking aims to fool deep neural networks with adversarial examples. In the field of natural language processing, various textual adversarial attack models have been proposed, varying in the accessibility to the victim model. Among them, the attack models that only require the output of the victim model are more fit for real-world situations of adversarial attacking. However, to achi… ▽ More

    Submitted 19 September, 2020; originally announced September 2020.

    Comments: work in progress, 10 pages, 6 figures

  40. OpenAttack: An Open-source Textual Adversarial Attack Toolkit

    Authors: Guoyang Zeng, Fanchao Qi, Qianrui Zhou, Tingji Zhang, Zixian Ma, Bairu Hou, Yuan Zang, Zhiyuan Liu, Maosong Sun

    Abstract: Textual adversarial attacking has received wide and increasing attention in recent years. Various attack models have been proposed, which are enormously distinct and implemented with different programming frameworks and settings. These facts hinder quick utilization and fair comparison of attack models. In this paper, we present an open-source textual adversarial attack toolkit named OpenAttack to… ▽ More

    Submitted 24 September, 2021; v1 submitted 19 September, 2020; originally announced September 2020.

    Comments: ACL-IJCNLP 2021 Demo. 9 pages, 3 figures

  41. Country Image in COVID-19 Pandemic: A Case Study of China

    Authors: Huimin Chen, Zeyu Zhu, Fanchao Qi, Yining Ye, Zhiyuan Liu, Maosong Sun, Jianbin Jin

    Abstract: Country image has a profound influence on international relations and economic development. In the worldwide outbreak of COVID-19, countries and their people display different reactions, resulting in diverse perceived images among foreign public. Therefore, in this study, we take China as a specific and typical case and investigate its image with aspect-based sentiment analysis on a large-scale Tw… ▽ More

    Submitted 12 September, 2020; originally announced September 2020.

  42. Hardware Accelerator for Adversarial Attacks on Deep Learning Neural Networks

    Authors: Haoqiang Guo, Lu Peng, Jian Zhang, Fang Qi, Lide Duan

    Abstract: Recent studies identify that Deep learning Neural Networks (DNNs) are vulnerable to subtle perturbations, which are not perceptible to human visual system but can fool the DNN models and lead to wrong outputs. A class of adversarial attack network algorithms has been proposed to generate robust physical perturbations under different circumstances. These algorithms are the first efforts to move for… ▽ More

    Submitted 3 August, 2020; originally announced August 2020.

    Comments: IGSC'2019 (https://shirazi21.wixsite.com/igsc2019archive) Best paper award

    MSC Class: 68-06 ACM Class: C.3

    Journal ref: 2019 Tenth International Green and Sustainable Computing Conference (IGSC)

  43. arXiv:2006.11413  [pdf

    cs.CV cs.AI

    Visualizing and Understanding Vision System

    Authors: Feng Qi, Guanjun Jiang

    Abstract: How the human vision system addresses the object identity-preserving recognition problem is largely unknown. Here, we use a vision recognition-reconstruction network (RRN) to investigate the development, recognition, learning and forgetting mechanisms, and achieve similar characteristics to electrophysiological measurements in monkeys. First, in network development study, the RRN also experiences… ▽ More

    Submitted 11 June, 2020; originally announced June 2020.

  44. arXiv:2005.09175  [pdf

    q-bio.NC cs.AI cs.LG

    Human-like general language processing

    Authors: Feng Qi, Guanjun Jiang

    Abstract: Using language makes human beings surpass animals in wisdom. To let machines understand, learn, and use language flexibly, we propose a human-like general language processing (HGLP) architecture, which contains sensorimotor, association, and cognitive systems. The HGLP network learns from easy to hard like a child, understands word meaning by coactivating multimodal neurons, comprehends and genera… ▽ More

    Submitted 29 May, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

  45. arXiv:2002.00352  [pdf, other

    cs.DC cs.AI cs.DS

    Solving Billion-Scale Knapsack Problems

    Authors: Xingwen Zhang, Feng Qi, Zhigang Hua, Shuang Yang

    Abstract: Knapsack problems (KPs) are common in industry, but solving KPs is known to be NP-hard and has been tractable only at a relatively small scale. This paper examines KPs in a slightly generalized form and shows that they can be solved nearly optimally at scale via distributed algorithms. The proposed approach can be implemented fairly easily with off-the-shelf distributed computing frameworks (e.g.… ▽ More

    Submitted 2 February, 2020; originally announced February 2020.

  46. arXiv:2001.05954  [pdf, other

    cs.CL

    Lexical Sememe Prediction using Dictionary Definitions by Capturing Local Semantic Correspondence

    Authors: Jiaju Du, Fanchao Qi, Maosong Sun, Zhiyuan Liu

    Abstract: Sememes, defined as the minimum semantic units of human languages in linguistics, have been proven useful in many NLP tasks. Since manual construction and update of sememe knowledge bases (KBs) are costly, the task of automatic sememe prediction has been proposed to assist sememe annotation. In this paper, we explore the approach of applying dictionary definitions to predicting sememes for unannot… ▽ More

    Submitted 16 January, 2020; originally announced January 2020.

    Comments: Accepted by Journal of Chinese Information Processing

  47. arXiv:1912.08441  [pdf, other

    cs.CL cs.AI cs.IR

    Multi-channel Reverse Dictionary Model

    Authors: Lei Zhang, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun

    Abstract: A reverse dictionary takes the description of a target word as input and outputs the target word together with other words that match the description. Existing reverse dictionary methods cannot deal with highly variable input queries and low-frequency target words successfully. Inspired by the description-to-word inference process of humans, we propose the multi-channel reverse dictionary model, w… ▽ More

    Submitted 18 December, 2019; v1 submitted 18 December, 2019; originally announced December 2019.

    Comments: Accepted by AAAI Conference on Artificial Intelligence 2020

  48. arXiv:1912.01795  [pdf, other

    cs.CL cs.AI

    Towards Building a Multilingual Sememe Knowledge Base: Predicting Sememes for BabelNet Synsets

    Authors: Fanchao Qi, Liang Chang, Maosong Sun, Sicong Ouyang, Zhiyuan Liu

    Abstract: A sememe is defined as the minimum semantic unit of human languages. Sememe knowledge bases (KBs), which contain words annotated with sememes, have been successfully applied to many NLP tasks. However, existing sememe KBs are built on only a few languages, which hinders their widespread utilization. To address the issue, we propose to build a unified sememe KB for multiple languages based on Babel… ▽ More

    Submitted 3 December, 2019; originally announced December 2019.

    Comments: Accepted by AAAI Conference on Artificial Intelligence 2020 for oral presentation

  49. Word-level Textual Adversarial Attacking as Combinatorial Optimization

    Authors: Yuan Zang, Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Meng Zhang, Qun Liu, Maosong Sun

    Abstract: Adversarial attacks are carried out to reveal the vulnerability of deep neural networks. Textual adversarial attacking is challenging because text is discrete and a small perturbation can bring significant change to the original input. Word-level attacking, which can be regarded as a combinatorial optimization problem, is a well-studied class of textual attack methods. However, existing word-level… ▽ More

    Submitted 9 December, 2020; v1 submitted 27 October, 2019; originally announced October 2019.

    Comments: Accepted at ACL 2020 as a long paper (a typo is corrected as compared with the official conference camera-ready version). 16 pages, 3 figures

  50. arXiv:1910.08910  [pdf, other

    cs.CL cs.LG eess.AS

    Improving Sequence Modeling Ability of Recurrent Neural Networks via Sememes

    Authors: Yujia Qin, Fanchao Qi, Sicong Ouyang, Zhiyuan Liu, Cheng Yang, Yasheng Wang, Qun Liu, Maosong Sun

    Abstract: Sememes, the minimum semantic units of human languages, have been successfully utilized in various natural language processing applications. However, most existing studies exploit sememes in specific tasks and few efforts are made to utilize sememes more fundamentally. In this paper, we propose to incorporate sememes into recurrent neural networks (RNNs) to improve their sequence modeling ability,… ▽ More

    Submitted 19 August, 2020; v1 submitted 20 October, 2019; originally announced October 2019.

    Comments: Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP). 10 pages, 2 figures