Skip to main content

Showing 1–26 of 26 results for author: Lian, Y

  1. arXiv:2402.13035  [pdf, other

    cs.CL cs.AI

    Learning to Check: Unleashing Potentials for Self-Correction in Large Language Models

    Authors: Che Zhang, Zhenyang Xiao, Chengcheng Han, Yixin Lian, Yuejian Fang

    Abstract: Self-correction has achieved impressive results in enhancing the style and security of the generated output from large language models (LLMs). However, recent studies suggest that self-correction might be limited or even counterproductive in reasoning tasks due to LLMs' difficulties in identifying logical mistakes. In this paper, we aim to enhance the self-checking capabilities of LLMs by constr… ▽ More

    Submitted 17 June, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

  2. arXiv:2402.11903  [pdf, other

    cs.CL cs.AI

    DiLA: Enhancing LLM Tool Learning with Differential Logic Layer

    Authors: Yu Zhang, Hui-Ling Zhen, Zehua Pei, Yingzhao Lian, Lihao Yin, Mingxuan Yuan, Bei Yu

    Abstract: Considering the challenges faced by large language models (LLMs) in logical reasoning and planning, prior efforts have sought to augment LLMs with access to external solvers. While progress has been made on simple reasoning problems, solving classical constraint satisfaction problems, such as the Boolean Satisfiability Problem (SAT) and Graph Coloring Problem (GCP), remains difficult for off-the-s… ▽ More

    Submitted 18 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2305.12295 by other authors

  3. arXiv:2311.16442  [pdf, other

    cs.LG cs.DC

    Fast and Efficient 2-bit LLM Inference on GPU: 2/4/16-bit in a Weight Matrix with Asynchronous Dequantization

    Authors: Jinhao Li, Jiaming Xu, Shiyao Li, Shan Huang, Jun Liu, Yaoxiu Lian, Guohao Dai

    Abstract: Large language models (LLMs) have demonstrated impressive abilities in various domains while the inference cost is expensive. Many previous studies exploit quantization methods to reduce LLM inference cost by reducing latency and memory consumption. Applying 2-bit single-precision weight quantization brings >3% accuracy loss, so the state-of-the-art methods use mixed-precision methods for LLMs (e.… ▽ More

    Submitted 1 July, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

  4. arXiv:2310.05074  [pdf, other

    cs.CL cs.AI

    DialCoT Meets PPO: Decomposing and Exploring Reasoning Paths in Smaller Language Models

    Authors: Chengcheng Han, Xiaowei Du, Che Zhang, Yixin Lian, Xiang Li, Ming Gao, Baoyuan Wang

    Abstract: Chain-of-Thought (CoT) prompting has proven to be effective in enhancing the reasoning capabilities of Large Language Models (LLMs) with at least 100 billion parameters. However, it is ineffective or even detrimental when applied to reasoning tasks in Smaller Language Models (SLMs) with less than 10 billion parameters. To address this limitation, we introduce Dialogue-guided Chain-of-Thought (Dial… ▽ More

    Submitted 23 October, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023

  5. arXiv:2310.02629  [pdf, other

    cs.SD eess.AS

    BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition

    Authors: Peikun Chen, Fan Yu, Yuhao Lian, Hongfei Xue, Xucheng Wan, Naijun Zheng, Huan Zhou, Lei Xie

    Abstract: Mixture-of-experts based models, which use language experts to extract language-specific representations effectively, have been well applied in code-switching automatic speech recognition. However, there is still substantial space to improve as similar pronunciation across languages may result in ineffective multi-language modeling and inaccurate language boundary estimation. To eliminate these dr… ▽ More

    Submitted 7 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: Accepted by ASRU2023

  6. arXiv:2306.08401  [pdf, other

    cs.CL

    LiveChat: A Large-Scale Personalized Dialogue Dataset Automatically Constructed from Live Streaming

    Authors: Jingsheng Gao, Yixin Lian, Ziyi Zhou, Yuzhuo Fu, Baoyuan Wang

    Abstract: Open-domain dialogue systems have made promising progress in recent years. While the state-of-the-art dialogue agents are built upon large-scale text-based social media data and large pre-trained models, there is no guarantee these agents could also perform well in fast-growing scenarios, such as live streaming, due to the bounded transferability of pre-trained models and biased distributions of p… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: ACL 2023 Main Conference

  7. arXiv:2305.16885  [pdf, other

    cs.CL

    Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification

    Authors: Ke Ji, Yixin Lian, Jingsheng Gao, Baoyuan Wang

    Abstract: Due to the complex label hierarchy and intensive labeling cost in practice, the hierarchical text classification (HTC) suffers a poor performance especially when low-resource or few-shot settings are considered. Recently, there is a growing trend of applying prompts on pre-trained language models (PLMs), which has exhibited effectiveness in the few-shot flat text classification tasks. However, lim… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 14 pages, 8 figures, Accepted by ACL 2023

  8. arXiv:2303.11138  [pdf, other

    stat.ML cs.LG eess.SY math.OC

    Fault Detection via Occupation Kernel Principal Component Analysis

    Authors: Zachary Morrison, Benjamin P. Russo, Yingzhao Lian, Rushikesh Kamalapurkar

    Abstract: The reliable operation of automatic systems is heavily dependent on the ability to detect faults in the underlying dynamical system. While traditional model-based methods have been widely used for fault detection, data-driven approaches have garnered increasing attention due to their ease of deployment and minimal need for expert knowledge. In this paper, we present a novel principal component ana… ▽ More

    Submitted 26 June, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

  9. arXiv:2301.13083  [pdf, other

    cs.CL cs.AI

    Communication Drives the Emergence of Language Universals in Neural Agents: Evidence from the Word-order/Case-marking Trade-off

    Authors: Yuchen Lian, Arianna Bisazza, Tessa Verhoef

    Abstract: Artificial learners often behave differently from human learners in the context of neural agent-based simulations of language emergence and change. A common explanation is the lack of appropriate cognitive biases in these learners. However, it has also been proposed that more naturalistic settings of language learning and use could lead to more human-like results. We investigate this latter accoun… ▽ More

    Submitted 31 May, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

    Comments: Accepted to TACL, pre-MIT Press publication version

  10. arXiv:2301.05834  [pdf, ps, other

    math.CO cs.IT

    On lattice tilings of $\mathbb{Z}^{n}$ by limited magnitude error balls $\mathcal{B}(n,2,1,1)$

    Authors: Tao Zhang, Yanlu Lian, Gennian Ge

    Abstract: Limited magnitude error model has applications in flash memory. In this model, a perfect code is equivalent to a tiling of $\mathbb{Z}^n$ by limited magnitude error balls. In this paper, we give a complete classification of lattice tilings of $\mathbb{Z}^n$ by limited magnitude error balls $\mathcal{B}(n,2,1,1)$.

    Submitted 14 January, 2023; originally announced January 2023.

    Comments: 15 pages

  11. arXiv:2205.15703  [pdf, other

    eess.SY cs.LG

    Lessons Learned from Data-Driven Building Control Experiments: Contrasting Gaussian Process-based MPC, Bilevel DeePC, and Deep Reinforcement Learning

    Authors: Loris Di Natale, Yingzhao Lian, Emilio T. Maddalena, Jicheng Shi, Colin N. Jones

    Abstract: This manuscript offers the perspective of experimentalists on a number of modern data-driven techniques: model predictive control relying on Gaussian processes, adaptive data-driven control based on behavioral theory, and deep reinforcement learning. These techniques are compared in terms of data requirements, ease of use, computational burden, and robustness in the context of real-world applicati… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

  12. arXiv:2204.00376  [pdf, other

    cs.CV

    Few-shot One-class Domain Adaptation Based on Frequency for Iris Presentation Attack Detection

    Authors: Yachun Li, Ying Lian, Jingjing Wang, Yuhui Chen, Chunmao Wang, Shiliang Pu

    Abstract: Iris presentation attack detection (PAD) has achieved remarkable success to ensure the reliability and security of iris recognition systems. Most existing methods exploit discriminative features in the spatial domain and report outstanding performance under intra-dataset settings. However, the degradation of performance is inevitable under cross-dataset settings, suffering from domain shift. In co… ▽ More

    Submitted 1 April, 2022; originally announced April 2022.

    Comments: Camera Ready, ICASSP 2022

  13. arXiv:2105.12371  [pdf, other

    cs.IR cs.LG

    Quotient Space-Based Keyword Retrieval in Sponsored Search

    Authors: Yijiang Lian, Shuang Li, Chaobing Feng, YanFeng Zhu

    Abstract: Synonymous keyword retrieval has become an important problem for sponsored search ever since major search engines relax the exact match product's matching requirement to a synonymous level. Since the synonymous relations between queries and keywords are quite scarce, the traditional information retrieval framework is inefficient in this scenario. In this paper, we propose a novel quotient space-ba… ▽ More

    Submitted 26 May, 2021; originally announced May 2021.

  14. arXiv:2104.07637  [pdf, other

    cs.CL cs.AI cs.LG

    The Effect of Efficient Messaging and Input Variability on Neural-Agent Iterated Language Learning

    Authors: Yuchen Lian, Arianna Bisazza, Tessa Verhoef

    Abstract: Natural languages display a trade-off among different strategies to convey syntactic structure, such as word order or inflection. This trade-off, however, has not appeared in recent simulations of iterated language learning with neural network agents (Chaabouni et al., 2019b). We re-evaluate this result in light of three factors that play an important role in comparable experiments from the Langua… ▽ More

    Submitted 10 September, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

    Comments: To appear at EMNLP 2021

  15. arXiv:2102.10560  [pdf, other

    cs.IR

    A Concept Knowledge-Driven Keywords Retrieval Framework for Sponsored Search

    Authors: Yijiang Lian, Yubo Liu, Zhicong Ye, Liang Yuan, Yanfeng Zhu, Min Zhao, Jianyi Cheng, Xinwei Feng

    Abstract: In sponsored search, retrieving synonymous keywords for exact match type is important for accurately targeted advertising. Data-driven deep learning-based method has been proposed to tackle this problem. An apparent disadvantage of this method is its poor generalization performance on entity-level long-tail instances, even though they might share similar concept-level patterns with frequent instan… ▽ More

    Submitted 21 February, 2021; originally announced February 2021.

  16. arXiv:2101.02392  [pdf, other

    cs.LG cs.CR

    Detecting Log Anomalies with Multi-Head Attention (LAMA)

    Authors: Yicheng Guo, Yujin Wen, Congwei Jiang, Yixin Lian, Yi Wan

    Abstract: Anomaly detection is a crucial and challenging subject that has been studied within diverse research areas. In this work, we explore the task of log anomaly detection (especially computer system logs and user behavior logs) by analyzing logs' sequential information. We propose LAMA, a multi-head attention based sequential model to process log streams as template activity (event) sequences. A nex… ▽ More

    Submitted 7 January, 2021; originally announced January 2021.

  17. arXiv:2010.05594  [pdf, other

    cs.CL

    MultiWOZ 2.3: A multi-domain task-oriented dialogue dataset enhanced with annotation corrections and co-reference annotation

    Authors: Ting Han, Ximing Liu, Ryuichi Takanobu, Yixin Lian, Chongxuan Huang, Dazhen Wan, Wei Peng, Minlie Huang

    Abstract: Task-oriented dialogue systems have made unprecedented progress with multiple state-of-the-art (SOTA) models underpinned by a number of publicly available MultiWOZ datasets. Dialogue state annotations are error-prone, leading to sub-optimal performance. Various efforts have been put in rectifying the annotation errors presented in the original MultiWOZ dataset. In this paper, we introduce MultiWOZ… ▽ More

    Submitted 14 June, 2021; v1 submitted 12 October, 2020; originally announced October 2020.

  18. arXiv:2008.02014  [pdf, other

    cs.LG cs.IR stat.ML

    Optimizing AD Pruning of Sponsored Search with Reinforcement Learning

    Authors: Yijiang Lian, Zhijie Chen, Xin Pei, Shuang Li, Yifei Wang, Yuefeng Qiu, Zhiheng Zhang, Zhipeng Tao, Liang Yuan, Hanju Guan, Kefeng Zhang, Zhigang Li, Xiaochun Liu

    Abstract: Industrial sponsored search system (SSS) can be logically divided into three modules: keywords matching, ad retrieving, and ranking. During ad retrieving, the ad candidates grow exponentially. A query with high commercial value might retrieve a great deal of ad candidates such that the ranking module could not afford. Due to limited latency and computing resources, the candidates have to be pruned… ▽ More

    Submitted 5 August, 2020; originally announced August 2020.

  19. arXiv:2008.01969  [pdf, other

    cs.IR

    Retrieve Synonymous keywords for Frequent Queries in Sponsored Search in a Data Augmentation Way

    Authors: Yijiang Lian, Zhenjun You, Fan Wu, Wenqiang Liu, Jing Jia

    Abstract: In sponsored search, retrieving synonymous keywords is of great importance for accurately targeted advertising. The semantic gap between queries and keywords and the extremely high precision requirements (>= 95\%) are two major challenges to this task. To the best of our knowledge, the problem has not been openly discussed. In an industrial sponsored search system, the retrieved keywords for frequ… ▽ More

    Submitted 5 August, 2020; originally announced August 2020.

  20. arXiv:1902.00592  [pdf, other

    cs.IR

    An end-to-end Generative Retrieval Method for Sponsored Search Engine --Decoding Efficiently into a Closed Target Domain

    Authors: Yijiang Lian, Zhijie Chen, Jinlong Hu, Kefeng Zhang, Chunwei Yan, Muchenxuan Tong, Wenying Han, Hanju Guan, Ying Li, Ying Cao, Yang Yu, Zhigang Li, Xiaochun Liu, Yue Wang

    Abstract: In this paper, we present a generative retrieval method for sponsored search engine, which uses neural machine translation (NMT) to generate keywords directly from query. This method is completely end-to-end, which skips query rewriting and relevance judging phases in traditional retrieval systems. Different from standard machine translation, the target space in the retrieval setting is a constrai… ▽ More

    Submitted 18 March, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: 8 pages, 8 figures, conference

  21. arXiv:1812.06585  [pdf, other

    cs.NE cs.AI

    Generalizable Meta-Heuristic based on Temporal Estimation of Rewards for Large Scale Blackbox Optimization

    Authors: Mingde Zhao, Hongwei Ge, Yi Lian, Kai Zhang

    Abstract: The generalization abilities of heuristic optimizers may deteriorate with the increment of the search space dimensionality. To achieve generalized performance across Large Scale Blackbox Optimization (LSBO) tasks, it ispossible to ensemble several heuristics and devise a meta-heuristic to control their initiation. This paper first proposes a methodology of transforming LSBO problems into online de… ▽ More

    Submitted 18 September, 2019; v1 submitted 16 December, 2018; originally announced December 2018.

    Comments: 7 pages of contents, 1 page of references, 2 pages for appendix

  22. arXiv:1810.09517  [pdf

    physics.med-ph cs.ET eess.SP

    A High Accuracy and High Sensitivity System Architecture for Electrical Impedance Tomography System

    Authors: Hui Li, Boxiao Liu, Yongfu Li, Guoxing Wang, Yong Lian

    Abstract: A high accuracy and high sensitivity system architecture is proposed for the read-out circuit of electrical impedance tomography system-on-chip. The switched ratiometric technique is applied in the proposed architecture. The proposed system architecture minimizes the device noise by processing signals from both read-out electrodes and the stimulus. The quantized signals are post-processed in the d… ▽ More

    Submitted 2 October, 2018; originally announced October 2018.

  23. arXiv:1808.03679  [pdf

    physics.ins-det cs.LG stat.ML

    Machine Learning Promoting Extreme Simplification of Spectroscopy Equipment

    Authors: Jianchao Lee, Qiannan Duan, Sifan Bi, Ruen Luo, Yachao Lian, Hanqiang Liu, Ruixing Tian, Jiayuan Chen, Guodong Ma, Jinhong Gao, Zhaoyi Xu

    Abstract: The spectroscopy measurement is one of main pathways for exploring and understanding the nature. Today, it seems that racing artificial intelligence will remould its styles. The algorithms contained in huge neural networks are capable of substituting many of expensive and complex components of spectrum instruments. In this work, we presented a smart machine learning strategy on the measurement of… ▽ More

    Submitted 13 September, 2019; v1 submitted 5 August, 2018; originally announced August 2018.

    Comments: This is the second version. On pages 7 through 8, we have added a new case about the spectral properties of mixtures. Specifically, paragraph 1 on page 8 and Fig.7 is added

  24. A Smart Cushion for Real-Time Heart Rate Monitoring

    Authors: Chacko John Deepu, Zhihao Chen, Ju Teng Teo, Soon Huat Ng, Xiefeng Yang, Yong Lian

    Abstract: This paper presents a smart cushion for real time heart rate monitoring. The cushion comprises of an integrated micro-bending fiber sensor, which records the BCG (Ballistocardiogram) signal without direct skin-electrode contact, and an optical transceiver that does signal amplification, digitization, and pre-filtering. To remove the artifacts and extract heart rate from BCG signal, a computational… ▽ More

    Submitted 29 September, 2014; originally announced September 2014.

    Comments: 2012 IEEE Biomedical Circuits and Systems Conference

  25. An ECG-on-Chip for Wearable Cardiac Monitoring Devices

    Authors: C. J. Deepu, X. Y. Xu, X. D. Zou, L. B. Yao, Y. Lian

    Abstract: This paper describes a highly integrated, low power chip solution for ECG signal processing in wearable devices. The chip contains an instrumentation amplifier with programmable gain, a band-pass filter, a 12-bit SAR ADC, a novel QRS detector, 8K on-chip SRAM, and relevant control circuitry and CPU interfaces. The analog front end circuits accurately senses and digitizes the raw ECG signal, which… ▽ More

    Submitted 29 September, 2014; originally announced September 2014.

    Journal ref: 5th IEEE International Symposium on Electronic Design Test and Applications 2010

  26. An ECG-on-Chip with 535-nW/Channel Integrated Lossless Data Compressor for Wireless Sensors

    Authors: C. J. Deepu, X. Zhang, W. -S. Liew, D. L. T. Wong, Y. Lian

    Abstract: This paper presents a low-power ECG recording system-on-chip (SoC) with on-chip low-complexity lossless ECG compression for data reduction in wireless/ambulatory ECG sensor devices. The chip uses a linear slope predictor for data compression, and incorporates a novel low-complexity dynamic coding-packaging scheme to frame the prediction error into fixed-length 16-bit format. The proposed technique… ▽ More

    Submitted 29 September, 2014; originally announced September 2014.

    Journal ref: IEEE Journal of Solid-State Circuits, Nov 2014