Skip to main content

Showing 1–46 of 46 results for author: Chai, L

  1. arXiv:2407.03040  [pdf, other

    cs.CL cs.AI

    Raw Text is All you Need: Knowledge-intensive Multi-turn Instruction Tuning for Large Language Model

    Authors: Xia Hou, Qifeng Li, Jian Yang, Tongliang Li, Linzheng Chai, Xianjie Wu, Hangyuan Ji, Zhoujun Li, Jixuan Nie, Jingbo Dun, Wenfeng Song

    Abstract: Instruction tuning as an effective technique aligns the outputs of large language models (LLMs) with human preference. But how to generate the seasonal multi-turn dialogues from raw documents for instruction tuning still requires further exploration. In this paper, we present a novel framework named R2S that leverages the CoD-Chain of Dialogue logic to guide large language models (LLMs) in generat… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 11 pages, 3 figures

    MSC Class: 68T50 ACM Class: I.2.7

  2. arXiv:2406.16441  [pdf, other

    cs.CL

    UniCoder: Scaling Code Large Language Model via Universal Code

    Authors: Tao Sun, Linzheng Chai, Jian Yang, Yuwei Yin, Hongcheng Guo, Jiaheng Liu, Bing Wang, Liqun Yang, Zhoujun Li

    Abstract: Intermediate reasoning or acting steps have successfully improved large language models (LLMs) for handling various downstream natural language processing (NLP) tasks. When applying LLMs for code generation, recent works mainly focus on directing the models to articulate intermediate natural-language reasoning steps, as in chain-of-thought (CoT) prompting, and then output code with the natural lan… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL 2024 (Main)

  3. arXiv:2406.07436  [pdf, other

    cs.PL

    McEval: Massively Multilingual Code Evaluation

    Authors: Linzheng Chai, Shukai Liu, Jian Yang, Yuwei Yin, Ke Jin, Jiaheng Liu, Tao Sun, Ge Zhang, Changyu Ren, Hongcheng Guo, Zekun Wang, Boyang Wang, Xianjie Wu, Bing Wang, Tongliang Li, Liqun Yang, Sufeng Duan, Zhoujun Li

    Abstract: Code large language models (LLMs) have shown remarkable advances in code understanding, completion, and generation tasks. Programming benchmarks, comprised of a selection of code challenges and corresponding test cases, serve as a standard to evaluate the capability of different LLMs in such tasks. However, most existing benchmarks primarily focus on Python and are still restricted to a limited nu… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 22 pages

  4. arXiv:2405.10014  [pdf, other

    cs.CV eess.IV

    Frequency-Domain Refinement with Multiscale Diffusion for Super Resolution

    Authors: Xingjian Wang, Li Chai, Jiming Chen

    Abstract: The performance of single image super-resolution depends heavily on how to generate and complement high-frequency details to low-resolution images. Recently, diffusion-based models exhibit great potential in generating high-quality images for super-resolution tasks. However, existing models encounter difficulties in directly predicting high-frequency information of wide bandwidth by solely utilizi… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  5. arXiv:2405.03446  [pdf, other

    cs.CR

    SEvenLLM: Benchmarking, Eliciting, and Enhancing Abilities of Large Language Models in Cyber Threat Intelligence

    Authors: Hangyuan Ji, Jian Yang, Linzheng Chai, Chaoren Wei, Liqun Yang, Yunlong Duan, Yunli Wang, Tianzhen Sun, Hongcheng Guo, Tongliang Li, Changyu Ren, Zhoujun Li

    Abstract: To address the increasing complexity and frequency of cybersecurity incidents emphasized by the recent cybersecurity threat reports with over 10 billion instances, cyber threat intelligence (CTI) plays a critical role in the modern cybersecurity landscape by offering the insights required to understand and combat the constantly evolving nature of cyber threats. Inspired by the powerful capability… ▽ More

    Submitted 3 June, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  6. arXiv:2401.07037  [pdf, other

    cs.CL cs.AI

    xCoT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning

    Authors: Linzheng Chai, Jian Yang, Tao Sun, Hongcheng Guo, Jiaheng Liu, Bing Wang, Xiannian Liang, Jiaqi Bai, Tongliang Li, Qiyao Peng, Zhoujun Li

    Abstract: Chain-of-thought (CoT) has emerged as a powerful technique to elicit reasoning in large language models and improve a variety of downstream tasks. CoT mainly demonstrates excellent performance in English, but its usage in low-resource languages is constrained due to poor language generalization. To bridge the gap among different languages, we propose a cross-lingual instruction fine-tuning framewo… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

    Comments: 11 pages

  7. arXiv:2312.17016  [pdf, other

    cs.CV cs.AI

    On the Promises and Challenges of Multimodal Foundation Models for Geographical, Environmental, Agricultural, and Urban Planning Applications

    Authors: Chenjiao Tan, Qian Cao, Yiwei Li, Jielu Zhang, Xiao Yang, Huaqin Zhao, Zihao Wu, Zhengliang Liu, Hao Yang, Nemin Wu, Tao Tang, Xinyue Ye, Lilong Chai, Ninghao Liu, Changying Li, Lan Mu, Tianming Liu, Gengchen Mai

    Abstract: The advent of large language models (LLMs) has heightened interest in their potential for multimodal applications that integrate language and vision. This paper explores the capabilities of GPT-4V in the realms of geography, environmental science, agriculture, and urban planning by evaluating its performance across a variety of tasks. Data sources comprise satellite imagery, aerial photos, ground-… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: 110 Pages; 61 Figures

    ACM Class: I.2.7; I.2.10; I.4.6; I.4.8; J.2

  8. arXiv:2312.11242  [pdf, other

    cs.CL

    MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL

    Authors: Bing Wang, Changyu Ren, Jian Yang, Xinnian Liang, Jiaqi Bai, Linzheng Chai, Zhao Yan, Qian-Wen Zhang, Di Yin, Xing Sun, Zhoujun Li

    Abstract: Recent LLM-based Text-to-SQL methods usually suffer from significant performance degradation on "huge" databases and complex user questions that require multi-step reasoning. Moreover, most existing methods neglect the crucial significance of LLMs utilizing external tools and model collaboration. To address these challenges, we introduce MAC-SQL, a novel LLM-based multi-agent collaborative framewo… ▽ More

    Submitted 16 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: under preview

  9. arXiv:2310.05316  [pdf, other

    cs.CV

    Understanding the Feature Norm for Out-of-Distribution Detection

    Authors: Jaewoo Park, Jacky Chen Long Chai, Jaeho Yoon, Andrew Beng Jin Teoh

    Abstract: A neural network trained on a classification dataset often exhibits a higher vector norm of hidden layer features for in-distribution (ID) samples, while producing relatively lower norm values on unseen instances from out-of-distribution (OOD). Despite this intriguing phenomenon being utilized in many applications, the underlying cause has not been thoroughly investigated. In this study, we demyst… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

    Comments: Accepted to ICCV2023

  10. arXiv:2309.09298  [pdf, other

    cs.CL

    OWL: A Large Language Model for IT Operations

    Authors: Hongcheng Guo, Jian Yang, Jiaheng Liu, Liqun Yang, Linzheng Chai, Jiaqi Bai, Junran Peng, Xiaorong Hu, Chao Chen, Dongfeng Zhang, Xu Shi, Tieqiao Zheng, Liangfan Zheng, Bo Zhang, Ke Xu, Zhoujun Li

    Abstract: With the rapid development of IT operations, it has become increasingly crucial to efficiently manage and analyze large volumes of data for practical applications. The techniques of Natural Language Processing (NLP) have shown remarkable capabilities for various tasks, including named entity recognition, machine translation and dialogue systems. Recently, Large Language Models (LLMs) have achieved… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

    Comments: 31 pages

  11. arXiv:2309.08165  [pdf, other

    cs.LG cs.AI stat.ME

    To Predict or to Reject: Causal Effect Estimation with Uncertainty on Networked Data

    Authors: Hechuan Wen, Tong Chen, Li Kheng Chai, Shazia Sadiq, Kai Zheng, Hongzhi Yin

    Abstract: Due to the imbalanced nature of networked observational data, the causal effect predictions for some individuals can severely violate the positivity/overlap assumption, rendering unreliable estimations. Nevertheless, this potential risk of individual-level treatment effect estimation on networked data has been largely under-explored. To create a more trustworthy causal effect estimator, we propose… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: Accepted by ICDM'23

  12. arXiv:2309.07438  [pdf, other

    cs.AI cs.NI

    Towards Artificial General Intelligence (AGI) in the Internet of Things (IoT): Opportunities and Challenges

    Authors: Fei Dou, Jin Ye, Geng Yuan, Qin Lu, Wei Niu, Haijian Sun, Le Guan, Guoyu Lu, Gengchen Mai, Ninghao Liu, Jin Lu, Zhengliang Liu, Zihao Wu, Chenjiao Tan, Shaochen Xu, Xianqiao Wang, Guoming Li, Lilong Chai, Sheng Li, Jin Sun, Hongyue Sun, Yunli Shao, Changying Li, Tianming Liu, Wenzhan Song

    Abstract: Artificial General Intelligence (AGI), possessing the capacity to comprehend, learn, and execute tasks with human cognitive abilities, engenders significant anticipation and intrigue across scientific, commercial, and societal arenas. This fascination extends particularly to the Internet of Things (IoT), a landscape characterized by the interconnection of countless devices, sensors, and systems, c… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  13. arXiv:2308.06552  [pdf, other

    cs.CL

    MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information Extraction

    Authors: Tongliang Li, Zixiang Wang, Linzheng Chai, Jian Yang, Jiaqi Bai, Yuwei Yin, Jiaheng Liu, Hongcheng Guo, Liqun Yang, Hebboul Zine el-abidine, Zhoujun Li

    Abstract: Cross-lingual open information extraction aims to extract structured information from raw text across multiple languages. Previous work uses a shared cross-lingual pre-trained model to handle the different languages but underuses the potential of the language-specific representation. In this paper, we propose an effective multi-stage tuning framework called MT4CrossIE, designed for enhancing cross… ▽ More

    Submitted 20 September, 2023; v1 submitted 12 August, 2023; originally announced August 2023.

    Comments: 10 pages

  14. arXiv:2308.01042  [pdf, other

    cs.CV

    WCCNet: Wavelet-integrated CNN with Crossmodal Rearranging Fusion for Fast Multispectral Pedestrian Detection

    Authors: Xingjian Wang, Li Chai, Jiming Chen, Zhiguo Shi

    Abstract: Multispectral pedestrian detection achieves better visibility in challenging conditions and thus has a broad application in various tasks, for which both the accuracy and computational cost are of paramount importance. Most existing approaches treat RGB and infrared modalities equally, typically adopting two symmetrical CNN backbones for multimodal feature extraction, which ignores the substantial… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Comments: Submitted to TPAMI

  15. arXiv:2306.13271  [pdf, other

    cs.LG stat.ML

    Variational Counterfactual Prediction under Runtime Domain Corruption

    Authors: Hechuan Wen, Tong Chen, Li Kheng Chai, Shazia Sadiq, Junbin Gao, Hongzhi Yin

    Abstract: To date, various neural methods have been proposed for causal effect estimation based on observational data, where a default assumption is the same distribution and availability of variables at both training and inference (i.e., runtime) stages. However, distribution shift (i.e., domain shift) could happen during runtime, and bigger challenges arise from the impaired accessibility of variables. Th… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

  16. arXiv:2306.09344  [pdf, other

    cs.CV cs.LG

    DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data

    Authors: Stephanie Fu, Netanel Tamir, Shobhita Sundaram, Lucy Chai, Richard Zhang, Tali Dekel, Phillip Isola

    Abstract: Current perceptual similarity metrics operate at the level of pixels and patches. These metrics compare images in terms of their low-level colors and textures, but fail to capture mid-level similarities and differences in image layout, object pose, and semantic content. In this paper, we develop a perceptual metric that assesses images holistically. Our first step is to collect a new dataset of hu… ▽ More

    Submitted 8 December, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: Website: https://dreamsim-nights.github.io/ Code: https://github.com/ssundaram21/dreamsim

  17. arXiv:2305.10254  [pdf, other

    cs.CV cs.AI

    SAM for Poultry Science

    Authors: Xiao Yang, Haixing Dai, Zihao Wu, Ramesh Bist, Sachin Subedi, Jin Sun, Guoyu Lu, Changying Li, Tianming Liu, Lilong Chai

    Abstract: In recent years, the agricultural industry has witnessed significant advancements in artificial intelligence (AI), particularly with the development of large-scale foundational models. Among these foundation models, the Segment Anything Model (SAM), introduced by Meta AI Research, stands out as a groundbreaking solution for object segmentation tasks. While SAM has shown success in various agricult… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

  18. arXiv:2305.06655  [pdf, other

    cs.CL

    QURG: Question Rewriting Guided Context-Dependent Text-to-SQL Semantic Parsing

    Authors: Linzheng Chai, Dongling Xiao, Jian Yang, Liqun Yang, Qian-Wen Zhang, Yunbo Cao, Zhoujun Li, Zhao Yan

    Abstract: Context-dependent Text-to-SQL aims to translate multi-turn natural language questions into SQL queries. Despite various methods have exploited context-dependence information implicitly for contextual SQL parsing, there are few attempts to explicitly address the dependencies between current question and question context. This paper presents QURG, a novel Question Rewriting Guided approach to help t… ▽ More

    Submitted 16 May, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

  19. arXiv:2304.10066  [pdf, other

    cs.CV

    Recognizability Embedding Enhancement for Very Low-Resolution Face Recognition and Quality Estimation

    Authors: Jacky Chen Long Chai, Tiong-Sik Ng, Cheng-Yaw Low, Jaewoo Park, Andrew Beng Jin Teoh

    Abstract: Very low-resolution face recognition (VLRFR) poses unique challenges, such as tiny regions of interest and poor resolution due to extreme standoff distance or wide viewing angle of the acquisition devices. In this paper, we study principled approaches to elevate the recognizability of a face in the embedding space instead of the visual quality. We first formulate a robust learning-based face recog… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

    Comments: Accepted to CVPR23

  20. arXiv:2304.06136  [pdf, other

    cs.AI cs.CY

    AGI for Agriculture

    Authors: Guoyu Lu, Sheng Li, Gengchen Mai, Jin Sun, Dajiang Zhu, Lilong Chai, Haijian Sun, Xianqiao Wang, Haixing Dai, Ninghao Liu, Rui Xu, Daniel Petti, Changying Li, Tianming Liu, Changying Li

    Abstract: Artificial General Intelligence (AGI) is poised to revolutionize a variety of sectors, including healthcare, finance, transportation, and education. Within healthcare, AGI is being utilized to analyze clinical medical notes, recognize patterns in patient data, and aid in patient management. Agriculture is another critical sector that impacts the lives of individuals worldwide. It serves as a found… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

  21. arXiv:2303.13515  [pdf, other

    cs.CV cs.LG

    Persistent Nature: A Generative Model of Unbounded 3D Worlds

    Authors: Lucy Chai, Richard Tucker, Zhengqi Li, Phillip Isola, Noah Snavely

    Abstract: Despite increasingly realistic image quality, recent 3D image generative models often operate on 3D volumes of fixed extent with limited camera motions. We investigate the task of unconditionally synthesizing unbounded nature scenes, enabling arbitrarily large camera motion while maintaining a persistent 3D world model. Our scene representation consists of an extendable, planar scene layout grid,… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: CVPR camera ready version, project page: https://chail.github.io/persistent-nature/

  22. arXiv:2303.05740  [pdf, other

    cs.DC

    A Novel Bilateral Energy Trading Mechanism for Electricity Markets with Numerous Prosumers

    Authors: Bing Liu, Furan Xie, Li Chai

    Abstract: With the rapid development of distributed energy resources, increasing number of residential and commercial users have been switched from pure electricity consumers to prosumers that can both consume and produce energy. To properly manage these emerging prosumers, a peer-to-peer electricity market has been explored and extensively studied. In such an electricity market, each prosumer trades energy… ▽ More

    Submitted 13 September, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

  23. arXiv:2209.13032  [pdf, other

    cs.CV

    Totems: Physical Objects for Verifying Visual Integrity

    Authors: Jingwei Ma, Lucy Chai, Minyoung Huh, Tongzhou Wang, Ser-Nam Lim, Phillip Isola, Antonio Torralba

    Abstract: We introduce a new approach to image forensics: placing physical refractive objects, which we call totems, into a scene so as to protect any photograph taken of that scene. Totems bend and redirect light rays, thus providing multiple, albeit distorted, views of the scene within a single image. A defender can use these distorted totem pixels to detect if an image has been manipulated. Our approach… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

    Comments: ECCV 2022 camera ready version; project page https://jingweim.github.io/totems/

  24. arXiv:2205.14659  [pdf, other

    cs.CV cs.AI

    Glance to Count: Learning to Rank with Anchors for Weakly-supervised Crowd Counting

    Authors: Zheng Xiong, Liangyu Chai, Wenxi Liu, Yongtuo Liu, Sucheng Ren, Shengfeng He

    Abstract: Crowd image is arguably one of the most laborious data to annotate. In this paper, we devote to reduce the massive demand of densely labeled crowd data, and propose a novel weakly-supervised setting, in which we leverage the binary ranking of two images with high-contrast crowd counts as training guidance. To enable training under this new setting, we convert the crowd count regression problem to… ▽ More

    Submitted 29 May, 2022; originally announced May 2022.

  25. arXiv:2205.07686  [pdf, other

    cs.CL

    CQR-SQL: Conversational Question Reformulation Enhanced Context-Dependent Text-to-SQL Parsers

    Authors: Dongling Xiao, Linzheng Chai, Qian-Wen Zhang, Zhao Yan, Zhoujun Li, Yunbo Cao

    Abstract: Context-dependent text-to-SQL is the task of translating multi-turn questions into database-related SQL queries. Existing methods typically focus on making full use of history context or previously predicted SQL for currently SQL parsing, while neglecting to explicitly comprehend the schema and conversational dependency, such as co-reference, ellipsis and user focus change. In this paper, we propo… ▽ More

    Submitted 24 October, 2022; v1 submitted 16 May, 2022; originally announced May 2022.

    Comments: Accepted at EMNLP 2022 (findings)

  26. arXiv:2204.07156  [pdf, other

    cs.CV cs.LG

    Any-resolution Training for High-resolution Image Synthesis

    Authors: Lucy Chai, Michael Gharbi, Eli Shechtman, Phillip Isola, Richard Zhang

    Abstract: Generative models operate at fixed resolution, even though natural images come in a variety of sizes. As high-resolution details are downsampled away and low-resolution images are discarded altogether, precious supervision is lost. We argue that every pixel matters and create datasets with variable-size images, collected at their native resolutions. To take advantage of varied-size data, we introd… ▽ More

    Submitted 4 August, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

    Comments: ECCV 2022 camera ready version; project page https://chail.github.io/anyres-gan/

  27. arXiv:2203.14485  [pdf, other

    cs.CV eess.SY

    Optimization of Directional Landmark Deployment for Visual Observer on SE(3)

    Authors: Zike Lei, Xi Chen, Ying Tan, Xiang Chen, Li Chai

    Abstract: An optimization method is proposed in this paper for novel deployment of given number of directional landmarks (location and pose) within a given region in the 3-D task space. This new deployment technique is built on the geometric models of both landmarks and the monocular camera. In particular, a new concept of Multiple Coverage Probability (MCP) is defined to characterize the probability of at… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

  28. arXiv:2203.08632  [pdf, other

    cs.CV

    Coverage Optimization of Camera Network for Continuous Deformable Object

    Authors: Chang Li, Xi Chen, Li Chai

    Abstract: In this paper, a deformable object is considered for cameras deployment with the aim of visual coverage. The object contour is discretized into sampled points as meshes, and the deformation is represented as continuous trajectories for the sampled points. To reduce the computational complexity, some feature points are carefully selected representing the continuous deformation process, and the visu… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

  29. arXiv:2203.05208  [pdf, other

    cs.CV eess.IV

    Transferring Dual Stochastic Graph Convolutional Network for Facial Micro-expression Recognition

    Authors: Hui Tang, Li Chai, Wanli Lu

    Abstract: Micro-expression recognition has drawn increasing attention due to its wide application in lie detection, criminal detection and psychological consultation. To improve the recognition performance of the small micro-expression data, this paper presents a transferring dual stochastic Graph Convolutional Network (TDSGCN) model. We propose a stochastic graph construction method and dual graph convolut… ▽ More

    Submitted 10 March, 2022; originally announced March 2022.

  30. arXiv:2202.05530  [pdf, ps, other

    cs.IT

    An Improved EPA based Receiver Design for Uplink LDPC Coded SCMA System

    Authors: Lingyun Chai, Zilong Liu, Pei Xiao, Amine Maaref, Lin Bai

    Abstract: Sparse code multiple access (SCMA) is an emerging paradigm for efficient enabling of massive connectivity in future machine-type communications (MTC). In this letter, we conceive the uplink transmissions of the low-density parity check (LDPC) coded SCMA system. Traditional receiver design of LDPC-SCMA system, which is based on message passing algorithm (MPA) for multiuser detection followed by ind… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

  31. arXiv:2108.02970  [pdf, other

    cs.CV

    Reducing Spatial Labeling Redundancy for Semi-supervised Crowd Counting

    Authors: Yongtuo Liu, Sucheng Ren, Liangyu Chai, Hanjie Wu, Jing Qin, Dan Xu, Shengfeng He

    Abstract: Labeling is onerous for crowd counting as it should annotate each individual in crowd images. Recently, several methods have been proposed for semi-supervised crowd counting to reduce the labeling efforts. Given a limited labeling budget, they typically select a few crowd images and densely label all individuals in each of them. Despite the promising results, we argue the None-or-All labeling stra… ▽ More

    Submitted 6 August, 2021; originally announced August 2021.

    Comments: 8 pages, 6 figures

  32. Heterogeneous Multi-sensor Fusion with Random Finite Set Multi-object Densities

    Authors: Wei Yi, Lei Chai

    Abstract: This paper addresses the density based multi-sensor cooperative fusion using random finite set (RFS) type multi-object densities (MODs). Existing fusion methods use scalar weights to characterize the relative information confidence among the local MODs, and in this way the portion of contribution of each local MOD to the fused global MOD can be tuned via adjusting these weights. Our analysis shows… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

  33. arXiv:2104.14551  [pdf, other

    cs.CV cs.LG

    Ensembling with Deep Generative Views

    Authors: Lucy Chai, Jun-Yan Zhu, Eli Shechtman, Phillip Isola, Richard Zhang

    Abstract: Recent generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose, simply by learning from unlabeled image collections. Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification. Using a pretrained generator, we first find the latent code corresponding… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

    Comments: CVPR 2021 camera ready version; code available at https://github.com/chail/gan-ensembling

  34. arXiv:2103.10426  [pdf, other

    cs.CV cs.LG

    Using latent space regression to analyze and leverage compositionality in GANs

    Authors: Lucy Chai, Jonas Wulff, Phillip Isola

    Abstract: In recent years, Generative Adversarial Networks have become ubiquitous in both research and public perception, but how GANs convert an unstructured latent code to a high quality output is still an open question. In this work, we investigate regression into the latent space as a probe to understand the compositional properties of GANs. We find that combining the regressor and a pretrained generato… ▽ More

    Submitted 3 June, 2021; v1 submitted 18 March, 2021; originally announced March 2021.

    Comments: Update to ICLR 2021 camera ready version

  35. arXiv:2103.05576  [pdf, other

    eess.SY cs.MA

    Distributed Frequency Restoration and SoC Balancing Control for AC Microgrids

    Authors: Chang Yu, Xiaoqing Lu, Jingang Lai, Li Chai

    Abstract: This paper develops an improved distributed finite-time control algorithm for multiagent-based ac microgrids with battery energy storage systems (BESSs) utilizing a low-width communication network. The proposed control algorithm can simultaneously coordinate BESSs to eliminate any deviation from the nominal frequency as well as solving the state of charge (SoC) balancing problem. The stability of… ▽ More

    Submitted 9 March, 2021; originally announced March 2021.

  36. arXiv:2103.02813  [pdf, other

    eess.IV cs.CV

    PET Image Reconstruction with Multiple Kernels and Multiple Kernel Space Regularizers

    Authors: Shiyao Guo, Yuxia Sheng, Shenpeng Li, Li Chai, Jingxin Zhang

    Abstract: Kernelized maximum-likelihood (ML) expectation maximization (EM) methods have recently gained prominence in PET image reconstruction, outperforming many previous state-of-the-art methods. But they are not immune to the problems of non-kernelized MLEM methods in potentially large reconstruction error and high sensitivity to iteration number. This paper demonstrates these problems by theoretical rea… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

    Comments: 21 pages, 9 figures

  37. Periocular Embedding Learning with Consistent Knowledge Distillation from Face

    Authors: Yoon Gyo Jung, Jaewoo Park, Cheng Yaw Low, Jacky Chen Long Chai, Leslie Ching Ow Tiong, Andrew Beng Jin Teoh

    Abstract: Periocular biometric, the peripheral area of the ocular, is a collaborative alternative to the face, especially when the face is occluded or masked. However, in practice, sole periocular biometric capture the least salient facial features, thereby lacking discriminative information, particularly in wild environments. To address these problems, we transfer discriminatory information from the face t… ▽ More

    Submitted 28 January, 2024; v1 submitted 12 December, 2020; originally announced December 2020.

    Comments: Accepted to Neurocomputing

  38. arXiv:2011.01447  [pdf, other

    cs.SD cs.AI cs.LG cs.NE eess.AS

    A Two-Stage Approach to Device-Robust Acoustic Scene Classification

    Authors: Hu Hu, Chao-Han Huck Yang, Xianjun Xia, Xue Bai, Xin Tang, Yajian Wang, Shutong Niu, Li Chai, Juanjuan Li, Hongning Zhu, Feng Bao, Yuanjun Zhao, Sabato Marco Siniscalchi, Yannan Wang, Jun Du, Chin-Hui Lee

    Abstract: To improve device robustness, a highly desirable key feature of a competitive data-driven acoustic scene classification (ASC) system, a novel two-stage system based on fully convolutional neural networks (CNNs) is proposed. Our two-stage system leverages on an ad-hoc score combination based on two CNN classifiers: (i) the first CNN classifies acoustic inputs into one of three broad classes, and (i… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

    Comments: Submitted to ICASSP 2021. Code available: https://github.com/MihawkHu/DCASE2020_task1

    Report number: 845--849

    Journal ref: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

  39. arXiv:2008.10588  [pdf, other

    cs.CV

    What makes fake images detectable? Understanding properties that generalize

    Authors: Lucy Chai, David Bau, Ser-Nam Lim, Phillip Isola

    Abstract: The quality of image generation and manipulation is reaching impressive levels, making it increasingly difficult for a human to distinguish between what is real and what is fake. However, deep networks can still pick up on the subtle artifacts in these doctored images. We seek to understand what properties of fake images make them detectable and identify what generalizes across different model arc… ▽ More

    Submitted 24 August, 2020; originally announced August 2020.

  40. arXiv:2007.08389  [pdf, other

    eess.AS cs.LG cs.SD

    Device-Robust Acoustic Scene Classification Based on Two-Stage Categorization and Data Augmentation

    Authors: Hu Hu, Chao-Han Huck Yang, Xianjun Xia, Xue Bai, Xin Tang, Yajian Wang, Shutong Niu, Li Chai, Juanjuan Li, Hongning Zhu, Feng Bao, Yuanjun Zhao, Sabato Marco Siniscalchi, Yannan Wang, Jun Du, Chin-Hui Lee

    Abstract: In this technical report, we present a joint effort of four groups, namely GT, USTC, Tencent, and UKE, to tackle Task 1 - Acoustic Scene Classification (ASC) in the DCASE 2020 Challenge. Task 1 comprises two different sub-tasks: (i) Task 1a focuses on ASC of audio signals recorded with multiple (real and simulated) devices into ten different fine-grained classes, and (ii) Task 1b concerns with cla… ▽ More

    Submitted 26 August, 2020; v1 submitted 16 July, 2020; originally announced July 2020.

    Comments: Revised Technical Report. Proposed systems attain 2nds in both Task-1a and Task-1b in the official DCASE challenge 2020

  41. arXiv:1907.07171  [pdf, other

    cs.CV cs.LG

    On the "steerability" of generative adversarial networks

    Authors: Ali Jahanian, Lucy Chai, Phillip Isola

    Abstract: An open secret in contemporary machine learning is that many models work beautifully on standard benchmarks but fail to generalize outside the lab. This has been attributed to biased training data, which provide poor coverage over real world events. Generative models are no exception, but recent advances in generative adversarial networks (GANs) suggest otherwise - these models can now synthesize… ▽ More

    Submitted 16 February, 2020; v1 submitted 16 July, 2019; originally announced July 2019.

  42. arXiv:1903.01712  [pdf, other

    cs.CV cs.AI

    Deep Learning Based Motion Planning For Autonomous Vehicle Using Spatiotemporal LSTM Network

    Authors: Zhengwei Bai, Baigen Cai, Wei Shangguan, Linguo Chai

    Abstract: Motion Planning, as a fundamental technology of automatic navigation for the autonomous vehicle, is still an open challenging issue in the real-life traffic situation and is mostly applied by the model-based approaches. However, due to the complexity of the traffic situations and the uncertainty of the edge cases, it is hard to devise a general motion planning system for the autonomous vehicle. In… ▽ More

    Submitted 5 March, 2019; originally announced March 2019.

    Comments: 5 pages, 8 figures, Accepted to 2018 Chinese Automation Congress (CAC)

  43. arXiv:1902.05772  [pdf, other

    cs.LG cs.AI

    Deep Reinforcement Learning Based High-level Driving Behavior Decision-making Model in Heterogeneous Traffic

    Authors: Zhengwei Bai, Baigen Cai, Wei Shangguan, Linguo Chai

    Abstract: High-level driving behavior decision-making is an open-challenging problem for connected vehicle technology, especially in heterogeneous traffic scenarios. In this paper, a deep reinforcement learning based high-level driving behavior decision-making approach is proposed for connected vehicle in heterogeneous traffic situations. The model is composed of three main parts: a data preprocessor that m… ▽ More

    Submitted 26 February, 2019; v1 submitted 15 February, 2019; originally announced February 2019.

    Comments: 7 pages, 7 figures, 6 tables

  44. arXiv:1811.11517  [pdf, other

    eess.AS cs.SD

    Acoustics-guided evaluation (AGE): a new measure for estimating performance of speech enhancement algorithms for robust ASR

    Authors: Li Chai, Jun Du, Chin-Hui Lee

    Abstract: One challenging problem of robust automatic speech recognition (ASR) is how to measure the goodness of a speech enhancement algorithm (SEA) without calculating the word error rate (WER) due to the high costs of manual transcriptions, language modeling and decoding process. Traditional measures like PESQ and STOI for evaluating the speech quality and intelligibility were verified to have relatively… ▽ More

    Submitted 28 November, 2018; originally announced November 2018.

    Comments: Submitted to ICASSP 2019

  45. arXiv:1810.10534  [pdf, other

    physics.soc-ph cs.DL

    Evolution of semantic networks in biomedical texts

    Authors: Lucy R. Chai, Danielle S. Bassett

    Abstract: Language is hierarchically organized: words are built into phrases, sentences, and paragraphs to represent complex ideas. Here we ask whether the organization of language in written text displays the fractal hierarchical architecture common in systems optimized for efficient information transmission. We test the hypothesis that the expositional structure of scientific research articles displays Re… ▽ More

    Submitted 24 October, 2018; originally announced October 2018.

    Comments: 4 figures in the main text; 9 figures in the supplement

  46. arXiv:1703.02685  [pdf, ps, other

    eess.SY cs.MA

    New results on multi-agent system consensus: A graph signal processing perspective

    Authors: Jing-Wen Yi, Li Chai

    Abstract: This paper revisits the problem of multi-agent consensus from a graph signal processing perspective. By defining the graph filter from the consensus protocol, we establish the direct relation between average consensus of multi-agent systems and filtering of graph signals. This relation not only provides new insights of the average consensus, it also turns out to be a powerful tool to design effect… ▽ More

    Submitted 7 March, 2017; originally announced March 2017.

    Comments: 5pages, 4 figures