Skip to main content

Showing 1–50 of 78 results for author: Lei, C

  1. arXiv:2407.10279  [pdf, other

    cs.AI cs.GT cs.MA

    AlphaDou: High-Performance End-to-End Doudizhu AI Integrating Bidding

    Authors: Chang Lei, Huan Lei

    Abstract: Artificial intelligence for card games has long been a popular topic in AI research. In recent years, complex card games like Mahjong and Texas Hold'em have been solved, with corresponding AI programs reaching the level of human experts. However, the game of Dou Di Zhu presents significant challenges due to its vast state/action space and unique characteristics involving reasoning about competitio… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

  2. arXiv:2406.18129  [pdf, other

    cs.CV cs.LG

    CTS: Sim-to-Real Unsupervised Domain Adaptation on 3D Detection

    Authors: Meiying Zhang, Weiyuan Peng, Guangyao Ding, Chenyang Lei, Chunlin Ji, Qi Hao

    Abstract: Simulation data can be accurately labeled and have been expected to improve the performance of data-driven algorithms, including object detection. However, due to the various domain inconsistencies from simulation to reality (sim-to-real), cross-domain object detection algorithms usually suffer from dramatic performance drops. While numerous unsupervised domain adaptation (UDA) methods have been d… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  3. arXiv:2406.09534  [pdf, other

    cs.DB cs.LG

    FeatNavigator: Automatic Feature Augmentation on Tabular Data

    Authors: Jiaming Liang, Chuan Lei, Xiao Qin, Jiani Zhang, Asterios Katsifodimos, Christos Faloutsos, Huzefa Rangwala

    Abstract: Data-centric AI focuses on understanding and utilizing high-quality, relevant data in training machine learning (ML) models, thereby increasing the likelihood of producing accurate and useful results. Automatic feature augmentation, aiming to augment the initial base table with useful features from other tables, is critical in data preparation as it improves model performance, robustness, and gene… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 15 pages, 41 figures

  4. arXiv:2406.03882  [pdf, other

    cs.CL cs.SD eess.AS

    Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models

    Authors: Ziyun Cui, Chang Lei, Wen Wu, Yinan Duan, Diyang Qu, Ji Wu, Runsen Chen, Chao Zhang

    Abstract: The early detection of suicide risk is important since it enables the intervention to prevent potential suicide attempts. This paper studies the automatic detection of suicide risk based on spontaneous speech from adolescents, and collects a Mandarin dataset with 15 hours of suicide speech from more than a thousand adolescents aged from ten to eighteen for our experiments. To leverage the diverse… ▽ More

    Submitted 9 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  5. arXiv:2406.03461  [pdf, other

    cs.CV eess.IV

    Polarization Wavefront Lidar: Learning Large Scene Reconstruction from Polarized Wavefronts

    Authors: Dominik Scheuble, Chenyang Lei, Seung-Hwan Baek, Mario Bijelic, Felix Heide

    Abstract: Lidar has become a cornerstone sensing modality for 3D vision, especially for large outdoor scenarios and autonomous driving. Conventional lidar sensors are capable of providing centimeter-accurate distance information by emitting laser pulses into a scene and measuring the time-of-flight (ToF) of the reflection. However, the polarization of the received light that depends on the surface orientati… ▽ More

    Submitted 11 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted at CVPR 2024; Project Website: https://light.princeton.edu/publication/pollidar

  6. arXiv:2406.01555  [pdf, other

    cs.CV

    Towards Flexible Interactive Reflection Removal with Human Guidance

    Authors: Xiao Chen, Xudong Jiang, Yunkang Tao, Zhen Lei, Qing Li, Chenyang Lei, Zhaoxiang Zhang

    Abstract: Single image reflection removal is inherently ambiguous, as both the reflection and transmission components requiring separation may follow natural image statistics. Existing methods attempt to address the issue by using various types of low-level and physics-based cues as sources of reflection signals. However, these cues are not universally applicable, since they are only observable in specific… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  7. arXiv:2405.17705  [pdf, other

    cs.CV

    DC-Gaussian: Improving 3D Gaussian Splatting for Reflective Dash Cam Videos

    Authors: Linhan Wang, Kai Cheng, Shuo Lei, Shengkun Wang, Wei Yin, Chenyang Lei, Xiaoxiao Long, Chang-Tien Lu

    Abstract: We present DC-Gaussian, a new method for generating novel views from in-vehicle dash cam videos. While neural rendering techniques have made significant strides in driving scenarios, existing methods are primarily designed for videos collected by autonomous vehicles. However, these videos are limited in both quantity and diversity compared to dash cam videos, which are more widely used across vari… ▽ More

    Submitted 29 May, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: 9 pages,7 figures;project page: https://linhanwang.github.io/dcgaussian/

  8. arXiv:2404.18209  [pdf, other

    cs.LG cs.DB

    4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on Relational DBs

    Authors: Minjie Wang, Quan Gan, David Wipf, Zhenkun Cai, Ning Li, Jianheng Tang, Yanlin Zhang, Zizhao Zhang, Zunyao Mao, Yakun Song, Yanbo Wang, Jiahang Li, Han Zhang, Guang Yang, Xiao Qin, Chuan Lei, Muhan Zhang, Weinan Zhang, Christos Faloutsos, Zheng Zhang

    Abstract: Although RDBs store vast amounts of rich, informative data spread across interconnected tables, the progress of predictive machine learning models as applied to such tasks arguably falls well behind advances in other domains such as computer vision or natural language processing. This deficit stems, at least in part, from the lack of established/public RDB benchmarks as needed for training and eva… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: Under review

  9. arXiv:2404.05661  [pdf, other

    cs.CV

    Automatic Controllable Colorization via Imagination

    Authors: Xiaoyan Cong, Yue Wu, Qifeng Chen, Chenyang Lei

    Abstract: We propose a framework for automatic colorization that allows for iterative editing and modifications. The core of our framework lies in an imagination module: by understanding the content within a grayscale image, we utilize a pre-trained image generation model to generate multiple images that contain the same content. These images serve as references for coloring, mimicking the process of human… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: CVPR 2024. Project page: https://xy-cong.github.io/imagine-colorization

  10. arXiv:2404.04318  [pdf, other

    cs.CV cs.AI

    Robust Depth Enhancement via Polarization Prompt Fusion Tuning

    Authors: Kei Ikemura, Yiming Huang, Felix Heide, Zhaoxiang Zhang, Qifeng Chen, Chenyang Lei

    Abstract: Existing depth sensors are imperfect and may provide inaccurate depth values in challenging scenarios, such as in the presence of transparent or reflective objects. In this work, we present a general framework that leverages polarization imaging to improve inaccurate depth measurements from various depth sensors. Previous polarization-based depth enhancement methods focus on utilizing pure physics… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: CVPR 2024. Project page: https://lastbasket.github.io/PPFT/. The first two authors contribute equally

  11. arXiv:2403.12372  [pdf, other

    cs.LG

    Learning Transferable Time Series Classifier with Cross-Domain Pre-training from Language Model

    Authors: Mingyue Cheng, Xiaoyu Tao, Qi Liu, Hao Zhang, Yiheng Chen, Chenyi Lei

    Abstract: Advancements in self-supervised pre-training (SSL) have significantly advanced the field of learning transferable time series representations, which can be very useful in enhancing the downstream task. Despite being effective, most existing works struggle to achieve cross-domain SSL pre-training, missing valuable opportunities to integrate patterns and features from different domains. The main cha… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  12. arXiv:2403.07653  [pdf, other

    cs.DB

    OmniMatch: Effective Self-Supervised Any-Join Discovery in Tabular Data Repositories

    Authors: Christos Koutras, Jiani Zhang, Xiao Qin, Chuan Lei, Vasileios Ioannidis, Christos Faloutsos, George Karypis, Asterios Katsifodimos

    Abstract: How can we discover join relationships among columns of tabular data in a data repository? Can this be done effectively when metadata is missing? Traditional column matching works mainly rely on similarity measures based on exact value overlaps, hence missing important semantics or failing to handle noise in the data. At the same time, recent dataset discovery methods focusing on deep table repres… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  13. arXiv:2402.14361  [pdf, other

    cs.LG

    OpenTab: Advancing Large Language Models as Open-domain Table Reasoners

    Authors: Kezhi Kong, Jiani Zhang, Zhengyuan Shen, Balasubramaniam Srinivasan, Chuan Lei, Christos Faloutsos, Huzefa Rangwala, George Karypis

    Abstract: Large Language Models (LLMs) trained on large volumes of data excel at various natural language tasks, but they cannot handle tasks requiring knowledge that has not been trained on previously. One solution is to use a retriever that fetches relevant information to expand LLM's knowledge scope. However, existing textual-oriented retrieval-based LLMs are not ideal on structured table data due to div… ▽ More

    Submitted 12 April, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: Accepted by ICLR 2024

  14. arXiv:2402.04722  [pdf, ps, other

    cs.CY cs.SE

    Ten simple rules for teaching sustainable software engineering

    Authors: Kit Gallagher, Richard Creswell, Ben Lambert, Martin Robinson, Chon Lok Lei, Gary R. Mirams, David J. Gavaghan

    Abstract: Computational methods and associated software implementations are central to every field of scientific investigation. Modern biological research, particularly within systems biology, has relied heavily on the development of software tools to process and organize increasingly large datasets, simulate complex mechanistic models, provide tools for the analysis and management of data, and visualize an… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: Prepared for submission to PLOS Computational Biology's 10 Simple Rules collection

  15. arXiv:2401.11098  [pdf, other

    quant-ph cs.LG

    Neural auto-designer for enhanced quantum kernels

    Authors: Cong Lei, Yuxuan Du, Peng Mi, Jun Yu, Tongliang Liu

    Abstract: Quantum kernels hold great promise for offering computational advantages over classical learners, with the effectiveness of these kernels closely tied to the design of the quantum feature map. However, the challenge of designing effective quantum feature maps for real-world datasets, particularly in the absence of sufficient prior information, remains a significant obstacle. In this study, we pres… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: 24 pages, 14 figures, 9 tables, ICLR2024

  16. arXiv:2401.07426  [pdf, other

    cs.AI

    Generalized Planning for the Abstraction and Reasoning Corpus

    Authors: Chao Lei, Nir Lipovetzky, Krista A. Ehinger

    Abstract: The Abstraction and Reasoning Corpus (ARC) is a general artificial intelligence benchmark that poses difficulties for pure machine learning methods due to its requirement for fluid intelligence with a focus on reasoning and abstraction. In this work, we introduce an ARC solver, Generalized Planning for Abstract Reasoning (GPAR). It casts an ARC problem as a generalized planning (GP) problem, where… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

    Comments: Accepted at AAAI 2024 (extended version)

  17. arXiv:2312.16245  [pdf, other

    cs.CV

    iKUN: Speak to Trackers without Retraining

    Authors: Yunhao Du, Cheng Lei, Zhicheng Zhao, Fei Su

    Abstract: Referring multi-object tracking (RMOT) aims to track multiple objects based on input textual descriptions. Previous works realize it by simply integrating an extra textual module into the multi-object tracker. However, they typically need to retrain the entire framework and have difficulties in optimization. In this work, we propose an insertable Knowledge Unification Network, termed iKUN, to enab… ▽ More

    Submitted 11 March, 2024; v1 submitted 25 December, 2023; originally announced December 2023.

    Comments: CVPR 2024 camera-ready

  18. arXiv:2312.15252  [pdf, other

    q-bio.BM cs.LG

    DTIAM: A unified framework for predicting drug-target interactions, binding affinities and activation/inhibition mechanisms

    Authors: Zhangli Lu, Chuqi Lei, Kaili Wang, Libo Qin, Jing Tang, Min Li

    Abstract: Accurate and robust prediction of drug-target interactions (DTIs) plays a vital role in drug discovery. Despite extensive efforts have been invested in predicting novel DTIs, existing approaches still suffer from insufficient labeled data and cold start problems. More importantly, there is currently a lack of studies focusing on elucidating the mechanism of action (MoA) between drugs and targets.… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

  19. arXiv:2312.15139  [pdf, other

    cs.CV

    Automatic Tooth Arrangement with Joint Features of Point and Mesh Representations via Diffusion Probabilistic Models

    Authors: Changsong Lei, Mengfei Xia, Shaofeng Wang, Yaqian Liang, Ran Yi, Yuhui Wen, Yongjin Liu

    Abstract: Tooth arrangement is a crucial step in orthodontics treatment, in which aligning teeth could improve overall well-being, enhance facial aesthetics, and boost self-confidence. To improve the efficiency of tooth arrangement and minimize errors associated with unreasonable designs by inexperienced practitioners, some deep learning-based tooth arrangement methods have been proposed. Currently, most ex… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  20. arXiv:2312.14235  [pdf, other

    cs.CV

    Neural Spline Fields for Burst Image Fusion and Layer Separation

    Authors: Ilya Chugunov, David Shustin, Ruyu Yan, Chenyang Lei, Felix Heide

    Abstract: Each photo in an image burst can be considered a sample of a complex 3D scene: the product of parallax, diffuse and specular materials, scene motion, and illuminant variation. While decomposing all of these effects from a stack of misaligned images is a highly ill-conditioned task, the conventional align-and-merge burst pipeline takes the other extreme: blending them into a single image. In this w… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: project website: https://light.princeton.edu/publication/nsf

  21. arXiv:2312.08571  [pdf, other

    cs.SD cs.AI eess.AS

    PhasePerturbation: Speech Data Augmentation via Phase Perturbation for Automatic Speech Recognition

    Authors: Chengxi Lei, Satwinder Singh, Feng Hou, Xiaoyun Jia, Ruili Wang

    Abstract: Most of the current speech data augmentation methods operate on either the raw waveform or the amplitude spectrum of speech. In this paper, we propose a novel speech data augmentation method called PhasePerturbation that operates dynamically on the phase spectrum of speech. Instead of statically rotating a phase by a constant degree, PhasePerturbation utilizes three dynamic phase spectrum operatio… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  22. arXiv:2312.07254  [pdf, other

    cs.CL

    The GUA-Speech System Description for CNVSRC Challenge 2023

    Authors: Shengqiang Li, Chao Lei, Baozhong Ma, Binbin Zhang, Fuping Pan

    Abstract: This study describes our system for Task 1 Single-speaker Visual Speech Recognition (VSR) fixed track in the Chinese Continuous Visual Speech Recognition Challenge (CNVSRC) 2023. Specifically, we use intermediate connectionist temporal classification (Inter CTC) residual modules to relax the conditional independence assumption of CTC in our model. Then we use a bi-transformer decoder to enable the… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: CNVSRC 2023 Challenge

  23. arXiv:2311.15571  [pdf, other

    cs.CV

    Video-based Visible-Infrared Person Re-Identification with Auxiliary Samples

    Authors: Yunhao Du, Cheng Lei, Zhicheng Zhao, Yuan Dong, Fei Su

    Abstract: Visible-infrared person re-identification (VI-ReID) aims to match persons captured by visible and infrared cameras, allowing person retrieval and tracking in 24-hour surveillance systems. Previous methods focus on learning from cross-modality person images in different cameras. However, temporal information and single-camera samples tend to be neglected. To crack this nut, in this paper, we first… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: Accepted by Transactions on Information Forensics & Security 2023

  24. arXiv:2310.09469  [pdf, other

    cs.CV

    Towards More Accurate Diffusion Model Acceleration with A Timestep Aligner

    Authors: Mengfei Xia, Yujun Shen, Changsong Lei, Yu Zhou, Ran Yi, Deli Zhao, Wenping Wang, Yong-jin Liu

    Abstract: A diffusion model, which is formulated to produce an image using thousands of denoising steps, usually suffers from a slow inference speed. Existing acceleration algorithms simplify the sampling by skipping most steps yet exhibit considerable performance degradation. By viewing the generation of diffusion models as a discretized integrating process, we argue that the quality drop is partly caused… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  25. arXiv:2309.04669  [pdf, other

    cs.CV

    Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization

    Authors: Yang Jin, Kun Xu, Kun Xu, Liwei Chen, Chao Liao, Jianchao Tan, Quzhe Huang, Bin Chen, Chenyi Lei, An Liu, Chengru Song, Xiaoqiang Lei, Di Zhang, Wenwu Ou, Kun Gai, Yadong Mu

    Abstract: Recently, the remarkable advance of the Large Language Model (LLM) has inspired researchers to transfer its extraordinary reasoning capability to both vision and language data. However, the prevailing approaches primarily regard the visual input as a prompt and focus exclusively on optimizing the text generation process conditioned upon vision content by a frozen LLM. Such an inequitable treatment… ▽ More

    Submitted 22 March, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

    Comments: ICLR 2024

  26. arXiv:2308.13241  [pdf, other

    cs.RO cond-mat.mtrl-sci physics.optics

    WSTac: Interactive Surface Perception based on Whisker-Inspired and Self-Illuminated Vision-Based Tactile Sensor

    Authors: Kai Chong Lei, Kit Wa Sou, Wang Sing Chan, Jiayi Yan, Siqi Ping, Dengfeng Peng, Wenbo Ding, Xiao-Ping Zhang

    Abstract: Modern Visual-Based Tactile Sensors (VBTSs) use cost-effective cameras to track elastomer deformation, but struggle with ambient light interference. Solutions typically involve using internal LEDs and blocking external light, thus adding complexity. Creating a VBTS resistant to ambient light with just a camera and an elastomer remains a challenge. In this work, we introduce WStac, a self-illuminat… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  27. arXiv:2308.02797  [pdf, other

    physics.optics cs.CV

    Thin On-Sensor Nanophotonic Array Cameras

    Authors: Praneeth Chakravarthula, Jipeng Sun, Xiao Li, Chenyang Lei, Gene Chou, Mario Bijelic, Johannes Froesch, Arka Majumdar, Felix Heide

    Abstract: Today's commodity camera systems rely on compound optics to map light originating from the scene to positions on the sensor where it gets recorded as an image. To record images without optical aberrations, i.e., deviations from Gauss' linear model of optics, typical lens systems introduce increasingly complex stacks of optical elements which are responsible for the height of existing commodity cam… ▽ More

    Submitted 5 August, 2023; originally announced August 2023.

    Comments: 18 pages, 12 figures, to be published in ACM Transactions on Graphics

    ACM Class: I.4.0

  28. arXiv:2307.00735  [pdf, other

    cs.AI

    Novelty and Lifted Helpful Actions in Generalized Planning

    Authors: Chao Lei, Nir Lipovetzky, Krista A. Ehinger

    Abstract: It has been shown recently that successful techniques in classical planning, such as goal-oriented heuristics and landmarks, can improve the ability to compute planning programs for generalized planning (GP) problems. In this work, we introduce the notion of action novelty rank, which computes novelty with respect to a planning program, and propose novelty-based generalized planning solvers, which… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

    Comments: Accepted at SoCS 2023 (extended version)

  29. arXiv:2305.09152  [pdf

    cs.CR quant-ph

    Security Enhancement of Quantum Noise Stream Cipher Based on Probabilistic Constellation Shaping

    Authors: Sheng Liu, Shuang Wei, Wei Wang, Chao Lei, Tianhe Liu, Yajie Li, Yunbo Li, Dawei Ge, Dong Wang, Yongli Zhao, Dechao Zhang, Han Li, Jie Zhang

    Abstract: We propose a QNSC pre-coding scheme based on probabilistic shaping of the basis, to reduce the probability of ciphertext bits that are easier to be intercepted. Experiment results show this scheme can improve the security performance by 100% in terms of Eve's cipher text BER.

    Submitted 16 May, 2023; originally announced May 2023.

  30. arXiv:2304.10226  [pdf, other

    cs.CV cs.LG

    Domain Generalization for Mammographic Image Analysis with Contrastive Learning

    Authors: Zheren Li, Zhiming Cui, Lichi Zhang, Sheng Wang, Chenjin Lei, Xi Ouyang, Dongdong Chen, Xiangyu Zhao, Yajia Gu, Zaiyi Liu, Chunling Liu, Dinggang Shen, Jie-Zhi Cheng

    Abstract: The deep learning technique has been shown to be effectively addressed several image analysis tasks in the computer-aided diagnosis scheme for mammography. The training of an efficacious deep learning model requires large data with diverse styles and qualities. The diversity of data often comes from the use of various scanners of vendors. But, in practice, it is impractical to collect a sufficient… ▽ More

    Submitted 7 September, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

    Comments: arXiv admin note: text overlap with arXiv:2111.10827

  31. arXiv:2304.00219  [pdf, other

    cs.LG

    ConvBLS: An Effective and Efficient Incremental Convolutional Broad Learning System for Image Classification

    Authors: Chunyu Lei, C. L. Philip Chen, Jifeng Guo, Tong Zhang

    Abstract: Deep learning generally suffers from enormous computational resources and time-consuming training processes. Broad Learning System (BLS) and its convolutional variants have been proposed to mitigate these issues and have achieved superb performance in image classification. However, the existing convolutional-based broad learning system (C-BLS) either lacks an efficient training method and incremen… ▽ More

    Submitted 1 April, 2023; originally announced April 2023.

  32. arXiv:2303.15494  [pdf, other

    cs.CV

    Semantic-visual Guided Transformer for Few-shot Class-incremental Learning

    Authors: Wenhao Qiu, Sichao Fu, Jingyi Zhang, Chengxiang Lei, Qinmu Peng

    Abstract: Few-shot class-incremental learning (FSCIL) has recently attracted extensive attention in various areas. Existing FSCIL methods highly depend on the robustness of the feature backbone pre-trained on base classes. In recent years, different Transformer variants have obtained significant processes in the feature representation learning of massive fields. Nevertheless, the progress of the Transformer… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: Accepted by IEEE International Conference on Multimedia and Expo (ICME 2023)

  33. arXiv:2303.12274  [pdf, other

    cs.RO cs.CV

    A Hierarchical Hybrid Learning Framework for Multi-agent Trajectory Prediction

    Authors: Yujun Jiao, Mingze Miao, Zhishuai Yin, Chunyuan Lei, Xu Zhu, Linzhen Nie, Bo Tao

    Abstract: Accurate and robust trajectory prediction of neighboring agents is critical for autonomous vehicles traversing in complex scenes. Most methods proposed in recent years are deep learning-based due to their strength in encoding complex interactions. However, unplausible predictions are often generated since they rely heavily on past observations and cannot effectively capture the transient and conti… ▽ More

    Submitted 24 March, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

  34. arXiv:2303.09535  [pdf, other

    cs.CV

    FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

    Authors: Chenyang Qi, Xiaodong Cun, Yong Zhang, Chenyang Lei, Xintao Wang, Ying Shan, Qifeng Chen

    Abstract: The diffusion-based generative models have achieved remarkable success in text-based image generation. However, since it contains enormous randomness in generation progress, it is still challenging to apply such models for real-world visual content editing, especially in videos. In this paper, we propose FateZero, a zero-shot text-based editing method on real-world videos without per-prompt traini… ▽ More

    Submitted 11 October, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: Accepted to ICCV 2023 as an Oral Presentation. Project page: https://fate-zero-edit.github.io ; GitHub repository: https://github.com/ChenyangQiQi/FateZero

  35. arXiv:2303.08120  [pdf, other

    cs.CV

    Blind Video Deflickering by Neural Filtering with a Flawed Atlas

    Authors: Chenyang Lei, Xuanchi Ren, Zhaoxiang Zhang, Qifeng Chen

    Abstract: Many videos contain flickering artifacts. Common causes of flicker include video processing algorithms, video generation algorithms, and capturing videos under specific situations. Prior work usually requires specific guidance such as the flickering frequency, manual annotations, or extra consistent videos to remove the flicker. In this work, we propose a general flicker removal framework that onl… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: To appear in CVPR2023. Code: github.com/ChenyangLEI/All-In-One-Deflicker Website: chenyanglei.github.io/deflicker

  36. arXiv:2302.14438  [pdf, other

    cs.IR cs.AI cs.LG

    Self-Supervised Interest Transfer Network via Prototypical Contrastive Learning for Recommendation

    Authors: Guoqiang Sun, Yibin Shen, Sijin Zhou, Xiang Chen, Hongyan Liu, Chunming Wu, Chenyi Lei, Xianhui Wei, Fei Fang

    Abstract: Cross-domain recommendation has attracted increasing attention from industry and academia recently. However, most existing methods do not exploit the interest invariance between domains, which would yield sub-optimal solutions. In this paper, we propose a cross-domain recommendation method: Self-supervised Interest Transfer Network (SITN), which can effectively transfer invariant knowledge between… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

    Comments: 9 pages, 3 figures, accepted by AAAI 2023

  37. arXiv:2302.08250  [pdf, other

    cs.LG

    Self-supervised Guided Hypergraph Feature Propagation for Semi-supervised Classification with Missing Node Features

    Authors: Chengxiang Lei, Sichao Fu, Yuetian Wang, Wenhao Qiu, Yachen Hu, Qinmu Peng, Xinge You

    Abstract: Graph neural networks (GNNs) with missing node features have recently received increasing interest. Such missing node features seriously hurt the performance of the existing GNNs. Some recent methods have been proposed to reconstruct the missing node features by the information propagation among nodes with known and unknown attributes. Although these methods have achieved superior performance, how… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: Accepted by 48th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2023)

  38. Natural Language Interfaces to Data

    Authors: Abdul Quamar, Vasilis Efthymiou, Chuan Lei, Fatma Özcan

    Abstract: Recent advances in NLU and NLP have resulted in renewed interest in natural language interfaces to data, which provide an easy mechanism for non-technical users to access and query the data. While early systems evolved from keyword search and focused on simple factual queries, the complexity of both the input sentences as well as the generated SQL queries has evolved over time. More recently, ther… ▽ More

    Submitted 26 December, 2022; originally announced December 2022.

    Comments: The full version of this manuscript, as published by Foundations and Trends in Databases, is available at http://dx.doi.org/10.1561/1900000078

    Journal ref: Foundations and Trends in Databases 2022 Vol. 11: No. 4, pp 319-414

  39. arXiv:2212.08663  [pdf, other

    cs.CV

    Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning

    Authors: Huimin Wu, Chenyang Lei, Xiao Sun, Peng-Shuai Wang, Qifeng Chen, Kwang-Ting Cheng, Stephen Lin, Zhirong Wu

    Abstract: Self-supervised representation learning follows a paradigm of withholding some part of the data and tasking the network to predict it from the remaining part. Among many techniques, data augmentation lies at the core for creating the information gap. Towards this end, masking has emerged as a generic and powerful tool where content is withheld along the sequential dimension, e.g., spatial in image… ▽ More

    Submitted 23 August, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: Accepted by ICCV 2023. The code is available at https: //github.com/microsoft/random_quantize

  40. arXiv:2211.15662  [pdf, other

    cs.CV

    High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization

    Authors: Jiaxin Xie, Hao Ouyang, Jingtan Piao, Chenyang Lei, Qifeng Chen

    Abstract: We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views while preserving specific details of the input image. High-fidelity 3D GAN inversion is inherently challenging due to the geometry-texture trade-off in 3D inversion, where overfitting to a single view input image often damages the estimated geometry during the late… ▽ More

    Submitted 28 November, 2022; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: Project website: https://ken-ouyang.github.io/HFGI3D/index.html ; Github link: https://github.com/jiaxinxie97/HFGI3D

  41. arXiv:2211.15133  [pdf, other

    cs.CV cs.AI

    SI-GAT: A method based on improved Graph Attention Network for sonar image classification

    Authors: Can Lei, Huigang Wang, Juan Lei

    Abstract: The existing sonar image classification methods based on deep learning are often analyzed in Euclidean space, only considering the local image features. For this reason, this paper presents a sonar classification method based on improved Graph Attention Network (GAT), namely SI-GAT, which is applicable to multiple types imaging sonar. This method quantifies the correlation relationship between nod… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

    Comments: 5 pages, 4 figures

  42. arXiv:2211.02914  [pdf, other

    cs.CV

    Robust Reflection Removal with Flash-only Cues in the Wild

    Authors: Chenyang Lei, Xudong Jiang, Qifeng Chen

    Abstract: We propose a simple yet effective reflection-free cue for robust reflection removal from a pair of flash and ambient (no-flash) images. The reflection-free cue exploits a flash-only image obtained by subtracting the ambient image from the corresponding flash image in raw data space. The flash-only image is equivalent to an image taken in a dark environment with only a flash on. This flash-only ima… ▽ More

    Submitted 13 November, 2023; v1 submitted 5 November, 2022; originally announced November 2022.

    Comments: Extension of CVPR 2021 paper [arXiv:2103.04273], submitted to TPAMI. Our source code and dataset are publicly available at http://github.com/ChenyangLEI/flash-reflection-removal

  43. arXiv:2208.11457  [pdf, other

    cs.IR cs.AI cs.LG

    Scenario-Adaptive and Self-Supervised Model for Multi-Scenario Personalized Recommendation

    Authors: Yuanliang Zhang, Xiaofeng Wang, Jinxin Hu, Ke Gao, Chenyi Lei, Fei Fang

    Abstract: Multi-scenario recommendation is dedicated to retrieve relevant items for users in multiple scenarios, which is ubiquitous in industrial recommendation systems. These scenarios enjoy portions of overlaps in users and items, while the distribution of different scenarios is different. The key point of multi-scenario modeling is to efficiently maximize the use of whole-scenario information and granul… ▽ More

    Submitted 24 August, 2022; originally announced August 2022.

    Comments: Accepted by CIKM 2022

  44. arXiv:2208.04314  [pdf

    q-bio.QM cs.LG

    TripHLApan: predicting HLA molecules binding peptides based on triple coding matrix and transfer learning

    Authors: Meng Wang, Chuqi Lei, Jianxin Wang, Yaohang Li, Min Li

    Abstract: Human leukocyte antigen (HLA) is an important molecule family in the field of human immunity, which recognizes foreign threats and triggers immune responses by presenting peptides to T cells. In recent years, the synthesis of tumor vaccines to induce specific immune responses has become the forefront of cancer treatment. Computationally modeling the binding patterns between peptide and HLA can gre… ▽ More

    Submitted 6 August, 2022; originally announced August 2022.

    Comments: 25 pages, 7 figures

  45. arXiv:2205.14837  [pdf, other

    cs.IR cs.AI

    Enhancing Sequential Recommendation with Graph Contrastive Learning

    Authors: Yixin Zhang, Yong Liu, Yonghui Xu, Hao Xiong, Chenyi Lei, Wei He, Lizhen Cui, Chunyan Miao

    Abstract: The sequential recommendation systems capture users' dynamic behavior patterns to predict their next interaction behaviors. Most existing sequential recommendation methods only exploit the local context information of an individual interaction sequence and learn model parameters solely based on the item prediction loss. Thus, they usually fail to learn appropriate sequence representations. This pa… ▽ More

    Submitted 6 June, 2022; v1 submitted 29 May, 2022; originally announced May 2022.

    Comments: 8 pages, 3 figures, Accepted by IJCAI 2022

  46. arXiv:2201.11632  [pdf, other

    cs.CV cs.AI

    Deep Video Prior for Video Consistency and Propagation

    Authors: Chenyang Lei, Yazhou Xing, Hao Ouyang, Qifeng Chen

    Abstract: Applying an image processing algorithm independently to each video frame often leads to temporal inconsistency in the resulting video. To address this issue, we present a novel and general approach for blind video temporal consistency. Our method is only trained on a pair of original and processed videos directly instead of a large dataset. Unlike most previous methods that enforce temporal consis… ▽ More

    Submitted 27 January, 2022; originally announced January 2022.

    Comments: Accepted by TPAMI in Dec 2021; extension of NeurIPS2020 Blind Video Temporal Consistency via Deep Video Prior. arXiv admin note: substantial text overlap with arXiv:2010.11838

  47. arXiv:2201.02791  [pdf, other

    cs.LG cs.AI

    Scaling Knowledge Graph Embedding Models

    Authors: Nasrullah Sheikh, Xiao Qin, Berthold Reinwald, Chuan Lei

    Abstract: Developing scalable solutions for training Graph Neural Networks (GNNs) for link prediction tasks is challenging due to the high data dependencies which entail high computational cost and huge memory footprint. We propose a new method for scaling training of knowledge graph embedding models for link prediction to address these challenges. Towards this end, we propose the following algorithmic stra… ▽ More

    Submitted 8 January, 2022; originally announced January 2022.

  48. arXiv:2112.11377  [pdf, other

    cs.CV

    Shape from Polarization for Complex Scenes in the Wild

    Authors: Chenyang Lei, Chenyang Qi, Jiaxin Xie, Na Fan, Vladlen Koltun, Qifeng Chen

    Abstract: We present a new data-driven approach with physics-based priors to scene-level normal estimation from a single polarization image. Existing shape from polarization (SfP) works mainly focus on estimating the normal of a single object rather than complex scenes in the wild. A key barrier to high-quality scene-level SfP is the lack of real-world SfP data in complex scenes. Hence, we contribute the fi… ▽ More

    Submitted 20 April, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

    Comments: Accepted to CVPR 2022; Github link: https://github.com/ChenyangLEI/sfp-wild ;Project website: https://chenyanglei.github.io/sfpwild/index.html

  49. arXiv:2112.09298  [pdf, other

    cs.CV

    Human-vehicle Cooperative Visual Perception for Autonomous Driving under Complex Road and Traffic Scenarios

    Authors: Yiyue Zhao, Cailin Lei, Yu Shen, Yuchuan Du, Qijun Chen

    Abstract: Human-vehicle cooperative driving has become the critical technology of autonomous driving, which reduces the workload of human drivers. However, the complex and uncertain road environments bring great challenges to the visual perception of cooperative systems. And the perception characteristics of autonomous driving differ from manual driving a lot. To enhance the visual perception capability of… ▽ More

    Submitted 21 April, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

  50. arXiv:2108.09195  [pdf, other

    cs.CV

    Towards Photorealistic Colorization by Imagination

    Authors: Chenyang Lei, Yue Wu, Qifeng Chen

    Abstract: We present a novel approach to automatic image colorization by imitating the imagination process of human experts. Our imagination module is designed to generate color images that are context-correlated with black-and-white photos. Given a black-and-white image, our imagination module firstly extracts the context information, which is then used to synthesize colorful and diverse images using a con… ▽ More

    Submitted 20 August, 2021; originally announced August 2021.

    Comments: NeurIPS 2021 submission