Skip to main content

Showing 1–50 of 87 results for author: Yi, Z

  1. arXiv:2407.01573  [pdf, other

    cs.RO cs.LG eess.SY math.OC

    Model-Based Diffusion for Trajectory Optimization

    Authors: Chaoyi Pan, Zeji Yi, Guanya Shi, Guannan Qu

    Abstract: Recent advances in diffusion models have demonstrated their strong capabilities in generating high-fidelity samples from complex distributions through an iterative refinement process. Despite the empirical success of diffusion models in motion planning and control, the model-free nature of these methods does not leverage readily available model information and limits their generalization to new sc… ▽ More

    Submitted 28 May, 2024; originally announced July 2024.

    Comments: Website: https://lecar-lab.github.io/mbd/

  2. arXiv:2406.13532  [pdf, other

    cs.CV

    SALI: Short-term Alignment and Long-term Interaction Network for Colonoscopy Video Polyp Segmentation

    Authors: Qiang Hu, Zhenyu Yi, Ying Zhou, Fang Peng, Mei Liu, Qiang Li, Zhiwei Wang

    Abstract: Colonoscopy videos provide richer information in polyp segmentation for rectal cancer diagnosis. However, the endoscope's fast moving and close-up observing make the current methods suffer from large spatial incoherence and continuous low-quality frames, and thus yield limited segmentation accuracy. In this context, we focus on robust video polyp segmentation by enhancing the adjacent feature cons… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted to MICCAI 2024. Code and models: https://github.com/Scatteredrain/SALI

  3. arXiv:2406.13392  [pdf, other

    cs.CV

    Strengthening Layer Interaction via Dynamic Layer Attention

    Authors: Kaishen Wang, Xun Xia, Jian Liu, Zhang Yi, Tao He

    Abstract: In recent years, employing layer attention to enhance interaction among hierarchical layers has proven to be a significant advancement in building network structures. In this paper, we delve into the distinction between layer attention and the general attention mechanism, noting that existing layer attention methods achieve layer interaction on fixed feature maps in a static manner. These static l… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted by IJCAI2024

  4. arXiv:2405.10890  [pdf, other

    astro-ph.IM astro-ph.GA cs.AI

    A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model

    Authors: Mingxiang Fu, Yu Song, Jiameng Lv, Liang Cao, Peng Jia, Nan Li, Xiangru Li, Jifeng Liu, A-Li Luo, Bo Qiu, Shiyin Shen, Liangping Tu, Lili Wang, Shoulin Wei, Haifeng Yang, Zhenping Yi, Zhiqiang Zou

    Abstract: The exponential growth of astronomical datasets provides an unprecedented opportunity for humans to gain insight into the Universe. However, effectively analyzing this vast amount of data poses a significant challenge. Astronomers are turning to deep learning techniques to address this, but the methods are limited by their specific training sets, leading to considerable duplicate workloads too. He… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 26 pages, 10 figures, to be published on Chinese Physics C

  5. arXiv:2405.09459  [pdf, other

    cs.CV cs.AI

    Fourier Boundary Features Network with Wider Catchers for Glass Segmentation

    Authors: Xiaolin Qin, Jiacen Liu, Qianlei Wang, Shaolin Zhang, Fei Zhu, Zhang Yi

    Abstract: Glass largely blurs the boundary between the real world and the reflection. The special transmittance and reflectance quality have confused the semantic tasks related to machine vision. Therefore, how to clear the boundary built by glass, and avoid over-capturing features as false positive information in deep structure, matters for constraining the segmentation of reflection surface and penetratin… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  6. arXiv:2404.18598  [pdf, other

    cs.CV cs.GR

    Anywhere: A Multi-Agent Framework for Reliable and Diverse Foreground-Conditioned Image Inpainting

    Authors: Tianyidan Xie, Rui Ma, Qian Wang, Xiaoqian Ye, Feixuan Liu, Ying Tai, Zhenyu Zhang, Zili Yi

    Abstract: Recent advancements in image inpainting, particularly through diffusion modeling, have yielded promising outcomes. However, when tested in scenarios involving the completion of images based on the foreground objects, current methods that aim to inpaint an image in an end-to-end manner encounter challenges such as "over-imagination", inconsistency between foreground and background, and limited dive… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 16 pages, 9 figures, project page: https://anywheremultiagent.github.io

  7. arXiv:2404.15638  [pdf, other

    cs.CV cs.AI

    PriorNet: A Novel Lightweight Network with Multidimensional Interactive Attention for Efficient Image Dehazing

    Authors: Yutong Chen, Zhang Wen, Chao Wang, Lei Gong, Zhongchao Yi

    Abstract: Hazy images degrade visual quality, and dehazing is a crucial prerequisite for subsequent processing tasks. Most current dehazing methods rely on neural networks and face challenges such as high computational parameter pressure and weak generalization capabilities. This paper introduces PriorNet--a novel, lightweight, and highly applicable dehazing network designed to significantly improve the cla… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 8 pages, 4 figures

  8. arXiv:2404.03326  [pdf, other

    cs.IR

    A Directional Diffusion Graph Transformer for Recommendation

    Authors: Zixuan Yi, Xi Wang, Iadh Ounis

    Abstract: In real-world recommender systems, implicitly collected user feedback, while abundant, often includes noisy false-positive and false-negative interactions. The possible misinterpretations of the user-item interactions pose a significant challenge for traditional graph neural recommenders. These approaches aggregate the users' or items' neighbours based on implicit user-item interactions in order t… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  9. arXiv:2404.02505  [pdf, other

    cs.CL cs.AI

    Dynamic Demonstration Retrieval and Cognitive Understanding for Emotional Support Conversation

    Authors: Zhe Xu, Daoyuan Chen, Jiayi Kuang, Zihao Yi, Yaliang Li, Ying Shen

    Abstract: Emotional Support Conversation (ESC) systems are pivotal in providing empathetic interactions, aiding users through negative emotional states by understanding and addressing their unique experiences. In this paper, we tackle two key challenges in ESC: enhancing contextually relevant and empathetic response generation through dynamic demonstration retrieval, and advancing cognitive understanding to… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Accpeted by SIGIR 2024

    MSC Class: 68T50 ACM Class: I.2.7

  10. arXiv:2404.01188  [pdf, other

    cs.CV

    MonoBox: Tightness-free Box-supervised Polyp Segmentation using Monotonicity Constraint

    Authors: Qiang Hu, Zhenyu Yi, Ying Zhou, Ting Li, Fan Huang, Mei Liu, Qiang Li, Zhiwei Wang

    Abstract: We propose MonoBox, an innovative box-supervised segmentation method constrained by monotonicity to liberate its training from the user-unfriendly box-tightness assumption. In contrast to conventional box-supervised segmentation, where the box edges must precisely touch the target boundaries, MonoBox leverages imprecisely-annotated boxes to achieve robust pixel-wise segmentation. The 'linchpin' is… ▽ More

    Submitted 24 June, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  11. arXiv:2403.15981  [pdf, other

    cs.CV

    Exploring Accurate 3D Phenotyping in Greenhouse through Neural Radiance Fields

    Authors: Junhong Zhao, Wei Ying, Yaoqiang Pan, Zhenfeng Yi, Chao Chen, Kewei Hu, Hanwen Kang

    Abstract: Accurate collection of plant phenotyping is critical to optimising sustainable farming practices in precision agriculture. Traditional phenotyping in controlled laboratory environments, while valuable, falls short in understanding plant growth under real-world conditions. Emerging sensor and digital technologies offer a promising approach for direct phenotyping of plants in farm environments. This… ▽ More

    Submitted 28 March, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

  12. arXiv:2403.11053  [pdf, other

    cs.CV

    OSTAF: A One-Shot Tuning Method for Improved Attribute-Focused T2I Personalization

    Authors: Ye Wang, Zili Yi, Rui Ma

    Abstract: Personalized text-to-image (T2I) models not only produce lifelike and varied visuals but also allow users to tailor the images to fit their personal taste. These personalization techniques can grasp the essence of a concept through a collection of images, or adjust a pre-trained text-to-image model with a specific image input for subject-driven or attribute-aware guidance. Yet, accurately capturin… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  13. arXiv:2403.10166  [pdf, other

    cs.CV

    SemanticHuman-HD: High-Resolution Semantic Disentangled 3D Human Generation

    Authors: Peng Zheng, Tao Liu, Zili Yi, Rui Ma

    Abstract: With the development of neural radiance fields and generative models, numerous methods have been proposed for learning 3D human generation from 2D images. These methods allow control over the pose of the generated 3D human and enable rendering from different viewpoints. However, none of these methods explore semantic disentanglement in human image synthesis, i.e., they can not disentangle the gene… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: 26 pages, 14 figures

    ACM Class: I.2.10

  14. arXiv:2403.10012  [pdf, other

    cs.CV cs.RO eess.IV physics.optics

    Real-World Computational Aberration Correction via Quantized Domain-Mixing Representation

    Authors: Qi Jiang, Zhonghua Yi, Shaohua Gao, Yao Gao, Xiaolong Qian, Hao Shi, Lei Sun, Zhijie Xu, Kailun Yang, Kaiwei Wang

    Abstract: Relying on paired synthetic data, existing learning-based Computational Aberration Correction (CAC) methods are confronted with the intricate and multifaceted synthetic-to-real domain gap, which leads to suboptimal performance in real-world applications. In this paper, in contrast to improving the simulation pipeline, we deliver a novel insight into real-world CAC from the perspective of Unsupervi… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Codes and datasets will be made publicly available at https://github.com/zju-jiangqi/QDMR

  15. arXiv:2402.18013  [pdf, other

    cs.CL cs.AI

    A Survey on Recent Advances in LLM-Based Multi-turn Dialogue Systems

    Authors: Zihao Yi, Jiarui Ouyang, Yuwen Liu, Tianhao Liao, Zhe Xu, Ying Shen

    Abstract: This survey provides a comprehensive review of research on multi-turn dialogue systems, with a particular focus on multi-turn dialogue systems based on large language models (LLMs). This paper aims to (a) give a summary of existing LLMs and approaches for adapting LLMs to downstream tasks; (b) elaborate recent advances in multi-turn dialogue systems, covering both LLM-based open-domain dialogue (O… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 35 pages, 10 figures, ACM Computing Surveys

  16. arXiv:2401.08649  [pdf, other

    cs.NE cs.LG

    Deep Pulse-Coupled Neural Networks

    Authors: Zexiang Yi, Jing Lian, Yunliang Qi, Zhaofei Yu, Huajin Tang, Yide Ma, Jizhao Liu

    Abstract: Spiking Neural Networks (SNNs) capture the information processing mechanism of the brain by taking advantage of spiking neurons, such as the Leaky Integrate-and-Fire (LIF) model neuron, which incorporates temporal dynamics and transmits information via discrete and asynchronous spikes. However, the simplified biological properties of LIF ignore the neuronal coupling and dendritic structure of real… ▽ More

    Submitted 24 December, 2023; originally announced January 2024.

  17. arXiv:2401.07369  [pdf, other

    cs.LG cs.RO

    CoVO-MPC: Theoretical Analysis of Sampling-based MPC and Optimal Covariance Design

    Authors: Zeji Yi, Chaoyi Pan, Guanqi He, Guannan Qu, Guanya Shi

    Abstract: Sampling-based Model Predictive Control (MPC) has been a practical and effective approach in many domains, notably model-based reinforcement learning, thanks to its flexibility and parallelizability. Despite its appealing empirical performance, the theoretical understanding, particularly in terms of convergence analysis and hyperparameter tuning, remains absent. In this paper, we characterize the… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

    Comments: 32 pages, 4 figures

  18. arXiv:2401.04900  [pdf, other

    astro-ph.SR astro-ph.IM cs.LG stat.ML

    SPT: Spectral Transformer for Red Giant Stars Age and Mass Estimation

    Authors: Mengmeng Zhang, Fan Wu, Yude Bu, Shanshan Li, Zhenping Yi, Meng Liu, Xiaoming Kong

    Abstract: The age and mass of red giants are essential for understanding the structure and evolution of the Milky Way. Traditional isochrone methods for these estimations are inherently limited due to overlapping isochrones in the Hertzsprung-Russell diagram, while asteroseismology, though more precise, requires high-precision, long-term observations. In response to these challenges, we developed a novel fr… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: Accepted by A&A

  19. arXiv:2401.02650  [pdf, other

    cs.LG stat.ML

    Improving sample efficiency of high dimensional Bayesian optimization with MCMC

    Authors: Zeji Yi, Yunyue Wei, Chu Xin Cheng, Kaibo He, Yanan Sui

    Abstract: Sequential optimization methods are often confronted with the curse of dimensionality in high-dimensional spaces. Current approaches under the Gaussian process framework are still burdened by the computational complexity of tracking Gaussian process posteriors and need to partition the optimization problem into small regions to ensure exploration or assume an underlying low-dimensional structure.… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  20. arXiv:2311.17853  [pdf, other

    cs.LG

    On the Adversarial Robustness of Graph Contrastive Learning Methods

    Authors: Filippo Guerranti, Zinuo Yi, Anna Starovoit, Rafiq Kamel, Simon Geisler, Stephan Günnemann

    Abstract: Contrastive learning (CL) has emerged as a powerful framework for learning representations of images and text in a self-supervised manner while enhancing model robustness against adversarial attacks. More recently, researchers have extended the principles of contrastive learning to graph-structured data, giving birth to the field of graph contrastive learning (GCL). However, whether GCL methods ca… ▽ More

    Submitted 30 November, 2023; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: Accepted at NeurIPS 2023 New Frontiers in Graph Learning Workshop (NeurIPS GLFrontiers 2023)

  21. arXiv:2311.10601  [pdf, other

    cs.CV eess.SP

    Multimodal Indoor Localization Using Crowdsourced Radio Maps

    Authors: Zhaoguang Yi, Xiangyu Wen, Qiyue Xia, Peize Li, Francisco Zampella, Firas Alsehly, Chris Xiaoxuan Lu

    Abstract: Indoor Positioning Systems (IPS) traditionally rely on odometry and building infrastructures like WiFi, often supplemented by building floor plans for increased accuracy. However, the limitation of floor plans in terms of availability and timeliness of updates challenges their wide applicability. In contrast, the proliferation of smartphones and WiFi-enabled robots has made crowdsourced radio maps… ▽ More

    Submitted 12 March, 2024; v1 submitted 17 November, 2023; originally announced November 2023.

    Comments: 7 pages, 4 figures; ICRA'24 https://youtu.be/NTTKwJBFN5w

  22. arXiv:2311.04150   

    cs.HC

    What Makes a Fantastic Passenger-Car Driver in Urban Contexts?

    Authors: Yueteng Yu, Zhijie Yi, Xinyu Yang, Mengdi Chu, Junrong Lu, Xiang Chang, Yiyao Liu, Jingli Qin, Ye Jin, Jialin Song, Xingrui Gu, Jirui Yuan, Guyue Zhou, Jiangtao Gong

    Abstract: The accurate evaluation of the quality of driving behavior is crucial for optimizing and implementing autonomous driving technology in practice. However, there is no comprehensive understanding of good driving behaviors currently. In this paper, we sought to understand driving behaviors from the perspectives of both drivers and passengers. We invited 10 expert drivers and 14 novice drivers to comp… ▽ More

    Submitted 12 November, 2023; v1 submitted 7 November, 2023; originally announced November 2023.

    Comments: Part of the content of the paper will be modified. One of the authors has recommended its withdrawal due to personal reasons

  23. arXiv:2310.20343  [pdf, other

    cs.IR cs.MM

    Large Multi-modal Encoders for Recommendation

    Authors: Zixuan Yi, Zijun Long, Iadh Ounis, Craig Macdonald, Richard Mccreadie

    Abstract: In recent years, the rapid growth of online multimedia services, such as e-commerce platforms, has necessitated the development of personalised recommendation approaches that can encode diverse content about each item. Indeed, modern multi-modal recommender systems exploit diverse features obtained from raw images and item descriptions to enhance the recommendation performance. However, the existi… ▽ More

    Submitted 3 November, 2023; v1 submitted 31 October, 2023; originally announced October 2023.

  24. arXiv:2309.13037  [pdf, other

    cs.RO

    GELLO: A General, Low-Cost, and Intuitive Teleoperation Framework for Robot Manipulators

    Authors: Philipp Wu, Yide Shentu, Zhongke Yi, Xingyu Lin, Pieter Abbeel

    Abstract: Imitation learning from human demonstrations is a powerful framework to teach robots new skills. However, the performance of the learned policies is bottlenecked by the quality, scale, and variety of the demonstration data. In this paper, we aim to lower the barrier to collecting large and high-quality human demonstration data by proposing GELLO, a general framework for building low-cost and intui… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  25. arXiv:2309.08642  [pdf, other

    eess.SY cs.AI cs.LG stat.ME

    A Stochastic Online Forecast-and-Optimize Framework for Real-Time Energy Dispatch in Virtual Power Plants under Uncertainty

    Authors: Wei Jiang, Zhongkai Yi, Li Wang, Hanwei Zhang, Jihai Zhang, Fangquan Lin, Cheng Yang

    Abstract: Aggregating distributed energy resources in power systems significantly increases uncertainties, in particular caused by the fluctuation of renewable energy generation. This issue has driven the necessity of widely exploiting advanced predictive control techniques under uncertainty to ensure long-term economics and decarbonization. In this paper, we propose a real-time uncertainty-aware energy dis… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: Preprint. Accepted by CIKM 23

  26. arXiv:2308.10685  [pdf, other

    cs.IR

    Contrastive Graph Prompt-tuning for Cross-domain Recommendation

    Authors: Zixuan Yi, Iadh Ounis, Craig Macdonald

    Abstract: Recommender systems are frequently challenged by the data sparsity problem. One approach to mitigate this issue is through cross-domain recommendation techniques. In a cross-domain context, sharing knowledge between domains can enhance the effectiveness in the target domain. Recent cross-domain methods have employed a pre-training approach, but we argue that these methods often result in suboptima… ▽ More

    Submitted 3 November, 2023; v1 submitted 21 August, 2023; originally announced August 2023.

  27. arXiv:2308.08137  [pdf, other

    cs.CV cs.AI

    SYENet: A Simple Yet Effective Network for Multiple Low-Level Vision Tasks with Real-time Performance on Mobile Device

    Authors: Weiran Gou, Ziyao Yi, Yan Xiang, Shaoqing Li, Zibin Liu, Dehui Kong, Ke Xu

    Abstract: With the rapid development of AI hardware accelerators, applying deep learning-based algorithms to solve various low-level vision tasks on mobile devices has gradually become possible. However, two main problems still need to be solved: task-specific algorithms make it difficult to integrate them into a single neural network architecture, and large amounts of parameters make it difficult to achiev… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  28. arXiv:2308.07104  [pdf, other

    cs.CV cs.RO eess.IV

    FocusFlow: Boosting Key-Points Optical Flow Estimation for Autonomous Driving

    Authors: Zhonghua Yi, Hao Shi, Kailun Yang, Qi Jiang, Yaozu Ye, Ze Wang, Huajian Ni, Kaiwei Wang

    Abstract: Key-point-based scene understanding is fundamental for autonomous driving applications. At the same time, optical flow plays an important role in many vision tasks. However, due to the implicit bias of equal attention on all points, classic data-driven optical flow estimation methods yield less satisfactory performance on key points, limiting their implementations in key-point-critical safety-rele… ▽ More

    Submitted 22 September, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: Accepted to IEEE Transactions on Intelligent Vehicles (T-IV). The source code of FocusFlow will be available at https://github.com/ZhonghuaYi/FocusFlow_official

  29. arXiv:2306.14437  [pdf, other

    cs.RO cs.AI

    A Self-supervised Contrastive Learning Method for Grasp Outcomes Prediction

    Authors: Chengliang Liu, Binhua Huang, Yiwen Liu, Yuanzhe Su, Ke Mai, Yupo Zhang, Zhengkun Yi, Xinyu Wu

    Abstract: In this paper, we investigate the effectiveness of contrastive learning methods for predicting grasp outcomes in an unsupervised manner. By utilizing a publicly available dataset, we demonstrate that contrastive learning methods perform well on the task of grasp outcomes prediction. Specifically, the dynamic-dictionary-based method with the momentum updating technique achieves a satisfactory accur… ▽ More

    Submitted 21 September, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

    Comments: Manuscript accepted to RCAR 2023

  30. arXiv:2306.12992  [pdf, other

    cs.CV eess.IV physics.optics

    Minimalist and High-Quality Panoramic Imaging with PSF-aware Transformers

    Authors: Qi Jiang, Shaohua Gao, Yao Gao, Kailun Yang, Zhonghua Yi, Hao Shi, Lei Sun, Kaiwei Wang

    Abstract: High-quality panoramic images with a Field of View (FoV) of 360° are essential for contemporary panoramic computer vision tasks. However, conventional imaging systems come with sophisticated lens designs and heavy optical components. This disqualifies their usage in many mobile and wearable applications where thin and portable, minimalist imaging systems are desired. In this paper, we propose a Pa… ▽ More

    Submitted 4 July, 2024; v1 submitted 22 June, 2023; originally announced June 2023.

    Comments: Accepted to IEEE Transactions on Image Processing (TIP). The dataset and code will be available at https://github.com/zju-jiangqi/PCIE-PART

  31. Adaptive Learning based Upper-Limb Rehabilitation Training System with Collaborative Robot

    Authors: Jun Hong Lim, Kaibo He, Zeji Yi, Chen Hou, Chen Zhang, Yanan Sui, Luming Li

    Abstract: Rehabilitation training for patients with motor disabilities usually requires specialized devices in rehabilitation centers. Home-based multi-purpose training would significantly increase treatment accessibility and reduce medical costs. While it is unlikely to equip a set of rehabilitation robots at home, we investigate the feasibility to use the general-purpose collaborative robot for rehabilita… ▽ More

    Submitted 12 July, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Journal ref: EMBC2023

  32. arXiv:2303.07625  [pdf, other

    cs.CV

    PlanarTrack: A Large-scale Challenging Benchmark for Planar Object Tracking

    Authors: Xinran Liu, Xiaoqiong Liu, Ziruo Yi, Xin Zhou, Thanh Le, Libo Zhang, Yan Huang, Qing Yang, Heng Fan

    Abstract: Planar object tracking is a critical computer vision problem and has drawn increasing interest owing to its key roles in robotics, augmented reality, etc. Despite rapid progress, its further development, especially in the deep learning era, is largely hindered due to the lack of large-scale challenging benchmarks. Addressing this, we introduce PlanarTrack, a large-scale challenging planar tracking… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: Tech. Report

  33. arXiv:2301.13402  [pdf, other

    cs.CV eess.IV

    ReGANIE: Rectifying GAN Inversion Errors for Accurate Real Image Editing

    Authors: Bingchuan Li, Tianxiang Ma, Peng Zhang, Miao Hua, Wei Liu, Qian He, Zili Yi

    Abstract: The StyleGAN family succeed in high-fidelity image generation and allow for flexible and plausible editing of generated images by manipulating the semantic-rich latent style space.However, projecting a real image into its latent space encounters an inherent trade-off between inversion quality and editability. Existing encoder-based or optimization-based StyleGAN inversion methods attempt to mitiga… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

  34. arXiv:2211.03885  [pdf, other

    cs.CV eess.IV

    Learned Smartphone ISP on Mobile GPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report

    Authors: Andrey Ignatov, Radu Timofte, Shuai Liu, Chaoyu Feng, Furui Bai, Xiaotao Wang, Lei Lei, Ziyao Yi, Yan Xiang, Zibin Liu, Shaoqing Li, Keming Shi, Dehui Kong, Ke Xu, Minsu Kwon, Yaqi Wu, Jiesi Zheng, Zhihao Fan, Xun Wu, Feng Zhang, Albert No, Minhyeok Cho, Zewen Chen, Xiaze Zhang, Ran Li , et al. (13 additional authors not shown)

    Abstract: The role of mobile cameras increased dramatically over the past few years, leading to more and more research in automatic image quality enhancement and RAW photo processing. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based image signal processing (ISP) pipeline replacing the standard mobile ISPs that can run on modern smartphone GPUs using TensorFlow Lite. Th… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

  35. arXiv:2208.14449  [pdf

    eess.IV cs.CV cs.LG

    A Learning-Based 3D EIT Image Reconstruction Method

    Authors: Zhaoguang Yi, Zhou Chen, Yunjie Yang

    Abstract: Deep learning has been widely employed to solve the Electrical Impedance Tomography (EIT) image reconstruction problem. Most existing physical model-based and learning-based approaches focus on 2D EIT image reconstruction. However, when they are directly extended to the 3D domain, the reconstruction performance in terms of image quality and noise robustness is hardly guaranteed mainly due to the s… ▽ More

    Submitted 30 August, 2022; originally announced August 2022.

    Journal ref: Proceedings of the International Conference of Bioelectromagnetism, Electrical Bioimpedance, and Electrical Impedance Tomography. June 28 to July 1, 2022 Kyung Hee University, Seoul, Korea

  36. arXiv:2207.06841  [pdf, ps, other

    cs.LG cs.CV

    Deep Dictionary Learning with An Intra-class Constraint

    Authors: Xia Yuan, Jianping Gou, Baosheng Yu, Jiali Yu, Zhang Yi

    Abstract: In recent years, deep dictionary learning (DDL)has attracted a great amount of attention due to its effectiveness for representation learning and visual recognition.~However, most existing methods focus on unsupervised deep dictionary learning, failing to further explore the category information.~To make full use of the category information of different samples, we propose a novel deep dictionary… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

    Comments: 6 pages, 3 figures, 2 tables. It has been accepted in ICME2022

  37. arXiv:2207.04660  [pdf, other

    cs.CL cs.IR

    SummScore: A Comprehensive Evaluation Metric for Summary Quality Based on Cross-Encoder

    Authors: Wuhang Lin, Shasha Li, Chen Zhang, Bin Ji, Jie Yu, Jun Ma, Zibo Yi

    Abstract: Text summarization models are often trained to produce summaries that meet human quality requirements. However, the existing evaluation metrics for summary text are only rough proxies for summary quality, suffering from low correlation with human scoring and inhibition of summary diversity. To solve these problems, we propose SummScore, a comprehensive metric for summary quality evaluation based o… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

    Comments: Accept to APWeb-WAIM2022

  38. arXiv:2207.04656  [pdf, other

    cs.IR cs.CL

    Topic-Grained Text Representation-based Model for Document Retrieval

    Authors: Mengxue Du, Shasha Li, Jie Yu, Jun Ma, Bin Ji, Huijun Liu, Wuhang Lin, Zibo Yi

    Abstract: Document retrieval enables users to find their required documents accurately and quickly. To satisfy the requirement of retrieval efficiency, prevalent deep neural methods adopt a representation-based matching paradigm, which saves online matching time by pre-storing document representations offline. However, the above paradigm consumes vast local storage space, especially when storing the documen… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

    Comments: Accepted to ICANN2022

  39. arXiv:2204.05084  [pdf, other

    cs.CV

    XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation

    Authors: Wei Liu, Fangyue Liu, Fei Ding, Qian He, Zili Yi

    Abstract: Generating a new font library is a very labor-intensive and time-consuming job for glyph-rich scripts. Few-shot font generation is thus required, as it requires only a few glyph references without fine-tuning during test. Existing methods follow the style-content disentanglement paradigm and expect novel fonts to be produced by combining the style codes of the reference glyphs and the content repr… ▽ More

    Submitted 5 May, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: Accepted by CVPR2022

  40. arXiv:2203.04564  [pdf, other

    cs.CV

    Region-Aware Face Swapping

    Authors: Chao Xu, Jiangning Zhang, Miao Hua, Qian He, Zili Yi, Yong Liu

    Abstract: This paper presents a novel Region-Aware Face Swapping (RAFSwap) network to achieve identity-consistent harmonious high-resolution face generation in a local-global manner: \textbf{1)} Local Facial Region-Aware (FRA) branch augments local identity-relevant features by introducing the Transformer to effectively model misaligned cross-scale semantic interaction. \textbf{2)} Global Source Feature-Ada… ▽ More

    Submitted 17 March, 2022; v1 submitted 9 March, 2022; originally announced March 2022.

  41. arXiv:2203.00836  [pdf, other

    cs.LG q-bio.BM

    CandidateDrug4Cancer: An Open Molecular Graph Learning Benchmark on Drug Discovery for Cancer

    Authors: Xianbin Ye, Ziliang Li, Fei Ma, Zongbi Yi, Pengyong Li, Jun Wang, Peng Gao, Yixuan Qiao, Guotong Xie

    Abstract: Anti-cancer drug discoveries have been serendipitous, we sought to present the Open Molecular Graph Learning Benchmark, named CandidateDrug4Cancer, a challenging and realistic benchmark dataset to facilitate scalable, robust, and reproducible graph machine learning research for anti-cancer drug discovery. CandidateDrug4Cancer dataset encompasses multiple most-mentioned 29 targets for cancer, cover… ▽ More

    Submitted 21 August, 2022; v1 submitted 1 March, 2022; originally announced March 2022.

    Comments: Accepted by Workshop on Graph Learning Benchmarks, The Web Conference 2021

  42. arXiv:2203.00386  [pdf, other

    cs.CV

    CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP

    Authors: Zihao Wang, Wei Liu, Qian He, Xinglong Wu, Zili Yi

    Abstract: Training a text-to-image generator in the general domain (e.g., Dall.e, CogView) requires huge amounts of paired text-image data, which is too expensive to collect. In this paper, we propose a self-supervised scheme named as CLIP-GEN for general text-to-image generation with the language-image priors extracted with a pre-trained CLIP model. In our approach, we only require a set of unlabeled image… ▽ More

    Submitted 1 March, 2022; originally announced March 2022.

  43. arXiv:2202.13804  [pdf, other

    eess.IV cs.CV

    RestainNet: a self-supervised digital re-stainer for stain normalization

    Authors: Bingchao Zhao, Jiatai Lin, Changhong Liang, Zongjian Yi, Xin Chen, Bingbing Li, Weihao Qiu, Danyi Li, Li Liang, Chu Han, Zaiyi Liu

    Abstract: Color inconsistency is an inevitable challenge in computational pathology, which generally happens because of stain intensity variations or sections scanned by different scanners. It harms the pathological image analysis methods, especially the learning-based models. A series of approaches have been proposed for stain normalization. However, most of them are lack flexibility in practice. In this p… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

  44. arXiv:2201.11290  [pdf

    cs.LG

    Stock2Vec: An Embedding to Improve Predictive Models for Companies

    Authors: Ziruo Yi, Ting Xiao, Kaz-Onyeakazi Ijeoma, Ratnam Cheran, Yuvraj Baweja, Phillip Nelson

    Abstract: Building predictive models for companies often relies on inference using historical data of companies in the same industry sector. However, companies are similar across a variety of dimensions that should be leveraged in relevant prediction problems. This is particularly true for large, complex organizations which may not be well defined by a single industry and have no clear peers. To enable pred… ▽ More

    Submitted 26 January, 2022; originally announced January 2022.

  45. arXiv:2112.11224  [pdf, other

    cs.CV eess.SP

    Attention-Based Sensor Fusion for Human Activity Recognition Using IMU Signals

    Authors: Wenjin Tao, Haodong Chen, Md Moniruzzaman, Ming C. Leu, Zhaozheng Yi, Ruwen Qin

    Abstract: Human Activity Recognition (HAR) using wearable devices such as smart watches embedded with Inertial Measurement Unit (IMU) sensors has various applications relevant to our daily life, such as workout tracking and health monitoring. In this paper, we propose a novel attention-based approach to human activity recognition using multiple IMU sensors worn at different body locations. Firstly, a sensor… ▽ More

    Submitted 20 December, 2021; originally announced December 2021.

  46. arXiv:2112.05409  [pdf, other

    cs.LG

    Batch Label Inference and Replacement Attacks in Black-Boxed Vertical Federated Learning

    Authors: Yang Liu, Tianyuan Zou, Yan Kang, Wenhan Liu, Yuanqin He, Zhihao Yi, Qiang Yang

    Abstract: In a vertical federated learning (VFL) scenario where features and model are split into different parties, communications of sample-specific updates are required for correct gradient calculations but can be used to deduce important sample-level label information. An immediate defense strategy is to protect sample-level messages communicated with Homomorphic Encryption (HE), and in this way only th… ▽ More

    Submitted 11 February, 2022; v1 submitted 10 December, 2021; originally announced December 2021.

    Comments: 13 pages, 9 figures, 3 tables, related previous work see arXiv:2007.03608

  47. arXiv:2111.03574  [pdf, other

    cs.CV

    Spatial-Temporal Residual Aggregation for High Resolution Video Inpainting

    Authors: Vishnu Sanjay Ramiya Srinivasan, Rui Ma, Qiang Tang, Zili Yi, Zhan Xu

    Abstract: Recent learning-based inpainting algorithms have achieved compelling results for completing missing regions after removing undesired objects in videos. To maintain the temporal consistency among the frames, 3D spatial and temporal operations are often heavily used in the deep networks. However, these methods usually suffer from memory constraints and can only handle low resolution videos. We propo… ▽ More

    Submitted 5 November, 2021; originally announced November 2021.

    Comments: Accepted by BMVC 2021. Project page: https://github.com/Ascend-Research/STRA_Net

  48. arXiv:2109.10760  [pdf, other

    cs.CV

    FaceEraser: Removing Facial Parts for Augmented Reality

    Authors: Miao Hua, Lijie Liu, Ziyang Cheng, Qian He, Bingchuan Li, Zili Yi

    Abstract: Our task is to remove all facial parts (e.g., eyebrows, eyes, mouth and nose), and then impose visual elements onto the ``blank'' face for augmented reality. Conventional object removal methods rely on image inpainting techniques (e.g., EdgeConnect, HiFill) that are trained in a self-supervised manner with randomly manipulated image pairs. Specifically, given a set of natural images, randomly mask… ▽ More

    Submitted 22 October, 2021; v1 submitted 22 September, 2021; originally announced September 2021.

    Comments: 18 pages, 15 figures. ICCV 2021, Fifth Workshop on Computer Vision for AR/VR

  49. arXiv:2109.10737  [pdf, other

    cs.CV

    DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editing

    Authors: Bingchuan Li, Shaofei Cai, Wei Liu, Peng Zhang, Qian He, Miao Hua, Zili Yi

    Abstract: The semantic controllability of StyleGAN is enhanced by unremitting research. Although the existing weak supervision methods work well in manipulating the style codes along one attribute, the accuracy of manipulating multiple attributes is neglected. Multi-attribute representations are prone to entanglement in the StyleGAN latent space, while sequential editing leads to error accumulation. To addr… ▽ More

    Submitted 28 September, 2022; v1 submitted 22 September, 2021; originally announced September 2021.

    Comments: Accepted to WACV 2023, 19 pages, 20 figures

  50. arXiv:2106.00329  [pdf, other

    cs.CV

    Consistent Two-Flow Network for Tele-Registration of Point Clouds

    Authors: Zihao Yan, Zimu Yi, Ruizhen Hu, Niloy J. Mitra, Daniel Cohen-Or, Hui Huang

    Abstract: Rigid registration of partial observations is a fundamental problem in various applied fields. In computer graphics, special attention has been given to the registration between two partial point clouds generated by scanning devices. State-of-the-art registration techniques still struggle when the overlap region between the two point clouds is small, and completely fail if there is no overlap betw… ▽ More

    Submitted 10 October, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

    Comments: Accepted to IEEE TVCG 2021, project page at https://vcc.tech/research/2021/CTFNet