Skip to main content

Showing 1–50 of 63 results for author: An, W

  1. arXiv:2406.14482  [pdf, other

    cs.CV

    Visible-Thermal Tiny Object Detection: A Benchmark Dataset and Baselines

    Authors: Xinyi Ying, Chao Xiao, Ruojing Li, Xu He, Boyang Li, Zhaoxu Li, Yingqian Wang, Mingyuan Hu, Qingyu Xu, Zaiping Lin, Miao Li, Shilin Zhou, Wei An, Weidong Sheng, Li Liu

    Abstract: Small object detection (SOD) has been a longstanding yet challenging task for decades, with numerous datasets and algorithms being developed. However, they mainly focus on either visible or thermal modality, while visible-thermal (RGBT) bimodality is rarely explored. Although some RGBT datasets have been developed recently, the insufficient quantity, limited category, misaligned images and large t… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  2. arXiv:2406.12718  [pdf, other

    cs.CV cs.AI cs.CL

    AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention

    Authors: Wenbin An, Feng Tian, Sicong Leng, Jiahao Nie, Haonan Lin, QianYing Wang, Guang Dai, Ping Chen, Shijian Lu

    Abstract: Despite their great success across various multimodal tasks, Large Vision-Language Models (LVLMs) are facing a prevalent problem with object hallucinations, where the generated textual responses are inconsistent with ground-truth objects in the given image. This paper investigates various LVLMs and pinpoints attention deficiency toward discriminative local image features as one root cause of objec… ▽ More

    Submitted 21 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  3. arXiv:2406.09121  [pdf, other

    cs.CV

    MMRel: A Relation Understanding Dataset and Benchmark in the MLLM Era

    Authors: Jiahao Nie, Gongjie Zhang, Wenbin An, Yap-Peng Tan, Alex C. Kot, Shijian Lu

    Abstract: Despite the recent advancements in Multi-modal Large Language Models (MLLMs), understanding inter-object relations, i.e., interactions or associations between distinct objects, remains a major challenge for such models. This issue significantly hinders their advanced reasoning capabilities and is primarily due to the lack of large-scale, high-quality, and diverse multi-modal data essential for tra… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  4. arXiv:2405.18679  [pdf, other

    cs.CV

    Vim-F: Visual State Space Model Benefiting from Learning in the Frequency Domain

    Authors: Juntao Zhang, Kun Bian, Peng Cheng, Wenbo An, Jianning Liu, Jun Zhou

    Abstract: In recent years, State Space Models (SSMs) with efficient hardware-aware designs, known as the Mamba deep learning models, have made significant progress in modeling long sequences such as language understanding. Therefore, building efficient and general-purpose visual backbones based on SSMs is a promising direction. Compared to traditional convolutional neural networks (CNNs) and Vision Transfor… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  5. arXiv:2405.10148  [pdf, other

    cs.CV

    SpecDETR: A Transformer-based Hyperspectral Point Object Detection Network

    Authors: Zhaoxu Li, Wei An, Gaowei Guo, Longguang Wang, Yingqian Wang, Zaiping Lin

    Abstract: Hyperspectral target detection (HTD) aims to identify specific materials based on spectral information in hyperspectral imagery and can detect point targets, some of which occupy a smaller than one-pixel area. However, existing HTD methods are developed based on per-pixel binary classification, which limits the feature representation capability for point targets. In this paper, we rethink the hype… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  6. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  7. arXiv:2403.19235  [pdf, other

    cs.CV

    DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation

    Authors: Haonan Lin, Mengmeng Wang, Yan Chen, Wenbin An, Yuzhe Yao, Guang Dai, Qianying Wang, Yong Liu, Jingdong Wang

    Abstract: While large-scale pre-trained text-to-image models can synthesize diverse and high-quality human-centered images, novel challenges arise with a nuanced task of "identity fine editing": precisely modifying specific features of a subject while maintaining its inherent identity and context. Existing personalization methods either require time-consuming optimization or learning additional encoders, ad… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  8. arXiv:2312.16467  [pdf, other

    cs.CL cs.LG

    Transfer and Alignment Network for Generalized Category Discovery

    Authors: Wenbin An, Feng Tian, Wenkai Shi, Yan Chen, Yaqiang Wu, Qianying Wang, Ping Chen

    Abstract: Generalized Category Discovery is a crucial real-world task. Despite the improved performance on known categories, current methods perform poorly on novel categories. We attribute the poor performance to two reasons: biased knowledge transfer between labeled and unlabeled data and noisy representation learning on the unlabeled data. To mitigate these two issues, we propose a Transfer and Alignment… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024

  9. arXiv:2312.16057  [pdf, other

    cs.IT eess.SP

    Semantic Importance-Aware Based for Multi-User Communication Over MIMO Fading Channels

    Authors: Haotai Liang, Zhicheng Bao, Wannian An, Chen Dong, Xiaodong Xu

    Abstract: Semantic communication, as a novel communication paradigm, has attracted the interest of many scholars, with multi-user, multi-input multi-output (MIMO) scenarios being one of the critical contexts. This paper presents a semantic importance-aware based communication system (SIA-SC) over MIMO Rayleigh fading channels. Combining the semantic symbols' inequality and the equivalent subchannels of MIMO… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

  10. arXiv:2312.10897  [pdf, other

    cs.CL cs.AI cs.LG

    Generalized Category Discovery with Large Language Models in the Loop

    Authors: Wenbin An, Wenkai Shi, Feng Tian, Haonan Lin, QianYing Wang, Yaqiang Wu, Mingxiang Cai, Luyan Wang, Yan Chen, Haiping Zhu, Ping Chen

    Abstract: Generalized Category Discovery (GCD) is a crucial task that aims to recognize both known and novel categories from a set of unlabeled data by utilizing a few labeled data with only known categories. Due to the lack of supervision and category information, current methods usually perform poorly on novel categories and struggle to reveal semantic meanings of the discovered clusters, which limits the… ▽ More

    Submitted 26 May, 2024; v1 submitted 17 December, 2023; originally announced December 2023.

    Comments: Accepted by ACL 2024 Findings, code and data are available at https://github.com/Lackel/LOOP

  11. arXiv:2311.15593  [pdf, other

    cs.IT cs.PF eess.SP

    Performance Analysis of MDMA-Based Cooperative MRC Networks with Relays in Dissimilar Rayleigh Fading Channels

    Authors: Lei Teng, Wannian An, Chen Dong, Xiaoqi Qin, Xiaodong Xu

    Abstract: Multiple access technology is a key technology in various generations of wireless communication systems. As a potential multiple access technology for the next generation wireless communication systems, model division multiple access (MDMA) technology improves spectrum efficiency and feasibility regions. This implies that the MDMA scheme can achieve greater performance gains compared to traditiona… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: 6 pages, 4 figures, conference

  12. arXiv:2311.10492  [pdf, other

    cs.CV

    A Relay System for Semantic Image Transmission based on Shared Feature Extraction and Hyperprior Entropy Compression

    Authors: Wannian An, Zhicheng Bao, Haotai Liang, Chen Dong, Xiaodong

    Abstract: Nowadays, the need for high-quality image reconstruction and restoration is more and more urgent. However, most image transmission systems may suffer from image quality degradation or transmission interruption in the face of interference such as channel noise and link fading. To solve this problem, a relay communication network for semantic image transmission based on shared feature extraction and… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  13. arXiv:2311.09932  [pdf, other

    cs.CY

    The Communication GSC System with Energy Harvesting Nodes aided by Opportunistic Routing

    Authors: Hanyu Liu, Lei Teng, Wannian An, Xiaoqi Qin, Chen Dong, Xiaodong Xu

    Abstract: In this paper, a cooperative communication network based on energy-harvesting (EH) decode-and-forward (DF) relays is proposed. For relay nodes, there is harvest-storage-use (HSU) structure in this system. And energy can be obtained from the surrounding environment through energy buffering. In order to improve the performance of the communication system, the opportunistic routing algorithm and the… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  14. arXiv:2310.15836  [pdf, other

    cs.CL cs.AI cs.LG

    A Diffusion Weighted Graph Framework for New Intent Discovery

    Authors: Wenkai Shi, Wenbin An, Feng Tian, Qinghua Zheng, QianYing Wang, Ping Chen

    Abstract: New Intent Discovery (NID) aims to recognize both new and known intents from unlabeled data with the aid of limited labeled data containing only known intents. Without considering structure relationships between samples, previous methods generate noisy supervisory signals which cannot strike a balance between quantity and quality, hindering the formation of new intent clusters and effective transf… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Main

  15. arXiv:2310.10151  [pdf, other

    cs.LG cs.CL cs.IR

    DNA: Denoised Neighborhood Aggregation for Fine-grained Category Discovery

    Authors: Wenbin An, Feng Tian, Wenkai Shi, Yan Chen, Qinghua Zheng, QianYing Wang, Ping Chen

    Abstract: Discovering fine-grained categories from coarsely labeled data is a practical and challenging task, which can bridge the gap between the demand for fine-grained analysis and the high annotation cost. Previous works mainly focus on instance-level discrimination to learn low-level features, but ignore semantic similarities between data, which may prevent these models learning compact cluster represe… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted by EMNLP 2023 Main

  16. arXiv:2308.09084  [pdf, other

    cs.CV cs.LG

    MovePose: A High-performance Human Pose Estimation Algorithm on Mobile and Edge Devices

    Authors: Dongyang Yu, Haoyue Zhang, Ruisheng Zhao, Guoqi Chen, Wangpeng An, Yanhong Yang

    Abstract: We present MovePose, an optimized lightweight convolutional neural network designed specifically for real-time body pose estimation on CPU-based mobile devices. The current solutions do not provide satisfactory accuracy and speed for human posture estimation, and MovePose addresses this gap. It aims to maintain real-time performance while improving the accuracy of human posture estimation for mobi… ▽ More

    Submitted 19 April, 2024; v1 submitted 17 August, 2023; originally announced August 2023.

  17. arXiv:2308.04126  [pdf, other

    cs.CV cs.LG cs.MM cs.SD eess.AS

    OmniDataComposer: A Unified Data Structure for Multimodal Data Fusion and Infinite Data Generation

    Authors: Dongyang Yu, Shihao Wang, Yuan Fang, Wangpeng An

    Abstract: This paper presents OmniDataComposer, an innovative approach for multimodal data fusion and unlimited data generation with an intent to refine and uncomplicate interplay among diverse data modalities. Coming to the core breakthrough, it introduces a cohesive data structure proficient in processing and merging multimodal data inputs, which include video, audio, and text. Our crafted algorithm lev… ▽ More

    Submitted 17 August, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

  18. arXiv:2307.01004  [pdf, other

    cs.CV cs.LG

    Joint Coordinate Regression and Association For Multi-Person Pose Estimation, A Pure Neural Network Approach

    Authors: Dongyang Yu, Yunshi Xie, Wangpeng An, Li Zhang, Yufeng Yao

    Abstract: We introduce a novel one-stage end-to-end multi-person 2D pose estimation algorithm, known as Joint Coordinate Regression and Association (JCRA), that produces human pose joints and associations without requiring any post-processing. The proposed algorithm is fast, accurate, effective, and simple. The one-stage end-to-end network architecture significantly improves the inference speed of JCRA. Mea… ▽ More

    Submitted 19 April, 2024; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: This paper has been accepted by MMasia 2023 and is an oral presentation

  19. arXiv:2304.04442  [pdf, other

    cs.CV

    Monte Carlo Linear Clustering with Single-Point Supervision is Enough for Infrared Small Target Detection

    Authors: Boyang Li, Yingqian Wang, Longguang Wang, Fei Zhang, Ting Liu, Zaiping Lin, Wei An, Yulan Guo

    Abstract: Single-frame infrared small target (SIRST) detection aims at separating small targets from clutter backgrounds on infrared images. Recently, deep learning based methods have achieved promising performance on SIRST detection, but at the cost of a large amount of training data with expensive pixel-level annotations. To reduce the annotation burden, we propose the first method to achieve SIRST detect… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  20. You Only Train Once: Learning a General Anomaly Enhancement Network with Random Masks for Hyperspectral Anomaly Detection

    Authors: Zhaoxu Li, Yingqian Wang, Chao Xiao, Qiang Ling, Zaiping Lin, Wei An

    Abstract: In this paper, we introduce a new approach to address the challenge of generalization in hyperspectral anomaly detection (AD). Our method eliminates the need for adjusting parameters or retraining on new test scenes as required by most existing methods. Employing an image-level training paradigm, we achieve a general anomaly enhancement network for hyperspectral AD that only needs to be trained on… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

    Journal ref: TGRS 2023

  21. arXiv:2303.11055  [pdf, other

    eess.IV cs.CV

    Parameter-Free Channel Attention for Image Classification and Super-Resolution

    Authors: Yuxuan Shi, Lingxiao Yang, Wangpeng An, Xiantong Zhen, Liuqing Wang

    Abstract: The channel attention mechanism is a useful technique widely employed in deep convolutional neural networks to boost the performance for image processing tasks, eg, image classification and image super-resolution. It is usually designed as a parameterized sub-network and embedded into the convolutional layers of the network to learn more powerful feature representations. However, current channel a… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

  22. arXiv:2212.09219  [pdf, other

    cs.DB

    Modeling and Performance Analysis of Single-Server Database Over Quasi-static Rayleigh Fading Channel

    Authors: Mengying Chen, Wannian An, Yang Liu, Chen Dong, Xiaodong Xu, Boxiao Han, Ping Zhang

    Abstract: Cloud database is the key technology in cloud computing. The effective and efficient service quality of the cloud database is inseparable from communication technology, just as improving communication quality will reduce the concurrency phenomenon in the ticketing system. In order to visually observe the impact of communication on the cloud database, we propose a Communication-Database (C-D) Model… ▽ More

    Submitted 17 January, 2023; v1 submitted 18 December, 2022; originally announced December 2022.

  23. arXiv:2211.15115  [pdf, other

    cs.CL cs.AI cs.CV

    Generalized Category Discovery with Decoupled Prototypical Network

    Authors: Wenbin An, Feng Tian, Qinghua Zheng, Wei Ding, QianYing Wang, Ping Chen

    Abstract: Generalized Category Discovery (GCD) aims to recognize both known and novel categories from a set of unlabeled data, based on another dataset labeled with only known categories. Without considering differences between known and novel categories, current methods learn about them in a coupled manner, which can hurt model's generalization and discriminative ability. Furthermore, the coupled training… ▽ More

    Submitted 15 March, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: Accepted by AAAI 2023

  24. arXiv:2210.07733  [pdf, other

    cs.CL cs.AI cs.LG

    Fine-grained Category Discovery under Coarse-grained supervision with Hierarchical Weighted Self-contrastive Learning

    Authors: Wenbin An, Feng Tian, Ping Chen, Siliang Tang, Qinghua Zheng, QianYing Wang

    Abstract: Novel category discovery aims at adapting models trained on known categories to novel categories. Previous works only focus on the scenario where known and novel categories are of the same granularity. In this paper, we investigate a new practical scenario called Fine-grained Category Discovery under Coarse-grained supervision (FCDC). FCDC aims at discovering fine-grained categories with only coar… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: Accepted by EMNLP 2022

  25. MTU-Net: Multi-level TransUNet for Space-based Infrared Tiny Ship Detection

    Authors: Tianhao Wu, Boyang Li, Yihang Luo, Yingqian Wang, Chao Xiao, Ting Liu, Jungang Yang, Wei An, Yulan Guo

    Abstract: Space-based infrared tiny ship detection aims at separating tiny ships from the images captured by earth orbiting satellites. Due to the extremely large image coverage area (e.g., thousands square kilometers), candidate targets in these images are much smaller, dimer, more changeable than those targets observed by aerial-based and land-based imaging devices. Existing short imaging distance-based i… ▽ More

    Submitted 27 September, 2022; originally announced September 2022.

  26. arXiv:2207.11991  [pdf, other

    cs.IT

    Soft decoding without soft demapping with ORBGRAND

    Authors: Wei An, Muriel Medard, Ken R. Duffy

    Abstract: For spectral efficiency, higher order modulation symbols confer information on more than one bit. As soft detection forward error correction decoders assume the availability of information at binary granularity, however, soft demappers are required to compute per-bit reliabilities from complex-valued signals. Here we show that the recently introduced universal soft detection decoder ORBGRAND can b… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

    Journal ref: 2023 IEEE International Symposium on Information Theory (ISIT)

  27. arXiv:2206.06214  [pdf, other

    cs.CV eess.IV

    Real-World Light Field Image Super-Resolution via Degradation Modulation

    Authors: Yingqian Wang, Zhengyu Liang, Longguang Wang, Jungang Yang, Wei An, Yulan Guo

    Abstract: Recent years have witnessed the great advances of deep neural networks (DNNs) in light field (LF) image super-resolution (SR). However, existing DNN-based LF image SR methods are developed on a single fixed degradation (e.g., bicubic downsampling), and thus cannot be applied to super-resolve real LF images with diverse degradation. In this paper, we propose a simple yet effective method for real-w… ▽ More

    Submitted 30 November, 2023; v1 submitted 13 June, 2022; originally announced June 2022.

    Comments: 15 pages, 10 figures

  28. arXiv:2206.05407  [pdf, other

    cs.IT

    Opportunistic Routing aided Cooperative Communication MRC Network with Energy-Harvesting Nodes

    Authors: Lei Teng, Wannian An, Chen Dong, Xiaodong Xu, Boxiao Han

    Abstract: In this paper, we consider a multi-hop cooperative network founded on two energy-harvesting (EH) decode-and-forward (DF) relays which are provided with harvest-store-use (HSU) architecture to harvest energy from the ambience using the energy buffers. For the sake of boosting the data delivery in this network, maximal ratio combining (MRC) at destination to combine the signals received from source… ▽ More

    Submitted 2 February, 2023; v1 submitted 10 June, 2022; originally announced June 2022.

    Comments: arXiv admin note: text overlap with arXiv:2205.06482

  29. arXiv:2205.06482  [pdf, other

    cs.IT

    Opportunistic Routing Aided Cooperative Communication Network with Energy-Harvesting

    Authors: Wannian An, Chen Dong, Xiaodong Xu, Chao Xu, Shujun Han, Lei Teng

    Abstract: In this paper, a cooperative communication network based on energy-harvesting (EH) decode-and-forward (DF) relays that harvest energy from the ambience using buffers with harvest-store-use (HSU) architecture is considered. An opportunistic routing (OR) protocol, which selects the transmission path of packet based on the node transmission priority, is proposed to improve data delivery in this netwo… ▽ More

    Submitted 11 June, 2022; v1 submitted 13 May, 2022; originally announced May 2022.

  30. arXiv:2203.01576  [pdf, other

    cs.CV

    Occlusion-Aware Cost Constructor for Light Field Depth Estimation

    Authors: Yingqian Wang, Longguang Wang, Zhengyu Liang, Jungang Yang, Wei An, Yulan Guo

    Abstract: Matching cost construction is a key step in light field (LF) depth estimation, but was rarely studied in the deep learning era. Recent deep learning-based LF depth estimation methods construct matching cost by sequentially shifting each sub-aperture image (SAI) with a series of predefined offsets, which is complex and time-consuming. In this paper, we propose a simple and fast cost constructor to… ▽ More

    Submitted 3 March, 2022; originally announced March 2022.

    Comments: Accepted to CVPR 2022

  31. Ordered Reliability Bits Guessing Random Additive Noise Decoding

    Authors: Ken R. Duffy, Wei An, Muriel Medard

    Abstract: Error correction techniques traditionally focus on the co-design of restricted code-structures in tandem with code-specific decoders that are computationally efficient when decoding long codes in hardware. Modern applications are, however, driving demand for ultra-reliable low-latency communications (URLLC), rekindling interest in the performance of shorter, higher-rate error correcting codes, and… ▽ More

    Submitted 29 August, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

    MSC Class: 94A15; 68P30

  32. Disentangling Light Fields for Super-Resolution and Disparity Estimation

    Authors: Yingqian Wang, Longguang Wang, Gaochang Wu, Jungang Yang, Wei An, Jingyi Yu, Yulan Guo

    Abstract: Light field (LF) cameras record both intensity and directions of light rays, and encode 3D scenes into 4D LF images. Recently, many convolutional neural networks (CNNs) have been proposed for various LF image processing tasks. However, it is challenging for CNNs to effectively process LF images since the spatial and angular information are highly inter-twined with varying disparities. In this pape… ▽ More

    Submitted 22 July, 2023; v1 submitted 21 February, 2022; originally announced February 2022.

    Comments: We have corrected a mistake in Table 1 and updated Fig. 6 by using HR GT depth maps for evaluation

  33. Detecting and Tracking Small and Dense Moving Objects in Satellite Videos: A Benchmark

    Authors: Qian Yin, Qingyong Hu, Hao Liu, Feng Zhang, Yingqian Wang, Zaiping Lin, Wei An, Yulan Guo

    Abstract: Satellite video cameras can provide continuous observation for a large-scale area, which is important for many remote sensing applications. However, achieving moving object detection and tracking in satellite videos remains challenging due to the insufficient appearance information of objects and lack of high-quality datasets. In this paper, we first build a large-scale satellite video dataset wit… ▽ More

    Submitted 25 November, 2021; originally announced November 2021.

    Comments: This paper has been accepted by IEEE Transactions on Geoscience and Remote Sensing. Qian Yin and Qingyong Hu have equal contributions to this work and are co-first authors. The dataset is available at https://github.com/QingyongHu/VISO

  34. Dense Dual-Attention Network for Light Field Image Super-Resolution

    Authors: Yu Mo, Yingqian Wang, Chao Xiao, Jungang Yang, Wei An

    Abstract: Light field (LF) images can be used to improve the performance of image super-resolution (SR) because both angular and spatial information is available. It is challenging to incorporate distinctive information from different views for LF image SR. Moreover, the long-term information from the previous layers can be weakened as the depth of network increases. In this paper, we propose a dense dual-a… ▽ More

    Submitted 22 October, 2021; originally announced October 2021.

    Comments: Accept by IEEE Transactions on Circuits and Systems for Video Technology

  35. Selective Light Field Refocusing for Camera Arrays Using Bokeh Rendering and Superresolution

    Authors: Yingqian Wang, Jungang Yang, Yulan Guo, Chao Xiao, Wei An

    Abstract: Camera arrays provide spatial and angular information within a single snapshot. With refocusing methods, focal planes can be altered after exposure. In this letter, we propose a light field refocusing method to improve the imaging quality of camera arrays. In our method, the disparity is first estimated. Then, the unfocused region (bokeh) is rendered by using a depth-based anisotropic filter. Fina… ▽ More

    Submitted 9 August, 2021; originally announced August 2021.

    Journal ref: IEEE Signal Processing Letters, volume: 26, issue: 1, pages: 204-208, 2019

  36. arXiv:2106.00487  [pdf, other

    cs.CV

    Dense Nested Attention Network for Infrared Small Target Detection

    Authors: Boyang Li, Chao Xiao, Longguang Wang, Yingqian Wang, Zaiping Lin, Miao Li, Wei An, Yulan Guo

    Abstract: Single-frame infrared small target (SIRST) detection aims at separating small targets from clutter backgrounds. With the advances of deep learning, CNN-based methods have yielded promising results in generic object detection due to their powerful modeling capability. However, existing CNN-based methods cannot be directly applied for infrared small targets since pooling layers in their networks cou… ▽ More

    Submitted 15 August, 2022; v1 submitted 1 June, 2021; originally announced June 2021.

    Comments: Accepted by IEEE Transactions on Image Processing (TIP)

  37. Non-Convex Tensor Low-Rank Approximation for Infrared Small Target Detection

    Authors: Ting Liu, Jungang Yang, Boyang Li, Chao Xiao, Yang Sun, Yingqian Wang, Wei An

    Abstract: Infrared small target detection is an important fundamental task in the infrared system. Therefore, many infrared small target detection methods have been proposed, in which the low-rank model has been used as a powerful tool. However, most low-rank-based methods assign the same weights for different singular values, which will lead to inaccurate background estimation. Considering that different s… ▽ More

    Submitted 20 November, 2021; v1 submitted 31 May, 2021; originally announced May 2021.

    Comments: This paper is accepted by IEEE Transactions on Geoscience and Remote Sensing

  38. arXiv:2105.10843  [pdf, other

    cs.CV

    Exploring Robustness of Unsupervised Domain Adaptation in Semantic Segmentation

    Authors: Jinyu Yang, Chunyuan Li, Weizhi An, Hehuan Ma, Yuzhi Guo, Yu Rong, Peilin Zhao, Junzhou Huang

    Abstract: Recent studies imply that deep neural networks are vulnerable to adversarial examples -- inputs with a slight but intentional perturbation are incorrectly classified by the network. Such vulnerability makes it risky for some security-related applications (e.g., semantic segmentation in autonomous cars) and triggers tremendous concerns on the model reliability. For the first time, we comprehensivel… ▽ More

    Submitted 25 July, 2021; v1 submitted 22 May, 2021; originally announced May 2021.

    Comments: ICCV 2021 (Oral)

  39. CRC Codes as Error Correction Codes

    Authors: Wei An, Muriel Médard, Ken R. Duffy

    Abstract: CRC codes have long since been adopted in a vast range of applications. The established notion that they are suitable primarily for error detection can be set aside through use of the recently proposed Guessing Random Additive Noise Decoding (GRAND). Hard-detection (GRAND-SOS) and soft-detection (ORBGRAND) variants can decode any short, high-rate block code, making them suitable for error correcti… ▽ More

    Submitted 28 April, 2021; originally announced April 2021.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

    Journal ref: IEEE ICC 2021

  40. arXiv:2104.00416  [pdf, other

    cs.CV

    Unsupervised Degradation Representation Learning for Blind Super-Resolution

    Authors: Longguang Wang, Yingqian Wang, Xiaoyu Dong, Qingyu Xu, Jungang Yang, Wei An, Yulan Guo

    Abstract: Most existing CNN-based super-resolution (SR) methods are developed based on an assumption that the degradation is fixed and known (e.g., bicubic downsampling). However, these methods suffer a severe performance drop when the real degradation is different from their assumption. To handle various unknown degradations in real-world applications, previous methods rely on degradation estimation to rec… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

    Comments: Accepted by CVPR 2021

  41. arXiv:2011.03802  [pdf, other

    cs.CV

    Symmetric Parallax Attention for Stereo Image Super-Resolution

    Authors: Yingqian Wang, Xinyi Ying, Longguang Wang, Jungang Yang, Wei An, Yulan Guo

    Abstract: Although recent years have witnessed the great advances in stereo image super-resolution (SR), the beneficial information provided by binocular systems has not been fully used. Since stereo images are highly symmetric under epipolar constraint, in this paper, we improve the performance of stereo image SR by exploiting symmetry cues in stereo image pairs. Specifically, we propose a symmetric bi-dir… ▽ More

    Submitted 20 April, 2021; v1 submitted 7 November, 2020; originally announced November 2020.

    Comments: Accepted to NTIRE workshop at CVPR 2021. The first two authors contribute equally to this work

  42. Keep the bursts and ditch the interleavers

    Authors: Wei An, Muriel Médard, Ken R. Duffy

    Abstract: To facilitate applications in IoT, 5G, and beyond, there is an engineering need to enable high-rate, low-latency communications. Errors in physical channels typically arrive in clumps, but most decoders are designed assuming that channels are memoryless. As a result, communication networks rely on interleaving over tens of thousands of bits so that channel conditions match decoder assumptions. Eve… ▽ More

    Submitted 6 November, 2020; originally announced November 2020.

    Comments: 6 pages

    Journal ref: 2020 IEEE Global Communications Conference

  43. arXiv:2009.12516  [pdf, other

    cs.CV

    Dense-View GEIs Set: View Space Covering for Gait Recognition based on Dense-View GAN

    Authors: Rijun Liao, Weizhi An, Shiqi Yu, Zhu Li, Yongzhen Huang

    Abstract: Gait recognition has proven to be effective for long-distance human recognition. But view variance of gait features would change human appearance greatly and reduce its performance. Most existing gait datasets usually collect data with a dozen different angles, or even more few. Limited view angles would prevent learning better view invariant feature. It can further improve robustness of gait reco… ▽ More

    Submitted 26 September, 2020; originally announced September 2020.

    Comments: Accepted for presentation at IJCB'2020

  44. arXiv:2009.08250  [pdf, other

    cs.CV

    Parallax Attention for Unsupervised Stereo Correspondence Learning

    Authors: Longguang Wang, Yulan Guo, Yingqian Wang, Zhengfa Liang, Zaiping Lin, Jungang Yang, Wei An

    Abstract: Stereo image pairs encode 3D scene cues into stereo correspondences between the left and right images. To exploit 3D cues within stereo images, recent CNN based methods commonly use cost volume techniques to capture stereo correspondence over large disparities. However, since disparities can vary significantly for stereo cameras with different baselines, focal lengths and resolutions, the fixed ma… ▽ More

    Submitted 12 October, 2021; v1 submitted 15 September, 2020; originally announced September 2020.

    Comments: Accepted by IEEE TPAMI 2020. arXiv admin note: text overlap with arXiv:1903.05784

  45. Light Field Image Super-Resolution Using Deformable Convolution

    Authors: Yingqian Wang, Jungang Yang, Longguang Wang, Xinyi Ying, Tianhao Wu, Wei An, Yulan Guo

    Abstract: Light field (LF) cameras can record scenes from multiple perspectives, and thus introduce beneficial angular information for image super-resolution (SR). However, it is challenging to incorporate angular information due to disparities among LF images. In this paper, we propose a deformable convolution network (i.e., LF-DFnet) to handle the disparity problem for LF image SR. Specifically, we design… ▽ More

    Submitted 25 November, 2020; v1 submitted 7 July, 2020; originally announced July 2020.

    Comments: Accepted by IEEE Transactions on Image Processing

  46. arXiv:2006.09603  [pdf, other

    cs.CV

    Exploring Sparsity in Image Super-Resolution for Efficient Inference

    Authors: Longguang Wang, Xiaoyu Dong, Yingqian Wang, Xinyi Ying, Zaiping Lin, Wei An, Yulan Guo

    Abstract: Current CNN-based super-resolution (SR) methods process all locations equally with computational resources being uniformly assigned in space. However, since missing details in low-resolution (LR) images mainly exist in regions of edges and textures, less computational resources are required for those flat regions. Therefore, existing CNN-based methods involve redundant computation in flat regions,… ▽ More

    Submitted 1 April, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: Accepted by CVPR 2021

  47. arXiv:2004.03791  [pdf, other

    cs.CV

    Learning A Single Network for Scale-Arbitrary Super-Resolution

    Authors: Longguang Wang, Yingqian Wang, Zaiping Lin, Jungang Yang, Wei An, Yulan Guo

    Abstract: Recently, the performance of single image super-resolution (SR) has been significantly improved with powerful networks. However, these networks are developed for image SR with a single specific integer scale (e.g., x2;x3,x4), and cannot be used for non-integer and asymmetric SR. In this paper, we propose to learn a scale-arbitrary image SR network from scale-specific networks. Specifically, we pro… ▽ More

    Submitted 23 July, 2021; v1 submitted 7 April, 2020; originally announced April 2020.

    Comments: Accepted by ICCV 2021

  48. Deformable 3D Convolution for Video Super-Resolution

    Authors: Xinyi Ying, Longguang Wang, Yingqian Wang, Weidong Sheng, Wei An, Yulan Guo

    Abstract: The spatio-temporal information among video sequences is significant for video super-resolution (SR). However, the spatio-temporal information cannot be fully used by existing video SR methods since spatial feature extraction and temporal motion compensation are usually performed sequentially. In this paper, we propose a deformable 3D convolution network (D3Dnet) to incorporate spatio-temporal inf… ▽ More

    Submitted 15 August, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

    Comments: Accepted by IEEE Signal Processing Letters

  49. arXiv:2003.04614  [pdf, other

    cs.CV

    Label-Driven Reconstruction for Domain Adaptation in Semantic Segmentation

    Authors: Jinyu Yang, Weizhi An, Sheng Wang, Xinliang Zhu, Chaochao Yan, Junzhou Huang

    Abstract: Unsupervised domain adaptation enables to alleviate the need for pixel-wise annotation in the semantic segmentation. One of the most common strategies is to translate images from the source domain to the target domain and then align their marginal distributions in the feature space using adversarial learning. However, source-to-target translation enlarges the bias in translated images and introduc… ▽ More

    Submitted 23 August, 2020; v1 submitted 10 March, 2020; originally announced March 2020.

    Comments: ECCV 2020

  50. arXiv:2003.04010  [pdf, other

    cs.CV

    Context-Aware Domain Adaptation in Semantic Segmentation

    Authors: Jinyu Yang, Weizhi An, Chaochao Yan, Peilin Zhao, Junzhou Huang

    Abstract: In this paper, we consider the problem of unsupervised domain adaptation in the semantic segmentation. There are two primary issues in this field, i.e., what and how to transfer domain knowledge across two domains. Existing methods mainly focus on adapting domain-invariant features (what to transfer) through adversarial learning (how to transfer). Context dependency is essential for semantic segme… ▽ More

    Submitted 9 March, 2020; originally announced March 2020.

    Comments: 10 pages, 6 figures, 5 tables