Skip to main content

Showing 1–33 of 33 results for author: Fei, J

  1. arXiv:2405.18937  [pdf, other

    cs.CV cs.CL

    Kestrel: Point Grounding Multimodal LLM for Part-Aware 3D Vision-Language Understanding

    Authors: Junjie Fei, Mahmoud Ahmed, Jian Ding, Eslam Mohamed Bakr, Mohamed Elhoseiny

    Abstract: While 3D MLLMs have achieved significant progress, they are restricted to object and scene understanding and struggle to understand 3D spatial structures at the part level. In this paper, we introduce Kestrel, representing a novel approach that empowers 3D MLLMs with part-aware understanding, enabling better interpretation and segmentation grounding of 3D objects at the part level. Despite its sig… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  2. arXiv:2311.05478  [pdf, other

    cs.CV eess.IV

    Robust Retraining-free GAN Fingerprinting via Personalized Normalization

    Authors: Jianwei Fei, Zhihua Xia, Benedetta Tondi, Mauro Barni

    Abstract: In recent years, there has been significant growth in the commercial applications of generative models, licensed and distributed by model developers to users, who in turn use them to offer services. In this scenario, there is a need to track and identify the responsible user in the presence of a violation of the license agreement or any kind of malicious usage. Although there are methods enabling… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  3. arXiv:2310.16919  [pdf, other

    cs.CV cs.AI

    Wide Flat Minimum Watermarking for Robust Ownership Verification of GANs

    Authors: Jianwei Fei, Zhihua Xia, Benedetta Tondi, Mauro Barni

    Abstract: We propose a novel multi-bit box-free watermarking method for the protection of Intellectual Property Rights (IPR) of GANs with improved robustness against white-box attacks like fine-tuning, pruning, quantization, and surrogate model attacks. The watermark is embedded by adding an extra watermarking loss term during GAN training, ensuring that the images generated by the GAN contain an invisible… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

  4. arXiv:2307.16525  [pdf, other

    cs.CV cs.CL

    Transferable Decoding with Visual Entities for Zero-Shot Image Captioning

    Authors: Junjie Fei, Teng Wang, Jinrui Zhang, Zhenyu He, Chengjie Wang, Feng Zheng

    Abstract: Image-to-text generation aims to describe images using natural language. Recently, zero-shot image captioning based on pre-trained vision-language models (VLMs) and large language models (LLMs) has made significant progress. However, we have observed and empirically demonstrated that these methods are susceptible to modality bias induced by LLMs and tend to generate descriptions containing objects… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: Accepted by ICCV 2023

  5. arXiv:2306.02061  [pdf, other

    cs.CV

    Balancing Logit Variation for Long-tailed Semantic Segmentation

    Authors: Yuchao Wang, Jingjing Fei, Haochen Wang, Wei Li, Tianpeng Bao, Liwei Wu, Rui Zhao, Yujun Shen

    Abstract: Semantic segmentation usually suffers from a long-tail data distribution. Due to the imbalanced number of samples across categories, the features of those tail classes may get squeezed into a narrow area in the feature space. Towards a balanced feature distribution, we introduce category-wise variation into the network predictions in the training phase such that an instance is no longer projected… ▽ More

    Submitted 3 June, 2023; originally announced June 2023.

  6. arXiv:2305.13752  [pdf, other

    cs.CV

    Pulling Target to Source: A New Perspective on Domain Adaptive Semantic Segmentation

    Authors: Haochen Wang, Yujun Shen, Jingjing Fei, Wei Li, Liwei Wu, Yuxi Wang, Zhaoxiang Zhang

    Abstract: Domain adaptive semantic segmentation aims to transfer knowledge from a labeled source domain to an unlabeled target domain. However, existing methods primarily focus on directly learning qualified target features, making it challenging to guarantee their discrimination in the absence of target labels. This work provides a new perspective. We observe that the features learned with source data mana… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  7. arXiv:2305.02677  [pdf, other

    cs.CV

    Caption Anything: Interactive Image Description with Diverse Multimodal Controls

    Authors: Teng Wang, Jinrui Zhang, Junjie Fei, Hao Zheng, Yunlong Tang, Zhe Li, Mingqi Gao, Shanshan Zhao

    Abstract: Controllable image captioning is an emerging multimodal topic that aims to describe the image with natural language following human purpose, $\textit{e.g.}$, looking at the specified regions or telling in a particular text style. State-of-the-art methods are trained on annotated pairs of input controls and output captions. However, the scarcity of such well-annotated multimodal data largely limits… ▽ More

    Submitted 6 July, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: Tech-report

  8. arXiv:2301.12178  [pdf, other

    cs.AI

    MVKT-ECG: Efficient Single-lead ECG Classification on Multi-Label Arrhythmia by Multi-View Knowledge Transferring

    Authors: Yuzhen Qin, Li Sun, Hui Chen, Wei-qiang Zhang, Wenming Yang, Jintao Fei, Guijin Wang

    Abstract: The widespread emergence of smart devices for ECG has sparked demand for intelligent single-lead ECG-based diagnostic systems. However, it is challenging to develop a single-lead-based ECG interpretation model for multiple diseases diagnosis due to the lack of some key disease information. In this work, we propose inter-lead Multi-View Knowledge Transferring of ECG (MVKT-ECG) to boost single-lead… ▽ More

    Submitted 28 January, 2023; originally announced January 2023.

  9. arXiv:2212.14309  [pdf, other

    cs.CV

    Learning to mask: Towards generalized face forgery detection

    Authors: Jianwei Fei, Yunshu Dai, Huaming Wang, Zhihua Xia

    Abstract: Generalizability to unseen forgery types is crucial for face forgery detectors. Recent works have made significant progress in terms of generalization by synthetic forgery data augmentation. In this work, we explore another path for improving the generalization. Our goal is to reduce the features that are easy to learn in the training phase, so as to reduce the risk of overfitting on specific forg… ▽ More

    Submitted 29 December, 2022; originally announced December 2022.

  10. arXiv:2212.13466  [pdf, other

    cs.CV

    General GAN-generated image detection by data augmentation in fingerprint domain

    Authors: Huaming Wang, Jianwei Fei, Yunshu Dai, Lingyun Leng, Zhihua Xia

    Abstract: In this work, we investigate improving the generalizability of GAN-generated image detectors by performing data augmentation in the fingerprint domain. Specifically, we first separate the fingerprints and contents of the GAN-generated images using an autoencoder based GAN fingerprint extractor, followed by random perturbations of the fingerprints. Then the original fingerprints are substituted wit… ▽ More

    Submitted 9 April, 2023; v1 submitted 27 December, 2022; originally announced December 2022.

  11. arXiv:2211.13968  [pdf, other

    cs.CV

    MIAD: A Maintenance Inspection Dataset for Unsupervised Anomaly Detection

    Authors: Tianpeng Bao, Jiadong Chen, Wei Li, Xiang Wang, Jingjing Fei, Liwei Wu, Rui Zhao, Ye Zheng

    Abstract: Visual anomaly detection plays a crucial role in not only manufacturing inspection to find defects of products during manufacturing processes, but also maintenance inspection to keep equipment in optimum working condition particularly outdoors. Due to the scarcity of the defective samples, unsupervised anomaly detection has attracted great attention in recent years. However, existing datasets for… ▽ More

    Submitted 28 November, 2022; v1 submitted 25 November, 2022; originally announced November 2022.

  12. arXiv:2211.07052  [pdf, other

    cs.LG

    Treatment-RSPN: Recurrent Sum-Product Networks for Sequential Treatment Regimes

    Authors: Adam Dejl, Harsh Deep, Jonathan Fei, Ardavan Saeedi, Li-wei H. Lehman

    Abstract: Sum-product networks (SPNs) have recently emerged as a novel deep learning architecture enabling highly efficient probabilistic inference. Since their introduction, SPNs have been applied to a wide range of data modalities and extended to time-sequence data. In this paper, we propose a general framework for modelling sequential treatment decision-making behaviour and treatment response using recur… ▽ More

    Submitted 13 November, 2022; originally announced November 2022.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2022, November 28th, 2022, New Orleans, United States & Virtual, http://www.ml4h.cc, 14 pages

    ACM Class: G.3; I.2

  13. arXiv:2209.15490  [pdf, other

    cs.CV

    Learning Second Order Local Anomaly for General Face Forgery Detection

    Authors: Jianwei Fei, Yunshu Dai, Peipeng Yu, Tianrun Shen, Zhihua Xia, Jian Weng

    Abstract: In this work, we propose a novel method to improve the generalization ability of CNN-based face forgery detectors. Our method considers the feature anomalies of forged faces caused by the prevalent blending operations in face forgery algorithms. Specifically, we propose a weakly supervised Second Order Local Anomaly (SOLA) learning module to mine anomalies in local regions using deep feature maps.… ▽ More

    Submitted 30 September, 2022; originally announced September 2022.

  14. arXiv:2209.09434  [pdf, other

    cs.AR

    BP-Im2col: Implicit Im2col Supporting AI Backpropagation on Systolic Arrays

    Authors: Jianchao Yang, Mei Wen, Junzhong Shen, Yasong Cao, Minjin Tang, Renyu Yang, Jiawei Fei, Chunyuan Zhang

    Abstract: State-of-the-art systolic array-based accelerators adopt the traditional im2col algorithm to accelerate the inference of convolutional layers. However, traditional im2col cannot efficiently support AI backpropagation. Backpropagation in convolutional layers involves performing transposed convolution and dilated convolution, which usually introduces plenty of zero-spaces into the feature map or ker… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

    Comments: Accepted in ICCD 2022, The 40th IEEE International Conference on Computer Design

  15. arXiv:2209.07237  [pdf, other

    cs.CV

    Robust Implementation of Foreground Extraction and Vessel Segmentation for X-ray Coronary Angiography Image Sequence

    Authors: Zeyu Fu, Zhuang Fu, Chenzhuo Lu, Jun Yan, Jian Fei, Hui Han

    Abstract: The extraction of contrast-filled vessels from X-ray coronary angiography (XCA) image sequence has important clinical significance for intuitively diagnosis and therapy. In this study, the XCA image sequence is regarded as a 3D tensor input, the vessel layer is regarded as a sparse tensor, and the background layer is regarded as a low-rank tensor. Using tensor nuclear norm (TNN) minimization, a no… ▽ More

    Submitted 27 February, 2023; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: 34pages, 14figures, 5tables

  16. arXiv:2209.06993  [pdf, other

    cs.CV

    Learning from Future: A Novel Self-Training Framework for Semantic Segmentation

    Authors: Ye Du, Yujun Shen, Haochen Wang, Jingjing Fei, Wei Li, Liwei Wu, Rui Zhao, Zehua Fu, Qingjie Liu

    Abstract: Self-training has shown great potential in semi-supervised learning. Its core idea is to use the model learned on labeled data to generate pseudo-labels for unlabeled samples, and in turn teach itself. To obtain valid supervision, active attempts typically employ a momentum teacher for pseudo-label prediction yet observe the confirmation bias issue, where the incorrect predictions may provide wron… ▽ More

    Submitted 18 September, 2022; v1 submitted 14 September, 2022; originally announced September 2022.

    Comments: Accepted to NeurIPS 2022

  17. arXiv:2209.03466  [pdf, other

    cs.CV cs.AI

    Supervised GAN Watermarking for Intellectual Property Protection

    Authors: Jianwei Fei, Zhihua Xia, Benedetta Tondi, Mauro Barni

    Abstract: We propose a watermarking method for protecting the Intellectual Property (IP) of Generative Adversarial Networks (GANs). The aim is to watermark the GAN model so that any image generated by the GAN contains an invisible watermark (signature), whose presence inside the image can be checked at a later stage for ownership verification. To achieve this goal, a pre-trained CNN watermarking decoding bl… ▽ More

    Submitted 7 September, 2022; originally announced September 2022.

  18. arXiv:2206.11476  [pdf, other

    cs.CV

    Dynamic Scene Deblurring Based on Continuous Cross-Layer Attention Transmission

    Authors: Xia Hua, Mingxin Li, Junxiong Fei, Yu Shi, JianGuo Liu, Hanyu Hong

    Abstract: The deep convolutional neural networks (CNNs) using attention mechanism have achieved great success for dynamic scene deblurring. In most of these networks, only the features refined by the attention maps can be passed to the next layer and the attention maps of different layers are separated from each other, which does not make full use of the attention information from different layers in the CN… ▽ More

    Submitted 28 January, 2023; v1 submitted 23 June, 2022; originally announced June 2022.

  19. DuMLP-Pin: A Dual-MLP-dot-product Permutation-invariant Network for Set Feature Extraction

    Authors: Jiajun Fei, Ziyu Zhu, Wenlei Liu, Zhidong Deng, Mingyang Li, Huanjun Deng, Shuo Zhang

    Abstract: Existing permutation-invariant methods can be divided into two categories according to the aggregation scope, i.e. global aggregation and local one. Although the global aggregation methods, e. g., PointNet and Deep Sets, get involved in simpler structures, their performance is poorer than the local aggregation ones like PointNet++ and Point Transformer. It remains an open problem whether there exi… ▽ More

    Submitted 30 August, 2022; v1 submitted 8 March, 2022; originally announced March 2022.

    Comments: 16 pages, accepted by AAAI 2022 (https://ojs.aaai.org/index.php/AAAI/article/view/19939), with technical appendix

  20. arXiv:2203.03884  [pdf, other

    cs.CV

    Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

    Authors: Yuchao Wang, Haochen Wang, Yujun Shen, Jingjing Fei, Wei Li, Guoqiang Jin, Liwei Wu, Rui Zhao, Xinyi Le

    Abstract: The crux of semi-supervised semantic segmentation is to assign adequate pseudo-labels to the pixels of unlabeled images. A common practice is to select the highly confident predictions as the pseudo ground-truth, but it leads to a problem that most pixels may be left unused due to their unreliability. We argue that every pixel matters to the model training, even its prediction is ambiguous. Intuit… ▽ More

    Submitted 14 March, 2022; v1 submitted 8 March, 2022; originally announced March 2022.

    Comments: Accepted to CVPR 2022. Project: https://haochen-wang409.github.io/U2PL/

  21. arXiv:2202.13067  [pdf, other

    cs.CV

    A Robust Document Image Watermarking Scheme using Deep Neural Network

    Authors: Sulong Ge, Zhihua Xia, Jianwei Fei, Xingming Sun, Jian Weng

    Abstract: Watermarking is an important copyright protection technology which generally embeds the identity information into the carrier imperceptibly. Then the identity can be extracted to prove the copyright from the watermarked carrier even after suffering various attacks. Most of the existing watermarking technologies take the nature images as carriers. Different from the natural images, document images… ▽ More

    Submitted 26 February, 2022; originally announced February 2022.

  22. arXiv:2112.06095  [pdf, other

    cs.NI cs.DC

    Unlocking the Power of Inline Floating-Point Operations on Programmable Switches

    Authors: Yifan Yuan, Omar Alama, Amedeo Sapio, Jiawei Fei, Jacob Nelson, Dan R. K. Ports, Marco Canini, Nam Sung Kim

    Abstract: The advent of switches with programmable dataplanes has enabled the rapid development of new network functionality, as well as providing a platform for acceleration of a broad range of application-level functionality. However, existing switch hardware was not designed with application acceleration in mind, and thus applications requiring operations or datatypes not used in traditional network prot… ▽ More

    Submitted 11 December, 2021; originally announced December 2021.

    Comments: This paper has been accepted by NSDI'22. This arxiv paper is not the final camera-ready version

  23. arXiv:2108.08166  [pdf, other

    cs.CV cs.RO

    Deployment of Deep Neural Networks for Object Detection on Edge AI Devices with Runtime Optimization

    Authors: Lukas Stäcker, Juncong Fei, Philipp Heidenreich, Frank Bonarens, Jason Rambach, Didier Stricker, Christoph Stiller

    Abstract: Deep neural networks have proven increasingly important for automotive scene understanding with new algorithms offering constant improvements of the detection performance. However, there is little emphasis on experiences and needs for deployment in embedded environments. We therefore perform a case study of the deployment of two representative object detection networks on an edge AI platform. In p… ▽ More

    Submitted 18 August, 2021; originally announced August 2021.

    Comments: To present in ICCV 2021 (ERCVAD Workshop)

  24. arXiv:2107.00346  [pdf, other

    cs.CV cs.RO

    MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding

    Authors: Kunyu Peng, Juncong Fei, Kailun Yang, Alina Roitberg, Jiaming Zhang, Frank Bieder, Philipp Heidenreich, Christoph Stiller, Rainer Stiefelhagen

    Abstract: At the heart of all automated driving systems is the ability to sense the surroundings, e.g., through semantic segmentation of LiDAR sequences, which experienced a remarkable progress due to the release of large datasets such as SemanticKITTI and nuScenes-LidarSeg. While most previous works focus on sparse segmentation of the LiDAR input, dense output masks provide self-driving cars with almost co… ▽ More

    Submitted 20 January, 2022; v1 submitted 1 July, 2021; originally announced July 2021.

    Comments: Accepted to IEEE Transactions on Intelligent Transportation Systems (T-ITS). Code is publicly available at https://github.com/KPeng9510/MASS

  25. arXiv:2105.04169  [pdf, other

    cs.CV cs.RO

    PillarSegNet: Pillar-based Semantic Grid Map Estimation using Sparse LiDAR Data

    Authors: Juncong Fei, Kunyu Peng, Philipp Heidenreich, Frank Bieder, Christoph Stiller

    Abstract: Semantic understanding of the surrounding environment is essential for automated vehicles. The recent publication of the SemanticKITTI dataset stimulates the research on semantic segmentation of LiDAR point clouds in urban scenarios. While most existing approaches predict sparse pointwise semantic classes for the sparse input LiDAR scan, we propose PillarSegNet to be able to output a dense semanti… ▽ More

    Submitted 5 July, 2021; v1 submitted 10 May, 2021; originally announced May 2021.

    Comments: Accepted to present in the 2021 IEEE Intelligent Vehicles Symposium (IV21)

  26. SemanticVoxels: Sequential Fusion for 3D Pedestrian Detection using LiDAR Point Cloud and Semantic Segmentation

    Authors: Juncong Fei, Wenbo Chen, Philipp Heidenreich, Sascha Wirges, Christoph Stiller

    Abstract: 3D pedestrian detection is a challenging task in automated driving because pedestrians are relatively small, frequently occluded and easily confused with narrow vertical objects. LiDAR and camera are two commonly used sensor modalities for this task, which should provide complementary information. Unexpectedly, LiDAR-only detection methods tend to outperform multisensor fusion methods in public be… ▽ More

    Submitted 25 September, 2020; originally announced September 2020.

    Comments: Accepted to present in the 2020 IEEE International Conference on Multisensor Fusion and Integration (MFI 2020)

  27. arXiv:2007.13902  [pdf, other

    cs.CY cs.LG econ.GN stat.AP

    Leveraging the Power of Place: A Data-Driven Decision Helper to Improve the Location Decisions of Economic Immigrants

    Authors: Jeremy Ferwerda, Nicholas Adams-Cohen, Kirk Bansak, Jennifer Fei, Duncan Lawrence, Jeremy M. Weinstein, Jens Hainmueller

    Abstract: A growing number of countries have established programs to attract immigrants who can contribute to their economy. Research suggests that an immigrant's initial arrival location plays a key role in shaping their economic success. Yet immigrants currently lack access to personalized information that would help them identify optimal destinations. Instead, they often rely on availability heuristics,… ▽ More

    Submitted 27 July, 2020; originally announced July 2020.

    Comments: 51 pages (including appendix), 13 figures. Immigration Policy Lab (IPL) Working Paper Series, Working Paper No. 20-06

  28. arXiv:2002.11573  [pdf, other

    cs.RO cs.AI

    Efficient reinforcement learning control for continuum robots based on Inexplicit Prior Knowledge

    Authors: Junjia Liu, Jiaying Shou, Zhuang Fu, Hangfei Zhou, Rongli Xie, Jun Zhang, Jian Fei, Yanna Zhao

    Abstract: Compared to rigid robots that are generally studied in reinforcement learning, the physical characteristics of some sophisticated robots such as soft or continuum robots are higher complicated. Moreover, recent reinforcement learning methods are data-inefficient and can not be directly deployed to the robot without simulation. In this paper, we propose an efficient reinforcement learning method ba… ▽ More

    Submitted 2 October, 2020; v1 submitted 26 February, 2020; originally announced February 2020.

    Comments: 11 pages, 12 figures

  29. arXiv:1906.03647  [pdf, other

    stat.ML cs.LG

    A Variant of Gaussian Process Dynamical Systems

    Authors: Jing Zhao, Jingjing Fei, Shiliang Sun

    Abstract: In order to better model high-dimensional sequential data, we propose a collaborative multi-output Gaussian process dynamical system (CGPDS), which is a novel variant of GPDSs. The proposed model assumes that the output on each dimension is controlled by a shared global latent process and a private local latent process. Thus, the dependence among different dimensions of the sequences can be captur… ▽ More

    Submitted 9 June, 2019; originally announced June 2019.

    Comments: Technical Report, East China Normal University, November 2018

  30. arXiv:1905.05761  [pdf, ps, other

    cs.LG stat.ML

    Online Anomaly Detection with Sparse Gaussian Processes

    Authors: Jingjing Fei, Shiliang Sun

    Abstract: Online anomaly detection of time-series data is an important and challenging task in machine learning. Gaussian processes (GPs) are powerful and flexible models for modeling time-series data. However, the high time complexity of GPs limits their applications in online anomaly detection. Attributed to some internal or external changes, concept drift usually occurs in time-series data, where the cha… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

  31. arXiv:1901.05571  [pdf, other

    cs.NI

    Metaflow: A DAG-Based Network Abstraction for Distributed Applications

    Authors: Jiawei Fei, Yang Shi, Qun Huang, Mei Wen

    Abstract: In the past decade, increasingly network scheduling techniques have been proposed to boost the distributed application performance. Flow-level metrics, such as flow completion time (FCT), are based on the abstraction of flows yet they cannot capture the semantics of communication in a cluster application. Being aware of this problem, coflow is proposed as a new network abstraction. However, it is… ▽ More

    Submitted 16 January, 2019; originally announced January 2019.

  32. arXiv:1311.3105  [pdf

    cs.NI

    k-DAG Based Lifetime Aware Data Collection in Wireless Sensor Networks

    Authors: Jingjing Fei, Hui Wu, Yongxin Wang

    Abstract: Wireless Sensor Networks need to be organized for efficient data collection and lifetime maximization. In this paper, we propose a novel routing structure, namely k-DAG, to balance the load of the base station's neighbours while providing the worst-case latency guarantee for data collection, and a distributed algorithm for construction a k-DAG based on a SPD (Shortest Path DAG). In a k-DAG, the le… ▽ More

    Submitted 13 November, 2013; originally announced November 2013.

    Comments: 17 pages, 10 figures

    Journal ref: International Journal of Wireless & Mobile Networks (IJWMN) Vol. 5, No. 5, October 2013, pp.17-33

  33. arXiv:1309.2139  [pdf

    cs.NI

    Frequency and time domain packet scheduling based on channel prediction with imperfect CQI in LTE

    Authors: Yongxin Wang, Kumbesan Sandrasegaran, Xinning Zhu, Jingjing Fei, Xiaoying Kong, Cheng-Chung Lin

    Abstract: Channel-dependent scheduling of transmission of data packets in a wireless system is based on measurement and feedback of the channel quality. To alleviate the performance degradation due to simultaneous multiple imperfect channel quality information (CQI), a simple and efficient packet scheduling (PS) algorithm is developed in downlink LTE system for real time traffic. A frequency domain channel… ▽ More

    Submitted 9 September, 2013; originally announced September 2013.