Skip to main content

Showing 1–27 of 27 results for author: Cen, J

  1. arXiv:2406.13408  [pdf, other

    cs.CL

    SQLFixAgent: Towards Semantic-Accurate SQL Generation via Multi-Agent Collaboration

    Authors: Jipeng Cen, Jiaxin Liu, Zhixu Li, Jingjing Wang

    Abstract: While fine-tuned large language models (LLMs) excel in generating grammatically valid SQL in Text-to-SQL parsing, they often struggle to ensure semantic accuracy in queries, leading to user confusion and diminished system usability. To tackle this challenge, we introduce SQLFixAgent, an innovative multi-agent collaborative framework designed for detecting and repairing erroneous SQL. Our framework… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  2. arXiv:2405.10305  [pdf, other

    cs.CV cs.AI

    4D Panoptic Scene Graph Generation

    Authors: Jingkang Yang, Jun Cen, Wenxuan Peng, Shuai Liu, Fangzhou Hong, Xiangtai Li, Kaiyang Zhou, Qifeng Chen, Ziwei Liu

    Abstract: We are living in a three-dimensional space while moving forward through a fourth dimension: time. To allow artificial intelligence to develop a comprehensive understanding of such a 4D environment, we introduce 4D Panoptic Scene Graph (PSG-4D), a new representation that bridges the raw visual data perceived in a dynamic 4D world and high-level visual understanding. Specifically, PSG-4D abstracts r… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Accepted as NeurIPS 2023. Code: https://github.com/Jingkang50/PSG4D Previous Series: PSG https://github.com/Jingkang50/OpenPSG and PVSG https://github.com/Jingkang50/OpenPVSG

  3. arXiv:2403.17010  [pdf, other

    cs.CV cs.LG cs.RO

    Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding

    Authors: Lingdong Kong, Xiang Xu, Jun Cen, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu

    Abstract: Safety-critical 3D scene understanding tasks necessitate not only accurate but also confident predictions from 3D perception models. This study introduces Calib3D, a pioneering effort to benchmark and scrutinize the reliability of 3D scene understanding models from an uncertainty estimation viewpoint. We comprehensively evaluate 28 state-of-the-art models across 10 diverse 3D datasets, uncovering… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Preprint; 37 pages, 8 figures, 11 tables; Code at https://github.com/ldkong1205/Calib3D

  4. arXiv:2403.13261  [pdf, other

    cs.CV

    Self-Supervised Class-Agnostic Motion Prediction with Spatial and Temporal Consistency Regularizations

    Authors: Kewei Wang, Yizheng Wu, Jun Cen, Zhiyu Pan, Xingyi Li, Zhe Wang, Zhiguo Cao, Guosheng Lin

    Abstract: The perception of motion behavior in a dynamic environment holds significant importance for autonomous driving systems, wherein class-agnostic motion prediction methods directly predict the motion of the entire point cloud. While most existing methods rely on fully-supervised learning, the manual labeling of point cloud data is laborious and time-consuming. Therefore, several annotation-efficient… ▽ More

    Submitted 21 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024

  5. arXiv:2403.08568  [pdf, other

    cs.CV cs.LG

    Consistent Prompting for Rehearsal-Free Continual Learning

    Authors: Zhanxin Gao, Jun Cen, Xiaobin Chang

    Abstract: Continual learning empowers models to adapt autonomously to the ever-changing environment or data streams without forgetting old knowledge. Prompt-based approaches are built on frozen pre-trained models to learn the task-specific prompts and classifiers efficiently. Existing prompt-based methods are inconsistent between training and testing, limiting their effectiveness. Two types of inconsistency… ▽ More

    Submitted 14 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024

  6. arXiv:2403.00485  [pdf, other

    cs.LG

    A Survey of Geometric Graph Neural Networks: Data Structures, Models and Applications

    Authors: Jiaqi Han, Jiacheng Cen, Liming Wu, Zongzhao Li, Xiangzhe Kong, Rui Jiao, Ziyang Yu, Tingyang Xu, Fandi Wu, Zihe Wang, Hongteng Xu, Zhewei Wei, Yang Liu, Yu Rong, Wenbing Huang

    Abstract: Geometric graph is a special kind of graph with geometric features, which is vital to model many scientific problems. Unlike generic graphs, geometric graphs often exhibit physical symmetries of translations, rotations, and reflections, making them ineffectively processed by current Graph Neural Networks (GNNs). To tackle this issue, researchers proposed a variety of Geometric Graph Neural Network… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  7. arXiv:2402.10534  [pdf, other

    cs.CV

    Using Left and Right Brains Together: Towards Vision and Language Planning

    Authors: Jun Cen, Chenfei Wu, Xiao Liu, Shengming Yin, Yixuan Pei, Jinglong Yang, Qifeng Chen, Nan Duan, Jianguo Zhang

    Abstract: Large Language Models (LLMs) and Large Multi-modality Models (LMMs) have demonstrated remarkable decision masking capabilities on a variety of tasks. However, they inherently operate planning within the language space, lacking the vision and spatial imagination ability. In contrast, humans utilize both left and right hemispheres of the brain for language and visual planning during the thinking pro… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: 19 pages, 13 figures

  8. arXiv:2312.00860  [pdf, other

    cs.CV

    Segment Any 3D Gaussians

    Authors: Jiazhong Cen, Jiemin Fang, Chen Yang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian

    Abstract: This paper presents SAGA (Segment Any 3D GAussians), a highly efficient 3D promptable segmentation method based on 3D Gaussian Splatting (3D-GS). Given 2D visual prompts as input, SAGA can segment the corresponding 3D target represented by 3D Gaussians within 4 ms. This is achieved by attaching an scale-gated affinity feature to each 3D Gaussian to endow it a new property towards multi-granularity… ▽ More

    Submitted 27 May, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

    Comments: Work in progress. Project page: https://jumpat.github.io/SAGA

  9. arXiv:2311.12076  [pdf, other

    cs.CV

    Towards Few-shot Out-of-Distribution Detection

    Authors: Jiuqing Dong, Yongbin Gao, Heng Zhou, Jun Cen, Yifan Yao, Sook Yoon, Park Dong Sun

    Abstract: Out-of-distribution (OOD) detection is critical for ensuring the reliability of open-world intelligent systems. Despite the notable advancements in existing OOD detection methodologies, our study identifies a significant performance drop under the scarcity of training samples. In this context, we introduce a novel few-shot OOD detection benchmark, carefully constructed to address this gap. Our emp… ▽ More

    Submitted 30 January, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

  10. arXiv:2307.04091  [pdf, other

    cs.CV

    CMDFusion: Bidirectional Fusion Network with Cross-modality Knowledge Distillation for LIDAR Semantic Segmentation

    Authors: Jun Cen, Shiwei Zhang, Yixuan Pei, Kun Li, Hang Zheng, Maochun Luo, Yingya Zhang, Qifeng Chen

    Abstract: 2D RGB images and 3D LIDAR point clouds provide complementary knowledge for the perception system of autonomous vehicles. Several 2D and 3D fusion methods have been explored for the LIDAR semantic segmentation task, but they suffer from different problems. 2D-to-3D fusion methods require strictly paired data during inference, which may not be available in real-world scenarios, while 3D-to-2D fusio… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

  11. arXiv:2306.09347  [pdf, other

    cs.CV cs.LG cs.RO

    Segment Any Point Cloud Sequences by Distilling Vision Foundation Models

    Authors: Youquan Liu, Lingdong Kong, Jun Cen, Runnan Chen, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu

    Abstract: Recent advancements in vision foundation models (VFMs) have opened up new possibilities for versatile and efficient visual perception. In this work, we introduce Seal, a novel framework that harnesses VFMs for segmenting diverse automotive point cloud sequences. Seal exhibits three appealing properties: i) Scalability: VFMs are directly distilled into point clouds, obviating the need for annotatio… ▽ More

    Submitted 24 October, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023 (Spotlight); 37 pages, 16 figures, 15 tables; Code at https://github.com/youquanl/Segment-Any-Point-Cloud

  12. arXiv:2305.14207  [pdf, other

    cs.CV

    SAD: Segment Any RGBD

    Authors: Jun Cen, Yizheng Wu, Kewei Wang, Xingyi Li, Jingkang Yang, Yixuan Pei, Lingdong Kong, Ziwei Liu, Qifeng Chen

    Abstract: The Segment Anything Model (SAM) has demonstrated its effectiveness in segmenting any part of 2D RGB images. However, SAM exhibits a stronger emphasis on texture information while paying less attention to geometry information when segmenting RGB images. To address this limitation, we propose the Segment Any RGBD (SAD) model, which is specifically designed to extract geometry information directly f… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: Technical report of Segment Any RGBD. Project url: https://github.com/Jun-CEN/SegmentAnyRGBD

  13. arXiv:2304.12308  [pdf, other

    cs.CV

    Segment Anything in 3D with Radiance Fields

    Authors: Jiazhong Cen, Jiemin Fang, Zanwei Zhou, Chen Yang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian

    Abstract: The Segment Anything Model (SAM) emerges as a powerful vision foundation model to generate high-quality 2D segmentation results. This paper aims to generalize SAM to segment 3D objects. Rather than replicating the data acquisition and annotation procedure which is costly in 3D, we design an efficient solution, leveraging the radiance field as a cheap and off-the-shelf prior that connects multi-vie… ▽ More

    Submitted 15 April, 2024; v1 submitted 24 April, 2023; originally announced April 2023.

    Comments: Extension version of SA3D (NeurIPS 2023). Project page: https://jumpat.github.io/SA3D/

  14. arXiv:2303.15467  [pdf, other

    cs.CV

    Enlarging Instance-specific and Class-specific Information for Open-set Action Recognition

    Authors: Jun Cen, Shiwei Zhang, Xiang Wang, Yixuan Pei, Zhiwu Qing, Yingya Zhang, Qifeng Chen

    Abstract: Open-set action recognition is to reject unknown human action cases which are out of the distribution of the training set. Existing methods mainly focus on learning better uncertainty scores but dismiss the importance of feature representations. We find that features with richer semantic diversity can significantly improve the open-set performance under the same uncertainty scores. In this paper,… ▽ More

    Submitted 25 March, 2023; originally announced March 2023.

    Comments: To appear at CVPR2023

  15. arXiv:2303.02982  [pdf, other

    cs.CV

    CLIP-guided Prototype Modulating for Few-shot Action Recognition

    Authors: Xiang Wang, Shiwei Zhang, Jun Cen, Changxin Gao, Yingya Zhang, Deli Zhao, Nong Sang

    Abstract: Learning from large-scale contrastive language-image pre-training like CLIP has shown remarkable success in a wide range of downstream tasks recently, but it is still under-explored on the challenging few-shot action recognition (FSAR) task. In this work, we aim to transfer the powerful multimodal knowledge of CLIP to alleviate the inaccurate prototype estimation issue due to data scarcity, which… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

    Comments: This work has been submitted to the Springer for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  16. arXiv:2302.04002  [pdf, other

    cs.CV

    The Devil is in the Wrongly-classified Samples: Towards Unified Open-set Recognition

    Authors: Jun Cen, Di Luan, Shiwei Zhang, Yixuan Pei, Yingya Zhang, Deli Zhao, Shaojie Shen, Qifeng Chen

    Abstract: Open-set Recognition (OSR) aims to identify test samples whose classes are not seen during the training process. Recently, Unified Open-set Recognition (UOSR) has been proposed to reject not only unknown samples but also known but wrongly classified samples, which tends to be more practical in real-world applications. The UOSR draws little attention since it is proposed, but we find sometimes it i… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

    Comments: Accepted by ICLR 2023

  17. arXiv:2211.00833  [pdf, other

    cs.CV

    Learning a Condensed Frame for Memory-Efficient Video Class-Incremental Learning

    Authors: Yixuan Pei, Zhiwu Qing, Jun Cen, Xiang Wang, Shiwei Zhang, Yaxiong Wang, Mingqian Tang, Nong Sang, Xueming Qian

    Abstract: Recent incremental learning for action recognition usually stores representative videos to mitigate catastrophic forgetting. However, only a few bulky videos can be stored due to the limited memory. To address this problem, we propose FrameMaker, a memory-efficient video class-incremental learning approach that learns to produce a condensed frame for each selected video. Specifically, FrameMaker i… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2022

  18. arXiv:2207.01755  [pdf, ps, other

    cs.CV

    Attention Guided Network for Salient Object Detection in Optical Remote Sensing Images

    Authors: Yuhan Lin, Han Sun, Ningzhong Liu, Yetong Bian, Jun Cen, Huiyu Zhou

    Abstract: Due to the extreme complexity of scale and shape as well as the uncertainty of the predicted location, salient object detection in optical remote sensing images (RSI-SOD) is a very difficult task. The existing SOD methods can satisfy the detection performance for natural scene images, but they are not well adapted to RSI-SOD due to the above-mentioned image characteristics in remote sensing images… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

    Comments: accepted by ICANN2022, The code is available at https://github.com/NuaaYH/AGNet

  19. arXiv:2207.01452  [pdf, other

    cs.CV cs.RO

    Open-world Semantic Segmentation for LIDAR Point Clouds

    Authors: Jun Cen, Peng Yun, Shiwei Zhang, Junhao Cai, Di Luan, Michael Yu Wang, Ming Liu, Mingqian Tang

    Abstract: Current methods for LIDAR semantic segmentation are not robust enough for real-world applications, e.g., autonomous driving, since it is closed-set and static. The closed-set assumption makes the network only able to output labels of trained classes, even for objects never seen before, while a static network cannot update its knowledge base according to what it has seen. Therefore, in this work, w… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

    Comments: Accepted by ECCV 2022. arXiv admin note: text overlap with arXiv:2011.10033, arXiv:2109.05441 by other authors

  20. arXiv:2207.01223  [pdf, other

    cs.CV

    A Survey on Label-efficient Deep Image Segmentation: Bridging the Gap between Weak Supervision and Dense Prediction

    Authors: Wei Shen, Zelin Peng, Xuehui Wang, Huayu Wang, Jiazhong Cen, Dongsheng Jiang, Lingxi Xie, Xiaokang Yang, Qi Tian

    Abstract: The rapid development of deep learning has made a great progress in image segmentation, one of the fundamental tasks of computer vision. However, the current segmentation algorithms mostly rely on the availability of pixel-level annotations, which are often expensive, tedious, and laborious. To alleviate this burden, the past years have witnessed an increasing attention in building label-efficient… ▽ More

    Submitted 15 February, 2023; v1 submitted 4 July, 2022; originally announced July 2022.

    Comments: Accepted to IEEE TPAMI

  21. arXiv:2205.08959  [pdf, other

    cs.CV

    A lightweight multi-scale context network for salient object detection in optical remote sensing images

    Authors: Yuhan Lin, Han Sun, Ningzhong Liu, Yetong Bian, Jun Cen, Huiyu Zhou

    Abstract: Due to the more dramatic multi-scale variations and more complicated foregrounds and backgrounds in optical remote sensing images (RSIs), the salient object detection (SOD) for optical RSIs becomes a huge challenge. However, different from natural scene images (NSIs), the discussion on the optical RSI SOD task still remains scarce. In this paper, we propose a multi-scale context network, namely MS… ▽ More

    Submitted 18 May, 2022; originally announced May 2022.

    Comments: accepted by ICPR2022, source code, see https://github.com/NuaaYH/MSCNet

  22. arXiv:2112.01135  [pdf, other

    cs.CV

    Open-set 3D Object Detection

    Authors: Jun Cen, Peng Yun, Junhao Cai, Michael Yu Wang, Ming Liu

    Abstract: 3D object detection has been wildly studied in recent years, especially for robot perception systems. However, existing 3D object detection is under a closed-set condition, meaning that the network can only output boxes of trained classes. Unfortunately, this closed-set condition is not robust enough for practical use, as it will identify unknown objects as known by mistake. Therefore, in this pap… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

    Comments: Received by 3DV 2021

  23. arXiv:2111.15463  [pdf, other

    cs.CV

    Consensus Synergizes with Memory: A Simple Approach for Anomaly Segmentation in Urban Scenes

    Authors: Jiazhong Cen, Zenkun Jiang, Lingxi Xie, Qi Tian, Xiaokang Yang, Wei Shen

    Abstract: Anomaly segmentation is a crucial task for safety-critical applications, such as autonomous driving in urban scenes, where the goal is to detect out-of-distribution (OOD) objects with categories which are unseen during training. The core challenge of this task is how to distinguish hard in-distribution samples from OOD samples, which has not been explicitly discussed yet. In this paper, we propose… ▽ More

    Submitted 24 November, 2021; originally announced November 2021.

  24. arXiv:2108.04562  [pdf, other

    cs.CV

    Deep Metric Learning for Open World Semantic Segmentation

    Authors: Jun Cen, Peng Yun, Junhao Cai, Michael Yu Wang, Ming Liu

    Abstract: Classical close-set semantic segmentation networks have limited ability to detect out-of-distribution (OOD) objects, which is important for safety-critical applications such as autonomous driving. Incrementally learning these OOD objects with few annotations is an ideal way to enlarge the knowledge base of the deep learning models. In this paper, we propose an open world semantic segmentation syst… ▽ More

    Submitted 10 August, 2021; originally announced August 2021.

    Comments: Accepted by ICCV2021

  25. MPI: Multi-receptive and Parallel Integration for Salient Object Detection

    Authors: Han Sun, Jun Cen, Ningzhong Liu, Dong Liang, Huiyu Zhou

    Abstract: The semantic representation of deep features is essential for image context understanding, and effective fusion of features with different semantic representations can significantly improve the model's performance on salient object detection. In this paper, a novel method called MPI is proposed for salient object detection. Firstly, a multi-receptive enhancement module (MRE) is designed to effecti… ▽ More

    Submitted 8 August, 2021; originally announced August 2021.

    Comments: 10 pages, 10 figures, accepted by IET Image Processing, code: https://github.com/NuaaCJ/MPI

    Journal ref: IET Image Processing, 2021

  26. arXiv:2108.00397  [pdf, other

    cs.CV cs.RO

    BORM: Bayesian Object Relation Model for Indoor Scene Recognition

    Authors: Liguang Zhou, Jun Cen, Xingchao Wang, Zhenglong Sun, Tin Lun Lam, Yangsheng Xu

    Abstract: Scene recognition is a fundamental task in robotic perception. For human beings, scene recognition is reasonable because they have abundant object knowledge of the real world. The idea of transferring prior object knowledge from humans to scene recognition is significant but still less exploited. In this paper, we propose to utilize meaningful object representations for indoor scene representation… ▽ More

    Submitted 1 August, 2021; originally announced August 2021.

    Comments: 8 pages, 5 figures, conference, Accepted by IROS2021

    Journal ref: IROS2021

  27. arXiv:2101.10799  [pdf, other

    eess.IV cs.CV cs.LG

    ImageCHD: A 3D Computed Tomography Image Dataset for Classification of Congenital Heart Disease

    Authors: Xiaowei Xu, Tianchen Wang, Jian Zhuang, Haiyun Yuan, Meiping Huang, Jianzheng Cen, Qianjun Jia, Yuhao Dong, Yiyu Shi

    Abstract: Congenital heart disease (CHD) is the most common type of birth defect, which occurs 1 in every 110 births in the United States. CHD usually comes with severe variations in heart structure and great artery connections that can be classified into many types. Thus highly specialized domain knowledge and the time-consuming human process is needed to analyze the associated medical images. On the other… ▽ More

    Submitted 11 May, 2021; v1 submitted 26 January, 2021; originally announced January 2021.

    Comments: 11 pages, 6 figures, 2 tables, published at MICCAI 2020. The diagnosis info of the dataset is updated (thanks to the help of Kadirbarut from Bilgiuzayi)