Skip to main content

Showing 1–50 of 71 results for author: Jia, P

  1. arXiv:2407.05256  [pdf, other

    cs.CV cs.AI

    Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection Enhanced by Comprehensive Guidance from Text and Image

    Authors: Pengkun Jiao, Na Zhao, Jingjing Chen, Yu-Gang Jiang

    Abstract: Open-vocabulary 3D object detection (OV-3DDet) aims to localize and recognize both seen and previously unseen object categories within any new 3D scene. While language and vision foundation models have achieved success in handling various open-vocabulary tasks with abundant training data, OV-3DDet faces a significant challenge due to the limited availability of training data. Although some pioneer… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  2. arXiv:2406.06216  [pdf, other

    cs.CV

    Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis

    Authors: Xin Jin, Pengyi Jiao, Zheng-Peng Duan, Xingchao Yang, Chun-Le Guo, Bo Ren, Chongyi Li

    Abstract: Volumetric rendering based methods, like NeRF, excel in HDR view synthesis from RAWimages, especially for nighttime scenes. While, they suffer from long training times and cannot perform real-time rendering due to dense sampling requirements. The advent of 3D Gaussian Splatting (3DGS) enables real-time rendering and faster training. However, implementing RAW image-based view synthesis directly usi… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  3. arXiv:2406.02147  [pdf, other

    cs.CV

    UA-Track: Uncertainty-Aware End-to-End 3D Multi-Object Tracking

    Authors: Lijun Zhou, Tao Tang, Pengkun Hao, Zihang He, Kalok Ho, Shuo Gu, Wenbo Hou, Zhihui Hao, Haiyang Sun, Kun Zhan, Peng Jia, Xianpeng Lang, Xiaodan Liang

    Abstract: 3D multiple object tracking (MOT) plays a crucial role in autonomous driving perception. Recent end-to-end query-based trackers simultaneously detect and track objects, which have shown promising potential for the 3D MOT task. However, existing methods overlook the uncertainty issue, which refers to the lack of precise confidence about the state and location of tracked objects. Uncertainty arises… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  4. arXiv:2406.01349  [pdf, other

    cs.CV

    Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation

    Authors: Enhui Ma, Lijun Zhou, Tao Tang, Zhan Zhang, Dong Han, Junpeng Jiang, Kun Zhan, Peng Jia, Xianpeng Lang, Haiyang Sun, Di Lin, Kaicheng Yu

    Abstract: Using generative models to synthesize new data has become a de-facto standard in autonomous driving to address the data scarcity issue. Though existing approaches are able to boost perception models, we discover that these approaches fail to improve the performance of planning of end-to-end autonomous driving models as the generated videos are usually less than 8 frames and the spatial and tempora… ▽ More

    Submitted 6 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Project Page: https://westlake-autolab.github.io/delphi.github.io/, 8 figures

  5. arXiv:2405.14702  [pdf, other

    cs.CV cs.AI

    G3: An Effective and Adaptive Framework for Worldwide Geolocalization Using Large Multi-Modality Models

    Authors: Pengyue Jia, Yiding Liu, Xiaopeng Li, Xiangyu Zhao, Yuhao Wang, Yantong Du, Xiao Han, Xuetao Wei, Shuaiqiang Wang, Dawei Yin

    Abstract: Worldwide geolocalization aims to locate the precise location at the coordinate level of photos taken anywhere on the Earth. It is very challenging due to 1) the difficulty of capturing subtle location-aware visual semantics, and 2) the heterogeneous geographical distribution of image data. As a result, existing studies have clear limitations when scaled to a worldwide context. They may easily con… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  6. arXiv:2405.10890  [pdf, other

    astro-ph.IM astro-ph.GA cs.AI

    A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model

    Authors: Mingxiang Fu, Yu Song, Jiameng Lv, Liang Cao, Peng Jia, Nan Li, Xiangru Li, Jifeng Liu, A-Li Luo, Bo Qiu, Shiyin Shen, Liangping Tu, Lili Wang, Shoulin Wei, Haifeng Yang, Zhenping Yi, Zhiqiang Zou

    Abstract: The exponential growth of astronomical datasets provides an unprecedented opportunity for humans to gain insight into the Universe. However, effectively analyzing this vast amount of data poses a significant challenge. Astronomers are turning to deep learning techniques to address this, but the methods are limited by their specific training sets, leading to considerable duplicate workloads too. He… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 26 pages, 10 figures, to be published on Chinese Physics C

  7. arXiv:2405.03408  [pdf, other

    astro-ph.IM astro-ph.SR cs.CV

    An Image Quality Evaluation and Masking Algorithm Based On Pre-trained Deep Neural Networks

    Authors: Peng Jia, Yu Song, Jiameng Lv, Runyu Ning

    Abstract: With the growing amount of astronomical data, there is an increasing need for automated data processing pipelines, which can extract scientific information from observation data without human interventions. A critical aspect of these pipelines is the image quality evaluation and masking algorithm, which evaluates image qualities based on various factors such as cloud coverage, sky brightness, scat… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: Accepted by the AJ. The code could be downloaded from: https://nadc.china-vo.org/res/r101415/ with DOI of: 10.12149/101415

  8. arXiv:2405.02008  [pdf, other

    cs.CV

    DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model

    Authors: Peijin Jia, Tuopu Wen, Ziang Luo, Mengmeng Yang, Kun Jiang, Zhiquan Lei, Xuewei Tang, Ziyuan Liu, Le Cui, Kehua Sheng, Bo Zhang, Diange Yang

    Abstract: Constructing high-definition (HD) maps is a crucial requirement for enabling autonomous driving. In recent years, several map segmentation algorithms have been developed to address this need, leveraging advancements in Bird's-Eye View (BEV) perception. However, existing models still encounter challenges in producing realistic and consistent semantic map layouts. One prominent issue is the limited… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  9. arXiv:2404.17820  [pdf, other

    cs.RO cs.AI cs.LG

    Motion planning for off-road autonomous driving based on human-like cognition and weight adaptation

    Authors: Yuchun Wang, Cheng Gong, Jianwei Gong, Peng Jia

    Abstract: Driving in an off-road environment is challenging for autonomous vehicles due to the complex and varied terrain. To ensure stable and efficient travel, the vehicle requires consideration and balancing of environmental factors, such as undulations, roughness, and obstacles, to generate optimal trajectories that can adapt to changing scenarios. However, traditional motion planners often utilize a fi… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Journal ref: Journal of Field Robotics,2024,1-22

  10. arXiv:2404.01780  [pdf, other

    astro-ph.IM astro-ph.GA cs.CV

    CSST Strong Lensing Preparation: a Framework for Detecting Strong Lenses in the Multi-color Imaging Survey by the China Survey Space Telescope (CSST)

    Authors: Xu Li, Ruiqi Sun, Jiameng Lv, Peng Jia, Nan Li, Chengliang Wei, Zou Hu, Xinzhong Er, Yun Chen, Zhang Ban, Yuedong Fang, Qi Guo, Dezi Liu, Guoliang Li, Lin Lin, Ming Li, Ran Li, Xiaobo Li, Yu Luo, Xianmin Meng, Jundan Nie, Zhaoxiang Qi, Yisheng Qiu, Li Shao, Hao Tian , et al. (7 additional authors not shown)

    Abstract: Strong gravitational lensing is a powerful tool for investigating dark matter and dark energy properties. With the advent of large-scale sky surveys, we can discover strong lensing systems on an unprecedented scale, which requires efficient tools to extract them from billions of astronomical objects. The existing mainstream lens-finding tools are based on machine learning algorithms and applied to… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: The paper is accepted by the AJ. The complete code could be downloaded with DOI of: 10.12149/101393. Comments are welcome

  11. arXiv:2403.19589  [pdf, other

    cs.CV

    TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

    Authors: Bu Jin, Yupeng Zheng, Pengfei Li, Weize Li, Yuhang Zheng, Sujie Hu, Xinyu Liu, Jinwei Zhu, Zhijie Yan, Haiyang Sun, Kun Zhan, Peng Jia, Xiaoxiao Long, Yilun Chen, Hao Zhao

    Abstract: 3D dense captioning stands as a cornerstone in achieving a comprehensive understanding of 3D scenes through natural language. It has recently witnessed remarkable achievements, particularly in indoor settings. However, the exploration of 3D dense captioning in outdoor scenes is hindered by two major challenges: 1) the domain gap between indoor and outdoor scenes, such as dynamics and sparse visual… ▽ More

    Submitted 5 June, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Code, data, and models are publicly available at https://github.com/jxbbb/TOD3Cap

  12. arXiv:2403.17297  [pdf, other

    cs.CL cs.AI

    InternLM2 Technical Report

    Authors: Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang , et al. (75 additional authors not shown)

    Abstract: The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI). However, replicating such advancements in open-source models has been challenging. This paper introduces InternLM2, an open-source LLM that outperforms its predecessors in comprehensive evaluations across 6 dimensions and 30 benchmarks, long-context m… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  13. arXiv:2403.12660  [pdf, other

    cs.IR cs.AI

    ERASE: Benchmarking Feature Selection Methods for Deep Recommender Systems

    Authors: Pengyue Jia, Yejing Wang, Zhaocheng Du, Xiangyu Zhao, Yichao Wang, Bo Chen, Wanyu Wang, Huifeng Guo, Ruiming Tang

    Abstract: Deep Recommender Systems (DRS) are increasingly dependent on a large number of feature fields for more precise recommendations. Effective feature selection methods are consequently becoming critical for further enhancing the accuracy and optimizing storage efficiencies to align with the deployment demands. This research area, particularly in the context of DRS, is nascent and faces three core chal… ▽ More

    Submitted 19 June, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: Accepted to KDD 2024

  14. arXiv:2403.12398  [pdf, other

    cs.NI

    Hierarchical Digital Twin for Efficient 6G Network Orchestration via Adaptive Attribute Selection and Scalable Network Modeling

    Authors: Pengyi Jia, Xianbin Wang, Xuemin Shen

    Abstract: Achieving a holistic and long-term understanding through accurate network modeling is essential for orchestrating future networks with increasing service diversity and infrastructure complexities. However, due to unselective data collection and uniform processing, traditional modeling approaches undermine the efficacy and timeliness of network orchestration. Additionally, temporal disparities aris… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  15. arXiv:2403.10206  [pdf, other

    astro-ph.IM astro-ph.SR cs.CV physics.ins-det physics.optics

    A Data-Driven Approach for Mitigating Dark Current Noise and Bad Pixels in Complementary Metal Oxide Semiconductor Cameras for Space-based Telescopes

    Authors: Peng Jia, Chao Lv, Yushan Li, Yongyang Sun, Shu Niu, Zhuoxiao Wang

    Abstract: In recent years, there has been a gradual increase in the performance of Complementary Metal Oxide Semiconductor (CMOS) cameras. These cameras have gained popularity as a viable alternative to charge-coupled device (CCD) cameras in a wide range of applications. One particular application is the CMOS camera installed in small space telescopes. However, the limited power and spatial resources availa… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Accepted by the AJ, comments are welcome. The complete code could be downloaded from: DOI: 10.12149/101387

  16. arXiv:2402.17487  [pdf, other

    cs.CV cs.LG eess.IV

    Bit Rate Matching Algorithm Optimization in JPEG-AI Verification Model

    Authors: Panqi Jia, A. Burakhan Koyuncu, Jue Mao, Ze Cui, Yi Ma, Tiansheng Guo, Timofey Solovyev, Alexander Karabutov, Yin Zhao, Jing Wang, Elena Alshina, Andre Kaup

    Abstract: The research on neural network (NN) based image compression has shown superior performance compared to classical compression frameworks. Unlike the hand-engineered transforms in the classical frameworks, NN-based models learn the non-linear transforms providing more compact bit representations, and achieve faster coding speed on parallel devices over their classical counterparts. Those properties… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted at (IEEE) PCS 2024; 6 pages

  17. arXiv:2402.17470  [pdf, other

    cs.CV cs.LG eess.IV

    Bit Distribution Study and Implementation of Spatial Quality Map in the JPEG-AI Standardization

    Authors: Panqi Jia, Jue Mao, Esin Koyuncu, A. Burakhan Koyuncu, Timofey Solovyev, Alexander Karabutov, Yin Zhao, Elena Alshina, Andre Kaup

    Abstract: Currently, there is a high demand for neural network-based image compression codecs. These codecs employ non-linear transforms to create compact bit representations and facilitate faster coding speeds on devices compared to the hand-crafted transforms used in classical frameworks. The scientific and industrial communities are highly interested in these properties, leading to the standardization ef… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 5 pages, 3 figures, 4 tables

  18. arXiv:2402.12289  [pdf, other

    cs.CV

    DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models

    Authors: Xiaoyu Tian, Junru Gu, Bailin Li, Yicheng Liu, Yang Wang, Zhiyong Zhao, Kun Zhan, Peng Jia, Xianpeng Lang, Hang Zhao

    Abstract: A primary hurdle of autonomous driving in urban environments is understanding complex and long-tail scenarios, such as challenging road conditions and delicate human behaviors. We introduce DriveVLM, an autonomous driving system leveraging Vision-Language Models (VLMs) for enhanced scene understanding and planning capabilities. DriveVLM integrates a unique combination of reasoning modules for scen… ▽ More

    Submitted 25 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Project Page: https://tsinghua-mars-lab.github.io/DriveVLM/

  19. arXiv:2402.09325  [pdf, other

    cs.CV cs.RO

    PC-NeRF: Parent-Child Neural Radiance Fields Using Sparse LiDAR Frames in Autonomous Driving Environments

    Authors: Xiuzhong Hu, Guangming Xiong, Zheng Zang, Peng Jia, Yuxuan Han, Junyi Ma

    Abstract: Large-scale 3D scene reconstruction and novel view synthesis are vital for autonomous vehicles, especially utilizing temporally sparse LiDAR frames. However, conventional explicit representations remain a significant bottleneck towards representing the reconstructed and synthetic scenes at unlimited resolution. Although the recently developed neural radiance fields (NeRF) have shown compelling res… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2310.00874

  20. arXiv:2401.01065  [pdf, other

    cs.CV cs.AI

    BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving

    Authors: Tao Tang, Dafeng Wei, Zhengyu Jia, Tian Gao, Changwei Cai, Chengkai Hou, Peng Jia, Kun Zhan, Haiyang Sun, Jingchen Fan, Yixing Zhao, Fu Liu, Xiaodan Liang, Xianpeng Lang, Yang Wang

    Abstract: The rapid development of the autonomous driving industry has led to a significant accumulation of autonomous driving data. Consequently, there comes a growing demand for retrieving data to provide specialized optimization. However, directly applying previous image retrieval methods faces several challenges, such as the lack of global feature representation and inadequate text retrieval ability for… ▽ More

    Submitted 18 June, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

  21. arXiv:2312.16108  [pdf, other

    cs.CV

    LaneSegNet: Map Learning with Lane Segment Perception for Autonomous Driving

    Authors: Tianyu Li, Peijin Jia, Bangjun Wang, Li Chen, Kun Jiang, Junchi Yan, Hongyang Li

    Abstract: A map, as crucial information for downstream applications of an autonomous driving system, is usually represented in lanelines or centerlines. However, existing literature on map learning primarily focuses on either detecting geometry-based lanelines or perceiving topology relationships of centerlines. Both of these methods ignore the intrinsic relationship of lanelines and centerlines, that lanel… ▽ More

    Submitted 26 February, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

    Comments: Accepted in ICLR 2024

  22. arXiv:2312.15450  [pdf, other

    cs.IR

    Agent4Ranking: Semantic Robust Ranking via Personalized Query Rewriting Using Multi-agent LLM

    Authors: Xiaopeng Li, Lixin Su, Pengyue Jia, Xiangyu Zhao, Suqi Cheng, Junfeng Wang, Dawei Yin

    Abstract: Search engines are crucial as they provide an efficient and easy way to access vast amounts of information on the internet for diverse information needs. User queries, even with a specific need, can differ significantly. Prior research has explored the resilience of ranking models against typical query variations like paraphrasing, misspellings, and order changes. Yet, these works overlook how div… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

  23. arXiv:2311.18214  [pdf, other

    astro-ph.IM astro-ph.GA astro-ph.SR cs.CV physics.optics

    Perception of Misalignment States for Sky Survey Telescopes with the Digital Twin and the Deep Neural Networks

    Authors: Miao Zhang, Peng Jia, Zhengyang Li, Wennan Xiang, Jiameng Lv, Rui Sun

    Abstract: Sky survey telescopes play a critical role in modern astronomy, but misalignment of their optical elements can introduce significant variations in point spread functions, leading to reduced data quality. To address this, we need a method to obtain misalignment states, aiding in the reconstruction of accurate point spread functions for data processing methods or facilitating adjustments of optical… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: The aforementioned submission has been accepted by Optics Express. We kindly request any feedback or comments to be directed to the corresponding author, Peng Jia (robinmartin20@gmail.com), or the second corresponding author, Zhengyang Li (lizy@niaot.ac.cn). Please note that Zhengyang is currently stationed in the South Antarctica and will not be available until after February 1st, 2024

  24. arXiv:2311.16974  [pdf, other

    cs.CV

    COLE: A Hierarchical Generation Framework for Multi-Layered and Editable Graphic Design

    Authors: Peidong Jia, Chenxuan Li, Yuhui Yuan, Zeyu Liu, Yichao Shen, Bohan Chen, Xingru Chen, Yinglin Zheng, Dong Chen, Ji Li, Xiaodong Xie, Shanghang Zhang, Baining Guo

    Abstract: Graphic design, which has been evolving since the 15th century, plays a crucial role in advertising. The creation of high-quality designs demands design-oriented planning, reasoning, and layer-wise generation. Unlike the recent CanvaGPT, which integrates GPT-4 with existing design templates to build a custom GPT, this paper introduces the COLE system - a hierarchical generation framework designed… ▽ More

    Submitted 18 March, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: Technical report. Project page: https://graphic-design-generation-github-io.vercel.app/

  25. arXiv:2311.15643  [pdf, other

    cs.RO

    A Survey on Monocular Re-Localization: From the Perspective of Scene Map Representation

    Authors: Jinyu Miao, Kun Jiang, Tuopu Wen, Yunlong Wang, Peijing Jia, Xuhe Zhao, Qian Cheng, Zhongyang Xiao, Jin Huang, Zhihua Zhong, Diange Yang

    Abstract: Monocular Re-Localization (MRL) is a critical component in autonomous applications, estimating 6 degree-of-freedom ego poses w.r.t. the scene map based on monocular images. In recent decades, significant progress has been made in the development of MRL techniques. Numerous algorithms have accomplished extraordinary success in terms of localization accuracy and robustness. In MRL, scene maps are re… ▽ More

    Submitted 12 January, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: 33 pages, 10 tables, 16 figures, under review

  26. arXiv:2311.10380  [pdf, other

    cs.CV

    MSE-Nets: Multi-annotated Semi-supervised Ensemble Networks for Improving Segmentation of Medical Image with Ambiguous Boundaries

    Authors: Shuai Wang, Tengjin Weng, Jingyi Wang, Yang Shen, Zhidong Zhao, Yixiu Liu, Pengfei Jiao, Zhiming Cheng, Yaqi Wang

    Abstract: Medical image segmentation annotations exhibit variations among experts due to the ambiguous boundaries of segmented objects and backgrounds in medical images. Although using multiple annotations for each image in the fully-supervised has been extensively studied for training deep models, obtaining a large amount of multi-annotated data is challenging due to the substantial time and manpower costs… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  27. Temporal Graph Representation Learning with Adaptive Augmentation Contrastive

    Authors: Hongjiang Chen, Pengfei Jiao, Huijun Tang, Huaming Wu

    Abstract: Temporal graph representation learning aims to generate low-dimensional dynamic node embeddings to capture temporal information as well as structural and property information. Current representation learning methods for temporal networks often focus on capturing fine-grained information, which may lead to the model capturing random noise instead of essential semantic information. While graph contr… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Journal ref: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 683-699. 2023

  28. arXiv:2311.00186  [pdf, other

    astro-ph.IM astro-ph.GA astro-ph.SR cs.CV

    Image Restoration with Point Spread Function Regularization and Active Learning

    Authors: Peng Jia, Jiameng Lv, Runyu Ning, Yu Song, Nan Li, Kaifan Ji, Chenzhou Cui, Shanshan Li

    Abstract: Large-scale astronomical surveys can capture numerous images of celestial objects, including galaxies and nebulae. Analysing and processing these images can reveal intricate internal structures of these objects, allowing researchers to conduct comprehensive studies on their morphology, evolution, and physical properties. However, varying noise levels and point spread functions can hamper the accur… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

    Comments: To be published in the MNRAS

  29. arXiv:2310.19056  [pdf, other

    cs.IR cs.AI cs.CL

    MILL: Mutual Verification with Large Language Models for Zero-Shot Query Expansion

    Authors: Pengyue Jia, Yiding Liu, Xiangyu Zhao, Xiaopeng Li, Changying Hao, Shuaiqiang Wang, Dawei Yin

    Abstract: Query expansion, pivotal in search engines, enhances the representation of user information needs with additional terms. While existing methods expand queries using retrieved or generated contextual documents, each approach has notable limitations. Retrieval-based methods often fail to accurately capture search intent, particularly with brief or ambiguous queries. Generation-based methods, utilizi… ▽ More

    Submitted 28 March, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

    Comments: Accepted to NAACL 2024

  30. arXiv:2310.00874  [pdf, other

    cs.CV cs.RO

    PC-NeRF: Parent-Child Neural Radiance Fields under Partial Sensor Data Loss in Autonomous Driving Environments

    Authors: Xiuzhong Hu, Guangming Xiong, Zheng Zang, Peng Jia, Yuxuan Han, Junyi Ma

    Abstract: Reconstructing large-scale 3D scenes is essential for autonomous vehicles, especially when partial sensor data is lost. Although the recently developed neural radiance fields (NeRF) have shown compelling results in implicit representations, the large-scale 3D scene reconstruction using partially lost LiDAR point cloud data still needs to be explored. To bridge this gap, we propose a novel 3D scene… ▽ More

    Submitted 1 October, 2023; originally announced October 2023.

  31. arXiv:2309.03539  [pdf, other

    cs.CV

    YOLO series target detection algorithms for underwater environments

    Authors: Chenjie Zhang, Pengcheng Jiao

    Abstract: You Only Look Once (YOLO) algorithm is a representative target detection algorithm emerging in 2016, which is known for its balance of computing speed and accuracy, and now plays an important role in various fields of human production and life. However, there are still many limitations in the application of YOLO algorithm in underwater environments due to problems such as dim light and turbid wate… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  32. arXiv:2307.16289  [pdf

    cs.CV cs.AI

    Implementing Edge Based Object Detection For Microplastic Debris

    Authors: Amardeep Singh, Prof. Charles Jia, Prof. Donald Kirk

    Abstract: Plastic has imbibed itself as an indispensable part of our day to day activities, becoming a source of problems due to its non-biodegradable nature and cheaper production prices. With these problems, comes the challenge of mitigating and responding to the aftereffects of disposal or the lack of proper disposal which leads to waste concentrating in locations and disturbing ecosystems for both plant… ▽ More

    Submitted 30 July, 2023; originally announced July 2023.

  33. arXiv:2307.00313  [pdf, other

    cs.CV

    PM-DETR: Domain Adaptive Prompt Memory for Object Detection with Transformers

    Authors: Peidong Jia, Jiaming Liu, Senqiao Yang, Jiarui Wu, Xiaodong Xie, Shanghang Zhang

    Abstract: The Transformer-based detectors (i.e., DETR) have demonstrated impressive performance on end-to-end object detection. However, transferring DETR to different data distributions may lead to a significant performance degradation. Existing adaptation techniques focus on model-based approaches, which aim to leverage feature alignment to narrow the distribution shift between different domains. In this… ▽ More

    Submitted 1 July, 2023; originally announced July 2023.

    Comments: cs.cv

    MSC Class: 68T07 ACM Class: I.5.1

  34. arXiv:2306.14287  [pdf, other

    eess.IV cs.CV cs.LG

    Efficient Contextformer: Spatio-Channel Window Attention for Fast Context Modeling in Learned Image Compression

    Authors: A. Burakhan Koyuncu, Panqi Jia, Atanas Boev, Elena Alshina, Eckehard Steinbach

    Abstract: Entropy estimation is essential for the performance of learned image compression. It has been demonstrated that a transformer-based entropy model is of critical importance for achieving a high compression ratio, however, at the expense of a significant computational effort. In this work, we introduce the Efficient Contextformer (eContextformer) - a computationally efficient transformer-based autor… ▽ More

    Submitted 27 February, 2024; v1 submitted 25 June, 2023; originally announced June 2023.

    Comments: Accepted for IEEE TCSVT (14 pages, 10 figures, 9 tables)

  35. arXiv:2306.04344  [pdf, other

    cs.CV

    ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation

    Authors: Jiaming Liu, Senqiao Yang, Peidong Jia, Renrui Zhang, Ming Lu, Yandong Guo, Wei Xue, Shanghang Zhang

    Abstract: Since real-world machine systems are running in non-stationary environments, Continual Test-Time Adaptation (CTTA) task is proposed to adapt the pre-trained model to continually changing target domains. Recently, existing methods mainly focus on model-based adaptation, which aims to leverage a self-training manner to extract the target domain knowledge. However, pseudo labels can be noisy and the… ▽ More

    Submitted 27 March, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: Accepted by ICLR2024

  36. arXiv:2304.10440  [pdf, other

    cs.CV

    OpenLane-V2: A Topology Reasoning Benchmark for Unified 3D HD Mapping

    Authors: Huijie Wang, Tianyu Li, Yang Li, Li Chen, Chonghao Sima, Zhenbo Liu, Bangjun Wang, Peijin Jia, Yuting Wang, Shengyin Jiang, Feng Wen, Hang Xu, Ping Luo, Junchi Yan, Wei Zhang, Hongyang Li

    Abstract: Accurately depicting the complex traffic scene is a vital component for autonomous vehicles to execute correct judgments. However, existing benchmarks tend to oversimplify the scene by solely focusing on lane perception tasks. Observing that human drivers rely on both lanes and traffic signals to operate their vehicles safely, we present OpenLane-V2, the first dataset on topology reasoning for tra… ▽ More

    Submitted 28 October, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

    Comments: Accepted by NeurIPS 2023 Track on Datasets and Benchmarks | OpenLane-V2 Dataset: https://github.com/OpenDriveLab/OpenLane-V2

  37. Bioinspired soft robotics: How do we learn from creatures?

    Authors: Yang Yang, Zhiguo He, Pengcheng Jiao, Hongliang Ren

    Abstract: Soft robotics has opened a unique path to flexibility and environmental adaptability, learning from nature and reproducing biological behaviors. Nature implies answers for how to apply robots to real life. To find out how we learn from creatures to design and apply soft robots, in this Review, we propose a classification method to summarize soft robots based on different functions of biological sy… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

    Comments: 15 pages, 6 figures, journal article IEEE Reviews in biomedical engineering(2022), Early Access

  38. arXiv:2302.11747  [pdf, other

    cs.RO

    Amos-SLAM: An Anti-Dynamics Two-stage SLAM Approach

    Authors: Yaoming Zhuang, Pengrun Jia, Zheng Liu, Li Li, Chengdong Wu, Wei cui, Zhanlin Liu

    Abstract: The traditional Simultaneous Localization And Mapping (SLAM) systems rely on the assumption of a static environment and fail to accurately estimate the system's location when dynamic objects are present in the background. While learning-based dynamic SLAM systems have difficulties in handling unknown moving objects, geometry-based methods have limited success in addressing the residual effects of… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

  39. Copebot: Underwater soft robot with copepod-like locomotion

    Authors: Zhiguo He, Yang Yang, Pengcheng Jiao, Haipeng Wang, Guanzheng Lin, Thomas Pähtz

    Abstract: It has been a great challenge to develop robots that are able to perform complex movement patterns with high speed and, simultaneously, high accuracy. Copepods are animals found in freshwater and saltwater habitats that can have extremely fast escape responses when a predator is sensed by performing explosive curved jumps. Here, we present a design and build prototypes of a combustion-driven under… ▽ More

    Submitted 29 April, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

    Journal ref: Soft Robotics 10 (2), 314-325 (2023)

  40. arXiv:2212.05497  [pdf

    astro-ph.IM astro-ph.HE cs.CV

    Target Detection Framework for Lobster Eye X-Ray Telescopes with Machine Learning Algorithms

    Authors: Peng Jia, Wenbo Liu, Yuan Liu, Haiwu Pan

    Abstract: Lobster eye telescopes are ideal monitors to detect X-ray transients, because they could observe celestial objects over a wide field of view in X-ray band. However, images obtained by lobster eye telescopes are modified by their unique point spread functions, making it hard to design a high efficiency target detection algorithm. In this paper, we integrate several machine learning algorithms to bu… ▽ More

    Submitted 11 December, 2022; originally announced December 2022.

    Comments: Accepted by the APJS Journal. Full source code could be downloaded from the China VO with DOI of https://doi.org/10.12149/101175. Docker version of the code could be obtained under request to the corresponding author

  41. arXiv:2211.05972  [pdf, other

    astro-ph.IM astro-ph.CO astro-ph.GA cs.CV

    Detection of Strongly Lensed Arcs in Galaxy Clusters with Transformers

    Authors: Peng Jia, Ruiqi Sun, Nan Li, Yu Song, Runyu Ning, Hongyan Wei, Rui Luo

    Abstract: Strong lensing in galaxy clusters probes properties of dense cores of dark matter halos in mass, studies the distant universe at flux levels and spatial resolutions otherwise unavailable, and constrains cosmological models independently. The next-generation large scale sky imaging surveys are expected to discover thousands of cluster-scale strong lenses, which would lead to unprecedented opportuni… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: Submitted to the Astronomical Journal, source code could be obtained from PaperData sponsored by China-VO group with DOI of 10.12149/101172. Cloud computing resources would be released under request

  42. arXiv:2210.07482  [pdf, other

    cs.CR cs.PL cs.SE

    Cargo Ecosystem Dependency-Vulnerability Knowledge Graph Construction and Vulnerability Propagation Study

    Authors: Peiyang Jia, Chengwei Liu, Hongyu Sun, Chengyi Sun, Mianxue Gu, Yang Liu, Yuqing Zhang

    Abstract: Currently, little is known about the structure of the Cargo ecosystem and the potential for vulnerability propagation. Many empirical studies generalize third-party dependency governance strategies from a single software ecosystem to other ecosystems but ignore the differences in the technical structures of different software ecosystems, making it difficult to directly generalize security governan… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

  43. arXiv:2205.10199  [pdf, other

    cs.CV cs.AI

    A Novel Underwater Image Enhancement and Improved Underwater Biological Detection Pipeline

    Authors: Zheng Liu, Yaoming Zhuang, Pengrun Jia, Chengdong Wu, Hongli Xu ang Zhanlin Liu

    Abstract: For aquaculture resource evaluation and ecological environment monitoring, automatic detection and identification of marine organisms is critical. However, due to the low quality of underwater images and the characteristics of underwater biological, a lack of abundant features may impede traditional hand-designed feature extraction approaches or CNN-based object detection algorithms, particularly… ▽ More

    Submitted 20 May, 2022; originally announced May 2022.

    Comments: 14 pages,14 figures

  44. arXiv:2202.06257  [pdf

    cs.LG

    Fine-Grained Population Mobility Data-Based Community-Level COVID-19 Prediction Model

    Authors: Pengyue Jia, Ling Chen, Dandan Lyu

    Abstract: Predicting the number of infections in the anti-epidemic process is extremely beneficial to the government in developing anti-epidemic strategies, especially in fine-grained geographic units. Previous works focus on low spatial resolution prediction, e.g., county-level, and preprocess data to the same geographic level, which loses some useful information. In this paper, we propose a fine-grained p… ▽ More

    Submitted 15 July, 2022; v1 submitted 13 February, 2022; originally announced February 2022.

    Comments: Accepted by Cybernetics and Systems

  45. arXiv:2201.06972  [pdf, other

    cs.SI cs.AI

    Representation Learning on Heterostructures via Heterogeneous Anonymous Walks

    Authors: Xuan Guo, Pengfei Jiao, Ting Pan, Wang Zhang, Mengyu Jia, Danyang Shi, Wenjun Wang

    Abstract: Capturing structural similarity has been a hot topic in the field of network embedding recently due to its great help in understanding the node functions and behaviors. However, existing works have paid very much attention to learning structures on homogeneous networks while the related study on heterogeneous networks is still a void. In this paper, we try to take the first step for representation… ▽ More

    Submitted 18 January, 2022; originally announced January 2022.

    Comments: 13 pages, 6 figures, 5 tables

    MSC Class: 68T30 ACM Class: J.4.3

  46. arXiv:2112.13507  [pdf, other

    cs.LG cs.SI

    Block Modeling-Guided Graph Convolutional Neural Networks

    Authors: Dongxiao He, Chundong Liang, Huixin Liu, Mingxiang Wen, Pengfei Jiao, Zhiyong Feng

    Abstract: Graph Convolutional Network (GCN) has shown remarkable potential of exploring graph representation. However, the GCN aggregating mechanism fails to generalize to networks with heterophily where most nodes have neighbors from different classes, which commonly exists in real-world networks. In order to make the propagation and aggregation mechanism of GCN suitable for both homophily and heterophily… ▽ More

    Submitted 27 December, 2021; v1 submitted 26 December, 2021; originally announced December 2021.

    Comments: Accepted by Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22)

  47. Reinforcement Learning for Few-Shot Text Generation Adaptation

    Authors: Pengsen Cheng, Jinqiao Dai, Jiamiao Liu, Jiayong Liu, Peng Jia

    Abstract: Controlling the generative model to adapt a new domain with limited samples is a difficult challenge and it is receiving increasing attention. Recently, methods based on meta-learning have shown promising results for few-shot domain adaptation. However, meta-learning-based methods usually suffer from the problem of overfitting, which results in a lack of diversity in the generated texts. To avoid… ▽ More

    Submitted 7 December, 2022; v1 submitted 22 November, 2021; originally announced November 2021.

    Journal ref: Neurocomputing, 2023, 126689

  48. arXiv:2111.08264  [pdf, other

    cs.SI cs.CC

    Analysis of 5G academic Network based on graph representation learning method

    Authors: Xiaoming Li, Guangquan Xu, Wei Yu, Pengfei Jiao, Xiangyu Song

    Abstract: With the rapid development of 5th Generation Mobile Communication Technology (5G), the diverse forms of collaboration and extensive data in academic social networks constructed by 5G papers make the management and analysis of academic social networks increasingly challenging. Despite the particular success achieved by representation learning in analyzing academic and social networks, most present… ▽ More

    Submitted 16 November, 2021; originally announced November 2021.

    Comments: 38 pages, 9 figures

  49. arXiv:2107.08379  [pdf, other

    cs.SI cs.AI

    A Survey on Role-Oriented Network Embedding

    Authors: Pengfei Jiao, Xuan Guo, Ting Pan, Wang Zhang, Yulong Pei

    Abstract: Recently, Network Embedding (NE) has become one of the most attractive research topics in machine learning and data mining. NE approaches have achieved promising performance in various of graph mining tasks including link prediction and node clustering and classification. A wide variety of NE methods focus on the proximity of networks. They learn community-oriented embedding for each node, where t… ▽ More

    Submitted 18 July, 2021; originally announced July 2021.

    Comments: 20 pages,9 figures, 5 tables

    ACM Class: J.4.3

  50. arXiv:2107.07957  [pdf, other

    cs.CL cs.AI

    Automatic Task Requirements Writing Evaluation via Machine Reading Comprehension

    Authors: Shiting Xu, Guowei Xu, Peilei Jia, Wenbiao Ding, Zhongqin Wu, Zitao Liu

    Abstract: Task requirements (TRs) writing is an important question type in Key English Test and Preliminary English Test. A TR writing question may include multiple requirements and a high-quality essay must respond to each requirement thoroughly and accurately. However, the limited teacher resources prevent students from getting detailed grading instantly. The majority of existing automatic essay scoring s… ▽ More

    Submitted 15 July, 2021; originally announced July 2021.

    Comments: AIED'21: The 22nd International Conference on Artificial Intelligence in Education, 2021