Skip to main content

Showing 1–50 of 106 results for author: Long, X

  1. arXiv:2407.09833  [pdf, other

    cs.CV

    LiveHPS++: Robust and Coherent Motion Capture in Dynamic Free Environment

    Authors: Yiming Ren, Xiao Han, Yichen Yao, Xiaoxiao Long, Yujing Sun, Yuexin Ma

    Abstract: LiDAR-based human motion capture has garnered significant interest in recent years for its practicability in large-scale and unconstrained environments. However, most methods rely on cleanly segmented human point clouds as input, the accuracy and smoothness of their motion results are compromised when faced with noisy data, rendering them unsuitable for practical applications. To address these lim… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  2. arXiv:2407.08571  [pdf, other

    cs.AI cs.IR cs.IT cs.LG stat.ML

    Multi-Group Proportional Representation

    Authors: Alex Oesterling, Claudio Mayrink Verdun, Carol Xuan Long, Alex Glynn, Lucas Monteiro Paes, Sajani Vithana, Martina Cardone, Flavio P. Calmon

    Abstract: Image search and retrieval tasks can perpetuate harmful stereotypes, erase cultural identities, and amplify social disparities. Current approaches to mitigate these representational harms balance the number of retrieved items across population groups defined by a small number of (often binary) attributes. However, most existing methods overlook intersectional groups determined by combinations of g… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 35 pages, 24 figures. Under review

  3. arXiv:2407.00492  [pdf, ps, other

    cs.LG stat.CO stat.ML

    Fast Gibbs sampling for the local and global trend Bayesian exponential smoothing model

    Authors: Xueying Long, Daniel F. Schmidt, Christoph Bergmeir, Slawek Smyl

    Abstract: In Smyl et al. [Local and global trend Bayesian exponential smoothing models. International Journal of Forecasting, 2024.], a generalised exponential smoothing model was proposed that is able to capture strong trends and volatility in time series. This method achieved state-of-the-art performance in many forecasting tasks, but its fitting procedure, which is based on the NUTS sampler, is very comp… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  4. arXiv:2406.09867  [pdf, other

    cs.CV

    Rethinking the Evaluation of Out-of-Distribution Detection: A Sorites Paradox

    Authors: Xingming Long, Jie Zhang, Shiguang Shan, Xilin Chen

    Abstract: Most existing out-of-distribution (OOD) detection benchmarks classify samples with novel labels as the OOD data. However, some marginal OOD samples actually have close semantic contents to the in-distribution (ID) sample, which makes determining the OOD sample a Sorites Paradox. In this paper, we construct a benchmark named Incremental Shift OOD (IS-OOD) to address the issue, in which we divide th… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: v1

  5. arXiv:2406.07648  [pdf, other

    cs.CV

    M-LRM: Multi-view Large Reconstruction Model

    Authors: Mengfei Li, Xiaoxiao Long, Yixun Liang, Weiyu Li, Yuan Liu, Peng Li, Xiaowei Chi, Xingqun Qi, Wei Xue, Wenhan Luo, Qifeng Liu, Yike Guo

    Abstract: Despite recent advancements in the Large Reconstruction Model (LRM) demonstrating impressive results, when extending its input from single image to multiple images, it exhibits inefficiencies, subpar geometric and texture quality, as well as slower convergence speed than expected. It is attributed to that, LRM formulates 3D reconstruction as a naive images-to-3D translation problem, ignoring the… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  6. arXiv:2406.01467  [pdf, other

    cs.GR cs.CV

    RaDe-GS: Rasterizing Depth in Gaussian Splatting

    Authors: Baowen Zhang, Chuan Fang, Rakesh Shrestha, Yixun Liang, Xiaoxiao Long, Ping Tan

    Abstract: Gaussian Splatting (GS) has proven to be highly effective in novel view synthesis, achieving high-quality and real-time rendering. However, its potential for reconstructing detailed 3D shapes has not been fully explored. Existing methods often suffer from limited shape accuracy due to the discrete and unstructured nature of Gaussian splats, which complicates the shape extraction. While recent tech… ▽ More

    Submitted 24 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  7. arXiv:2405.20327  [pdf, other

    cs.CV

    GECO: Generative Image-to-3D within a SECOnd

    Authors: Chen Wang, Jiatao Gu, Xiaoxiao Long, Yuan Liu, Lingjie Liu

    Abstract: 3D generation has seen remarkable progress in recent years. Existing techniques, such as score distillation methods, produce notable results but require extensive per-scene optimization, impacting time efficiency. Alternatively, reconstruction-based approaches prioritize efficiency but compromise quality due to their limited handling of uncertainty. We introduce GECO, a novel method for high-quali… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Project Page: https://cwchenwang.github.io/geco

  8. arXiv:2405.17705  [pdf, other

    cs.CV

    DC-Gaussian: Improving 3D Gaussian Splatting for Reflective Dash Cam Videos

    Authors: Linhan Wang, Kai Cheng, Shuo Lei, Shengkun Wang, Wei Yin, Chenyang Lei, Xiaoxiao Long, Chang-Tien Lu

    Abstract: We present DC-Gaussian, a new method for generating novel views from in-vehicle dash cam videos. While neural rendering techniques have made significant strides in driving scenarios, existing methods are primarily designed for videos collected by autonomous vehicles. However, these videos are limited in both quantity and diversity compared to dash cam videos, which are more widely used across vari… ▽ More

    Submitted 29 May, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: 9 pages,7 figures;project page: https://linhanwang.github.io/dcgaussian/

  9. arXiv:2405.16888  [pdf, other

    cs.GR cs.CV

    Part123: Part-aware 3D Reconstruction from a Single-view Image

    Authors: Anran Liu, Cheng Lin, Yuan Liu, Xiaoxiao Long, Zhiyang Dou, Hao-Xiang Guo, Ping Luo, Wenping Wang

    Abstract: Recently, the emergence of diffusion models has opened up new opportunities for single-view reconstruction. However, all the existing methods represent the target object as a closed mesh devoid of any structural information, thus neglecting the part-based structure, which is crucial for many downstream applications, of the reconstructed shape. Moreover, the generated meshes usually suffer from lar… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted to SIGGRAPH 2024 (conference track),webpage: https://liuar0512.github.io/part123_official_page/

  10. arXiv:2405.14979  [pdf, other

    cs.GR cs.CV

    CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner

    Authors: Weiyu Li, Jiarui Liu, Rui Chen, Yixun Liang, Xuelin Chen, Ping Tan, Xiaoxiao Long

    Abstract: We present a novel generative 3D modeling system, coined CraftsMan, which can generate high-fidelity 3D geometries with highly varied shapes, regular mesh topologies, and detailed surfaces, and, notably, allows for refining the geometry in an interactive manner. Despite the significant advancements in 3D generation, existing methods still struggle with lengthy optimization processes, irregular mes… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: HomePage: https://craftsman3d.github.io/, Code: https://github.com/wyysf-98/CraftsMan

  11. arXiv:2405.11616  [pdf, other

    cs.CV

    Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention

    Authors: Peng Li, Yuan Liu, Xiaoxiao Long, Feihu Zhang, Cheng Lin, Mengfei Li, Xingqun Qi, Shanghang Zhang, Wenhan Luo, Ping Tan, Wenping Wang, Qifeng Liu, Yike Guo

    Abstract: In this paper, we introduce Era3D, a novel multiview diffusion method that generates high-resolution multiview images from a single-view image. Despite significant advancements in multiview generation, existing methods still suffer from camera prior mismatch, inefficacy, and low resolution, resulting in poor-quality multiview images. Specifically, these methods assume that the input images should… ▽ More

    Submitted 29 May, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

  12. arXiv:2405.04589  [pdf, other

    cs.CV cs.RO

    A Novel Wide-Area Multiobject Detection System with High-Probability Region Searching

    Authors: Xianlei Long, Hui Zhao, Chao Chen, Fuqiang Gu, Qingyi Gu

    Abstract: In recent years, wide-area visual surveillance systems have been widely applied in various industrial and transportation scenarios. These systems, however, face significant challenges when implementing multi-object detection due to conflicts arising from the need for high-resolution imaging, efficient object searching, and accurate localization. To address these challenges, this paper presents a h… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted by ICRA 2024

    Journal ref: 2024 IEEE International Conference on Robotics and Automation (ICRA)

  13. arXiv:2404.15506  [pdf, other

    cs.CV

    Metric3D v2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation

    Authors: Mu Hu, Wei Yin, Chi Zhang, Zhipeng Cai, Xiaoxiao Long, Hao Chen, Kaixuan Wang, Gang Yu, Chunhua Shen, Shaojie Shen

    Abstract: We introduce Metric3D v2, a geometric foundation model for zero-shot metric depth and surface normal estimation from a single image, which is crucial for metric 3D recovery. While depth and normal are geometrically related and highly complimentary, they present distinct challenges. SoTA monocular depth methods achieve zero-shot generalization by learning affine-invariant depths, which cannot recov… ▽ More

    Submitted 21 March, 2024; originally announced April 2024.

    Comments: Our project page is at https://JUGGHM.github.io/Metric3Dv2. arXiv admin note: substantial text overlap with arXiv:2307.10984

  14. arXiv:2404.12149  [pdf, other

    cs.AI

    AccidentBlip2: Accident Detection With Multi-View MotionBlip2

    Authors: Yihua Shao, Hongyi Cai, Xinwei Long, Weiyi Lang, Zhe Wang, Haoran Wu, Yan Wang, Jiayi Yin, Yang Yang, Yisheng Lv, Zhen Lei

    Abstract: Intelligent vehicles have demonstrated excellent capabilities in many transportation scenarios. The inference capabilities of neural networks using cameras limit the accuracy of accident detection in complex transportation systems. This paper presents AccidentBlip2, a pure vision-based multi-modal large model Blip2 for accident detection. Our method first processes the multi-view images through Vi… ▽ More

    Submitted 7 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  15. arXiv:2404.12000  [pdf, other

    cs.SE

    How far are AI-powered programming assistants from meeting developers' needs?

    Authors: Xin Tan, Xiao Long, Xianjun Ni, Yinghao Zhu, Jing Jiang, Li Zhang

    Abstract: Recent In-IDE AI coding assistant tools (ACATs) like GitHub Copilot have significantly impacted developers' coding habits. While some studies have examined their effectiveness, there lacks in-depth investigation into the actual assistance process. To bridge this gap, we simulate real development scenarios encompassing three typical types of software development tasks and recruit 27 computer scienc… ▽ More

    Submitted 24 April, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  16. arXiv:2404.06395  [pdf, other

    cs.CL cs.LG

    MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies

    Authors: Shengding Hu, Yuge Tu, Xu Han, Chaoqun He, Ganqu Cui, Xiang Long, Zhi Zheng, Yewei Fang, Yuxiang Huang, Weilin Zhao, Xinrong Zhang, Zheng Leng Thai, Kaihuo Zhang, Chongyi Wang, Yuan Yao, Chenyang Zhao, Jie Zhou, Jie Cai, Zhongwu Zhai, Ning Ding, Chao Jia, Guoyang Zeng, Dahai Li, Zhiyuan Liu, Maosong Sun

    Abstract: The burgeoning interest in developing Large Language Models (LLMs) with up to trillion parameters has been met with concerns regarding resource efficiency and practical expense, particularly given the immense cost of experimentation. This scenario underscores the importance of exploring the potential of Small Language Models (SLMs) as a resource-efficient alternative. In this context, we introduce… ▽ More

    Submitted 3 June, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: revise according to peer review

  17. arXiv:2403.19589  [pdf, other

    cs.CV

    TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

    Authors: Bu Jin, Yupeng Zheng, Pengfei Li, Weize Li, Yuhang Zheng, Sujie Hu, Xinyu Liu, Jinwei Zhu, Zhijie Yan, Haiyang Sun, Kun Zhan, Peng Jia, Xiaoxiao Long, Yilun Chen, Hao Zhao

    Abstract: 3D dense captioning stands as a cornerstone in achieving a comprehensive understanding of 3D scenes through natural language. It has recently witnessed remarkable achievements, particularly in indoor settings. However, the exploration of 3D dense captioning in outdoor scenes is hindered by two major challenges: 1) the domain gap between indoor and outdoor scenes, such as dynamics and sparse visual… ▽ More

    Submitted 5 June, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Code, data, and models are publicly available at https://github.com/jxbbb/TOD3Cap

  18. arXiv:2403.13307  [pdf, other

    cs.CV

    LaserHuman: Language-guided Scene-aware Human Motion Generation in Free Environment

    Authors: Peishan Cong, Ziyi Wang, Zhiyang Dou, Yiming Ren, Wei Yin, Kai Cheng, Yujing Sun, Xiaoxiao Long, Xinge Zhu, Yuexin Ma

    Abstract: Language-guided scene-aware human motion generation has great significance for entertainment and robotics. In response to the limitations of existing datasets, we introduce LaserHuman, a pioneering dataset engineered to revolutionize Scene-Text-to-Motion research. LaserHuman stands out with its inclusion of genuine human motions within 3D environments, unbounded free-form natural language descript… ▽ More

    Submitted 21 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  19. arXiv:2403.12013  [pdf, other

    cs.CV

    GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

    Authors: Xiao Fu, Wei Yin, Mu Hu, Kaixuan Wang, Yuexin Ma, Ping Tan, Shaojie Shen, Dahua Lin, Xiaoxiao Long

    Abstract: We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes, e.g., depth and normals, from single images. While significant research has already been conducted in this area, the progress has been substantially limited by the low diversity and poor quality of publicly available datasets. As a result, the prior works either are constrained to limited scenar… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Project page: https://fuxiao0719.github.io/projects/geowizard/

  20. arXiv:2403.09637  [pdf, other

    cs.RO cs.CV

    GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping

    Authors: Yuhang Zheng, Xiangyu Chen, Yupeng Zheng, Songen Gu, Runyi Yang, Bu Jin, Pengfei Li, Chengliang Zhong, Zengmao Wang, Lina Liu, Chao Yang, Dawei Wang, Zhen Chen, Xiaoxiao Long, Meiqing Wang

    Abstract: Constructing a 3D scene capable of accommodating open-ended language queries, is a pivotal pursuit, particularly within the domain of robotics. Such technology facilitates robots in executing object manipulations based on human language directives. To tackle this challenge, some research efforts have been dedicated to the development of language-embedded implicit fields. However, implicit fields (… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  21. arXiv:2403.08766  [pdf, other

    cs.CV

    MonoOcc: Digging into Monocular Semantic Occupancy Prediction

    Authors: Yupeng Zheng, Xiang Li, Pengfei Li, Yuhang Zheng, Bu Jin, Chengliang Zhong, Xiaoxiao Long, Hao Zhao, Qichao Zhang

    Abstract: Monocular Semantic Occupancy Prediction aims to infer the complete 3D geometry and semantic information of scenes from only 2D images. It has garnered significant attention, particularly due to its potential to enhance the 3D perception of autonomous vehicles. However, existing methods rely on a complex cascaded framework with relatively limited information to restore 3D scenes, including a depend… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted by ICRA 2024

  22. arXiv:2403.05090  [pdf, other

    cs.RO

    OCEAN: An Openspace Collision-free Trajectory Planner for Autonomous Parking Based on ADMM

    Authors: Dongxu Wang, Yanbin Lu, Weilong Liu, Hao Zuo, Jiade Xin, Xiang Long, Yuncheng Jiang

    Abstract: In this paper, we propose an Openspace Collision-freE trAjectory plaNner (OCEAN) for autonomous parking. OCEAN is an optimization-based trajectory planner accelerated by Alternating Direction Method of Multiplier (ADMM) with enhanced computational efficiency and robustness, and is suitable for all scenes with few dynamic obstacles. Starting from a hierarchical optimization-based collision avoidanc… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 8 pages,5 figures

  23. arXiv:2402.14650  [pdf, other

    cs.CV

    GaussianPro: 3D Gaussian Splatting with Progressive Propagation

    Authors: Kai Cheng, Xiaoxiao Long, Kaizhi Yang, Yao Yao, Wei Yin, Yuexin Ma, Wenping Wang, Xuejin Chen

    Abstract: The advent of 3D Gaussian Splatting (3DGS) has recently brought about a revolution in the field of neural rendering, facilitating high-quality renderings at real-time speed. However, 3DGS heavily depends on the initialized point cloud produced by Structure-from-Motion (SfM) techniques. When tackling with large-scale scenes that unavoidably contain texture-less surfaces, the SfM techniques always f… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: See the project page for code, data: https://kcheng1021.github.io/gaussianpro.github.io

  24. arXiv:2402.05869  [pdf, other

    cs.CV

    Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images

    Authors: Xiaoxiao Long, Yuhang Zheng, Yupeng Zheng, Beiwen Tian, Cheng Lin, Lingjie Liu, Hao Zhao, Guyue Zhou, Wenping Wang

    Abstract: We introduce a novel approach to learn geometries such as depth and surface normal from images while incorporating geometric context. The difficulty of reliably capturing geometric context in existing methods impedes their ability to accurately enforce the consistency between the different geometric properties, thereby leading to a bottleneck of geometric estimation quality. We therefore propose t… ▽ More

    Submitted 31 March, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: Accepted by TPAMI. arXiv admin note: substantial text overlap with arXiv:2103.15483

  25. arXiv:2401.12946  [pdf, other

    cs.CV cs.CG cs.GR

    Coverage Axis++: Efficient Inner Point Selection for 3D Shape Skeletonization

    Authors: Zimeng Wang, Zhiyang Dou, Rui Xu, Cheng Lin, Yuan Liu, Xiaoxiao Long, Shiqing Xin, Taku Komura, Xiaoming Yuan, Wenping Wang

    Abstract: We introduce Coverage Axis++, a novel and efficient approach to 3D shape skeletonization. The current state-of-the-art approaches for this task often rely on the watertightness of the input or suffer from substantial computational costs, thereby limiting their practicality. To address this challenge, Coverage Axis++ proposes a heuristic algorithm to select skeletal points, offering a high-accuracy… ▽ More

    Submitted 10 June, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: SGP2024. Project Page: https://frank-zy-dou.github.io/projects/CoverageAxis++/index.html

  26. arXiv:2401.09006  [pdf, other

    cs.CV

    Generalized Face Liveness Detection via De-spoofing Face Generator

    Authors: Xingming Long, Shiguang Shan, Jie Zhang

    Abstract: Previous Face Anti-spoofing (FAS) works face the challenge of generalizing in unseen domains. One of the major problems is that most existing FAS datasets are relatively small and lack data diversity. However, we find that there are numerous real faces that can be easily achieved under various conditions, which are neglected by previous FAS works. In this paper, we conduct an Anomalous cue Guided… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: v1

  27. arXiv:2401.08206  [pdf, other

    cs.IR cs.CL

    Generative Multi-Modal Knowledge Retrieval with Large Language Models

    Authors: Xinwei Long, Jiali Zeng, Fandong Meng, Zhiyuan Ma, Kaiyan Zhang, Bowen Zhou, Jie Zhou

    Abstract: Knowledge retrieval with multi-modal queries plays a crucial role in supporting knowledge-intensive multi-modal applications. However, existing methods face challenges in terms of their effectiveness and training efficiency, especially when it comes to training and integrating multiple retrievers to handle multi-modal queries. In this paper, we propose an innovative end-to-end generative framework… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted to AAAI 2024

  28. arXiv:2312.12789  [pdf, other

    eess.IV cs.CV cs.LG

    SLP-Net:An efficient lightweight network for segmentation of skin lesions

    Authors: Bo Yang, Hong Peng, Chenggang Guo, Xiaohui Luo, Jun Wang, Xianzhong Long

    Abstract: Prompt treatment for melanoma is crucial. To assist physicians in identifying lesion areas precisely in a quick manner, we propose a novel skin lesion segmentation technique namely SLP-Net, an ultra-lightweight segmentation network based on the spiking neural P(SNP) systems type mechanism. Most existing convolutional neural networks achieve high segmentation accuracy while neglecting the high hard… ▽ More

    Submitted 4 January, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

  29. arXiv:2311.17977  [pdf, other

    cs.CV

    GaussianShader: 3D Gaussian Splatting with Shading Functions for Reflective Surfaces

    Authors: Yingwenqi Jiang, Jiadong Tu, Yuan Liu, Xifeng Gao, Xiaoxiao Long, Wenping Wang, Yuexin Ma

    Abstract: The advent of neural 3D Gaussians has recently brought about a revolution in the field of neural rendering, facilitating the generation of high-quality renderings at real-time speeds. However, the explicit and discrete representation encounters challenges when applied to scenes featuring reflective surfaces. In this paper, we present GaussianShader, a novel method that applies a simplified shading… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: 13 pages, 11 figures, refrences added

  30. arXiv:2311.17050  [pdf, other

    cs.CV cs.GR

    Surf-D: Generating High-Quality Surfaces of Arbitrary Topologies Using Diffusion Models

    Authors: Zhengming Yu, Zhiyang Dou, Xiaoxiao Long, Cheng Lin, Zekun Li, Yuan Liu, Norman Müller, Taku Komura, Marc Habermann, Christian Theobalt, Xin Li, Wenping Wang

    Abstract: We present Surf-D, a novel method for generating high-quality 3D shapes as Surfaces with arbitrary topologies using Diffusion models. Previous methods explored shape generation with different representations and they suffer from limited topologies and poor geometry details. To generate high-quality surfaces of arbitrary topologies, we use the Unsigned Distance Field (UDF) as our surface representa… ▽ More

    Submitted 22 March, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: Project Page: https://yzmblog.github.io/projects/SurfD/

  31. arXiv:2311.16945  [pdf, other

    cs.CV

    UC-NeRF: Neural Radiance Field for Under-Calibrated Multi-view Cameras in Autonomous Driving

    Authors: Kai Cheng, Xiaoxiao Long, Wei Yin, Jin Wang, Zhiqiang Wu, Yuexin Ma, Kaixuan Wang, Xiaozhi Chen, Xuejin Chen

    Abstract: Multi-camera setups find widespread use across various applications, such as autonomous driving, as they greatly expand sensing capabilities. Despite the fast development of Neural radiance field (NeRF) techniques and their wide applications in both indoor and outdoor scenes, applying NeRF to multi-camera systems remains very challenging. This is primarily due to the inherent under-calibration iss… ▽ More

    Submitted 10 December, 2023; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: See the project page for code, data: https://kcheng1021.github.io/ucnerf.github.io

  32. arXiv:2311.02995  [pdf, other

    cs.CV cs.GR

    Zero-Shot Enhancement of Low-Light Image Based on Retinex Decomposition

    Authors: Wenchao Li, Bangshu Xiong, Qiaofeng Ou, Xiaoyun Long, Jinhao Zhu, Jiabao Chen, Shuyuan Wen

    Abstract: Two difficulties here make low-light image enhancement a challenging task; firstly, it needs to consider not only luminance restoration but also image contrast, image denoising and color distortion issues simultaneously. Second, the effectiveness of existing low-light enhancement methods depends on paired or unpaired training data with poor generalization performance. To solve these difficult pr… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: 16 pages, 66 figures, TCSVT

  33. arXiv:2311.02378  [pdf

    cs.CR cs.AI eess.SY

    MTS-DVGAN: Anomaly Detection in Cyber-Physical Systems using a Dual Variational Generative Adversarial Network

    Authors: Haili Sun, Yan Huang, Lansheng Han, Cai Fu, Hongle Liu, Xiang Long

    Abstract: Deep generative models are promising in detecting novel cyber-physical attacks, mitigating the vulnerability of Cyber-physical systems (CPSs) without relying on labeled information. Nonetheless, these generative models face challenges in identifying attack behaviors that closely resemble normal data, or deviate from the normal data distribution but are in close proximity to the manifold of the nor… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

    Comments: 27 pages, 14 figures, 8 tables. Accepted by Computers & Security

    Journal ref: Computers & Security, 2023, 103570

  34. arXiv:2311.00993  [pdf, other

    cs.LG

    Scalable Probabilistic Forecasting in Retail with Gradient Boosted Trees: A Practitioner's Approach

    Authors: Xueying Long, Quang Bui, Grady Oktavian, Daniel F. Schmidt, Christoph Bergmeir, Rakshitha Godahewa, Seong Per Lee, Kaifeng Zhao, Paul Condylis

    Abstract: The recent M5 competition has advanced the state-of-the-art in retail forecasting. However, we notice important differences between the competition challenge and the challenges we face in a large e-commerce company. The datasets in our scenario are larger (hundreds of thousands of time series), and e-commerce can afford to have a larger assortment than brick-and-mortar retailers, leading to more i… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  35. arXiv:2311.00562  [pdf, other

    cs.CV

    MNN: Mixed Nearest-Neighbors for Self-Supervised Learning

    Authors: Xianzhong Long, Chen Peng, Yun Li

    Abstract: In contrastive self-supervised learning, positive samples are typically drawn from the same image but in different augmented views, resulting in a relatively limited source of positive samples. An effective way to alleviate this problem is to incorporate the relationship between samples, which involves including the top-K nearest neighbors of positive samples. However, the problem of false neighbo… ▽ More

    Submitted 13 November, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: 31 pages, 7 figures, source code and pretrained models are available https://github.com/pc-cp/MNN

  36. arXiv:2311.00358  [pdf, other

    cs.CV cs.AI

    Rethinking Samples Selection for Contrastive Learning: Mining of Potential Samples

    Authors: Hengkui Dong, Xianzhong Long, Yun Li

    Abstract: Contrastive learning predicts whether two images belong to the same category by training a model to make their feature representations as close or as far away as possible. In this paper, we rethink how to mine samples in contrastive learning, unlike other methods, our approach is more comprehensive, taking into account both positive and negative samples, and mining potential samples from two aspec… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  37. arXiv:2310.15477  [pdf, other

    cs.CL

    CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model

    Authors: Kaiyan Zhang, Ning Ding, Biqing Qi, Xuekai Zhu, Xinwei Long, Bowen Zhou

    Abstract: Instruction tuning has recently been recognized as an effective way of aligning Large Language Models (LLMs) to enhance their generalization ability across various tasks. However, when tuning publicly accessible, centralized LLMs with private instruction data, privacy concerns are inevitable. While direct transfer of parameterized modules between models is a plausible approach to address this, its… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 (Main Conference)

  38. arXiv:2310.15008  [pdf, other

    cs.CV

    Wonder3D: Single Image to 3D using Cross-Domain Diffusion

    Authors: Xiaoxiao Long, Yuan-Chen Guo, Cheng Lin, Yuan Liu, Zhiyang Dou, Lingjie Liu, Yuexin Ma, Song-Hai Zhang, Marc Habermann, Christian Theobalt, Wenping Wang

    Abstract: In this work, we introduce Wonder3D, a novel method for efficiently generating high-fidelity textured meshes from single-view images.Recent methods based on Score Distillation Sampling (SDS) have shown the potential to recover 3D geometry from 2D diffusion priors, but they typically suffer from time-consuming per-shape optimization and inconsistent geometry. In contrast, certain works directly pro… ▽ More

    Submitted 8 November, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: Project page: https://www.xxlong.site/Wonder3D/

  39. DxPU: Large Scale Disaggregated GPU Pools in the Datacenter

    Authors: Bowen He, Xiao Zheng, Yuan Chen, Weinan Li, Yajin Zhou, Xin Long, Pengcheng Zhang, Xiaowei Lu, Linquan Jiang, Qiang Liu, Dennis Cai, Xiantao Zhang

    Abstract: The rapid adoption of AI and convenience offered by cloud services have resulted in the growing demands for GPUs in the cloud. Generally, GPUs are physically attached to host servers as PCIe devices. However, the fixed assembly combination of host servers and GPUs is extremely inefficient in resource utilization, upgrade, and maintenance. Due to these issues, the GPU disaggregation technique has b… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: 23 pages, 6 figures, published in ACM Transactions on Architecture and Code Optimization

  40. arXiv:2309.13950  [pdf, other

    cs.LG

    Local and Global Trend Bayesian Exponential Smoothing Models

    Authors: Slawek Smyl, Christoph Bergmeir, Alexander Dokumentov, Xueying Long, Erwin Wibowo, Daniel Schmidt

    Abstract: This paper describes a family of seasonal and non-seasonal time series models that can be viewed as generalisations of additive and multiplicative exponential smoothing models, to model series that grow faster than linear but slower than exponential. Their development is motivated by fast-growing, volatile time series. In particular, our models have a global trend that can smoothly change from add… ▽ More

    Submitted 21 March, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

  41. arXiv:2309.03453  [pdf, other

    cs.CV cs.AI cs.GR

    SyncDreamer: Generating Multiview-consistent Images from a Single-view Image

    Authors: Yuan Liu, Cheng Lin, Zijiao Zeng, Xiaoxiao Long, Lingjie Liu, Taku Komura, Wenping Wang

    Abstract: In this paper, we present a novel diffusion model called that generates multiview-consistent images from a single-view image. Using pretrained large-scale 2D diffusion models, recent work Zero123 demonstrates the ability to generate plausible novel views from a single-view image of an object. However, maintaining consistency in geometry and colors for the generated images remains a challenge. To a… ▽ More

    Submitted 15 April, 2024; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: ICLR 2024 Spotlight. Project page: https://liuyuan-pal.github.io/SyncDreamer/ Code: https://github.com/liuyuan-pal/SyncDreamer

  42. arXiv:2306.15930  [pdf, other

    cs.CV

    Multi-network Contrastive Learning Based on Global and Local Representations

    Authors: Weiquan Li, Xianzhong Long, Yun Li

    Abstract: The popularity of self-supervised learning has made it possible to train models without relying on labeled data, which saves expensive annotation costs. However, most existing self-supervised contrastive learning methods often overlook the combination of global and local feature information. This paper proposes a multi-network contrastive learning framework based on global and local representation… ▽ More

    Submitted 30 July, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

  43. arXiv:2306.09425  [pdf, other

    cs.LG cs.CY cs.IT

    Arbitrariness Lies Beyond the Fairness-Accuracy Frontier

    Authors: Carol Xuan Long, Hsiang Hsu, Wael Alghamdi, Flavio P. Calmon

    Abstract: Machine learning tasks may admit multiple competing models that achieve similar performance yet produce conflicting outputs for individual samples -- a phenomenon known as predictive multiplicity. We demonstrate that fairness interventions in machine learning optimized solely for group fairness and accuracy can exacerbate predictive multiplicity. Consequently, state-of-the-art fairness interventio… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  44. arXiv:2305.17398  [pdf, other

    cs.CV cs.GR

    NeRO: Neural Geometry and BRDF Reconstruction of Reflective Objects from Multiview Images

    Authors: Yuan Liu, Peng Wang, Cheng Lin, Xiaoxiao Long, Jiepeng Wang, Lingjie Liu, Taku Komura, Wenping Wang

    Abstract: We present a neural rendering-based method called NeRO for reconstructing the geometry and the BRDF of reflective objects from multiview images captured in an unknown environment. Multiview reconstruction of reflective objects is extremely challenging because specular reflections are view-dependent and thus violate the multiview consistency, which is the cornerstone for most multiview reconstructi… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

    Comments: Accepted to SIGGRAPH 2023. Project page: https://liuyuan-pal.github.io/NeRO/ Codes: https://github.com/liuyuan-pal/NeRO

  45. arXiv:2305.15245  [pdf, other

    cs.NE

    Challenges of ELA-guided Function Evolution using Genetic Programming

    Authors: Fu Xing Long, Diederick Vermetten, Anna V. Kononova, Roman Kalkreuth, Kaifeng Yang, Thomas Bäck, Niki van Stein

    Abstract: Within the optimization community, the question of how to generate new optimization problems has been gaining traction in recent years. Within topics such as instance space analysis (ISA), the generation of new problems can provide new benchmarks which are not yet explored in existing research. Beyond that, this function generation can also be exploited for solving complex real-world optimization… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  46. arXiv:2305.13888  [pdf, other

    cs.CL

    PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning

    Authors: Xuekai Zhu, Biqing Qi, Kaiyan Zhang, Xinwei Long, Zhouhan Lin, Bowen Zhou

    Abstract: While large language models (LLMs) excel in various natural language processing tasks, their huge size and the inaccessibility of parameters present challenges for practical deployment. Previous studies try to distill task-specific ability from LLMs to smaller models, using data synthesis and chain-of-thought (CoT) fine-tuning. However, synthetic CoT data often contains faulty reasoning, which det… ▽ More

    Submitted 20 March, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: NAACL 2024 Long Paper; Code and data are available at https://github.com/Xuekai-Zhu/pad

  47. MSVQ: Self-Supervised Learning with Multiple Sample Views and Queues

    Authors: Chen Peng, Xianzhong Long, Yun Li

    Abstract: Self-supervised methods based on contrastive learning have achieved great success in unsupervised visual representation learning. However, most methods under this framework suffer from the problem of false negative samples. Inspired by the mean shift for self-supervised learning, we propose a new simple framework, namely Multiple Sample Views and Queues (MSVQ). We jointly construct three soft labe… ▽ More

    Submitted 17 November, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

    Comments: Accepted in KBS(Knowledge-Based Systems)

  48. arXiv:2304.02971  [pdf, other

    cs.CV cs.LG

    Synthetic Hard Negative Samples for Contrastive Learning

    Authors: Hengkui Dong, Xianzhong Long, Yun Li, Lei Chen

    Abstract: Contrastive learning has emerged as an essential approach for self-supervised learning in visual representation learning. The central objective of contrastive learning is to maximize the similarities between two augmented versions of an image (positive pairs), while minimizing the similarities between different images (negative pairs). Recent studies have demonstrated that harder negative samples,… ▽ More

    Submitted 17 April, 2023; v1 submitted 6 April, 2023; originally announced April 2023.

  49. arXiv:2304.01219  [pdf, other

    math.OC cs.AI cs.LG cs.NE

    DoE2Vec: Deep-learning Based Features for Exploratory Landscape Analysis

    Authors: Bas van Stein, Fu Xing Long, Moritz Frenzel, Peter Krause, Markus Gitterle, Thomas Bäck

    Abstract: We propose DoE2Vec, a variational autoencoder (VAE)-based methodology to learn optimization landscape characteristics for downstream meta-learning tasks, e.g., automated selection of optimization algorithms. Principally, using large training data sets generated with a random function generator, DoE2Vec self-learns an informative latent representation for any design of experiments (DoE). Unlike the… ▽ More

    Submitted 31 March, 2023; originally announced April 2023.

  50. arXiv:2303.11219  [pdf, other

    cs.CV cs.AI

    NeTO:Neural Reconstruction of Transparent Objects with Self-Occlusion Aware Refraction-Tracing

    Authors: Zongcheng Li, Xiaoxiao Long, Yusen Wang, Tuo Cao, Wenping Wang, Fei Luo, Chunxia Xiao

    Abstract: We present a novel method, called NeTO, for capturing 3D geometry of solid transparent objects from 2D images via volume rendering. Reconstructing transparent objects is a very challenging task, which is ill-suited for general-purpose reconstruction techniques due to the specular light transport phenomena. Although existing refraction-tracing based methods, designed specially for this task, achiev… ▽ More

    Submitted 8 September, 2023; v1 submitted 20 March, 2023; originally announced March 2023.