Skip to main content

Showing 1–20 of 20 results for author: Fei, X

  1. arXiv:2405.17459  [pdf

    cs.LG cs.AI cs.CL cs.CV

    Integrating Medical Imaging and Clinical Reports Using Multimodal Deep Learning for Advanced Disease Analysis

    Authors: Ziyan Yao, Fei Lin, Sheng Chai, Weijie He, Lu Dai, Xinghui Fei

    Abstract: In this paper, an innovative multi-modal deep learning model is proposed to deeply integrate heterogeneous information from medical images and clinical reports. First, for medical images, convolutional neural networks were used to extract high-dimensional features and capture key visual information such as focal details, texture and spatial distribution. Secondly, for clinical report text, a two-w… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  2. arXiv:2404.18065  [pdf, other

    cs.CV cs.AI

    Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model

    Authors: Xiaolong Li, Jiawei Mo, Ying Wang, Chethan Parameshwara, Xiaohan Fei, Ashwin Swaminathan, CJ Taylor, Zhuowen Tu, Paolo Favaro, Stefano Soatto

    Abstract: In this paper, we propose an effective two-stage approach named Grounded-Dreamer to generate 3D assets that can accurately follow complex, compositional text prompts while achieving high fidelity by using a pre-trained multi-view diffusion model. Multi-view diffusion models, such as MVDream, have shown to generate high-fidelity 3D assets using score distillation sampling (SDS). However, applied na… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 9 pages, 10 figures

  3. arXiv:2403.19220  [pdf, other

    cs.CV

    GeoAuxNet: Towards Universal 3D Representation Learning for Multi-sensor Point Clouds

    Authors: Shengjun Zhang, Xin Fei, Yueqi Duan

    Abstract: Point clouds captured by different sensors such as RGB-D cameras and LiDAR possess non-negligible domain gaps. Most existing methods design different network architectures and train separately on point clouds from various sensors. Typically, point-based methods achieve outstanding performances on even-distributed dense point clouds from RGB-D cameras, while voxel-based methods are more efficient f… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  4. arXiv:2403.11024  [pdf

    cs.CV

    Fast Sparse View Guided NeRF Update for Object Reconfigurations

    Authors: Ziqi Lu, Jianbo Ye, Xiaohan Fei, Xiaolong Li, Jiawei Mo, Ashwin Swaminathan, Stefano Soatto

    Abstract: Neural Radiance Field (NeRF), as an implicit 3D scene representation, lacks inherent ability to accommodate changes made to the initial static scene. If objects are reconfigured, it is difficult to update the NeRF to reflect the new state of the scene without time-consuming data re-capturing and NeRF re-training. To address this limitation, we develop the first update method for NeRFs to physical… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  5. arXiv:2402.18780  [pdf, other

    cs.CV

    A Quantitative Evaluation of Score Distillation Sampling Based Text-to-3D

    Authors: Xiaohan Fei, Chethan Parameshwara, Jiawei Mo, Xiaolong Li, Ashwin Swaminathan, CJ Taylor, Paolo Favaro, Stefano Soatto

    Abstract: The development of generative models that create 3D content from a text prompt has made considerable strides thanks to the use of the score distillation sampling (SDS) method on pre-trained diffusion models for image generation. However, the SDS method is also the source of several artifacts, such as the Janus problem, the misalignment between the text prompt and the generated 3D model, and 3D mod… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  6. arXiv:2308.02746  [pdf, other

    cs.CL cs.LG

    Meta-Tsallis-Entropy Minimization: A New Self-Training Approach for Domain Adaptation on Text Classification

    Authors: Menglong Lu, Zhen Huang, Zhiliang Tian, Yunxiang Zhao, Xuanyu Fei, Dongsheng Li

    Abstract: Text classification is a fundamental task for natural language processing, and adapting text classification models across domains has broad applications. Self-training generates pseudo-examples from the model's predictions and iteratively trains on the pseudo-examples, i.e., minimizes the loss on the source domain and the Gibbs entropy on the target domain. However, Gibbs entropy is sensitive to p… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: This paper was accepted by IJCAI 2023, and the uploaded file includes 9 pages of main contents(including two pages of reference) plus 10 pages of appendix

  7. arXiv:2307.07756  [pdf, other

    cs.LG cs.CR cs.SI

    Real-time Traffic Classification for 5G NSA Encrypted Data Flows With Physical Channel Records

    Authors: Xiao Fei, Philippe Martins, Jialiang Lu

    Abstract: The classification of fifth-generation New-Radio (5G-NR) mobile network traffic is an emerging topic in the field of telecommunications. It can be utilized for quality of service (QoS) management and dynamic resource allocation. However, traditional approaches such as Deep Packet Inspection (DPI) can not be directly applied to encrypted data flows. Therefore, new real-time encrypted traffic classi… ▽ More

    Submitted 15 July, 2023; originally announced July 2023.

    Comments: 6 pages, 10 figures

  8. arXiv:2307.05717  [pdf, other

    cs.OH

    Towards Mobility Data Science (Vision Paper)

    Authors: Mohamed Mokbel, Mahmoud Sakr, Li Xiong, Andreas Züfle, Jussara Almeida, Taylor Anderson, Walid Aref, Gennady Andrienko, Natalia Andrienko, Yang Cao, Sanjay Chawla, Reynold Cheng, Panos Chrysanthis, Xiqi Fei, Gabriel Ghinita, Anita Graser, Dimitrios Gunopulos, Christian Jensen, Joon-Seok Kim, Kyoung-Sook Kim, Peer Kröger, John Krumm, Johannes Lauer, Amr Magdy, Mario Nascimento , et al. (23 additional authors not shown)

    Abstract: Mobility data captures the locations of moving objects such as humans, animals, and cars. With the availability of GPS-equipped mobile devices and other inexpensive location-tracking technologies, mobility data is collected ubiquitously. In recent years, the use of mobility data has demonstrated significant impact in various domains including traffic management, urban planning, and health sciences… ▽ More

    Submitted 7 March, 2024; v1 submitted 21 June, 2023; originally announced July 2023.

    Comments: Updated to reflect the major revision for ACM Transactions on Spatial Algorithms and Systems (TSAS). This version reflects the final version accepted by ACM TSAS

  9. arXiv:2306.03727  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Towards Visual Foundational Models of Physical Scenes

    Authors: Chethan Parameshwara, Alessandro Achille, Matthew Trager, Xiaolong Li, Jiawei Mo, Matthew Trager, Ashwin Swaminathan, CJ Taylor, Dheera Venkatraman, Xiaohan Fei, Stefano Soatto

    Abstract: We describe a first step towards learning general-purpose visual representations of physical scenes using only image prediction as a training criterion. To do so, we first define "physical scene" and show that, even though different agents may maintain different representations of the same scene, the underlying physical scene that can be inferred is unique. Then, we show that NeRFs cannot represen… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: TLDR: Physical scenes are equivalence classes of sufficient statistics, and can be inferred uniquely by any agent measuring the same finite data; We formalize and implement an approach to representation learning that overturns "naive realism" in favor of an analytical approach of Russell and Koenderink. NeRFs cannot capture the physical scenes, but combined with Diffusion Models they can

  10. arXiv:2301.13112  [pdf, other

    stat.ML cs.LG

    Benchmarking optimality of time series classification methods in distinguishing diffusions

    Authors: Zehong Zhang, Fei Lu, Esther Xu Fei, Terry Lyons, Yannis Kevrekidis, Tom Woolf

    Abstract: Statistical optimality benchmarking is crucial for analyzing and designing time series classification (TSC) algorithms. This study proposes to benchmark the optimality of TSC algorithms in distinguishing diffusion processes by the likelihood ratio test (LRT). The LRT is an optimal classifier by the Neyman-Pearson lemma. The LRT benchmarks are computationally efficient because the LRT does not need… ▽ More

    Submitted 11 April, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

    Comments: 23 pages, 8 figures

    MSC Class: 62M02; 62M10; 62M20

  11. arXiv:2208.12810  [pdf, other

    eess.IV cs.CV cs.LG

    Riesz-Quincunx-UNet Variational Auto-Encoder for Satellite Image Denoising

    Authors: Duy H. Thai, Xiqi Fei, Minh Tri Le, Andreas Züfle, Konrad Wessels

    Abstract: Multiresolution deep learning approaches, such as the U-Net architecture, have achieved high performance in classifying and segmenting images. However, these approaches do not provide a latent image representation and cannot be used to decompose, denoise, and reconstruct image data. The U-Net and other convolutional neural network (CNNs) architectures commonly use pooling to enlarge the receptive… ▽ More

    Submitted 25 August, 2022; originally announced August 2022.

    Comments: Submitted to IEEE Transactions on Geoscience and Remote Sensing (TGRS)

  12. arXiv:2106.10335  [pdf, other

    cs.CV

    Single View Physical Distance Estimation using Human Pose

    Authors: Xiaohan Fei, Henry Wang, Xiangyu Zeng, Lin Lee Cheong, Meng Wang, Joseph Tighe

    Abstract: We propose a fully automated system that simultaneously estimates the camera intrinsics, the ground plane, and physical distances between people from a single RGB image or video captured by a camera viewing a 3-D scene from a fixed vantage point. To automate camera calibration and distance estimation, we leverage priors about human pose and develop a novel direct formulation for pose-based auto-ca… ▽ More

    Submitted 18 June, 2021; originally announced June 2021.

  13. arXiv:2106.03010  [pdf, other

    cs.CV cs.LG cs.RO

    An Adaptive Framework for Learning Unsupervised Depth Completion

    Authors: Alex Wong, Xiaohan Fei, Byung-Woo Hong, Stefano Soatto

    Abstract: We present a method to infer a dense depth map from a color image and associated sparse depth measurements. Our main contribution lies in the design of an annealing process for determining co-visibility (occlusions, disocclusions) and the degree of regularization to impose on the model. We show that regularization and co-visibility are related via the fitness (residual) of model to data and both c… ▽ More

    Submitted 24 August, 2021; v1 submitted 5 June, 2021; originally announced June 2021.

  14. arXiv:2103.08893  [pdf, other

    cs.AI cs.CL

    KGSynNet: A Novel Entity Synonyms Discovery Framework with Knowledge Graph

    Authors: Yiying Yang, Xi Yin, Haiqin Yang, Xingjian Fei, Hao Peng, Kaijie Zhou, Kunfeng Lai, Jianping Shen

    Abstract: Entity synonyms discovery is crucial for entity-leveraging applications. However, existing studies suffer from several critical issues: (1) the input mentions may be out-of-vocabulary (OOV) and may come from a different semantic space of the entities; (2) the connection between mentions and entities may be hidden and cannot be established by surface matching; and (3) some entities rarely appear du… ▽ More

    Submitted 1 April, 2021; v1 submitted 16 March, 2021; originally announced March 2021.

    Comments: 16 pages, 3 figures, 5 tables, in DASFAA'21

  15. arXiv:1905.08616  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Unsupervised Depth Completion from Visual Inertial Odometry

    Authors: Alex Wong, Xiaohan Fei, Stephanie Tsuei, Stefano Soatto

    Abstract: We describe a method to infer dense depth from camera motion and sparse depth as estimated using a visual-inertial odometry system. Unlike other scenarios using point clouds from lidar or structured light sensors, we have few hundreds to few thousand points, insufficient to inform the topology of the scene. Our method first constructs a piecewise planar scaffolding of the scene, and then uses it t… ▽ More

    Submitted 21 July, 2021; v1 submitted 14 May, 2019; originally announced May 2019.

  16. arXiv:1807.11130  [pdf, other

    cs.CV cs.RO

    Geo-Supervised Visual Depth Prediction

    Authors: Xiaohan Fei, Alex Wong, Stefano Soatto

    Abstract: We propose using global orientation from inertial measurements, and the bias it induces on the shape of objects populating the scene, to inform visual 3D reconstruction. We test the effect of using the resulting prior in depth prediction from a single image, where the normal vectors to surfaces of objects of certain classes tend to align with gravity or be orthogonal to it. Adding such a prior to… ▽ More

    Submitted 11 June, 2019; v1 submitted 29 July, 2018; originally announced July 2018.

    Comments: ICRA 2019, RA-L 2019

  17. arXiv:1806.08498  [pdf, other

    cs.CV cs.RO

    Visual-Inertial Object Detection and Mapping

    Authors: Xiaohan Fei, Stefano Soatto

    Abstract: We present a method to populate an unknown environment with models of previously seen objects, placed in a Euclidean reference frame that is inferred causally and on-line using monocular video along with inertial sensors. The system we implement returns a sparse point cloud for the regions of the scene that are visible but not recognized as a previously seen object, and a detailed object model and… ▽ More

    Submitted 23 October, 2018; v1 submitted 22 June, 2018; originally announced June 2018.

    Journal ref: ECCV 2018

  18. arXiv:1709.05470  [pdf, ps, other

    cs.CV

    Long-Term Ensemble Learning of Visual Place Classifiers

    Authors: Xiaoxiao Fei, Kanji Tanaka, Yichu Fang, Akitaka Takayama

    Abstract: This paper addresses the problem of cross-season visual place classification (VPC) from a novel perspective of long-term map learning. Our goal is to enable transfer learning efficiently from one season to the next, at a small constant cost, and without wasting the robot's available long-term-memory by memorizing very large amounts of training data. To realize a good tradeoff between generalizatio… ▽ More

    Submitted 16 September, 2017; originally announced September 2017.

    Comments: 8 pages, 9 figures, technical report

  19. arXiv:1606.03968  [pdf, other

    cs.CV cs.AI

    Visual-Inertial-Semantic Scene Representation for 3-D Object Detection

    Authors: Jingming Dong, Xiaohan Fei, Stefano Soatto

    Abstract: We describe a system to detect objects in three-dimensional space using video and inertial sensors (accelerometer and gyrometer), ubiquitous in modern mobile platforms from phones to drones. Inertials afford the ability to impose class-specific scale priors for objects, and provide a global orientation reference. A minimal sufficient representation, the posterior of semantic (identity) and syntact… ▽ More

    Submitted 17 April, 2017; v1 submitted 13 June, 2016; originally announced June 2016.

    Comments: To appear in CVPR 2017

    Report number: CSD160005

  20. arXiv:1511.06489  [pdf, other

    cs.CV cs.RO

    A Simple Hierarchical Pooling Data Structure for Loop Closure

    Authors: Xiaohan Fei, Konstantine Tsotsos, Stefano Soatto

    Abstract: We propose a data structure obtained by hierarchically averaging bag-of-word descriptors during a sequence of views that achieves average speedups in large-scale loop closure applications ranging from 4 to 20 times on benchmark datasets. Although simple, the method works as well as sophisticated agglomerative schemes at a fraction of the cost with minimal loss of performance.

    Submitted 23 October, 2018; v1 submitted 19 November, 2015; originally announced November 2015.