Skip to main content

Showing 1–50 of 68 results for author: Di, X

  1. arXiv:2405.08005  [pdf, other

    math.OC cs.AI cs.GT cs.LG stat.ML

    Graphon Mean Field Games with a Representative Player: Analysis and Learning Algorithm

    Authors: Fuzhong Zhou, Chenyu Zhang, Xu Chen, Xuan Di

    Abstract: We propose a discrete time graphon game formulation on continuous state and action spaces using a representative player to study stochastic games with heterogeneous interaction among agents. This formulation admits both philosophical and mathematical advantages, compared to a widely adopted formulation using a continuum of players. We prove the existence and uniqueness of the graphon equilibrium w… ▽ More

    Submitted 4 June, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: Published as a conference paper at ICML 2024

  2. arXiv:2405.03718  [pdf, other

    cs.LG cs.AI cs.GT cs.MA

    A Single Online Agent Can Efficiently Learn Mean Field Games

    Authors: Chenyu Zhang, Xu Chen, Xuan Di

    Abstract: Mean field games (MFGs) are a promising framework for modeling the behavior of large-population systems. However, solving MFGs can be challenging due to the coupling of forward population evolution and backward agent dynamics. Typically, obtaining mean field Nash equilibria (MFNE) involves an iterative approach where the forward and backward processes are solved alternately, known as fixed-point i… ▽ More

    Submitted 16 July, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

    Comments: Published as a conference paper at ECAI 2024

  3. arXiv:2404.16484  [pdf, other

    cs.CV eess.IV

    Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

    Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi Jin, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

    Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: CVPR 2024, AI for Streaming (AIS) Workshop

  4. arXiv:2404.11458  [pdf, other

    cs.AI

    Learn to Tour: Operator Design For Solution Feasibility Mapping in Pickup-and-delivery Traveling Salesman Problem

    Authors: Bowen Fang, Xu Chen, Xuan Di

    Abstract: This paper aims to develop a learning method for a special class of traveling salesman problems (TSP), namely, the pickup-and-delivery TSP (PDTSP), which finds the shortest tour along a sequence of one-to-one pickup-and-delivery nodes. One-to-one here means that the transported people or goods are associated with designated pairs of pickup and delivery nodes, in contrast to that indistinguishable… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  5. arXiv:2404.10343  [pdf, other

    cs.CV eess.IV

    The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

    Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

  6. arXiv:2404.06892  [pdf, other

    cs.CV

    SparseAD: Sparse Query-Centric Paradigm for Efficient End-to-End Autonomous Driving

    Authors: Diankun Zhang, Guoan Wang, Runwen Zhu, Jianbo Zhao, Xiwu Chen, Siyu Zhang, Jiahao Gong, Qibin Zhou, Wenyuan Zhang, Ningzi Wang, Feiyang Tan, Hangning Zhou, Ziyao Xu, Haotian Yao, Chi Zhang, Xiaojun Liu, Xiaoguang Di, Bin Li

    Abstract: End-to-End paradigms use a unified framework to implement multi-tasks in an autonomous driving system. Despite simplicity and clarity, the performance of end-to-end autonomous driving methods on sub-tasks is still far behind the single-task methods. Meanwhile, the widely used dense BEV features in previous end-to-end methods make it costly to extend to more modalities or tasks. In this paper, we p… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  7. arXiv:2311.01929  [pdf, other

    cs.CV

    ProS: Facial Omni-Representation Learning via Prototype-based Self-Distillation

    Authors: Xing Di, Yiyu Zheng, Xiaoming Liu, Yu Cheng

    Abstract: This paper presents a novel approach, called Prototype-based Self-Distillation (ProS), for unsupervised face representation learning. The existing supervised methods heavily rely on a large amount of annotated training facial data, which poses challenges in terms of data collection and privacy concerns. To address these issues, we propose ProS, which leverages a vast collection of unlabeled face i… ▽ More

    Submitted 7 November, 2023; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: This paper has been accepted in WACV2024

  8. arXiv:2306.09261  [pdf, other

    cs.LG

    Mitigating Cold-start Forecasting using Cold Causal Demand Forecasting Model

    Authors: Zahra Fatemi, Minh Huynh, Elena Zheleva, Zamir Syed, Xiaojun Di

    Abstract: Forecasting multivariate time series data, which involves predicting future values of variables over time using historical data, has significant practical applications. Although deep learning-based models have shown promise in this field, they often fail to capture the causal relationship between dependent variables, leading to less accurate forecasts. Additionally, these models cannot handle the… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  9. arXiv:2305.04123  [pdf, other

    cs.CV

    Transform-Equivariant Consistency Learning for Temporal Sentence Grounding

    Authors: Daizong Liu, Xiaoye Qu, Jianfeng Dong, Pan Zhou, Zichuan Xu, Haozhao Wang, Xing Di, Weining Lu, Yu Cheng

    Abstract: This paper addresses the temporal sentence grounding (TSG). Although existing methods have made decent achievements in this task, they not only severely rely on abundant video-query paired data for training, but also easily fail into the dataset distribution bias. To alleviate these limitations, we introduce a novel Equivariant Consistency Regulation Learning (ECRL) framework to learn more discrim… ▽ More

    Submitted 6 May, 2023; originally announced May 2023.

  10. arXiv:2304.02978  [pdf, other

    cs.CV cs.LG eess.IV

    Simplifying Low-Light Image Enhancement Networks with Relative Loss Functions

    Authors: Yu Zhang, Xiaoguang Di, Junde Wu, Rao Fu, Yong Li, Yue Wang, Yanwu Xu, Guohui Yang, Chunhui Wang

    Abstract: Image enhancement is a common technique used to mitigate issues such as severe noise, low brightness, low contrast, and color deviation in low-light images. However, providing an optimal high-light image as a reference for low-light image enhancement tasks is impossible, which makes the learning process more difficult than other image processing tasks. As a result, although several low-light image… ▽ More

    Submitted 3 August, 2023; v1 submitted 6 April, 2023; originally announced April 2023.

    Comments: 19 pages, 11 figures

    MSC Class: 68Txx ACM Class: I.4.3

  11. Physics-Informed Deep Learning For Traffic State Estimation: A Survey and the Outlook

    Authors: Xuan Di, Rongye Shi, Zhaobin Mo, Yongjie Fu

    Abstract: For its robust predictive power (compared to pure physics-based models) and sample-efficient training (compared to pure deep learning models), physics-informed deep learning (PIDL), a paradigm hybridizing physics-based models and deep neural networks (DNN), has been booming in science and engineering fields. One key challenge of applying PIDL to various domains and problems lies in the design of a… ▽ More

    Submitted 1 July, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

  12. arXiv:2301.01871  [pdf, other

    cs.CV

    Hypotheses Tree Building for One-Shot Temporal Sentence Localization

    Authors: Daizong Liu, Xiang Fang, Pan Zhou, Xing Di, Weining Lu, Yu Cheng

    Abstract: Given an untrimmed video, temporal sentence localization (TSL) aims to localize a specific segment according to a given sentence query. Though respectable works have made decent achievements in this task, they severely rely on dense video frame annotations, which require a tremendous amount of human effort to collect. In this paper, we target another more practical and challenging setting: one-sho… ▽ More

    Submitted 15 January, 2023; v1 submitted 4 January, 2023; originally announced January 2023.

    Comments: Accepted by AAAI2023

  13. arXiv:2301.00514  [pdf, other

    cs.CV

    Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence Grounding

    Authors: Jiahao Zhu, Daizong Liu, Pan Zhou, Xing Di, Yu Cheng, Song Yang, Wenzheng Xu, Zichuan Xu, Yao Wan, Lichao Sun, Zeyu Xiong

    Abstract: Temporal sentence grounding (TSG) aims to identify the temporal boundary of a specific segment from an untrimmed video by a sentence query. All existing works first utilize a sparse sampling strategy to extract a fixed number of video frames and then conduct multi-modal interactions with query sentence for reasoning. However, we argue that these methods have overlooked two indispensable issues: 1)… ▽ More

    Submitted 1 January, 2023; originally announced January 2023.

    Comments: Accepted by EMNLP Findings, 2022

  14. arXiv:2301.00407  [pdf, other

    cs.LG cs.PF

    MIGPerf: A Comprehensive Benchmark for Deep Learning Training and Inference Workloads on Multi-Instance GPUs

    Authors: Huaizheng Zhang, Yuanming Li, Wencong Xiao, Yizheng Huang, Xing Di, Jianxiong Yin, Simon See, Yong Luo, Chiew Tong Lau, Yang You

    Abstract: New architecture GPUs like A100 are now equipped with multi-instance GPU (MIG) technology, which allows the GPU to be partitioned into multiple small, isolated instances. This technology provides more flexibility for users to support both deep learning training and inference workloads, but efficiently utilizing it can still be challenging. The vision of this paper is to provide a more comprehensiv… ▽ More

    Submitted 1 January, 2023; originally announced January 2023.

    Comments: 10 pages, 11 figures

  15. arXiv:2210.10431  [pdf, other

    cs.CV cs.AI

    Hierarchical Reinforcement Learning for Furniture Layout in Virtual Indoor Scenes

    Authors: Xinhan Di, Pengqian Yu

    Abstract: In real life, the decoration of 3D indoor scenes through designing furniture layout provides a rich experience for people. In this paper, we explore the furniture layout task as a Markov decision process (MDP) in virtual reality, which is solved by hierarchical reinforcement learning (HRL). The goal is to produce a proper two-furniture layout in the virtual reality of the indoor scenes. In particu… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: Accepted by Reinforcement Learning for Real Life Workshop @ NeurIPS 2022

  16. arXiv:2208.09815  [pdf, other

    cs.CV

    LWA-HAND: Lightweight Attention Hand for Interacting Hand Reconstruction

    Authors: Xinhan Di, Pengqian Yu

    Abstract: Recent years have witnessed great success for hand reconstruction in real-time applications such as visual reality and augmented reality while interacting with two-hand reconstruction through efficient transformers is left unexplored. In this paper, we propose a method called lightweight attention hand (LWA-HAND) to reconstruct hands in low flops from a single RGB image. To solve the occlusion and… ▽ More

    Submitted 27 August, 2022; v1 submitted 21 August, 2022; originally announced August 2022.

    Comments: Accepted by ECCV 2022 Computer Vision for Metaverse Workshop (16 pages, 6 figures, 1 table)

  17. Backdoor Attacks on Crowd Counting

    Authors: Yuhua Sun, Tailai Zhang, Xingjun Ma, Pan Zhou, Jian Lou, Zichuan Xu, Xing Di, Yu Cheng, Lichao

    Abstract: Crowd counting is a regression task that estimates the number of people in a scene image, which plays a vital role in a range of safety-critical applications, such as video surveillance, traffic monitoring and flow control. In this paper, we investigate the vulnerability of deep learning based crowd counting models to backdoor attacks, a major security threat to deep learning. A backdoor attack im… ▽ More

    Submitted 12 July, 2022; originally announced July 2022.

    Comments: To appear in ACMMM 2022. 10pages, 6 figures and 2 tables

    ACM Class: F.0; I.4.0

  18. arXiv:2206.09349  [pdf, other

    cs.LG

    Quantifying Uncertainty In Traffic State Estimation Using Generative Adversarial Networks

    Authors: Zhaobin Mo, Yongjie Fu, Xuan Di

    Abstract: This paper aims to quantify uncertainty in traffic state estimation (TSE) using the generative adversarial network based physics-informed deep learning (PIDL). The uncertainty of the focus arises from fundamental diagrams, in other words, the mapping from traffic density to velocity. To quantify uncertainty for the TSE problem is to characterize the robustness of predicted traffic states. Since it… ▽ More

    Submitted 9 November, 2022; v1 submitted 19 June, 2022; originally announced June 2022.

  19. arXiv:2206.09319  [pdf, other

    cs.LG

    TrafficFlowGAN: Physics-informed Flow based Generative Adversarial Network for Uncertainty Quantification

    Authors: Zhaobin Mo, Yongjie Fu, Daran Xu, Xuan Di

    Abstract: This paper proposes the TrafficFlowGAN, a physics-informed flow based generative adversarial network (GAN), for uncertainty quantification (UQ) of dynamical systems. TrafficFlowGAN adopts a normalizing flow model as the generator to explicitly estimate the data likelihood. This flow model is trained to maximize the data likelihood and to generate synthetic data that can fool a convolutional discri… ▽ More

    Submitted 15 October, 2022; v1 submitted 18 June, 2022; originally announced June 2022.

  20. arXiv:2203.00512  [pdf, other

    eess.SP cs.AI cs.LG

    A Deep Bayesian Neural Network for Cardiac Arrhythmia Classification with Rejection from ECG Recordings

    Authors: Wenrui Zhang, Xinxin Di, Guodong Wei, Shijia Geng, Zhaoji Fu, Shenda Hong

    Abstract: With the development of deep learning-based methods, automated classification of electrocardiograms (ECGs) has recently gained much attention. Although the effectiveness of deep neural networks has been encouraging, the lack of information given by the outputs restricts clinicians' reexamination. If the uncertainty estimation comes along with the classification results, cardiologists can pay more… ▽ More

    Submitted 25 February, 2022; originally announced March 2022.

  21. arXiv:2201.05307  [pdf, other

    cs.CV cs.LG

    Unsupervised Temporal Video Grounding with Deep Semantic Clustering

    Authors: Daizong Liu, Xiaoye Qu, Yinzhen Wang, Xing Di, Kai Zou, Yu Cheng, Zichuan Xu, Pan Zhou

    Abstract: Temporal video grounding (TVG) aims to localize a target segment in a video according to a given sentence query. Though respectable works have made decent achievements in this task, they severely rely on abundant video-query paired data, which is expensive and time-consuming to collect in real-world scenarios. In this paper, we explore whether a video grounding model can be learned without any pai… ▽ More

    Submitted 14 January, 2022; originally announced January 2022.

    Comments: Accepted by AAAI2022

  22. arXiv:2201.00454  [pdf, other

    cs.CV

    Memory-Guided Semantic Learning Network for Temporal Sentence Grounding

    Authors: Daizong Liu, Xiaoye Qu, Xing Di, Yu Cheng, Zichuan Xu, Pan Zhou

    Abstract: Temporal sentence grounding (TSG) is crucial and fundamental for video understanding. Although the existing methods train well-designed deep networks with a large amount of data, we find that they can easily forget the rarely appeared cases in the training stage due to the off-balance data distribution, which influences the model generalization and leads to undesirable performance. To tackle this… ▽ More

    Submitted 2 January, 2022; originally announced January 2022.

    Comments: Accepted by AAAI2022

  23. arXiv:2109.12506  [pdf, other

    cs.CV cs.AR

    A Simple Self-calibration Method for The Internal Time Synchronization of MEMS LiDAR

    Authors: Yu Zhang, Xiaoguang Di, Shiyu Yan, Bin Zhang, Baoling Qi, Chunhui Wang

    Abstract: This paper proposes a simple self-calibration method for the internal time synchronization of MEMS(Micro-electromechanical systems) LiDAR during research and development. Firstly, we introduced the problem of internal time misalignment in MEMS lidar. Then, a robust Minimum Vertical Gradient(MVG) prior is proposed to calibrate the time difference between the laser and MEMS mirror, which can be calc… ▽ More

    Submitted 26 September, 2021; originally announced September 2021.

    Comments: 9 pages, 8 figures,

    ACM Class: I.4.5; J.2

  24. arXiv:2109.09271  [pdf, ps, other

    eess.IV cs.CV

    DeepStationing: Thoracic Lymph Node Station Parsing in CT Scans using Anatomical Context Encoding and Key Organ Auto-Search

    Authors: Dazhou Guo, Xianghua Ye, Jia Ge, Xing Di, Le Lu, Lingyun Huang, Guotong Xie, Jing Xiao, Zhongjie Liu, Ling Peng, Senxiang Yan, Dakai Jin

    Abstract: Lymph node station (LNS) delineation from computed tomography (CT) scans is an indispensable step in radiation oncology workflow. High inter-user variabilities across oncologists and prohibitive laboring costs motivated the automated approach. Previous works exploit anatomical priors to infer LNS based on predefined ad-hoc margins. However, without voxel-level supervision, the performance is sever… ▽ More

    Submitted 19 September, 2021; originally announced September 2021.

  25. Heterogeneous Face Frontalization via Domain Agnostic Learning

    Authors: Xing Di, Shuowen Hu, Vishal M. Patel

    Abstract: Recent advances in deep convolutional neural networks (DCNNs) have shown impressive performance improvements on thermal to visible face synthesis and matching problems. However, current DCNN-based synthesis models do not perform well on thermal faces with large pose variations. In order to deal with this problem, heterogeneous face frontalization methods are needed in which a model takes a thermal… ▽ More

    Submitted 5 December, 2021; v1 submitted 17 July, 2021; originally announced July 2021.

    Comments: FG2021 camera-ready version

  26. A Physics-Informed Deep Learning Paradigm for Traffic State and Fundamental Diagram Estimation

    Authors: Rongye Shi, Zhaobin Mo, Kuang Huang, Xuan Di, Qiang Du

    Abstract: Traffic state estimation (TSE) bifurcates into two categories, model-driven and data-driven (e.g., machine learning, ML), while each suffers from either deficient physics or small data. To mitigate these limitations, recent studies introduced a hybrid paradigm, physics-informed deep learning (PIDL), which contains both model-driven and data-driven components. This paper contributes an improved ver… ▽ More

    Submitted 21 September, 2021; v1 submitted 6 June, 2021; originally announced June 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2101.06580

  27. arXiv:2104.10340  [pdf, other

    cs.LG cs.AI eess.SY

    CVLight: Decentralized Learning for Adaptive Traffic Signal Control with Connected Vehicles

    Authors: Mobin Zhao, Wangzhi Li, Yongjie Fu, Kangrui Ruan, Xuan Di

    Abstract: This paper develops a decentralized reinforcement learning (RL) scheme for multi-intersection adaptive traffic signal control (TSC), called "CVLight", that leverages data collected from connected vehicles (CVs). The state and reward design facilitates coordination among agents and considers travel delays collected by CVs. A novel algorithm, Asymmetric Advantage Actor-critic (Asym-A2C), is proposed… ▽ More

    Submitted 30 June, 2022; v1 submitted 20 April, 2021; originally announced April 2021.

    Comments: 29 pages, 14 figures

    Journal ref: Transportation Research Part C: Emerging Technologies, 141 (2022): 103728

  28. Multimodal Face Synthesis from Visual Attributes

    Authors: Xing Di, Vishal M. Patel

    Abstract: Synthesis of face images from visual attributes is an important problem in computer vision and biometrics due to its applications in law enforcement and entertainment. Recent advances in deep generative networks have made it possible to synthesize high-quality face images from visual attributes. However, existing methods are specifically designed for generating unimodal images (i.e visible faces)… ▽ More

    Submitted 9 April, 2021; originally announced April 2021.

    Comments: IEEE Transactions on Biometrics, Behavior, and Identity Science (T-BIOM) submission

  29. arXiv:2103.00832  [pdf, other

    cs.CV

    Self-supervised Low Light Image Enhancement and Denoising

    Authors: Yu Zhang, Xiaoguang Di, Bin Zhang, Qingyan Li, Shiyu Yan, Chunhui Wang

    Abstract: This paper proposes a self-supervised low light image enhancement method based on deep learning, which can improve the image contrast and reduce noise at the same time to avoid the blur caused by pre-/post-denoising. The method contains two deep sub-networks, an Image Contrast Enhancement Network (ICE-Net) and a Re-Enhancement and Denoising Network (RED-Net). The ICE-Net takes the low light image… ▽ More

    Submitted 1 March, 2021; originally announced March 2021.

    Comments: 10 pages, 7 figures

  30. arXiv:2102.09137  [pdf, other

    cs.CV

    Multi-Agent Reinforcement Learning of 3D Furniture Layout Simulation in Indoor Graphics Scenes

    Authors: Xinhan Di, Pengqian Yu

    Abstract: In the industrial interior design process, professional designers plan the furniture layout to achieve a satisfactory 3D design for selling. In this paper, we explore the interior graphics scenes design task as a Markov decision process (MDP) in 3D simulation, which is solved by multi-agent reinforcement learning. The goal is to produce furniture layout in the 3D simulation of the indoor graphics… ▽ More

    Submitted 17 February, 2021; originally announced February 2021.

    Comments: 8 pages, 3 figures submit to conference. arXiv admin note: substantial text overlap with arXiv:2101.07462

  31. arXiv:2101.07462  [pdf, other

    cs.CV

    Deep Reinforcement Learning for Producing Furniture Layout in Indoor Scenes

    Authors: Xinhan Di, Pengqian Yu

    Abstract: In the industrial interior design process, professional designers plan the size and position of furniture in a room to achieve a satisfactory design for selling. In this paper, we explore the interior scene design task as a Markov decision process (MDP), which is solved by deep reinforcement learning. The goal is to produce an accurate position and size of the furniture simultaneously for the indo… ▽ More

    Submitted 18 January, 2021; originally announced January 2021.

    Comments: computer vision reinforcement learning. arXiv admin note: text overlap with arXiv:2012.08514, arXiv:2012.08131

  32. arXiv:2101.06580  [pdf, other

    cs.LG

    Physics-Informed Deep Learning for Traffic State Estimation

    Authors: Rongye Shi, Zhaobin Mo, Kuang Huang, Xuan Di, Qiang Du

    Abstract: Traffic state estimation (TSE), which reconstructs the traffic variables (e.g., density) on road segments using partially observed data, plays an important role on efficient traffic control and operation that intelligent transportation systems (ITS) need to provide to people. Over decades, TSE approaches bifurcate into two main categories, model-driven approaches and data-driven approaches. Howeve… ▽ More

    Submitted 16 January, 2021; originally announced January 2021.

  33. arXiv:2101.02637  [pdf, other

    cs.CV

    A Large-Scale, Time-Synchronized Visible and Thermal Face Dataset

    Authors: Domenick Poster, Matthew Thielke, Robert Nguyen, Srinivasan Rajaraman, Xing Di, Cedric Nimpa Fondje, Vishal M. Patel, Nathaniel J. Short, Benjamin S. Riggan, Nasser M. Nasrabadi, Shuowen Hu

    Abstract: Thermal face imagery, which captures the naturally emitted heat from the face, is limited in availability compared to face imagery in the visible spectrum. To help address this scarcity of thermal face imagery for research and algorithm development, we present the DEVCOM Army Research Laboratory Visible-Thermal Face Dataset (ARL-VTF). With over 500,000 images from 395 subjects, the ARL-VTF dataset… ▽ More

    Submitted 7 January, 2021; originally announced January 2021.

  34. A Physics-Informed Deep Learning Paradigm for Car-Following Models

    Authors: Zhaobin Mo, Xuan Di, Rongye Shi

    Abstract: Car-following behavior has been extensively studied using physics-based models, such as the Intelligent Driver Model. These models successfully interpret traffic phenomena observed in the real-world but may not fully capture the complex cognitive process of driving. Deep learning models, on the other hand, have demonstrated their power in capturing observed traffic phenomena but require a large am… ▽ More

    Submitted 13 July, 2021; v1 submitted 24 December, 2020; originally announced December 2020.

  35. arXiv:2012.08514  [pdf, other

    cs.CV

    End-to-end Generative Floor-plan and Layout with Attributes and Relation Graph

    Authors: Xinhan Di, Pengqian Yu, Danfeng Yang, Hong Zhu, Changyu Sun, YinDong Liu

    Abstract: In this paper, we propose an end-end model for producing furniture layout for interior scene synthesis from the random vector. This proposed model is aimed to support professional interior designers to produce the interior decoration solutions more quickly. The proposed model combines a conditional floor-plan module of the room, a conditional graphical floor-plan module of the room and a condition… ▽ More

    Submitted 15 December, 2020; originally announced December 2020.

    Comments: Submitted to CV Conference. arXiv admin note: text overlap with arXiv:2006.13527. text overlap with arXiv:2012.08131

  36. arXiv:2012.08131  [pdf, other

    cs.CV

    Deep Layout of Custom-size Furniture through Multiple-domain Learning

    Authors: Xinhan Di, Pengqian Yu, Danfeng Yang, Hong Zhu, Changyu Sun, YinDong Liu

    Abstract: In this paper, we propose a multiple-domain model for producing a custom-size furniture layout in the interior scene. This model is aimed to support professional interior designers to produce interior decoration solutions with custom-size furniture more quickly. The proposed model combines a deep layout module, a domain attention module, a dimensional domain transfer module, and a custom-size modu… ▽ More

    Submitted 15 December, 2020; originally announced December 2020.

    Comments: Submitted to CV Conference. arXiv admin note: text overlap with arXiv:2006.13527

  37. Multi-Agent Reinforcement Learning for Markov Routing Games: A New Modeling Paradigm For Dynamic Traffic Assignment

    Authors: Zhenyu Shou, Xu Chen, Yongjie Fu, Xuan Di

    Abstract: This paper aims to develop a paradigm that models the learning behavior of intelligent agents (including but not limited to autonomous vehicles, connected and automated vehicles, or human-driven vehicles with intelligent navigation systems where human drivers follow the navigation instructions completely) with a utility-optimizing goal and the system's equilibrating processes in a routing game amo… ▽ More

    Submitted 27 February, 2022; v1 submitted 21 November, 2020; originally announced November 2020.

    Comments: 20 pages, 11 figures, published in Transportation Research Part C 137 (2022) 103560

    Journal ref: Transportation Research Part C: Emerging Technologies 137, 103560 (2022)

  38. arXiv:2008.11434  [pdf, other

    eess.IV cs.CV

    Better Than Reference In Low Light Image Enhancement: Conditional Re-Enhancement Networks

    Authors: Yu Zhang, Xiaoguang Di, Bin Zhang, Ruihang Ji, Chunhui Wang

    Abstract: Low light images suffer from severe noise, low brightness, low contrast, etc. In previous researches, many image enhancement methods have been proposed, but few methods can deal with these problems simultaneously. In this paper, to solve these problems simultaneously, we propose a low light image enhancement method that can combined with supervised learning and previous HSV (Hue, Saturation, Value… ▽ More

    Submitted 26 August, 2020; originally announced August 2020.

    Comments: 10 pages, 8 figures

  39. arXiv:2008.01323  [pdf, other

    cs.CV cs.LG

    Structural Plan of Indoor Scenes with Personalized Preferences

    Authors: Xinhan Di, Pengqian Yu, Hong Zhu, Lei Cai, Qiuyan Sheng, Changyu Sun

    Abstract: In this paper, we propose an assistive model that supports professional interior designers to produce industrial interior decoration solutions and to meet the personalized preferences of the property owners. The proposed model is able to automatically produce the layout of objects of a particular indoor scene according to property owners' preferences. In particular, the model consists of the extra… ▽ More

    Submitted 5 August, 2020; v1 submitted 4 August, 2020; originally announced August 2020.

    Comments: Accepted by the 8th International Workshop on Assistive Computer Vision and Robotics (ACVR) in Conjunction with ECCV 2020

  40. arXiv:2007.11355  [pdf, other

    cs.CV

    Leveraging Undiagnosed Data for Glaucoma Classification with Teacher-Student Learning

    Authors: Junde Wu, Shuang Yu, Wenting Chen, Kai Ma, Rao Fu, Hanruo Liu, Xiaoguang Di, Yefeng Zheng

    Abstract: Recently, deep learning has been adopted to the glaucoma classification task with performance comparable to that of human experts. However, a well trained deep learning model demands a large quantity of properly labeled data, which is relatively expensive since the accurate labeling of glaucoma requires years of specialist training. In order to alleviate this problem, we propose a glaucoma classif… ▽ More

    Submitted 22 July, 2020; originally announced July 2020.

    Journal ref: MICCAI 2020

  41. arXiv:2007.05156  [pdf, other

    cs.AI cs.RO

    A Survey on Autonomous Vehicle Control in the Era of Mixed-Autonomy: From Physics-Based to AI-Guided Driving Policy Learning

    Authors: Xuan Di, Rongye Shi

    Abstract: This paper serves as an introduction and overview of the potentially useful models and methodologies from artificial intelligence (AI) into the field of transportation engineering for autonomous vehicle (AV) control in the era of mixed autonomy. We will discuss state-of-the-art applications of AI-guided methods, identify opportunities and obstacles, raise open questions, and help suggest the build… ▽ More

    Submitted 10 July, 2020; originally announced July 2020.

  42. arXiv:2006.13527  [pdf, other

    cs.CV

    Adversarial Model for Rotated Indoor Scenes Planning

    Authors: Xinhan Di, Pengqian Yu, Hong Zhu, Lei Cai, Qiuyan Sheng, Changyu Sun

    Abstract: In this paper, we propose an adversarial model for producing furniture layout for interior scene synthesis when the interior room is rotated. The proposed model combines a conditional adversarial network, a rotation module, a mode module, and a rotation discriminator module. As compared with the prior work on scene synthesis, our proposed three modules enhance the ability of auto-layout generation… ▽ More

    Submitted 6 July, 2020; v1 submitted 24 June, 2020; originally announced June 2020.

  43. arXiv:2006.12769  [pdf

    cs.LG cs.AI stat.ML

    Long-Term Prediction of Lane Change Maneuver Through a Multilayer Perceptron

    Authors: Zhenyu Shou, Ziran Wang, Kyungtae Han, Yongkang Liu, Prashant Tiwari, Xuan Di

    Abstract: Behavior prediction plays an essential role in both autonomous driving systems and Advanced Driver Assistance Systems (ADAS), since it enhances vehicle's awareness of the imminent hazards in the surrounding environment. Many existing lane change prediction models take as input lateral or angle information and make short-term (< 5 seconds) maneuver predictions. In this study, we propose a longer-te… ▽ More

    Submitted 23 June, 2020; originally announced June 2020.

    Comments: Accepted by 31st IEEE Intelligent Vehicles Symposium

  44. arXiv:2004.10447  [pdf, other

    eess.IV cs.CV

    Learning an Adaptive Model for Extreme Low-light Raw Image Processing

    Authors: Qingxu Fu, Xiaoguang Di, Yu Zhang

    Abstract: Low-light images suffer from severe noise and low illumination. Current deep learning models that are trained with real-world images have excellent noise reduction, but a ratio parameter must be chosen manually to complete the enhancement pipeline. In this work, we propose an adaptive low-light raw image enhancement network to avoid parameter-handcrafting and to improve image quality. The proposed… ▽ More

    Submitted 22 April, 2020; originally announced April 2020.

  45. Multi-Scale Thermal to Visible Face Verification via Attribute Guided Synthesis

    Authors: Xing Di, Benjamin S. Riggan, Shuowen Hu, Nathaniel J. Short, Vishal M. Patel

    Abstract: Thermal-to-visible face verification is a challenging problem due to the large domain discrepancy between the modalities. Existing approaches either attempt to synthesize visible faces from thermal faces or learn domain-invariant robust features from these modalities for cross-modal matching. In this paper, we use attributes extracted from visible images to synthesize attribute-preserved visible i… ▽ More

    Submitted 13 February, 2021; v1 submitted 19 April, 2020; originally announced April 2020.

    Comments: accepted by IEEE Transactions on Biometrics, Behavior, and Identity Science (T-BIOM). arXiv admin note: substantial text overlap with arXiv:1901.00889

  46. arXiv:2003.09855  [pdf, other

    cs.LG cs.CV cs.NE

    TanhExp: A Smooth Activation Function with High Convergence Speed for Lightweight Neural Networks

    Authors: Xinyu Liu, Xiaoguang Di

    Abstract: Lightweight or mobile neural networks used for real-time computer vision tasks contain fewer parameters than normal networks, which lead to a constrained performance. In this work, we proposed a novel activation function named Tanh Exponential Activation Function (TanhExp) which can improve the performance for these networks on image classification task significantly. The definition of TanhExp is… ▽ More

    Submitted 9 September, 2020; v1 submitted 22 March, 2020; originally announced March 2020.

    Comments: This paper is a preprint of a paper accepted by IET Computer Vision and is subject to Institution of Engineering and Technology Copyright. When the final version is published, the copy of record will be available at the IET Digital Library

  47. arXiv:2002.11300  [pdf, other

    cs.CV eess.IV

    Self-supervised Image Enhancement Network: Training with Low Light Images Only

    Authors: Yu Zhang, Xiaoguang Di, Bin Zhang, Chunhui Wang

    Abstract: This paper proposes a self-supervised low light image enhancement method based on deep learning. Inspired by information entropy theory and Retinex model, we proposed a maximum entropy based Retinex model. With this model, a very simple network can separate the illumination and reflectance, and the network can be trained with low light images only. We introduce a constraint that the maximum channe… ▽ More

    Submitted 25 February, 2020; originally announced February 2020.

    Comments: 14 pages,13 figures

    MSC Class: 68U10 ACM Class: I.4.3

  48. arXiv:2002.06723  [pdf, other

    cs.LG cs.MA stat.ML

    Reward Design for Driver Repositioning Using Multi-Agent Reinforcement Learning

    Authors: Zhenyu Shou, Xuan Di

    Abstract: A large portion of passenger requests is reportedly unserviced, partially due to vacant for-hire drivers' cruising behavior during the passenger seeking process. This paper aims to model the multi-driver repositioning task through a mean field multi-agent reinforcement learning (MARL) approach that captures competition among multiple agents. Because the direct application of MARL to the multi-driv… ▽ More

    Submitted 23 August, 2020; v1 submitted 16 February, 2020; originally announced February 2020.

    Comments: 28 pages, 20 figures, published in Transportation Research Part C 119 (2020) 102738

  49. arXiv:2002.05878  [pdf, other

    cs.CV cs.LG cs.RO

    An LSTM-Based Autonomous Driving Model Using Waymo Open Dataset

    Authors: Zhicheng Gu, Zhihao Li, Xuan Di, Rongye Shi

    Abstract: The Waymo Open Dataset has been released recently, providing a platform to crowdsource some fundamental challenges for automated vehicles (AVs), such as 3D detection and tracking. While~the dataset provides a large amount of high-quality and multi-source driving information, people in academia are more interested in the underlying driving policy programmed in Waymo self-driving cars, which is inac… ▽ More

    Submitted 23 March, 2020; v1 submitted 14 February, 2020; originally announced February 2020.

    Journal ref: Applied Sciences 10(6) 2046, 2020

  50. arXiv:2001.11194  [pdf, other

    cs.CV cs.LG

    The Direction-Aware, Learnable, Additive Kernels and the Adversarial Network for Deep Floor Plan Recognition

    Authors: Yuli Zhang, Yeyang He, Shaowen Zhu, Xinhan Di

    Abstract: This paper presents a new approach for the recognition of elements in floor plan layouts. Besides of elements with common shapes, we aim to recognize elements with irregular shapes such as circular rooms and inclined walls. Furthermore, the reduction of noise in the semantic segmentation of the floor plan is on demand. To this end, we propose direction-aware, learnable, additive kernels in the app… ▽ More

    Submitted 30 January, 2020; originally announced January 2020.

    Comments: deep learning, floor plan, computer vision