-
Advancing UWF-SLO Vessel Segmentation with Source-Free Active Domain Adaptation and a Novel Multi-Center Dataset
Authors:
Hongqiu Wang,
Xiangde Luo,
Wu Chen,
Qingqing Tang,
Mei Xin,
Qiong Wang,
Lei Zhu
Abstract:
Accurate vessel segmentation in Ultra-Wide-Field Scanning Laser Ophthalmoscopy (UWF-SLO) images is crucial for diagnosing retinal diseases. Although recent techniques have shown encouraging outcomes in vessel segmentation, models trained on one medical dataset often underperform on others due to domain shifts. Meanwhile, manually labeling high-resolution UWF-SLO images is an extremely challenging,…
▽ More
Accurate vessel segmentation in Ultra-Wide-Field Scanning Laser Ophthalmoscopy (UWF-SLO) images is crucial for diagnosing retinal diseases. Although recent techniques have shown encouraging outcomes in vessel segmentation, models trained on one medical dataset often underperform on others due to domain shifts. Meanwhile, manually labeling high-resolution UWF-SLO images is an extremely challenging, time-consuming and expensive task. In response, this study introduces a pioneering framework that leverages a patch-based active domain adaptation approach. By actively recommending a few valuable image patches by the devised Cascade Uncertainty-Predominance (CUP) selection strategy for labeling and model-finetuning, our method significantly improves the accuracy of UWF-SLO vessel segmentation across diverse medical centers. In addition, we annotate and construct the first Multi-center UWF-SLO Vessel Segmentation (MU-VS) dataset to promote this topic research, comprising data from multiple institutions. This dataset serves as a valuable resource for cross-center evaluation, verifying the effectiveness and robustness of our approach. Experimental results demonstrate that our approach surpasses existing domain adaptation and active learning methods, considerably reducing the gap between the Upper and Lower bounds with minimal annotations, highlighting our method's practical clinical value. We will release our dataset and code to facilitate relevant research: https://github.com/whq-xxh/SFADA-UWF-SLO.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
BFRFormer: Transformer-based generator for Real-World Blind Face Restoration
Authors:
Guojing Ge,
Qi Song,
Guibo Zhu,
Yuting Zhang,
Jinglu Chen,
Miao Xin,
Ming Tang,
Jinqiao Wang
Abstract:
Blind face restoration is a challenging task due to the unknown and complex degradation. Although face prior-based methods and reference-based methods have recently demonstrated high-quality results, the restored images tend to contain over-smoothed results and lose identity-preserved details when the degradation is severe. It is observed that this is attributed to short-range dependencies, the in…
▽ More
Blind face restoration is a challenging task due to the unknown and complex degradation. Although face prior-based methods and reference-based methods have recently demonstrated high-quality results, the restored images tend to contain over-smoothed results and lose identity-preserved details when the degradation is severe. It is observed that this is attributed to short-range dependencies, the intrinsic limitation of convolutional neural networks. To model long-range dependencies, we propose a Transformer-based blind face restoration method, named BFRFormer, to reconstruct images with more identity-preserved details in an end-to-end manner. In BFRFormer, to remove blocking artifacts, the wavelet discriminator and aggregated attention module are developed, and spectral normalization and balanced consistency regulation are adaptively applied to address the training instability and over-fitting problem, respectively. Extensive experiments show that our method outperforms state-of-the-art methods on a synthetic dataset and four real-world datasets. The source code, Casia-Test dataset, and pre-trained models are released at https://github.com/s8Znk/BFRFormer.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
We Choose to Go to Space: Agent-driven Human and Multi-Robot Collaboration in Microgravity
Authors:
Miao Xin,
Zhongrui You,
Zihan Zhang,
Taoran Jiang,
Tingjia Xu,
Haotian Liang,
Guojing Ge,
Yuchen Ji,
Shentong Mo,
Jian Cheng
Abstract:
We present SpaceAgents-1, a system for learning human and multi-robot collaboration (HMRC) strategies under microgravity conditions. Future space exploration requires humans to work together with robots. However, acquiring proficient robot skills and adept collaboration under microgravity conditions poses significant challenges within ground laboratories. To address this issue, we develop a microg…
▽ More
We present SpaceAgents-1, a system for learning human and multi-robot collaboration (HMRC) strategies under microgravity conditions. Future space exploration requires humans to work together with robots. However, acquiring proficient robot skills and adept collaboration under microgravity conditions poses significant challenges within ground laboratories. To address this issue, we develop a microgravity simulation environment and present three typical configurations of intra-cabin robots. We propose a hierarchical heterogeneous multi-agent collaboration architecture: guided by foundation models, a Decision-Making Agent serves as a task planner for human-robot collaboration, while individual Skill-Expert Agents manage the embodied control of robots. This mechanism empowers the SpaceAgents-1 system to execute a range of intricate long-horizon HMRC tasks.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
A novel diffusion recommendation algorithm based on multi-scale cnn and residual lstm
Authors:
Yong Niu,
Xing Xing,
Zhichun Jia,
Ruidi Liu,
Mindong Xin
Abstract:
Sequential recommendation aims to infer user preferences from historical interaction sequences and predict the next item that users may be interested in the future. The current mainstream design approach is to represent items as fixed vectors, capturing the underlying relationships between items and user preferences based on the order of interactions. However, relying on a single fixed-item embedd…
▽ More
Sequential recommendation aims to infer user preferences from historical interaction sequences and predict the next item that users may be interested in the future. The current mainstream design approach is to represent items as fixed vectors, capturing the underlying relationships between items and user preferences based on the order of interactions. However, relying on a single fixed-item embedding may weaken the modeling capability of the system, and the global dynamics and local saliency exhibited by user preferences need to be distinguished. To address these issues, this paper proposes a novel diffusion recommendation algorithm based on multi-scale cnn and residual lstm (AREAL). We introduce diffusion models into the recommend system, representing items as probability distributions instead of fixed vectors. This approach enables adaptive reflection of multiple aspects of the items and generates item distributions in a denoising manner. We use multi-scale cnn and residual lstm methods to extract the local and global dependency features of user history interactions, and use attention mechanism to distinguish weights as the guide features of reverse diffusion recovery. The effectiveness of the proposed method is validated through experiments conducted on two real-world datasets. Specifically, AREAL obtains improvements over the best baselines by 2.63% and 4.25% in terms of HR@20 and 5.05% and 3.94% in terms of NDCG@20 on all datasets.
△ Less
Submitted 20 December, 2023; v1 submitted 17 December, 2023;
originally announced December 2023.
-
Nash or Stackelberg? -- A comparative study for game-theoretic AV decision-making
Authors:
Brady Bateman,
Ming Xin,
H. Eric Tseng,
Mushuang Liu
Abstract:
This paper studies game-theoretic decision-making for autonomous vehicles (AVs). A receding horizon multi-player game is formulated to model the AV decision-making problem. Two classes of games, including Nash game and Stackelber games, are developed respectively. For each of the two games, two solution settings, including pairwise games and multi-player games, are introduced, respectively, to sol…
▽ More
This paper studies game-theoretic decision-making for autonomous vehicles (AVs). A receding horizon multi-player game is formulated to model the AV decision-making problem. Two classes of games, including Nash game and Stackelber games, are developed respectively. For each of the two games, two solution settings, including pairwise games and multi-player games, are introduced, respectively, to solve the game in multi-agent scenarios. Comparative studies are conducted via statistical simulations to gain understandings of the performance of the two classes of games and of the two solution settings, respectively. The simulations are conducted in intersection-crossing scenarios, and the game performance is quantified by three metrics: safety, travel efficiency, and computational time.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
Tree of Uncertain Thoughts Reasoning for Large Language Models
Authors:
Shentong Mo,
Miao Xin
Abstract:
While the recently introduced Tree of Thoughts (ToT) has heralded advancements in allowing Large Language Models (LLMs) to reason through foresight and backtracking for global decision-making, it has overlooked the inherent local uncertainties in intermediate decision points or "thoughts". These local uncertainties, intrinsic to LLMs given their potential for diverse responses, remain a significan…
▽ More
While the recently introduced Tree of Thoughts (ToT) has heralded advancements in allowing Large Language Models (LLMs) to reason through foresight and backtracking for global decision-making, it has overlooked the inherent local uncertainties in intermediate decision points or "thoughts". These local uncertainties, intrinsic to LLMs given their potential for diverse responses, remain a significant concern in the reasoning process. Addressing this pivotal gap, we introduce the Tree of Uncertain Thoughts (TouT) - a reasoning framework tailored for LLMs. Our TouT effectively leverages Monte Carlo Dropout to quantify uncertainty scores associated with LLMs' diverse local responses at these intermediate steps. By marrying this local uncertainty quantification with global search algorithms, TouT enhances the model's precision in response generation. We substantiate our approach with rigorous experiments on two demanding planning tasks: Game of 24 and Mini Crosswords. The empirical evidence underscores TouT's superiority over both ToT and chain-of-thought prompting methods.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
Integrating Vehicle Slip and Yaw in Overarching Multi-Tiered Automated Vehicle Steering Control to Balance Path Following Accuracy, Gracefulness, and Safety
Authors:
Ming Xin,
Mark A. Minor
Abstract:
Balancing path following accuracy and error convergence with graceful motion in steering control is challenging due to the competing nature of these requirements, especially across a range of operating speeds and conditions. This paper demonstrates that an integrated multi-tiered steering controller considering the impact of slip on kinematic control, dynamic control, and steering actuator rate co…
▽ More
Balancing path following accuracy and error convergence with graceful motion in steering control is challenging due to the competing nature of these requirements, especially across a range of operating speeds and conditions. This paper demonstrates that an integrated multi-tiered steering controller considering the impact of slip on kinematic control, dynamic control, and steering actuator rate commands achieves accurate and graceful path following. This work is founded on multi-tiered sideslip and yaw-based models, which allow derivation of controllers considering error due to sideslip and the mapping between steering commands and graceful lateral motion. Observer based sideslip estimates are combined with heading error in the kinematic controller to provide feedforward slip compensation. Path following error is compensated by a continuous Variable Structure Controller (VSC) using speed-based path manifolds to balance graceful motion and error convergence. Resulting yaw rate commands are used by a backstepping dynamic controller to generate steering rate commands. A High Gain Observer (HGO) estimates sideslip and yaw rate for output feedback control. Stability analysis of the output feedback controller is provided, and peaking is resolved. The work focuses on lateral control alone so that the steering controller can be combined with other speed controllers. Field results provide comparisons to related approaches demonstrating gracefulness and accuracy in different complex scenarios with varied weather conditions and perturbations.
△ Less
Submitted 12 July, 2022;
originally announced July 2022.
-
Firmware Re-hosting Through Static Binary-level Porting
Authors:
Mingfeng Xin,
Hui Wen,
Liting Deng,
Hong Li,
Qiang Li,
Limin Sun
Abstract:
The rapid growth of the Industrial Internet of Things (IIoT) has brought embedded systems into focus as major targets for both security analysts and malicious adversaries. Due to the non-standard hardware and diverse software, embedded devices present unique challenges to security analysts for the accurate analysis of firmware binaries. The diversity in hardware components and tight coupling betwe…
▽ More
The rapid growth of the Industrial Internet of Things (IIoT) has brought embedded systems into focus as major targets for both security analysts and malicious adversaries. Due to the non-standard hardware and diverse software, embedded devices present unique challenges to security analysts for the accurate analysis of firmware binaries. The diversity in hardware components and tight coupling between firmware and hardware makes it hard to perform dynamic analysis, which must have the ability to execute firmware code in virtualized environments. However, emulating the large expanse of hardware peripherals makes analysts have to frequently modify the emulator for executing various firmware code in different virtualized environments, greatly limiting the ability of security analysis.
In this work, we explore the problem of firmware re-hosting related to the real-time operating system (RTOS). Specifically, developers create a Board Support Package (BSP) and develop device drivers to make that RTOS run on their platform. By providing high-level replacements for BSP routines and device drivers, we can make the minimal modification of the firmware that is to be migrated from its original hardware environment into a virtualized one. We show that an approach capable of offering the ability to execute firmware at scale through patching firmware in an automated manner without modifying the existing emulators. Our approach, called static binary-level porting, first identifies the BSP and device drivers in target firmware, then patches the firmware with pre-built BSP routines and drivers that can be adapted to the existing emulators. Finally, we demonstrate the practicality of the proposed method on multiple hardware platforms and firmware samples for security analysis. The result shows that the approach is flexible enough to emulate firmware for vulnerability assessment and exploits development.
△ Less
Submitted 29 July, 2021; v1 submitted 20 July, 2021;
originally announced July 2021.
-
NTIRE 2021 Multi-modal Aerial View Object Classification Challenge
Authors:
Jerrick Liu,
Nathan Inkawhich,
Oliver Nina,
Radu Timofte,
Sahil Jain,
Bob Lee,
Yuru Duan,
Wei Wei,
Lei Zhang,
Songzheng Xu,
Yuxuan Sun,
Jiaqi Tang,
Xueli Geng,
Mengru Ma,
Gongzhe Li,
Xueli Geng,
Huanqia Cai,
Chengxue Cai,
Sol Cummings,
Casian Miron,
Alexandru Pasarica,
Cheng-Yen Yang,
Hung-Min Hsu,
Jiarui Cai,
Jie Mei
, et al. (9 additional authors not shown)
Abstract:
In this paper, we introduce the first Challenge on Multi-modal Aerial View Object Classification (MAVOC) in conjunction with the NTIRE 2021 workshop at CVPR. This challenge is composed of two different tracks using EO andSAR imagery. Both EO and SAR sensors possess different advantages and drawbacks. The purpose of this competition is to analyze how to use both sets of sensory information in compl…
▽ More
In this paper, we introduce the first Challenge on Multi-modal Aerial View Object Classification (MAVOC) in conjunction with the NTIRE 2021 workshop at CVPR. This challenge is composed of two different tracks using EO andSAR imagery. Both EO and SAR sensors possess different advantages and drawbacks. The purpose of this competition is to analyze how to use both sets of sensory information in complementary ways. We discuss the top methods submitted for this competition and evaluate their results on our blind test set. Our challenge results show significant improvement of more than 15% accuracy from our current baselines for each track of the competition
△ Less
Submitted 6 April, 2022; v1 submitted 2 July, 2021;
originally announced July 2021.
-
Learning of Long-Horizon Sparse-Reward Robotic Manipulator Tasks with Base Controllers
Authors:
Guangming Wang,
Minjian Xin,
Wenhua Wu,
Zhe Liu,
Hesheng Wang
Abstract:
Deep Reinforcement Learning (DRL) enables robots to perform some intelligent tasks end-to-end. However, there are still many challenges for long-horizon sparse-reward robotic manipulator tasks. On the one hand, a sparse-reward setting causes exploration inefficient. On the other hand, exploration using physical robots is of high cost and unsafe. In this paper, we propose a method of learning long-…
▽ More
Deep Reinforcement Learning (DRL) enables robots to perform some intelligent tasks end-to-end. However, there are still many challenges for long-horizon sparse-reward robotic manipulator tasks. On the one hand, a sparse-reward setting causes exploration inefficient. On the other hand, exploration using physical robots is of high cost and unsafe. In this paper, we propose a method of learning long-horizon sparse-reward tasks utilizing one or more existing traditional controllers named base controllers in this paper. Built upon Deep Deterministic Policy Gradients (DDPG), our algorithm incorporates the existing base controllers into stages of exploration, value learning, and policy update. Furthermore, we present a straightforward way of synthesizing different base controllers to integrate their strengths. Through experiments ranging from stacking blocks to cups, it is demonstrated that the learned state-based or image-based policies steadily outperform base controllers. Compared to previous works of learning from demonstrations, our method improves sample efficiency by orders of magnitude and improves the performance. Overall, our method bears the potential of leveraging existing industrial robot manipulation systems to build more flexible and intelligent controllers.
△ Less
Submitted 4 December, 2021; v1 submitted 24 November, 2020;
originally announced November 2020.
-
Flexible and Efficient Long-Range Planning Through Curious Exploration
Authors:
Aidan Curtis,
Minjian Xin,
Dilip Arumugam,
Kevin Feigelis,
Daniel Yamins
Abstract:
Identifying algorithms that flexibly and efficiently discover temporally-extended multi-phase plans is an essential step for the advancement of robotics and model-based reinforcement learning. The core problem of long-range planning is finding an efficient way to search through the tree of possible action sequences. Existing non-learned planning solutions from the Task and Motion Planning (TAMP) l…
▽ More
Identifying algorithms that flexibly and efficiently discover temporally-extended multi-phase plans is an essential step for the advancement of robotics and model-based reinforcement learning. The core problem of long-range planning is finding an efficient way to search through the tree of possible action sequences. Existing non-learned planning solutions from the Task and Motion Planning (TAMP) literature rely on the existence of logical descriptions for the effects and preconditions for actions. This constraint allows TAMP methods to efficiently reduce the tree search problem but limits their ability to generalize to unseen and complex physical environments. In contrast, deep reinforcement learning (DRL) methods use flexible neural-network-based function approximators to discover policies that generalize naturally to unseen circumstances. However, DRL methods struggle to handle the very sparse reward landscapes inherent to long-range multi-step planning situations. Here, we propose the Curious Sample Planner (CSP), which fuses elements of TAMP and DRL by combining a curiosity-guided sampling strategy with imitation learning to accelerate planning. We show that CSP can efficiently discover interesting and complex temporally-extended plans for solving a wide range of physically realistic 3D tasks. In contrast, standard planning and learning methods often fail to solve these tasks at all or do so only with a huge and highly variable number of training samples. We explore the use of a variety of curiosity metrics with CSP and analyze the types of solutions that CSP discovers. Finally, we show that CSP supports task transfer so that the exploration policies learned during experience with one task can help improve efficiency on related tasks.
△ Less
Submitted 8 July, 2020; v1 submitted 22 April, 2020;
originally announced April 2020.