Skip to main content

Showing 1–50 of 55 results for author: Cha, Y

  1. arXiv:2406.08545  [pdf, other

    cs.RO cs.AI cs.CV

    RVT-2: Learning Precise Manipulation from Few Demonstrations

    Authors: Ankit Goyal, Valts Blukis, Jie Xu, Yijie Guo, Yu-Wei Chao, Dieter Fox

    Abstract: In this work, we study how to build a robotic system that can solve multiple 3D manipulation tasks given language instructions. To be useful in industrial and household domains, such a system should be capable of learning new tasks with few demonstrations and solving them precisely. Prior works, like PerAct and RVT, have studied this problem, however, they often struggle with tasks requiring high… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted to RSS 2024

  2. arXiv:2406.06843  [pdf, other

    cs.CV

    HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction

    Authors: Jikai Wang, Qifan Zhang, Yu-Wei Chao, Bowen Wen, Xiaohu Guo, Yu Xiang

    Abstract: We introduce a data capture system and a new dataset named HO-Cap that can be used to study 3D reconstruction and pose tracking of hands and objects in videos. The capture system uses multiple RGB-D cameras and a HoloLens headset for data collection, avoiding the use of expensive 3D scanners or mocap systems. We propose a semi-automatic method to obtain annotations of shape and pose of hands and o… ▽ More

    Submitted 16 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  3. Understanding the Career Mobility of Blind and Low Vision Software Professionals

    Authors: Yoonha Cha, Victoria Jackson, Isabela Figueira, Stacy M. Branham, André van der Hoek

    Abstract: Context: Scholars in the software engineering (SE) research community have investigated career advancement in the software industry. Research topics have included how individual and external factors can impact career mobility of software professionals, and how gender affects career advancement. However, the community has yet to look at career mobility from the lens of accessibility. Specifically,… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: 12 pages, 1 table, conference paper, 2024 ACM / IEEE 17th International Conference on Cooperative and Human Aspects of Software Engineering

    ACM Class: D.2.9; H.5.3

  4. arXiv:2404.01842  [pdf, other

    cs.CV

    Semi-Supervised Domain Adaptation for Wildfire Detection

    Authors: JooYoung Jang, Youngseo Cha, Jisu Kim, SooHyung Lee, Geonu Lee, Minkook Cho, Young Hwang, Nojun Kwak

    Abstract: Recently, both the frequency and intensity of wildfires have increased worldwide, primarily due to climate change. In this paper, we propose a novel protocol for wildfire detection, leveraging semi-supervised Domain Adaptation for object detection, accompanied by a corresponding dataset designed for use by both academics and industries. Our dataset encompasses 30 times more diverse labeled scenes… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 16 pages, 5 figures, 22 tables

  5. arXiv:2403.06497  [pdf, other

    cs.CV cs.MM

    QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven Fine Tuning

    Authors: Jiun-Man Chen, Yu-Hsuan Chao, Yu-Jie Wang, Ming-Der Shieh, Chih-Chung Hsu, Wei-Fen Lin

    Abstract: Transformer-based models have gained widespread popularity in both the computer vision (CV) and natural language processing (NLP) fields. However, significant challenges arise during post-training linear quantization, leading to noticeable reductions in inference accuracy. Our study focuses on uncovering the underlying causes of these accuracy drops and proposing a quantization-friendly fine-tunin… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  6. arXiv:2312.14401  [pdf, other

    cs.HC

    Towards an Exploratory Visual Analytics System for Griefer Identification in MOBA Games

    Authors: Zixin Chen, Shiyi Liu, Zhihua Jin, Gaoping Huang, Yang Chao, Zhenchuan Yang, Quan Li, Huamin Qu

    Abstract: Multiplayer Online Battle Arenas (MOBAs) have gained a significant player base worldwide, generating over two billion US dollars in annual game revenue. However, the presence of griefers, who deliberately irritate and harass other players within the game, can have a detrimental impact on players' experience, compromising game fairness and potentially leading to the emergence of gray industries. Un… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: IEEE VIS 2023 (Poster)

  7. arXiv:2312.04936  [pdf, other

    cs.RO

    SKT-Hang: Hanging Everyday Objects via Object-Agnostic Semantic Keypoint Trajectory Generation

    Authors: Chia-Liang Kuo, Yu-Wei Chao, Yi-Ting Chen

    Abstract: We study the problem of hanging a wide range of grasped objects on diverse supporting items. Hanging objects is a ubiquitous task that is encountered in numerous aspects of our everyday lives. However, both the objects and supporting items can exhibit substantial variations in their shapes and structures, bringing two challenging issues: (1) determining the task-relevant geometric structures acros… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  8. arXiv:2311.05599  [pdf, other

    cs.RO cs.AI

    SynH2R: Synthesizing Hand-Object Motions for Learning Human-to-Robot Handovers

    Authors: Sammy Christen, Lan Feng, Wei Yang, Yu-Wei Chao, Otmar Hilliges, Jie Song

    Abstract: Vision-based human-to-robot handover is an important and challenging task in human-robot interaction. Recent work has attempted to train robot policies by interacting with dynamic virtual humans in simulated environments, where the policies can later be transferred to the real world. However, a major bottleneck is the reliance on human motion capture data, which is expensive to acquire and difficu… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  9. arXiv:2310.13969  [pdf, ps, other

    stat.ML cs.LG

    Distributed Linear Regression with Compositional Covariates

    Authors: Yue Chao, Lei Huang, Xuejun Ma

    Abstract: With the availability of extraordinarily huge data sets, solving the problems of distributed statistical methodology and computing for such data sets has become increasingly crucial in the big data area. In this paper, we focus on the distributed sparse penalized linear log-contrast model in massive compositional data. In particular, two distributed optimization techniques under centralized and de… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Comments: 35 pages,2 figures

    MSC Class: 62-08 62-08 62-08 62-08 62-08 ACM Class: G.3

  10. arXiv:2308.12599  [pdf, other

    cs.SD cs.LG eess.AS

    Exploiting Time-Frequency Conformers for Music Audio Enhancement

    Authors: Yunkee Chae, Junghyun Koo, Sungho Lee, Kyogu Lee

    Abstract: With the proliferation of video platforms on the internet, recording musical performances by mobile devices has become commonplace. However, these recordings often suffer from degradation such as noise and reverberation, which negatively impact the listening experience. Consequently, the necessity for music audio enhancement (referred to as music enhancement from this point onward), involving the… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

    Comments: Accepted by ACM Multimedia 2023

  11. arXiv:2308.11896  [pdf, other

    cs.CV

    Age Prediction From Face Images Via Contrastive Learning

    Authors: Yeongnam Chae, Poulami Raha, Mijung Kim, Bjorn Stenger

    Abstract: This paper presents a novel approach for accurately estimating age from face images, which overcomes the challenge of collecting a large dataset of individuals with the same identity at different ages. Instead, we leverage readily available face datasets of different people at different ages and aim to extract age-related features using contrastive learning. Our method emphasizes these relevant fe… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Comments: MVA2023

  12. arXiv:2308.09383  [pdf, other

    cs.CV

    Label-Free Event-based Object Recognition via Joint Learning with Image Reconstruction from Events

    Authors: Hoonhee Cho, Hyeonseong Kim, Yujeong Chae, Kuk-Jin Yoon

    Abstract: Recognizing objects from sparse and noisy events becomes extremely difficult when paired images and category labels do not exist. In this paper, we study label-free event-based object recognition where category labels and paired images are not available. To this end, we propose a joint formulation of object recognition and image reconstruction in a complementary manner. Our method first reconstruc… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: Accepted to ICCV 2023 (Oral)

  13. arXiv:2307.12576  [pdf, other

    eess.AS cs.IR cs.LG cs.SD

    Self-refining of Pseudo Labels for Music Source Separation with Noisy Labeled Data

    Authors: Junghyun Koo, Yunkee Chae, Chang-Bin Jeon, Kyogu Lee

    Abstract: Music source separation (MSS) faces challenges due to the limited availability of correctly-labeled individual instrument tracks. With the push to acquire larger datasets to improve MSS performance, the inevitability of encountering mislabeled individual instrument tracks becomes a significant challenge to address. This paper introduces an automated technique for refining the labels in a partially… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: 24th International Society for Music Information Retrieval Conference (ISMIR 2023)

  14. arXiv:2307.09699  [pdf, other

    cs.HC

    ActorLens: Visual Analytics for High-level Actor Identification in MOBA Games

    Authors: Zhihua Jin, Gaoping Huang, Zixin Chen, Shiyi Liu, Yang Chao, Zhenchuan Yang, Quan Li, Huamin Qu

    Abstract: Multiplayer Online Battle Arenas (MOBAs) have garnered a substantial player base worldwide. Nevertheless, the presence of noxious players, commonly referred to as "actors", can significantly compromise game fairness by exhibiting negative behaviors that diminish their team's competitive edge. Furthermore, high-level actors tend to engage in more egregious conduct to evade detection, thereby causin… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: 15 pages, 9 figures

  15. arXiv:2307.04577  [pdf, other

    cs.RO cs.CV cs.LG

    AnyTeleop: A General Vision-Based Dexterous Robot Arm-Hand Teleoperation System

    Authors: Yuzhe Qin, Wei Yang, Binghao Huang, Karl Van Wyk, Hao Su, Xiaolong Wang, Yu-Wei Chao, Dieter Fox

    Abstract: Vision-based teleoperation offers the possibility to endow robots with human-level intelligence to physically interact with the environment, while only requiring low-cost camera sensors. However, current vision-based teleoperation systems are designed and engineered towards a particular robot model and deploy environment, which scales poorly as the pool of the robot models expands and the variety… ▽ More

    Submitted 16 May, 2024; v1 submitted 10 July, 2023; originally announced July 2023.

    Comments: http://anyteleop.com/ Robotics: Science and Systems 2023

  16. arXiv:2307.03073  [pdf, other

    cs.CV cs.RO

    Proto-CLIP: Vision-Language Prototypical Network for Few-Shot Learning

    Authors: Jishnu Jaykumar P, Kamalesh Palanisamy, Yu-Wei Chao, Xinya Du, Yu Xiang

    Abstract: We propose a novel framework for few-shot learning by leveraging large-scale vision-language models such as CLIP. Motivated by unimodal prototypical networks for few-shot learning, we introduce Proto-CLIP which utilizes image prototypes and text prototypes for few-shot learning. Specifically, Proto-CLIP adapts the image and text encoder embeddings from CLIP in a joint fashion using few-shot exampl… ▽ More

    Submitted 14 July, 2024; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: Accepted at 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

  17. arXiv:2306.16495  [pdf

    cs.SI cs.AI cs.IR

    Event Detection from Social Media Stream: Methods, Datasets and Opportunities

    Authors: Quanzhi Li, Yang Chao, Dong Li, Yao Lu, Chi Zhang

    Abstract: Social media streams contain large and diverse amount of information, ranging from daily-life stories to the latest global and local events and news. Twitter, especially, allows a fast spread of events happening real time, and enables individuals and organizations to stay informed of the events happening now. Event detection from social media data poses different challenges from traditional text a… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

    Comments: 8 pages

  18. arXiv:2306.14896  [pdf, other

    cs.RO cs.CV

    RVT: Robotic View Transformer for 3D Object Manipulation

    Authors: Ankit Goyal, Jie Xu, Yijie Guo, Valts Blukis, Yu-Wei Chao, Dieter Fox

    Abstract: For 3D object manipulation, methods that build an explicit 3D representation perform better than those relying only on camera images. But using explicit 3D representations like voxels comes at large computing cost, adversely affecting scalability. In this work, we propose RVT, a multi-view transformer for 3D manipulation that is both scalable and accurate. Some key features of RVT are an attention… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  19. arXiv:2305.13108  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Debiased Automatic Speech Recognition for Dysarthric Speech via Sample Reweighting with Sample Affinity Test

    Authors: Eungbeom Kim, Yunkee Chae, Jaeheon Sim, Kyogu Lee

    Abstract: Automatic speech recognition systems based on deep learning are mainly trained under empirical risk minimization (ERM). Since ERM utilizes the averaged performance on the data samples regardless of a group such as healthy or dysarthric speakers, ASR systems are unaware of the performance disparities across the groups. This results in biased ASR systems whose performance differences among groups ar… ▽ More

    Submitted 27 June, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted by Interspeech 2023

  20. arXiv:2305.09167  [pdf, other

    cs.SD cs.CL eess.AS

    Adversarial Speaker Disentanglement Using Unannotated External Data for Self-supervised Representation Based Voice Conversion

    Authors: Xintao Zhao, Shuai Wang, Yang Chao, Zhiyong Wu, Helen Meng

    Abstract: Nowadays, recognition-synthesis-based methods have been quite popular with voice conversion (VC). By introducing linguistics features with good disentangling characters extracted from an automatic speech recognition (ASR) model, the VC performance achieved considerable breakthroughs. Recently, self-supervised learning (SSL) methods trained with a large-scale unannotated speech corpus have been app… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: Accepted by ICME 2023

  21. arXiv:2304.14496  [pdf, ps, other

    physics.ins-det cs.LG eess.SP nucl-ex

    Restoring Original Signal From Pile-up Signal using Deep Learning

    Authors: C. H. Kim, S. Ahn, K. Y. Chae, J. Hooker, G. V. Rogachev

    Abstract: Pile-up signals are frequently produced in experimental physics. They create inaccurate physics data with high uncertainty and cause various problems. Therefore, the correction to pile-up signals is crucially required. In this study, we implemented a deep learning method to restore the original signals from the pile-up signals. We showed that a deep learning model could accurately reconstruct the… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

  22. arXiv:2303.17592  [pdf, other

    cs.RO cs.CV cs.LG

    Learning Human-to-Robot Handovers from Point Clouds

    Authors: Sammy Christen, Wei Yang, Claudia Pérez-D'Arpino, Otmar Hilliges, Dieter Fox, Yu-Wei Chao

    Abstract: We propose the first framework to learn control policies for vision-based human-to-robot handovers, a critical task for human-robot interaction. While research in Embodied AI has made significant progress in training robot agents in simulated environments, interacting with humans remains challenging due to the difficulties of simulating humans. Fortunately, recent research has developed realistic… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

    Comments: Accepted at CVPR 2023 as highlight. Project page at https://handover-sim2real.github.io

  23. arXiv:2211.07951  [pdf, other

    cs.SD cs.LG eess.AS

    Show Me the Instruments: Musical Instrument Retrieval from Mixture Audio

    Authors: Kyungsu Kim, Minju Park, Haesun Joung, Yunkee Chae, Yeongbeom Hong, Seonghyeon Go, Kyogu Lee

    Abstract: As digital music production has become mainstream, the selection of appropriate virtual instruments plays a crucial role in determining the quality of music. To search the musical instrument samples or virtual instruments that make one's desired sound, music producers use their ears to listen and compare each instrument sample in their collection, which is time-consuming and inefficient. In this p… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

    Comments: 5 pages, 4 figures, submitted to ICASSP 2023

  24. arXiv:2211.01629  [pdf, other

    cs.CV cs.LG

    Image-based Early Detection System for Wildfires

    Authors: Omkar Ranadive, Jisu Kim, Serin Lee, Youngseo Cha, Heechan Park, Minkook Cho, Young K. Hwang

    Abstract: Wildfires are a disastrous phenomenon which cause damage to land, loss of property, air pollution, and even loss of human life. Due to the warmer and drier conditions created by climate change, more severe and uncontrollable wildfires are expected to occur in the coming years. This could lead to a global wildfire crisis and have dire consequences on our planet. Hence, it has become imperative to u… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

    Comments: Published in Tackling Climate Change with Machine Learning workshop, Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS 2022)

  25. arXiv:2210.13638  [pdf, other

    cs.RO

    Learning Robust Real-World Dexterous Grasping Policies via Implicit Shape Augmentation

    Authors: Zoey Qiuyu Chen, Karl Van Wyk, Yu-Wei Chao, Wei Yang, Arsalan Mousavian, Abhishek Gupta, Dieter Fox

    Abstract: Dexterous robotic hands have the capability to interact with a wide variety of household objects to perform tasks like grasping. However, learning robust real world grasping policies for arbitrary objects has proven challenging due to the difficulty of generating high quality training data. In this work, we propose a learning system (ISAGrasp) for leveraging a small number of human demonstrations… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: Accepted by CoRL2022

  26. arXiv:2209.14284  [pdf, other

    cs.CV

    DexTransfer: Real World Multi-fingered Dexterous Grasping with Minimal Human Demonstrations

    Authors: Zoey Qiuyu Chen, Karl Van Wyk, Yu-Wei Chao, Wei Yang, Arsalan Mousavian, Abhishek Gupta, Dieter Fox

    Abstract: Teaching a multi-fingered dexterous robot to grasp objects in the real world has been a challenging problem due to its high dimensional state and action space. We propose a robot-learning system that can take a small number of human demonstrations and learn to grasp unseen object poses given partially occluded observations. Our system leverages a small motion capture dataset and generates a large… ▽ More

    Submitted 28 September, 2022; originally announced September 2022.

  27. arXiv:2207.03333  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    FewSOL: A Dataset for Few-Shot Object Learning in Robotic Environments

    Authors: Jishnu Jaykumar P, Yu-Wei Chao, Yu Xiang

    Abstract: We introduce the Few-Shot Object Learning (FewSOL) dataset for object recognition with a few images per object. We captured 336 real-world objects with 9 RGB-D images per object from different views. Object segmentation masks, object poses and object attributes are provided. In addition, synthetic images generated using 330 3D object models are used to augment the dataset. We investigated (i) few-… ▽ More

    Submitted 5 March, 2023; v1 submitted 6 July, 2022; originally announced July 2022.

  28. arXiv:2206.11381  [pdf

    cs.LG cs.AI

    Prevent Car Accidents by Using AI

    Authors: Sri Siddhartha Reddy Gudemupati, Yen Ling Chao, Lakshmi Praneetha Kotikalapudi, Ebrima Ceesay

    Abstract: Transportation facilities are becoming more developed as society develops, and people's travel demand is increasing, but so are the traffic safety issues that arise as a result. And car accidents are a major issue all over the world. The cost of traffic fatalities and driver injuries has a significant impact on society. The use of machine learning techniques in the field of traffic accidents is be… ▽ More

    Submitted 19 June, 2022; originally announced June 2022.

  29. arXiv:2206.06730  [pdf, other

    eess.IV cs.CV

    Automated Precision Localization of Peripherally Inserted Central Catheter Tip through Model-Agnostic Multi-Stage Networks

    Authors: Subin Park, Yoon Ki Cha, Soyoung Park, Kyung-Su Kim, Myung Jin Chung

    Abstract: Peripherally inserted central catheters (PICCs) have been widely used as one of the representative central venous lines (CVCs) due to their long-term intravascular access with low infectivity. However, PICCs have a fatal drawback of a high frequency of tip mispositions, increasing the risk of puncture, embolism, and complications such as cardiac arrhythmias. To automatically and precisely detect i… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

    Comments: Subin Park and Yoon Ki Cha have contributed equally to this work as the co-first author. Kyung-Su Kim (kskim.doc@gmail.com) and Myung Jin Chung (mj1.chung@samsung.com) have contributed equally to this work as the co-corresponding author

  30. arXiv:2205.09747  [pdf, other

    cs.RO cs.CV

    HandoverSim: A Simulation Framework and Benchmark for Human-to-Robot Object Handovers

    Authors: Yu-Wei Chao, Chris Paxton, Yu Xiang, Wei Yang, Balakumar Sundaralingam, Tao Chen, Adithyavairavan Murali, Maya Cakmak, Dieter Fox

    Abstract: We introduce a new simulation benchmark "HandoverSim" for human-to-robot object handovers. To simulate the giver's motion, we leverage a recent motion capture dataset of hand grasping of objects. We create training and evaluation environments for the receiver with standardized protocols and metrics. We analyze the performance of a set of baselines and show a correlation with a real-world evaluatio… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

    Comments: Accepted to ICRA 2022

  31. arXiv:2204.00134  [pdf, other

    cs.RO cs.CV

    Model Predictive Control for Fluid Human-to-Robot Handovers

    Authors: Wei Yang, Balakumar Sundaralingam, Chris Paxton, Iretiayo Akinola, Yu-Wei Chao, Maya Cakmak, Dieter Fox

    Abstract: Human-robot handover is a fundamental yet challenging task in human-robot interaction and collaboration. Recently, remarkable progressions have been made in human-to-robot handovers of unknown objects by using learning-based grasp generators. However, how to responsively generate smooth motions to take an object from a human is still an open question. Specifically, planning motions that take human… ▽ More

    Submitted 31 March, 2022; originally announced April 2022.

    Comments: Accepted to ICRA 2022

  32. arXiv:2202.00732  [pdf, other

    cs.RO cs.CV

    IFOR: Iterative Flow Minimization for Robotic Object Rearrangement

    Authors: Ankit Goyal, Arsalan Mousavian, Chris Paxton, Yu-Wei Chao, Brian Okorn, Jia Deng, Dieter Fox

    Abstract: Accurate object rearrangement from vision is a crucial problem for a wide variety of real-world robotics applications in unstructured environments. We propose IFOR, Iterative Flow Minimization for Robotic Object Rearrangement, an end-to-end method for the challenging problem of object rearrangement for unknown objects given an RGBD image of the original and final scenes. First, we learn an optical… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

  33. arXiv:2112.06179  [pdf, other

    cs.CV

    BIPS: Bi-modal Indoor Panorama Synthesis via Residual Depth-aided Adversarial Learning

    Authors: Changgyoon Oh, Wonjune Cho, Daehee Park, Yujeong Chae, Lin Wang, Kuk-Jin Yoon

    Abstract: Providing omnidirectional depth along with RGB information is important for numerous applications, eg, VR/AR. However, as omnidirectional RGB-D data is not always available, synthesizing RGB-D panorama data from limited information of a scene can be useful. Therefore, some prior works tried to synthesize RGB panorama images from perspective RGB images; however, they suffer from limited image quali… ▽ More

    Submitted 12 December, 2021; originally announced December 2021.

  34. arXiv:2111.12341  [pdf, other

    cs.CV

    EvDistill: Asynchronous Events to End-task Learning via Bidirectional Reconstruction-guided Cross-modal Knowledge Distillation

    Authors: Lin Wang, Yujeong Chae, Sung-Hoon Yoon, Tae-Kyun Kim, Kuk-Jin Yoon

    Abstract: Event cameras sense per-pixel intensity changes and produce asynchronous event streams with high dynamic range and less motion blur, showing advantages over conventional cameras. A hurdle of training event-based models is the lack of large qualitative labeled data. Prior works learning end-tasks mostly rely on labeled or pseudo-labeled datasets obtained from the active pixel sensor (APS) frames; h… ▽ More

    Submitted 24 November, 2021; originally announced November 2021.

    Comments: CVPR 2021 (updated references in this version)

  35. arXiv:2111.08272  [pdf, other

    cs.DC

    Task allocation for decentralized training in heterogeneous environment

    Authors: Yongyue Chao, Mingxue Liao, Jiaxin Gao

    Abstract: The demand for large-scale deep learning is increasing, and distributed training is the current mainstream solution. Ring AllReduce is widely used as a data parallel decentralized algorithm. However, in a heterogeneous environment, each worker calculates the same amount of data, so that there is a lot of waiting time loss among different workers, which makes the algorithm unable to adapt well to h… ▽ More

    Submitted 16 November, 2021; originally announced November 2021.

  36. arXiv:2111.05251  [pdf

    cs.RO cs.AI cs.HC cs.LG

    Learning Perceptual Concepts by Bootstrapping from Human Queries

    Authors: Andreea Bobu, Chris Paxton, Wei Yang, Balakumar Sundaralingam, Yu-Wei Chao, Maya Cakmak, Dieter Fox

    Abstract: When robots operate in human environments, it's critical that humans can quickly teach them new concepts: object-centric properties of the environment that they care about (e.g. objects near, upright, etc). However, teaching a new perceptual concept from high-dimensional robot sensor data (e.g. point clouds) is demanding, requiring an unrealistic amount of human labels. To address this, we propose… ▽ More

    Submitted 4 July, 2022; v1 submitted 9 November, 2021; originally announced November 2021.

    Comments: 9 pages, 10 figures

  37. arXiv:2109.13456  [pdf, other

    cs.CV cs.AI cs.RO

    SiamEvent: Event-based Object Tracking via Edge-aware Similarity Learning with Siamese Networks

    Authors: Yujeong Chae, Lin Wang, Kuk-Jin Yoon

    Abstract: Event cameras are novel sensors that perceive the per-pixel intensity changes and output asynchronous event streams, showing lots of advantages over traditional cameras, such as high dynamic range (HDR) and no motion blur. It has been shown that events alone can be used for object tracking by motion compensation or prediction. However, existing methods assume that the target always moves and is th… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

  38. arXiv:2109.01801  [pdf, other

    cs.CV

    Dual Transfer Learning for Event-based End-task Prediction via Pluggable Event to Image Translation

    Authors: Lin Wang, Yujeong Chae, Kuk-Jin Yoon

    Abstract: Event cameras are novel sensors that perceive the per-pixel intensity changes and output asynchronous event streams with high dynamic range and less motion blur. It has been shown that events alone can be used for end-task learning, e.g., semantic segmentation, based on encoder-decoder-like networks. However, as events are sparse and mostly reflect edge information, it is difficult to recover orig… ▽ More

    Submitted 24 November, 2021; v1 submitted 4 September, 2021; originally announced September 2021.

    Comments: ICCV 2021 (updated references in this version)

  39. arXiv:2105.10915  [pdf, other

    stat.ML cs.LG

    GOALS: Gradient-Only Approximations for Line Searches Towards Robust and Consistent Training of Deep Neural Networks

    Authors: Younghwan Chae, Daniel N. Wilke, Dominic Kafka

    Abstract: Mini-batch sub-sampling (MBSS) is favored in deep neural network training to reduce the computational cost. Still, it introduces an inherent sampling error, making the selection of appropriate learning rates challenging. The sampling errors can manifest either as a bias or variances in a line search. Dynamic MBSS re-samples a mini-batch at every function evaluation. Hence, dynamic MBSS results in… ▽ More

    Submitted 23 May, 2021; originally announced May 2021.

    Comments: 26 pages, 8 figures and 5 tables

    MSC Class: 93E35; 65K05; 90C15; 49M05; 90C20; 90C26 ACM Class: I.2.6; I.2.8

  40. arXiv:2104.04631  [pdf, other

    cs.CV

    DexYCB: A Benchmark for Capturing Hand Grasping of Objects

    Authors: Yu-Wei Chao, Wei Yang, Yu Xiang, Pavlo Molchanov, Ankur Handa, Jonathan Tremblay, Yashraj S. Narang, Karl Van Wyk, Umar Iqbal, Stan Birchfield, Jan Kautz, Dieter Fox

    Abstract: We introduce DexYCB, a new dataset for capturing hand grasping of objects. We first compare DexYCB with a related one through cross-dataset evaluation. We then present a thorough benchmark of state-of-the-art approaches on three relevant tasks: 2D object and keypoint detection, 6D object pose estimation, and 3D hand pose estimation. Finally, we evaluate a new robotics-relevant task: generating saf… ▽ More

    Submitted 9 April, 2021; originally announced April 2021.

    Comments: Accepted to CVPR 2021

  41. arXiv:2011.08961  [pdf, other

    cs.RO cs.CV

    Reactive Human-to-Robot Handovers of Arbitrary Objects

    Authors: Wei Yang, Chris Paxton, Arsalan Mousavian, Yu-Wei Chao, Maya Cakmak, Dieter Fox

    Abstract: Human-robot object handovers have been an actively studied area of robotics over the past decade; however, very few techniques and systems have addressed the challenge of handing over diverse objects with arbitrary appearance, size, shape, and rigidity. In this paper, we present a vision-based system that enables reactive human-to-robot handovers of unknown objects. Our approach combines closed-lo… ▽ More

    Submitted 3 June, 2021; v1 submitted 17 November, 2020; originally announced November 2020.

    Comments: Accepted to the International Conference on Robotics and Automation (ICRA) 2021

  42. arXiv:2010.13982  [pdf, other

    cs.CL cs.AI

    Predict and Use Latent Patterns for Short-Text Conversation

    Authors: Hung-Ting Chen, Yu-Chieh Chao, Ta-Hsuan Chao, Wei-Yun Ma

    Abstract: Many neural network models nowadays have achieved promising performances in Chit-chat settings. The majority of them rely on an encoder for understanding the post and a decoder for generating the response. Without given assigned semantics, the models lack the fine-grained control over responses as the semantic mapping between posts and responses is hidden on the fly within the end-to-end manners.… ▽ More

    Submitted 6 December, 2020; v1 submitted 26 October, 2020; originally announced October 2020.

  43. arXiv:2002.07033  [pdf, other

    cs.LG cs.AI cs.CY

    Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing

    Authors: Youngduck Choi, Youngnam Lee, Junghyun Cho, Jineon Baek, Byungsoo Kim, Yeongmin Cha, Dongmin Shin, Chan Bae, Jaewe Heo

    Abstract: Knowledge tracing, the act of modeling a student's knowledge through learning activities, is an extensively studied problem in the field of computer-aided education. Although models with attention mechanism have outperformed traditional approaches such as Bayesian knowledge tracing and collaborative filtering, they share two limitations. Firstly, the models rely on shallow attention layers and fai… ▽ More

    Submitted 31 January, 2021; v1 submitted 14 February, 2020; originally announced February 2020.

    Comments: L@S 2020

  44. arXiv:1911.05864  [pdf, other

    cs.RO cs.AI cs.CV

    Motion Reasoning for Goal-Based Imitation Learning

    Authors: De-An Huang, Yu-Wei Chao, Chris Paxton, Xinke Deng, Li Fei-Fei, Juan Carlos Niebles, Animesh Garg, Dieter Fox

    Abstract: We address goal-based imitation learning, where the aim is to output the symbolic goal from a third-person video demonstration. This enables the robot to plan for execution and reproduce the same goal in a completely different environment. The key challenge is that the goal of a video demonstration is often ambiguous at the level of semantic actions. The human demonstrators might unintentionally a… ▽ More

    Submitted 13 November, 2019; originally announced November 2019.

  45. arXiv:1910.03135  [pdf, other

    cs.CV cs.LG cs.RO

    DexPilot: Vision Based Teleoperation of Dexterous Robotic Hand-Arm System

    Authors: Ankur Handa, Karl Van Wyk, Wei Yang, Jacky Liang, Yu-Wei Chao, Qian Wan, Stan Birchfield, Nathan Ratliff, Dieter Fox

    Abstract: Teleoperation offers the possibility of imparting robotic systems with sophisticated reasoning skills, intuition, and creativity to perform tasks. However, current teleoperation solutions for high degree-of-actuation (DoA), multi-fingered robots are generally cost-prohibitive, while low-cost offerings usually provide reduced degrees of control. Herein, a low-cost, vision based teleoperation system… ▽ More

    Submitted 14 October, 2019; v1 submitted 7 October, 2019; originally announced October 2019.

    Comments: 17 pages, first version of DexPilot

  46. arXiv:1909.06893  [pdf, other

    stat.ML cs.LG

    Empirical study towards understanding line search approximations for training neural networks

    Authors: Younghwan Chae, Daniel N. Wilke

    Abstract: Choosing appropriate step sizes is critical for reducing the computational cost of training large-scale neural network models. Mini-batch sub-sampling (MBSS) is often employed for computational tractability. However, MBSS introduces a sampling error, that can manifest as a bias or variance in a line search. This is because MBSS can be performed statically, where the mini-batch is updated only when… ▽ More

    Submitted 15 September, 2019; originally announced September 2019.

    Comments: 30 pages, 20 figures

    MSC Class: 90C15; 90C59; 90C30; 90C26; 90C56

  47. arXiv:1909.00952  [pdf, other

    eess.IV cs.LG cs.MM eess.SY stat.AP stat.ML

    Graph-based Transforms for Video Coding

    Authors: Hilmi E. Egilmez, Yung-Hsuan Chao, Antonio Ortega

    Abstract: In many state-of-the-art compression systems, signal transformation is an integral part of the encoding and decoding process, where transforms provide compact representations for the signals of interest. This paper introduces a class of transforms called graph-based transforms (GBTs) for video compression, and proposes two different techniques to design GBTs. In the first technique, we formulate a… ▽ More

    Submitted 18 September, 2020; v1 submitted 3 September, 2019; originally announced September 2019.

    Comments: To appear in IEEE Trans. on Image Processing (14 pages)

  48. arXiv:1908.07423  [pdf, other

    cs.CV

    Learning to Sit: Synthesizing Human-Chair Interactions via Hierarchical Control

    Authors: Yu-Wei Chao, Jimei Yang, Weifeng Chen, Jia Deng

    Abstract: Recent progress on physics-based character animation has shown impressive breakthroughs on human motion synthesis, through imitating motion capture data via deep reinforcement learning. However, results have mostly been demonstrated on imitating a single distinct motion pattern, and do not generalize to interactive tasks that require flexible motion patterns due to varying human-object spatial con… ▽ More

    Submitted 16 December, 2020; v1 submitted 20 August, 2019; originally announced August 2019.

    Comments: Accepted to AAAI 2021

  49. arXiv:1808.02513  [pdf, other

    cs.LG stat.ML

    Rethinking Numerical Representations for Deep Neural Networks

    Authors: Parker Hill, Babak Zamirai, Shengshuo Lu, Yu-Wei Chao, Michael Laurenzano, Mehrzad Samadi, Marios Papaefthymiou, Scott Mahlke, Thomas Wenisch, Jia Deng, Lingjia Tang, Jason Mars

    Abstract: With ever-increasing computational demand for deep learning, it is critical to investigate the implications of the numeric representation and precision of DNN model weights and activations on computational efficiency. In this work, we explore unconventional narrow-precision floating-point representations as it relates to inference accuracy and efficiency to steer the improved design of future DNN… ▽ More

    Submitted 7 August, 2018; originally announced August 2018.

  50. arXiv:1805.01786  [pdf, other

    cs.DC

    To Centralize or Not to Centralize: A Tale of Swarm Coordination

    Authors: Justin Hu, Ariana Bruno, Drew Zagieboylo, Mark Zhao, Brian Ritchken, Brendon Jackson, Joo Yeon Chae, Francois Mertil, Mateo Espinosa, Christina Delimitrou

    Abstract: Large swarms of autonomous devices are increasing in size and importance. When it comes to controlling the devices of large-scale swarms there are two main lines of thought. Centralized control, where all decisions - and often compute - happen in a centralized back-end cloud system, and distributed control, where edge devices are responsible for selecting and executing tasks with minimal or zero h… ▽ More

    Submitted 4 May, 2018; originally announced May 2018.