Skip to main content

Showing 1–50 of 182 results for author: Choi, W

  1. arXiv:2406.15751  [pdf, other

    cs.SD eess.AS

    Improving Unsupervised Clean-to-Rendered Guitar Tone Transformation Using GANs and Integrated Unaligned Clean Data

    Authors: Yu-Hua Chen, Woosung Choi, Wei-Hsiang Liao, Marco Martínez-Ramírez, Kin Wai Cheuk, Yuki Mitsufuji, Jyh-Shing Roger Jang, Yi-Hsuan Yang

    Abstract: Recent years have seen increasing interest in applying deep learning methods to the modeling of guitar amplifiers or effect pedals. Existing methods are mainly based on the supervised approach, requiring temporally-aligned data pairs of unprocessed and rendered audio. However, this approach does not scale well, due to the complicated process involved in creating the data pairs. A very recent work… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Accepted to DAFx 2024

  2. arXiv:2406.11313  [pdf, other

    cs.CV

    Semi-Supervised Domain Adaptation Using Target-Oriented Domain Augmentation for 3D Object Detection

    Authors: Yecheol Kim, Junho Lee, Changsoo Park, Hyoung won Kim, Inho Lim, Christopher Chang, Jun Won Choi

    Abstract: 3D object detection is crucial for applications like autonomous driving and robotics. However, in real-world environments, variations in sensor data distribution due to sensor upgrades, weather changes, and geographic differences can adversely affect detection performance. Semi-Supervised Domain Adaptation (SSDA) aims to mitigate these challenges by transferring knowledge from a source domain, abu… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted to IEEE Transactions on Intelligent Vehicles (T-IV). The code is available at: https://github.com/rasd3/TODA

  3. arXiv:2405.18386  [pdf, other

    cs.SD cs.AI cs.LG cs.MM eess.AS

    Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning

    Authors: Yixiao Zhang, Yukara Ikemiya, Woosung Choi, Naoki Murata, Marco A. Martínez-Ramírez, Liwei Lin, Gus Xia, Wei-Hsiang Liao, Yuki Mitsufuji, Simon Dixon

    Abstract: Recent advances in text-to-music editing, which employ text queries to modify music (e.g.\ by changing its style or adjusting instrumental components), present unique challenges and opportunities for AI-assisted music creation. Previous approaches in this domain have been constrained by the necessity to train specific editing models from scratch, which is both resource-intensive and inefficient; o… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Code and demo are available at: https://github.com/ldzhangyx/instruct-musicgen

  4. arXiv:2405.11563  [pdf, other

    cs.IT

    User-Centric Association and Feedback Bit Allocation for FDD Cell-Free Massive MIMO

    Authors: Kwangjae Lee, Jung Hoon Lee, Wan Choi

    Abstract: In this paper, we introduce a novel approach to user-centric association and feedback bit allocation for the downlink of a cell-free massive MIMO (CF-mMIMO) system, operating under limited feedback constraints. In CF-mMIMO systems employing frequency division duplexing, each access point (AP) relies on channel information provided by its associated user equipments (UEs) for beamforming design. Sin… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  5. arXiv:2405.01979  [pdf, other

    cs.IT eess.SP

    Graph Neural Network based Active and Passive Beamforming for Distributed STAR-RIS-Assisted Multi-User MISO Systems

    Authors: Ha An Le, Trinh Van Chien, Wan Choi

    Abstract: This paper investigates a joint active and passive beamforming design for distributed simultaneous transmitting and reflecting (STAR) reconfigurable intelligent surface (RIS) assisted multi-user (MU)- mutiple input single output (MISO) systems, where the energy splitting (ES) mode is considered for the STAR-RIS. We aim to design the active beamforming vectors at the base station (BS) and the passi… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 13 pages, 7 figures

  6. arXiv:2404.18705  [pdf, other

    cs.IT eess.SP

    Wireless Information and Energy Transfer in the Era of 6G Communications

    Authors: Constantinos Psomas, Konstantinos Ntougias, Nikita Shanin, Dongfang Xu, Kenneth MacSporran Mayer, Nguyen Minh Tran, Laura Cottatellucci, Kae Won Choi, Dong In Kim, Robert Schober, Ioannis Krikidis

    Abstract: Wireless information and energy transfer (WIET) represents an emerging paradigm which employs controllable transmission of radio-frequency signals for the dual purpose of data communication and wireless charging. As such, WIET is widely regarded as an enabler of envisioned 6G use cases that rely on energy-sustainable Internet-of-Things (IoT) networks, such as smart cities and smart grids. Meeting… ▽ More

    Submitted 16 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: Proceedings of the IEEE, 36 pages, 33 figures

  7. arXiv:2404.04842  [pdf, other

    cs.IT eess.SP

    Analog-Digital Beam Focusing for Line of Sight Wide-Aperture MIMO with Spherical Wavefronts

    Authors: Jiyoung Yun, Hojun Rho, Wan Choi

    Abstract: Enhancing high-speed wireless communication in the future relies significantly on harnessing high frequency bands effectively. These bands predominantly operate in line-of-sight (LoS) paths, necessitating well-configured antenna arrays and beamforming techniques for optimal spectrum utilization. Maximizing the potential of LoS multiple-input multiple-output (MIMO) systems, which are crucial for ac… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  8. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  9. arXiv:2403.19105  [pdf, ps, other

    cs.IT eess.SP

    Pilot Signal and Channel Estimator Co-Design for Hybrid-Field XL-MIMO

    Authors: Yoonseong Kang, Hyowoon Seo, Wan Choi

    Abstract: This paper addresses the intricate task of hybrid-field channel estimation in extremely large-scale MIMO (XL-MIMO) systems, critical for the progression of 6G communications. Within these systems, comprising a line-of-sight (LoS) channel component alongside far-field and near-field scattering channel components, our objective is to tackle the channel estimation challenge. We encounter two central… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  10. arXiv:2403.11551  [pdf, ps, other

    cs.IT

    New Constructions of Reversible DNA Codes

    Authors: Xueyan Chen, Whan-Hyuk Choi, Hongwei Liu

    Abstract: DNA codes have many applications, such as in data storage, DNA computing, etc. Good DNA codes have large sizes and satisfy some certain constraints. In this paper, we present a new construction method for reversible DNA codes. We show that the DNA codes obtained using our construction method can satisfy some desired constraints and the lower bounds of the sizes of some DNA codes are better than th… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  11. arXiv:2403.06880  [pdf, other

    cs.LG cs.AI

    Unveiling the Significance of Toddler-Inspired Reward Transition in Goal-Oriented Reinforcement Learning

    Authors: Junseok Park, Yoonsung Kim, Hee Bin Yoo, Min Whoo Lee, Kibeom Kim, Won-Seok Choi, Minsu Lee, Byoung-Tak Zhang

    Abstract: Toddlers evolve from free exploration with sparse feedback to exploiting prior experiences for goal-directed learning with denser rewards. Drawing inspiration from this Toddler-Inspired Reward Transition, we set out to explore the implications of varying reward transitions when incorporated into Reinforcement Learning (RL) tasks. Central to our inquiry is the transition from sparse to potential-ba… ▽ More

    Submitted 18 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted as a full paper at AAAI 2024 (Oral presentation): 7 pages (main paper), 2 pages (references), 17 pages (appendix) each

  12. arXiv:2403.06433  [pdf, other

    cs.CV cs.AI

    Fine-Grained Pillar Feature Encoding Via Spatio-Temporal Virtual Grid for 3D Object Detection

    Authors: Konyul Park, Yecheol Kim, Junho Koh, Byungwoo Park, Jun Won Choi

    Abstract: Developing high-performance, real-time architectures for LiDAR-based 3D object detectors is essential for the successful commercialization of autonomous vehicles. Pillar-based methods stand out as a practical choice for onboard deployment due to their computational efficiency. However, despite their efficiency, these methods can sometimes underperform compared to alternative point encoding techniq… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: ICRA 2024

  13. arXiv:2403.05061  [pdf, other

    cs.CV

    RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features

    Authors: Geonho Bang, Kwangjin Choi, Jisong Kim, Dongsuk Kum, Jun Won Choi

    Abstract: The inherent noisy and sparse characteristics of radar data pose challenges in finding effective representations for 3D object detection. In this paper, we propose RadarDistill, a novel knowledge distillation (KD) method, which can improve the representation of radar data by leveraging LiDAR data. RadarDistill successfully transfers desirable characteristics of LiDAR features into radar features u… ▽ More

    Submitted 4 April, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024, 10 pages, 3 figures

  14. arXiv:2403.03468  [pdf, other

    cs.CV

    Multi-task Learning for Real-time Autonomous Driving Leveraging Task-adaptive Attention Generator

    Authors: Wonhyeok Choi, Mingyu Shin, Hyukzae Lee, Jaehoon Cho, Jaehyeon Park, Sunghoon Im

    Abstract: Real-time processing is crucial in autonomous driving systems due to the imperative of instantaneous decision-making and rapid response. In real-world scenarios, autonomous vehicles are continuously tasked with interpreting their surroundings, analyzing intricate sensor data, and making decisions within split seconds to ensure safety through numerous computer vision tasks. In this paper, we presen… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: Accepted at ICRA 2024

  15. arXiv:2403.01663  [pdf, other

    cs.CV

    PillarGen: Enhancing Radar Point Cloud Density and Quality via Pillar-based Point Generation Network

    Authors: Jisong Kim, Geonho Bang, Kwangjin Choi, Minjae Seong, Jaechang Yoo, Eunjong Pyo, Jun Won Choi

    Abstract: In this paper, we present a novel point generation model, referred to as Pillar-based Point Generation Network (PillarGen), which facilitates the transformation of point clouds from one domain into another. PillarGen can produce synthetic point clouds with enhanced density and quality based on the provided input point clouds. The PillarGen model performs the following three steps: 1) pillar encodi… ▽ More

    Submitted 8 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: Accepted by IEEE International Conference on Robotics and Automation (ICRA 2024), 8 pages, 3 figures

  16. arXiv:2402.15019  [pdf, other

    cs.LG cs.AI stat.ML

    Consistency-Guided Temperature Scaling Using Style and Content Information for Out-of-Domain Calibration

    Authors: Wonjeong Choi, Jungwuk Park, Dong-Jun Han, Younghyun Park, Jaekyun Moon

    Abstract: Research interests in the robustness of deep neural networks against domain shifts have been rapidly increasing in recent years. Most existing works, however, focus on improving the accuracy of the model, not the calibration performance which is another important requirement for trustworthy AI systems. Temperature scaling (TS), an accuracy-preserving post-hoc calibration method, has been proven to… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: Accepted at AAAI-24 (The 38th AAAI Conference on Artificial Intelligence, February 2024)

  17. arXiv:2402.08963  [pdf, other

    cs.LG cs.AI

    DUEL: Duplicate Elimination on Active Memory for Self-Supervised Class-Imbalanced Learning

    Authors: Won-Seok Choi, Hyundo Lee, Dong-Sig Han, Junseok Park, Heeyeon Koo, Byoung-Tak Zhang

    Abstract: Recent machine learning algorithms have been developed using well-curated datasets, which often require substantial cost and resources. On the other hand, the direct use of raw data often leads to overfitting towards frequently occurring class information. To address class imbalances cost-efficiently, we propose an active data filtering process during self-supervised pre-training in our novel fram… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: Accepted as a full paper at AAAI 2024: The 38th Annual AAAI Conference on Artificial Intelligence (Main Tech Track). 7 pages (main paper), 2 pages (references), 11 pages (appendix) each

  18. arXiv:2402.03060  [pdf, other

    cs.CR

    UniHENN: Designing More Versatile Homomorphic Encryption-based CNNs without im2col

    Authors: Hyunmin Choi, Jihun Kim, Seungho Kim, Seonhye Park, Jeongyong Park, Wonbin Choi, Hyoungshick Kim

    Abstract: Homomorphic encryption enables computations on encrypted data without decryption, which is crucial for privacy-preserving cloud services. However, deploying convolutional neural networks (CNNs) with homomorphic encryption encounters significant challenges, particularly in converting input data into a two-dimensional matrix for convolution, typically achieved using the im2col technique. While effic… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  19. arXiv:2401.16732  [pdf, other

    cs.CR

    Flash: A Hybrid Private Inference Protocol for Deep CNNs with High Accuracy and Low Latency on CPU

    Authors: Hyeri Roh, Jinsu Yeo, Yeongil Ko, Gu-Yeon Wei, David Brooks, Woo-Seok Choi

    Abstract: This paper presents Flash, an optimized private inference (PI) hybrid protocol utilizing both homomorphic encryption (HE) and secure two-party computation (2PC), which can reduce the end-to-end PI latency for deep CNN models less than 1 minute with CPU. To this end, first, Flash proposes a low-latency convolution algorithm built upon a fast slot rotation operation and a novel data encoding scheme,… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  20. arXiv:2401.14780  [pdf, other

    cs.IT

    Adversarial Attacks and Defenses in 6G Network-Assisted IoT Systems

    Authors: Bui Duc Son, Nguyen Tien Hoa, Trinh Van Chien, Waqas Khalid, Mohamed Amine Ferrag, Wan Choi, Merouane Debbah

    Abstract: The Internet of Things (IoT) and massive IoT systems are key to sixth-generation (6G) networks due to dense connectivity, ultra-reliability, low latency, and high throughput. Artificial intelligence, including deep learning and machine learning, offers solutions for optimizing and deploying cutting-edge technologies for future radio communications. However, these techniques are vulnerable to adver… ▽ More

    Submitted 28 January, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: 17 pages, 5 figures, and 4 tables. Submitted for publications

  21. arXiv:2401.01075  [pdf, other

    cs.CV

    Depth-discriminative Metric Learning for Monocular 3D Object Detection

    Authors: Wonhyeok Choi, Mingyu Shin, Sunghoon Im

    Abstract: Monocular 3D object detection poses a significant challenge due to the lack of depth information in RGB images. Many existing methods strive to enhance the object depth estimation performance by allocating additional parameters for object depth estimation, utilizing extra modules or data. In contrast, we introduce a novel metric learning scheme that encourages the model to extract depth-discrimina… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: Accepted at NeurIPS 2023

  22. arXiv:2401.00365  [pdf, other

    cs.LG cs.AI cs.CV

    HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes

    Authors: Yuhta Takida, Yukara Ikemiya, Takashi Shibuya, Kazuki Shimada, Woosung Choi, Chieh-Hsin Lai, Naoki Murata, Toshimitsu Uesaka, Kengo Uchida, Wei-Hsiang Liao, Yuki Mitsufuji

    Abstract: Vector quantization (VQ) is a technique to deterministically learn features with discrete codebook representations. It is commonly performed with a variational autoencoding model, VQ-VAE, which can be further extended to hierarchical structures for making high-fidelity reconstructions. However, such hierarchical extensions of VQ-VAE often suffer from the codebook/layer collapse issue, where the co… ▽ More

    Submitted 28 March, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

    Comments: 34 pages with 17 figures, accepted for TMLR

  23. arXiv:2311.12519  [pdf, other

    cs.CR

    Hyena: Optimizing Homomorphically Encrypted Convolution for Private CNN Inference

    Authors: Hyeri Roh, Woo-Seok Choi

    Abstract: Processing convolution layers remains a huge bottleneck for private deep convolutional neural network (CNN) inference for large datasets. To solve this issue, this paper presents a novel homomorphic convolution algorithm that provides speedup, communication cost, and storage saving. We first note that padded convolution provides the advantage of model storage saving, but it does not support channe… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

  24. arXiv:2311.04250  [pdf, other

    cs.AI cs.CL cs.LG

    Unifying Structure and Language Semantic for Efficient Contrastive Knowledge Graph Completion with Structured Entity Anchors

    Authors: Sang-Hyun Je, Wontae Choi, Kwangjin Oh

    Abstract: The goal of knowledge graph completion (KGC) is to predict missing links in a KG using trained facts that are already known. In recent, pre-trained language model (PLM) based methods that utilize both textual and structural information are emerging, but their performances lag behind state-of-the-art (SOTA) structure-based methods or some methods lose their inductive inference capabilities in the p… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  25. arXiv:2310.12369  [pdf, other

    cs.IR

    On Identifying Points of Semantic Shift Across Domains

    Authors: Hyung Wook Choi, Mat Kelly

    Abstract: The semantics used for particular terms in an academic field organically evolve over time. Tracking this evolution through inspection of published literature has either been from the perspective of Linguistic scholars or has concentrated the focus of term evolution within a single domain of study. In this paper, we performed a case study to identify semantic evolution across different domains and… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: In 17th International Conference on Metadata and Semantics Research, October 2023

  26. arXiv:2309.15717  [pdf, other

    eess.AS cs.LG cs.SD

    Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription

    Authors: Frank Cwitkowitz, Kin Wai Cheuk, Woosung Choi, Marco A. Martínez-Ramírez, Keisuke Toyama, Wei-Hsiang Liao, Yuki Mitsufuji

    Abstract: In recent years, research on music transcription has focused mainly on architecture design and instrument-specific data acquisition. With the lack of availability of diverse datasets, progress is often limited to solo-instrument tasks such as piano transcription. Several works have explored multi-instrument transcription as a means to bolster the performance of models on low-resource tasks, but th… ▽ More

    Submitted 24 January, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: Accepted to ICASSP 2024

  27. arXiv:2309.05032  [pdf, other

    cs.CV

    Unified Contrastive Fusion Transformer for Multimodal Human Action Recognition

    Authors: Kyoung Ok Yang, Junho Koh, Jun Won Choi

    Abstract: Various types of sensors have been considered to develop human action recognition (HAR) models. Robust HAR performance can be achieved by fusing multimodal data acquired by different sensors. In this paper, we introduce a new multimodal fusion architecture, referred to as Unified Contrastive Fusion Transformer (UCFFormer) designed to integrate data with diverse distributions to enhance HAR perform… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

  28. Double RIS-Assisted MIMO Systems Over Spatially Correlated Rician Fading Channels and Finite Scatterers

    Authors: Ha An Le, Trinh Van Chien, Van Duc Nguyen, Wan Choi

    Abstract: This paper investigates double RIS-assisted MIMO communication systems over Rician fading channels with finite scatterers, spatial correlation, and the existence of a double-scattering link between the transceiver. First, the statistical information is driven in closed form for the aggregated channels, unveiling various influences of the system and environment on the average channel power gains. N… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

    Comments: 15 pages, 9 figures, accepted by IEEE Transactions on Communications

  29. arXiv:2308.11984  [pdf, ps, other

    math.OC cs.DC

    Non-ergodic linear convergence property of the delayed gradient descent under the strongly convexity and the Polyak-Łojasiewicz condition

    Authors: Hyung Jun Choi, Woocheol Choi, Jinmyoung Seok

    Abstract: In this work, we establish the linear convergence estimate for the gradient descent involving the delay $τ\in\mathbb{N}$ when the cost function is $μ$-strongly convex and $L$-smooth. This result improves upon the well-known estimates in Arjevani et al. \cite{ASS} and Stich-Karmireddy \cite{SK} in the sense that it is non-ergodic and is still established in spite of weaker constraint of cost functi… ▽ More

    Submitted 22 February, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: A result for the delayed SGD was added. accepted in Analysis and Applications

  30. arXiv:2308.10604  [pdf, other

    cs.CV cs.AI cs.LG

    BackTrack: Robust template update via Backward Tracking of candidate template

    Authors: Dongwook Lee, Wonjun Choi, Seohyung Lee, ByungIn Yoo, Eunho Yang, Seongju Hwang

    Abstract: Variations of target appearance such as deformations, illumination variance, occlusion, etc., are the major challenges of visual object tracking that negatively impact the performance of a tracker. An effective method to tackle these challenges is template update, which updates the template to reflect the change of appearance in the target object during tracking. However, with template updates, in… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: 14 pages, 7 figures

  31. arXiv:2308.07843  [pdf, other

    cs.LG stat.AP stat.ML

    Dyadic Reinforcement Learning

    Authors: Shuangning Li, Lluis Salvat Niell, Sung Won Choi, Inbal Nahum-Shani, Guy Shani, Susan Murphy

    Abstract: Mobile health aims to enhance health outcomes by delivering interventions to individuals as they go about their daily life. The involvement of care partners and social support networks often proves crucial in helping individuals managing burdensome medical conditions. This presents opportunities in mobile health to design interventions that target the dyadic relationship -- the relationship betwee… ▽ More

    Submitted 1 November, 2023; v1 submitted 15 August, 2023; originally announced August 2023.

  32. arXiv:2308.06979  [pdf, other

    eess.AS cs.SD

    The Sound Demixing Challenge 2023 $\unicode{x2013}$ Music Demixing Track

    Authors: Giorgio Fabbro, Stefan Uhlich, Chieh-Hsin Lai, Woosung Choi, Marco Martínez-Ramírez, Weihsiang Liao, Igor Gadelha, Geraldo Ramos, Eddie Hsu, Hugo Rodrigues, Fabian-Robert Stöter, Alexandre Défossez, Yi Luo, Jianwei Yu, Dipam Chakraborty, Sharada Mohanty, Roman Solovyev, Alexander Stempkovskiy, Tatiana Habruseva, Nabarun Goswami, Tatsuya Harada, Minseok Kim, Jun Hyung Lee, Yuanliang Dong, Xinran Zhang , et al. (2 additional authors not shown)

    Abstract: This paper summarizes the music demixing (MDX) track of the Sound Demixing Challenge (SDX'23). We provide a summary of the challenge setup and introduce the task of robust music source separation (MSS), i.e., training MSS models in the presence of errors in the training data. We propose a formalization of the errors that can occur in the design of a training dataset for MSS systems and introduce t… ▽ More

    Submitted 19 April, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: Published in Transactions of the International Society for Music Information Retrieval (https://transactions.ismir.net/articles/10.5334/tismir.171)

    Journal ref: Transactions of the International Society for Music Information Retrieval, 7(1), pp.63-84, 2024

  33. arXiv:2308.05862  [pdf, other

    eess.IV cs.AI cs.CV

    Unleashing the Strengths of Unlabeled Data in Pan-cancer Abdominal Organ Quantification: the FLARE22 Challenge

    Authors: Jun Ma, Yao Zhang, Song Gu, Cheng Ge, Shihao Ma, Adamo Young, Cheng Zhu, Kangkang Meng, Xin Yang, Ziyan Huang, Fan Zhang, Wentao Liu, YuanKe Pan, Shoujin Huang, Jiacheng Wang, Mingze Sun, Weixin Xu, Dengqiang Jia, Jae Won Choi, Natália Alves, Bram de Wilde, Gregor Koehler, Yajun Wu, Manuel Wiesenfarth, Qiongjie Zhu , et al. (4 additional authors not shown)

    Abstract: Quantitative organ assessment is an essential step in automated abdominal disease diagnosis and treatment planning. Artificial intelligence (AI) has shown great potential to automatize this process. However, most existing AI algorithms rely on many expert annotations and lack a comprehensive evaluation of accuracy and efficiency in real-world multinational settings. To overcome these limitations,… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

    Comments: MICCAI FLARE22: https://flare22.grand-challenge.org/

  34. arXiv:2308.00994  [pdf, other

    cs.CV cs.LG

    SYNAuG: Exploiting Synthetic Data for Data Imbalance Problems

    Authors: Moon Ye-Bin, Nam Hyeon-Woo, Wonseok Choi, Nayeong Kim, Suha Kwak, Tae-Hyun Oh

    Abstract: Data imbalance in training data often leads to biased predictions from trained models, which in turn causes ethical and social issues. A straightforward solution is to carefully curate training data, but given the enormous scale of modern neural networks, this is prohibitively labor-intensive and thus impractical. Inspired by recent developments in generative models, this paper explores the potent… ▽ More

    Submitted 25 April, 2024; v1 submitted 2 August, 2023; originally announced August 2023.

    Comments: The paper is under consideration at Pattern Recognition Letters

  35. arXiv:2307.10249  [pdf, other

    cs.CV

    RCM-Fusion: Radar-Camera Multi-Level Fusion for 3D Object Detection

    Authors: Jisong Kim, Minjae Seong, Geonho Bang, Dongsuk Kum, Jun Won Choi

    Abstract: While LiDAR sensors have been successfully applied to 3D object detection, the affordability of radar and camera sensors has led to a growing interest in fusing radars and cameras for 3D object detection. However, previous radar-camera fusion models were unable to fully utilize the potential of radar information. In this paper, we propose Radar-Camera Multi-level fusion (RCM-Fusion), which attempt… ▽ More

    Submitted 16 May, 2024; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: Accepted by IEEE International Conference on Robotics and Automation (ICRA 2024, Oral presentation), 7 pages, 5 figures

  36. arXiv:2307.07468  [pdf, other

    cs.RO

    SGGNet$^2$: Speech-Scene Graph Grounding Network for Speech-guided Navigation

    Authors: Dohyun Kim, Yeseung Kim, Jaehwi Jang, Minjae Song, Woojin Choi, Daehyung Park

    Abstract: The spoken language serves as an accessible and efficient interface, enabling non-experts and disabled users to interact with complex assistant robots. However, accurately grounding language utterances gives a significant challenge due to the acoustic variability in speakers' voices and environmental noise. In this work, we propose a novel speech-scene graph grounding network (SGGNet$^2$) that rob… ▽ More

    Submitted 14 April, 2024; v1 submitted 14 July, 2023; originally announced July 2023.

    Comments: 7 pages, 6 figures, Published at 2023 IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), [Dohyun Kim, Yeseung Kim, Jaehwi Jang, and Minjae Song] contributed equally to this work

  37. arXiv:2306.06403  [pdf, other

    cs.IT cs.LG

    Bayesian Inverse Contextual Reasoning for Heterogeneous Semantics-Native Communication

    Authors: Hyowoon Seo, Yoonseong Kang, Mehdi Bennis, Wan Choi

    Abstract: This work deals with the heterogeneous semantic-native communication (SNC) problem. When agents do not share the same communication context, the effectiveness of contextual reasoning (CR) is compromised calling for agents to infer other agents' context. This article proposes a novel framework for solving the inverse problem of CR in SNC using two Bayesian inference methods, namely: Bayesian invers… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

    Comments: 14 pages, 7 figures, submitted for possible publication

  38. arXiv:2305.07522  [pdf, other

    cs.AR cs.AI

    SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving

    Authors: Minjae Lee, Seongmin Park, Hyungmin Kim, Minyong Yoon, Janghwan Lee, Jun Won Choi, Nam Sung Kim, Mingu Kang, Jungwook Choi

    Abstract: 3D object detection using point cloud (PC) data is essential for perception pipelines of autonomous driving, where efficient encoding is key to meeting stringent resource and latency requirements. PointPillars, a widely adopted bird's-eye view (BEV) encoding, aggregates 3D point cloud data into 2D pillars for fast and accurate 3D object detection. However, the state-of-the-art methods employing Po… ▽ More

    Submitted 13 January, 2024; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: 14 pages, 15 figures

  39. arXiv:2304.09047  [pdf, other

    cs.LG cs.CE

    Neural Lumped Parameter Differential Equations with Application in Friction-Stir Processing

    Authors: James Koch, WoongJo Choi, Ethan King, David Garcia, Hrishikesh Das, Tianhao Wang, Ken Ross, Keerti Kappagantula

    Abstract: Lumped parameter methods aim to simplify the evolution of spatially-extended or continuous physical systems to that of a "lumped" element representative of the physical scales of the modeled system. For systems where the definition of a lumped element or its associated physics may be unknown, modeling tasks may be restricted to full-fidelity simulations of the physics of a system. In this work, we… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

  40. arXiv:2304.08204  [pdf, other

    cs.CV

    Learning Geometry-aware Representations by Sketching

    Authors: Hyundo Lee, Inwoo Hwang, Hyunsung Go, Won-Seok Choi, Kibeom Kim, Byoung-Tak Zhang

    Abstract: Understanding geometric concepts, such as distance and shape, is essential for understanding the real world and also for many vision tasks. To incorporate such information into a visual representation of a scene, we propose learning to represent the scene by sketching, inspired by human behavior. Our method, coined Learning by Sketching (LBS), learns to convert an image into a set of colored strok… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  41. arXiv:2304.00670  [pdf, other

    cs.CV cs.AI cs.RO

    CRN: Camera Radar Net for Accurate, Robust, Efficient 3D Perception

    Authors: Youngseok Kim, Juyeb Shin, Sanmin Kim, In-Jae Lee, Jun Won Choi, Dongsuk Kum

    Abstract: Autonomous driving requires an accurate and fast 3D perception system that includes 3D object detection, tracking, and segmentation. Although recent low-cost camera-based approaches have shown promising results, they are susceptible to poor illumination or bad weather conditions and have a large localization error. Hence, fusing camera with low-cost radar, which provides precise long-range measure… ▽ More

    Submitted 23 December, 2023; v1 submitted 2 April, 2023; originally announced April 2023.

    Comments: IEEE/CVF International Conference on Computer Vision (ICCV'23). Code is available at https://github.com/youngskkim/CRN

  42. arXiv:2303.06856  [pdf, ps, other

    cs.CV cs.LG

    Dynamic Neural Network for Multi-Task Learning Searching across Diverse Network Topologies

    Authors: Wonhyeok Choi, Sunghoon Im

    Abstract: In this paper, we present a new MTL framework that searches for structures optimized for multiple tasks with diverse graph topologies and shares features among tasks. We design a restricted DAG-based central network with read-in/read-out layers to build topologically diverse task-adaptive structures while limiting search space and time. We search for a single optimized network that serves as multi… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: Accepted at CVPR 2023, 13 pages, 10 encapsulated postscript figures

  43. arXiv:2302.08779  [pdf, other

    math.OC cs.DC eess.SP

    On the convergence result of the gradient-push algorithm on directed graphs with constant stepsize

    Authors: Woocheol Choi, Doheon Kim, Seok-Bae Yun

    Abstract: Gradient-push algorithm has been widely used for decentralized optimization problems when the connectivity network is a direct graph. This paper shows that the gradient-push algorithm with stepsize $α>0$ converges exponentially fast to an $O(α)$-neighborhood of the optimizer under the assumption that each cost is smooth and the total cost is strongly convex. Numerical experiments are provided to s… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    MSC Class: 90C25; 68Q25

  44. arXiv:2212.13773  [pdf, other

    cs.SE

    A Bayesian Framework for Automated Debugging

    Authors: Sungmin Kang, Wonkeun Choi, Shin Yoo

    Abstract: Debugging takes up a significant portion of developer time. As a result, automated debugging techniques including Fault Localization (FL) and Automated Program Repair (APR) have garnered significant attention due to their potential to aid developers in debugging tasks. Despite intensive research on these subjects, we are unaware of a theoretic framework that highlights the principles behind automa… ▽ More

    Submitted 29 December, 2022; v1 submitted 28 December, 2022; originally announced December 2022.

  45. arXiv:2212.10854  [pdf, other

    cs.CR

    Defining C-ITS Environment and Attack Scenarios

    Authors: Yongsik Kim, Jae Woong Choi, Hyo Sun Lee, Jeong Do Yoo, Haerin Kim, Junho Jang, Kibeom Park, Huy Kang Kim

    Abstract: As technology advances, it is possible to process a lot of data, and as various elements in the city become diverse and complex, cities are becoming smart cities. One of the core systems of smart cities is Cooperative-Intelligent Transport Systems (C-ITS). C-ITS is a system that provides drivers with real-time accident risk information such as surrounding traffic conditions, sudden stops, and fall… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    Comments: in Korean language

  46. arXiv:2212.10320  [pdf

    cs.AI q-bio.QM

    Construction of extra-large scale screening tools for risks of severe mental illnesses using real world healthcare data

    Authors: Dianbo Liu, Karmel W. Choi, Paulo Lizano, William Yuan, Kun-Hsing Yu, Jordan W. Smoller, Isaac Kohane

    Abstract: Importance: The prevalence of severe mental illnesses (SMIs) in the United States is approximately 3% of the whole population. The ability to conduct risk screening of SMIs at large scale could inform early prevention and treatment. Objective: A scalable machine learning based tool was developed to conduct population-level risk screening for SMIs, including schizophrenia, schizoaffective disorde… ▽ More

    Submitted 12 January, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  47. arXiv:2212.00442  [pdf, other

    cs.CV

    MGTANet: Encoding Sequential LiDAR Points Using Long Short-Term Motion-Guided Temporal Attention for 3D Object Detection

    Authors: Junho Koh, Junhyung Lee, Youngwoo Lee, Jaekyum Kim, Jun Won Choi

    Abstract: Most scanning LiDAR sensors generate a sequence of point clouds in real-time. While conventional 3D object detectors use a set of unordered LiDAR points acquired over a fixed time interval, recent studies have revealed that substantial performance improvement can be achieved by exploiting the spatio-temporal context present in a sequence of LiDAR point sets. In this paper, we propose a novel 3D ob… ▽ More

    Submitted 21 December, 2022; v1 submitted 1 December, 2022; originally announced December 2022.

    Comments: Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI'23)

  48. arXiv:2211.13529  [pdf, other

    cs.CV

    3D Dual-Fusion: Dual-Domain Dual-Query Camera-LiDAR Fusion for 3D Object Detection

    Authors: Yecheol Kim, Konyul Park, Minwook Kim, Dongsuk Kum, Jun Won Choi

    Abstract: Fusing data from cameras and LiDAR sensors is an essential technique to achieve robust 3D object detection. One key challenge in camera-LiDAR fusion involves mitigating the large domain gap between the two sensors in terms of coordinates and data distribution when fusing their features. In this paper, we propose a novel camera-LiDAR fusion architecture called, 3D Dual-Fusion, which is designed to… ▽ More

    Submitted 16 February, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

    Comments: 12 pages, 3 figures

  49. arXiv:2211.08609  [pdf, other

    cs.CV

    R-Pred: Two-Stage Motion Prediction Via Tube-Query Attention-Based Trajectory Refinement

    Authors: Sehwan Choi, Jungho Kim, Junyong Yun, Jun Won Choi

    Abstract: Predicting the future motion of dynamic agents is of paramount importance to ensuring safety and assessing risks in motion planning for autonomous robots. In this study, we propose a two-stage motion prediction method, called R-Pred, designed to effectively utilize both scene and interaction context using a cascade of the initial trajectory proposal and trajectory refinement networks. The initial… ▽ More

    Submitted 14 July, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

  50. arXiv:2211.05942  [pdf, other

    eess.IV cs.CV

    Knowledge Distillation from Cross Teaching Teachers for Efficient Semi-Supervised Abdominal Organ Segmentation in CT

    Authors: Jae Won Choi

    Abstract: For more clinical applications of deep learning models for medical image segmentation, high demands on labeled data and computational resources must be addressed. This study proposes a coarse-to-fine framework with two teacher models and a student model that combines knowledge distillation and cross teaching, a consistency regularization based on pseudo-labels, for efficient semi-supervised learni… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.