Skip to main content

Showing 1–50 of 122 results for author: Lee, J H

  1. arXiv:2407.11534  [pdf, other

    cs.LG cs.AI

    LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices

    Authors: Jung Hyun Lee, Jeonghoon Kim, June Yong Yang, Se Jung Kwon, Eunho Yang, Kang Min Yoo, Dongsoo Lee

    Abstract: With the commercialization of large language models (LLMs), weight-activation quantization has emerged to compress and accelerate LLMs, achieving high throughput while reducing inference costs. However, existing post-training quantization (PTQ) techniques for quantizing weights and activations of LLMs still suffer from non-negligible accuracy drops, especially on massive multitask language underst… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Preprint

  2. arXiv:2406.18699  [pdf, ps, other

    cs.RO

    From Pixels to Torques with Linear Feedback

    Authors: Jeong Hun Lee, Sam Schoedel, Aditya Bhardwaj, Zachary Manchester

    Abstract: We demonstrate the effectiveness of simple observer-based linear feedback policies for "pixels-to-torques" control of robotic systems using only a robot-facing camera. Specifically, we show that the matrices of an image-based Luenberger observer (linear state estimator) for a "student" output-feedback policy can be learned from demonstration data provided by a "teacher" state-feedback policy via s… ▽ More

    Submitted 7 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: Submitted to Workshop on Algorithmic Foundations of Robotics (WAFR) 2024

  3. arXiv:2406.18505  [pdf, other

    cs.LG cs.AI cs.CL cs.RO

    Mental Modeling of Reinforcement Learning Agents by Language Models

    Authors: Wenhao Lu, Xufeng Zhao, Josua Spisak, Jae Hee Lee, Stefan Wermter

    Abstract: Can emergent language models faithfully model the intelligence of decision-making agents? Though modern language models exhibit already some reasoning ability, and theoretically can potentially express any probable distribution over tokens, it remains underexplored how the world knowledge these pretrained models have memorized can be utilized to comprehend an agent's behaviour in the physical worl… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: https://lukaswill.github.io/

  4. arXiv:2406.09988  [pdf, other

    cs.AI cs.CL cs.RO

    Details Make a Difference: Object State-Sensitive Neurorobotic Task Planning

    Authors: Xiaowen Sun, Xufeng Zhao, Jae Hee Lee, Wenhao Lu, Matthias Kerzel, Stefan Wermter

    Abstract: The state of an object reflects its current status or condition and is important for a robot's task planning and manipulation. However, detecting an object's state and generating a state-sensitive plan for robots is challenging. Recently, pre-trained Large Language Models (LLMs) and Vision-Language Models (VLMs) have shown impressive capabilities in generating plans. However, to the best of our kn… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  5. arXiv:2406.02989  [pdf, other

    cs.RO cs.AI

    Learning Semantic Traversability with Egocentric Video and Automated Annotation Strategy

    Authors: Yunho Kim, Jeong Hyun Lee, Choongin Lee, Juhyeok Mun, Donghoon Youm, Jeongsoo Park, Jemin Hwangbo

    Abstract: For reliable autonomous robot navigation in urban settings, the robot must have the ability to identify semantically traversable terrains in the image based on the semantic understanding of the scene. This reasoning ability is based on semantic traversability, which is frequently achieved using semantic segmentation models fine-tuned on the testing domain. This fine-tuning process often involves m… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Submitted to IEEE Robotics and Automation Letters (RA-L), First two authors contributed equally

  6. arXiv:2405.20605  [pdf, other

    cs.LG cs.AI cs.CV

    Searching for internal symbols underlying deep learning

    Authors: Jung H. Lee, Sujith Vijayan

    Abstract: Deep learning (DL) enables deep neural networks (DNNs) to automatically learn complex tasks or rules from given examples without instructions or guiding principles. As we do not engineer DNNs' functions, it is extremely difficult to diagnose their decisions, and multiple lines of studies proposed to explain principles of DNNs/DL operations. Notably, one line of studies suggests that DNNs may learn… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 10 pages, 7 figures, 3 tables and Appendix

  7. arXiv:2405.11563  [pdf, other

    cs.IT

    User-Centric Association and Feedback Bit Allocation for FDD Cell-Free Massive MIMO

    Authors: Kwangjae Lee, Jung Hoon Lee, Wan Choi

    Abstract: In this paper, we introduce a novel approach to user-centric association and feedback bit allocation for the downlink of a cell-free massive MIMO (CF-mMIMO) system, operating under limited feedback constraints. In CF-mMIMO systems employing frequency division duplexing, each access point (AP) relies on channel information provided by its associated user equipments (UEs) for beamforming design. Sin… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  8. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  9. arXiv:2403.08244  [pdf

    cs.CE

    Evaluating the Efficiency and Cost-effectiveness of RPB-based CO2 Capture: A Comprehensive Approach to Simultaneous Design and Operating Condition Optimization

    Authors: Howoun Jung, Nohjin Park, Jay H. Lee

    Abstract: Despite ongoing global initiatives to reduce CO2 emissions, implementing large-scale CO2 capture using amine solvents is fraught with economic uncertainties and technical hurdles. The Rotating Packed Bed (RPB) presents a promising alternative to traditional packed towers, offering compact design and adaptability. Nonetheless, scaling RPB processes to an industrial level is challenging due to the n… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 44 pages, 11 figures, 6 tables

  10. arXiv:2403.05136  [pdf, other

    cs.RO eess.SP

    DeRO: Dead Reckoning Based on Radar Odometry With Accelerometers Aided for Robot Localization

    Authors: Hoang Viet Do, Yong Hun Kim, Joo Han Lee, Min Ho Lee, Jin Woo Song

    Abstract: In this paper, we propose a radar odometry structure that directly utilizes radar velocity measurements for dead reckoning while maintaining its ability to update estimations within the Kalman filter framework. Specifically, we employ the Doppler velocity obtained by a 4D Frequency Modulated Continuous Wave (FMCW) radar in conjunction with gyroscope data to calculate poses. This approach helps mit… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 9 pages, 5 figures, 1 table, conference

    ACM Class: I.2.9

  11. arXiv:2402.17517  [pdf, other

    cs.LG

    Label-Noise Robust Diffusion Models

    Authors: Byeonghu Na, Yeongmin Kim, HeeSun Bae, Jung Hyun Lee, Se Jung Kwon, Wanmo Kang, Il-Chul Moon

    Abstract: Conditional diffusion models have shown remarkable performance in various generative tasks, but training them requires large-scale datasets that often contain noise in conditional inputs, a.k.a. noisy labels. This noise leads to condition mismatch and quality degradation of generated data. This paper proposes Transition-aware weighted Denoising Score Matching (TDSM) for training conditional diffus… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted at ICLR 2024

  12. arXiv:2402.06185  [pdf, other

    cs.CV cs.AI cs.LG

    Development and validation of an artificial intelligence model to accurately predict spinopelvic parameters

    Authors: Edward S. Harake, Joseph R. Linzey, Cheng Jiang, Rushikesh S. Joshi, Mark M. Zaki, Jaes C. Jones, Siri S. Khalsa, John H. Lee, Zachary Wilseck, Jacob R. Joseph, Todd C. Hollon, Paul Park

    Abstract: Objective. Achieving appropriate spinopelvic alignment has been shown to be associated with improved clinical symptoms. However, measurement of spinopelvic radiographic parameters is time-intensive and interobserver reliability is a concern. Automated measurement tools have the promise of rapid and consistent measurements, but existing tools are still limited by some degree of manual user-entry re… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: 10 pages, 5 figures, to appear in Journal of Neurosurgery: Spine

  13. arXiv:2401.00104  [pdf, other

    cs.LG cs.AI stat.ME

    Causal State Distillation for Explainable Reinforcement Learning

    Authors: Wenhao Lu, Xufeng Zhao, Thilo Fryen, Jae Hee Lee, Mengdi Li, Sven Magg, Stefan Wermter

    Abstract: Reinforcement learning (RL) is a powerful technique for training intelligent agents, but understanding why these agents make specific decisions can be quite challenging. This lack of transparency in RL models has been a long-standing problem, making it difficult for users to grasp the reasons behind an agent's behaviour. Various approaches have been explored to address this problem, with one promi… ▽ More

    Submitted 1 April, 2024; v1 submitted 29 December, 2023; originally announced January 2024.

    Comments: https://lukaswill.github.io/; Accepted as oral by CLeaR 2024

  14. arXiv:2312.08888  [pdf, other

    cs.LG cs.CV

    Read Between the Layers: Leveraging Multi-Layer Representations for Rehearsal-Free Continual Learning with Pre-Trained Models

    Authors: Kyra Ahrens, Hans Hergen Lehmann, Jae Hee Lee, Stefan Wermter

    Abstract: We address the Continual Learning (CL) problem, wherein a model must learn a sequence of tasks from non-stationary distributions while preserving prior knowledge upon encountering new experiences. With the advancement of foundation models, CL research has pivoted from the initial learning-from-scratch paradigm towards utilizing generic features from large-scale pre-training. However, existing appr… ▽ More

    Submitted 5 July, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

    Comments: Accepted for publication in Transactions of Machine Learning Research (TMLR) journal

  15. arXiv:2311.15356  [pdf, other

    cs.CV cs.AI

    Having Second Thoughts? Let's hear it

    Authors: Jung H. Lee, Sujith Vijayan

    Abstract: Deep learning models loosely mimic bottom-up signal pathways from low-order sensory areas to high-order cognitive areas. After training, DL models can outperform humans on some domain-specific tasks, but their decision-making process has been known to be easily disrupted. Since the human brain consists of multiple functional areas highly connected to one another and relies on intricate interplays… ▽ More

    Submitted 31 May, 2024; v1 submitted 26 November, 2023; originally announced November 2023.

    Comments: 10 pages, 6 figures, 3 table and Append/Supplementary materials. Section 3 has been substantially revised

  16. arXiv:2311.10792  [pdf

    cs.LG cs.AI stat.AP

    Enhancing Data Efficiency and Feature Identification for Lithium-Ion Battery Lifespan Prediction by Deciphering Interpretation of Temporal Patterns and Cyclic Variability Using Attention-Based Models

    Authors: Jaewook Lee, Seongmin Heo, Jay H. Lee

    Abstract: Accurately predicting the lifespan of lithium-ion batteries is crucial for optimizing operational strategies and mitigating risks. While numerous studies have aimed at predicting battery lifespan, few have examined the interpretability of their models or how such insights could improve predictions. Addressing this gap, we introduce three innovative models that integrate shallow attention layers in… ▽ More

    Submitted 11 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

  17. arXiv:2311.08022  [pdf, ps, other

    cs.AI cs.LG

    Two-Stage Predict+Optimize for Mixed Integer Linear Programs with Unknown Parameters in Constraints

    Authors: Xinyi Hu, Jasper C. H. Lee, Jimmy H. M. Lee

    Abstract: Consider the setting of constrained optimization, with some parameters unknown at solving time and requiring prediction from relevant features. Predict+Optimize is a recent framework for end-to-end training supervised learning models for such predictions, incorporating information about the optimization problem in the training process in order to yield better predictions in terms of the quality of… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  18. MetaMix: Meta-state Precision Searcher for Mixed-precision Activation Quantization

    Authors: Han-Byul Kim, Joo Hyung Lee, Sungjoo Yoo, Hong-Seok Kim

    Abstract: Mixed-precision quantization of efficient networks often suffer from activation instability encountered in the exploration of bit selections. To address this problem, we propose a novel method called MetaMix which consists of bit selection and weight training phases. The bit selection phase iterates two steps, (1) the mixed-precision-aware weight update, and (2) the bit-search training with the fi… ▽ More

    Submitted 9 April, 2024; v1 submitted 12 November, 2023; originally announced November 2023.

    Comments: Proc. The 38th Annual AAAI Conference on Artificial Intelligence (AAAI)

  19. arXiv:2311.05373  [pdf

    cs.HC

    What is prompt literacy? An exploratory study of language learners' development of new literacy skill using generative AI

    Authors: Yohan Hwang, Jang Ho Lee, Dongkwang Shin

    Abstract: In the current study,we propose that, in the era of generative AI, there is now a new form of literacy called "prompt literacy," which refers to the ability to generate precise prompts as input for AI systems, interpret the outputs, and iteratively refine prompts to achieve desired results. To explore the emergence and development of this literacy skill, the current study examined 30 EFL students'… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: 22 pages

  20. Visually Grounded Continual Language Learning with Selective Specialization

    Authors: Kyra Ahrens, Lennart Bengtson, Jae Hee Lee, Stefan Wermter

    Abstract: A desirable trait of an artificial agent acting in the visual world is to continually learn a sequence of language-informed tasks while striking a balance between sufficiently specializing in each task and building a generalized knowledge for transfer. Selective specialization, i.e., a careful selection of model components to specialize in each task, is a strategy to provide control over this trad… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 Findings

    Journal ref: Findings of the Association for Computational Linguistics: EMNLP 2023

  21. arXiv:2310.14119  [pdf

    cs.RO physics.flu-dyn

    Accelerating Aquatic Soft Robots with Elastic Instability Effects

    Authors: Zechen Xiong, Suyu Luohong, Jeong Hun Lee, Hod Lipson

    Abstract: Sinusoidal undulation has long been considered the most successful swimming pattern for fish and bionic aquatic robots [1]. However, a swimming pattern generated by the hair clip mechanism (HCM, part iii, Figure 1A) [2]~[5] may challenge this knowledge. HCM is an in-plane prestressed bi-stable mechanism that stores elastic energy and releases the stored energy quickly via its snap-through buckling… ▽ More

    Submitted 15 July, 2024; v1 submitted 21 October, 2023; originally announced October 2023.

  22. arXiv:2310.11884  [pdf, other

    cs.AI cs.CL cs.CV cs.LG cs.NE

    From Neural Activations to Concepts: A Survey on Explaining Concepts in Neural Networks

    Authors: Jae Hee Lee, Sergio Lanza, Stefan Wermter

    Abstract: In this paper, we review recent approaches for explaining concepts in neural networks. Concepts can act as a natural link between learning and reasoning: once the concepts are identified that a neural learning system uses, one can integrate those concepts with a reasoning system for inference or use a reasoning system to act upon them to improve or enhance the learning system. On the other hand, k… ▽ More

    Submitted 3 May, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: Accepted in Neurosymbolic Artificial Intelligence

  23. arXiv:2309.13339  [pdf, other

    cs.CL cs.AI cs.LG cs.SC

    Enhancing Zero-Shot Chain-of-Thought Reasoning in Large Language Models through Logic

    Authors: Xufeng Zhao, Mengdi Li, Wenhao Lu, Cornelius Weber, Jae Hee Lee, Kun Chu, Stefan Wermter

    Abstract: Recent advancements in large language models have showcased their remarkable generalizability across various domains. However, their reasoning abilities still have significant room for improvement, especially when confronted with scenarios requiring multi-step reasoning. Although large language models possess extensive knowledge, their reasoning often fails to effectively utilize this knowledge to… ▽ More

    Submitted 25 March, 2024; v1 submitted 23 September, 2023; originally announced September 2023.

    Comments: Accepted in COLING 2024. Code see https://github.com/xf-zhao/LoT

  24. arXiv:2309.02745  [pdf, other

    cs.RO

    Learning Vehicle Dynamics from Cropped Image Patches for Robot Navigation in Unpaved Outdoor Terrains

    Authors: Jeong Hyun Lee, Jinhyeok Choi, Simo Ryu, Hyunsik Oh, Suyoung Choi, Jemin Hwangbo

    Abstract: In the realm of autonomous mobile robots, safe navigation through unpaved outdoor environments remains a challenging task. Due to the high-dimensional nature of sensor data, extracting relevant information becomes a complex problem, which hinders adequate perception and path planning. Previous works have shown promising performances in extracting global features from full-sized images. However, th… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

    Comments: 8 pages, 10 figures

  25. arXiv:2309.01670  [pdf, other

    q-bio.GN cs.LG

    Blind Biological Sequence Denoising with Self-Supervised Set Learning

    Authors: Nathan Ng, Ji Won Park, Jae Hyeon Lee, Ryan Lewis Kelly, Stephen Ra, Kyunghyun Cho

    Abstract: Biological sequence analysis relies on the ability to denoise the imprecise output of sequencing platforms. We consider a common setting where a short sequence is read out repeatedly using a high-throughput long-read platform to generate multiple subreads, or noisy observations of the same sequence. Denoising these subreads with alignment-based approaches often fails when too few subreads are avai… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

  26. arXiv:2308.06979  [pdf, other

    eess.AS cs.SD

    The Sound Demixing Challenge 2023 $\unicode{x2013}$ Music Demixing Track

    Authors: Giorgio Fabbro, Stefan Uhlich, Chieh-Hsin Lai, Woosung Choi, Marco Martínez-Ramírez, Weihsiang Liao, Igor Gadelha, Geraldo Ramos, Eddie Hsu, Hugo Rodrigues, Fabian-Robert Stöter, Alexandre Défossez, Yi Luo, Jianwei Yu, Dipam Chakraborty, Sharada Mohanty, Roman Solovyev, Alexander Stempkovskiy, Tatiana Habruseva, Nabarun Goswami, Tatsuya Harada, Minseok Kim, Jun Hyung Lee, Yuanliang Dong, Xinran Zhang , et al. (2 additional authors not shown)

    Abstract: This paper summarizes the music demixing (MDX) track of the Sound Demixing Challenge (SDX'23). We provide a summary of the challenge setup and introduce the task of robust music source separation (MSS), i.e., training MSS models in the presence of errors in the training data. We propose a formalization of the errors that can occur in the design of a training dataset for MSS systems and introduce t… ▽ More

    Submitted 19 April, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: Published in Transactions of the International Society for Music Information Retrieval (https://transactions.ismir.net/articles/10.5334/tismir.171)

    Journal ref: Transactions of the International Society for Music Information Retrieval, 7(1), pp.63-84, 2024

  27. arXiv:2307.03378  [pdf, other

    cs.CL

    A Side-by-side Comparison of Transformers for English Implicit Discourse Relation Classification

    Authors: Bruce W. Lee, BongSeok Yang, Jason Hyung-Jong Lee

    Abstract: Though discourse parsing can help multiple NLP fields, there has been no wide language model search done on implicit discourse relation classification. This hinders researchers from fully utilizing public-available models in discourse analysis. This work is a straightforward, fine-tuned discourse performance comparison of seven pre-trained language models. We use PDTB-3, a popular discourse relati… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

    Comments: TrustNLP @ ACL 2023

  28. arXiv:2306.11681  [pdf, other

    cs.LG q-bio.QM

    MoleCLUEs: Molecular Conformers Maximally In-Distribution for Predictive Models

    Authors: Michael Maser, Natasa Tagasovska, Jae Hyeon Lee, Andrew Watkins

    Abstract: Structure-based molecular ML (SBML) models can be highly sensitive to input geometries and give predictions with large variance. We present an approach to mitigate the challenge of selecting conformations for such models by generating conformers that explicitly minimize predictive uncertainty. To achieve this, we compute estimates of aleatoric and epistemic uncertainties that are differentiable w.… ▽ More

    Submitted 6 November, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023 AI for Science Workshop

  29. arXiv:2306.09382  [pdf, ps, other

    cs.SD cs.LG cs.MM eess.AS

    Sound Demixing Challenge 2023 Music Demixing Track Technical Report: TFC-TDF-UNet v3

    Authors: Minseok Kim, Jun Hyung Lee, Soonyoung Jung

    Abstract: In this report, we present our award-winning solutions for the Music Demixing Track of Sound Demixing Challenge 2023. First, we propose TFC-TDF-UNet v3, a time-efficient music source separation model that achieves state-of-the-art results on the MUSDB benchmark. We then give full details regarding our solutions for each Leaderboard, including a loss masking approach for noise-robust training. Code… ▽ More

    Submitted 21 July, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: 5 pages, 4 tables

  30. arXiv:2306.03747  [pdf, other

    cs.CV

    Towards Scalable Multi-View Reconstruction of Geometry and Materials

    Authors: Carolin Schmitt, Božidar Antić, Andrei Neculai, Joo Ho Lee, Andreas Geiger

    Abstract: In this paper, we propose a novel method for joint recovery of camera pose, object geometry and spatially-varying Bidirectional Reflectance Distribution Function (svBRDF) of 3D scenes that exceed object-scale and hence cannot be captured with stationary light stages. The input are high-resolution RGB-D images captured by a mobile, hand-held capture system with point lights for active illumination.… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  31. arXiv:2306.00317  [pdf, other

    cs.LG cs.AI

    FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization

    Authors: Jung Hyun Lee, Jeonghoon Kim, Se Jung Kwon, Dongsoo Lee

    Abstract: Post-training quantization (PTQ) has been gaining popularity for the deployment of deep neural networks on resource-limited devices since unlike quantization-aware training, neither a full training dataset nor end-to-end training is required at all. As PTQ schemes based on reconstructing each layer or block output turn out to be effective to enhance quantized model performance, recent works have d… ▽ More

    Submitted 16 July, 2024; v1 submitted 31 May, 2023; originally announced June 2023.

    Comments: Accepted to ICML 2023

  32. arXiv:2305.15878  [pdf, other

    cs.CL cs.LG

    LFTK: Handcrafted Features in Computational Linguistics

    Authors: Bruce W. Lee, Jason Hyung-Jong Lee

    Abstract: Past research has identified a rich set of handcrafted linguistic features that can potentially assist various tasks. However, their extensive number makes it difficult to effectively select and utilize existing handcrafted features. Coupled with the problem of inconsistent implementation across research works, there has been no categorization scheme or generally-accepted feature names. This creat… ▽ More

    Submitted 1 June, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: BEA @ ACL 2023

  33. arXiv:2305.14152  [pdf, other

    cs.LG cs.AI

    Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization

    Authors: Jeonghoon Kim, Jung Hyun Lee, Sungdong Kim, Joonsuk Park, Kang Min Yoo, Se Jung Kwon, Dongsoo Lee

    Abstract: Large language models (LLMs) face the challenges in fine-tuning and deployment due to their high memory demands and computational costs. While parameter-efficient fine-tuning (PEFT) methods aim to reduce the memory usage of the optimizer state during fine-tuning, the inherent size of pre-trained LLM weights continues to be a pressing concern. Even though quantization techniques are widely proposed… ▽ More

    Submitted 28 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Published at NeurIPS 2023. Camera-ready version

  34. arXiv:2305.10084  [pdf, other

    cs.CV

    CWD30: A Comprehensive and Holistic Dataset for Crop Weed Recognition in Precision Agriculture

    Authors: Talha Ilyas, Dewa Made Sri Arsa, Khubaib Ahmad, Yong Chae Jeong, Okjae Won, Jong Hoon Lee, Hyongsuk Kim

    Abstract: The growing demand for precision agriculture necessitates efficient and accurate crop-weed recognition and classification systems. Current datasets often lack the sample size, diversity, and hierarchical structure needed to develop robust deep learning models for discriminating crops and weeds in agricultural fields. Moreover, the similar external structure and phenomics of crops and weeds complic… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

    Comments: 15 pages, 14 figures, journal research article

  35. arXiv:2304.14082  [pdf, other

    cs.LG cs.SE

    JaxPruner: A concise library for sparsity research

    Authors: Joo Hyung Lee, Wonpyo Park, Nicole Mitchell, Jonathan Pilault, Johan Obando-Ceron, Han-Byul Kim, Namhoon Lee, Elias Frantar, Yun Long, Amir Yazdanbakhsh, Shivani Agrawal, Suvinay Subramanian, Xin Wang, Sheng-Chun Kao, Xingyao Zhang, Trevor Gale, Aart Bik, Woohyun Han, Milen Ferev, Zhonglin Han, Hong-Seok Kim, Yann Dauphin, Gintare Karolina Dziugaite, Pablo Samuel Castro, Utku Evci

    Abstract: This paper introduces JaxPruner, an open-source JAX-based pruning and sparse training library for machine learning research. JaxPruner aims to accelerate research on sparse neural networks by providing concise implementations of popular pruning and sparse training algorithms with minimal memory and latency overhead. Algorithms implemented in JaxPruner use a common API and work seamlessly with the… ▽ More

    Submitted 18 December, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: Jaxpruner is hosted at http://github.com/google-research/jaxpruner

  36. Knowledge Distillation for Feature Extraction in Underwater VSLAM

    Authors: Jinghe Yang, Mingming Gong, Girish Nair, Jung Hoon Lee, Jason Monty, Ye Pu

    Abstract: In recent years, learning-based feature detection and matching have outperformed manually-designed methods in in-air cases. However, it is challenging to learn the features in the underwater scenario due to the absence of annotated underwater datasets. This paper proposes a cross-modal knowledge distillation framework for training an underwater feature detection and matching network (UFEN). In par… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

    Comments: Accepted by IEEE International Conference on Robotics and Automation (ICRA 2023),6 pages

  37. arXiv:2303.06698  [pdf, ps, other

    cs.LG cs.AI math.OC

    Branch & Learn with Post-hoc Correction for Predict+Optimize with Unknown Parameters in Constraints

    Authors: Xinyi Hu, Jasper C. H. Lee, Jimmy H. M. Lee

    Abstract: Combining machine learning and constrained optimization, Predict+Optimize tackles optimization problems containing parameters that are unknown at the time of solving. Prior works focus on cases with unknowns only in the objectives. A new framework was recently proposed to cater for unknowns also in constraints by introducing a loss function, called Post-hoc Regret, that takes into account the cost… ▽ More

    Submitted 12 March, 2023; originally announced March 2023.

  38. arXiv:2302.13139  [pdf, other

    cs.CL cs.AI

    Prompt-based Learning for Text Readability Assessment

    Authors: Bruce W. Lee, Jason Hyung-Jong Lee

    Abstract: We propose the novel adaptation of a pre-trained seq2seq model for readability assessment. We prove that a seq2seq model - T5 or BART - can be adapted to discern which text is more difficult from two given texts (pairwise). As an exploratory study to prompt-learn a neural network for text readability in a text-to-text manner, we report useful tips for future work in seq2seq training and ranking-ba… ▽ More

    Submitted 16 June, 2024; v1 submitted 25 February, 2023; originally announced February 2023.

    Comments: EACL 2023

  39. arXiv:2302.07754  [pdf, other

    cs.LG

    SupSiam: Non-contrastive Auxiliary Loss for Learning from Molecular Conformers

    Authors: Michael Maser, Ji Won Park, Joshua Yao-Yu Lin, Jae Hyeon Lee, Nathan C. Frey, Andrew Watkins

    Abstract: We investigate Siamese networks for learning related embeddings for augmented samples of molecular conformers. We find that a non-contrastive (positive-pair only) auxiliary task aids in supervised training of Euclidean neural networks (E3NNs) and increases manifold smoothness (MS) around point-cloud geometries. We demonstrate this property for multiple drug-activity prediction tasks while maintain… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

    Comments: Submitted to the MLDD workshop, ICLR 2023

  40. arXiv:2302.00270  [pdf, other

    cs.LG cs.AI

    Internally Rewarded Reinforcement Learning

    Authors: Mengdi Li, Xufeng Zhao, Jae Hee Lee, Cornelius Weber, Stefan Wermter

    Abstract: We study a class of reinforcement learning problems where the reward signals for policy learning are generated by an internal reward model that is dependent on and jointly optimized with the policy. This interdependence between the policy and the reward model leads to an unstable learning process because reward signals from an immature reward model are noisy and impede policy learning, and convers… ▽ More

    Submitted 24 August, 2023; v1 submitted 1 February, 2023; originally announced February 2023.

    Comments: Accepted at ICML 2023. Update: adopt the term "reward model" instead of using "critic" to prevent confusion with the term "critic" in actor-critic algorithms. Project webpage at https://ir-rl.github.io

  41. arXiv:2301.07028  [pdf, other

    cs.RO

    Aquarium: A Fully Differentiable Fluid-Structure Interaction Solver for Robotics Applications

    Authors: Jeong Hun Lee, Mike Y. Michelis, Robert Katzschmann, Zachary Manchester

    Abstract: We present Aquarium, a differentiable fluid-structure interaction solver for robotics that offers stable simulation, accurately coupled fluid-robot physics in two dimensions, and full differentiability with respect to fluid and robot states and parameters. Aquarium achieves stable simulation with accurate flow physics by directly integrating over the incompressible Navier-Stokes equations using a… ▽ More

    Submitted 7 March, 2023; v1 submitted 17 January, 2023; originally announced January 2023.

    Comments: 8 pages, 7 figures, accepted to IEEE ICRA 2023

  42. arXiv:2301.03353  [pdf, other

    cs.CL cs.AI cs.NE cs.RO

    Learning Bidirectional Action-Language Translation with Limited Supervision and Incongruent Input

    Authors: Ozan Özdemir, Matthias Kerzel, Cornelius Weber, Jae Hee Lee, Muhammad Burhan Hafez, Patrick Bruns, Stefan Wermter

    Abstract: Human infant learning happens during exploration of the environment, by interaction with objects, and by listening to and repeating utterances casually, which is analogous to unsupervised learning. Only occasionally, a learning infant would receive a matching verbal description of an action it is committing, which is similar to supervised learning. Such a learning mechanism can be mimicked with de… ▽ More

    Submitted 22 February, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

    Comments: Published in: Applied Artificial Intelligence, 37:1, 2179167

    Journal ref: Applied Artificial Intelligence Volume 37, 2023 - Issue 1

  43. arXiv:2301.02975  [pdf, other

    cs.CL cs.LG

    Traditional Readability Formulas Compared for English

    Authors: Bruce W. Lee, Jason Hyung-Jong Lee

    Abstract: Traditional English readability formulas, or equations, were largely developed in the 20th century. Nonetheless, many researchers still rely on them for various NLP applications. This phenomenon is presumably due to the convenience and straightforwardness of readability formulas. In this work, we contribute to the NLP community by 1. introducing New English Readability Formula (NERF), 2. recalibra… ▽ More

    Submitted 10 January, 2023; v1 submitted 7 January, 2023; originally announced January 2023.

    Comments: Submitted to EMNLP 2022

  44. arXiv:2212.07885  [pdf, other

    cs.RO

    Data-Efficient Model Learning for Control with Jacobian-Regularized Dynamic-Mode Decomposition}

    Authors: Brian E. Jackson, Jeong Hun Lee, Kevin Tracy, Zachary Manchester

    Abstract: We present a data-efficient algorithm for learning models for model-predictive control (MPC). Our approach, Jacobian-Regularized Dynamic-Mode Decomposition (JDMD), offers improved sample efficiency over traditional Koopman approaches based on Dynamic-Mode Decomposition (DMD) by leveraging Jacobian information from an approximate prior model of the system, and improved tracking performance over tra… ▽ More

    Submitted 28 January, 2023; v1 submitted 25 October, 2022; originally announced December 2022.

    Journal ref: Conference on Robot Learning (CoRL) 2022, Auckland, New Zealand

  45. arXiv:2212.04614  [pdf, other

    cs.LG

    Is Bio-Inspired Learning Better than Backprop? Benchmarking Bio Learning vs. Backprop

    Authors: Manas Gupta, Sarthak Ketanbhai Modi, Hang Zhang, Joon Hei Lee, Joo Hwee Lim

    Abstract: Bio-inspired learning has been gaining popularity recently given that Backpropagation (BP) is not considered biologically plausible. Many algorithms have been proposed in the literature which are all more biologically plausible than BP. However, apart from overcoming the biological implausibility of BP, a strong motivation for using Bio-inspired algorithms remains lacking. In this study, we undert… ▽ More

    Submitted 30 August, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

  46. arXiv:2212.04231  [pdf, other

    cs.CV cs.CL

    Harnessing the Power of Multi-Task Pretraining for Ground-Truth Level Natural Language Explanations

    Authors: Björn Plüster, Jakob Ambsdorf, Lukas Braach, Jae Hee Lee, Stefan Wermter

    Abstract: Natural language explanations promise to offer intuitively understandable explanations of a neural network's decision process in complex vision-language tasks, as pursued in recent VL-NLE models. While current models offer impressive performance on task accuracy and explanation plausibility, they suffer from a range of issues: Some models feature a modular design where the explanation generation m… ▽ More

    Submitted 29 March, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

    Comments: Minor changes

  47. arXiv:2211.15566  [pdf, other

    cs.AI

    Neuro-Symbolic Spatio-Temporal Reasoning

    Authors: Jae Hee Lee, Michael Sioutis, Kyra Ahrens, Marjan Alirezaie, Matthias Kerzel, Stefan Wermter

    Abstract: Knowledge about space and time is necessary to solve problems in the physical world: An AI agent situated in the physical world and interacting with objects often needs to reason about positions of and relations between objects; and as soon as the agent plans its actions to solve a task, it needs to consider the temporal aspect (e.g., what actions to perform over time). Spatio-temporal knowledge,… ▽ More

    Submitted 13 January, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: Contribution to the book "A Compendium of Neuro-Symbolic Artificial Intelligence", which is to appear in the first half of 2023

  48. arXiv:2211.13866  [pdf, ps, other

    stat.ML cs.LG

    Minimal Width for Universal Property of Deep RNN

    Authors: Chang hoon Song, Geonho Hwang, Jun ho Lee, Myungjoo Kang

    Abstract: A recurrent neural network (RNN) is a widely used deep-learning network for dealing with sequential data. Imitating a dynamical system, an infinite-width RNN can approximate any open dynamical system in a compact domain. In general, deep networks with bounded widths are more effective than wide networks in practice; however, the universal approximation theorem for deep narrow structures has yet to… ▽ More

    Submitted 28 March, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

  49. arXiv:2210.15415  [pdf, other

    cs.NE

    Exact Gradient Computation for Spiking Neural Networks Through Forward Propagation

    Authors: Jane H. Lee, Saeid Haghighatshoar, Amin Karbasi

    Abstract: Spiking neural networks (SNN) have recently emerged as alternatives to traditional neural networks, owing to energy efficiency benefits and capacity to better capture biological neuronal mechanisms. However, the classic backpropagation algorithm for training traditional networks has been notoriously difficult to apply to SNN due to the hard-thresholding and discontinuities at spike times. Therefor… ▽ More

    Submitted 10 March, 2023; v1 submitted 18 October, 2022; originally announced October 2022.

  50. arXiv:2209.03668  [pdf, other

    cs.AI cs.LG math.OC

    Predict+Optimize for Packing and Covering LPs with Unknown Parameters in Constraints

    Authors: Xinyi Hu, Jasper C. H. Lee, Jimmy H. M. Lee

    Abstract: Predict+Optimize is a recently proposed framework which combines machine learning and constrained optimization, tackling optimization problems that contain parameters that are unknown at solving time. The goal is to predict the unknown parameters and use the estimates to solve for an estimated optimal solution to the optimization problem. However, all prior works have focused on the case where unk… ▽ More

    Submitted 8 September, 2022; originally announced September 2022.