Skip to main content

Showing 1–50 of 299 results for author: Lim, H

  1. arXiv:2407.08906  [pdf, other

    cs.CV cs.AI cs.GR

    AirSketch: Generative Motion to Sketch

    Authors: Hui Xian Grace Lim, Xuanming Cui, Yogesh S Rawat, Ser-Nam Lim

    Abstract: Illustration is a fundamental mode of human expression and communication. Certain types of motion that accompany speech can provide this illustrative mode of communication. While Augmented and Virtual Reality technologies (AR/VR) have introduced tools for producing drawings with hand motions (air drawing), they typically require costly hardware and additional digital markers, thereby limiting thei… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  2. arXiv:2407.08795  [pdf, ps, other

    cs.CV

    Feasibility of Neural Radiance Fields for Crime Scene Video Reconstruction

    Authors: Shariq Nadeem Malik, Min Hao Chee, Dayan Mario Anthony Perera, Chern Hong Lim

    Abstract: This paper aims to review and determine the feasibility of using variations of NeRF models in order to reconstruct crime scenes given input videos of the scene. We focus on three main innovations of NeRF when it comes to reconstructing crime scenes: Multi-object Synthesis, Deformable Synthesis, and Lighting. From there, we analyse its innovation progress against the requirements to be met in order… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 4 pages, 1 table

  3. arXiv:2407.04903  [pdf, other

    cs.CL cs.AI cs.CV

    MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension

    Authors: Zekun Li, Xianjun Yang, Kyuri Choi, Wanrong Zhu, Ryan Hsieh, HyeonJung Kim, Jin Hyuk Lim, Sungyoung Ji, Byungju Lee, Xifeng Yan, Linda Ruth Petzold, Stephen D. Wilson, Woosang Lim, William Yang Wang

    Abstract: The rapid advancement of Large Language Models (LLMs) and Large Multimodal Models (LMMs) has heightened the demand for AI-based scientific assistants capable of understanding scientific articles and figures. Despite progress, there remains a significant gap in evaluating models' comprehension of professional, graduate-level, and even PhD-level scientific content. Current datasets and benchmarks pr… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Code and data are available at https://github.com/Leezekun/MMSci

  4. arXiv:2406.19634  [pdf, other

    cs.RO

    CLOi-Mapper: Consistent, Lightweight, Robust, and Incremental Mapper With Embedded Systems for Commercial Robot Services

    Authors: DongKi Noh, Hyungtae Lim, Gyuho Eoh, Duckyu Choi, Jeongsik Choi, Hyunjun Lim, SeungMin Baek, Hyun Myung

    Abstract: In commercial autonomous service robots with several form factors, simultaneous localization and mapping (SLAM) is an essential technology for providing proper services such as cleaning and guidance. Such robots require SLAM algorithms suitable for specific applications and environments. Hence, several SLAM frameworks have been proposed to address various requirements in the past decade. However,… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Journal ref: IEEE Robotics and Automation Letters, 2024

  5. arXiv:2406.18138  [pdf, other

    cs.RO

    B-TMS: Bayesian Traversable Terrain Modeling and Segmentation Across 3D LiDAR Scans and Maps for Enhanced Off-Road Navigation

    Authors: Minho Oh, Gunhee Shin, Seoyeon Jang, Seungjae Lee, Dongkyu Lee, Wonho Song, Byeongho Yu, Hyungtae Lim, Jaeyoung Lee, Hyun Myung

    Abstract: Recognizing traversable terrain from 3D point cloud data is critical, as it directly impacts the performance of autonomous navigation in off-road environments. However, existing segmentation algorithms often struggle with challenges related to changes in data distribution, environmental specificity, and sensor variations. Moreover, when encountering sunken areas, their performance is frequently co… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE IV'24 workshop on Off-road autonomy

  6. arXiv:2406.10809  [pdf, other

    cs.CL cs.AI

    Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded Conversations

    Authors: Yoonna Jang, Suhyune Son, Jeongwoo Lee, Junyoung Son, Yuna Hur, Jungwoo Lim, Hyeonseok Moon, Kisu Yang, Heuiseok Lim

    Abstract: Despite the striking advances in recent language generation performance, model-generated responses have suffered from the chronic problem of hallucinations that are either untrue or unfaithful to a given source. Especially in the task of knowledge grounded conversation, the models are required to generate informative responses, but hallucinated utterances lead to miscommunication. In particular, e… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: Accepted at EMNLP 2023

  7. arXiv:2406.03202  [pdf, other

    cs.CL cs.AI

    ChatLang-8: An LLM-Based Synthetic Data Generation Framework for Grammatical Error Correction

    Authors: Jeiyoon Park, Chanjun Park, Heuiseok Lim

    Abstract: We explore and improve the capabilities of LLMs to generate data for grammatical error correction (GEC). When merely producing parallel sentences, their patterns are too simplistic to be valuable as a corpus. To address this issue, we propose an automated framework that includes a Subject Selector, Grammar Selector, Prompt Manager, and Evaluator. Additionally, we introduce a new dataset for GEC ta… ▽ More

    Submitted 11 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: preprint

  8. arXiv:2406.02331  [pdf, other

    cs.CL

    Translation Deserves Better: Analyzing Translation Artifacts in Cross-lingual Visual Question Answering

    Authors: ChaeHun Park, Koanho Lee, Hyesu Lim, Jaeseok Kim, Junmo Park, Yu-Jung Heo, Du-Seong Chang, Jaegul Choo

    Abstract: Building a reliable visual question answering~(VQA) system across different languages is a challenging problem, primarily due to the lack of abundant samples for training. To address this challenge, recent studies have employed machine translation systems for the cross-lingual VQA task. This involves translating the evaluation samples into a source language (usually English) and using monolingual… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Findings Accepted

  9. arXiv:2405.19778  [pdf, other

    cs.CL cs.AI

    Enhancing Consistency and Role-Specific Knowledge Capturing by Rebuilding Fictional Character's Persona

    Authors: Jeiyoon Park, Chanjun Park, Heuiseok Lim

    Abstract: With the recent introduction of Assistants API, it is expected that document-based language models will be actively used in various domains, especially Role-playing. However, a key challenge lies in utilizing protagonist's persona: Assistants API often fails to achieve with its search because the information extraction part is different each time and it often omits important information such as pr… ▽ More

    Submitted 4 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: preprint

  10. arXiv:2405.19691  [pdf, other

    cs.HC

    Designing Prompt Analytics Dashboards to Analyze Student-ChatGPT Interactions in EFL Writing

    Authors: Minsun Kim, SeonGyeom Kim, Suyoun Lee, Yoosang Yoon, Junho Myung, Haneul Yoo, Hyungseung Lim, Jieun Han, Yoonsu Kim, So-Yeon Ahn, Juho Kim, Alice Oh, Hwajung Hong, Tak Yeon Lee

    Abstract: While ChatGPT has significantly impacted education by offering personalized resources for students, its integration into educational settings poses unprecedented risks, such as inaccuracies and biases in AI-generated content, plagiarism and over-reliance on AI, and privacy and security issues. To help teachers address such risks, we conducted a two-phase iterative design process that comprises sur… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  11. arXiv:2405.16496  [pdf, other

    cs.CV cs.AI cs.LG

    Exploring a Multimodal Fusion-based Deep Learning Network for Detecting Facial Palsy

    Authors: Nicole Heng Yim Oo, Min Hun Lee, Jeong Hoon Lim

    Abstract: Algorithmic detection of facial palsy offers the potential to improve current practices, which usually involve labor-intensive and subjective assessment by clinicians. In this paper, we present a multimodal fusion-based deep learning model that utilizes unstructured data (i.e. an image frame with facial line segments) and structured data (i.e. features of facial expressions) to detect facial palsy… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  12. arXiv:2405.16305  [pdf, other

    cs.LG

    Efficiently Parameterized Neural Metriplectic Systems

    Authors: Anthony Gruber, Kookjin Lee, Haksoo Lim, Noseong Park, Nathaniel Trask

    Abstract: Metriplectic systems are learned from data in a way that scales quadratically in both the size of the state and the rank of the metriplectic data. Besides being provably energy conserving and entropy stable, the proposed approach comes with approximation results demonstrating its ability to accurately learn metriplectic dynamics from data as well as an error estimate indicating its potential for g… ▽ More

    Submitted 28 May, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  13. arXiv:2405.14078  [pdf, ps, other

    cs.AI cs.LG cs.MA

    A finite time analysis of distributed Q-learning

    Authors: Han-Dong Lim, Donghwan Lee

    Abstract: Multi-agent reinforcement learning (MARL) has witnessed a remarkable surge in interest, fueled by the empirical success achieved in applications of single-agent reinforcement learning (RL). In this study, we consider a distributed Q-learning scenario, wherein a number of agents cooperatively solve a sequential decision making problem without access to the central reward function which is an averag… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  14. arXiv:2405.12538  [pdf, other

    cs.CV

    Bridging the Intent Gap: Knowledge-Enhanced Visual Generation

    Authors: Yi Cheng, Ziwei Xu, Dongyun Lin, Harry Cheng, Yongkang Wong, Ying Sun, Joo Hwee Lim, Mohan Kankanhalli

    Abstract: For visual content generation, discrepancies between user intentions and the generated content have been a longstanding problem. This discrepancy arises from two main factors. First, user intentions are inherently complex, with subtle details not fully captured by input prompts. The absence of such details makes it challenging for generative models to accurately reflect the intended meaning, leadi… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  15. arXiv:2405.11176  [pdf, other

    cs.RO cs.CV

    Outlier-Robust Long-Term Robotic Mapping Leveraging Ground Segmentation

    Authors: Hyungtae Lim

    Abstract: Despite the remarkable advancements in deep learning-based perception technologies and simultaneous localization and mapping (SLAM), one can face the failure of these approaches when robots encounter scenarios outside their modeled experiences (here, the term modeling encompasses both conventional pattern finding and data-driven approaches). In particular, because learning-based methods are prone… ▽ More

    Submitted 27 May, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

    Comments: RSS Pioneers 2024 Research Statement

  16. arXiv:2405.06665  [pdf, other

    cs.CL cs.IR cs.LG

    Enhancing Language Models for Financial Relation Extraction with Named Entities and Part-of-Speech

    Authors: Menglin Li, Kwan Hui Lim

    Abstract: The Financial Relation Extraction (FinRE) task involves identifying the entities and their relation, given a piece of financial statement/text. To solve this FinRE problem, we propose a simple but effective strategy that improves the performance of pre-trained language models by augmenting them with Named Entity Recognition (NER) and Part-Of-Speech (POS), as well as different approaches to combine… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted to ICLR 2024 Tiny Paper Track

  17. Towards Precise Observations of Neural Model Robustness in Classification

    Authors: Wenchuan Mu, Kwan Hui Lim

    Abstract: In deep learning applications, robustness measures the ability of neural models that handle slight changes in input data, which could lead to potential safety hazards, especially in safety-critical applications. Pre-deployment assessment of model robustness is essential, but existing methods often suffer from either high costs or imprecise results. To enhance safety in real-world scenarios, metric… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  18. arXiv:2404.16411  [pdf, other

    cs.AI

    Label-Free Topic-Focused Summarization Using Query Augmentation

    Authors: Wenchuan Mu, Kwan Hui Lim

    Abstract: In today's data and information-rich world, summarization techniques are essential in harnessing vast text to extract key information and enhance decision-making and efficiency. In particular, topic-focused summarization is important due to its ability to tailor content to specific aspects of an extended text. However, this usually requires extensive labelled datasets and considerable computationa… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  19. arXiv:2404.16257  [pdf, other

    cs.CL cs.AI

    Translation of Multifaceted Data without Re-Training of Machine Translation Systems

    Authors: Hyeonseok Moon, Seungyoon Lee, Seongtae Hong, Seungjun Lee, Chanjun Park, Heuiseok Lim

    Abstract: Translating major language resources to build minor language resources becomes a widely-used approach. Particularly in translating complex data points composed of multiple components, it is common to translate each component separately. However, we argue that this practice often overlooks the interrelation between components within the same data point. To address this limitation, we propose a nove… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 19 pages

  20. arXiv:2404.12980  [pdf, other

    cs.HC

    Ring-a-Pose: A Ring for Continuous Hand Pose Tracking

    Authors: Tianhong Catherine Yu, Guilin Hu, Ruidong Zhang, Hyunchul Lim, Saif Mahmud, Chi-Jung Lee, Ke Li, Devansh Agarwal, Shuyang Nie, Jinseok Oh, François Guimbretière, Cheng Zhang

    Abstract: We present Ring-a-Pose, a single untethered ring that tracks continuous 3D hand poses. Located in the center of the hand, the ring emits an inaudible acoustic signal that each hand pose reflects differently. Ring-a-Pose imposes minimal obtrusions on the hand, unlike multi-ring or glove systems. It is not affected by the choice of clothing that may cover wrist-worn systems. In a series of three use… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  21. arXiv:2404.10633  [pdf, other

    cs.CV

    Contextrast: Contextual Contrastive Learning for Semantic Segmentation

    Authors: Changki Sung, Wanhee Kim, Jungho An, Wooju Lee, Hyungtae Lim, Hyun Myung

    Abstract: Despite great improvements in semantic segmentation, challenges persist because of the lack of local/global contexts and the relationship between them. In this paper, we propose Contextrast, a contrastive learning-based semantic segmentation method that allows to capture local/global contexts and comprehend their relationships. Our proposed method comprises two parts: a) contextual contrastive lea… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  22. arXiv:2404.08662  [pdf, other

    cs.IR cs.LG cs.SI

    FewUser: Few-Shot Social User Geolocation via Contrastive Learning

    Authors: Menglin Li, Kwan Hui Lim

    Abstract: To address the challenges of scarcity in geotagged data for social user geolocation, we propose FewUser, a novel framework for Few-shot social User geolocation. We incorporate a contrastive learning strategy between users and locations to improve geolocation performance with no or limited training data. FewUser features a user representation module that harnesses a pre-trained language model (PLM)… ▽ More

    Submitted 28 March, 2024; originally announced April 2024.

    Comments: 17 pages, 3 figures, 8 tables, submitted to ECML-PKDD 2024 for review

  23. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  24. arXiv:2403.13872  [pdf, other

    cs.LG cs.SI

    Spatial-Temporal Graph Representation Learning for Tactical Networks Future State Prediction

    Authors: Junhua Liu, Justin Albrethsen, Lincoln Goh, David Yau, Kwan Hui Lim

    Abstract: Resource allocation in tactical ad-hoc networks presents unique challenges due to their dynamic and multi-hop nature. Accurate prediction of future network connectivity is essential for effective resource allocation in such environments. In this paper, we introduce the Spatial-Temporal Graph Encoder-Decoder (STGED) framework for Tactical Communication Networks that leverages both spatial and tempo… ▽ More

    Submitted 14 July, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  25. arXiv:2403.11399  [pdf, other

    cs.CL

    X-LLaVA: Optimizing Bilingual Large Vision-Language Alignment

    Authors: Dongjae Shin, Hyeonseok Lim, Inho Won, Changsu Choi, Minjun Kim, Seungwoo Song, Hangyeol Yoo, Sangmin Kim, Kyungtae Lim

    Abstract: The impressive development of large language models (LLMs) is expanding into the realm of large multimodal models (LMMs), which incorporate multiple types of data beyond text. However, the nature of multimodal models leads to significant expenses in the creation of training data. Furthermore, constructing multilingual data for LMMs presents its own set of challenges due to language diversity and c… ▽ More

    Submitted 1 April, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

  26. arXiv:2403.10882  [pdf, other

    cs.CL cs.AI

    Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean

    Authors: ChangSu Choi, Yongbin Jeong, Seoyoon Park, InHo Won, HyeonSeok Lim, SangMin Kim, Yejee Kang, Chanhyuk Yoon, Jaewan Park, Yiseul Lee, HyeJin Lee, Younggyun Hahm, Hansaem Kim, KyungTae Lim

    Abstract: Large language models (LLMs) use pretraining to predict the subsequent word; however, their expansion requires significant computing resources. Numerous big tech companies and research institutes have developed multilingual LLMs (MLLMs) to meet current demands, overlooking less-resourced languages (LRLs). This study proposed three strategies to enhance the performance of LRLs based on the publicly… ▽ More

    Submitted 21 March, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

  27. arXiv:2403.09437  [pdf, other

    cs.CV

    Improving Real-Time Omnidirectional 3D Multi-Person Human Pose Estimation with People Matching and Unsupervised 2D-3D Lifting

    Authors: Pawel Knap, Peter Hardy, Alberto Tamajo, Hwasup Lim, Hansung Kim

    Abstract: Current human pose estimation systems focus on retrieving an accurate 3D global estimate of a single person. Therefore, this paper presents one of the first 3D multi-person human pose estimation systems that is able to work in real-time and is also able to handle basic forms of occlusion. First, we adjust an off-the-shelf 2D detector and an unsupervised 2D-3D lifting model for use with a 360… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  28. arXiv:2403.06394  [pdf, other

    cs.CV

    FSViewFusion: Few-Shots View Generation of Novel Objects

    Authors: Rukhshanda Hussain, Hui Xian Grace Lim, Borchun Chen, Mubarak Shah, Ser Nam Lim

    Abstract: Novel view synthesis has observed tremendous developments since the arrival of NeRFs. However, Nerf models overfit on a single scene, lacking generalization to out of distribution objects. Recently, diffusion models have exhibited remarkable performance on introducing generalization in view synthesis. Inspired by these advancements, we explore the capabilities of a pretrained stable diffusion mode… ▽ More

    Submitted 12 March, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

  29. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  30. arXiv:2403.02253  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-Based Phishing Detection

    Authors: Yuexin Li, Chengyu Huang, Shumin Deng, Mei Lin Lock, Tri Cao, Nay Oo, Hoon Wei Lim, Bryan Hooi

    Abstract: Phishing attacks have inflicted substantial losses on individuals and businesses alike, necessitating the development of robust and efficient automated phishing detection approaches. Reference-based phishing detectors (RBPDs), which compare the logos on a target webpage to a known set of logos, have emerged as the state-of-the-art approach. However, a major limitation of existing RBPDs is that the… ▽ More

    Submitted 15 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted by USENIX Security 2024

  31. arXiv:2403.00786  [pdf, other

    cs.IR cs.SI

    Leveraging Contrastive Learning for Few-shot Geolocation of Social Posts

    Authors: Menglin Li, Kwan Hui Lim

    Abstract: Social geolocation is an important problem of predicting the originating locations of social media posts. However, this task is challenging due to the need for a substantial volume of training data, alongside well-annotated labels. These issues are further exacerbated by new or less popular locations with insufficient labels, further leading to an imbalanced dataset. In this paper, we propose \tex… ▽ More

    Submitted 19 February, 2024; originally announced March 2024.

    Comments: This paper contains 7-page main content and 2-page references and was submitted to IJCAI2024 for review

  32. arXiv:2402.13562  [pdf, other

    cs.CL

    Analysis of Multi-Source Language Training in Cross-Lingual Transfer

    Authors: Seong Hoon Lim, Taejun Yun, Jinhyeon Kim, Jihun Choi, Taeuk Kim

    Abstract: The successful adaptation of multilingual language models (LMs) to a specific language-task pair critically depends on the availability of data tailored for that condition. While cross-lingual transfer (XLT) methods have contributed to addressing this data scarcity problem, there still exists ongoing debate about the mechanisms behind their effectiveness. In this work, we focus on one of promising… ▽ More

    Submitted 4 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: Accepted to ACL 2024

  33. arXiv:2402.11877  [pdf, other

    cs.LG cs.AI

    Finite-Time Error Analysis of Online Model-Based Q-Learning with a Relaxed Sampling Model

    Authors: Han-Dong Lim, HyeAnn Lee, Donghwan Lee

    Abstract: Reinforcement learning has witnessed significant advancements, particularly with the emergence of model-based approaches. Among these, $Q$-learning has proven to be a powerful algorithm in model-free settings. However, the extension of $Q$-learning to a model-based framework remains relatively unexplored. In this paper, we delve into the sample complexity of $Q$-learning when integrated with a mod… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  34. Exploring the Effects of Population and Employment Characteristics on Truck Flows: An Analysis of NextGen NHTS Origin-Destination Data

    Authors: Majbah Uddin, Yuandong Liu, Hyeonsup Lim

    Abstract: Truck transportation remains the dominant mode of US freight transportation because of its advantages, such as the flexibility of accessing pickup and drop-off points and faster delivery. Because of the massive freight volume transported by trucks, understanding the effects of population and employment characteristics on truck flows is critical for better transportation planning and investment dec… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Journal ref: In International Conference on Transportation and Development 2023 (pp. 503-513)

  35. Improving the accuracy of freight mode choice models: A case study using the 2017 CFS PUF data set and ensemble learning techniques

    Authors: Diyi Liu, Hyeonsup Lim, Majbah Uddin, Yuandong Liu, Lee D. Han, Ho-ling Hwang, Shih-Miao Chin

    Abstract: The US Census Bureau has collected two rounds of experimental data from the Commodity Flow Survey, providing shipment-level characteristics of nationwide commodity movements, published in 2012 (i.e., Public Use Microdata) and in 2017 (i.e., Public Use File). With this information, data-driven methods have become increasingly valuable for understanding detailed patterns in freight logistics. In thi… ▽ More

    Submitted 12 February, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Journal ref: Expert Systems with Applications, 240, 122478 (2024)

  36. arXiv:2401.14625  [pdf, ps, other

    cs.CL

    Toward Practical Automatic Speech Recognition and Post-Processing: a Call for Explainable Error Benchmark Guideline

    Authors: Seonmin Koo, Chanjun Park, Jinsung Kim, Jaehyung Seo, Sugyeong Eo, Hyeonseok Moon, Heuiseok Lim

    Abstract: Automatic speech recognition (ASR) outcomes serve as input for downstream tasks, substantially impacting the satisfaction level of end-users. Hence, the diagnosis and enhancement of the vulnerabilities present in the ASR model bear significant importance. However, traditional evaluation methodologies of ASR systems generate a singular, composite quantitative metric, which fails to provide comprehe… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: Accepted for Data-centric Machine Learning Research (DMLR) Workshop at ICML 2023

  37. arXiv:2401.14616  [pdf, other

    cs.CL cs.AI

    Alternative Speech: Complementary Method to Counter-Narrative for Better Discourse

    Authors: Seungyoon Lee, Dahyun Jung, Chanjun Park, Seolhwa Lee, Heuiseok Lim

    Abstract: We introduce the concept of "Alternative Speech" as a new way to directly combat hate speech and complement the limitations of counter-narrative. An alternative speech provides practical alternatives to hate speech in real-world scenarios by offering speech-level corrections to speakers while considering the surrounding context and promoting speakers to reform. Further, an alternative speech can c… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: Accepted for The First Workshop on Data-Centric AI (DCAI) at ICDM 2023

  38. arXiv:2401.12987  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    TelME: Teacher-leading Multimodal Fusion Network for Emotion Recognition in Conversation

    Authors: Taeyang Yun, Hyunkuk Lim, Jeonghwan Lee, Min Song

    Abstract: Emotion Recognition in Conversation (ERC) plays a crucial role in enabling dialogue systems to effectively respond to user requests. The emotions in a conversation can be identified by the representations from various modalities, such as audio, visual, and text. However, due to the weak contribution of non-verbal modalities to recognize emotions, multimodal ERC has always been considered a challen… ▽ More

    Submitted 31 March, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: NAACL 2024 main conference

  39. arXiv:2401.12499  [pdf, ps, other

    cs.IT eess.SP

    On the Fundamental Tradeoff of Joint Communication and Quickest Change Detection

    Authors: Daewon Seo, Sung Hoon Lim

    Abstract: In this work, we take the initiative in studying the fundamental tradeoff between communication and quickest change detection (QCD) under an integrated sensing and communication setting. We formally establish a joint communication and sensing problem for quickest change detection. Then, by utilizing constant subblock-composition codes and a modified QuSum detection rule, which we call subblock QuS… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

  40. arXiv:2312.16839  [pdf, other

    cs.RO

    Similar but Different: A Survey of Ground Segmentation and Traversability Estimation for Terrestrial Robots

    Authors: Hyungtae Lim, Minho Oh, Seungjae Lee, Seunguk Ahn, Hyun Myung

    Abstract: With the increasing demand for mobile robots and autonomous vehicles, several approaches for long-term robot navigation have been proposed. Among these techniques, ground segmentation and traversability estimation play important roles in perception and path planning, respectively. Even though these two techniques appear similar, their objectives are different. Ground segmentation divides data into… ▽ More

    Submitted 2 January, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: 10 pages, 8 figures

  41. arXiv:2312.12133  [pdf, other

    cs.CV cs.LG

    Object-Aware Domain Generalization for Object Detection

    Authors: Wooju Lee, Dasol Hong, Hyungtae Lim, Hyun Myung

    Abstract: Single-domain generalization (S-DG) aims to generalize a model to unseen environments with a single-source domain. However, most S-DG approaches have been conducted in the field of classification. When these approaches are applied to object detection, the semantic features of some objects can be damaged, which can lead to imprecise object localization and misclassification. To address these proble… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI-24. The first two authors contributed equally

  42. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  43. arXiv:2311.12355  [pdf, other

    cs.IR cs.CL cs.LG

    Utilizing Language Models for Tour Itinerary Recommendation

    Authors: Ngai Lam Ho, Kwan Hui Lim

    Abstract: Tour itinerary recommendation involves planning a sequence of relevant Point-of-Interest (POIs), which combines challenges from the fields of both Operations Research (OR) and Recommendation Systems (RS). As an OR problem, there is the need to maximize a certain utility (e.g., popularity of POIs in the tour) while adhering to some constraints (e.g., maximum time for the tour). As a RS problem, it… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

    Comments: PMAI23 @IJCAI 2023 2nd International Workshop on Process Management in the AI era

  44. arXiv:2311.11071  [pdf, other

    cs.IR cs.AI cs.LG cs.SI

    SBTRec- A Transformer Framework for Personalized Tour Recommendation Problem with Sentiment Analysis

    Authors: Ngai Lam Ho, Roy Ka-Wei Lee, Kwan Hui Lim

    Abstract: When traveling to an unfamiliar city for holidays, tourists often rely on guidebooks, travel websites, or recommendation systems to plan their daily itineraries and explore popular points of interest (POIs). However, these approaches may lack optimization in terms of time feasibility, localities, and user preferences. In this paper, we propose the SBTRec algorithm: a BERT-based Trajectory Recommen… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

    Report number: 01

  45. arXiv:2311.04522  [pdf, other

    cs.LG

    Long-term Time Series Forecasting based on Decomposition and Neural Ordinary Differential Equations

    Authors: Seonkyu Lim, Jaehyeon Park, Seojin Kim, Hyowon Wi, Haksoo Lim, Jinsung Jeon, Jeongwhan Choi, Noseong Park

    Abstract: Long-term time series forecasting (LTSF) is a challenging task that has been investigated in various domains such as finance investment, health care, traffic, and weather forecasting. In recent years, Linear-based LTSF models showed better performance, pointing out the problem of Transformer-based approaches causing temporal information loss. However, Linear-based approach has also limitations tha… ▽ More

    Submitted 10 November, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: Accepted at IEEE BigData 2023

  46. arXiv:2311.01723  [pdf, other

    cs.CV cs.AI

    Towards Calibrated Robust Fine-Tuning of Vision-Language Models

    Authors: Changdae Oh, Hyesu Lim, Mijoo Kim, Dongyoon Han, Sangdoo Yun, Jaegul Choo, Alexander Hauptmann, Zhi-Qi Cheng, Kyungwoo Song

    Abstract: Improving out-of-distribution (OOD) generalization through in-distribution (ID) adaptation is a primary goal of robust fine-tuning methods beyond the naive fine-tuning approach. However, despite decent OOD generalization performance from recent robust fine-tuning methods, OOD confidence calibration for reliable machine learning has not been fully addressed. This work proposes a robust fine-tuning… ▽ More

    Submitted 27 May, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: Presented at the NeurIPS 2023 Workshop on Distribution Shifts (DistShift)

  47. arXiv:2311.00928  [pdf, other

    cs.RO

    Quatro++: Robust Global Registration Exploiting Ground Segmentation for Loop Closing in LiDAR SLAM

    Authors: Hyungtae Lim, Beomsoo Kim, Daebeom Kim, Eungchang Mason Lee, Hyun Myung

    Abstract: Global registration is a fundamental task that estimates the relative pose between two viewpoints of 3D point clouds. However, there are two issues that degrade the performance of global registration in LiDAR SLAM: one is the sparsity issue and the other is degeneracy. The sparsity issue is caused by the sparse characteristics of the 3D point cloud measurements in a mechanically spinning LiDAR sen… ▽ More

    Submitted 21 January, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: 26 pages, 23 figures

  48. arXiv:2310.19886  [pdf

    cs.LG cs.IR cs.SI

    BTRec: BERT-Based Trajectory Recommendation for Personalized Tours

    Authors: Ngai Lam Ho, Roy Ka-Wei Lee, Kwan Hui Lim

    Abstract: An essential task for tourists having a pleasant holiday is to have a well-planned itinerary with relevant recommendations, especially when visiting unfamiliar cities. Many tour recommendation tools only take into account a limited number of factors, such as popular Points of Interest (POIs) and routing constraints. Consequently, the solutions they provide may not always align with the individual… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: RecSys 2023, Workshop on Recommenders in Tourism

  49. arXiv:2310.17166  [pdf, other

    cs.CL

    X-SNS: Cross-Lingual Transfer Prediction through Sub-Network Similarity

    Authors: Taejun Yun, Jinhyeon Kim, Deokyeong Kang, Seong Hoon Lim, Jihoon Kim, Taeuk Kim

    Abstract: Cross-lingual transfer (XLT) is an emergent ability of multilingual language models that preserves their performance on a task to a significant extent when evaluated in languages that were not included in the fine-tuning process. While English, due to its widespread usage, is typically regarded as the primary language for model adaption in various tasks, recent studies have revealed that the effic… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 (Findings)

  50. arXiv:2310.05191  [pdf, other

    cs.CL

    FABRIC: Automated Scoring and Feedback Generation for Essays

    Authors: Jieun Han, Haneul Yoo, Junho Myung, Minsun Kim, Hyunseung Lim, Yoonsu Kim, Tak Yeon Lee, Hwajung Hong, Juho Kim, So-Yeon Ahn, Alice Oh

    Abstract: Automated essay scoring (AES) provides a useful tool for students and instructors in writing classes by generating essay scores in real-time. However, previous AES models do not provide more specific rubric-based scores nor feedback on how to improve the essays, which can be even more important than the overall scores for learning. We present FABRIC, a pipeline to help students and instructors in… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.