Skip to main content

Showing 1–50 of 68 results for author: Jo, Y

  1. arXiv:2407.07110  [pdf, other

    cs.LG cs.AI eess.SP

    Foundation Models for Electrocardiograms

    Authors: Junho Song, Jong-Hwan Jang, Byeong Tak Lee, DongGyun Hong, Joon-myoung Kwon, Yong-Yeon Jo

    Abstract: Foundation models, enhanced by self-supervised learning (SSL) techniques, represent a cutting-edge frontier in biomedical signal analysis, particularly for electrocardiograms (ECGs), crucial for cardiac health monitoring and diagnosis. This study conducts a comprehensive analysis of foundation models for ECGs by employing and refining innovative SSL methodologies - namely, generative and contrasti… ▽ More

    Submitted 25 June, 2024; originally announced July 2024.

    Comments: 27 pages

  2. arXiv:2406.13144  [pdf, other

    cs.CL cs.AI

    DialSim: A Real-Time Simulator for Evaluating Long-Term Dialogue Understanding of Conversational Agents

    Authors: Jiho Kim, Woosog Chay, Hyeonji Hwang, Daeun Kyung, Hyunseung Chung, Eunbyeol Cho, Yohan Jo, Edward Choi

    Abstract: Recent advancements in Large Language Models (LLMs) have significantly enhanced the capabilities of conversational agents, making them applicable to various fields (e.g., education). Despite their progress, the evaluation of the agents often overlooks the complexities of real-world conversations, such as real-time interactions, multi-party dialogues, and extended contextual dependencies. To bridge… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  3. arXiv:2406.10996  [pdf, other

    cs.CL

    THEANINE: Revisiting Memory Management in Long-term Conversations with Timeline-augmented Response Generation

    Authors: Seo Hyun Kim, Kai Tzu-iunn Ong, Taeyoon Kwon, Namyoung Kim, Keummin Ka, SeongHyeon Bae, Yohan Jo, Seung-won Hwang, Dongha Lee, Jinyoung Yeo

    Abstract: Large language models (LLMs) are capable of processing lengthy dialogue histories during prolonged interaction with users without additional memory modules; however, their responses tend to overlook or incorrectly recall information from the past. In this paper, we revisit memory-augmented response generation in the era of LLMs. While prior work focuses on getting rid of outdated memories, we argu… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: Under Review

  4. arXiv:2406.01020  [pdf, other

    cs.CV

    CLIP-Guided Attribute Aware Pretraining for Generalizable Image Quality Assessment

    Authors: Daekyu Kwon, Dongyoung Kim, Sehwan Ki, Younghyun Jo, Hyong-Euk Lee, Seon Joo Kim

    Abstract: In no-reference image quality assessment (NR-IQA), the challenge of limited dataset sizes hampers the development of robust and generalizable models. Conventional methods address this issue by utilizing large datasets to extract rich representations for IQA. Also, some approaches propose vision language models (VLM) based IQA, but the domain gap between generic VLM and IQA constrains their scalabi… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  5. arXiv:2405.14082  [pdf, other

    cs.LG cs.AI

    Exclusively Penalized Q-learning for Offline Reinforcement Learning

    Authors: Junghyuk Yeom, Yonghyeon Jo, Jungmo Kim, Sanghyeon Lee, Seungyul Han

    Abstract: Constraint-based offline reinforcement learning (RL) involves policy constraints or imposing penalties on the value function to mitigate overestimation errors caused by distributional shift. This paper focuses on a limitation in existing offline RL methods with penalized value function, indicating the potential for underestimation bias due to unnecessary bias introduced in the value function. To a… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 9 pages technical page followed by references and appendix

  6. arXiv:2405.11162  [pdf, other

    cs.CL

    LG AI Research & KAIST at EHRSQL 2024: Self-Training Large Language Models with Pseudo-Labeled Unanswerable Questions for a Reliable Text-to-SQL System on EHRs

    Authors: Yongrae Jo, Seongyun Lee, Minju Seo, Sung Ju Hwang, Moontae Lee

    Abstract: Text-to-SQL models are pivotal for making Electronic Health Records (EHRs) accessible to healthcare professionals without SQL knowledge. With the advancements in large language models, these systems have become more adept at translating complex questions into SQL queries. Nonetheless, the critical need for reliability in healthcare necessitates these models to accurately identify unanswerable ques… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: NAACL 2024 Clinical NLP Workshop

  7. arXiv:2404.09480  [pdf, other

    cs.CL cs.AI

    Mitigating Hallucination in Abstractive Summarization with Domain-Conditional Mutual Information

    Authors: Kyubyung Chae, Jaepill Choi, Yohan Jo, Taesup Kim

    Abstract: A primary challenge in abstractive summarization is hallucination -- the phenomenon where a model generates plausible text that is absent in the source text. We hypothesize that the domain (or topic) of the source text triggers the model to generate text that is highly probable in the domain, neglecting the details of the source text. To alleviate this model bias, we introduce a decoding strategy… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted by Findings of NAACL 2024

  8. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  9. arXiv:2403.04787  [pdf, other

    cs.CL cs.AI

    Ever-Evolving Memory by Blending and Refining the Past

    Authors: Seo Hyun Kim, Keummin Ka, Yohan Jo, Seung-won Hwang, Dongha Lee, Jinyoung Yeo

    Abstract: For a human-like chatbot, constructing a long-term memory is crucial. However, current large language models often lack this capability, leading to instances of missing important user information or redundantly asking for the same information, thereby diminishing conversation quality. To effectively construct memory, it is crucial to seamlessly connect past and present information, while also poss… ▽ More

    Submitted 7 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: 17 pages, 4 figures, 7 tables

  10. arXiv:2402.11827  [pdf, other

    cs.IR cs.CL

    Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversational Search

    Authors: Chanwoong Yoon, Gangwoo Kim, Byeongguk Jeon, Sungdong Kim, Yohan Jo, Jaewoo Kang

    Abstract: Conversational search, unlike single-turn retrieval tasks, requires understanding the current question within a dialogue context. The common approach of rewrite-then-retrieve aims to decontextualize questions to be self-sufficient for off-the-shelf retrievers, but most existing methods produce sub-optimal query rewrites due to the limited ability to incorporate signals from the retrieval results.… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: 8 pages

  11. arXiv:2401.06400  [pdf, other

    cs.CL cs.CV

    Generalizing Visual Question Answering from Synthetic to Human-Written Questions via a Chain of QA with a Large Language Model

    Authors: Taehee Kim, Yeongjae Cho, Heejun Shin, Yohan Jo, Dongmyung Shin

    Abstract: Visual question answering (VQA) is a task where an image is given, and a series of questions are asked about the image. To build an efficient VQA algorithm, a large amount of QA data is required which is very expensive. Generating synthetic QA pairs based on templates is a practical way to obtain data. However, VQA models trained on those data do not perform well on complex, human-written question… ▽ More

    Submitted 16 January, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

  12. arXiv:2312.13822  [pdf, other

    cs.CV

    Universal Noise Annotation: Unveiling the Impact of Noisy annotation on Object Detection

    Authors: Kwangrok Ryoo, Yeonsik Jo, Seungjun Lee, Mira Kim, Ahra Jo, Seung Hwan Kim, Seungryong Kim, Soonyoung Lee

    Abstract: For object detection task with noisy labels, it is important to consider not only categorization noise, as in image classification, but also localization noise, missing annotations, and bogus bounding boxes. However, previous studies have only addressed certain types of noise (e.g., localization or categorization). In this paper, we propose Universal-Noise Annotation (UNA), a more practical settin… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: appendix and code : https://github.com/Ryoo72/UNA

  13. arXiv:2312.12661  [pdf, other

    cs.CV

    Misalign, Contrast then Distill: Rethinking Misalignments in Language-Image Pretraining

    Authors: Bumsoo Kim, Yeonsik Jo, Jinhyung Kim, Seung Hwan Kim

    Abstract: Contrastive Language-Image Pretraining has emerged as a prominent approach for training vision and text encoders with uncurated image-text pairs from the web. To enhance data-efficiency, recent efforts have introduced additional supervision terms that involve random-augmented views of the image. However, since the image augmentation process is unaware of its text counterpart, this procedure could… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: ICCV 2023

  14. arXiv:2312.12659  [pdf, other

    cs.CV

    Expediting Contrastive Language-Image Pretraining via Self-distilled Encoders

    Authors: Bumsoo Kim, Jinhyung Kim, Yeonsik Jo, Seung Hwan Kim

    Abstract: Recent advances in vision language pretraining (VLP) have been largely attributed to the large-scale data collected from the web. However, uncurated dataset contains weakly correlated image-text pairs, causing data inefficiency. To address the issue, knowledge distillation have been explored at the expense of extra image and text momentum encoders to generate teaching signals for misaligned image-… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: AAAI 2024

  15. arXiv:2311.07362  [pdf, other

    cs.CL cs.CV

    Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision

    Authors: Seongyun Lee, Sue Hyun Park, Yongrae Jo, Minjoon Seo

    Abstract: Large multimodal models suffer from multimodal hallucination, where they provide incorrect responses misaligned with the given visual information. Recent works have conjectured that one of the reasons behind multimodal hallucination is due to the vision encoder failing to ground on the image properly. To mitigate this issue, we propose a novel approach that leverages self-feedback as visual cues.… ▽ More

    Submitted 2 April, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

  16. arXiv:2310.20479  [pdf, other

    cs.CL

    Multi-User MultiWOZ: Task-Oriented Dialogues among Multiple Users

    Authors: Yohan Jo, Xinyan Zhao, Arijit Biswas, Nikoletta Basiou, Vincent Auvray, Nikolaos Malandrakis, Angeliki Metallinou, Alexandros Potamianos

    Abstract: While most task-oriented dialogues assume conversations between the agent and one user at a time, dialogue systems are increasingly expected to communicate with multiple users simultaneously who make decisions collaboratively. To facilitate development of such systems, we release the Multi-User MultiWOZ dataset: task-oriented dialogues among two users and one agent. To collect this dataset, each u… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: To Appear in EMNLP-Findings 2023

  17. arXiv:2310.17857  [pdf, other

    cs.CL

    From Values to Opinions: Predicting Human Behaviors and Stances Using Value-Injected Large Language Models

    Authors: Dongjun Kang, Joonsuk Park, Yohan Jo, JinYeong Bak

    Abstract: Being able to predict people's opinions on issues and behaviors in realistic scenarios can be helpful in various domains, such as politics and marketing. However, conducting large-scale surveys like the European Social Survey to solicit people's opinions on individual issues can incur prohibitive costs. Leveraging prior research showing influence of core human values on individual decisions and ac… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 main paper accepted

  18. arXiv:2310.11220  [pdf, other

    cs.CL

    KG-GPT: A General Framework for Reasoning on Knowledge Graphs Using Large Language Models

    Authors: Jiho Kim, Yeonsu Kwon, Yohan Jo, Edward Choi

    Abstract: While large language models (LLMs) have made considerable advancements in understanding and generating unstructured text, their application in structured data remains underexplored. Particularly, using LLMs for complex reasoning tasks on knowledge graphs (KGs) remains largely untouched. To address this, we propose KG-GPT, a multi-purpose framework leveraging LLMs for tasks employing KGs. KG-GPT co… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 Findings

  19. arXiv:2309.06006  [pdf, ps, other

    cs.CV cs.AI

    SoccerNet 2023 Challenges Results

    Authors: Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim , et al. (77 additional authors not shown)

    Abstract: The SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, fo… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

  20. arXiv:2309.00237  [pdf, other

    cs.CL cs.AI

    Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes

    Authors: Sunjun Kweon, Junu Kim, Jiyoun Kim, Sujeong Im, Eunbyeol Cho, Seongsu Bae, Jungwoo Oh, Gyubok Lee, Jong Hak Moon, Seng Chan You, Seungjin Baek, Chang Hoon Han, Yoon Bin Jung, Yohan Jo, Edward Choi

    Abstract: The development of large language models tailored for handling patients' clinical notes is often hindered by the limited accessibility and usability of these notes due to strict privacy regulations. To address these challenges, we first create synthetic large-scale clinical notes using publicly available case reports extracted from biomedical literature. We then use these synthetic notes to train… ▽ More

    Submitted 13 June, 2024; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: ACL 2024 (Findings)

  21. arXiv:2308.12492  [pdf, other

    cs.LG eess.SP

    Optimizing Neural Network Scale for ECG Classification

    Authors: Byeong Tak Lee, Yong-Yeon Jo, Joon-Myoung Kwon

    Abstract: We study scaling convolutional neural networks (CNNs), specifically targeting Residual neural networks (ResNet), for analyzing electrocardiograms (ECGs). Although ECG signals are time-series data, CNN-based models have been shown to outperform other neural networks with different architectures in ECG analysis. However, most previous studies in ECG analysis have overlooked the importance of network… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

    Comments: 30pages

  22. arXiv:2308.11272  [pdf, other

    cs.LG

    FoX: Formation-aware exploration in multi-agent reinforcement learning

    Authors: Yonghyeon Jo, Sunwoo Lee, Junghyuk Yeom, Seungyul Han

    Abstract: Recently, deep multi-agent reinforcement learning (MARL) has gained significant popularity due to its success in various cooperative multi-agent tasks. However, exploration still remains a challenging problem in MARL due to the partial observability of the agents and the exploration space that can grow exponentially as the number of agents increases. Firstly, in order to address the scalability is… ▽ More

    Submitted 13 January, 2024; v1 submitted 22 August, 2023; originally announced August 2023.

    Comments: 8 pages main, 5 pages appendix with reference. 10 figures, accepeted by AAAI 2024

    MSC Class: Machine Learning (ML) - ML: Reinforcement Learning; Secondary Subject Areas: Multiagent Systems (MAS) - MAS: Multiagent Learning

  23. arXiv:2307.10928  [pdf, other

    cs.CL cs.AI

    FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets

    Authors: Seonghyeon Ye, Doyoung Kim, Sungdong Kim, Hyeonbin Hwang, Seungone Kim, Yongrae Jo, James Thorne, Juho Kim, Minjoon Seo

    Abstract: Evaluation of Large Language Models (LLMs) is challenging because instruction-following necessitates alignment with human values and the required set of skills varies depending on the instruction. However, previous studies have mainly focused on coarse-grained evaluation (i.e. overall preference-based evaluation), which limits interpretability since it does not consider the nature of user instruct… ▽ More

    Submitted 14 April, 2024; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: ICLR 2024 Spotlight

  24. arXiv:2307.02682  [pdf, other

    cs.CV cs.CL

    Zero-Shot Dense Video Captioning by Jointly Optimizing Text and Moment

    Authors: Yongrae Jo, Seongyun Lee, Aiden SJ Lee, Hyunji Lee, Hanseok Oh, Minjoon Seo

    Abstract: Dense video captioning, a task of localizing meaningful moments and generating relevant captions for videos, often requires a large, expensive corpus of annotated video segments paired with text. In an effort to minimize the annotation cost, we propose ZeroTA, a novel method for dense video captioning in a zero-shot manner. Our method does not require any videos or annotations for training; instea… ▽ More

    Submitted 11 July, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

  25. arXiv:2305.07288  [pdf, other

    cs.CL

    Open-WikiTable: Dataset for Open Domain Question Answering with Complex Reasoning over Table

    Authors: Sunjun Kweon, Yeonsu Kwon, Seonhee Cho, Yohan Jo, Edward Choi

    Abstract: Despite recent interest in open domain question answering (ODQA) over tables, many studies still rely on datasets that are not truly optimal for the task with respect to utilizing structural nature of table. These datasets assume answers reside as a single cell value and do not necessitate exploring over multiple cells such as aggregation, comparison, and sorting. Thus, we release Open-WikiTable,… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: ACL 2023 (Findings)

  26. arXiv:2305.06590  [pdf, other

    cs.CL cs.AI

    FactKG: Fact Verification via Reasoning on Knowledge Graphs

    Authors: Jiho Kim, Sungjin Park, Yeonsu Kwon, Yohan Jo, James Thorne, Edward Choi

    Abstract: In real world applications, knowledge graphs (KG) are widely used in various domains (e.g. medical applications and dialogue agents). However, for fact verification, KGs have not been adequately utilized as a knowledge source. KGs can be a valuable knowledge source in fact verification due to their reliability and broad applicability. A KG consists of nodes and edges which makes it clear how conce… ▽ More

    Submitted 18 May, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023

  27. arXiv:2304.02096  [pdf, other

    astro-ph.CO astro-ph.GA cs.LG

    The CAMELS project: Expanding the galaxy formation model space with new ASTRID and 28-parameter TNG and SIMBA suites

    Authors: Yueying Ni, Shy Genel, Daniel Anglés-Alcázar, Francisco Villaescusa-Navarro, Yongseok Jo, Simeon Bird, Tiziana Di Matteo, Rupert Croft, Nianyi Chen, Natalí S. M. de Santi, Matthew Gebhardt, Helen Shao, Shivam Pandey, Lars Hernquist, Romeel Dave

    Abstract: We present CAMELS-ASTRID, the third suite of hydrodynamical simulations in the Cosmology and Astrophysics with MachinE Learning (CAMELS) project, along with new simulation sets that extend the model parameter space based on the previous frameworks of CAMELS-TNG and CAMELS-SIMBA, to provide broader training sets and testing grounds for machine-learning algorithms designed for cosmological studies.… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  28. arXiv:2302.14260  [pdf, other

    cs.LG

    A Closer Look at the Intervention Procedure of Concept Bottleneck Models

    Authors: Sungbin Shin, Yohan Jo, Sungsoo Ahn, Namhoon Lee

    Abstract: Concept bottleneck models (CBMs) are a class of interpretable neural network models that predict the target response of a given input based on its high-level concepts. Unlike the standard end-to-end models, CBMs enable domain experts to intervene on the predicted concepts and rectify any mistakes at test time, so that more accurate task predictions can be made at the end. While such intervenabilit… ▽ More

    Submitted 2 July, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: ICML 2023

  29. arXiv:2210.03029  [pdf, other

    cs.CL cs.AI

    Efficiently Enhancing Zero-Shot Performance of Instruction Following Model via Retrieval of Soft Prompt

    Authors: Seonghyeon Ye, Joel Jang, Doyoung Kim, Yongrae Jo, Minjoon Seo

    Abstract: Enhancing the zero-shot performance of instruction-following models requires heavy computation, either by scaling the total number of training datasets or the model size. In this work, we explore how retrieval of soft prompts obtained through prompt tuning can efficiently assist hard prompts in zero-shot task generalization. Specifically, we train soft prompt embeddings for each prompt through pro… ▽ More

    Submitted 16 October, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: EMNLP 2023 Findings

  30. arXiv:2207.11728  [pdf, other

    eess.SP cs.AR

    A Custom IC Layout Generation Engine Based on Dynamic Templates and Grids

    Authors: Taeho Shin, Dongjun Lee, Dongwhee Kim, Gaeryun Sung, Wookjin Shin, Yunseong Jo, Hyungjoo Park, Jaeduk Han

    Abstract: This paper presents an automatic layout generation framework in advanced CMOS technologies. The framework extends the template-and-grid-based layout generation methodology with the following additional techniques applied to produce optimal layouts more effectively. First, layout templates and grids are dynamically created and adjusted during runtime to serve various structural, functional, and des… ▽ More

    Submitted 24 July, 2022; originally announced July 2022.

    Comments: 10 pages, 6 figures

  31. arXiv:2206.13404  [pdf, other

    eess.AS cs.AI cs.SD

    Avocodo: Generative Adversarial Network for Artifact-free Vocoder

    Authors: Taejun Bak, Junmo Lee, Hanbin Bae, Jinhyeok Yang, Jae-Sung Bae, Young-Sun Joo

    Abstract: Neural vocoders based on the generative adversarial neural network (GAN) have been widely used due to their fast inference speed and lightweight networks while generating high-quality speech waveforms. Since the perceptually important speech components are primarily concentrated in the low-frequency bands, most GAN-based vocoders perform multi-scale analysis that evaluates downsampled speech wavef… ▽ More

    Submitted 3 January, 2023; v1 submitted 27 June, 2022; originally announced June 2022.

    Comments: Accepted for publication in the 37th AAAI conference on artificial intelligence (AAAI 2023)

  32. arXiv:2206.11349  [pdf, other

    cs.LG cs.AI cs.CL

    Prompt Injection: Parameterization of Fixed Inputs

    Authors: Eunbi Choi, Yongrae Jo, Joel Jang, Minjoon Seo

    Abstract: Recent works have shown that attaching prompts to the input is effective at conditioning Language Models (LM) to perform specific tasks. However, prompts are always included in the input text during inference, thus incurring substantial computational and memory overhead. Also, there is currently no straightforward method of utilizing prompts that are longer than the maximum input length of the LMs… ▽ More

    Submitted 15 July, 2022; v1 submitted 31 May, 2022; originally announced June 2022.

    Comments: PING results in Table 2 updated (bug fixed)

  33. arXiv:2204.05753  [pdf, other

    eess.AS cs.AI cs.SD

    Enhancement of Pitch Controllability using Timbre-Preserving Pitch Augmentation in FastPitch

    Authors: Hanbin Bae, Young-Sun Joo

    Abstract: The recently developed pitch-controllable text-to-speech (TTS) model, i.e. FastPitch, was conditioned for the pitch contours. However, the quality of the synthesized speech degraded considerably for pitch values that deviated significantly from the average pitch; i.e. the ability to control pitch was limited. To address this issue, we propose two algorithms to improve the robustness of FastPitch.… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

    Comments: Submitted to INTERSPEECH 2022

  34. arXiv:2204.04004  [pdf, other

    eess.AS cs.SD

    Hierarchical and Multi-Scale Variational Autoencoder for Diverse and Natural Non-Autoregressive Text-to-Speech

    Authors: Jae-Sung Bae, Jinhyeok Yang, Tae-Jun Bak, Young-Sun Joo

    Abstract: This paper proposes a hierarchical and multi-scale variational autoencoder-based non-autoregressive text-to-speech model (HiMuV-TTS) to generate natural speech with diverse speaking styles. Recent advances in non-autoregressive TTS (NAR-TTS) models have significantly improved the inference speed and robustness of synthesized speech. However, the diversity of speaking styles and naturalness are nee… ▽ More

    Submitted 15 August, 2022; v1 submitted 8 April, 2022; originally announced April 2022.

    Comments: Accepted to INTERSPEECH 2022

  35. arXiv:2201.01300  [pdf, other

    astro-ph.CO astro-ph.GA astro-ph.IM cs.AI cs.LG

    The CAMELS project: public data release

    Authors: Francisco Villaescusa-Navarro, Shy Genel, Daniel Anglés-Alcázar, Lucia A. Perez, Pablo Villanueva-Domingo, Digvijay Wadekar, Helen Shao, Faizan G. Mohammad, Sultan Hassan, Emily Moser, Erwin T. Lau, Luis Fernando Machado Poletti Valle, Andrina Nicola, Leander Thiele, Yongseok Jo, Oliver H. E. Philcox, Benjamin D. Oppenheimer, Megan Tillman, ChangHoon Hahn, Neerav Kaushal, Alice Pisani, Matthew Gebhardt, Ana Maria Delgado, Joyce Caliendo, Christina Kreisch , et al. (22 additional authors not shown)

    Abstract: The Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project was developed to combine cosmology with astrophysics through thousands of cosmological hydrodynamic simulations and machine learning. CAMELS contains 4,233 cosmological simulations, 2,049 N-body and 2,184 state-of-the-art hydrodynamic simulations that sample a vast volume in parameter space. In this paper we present… ▽ More

    Submitted 4 January, 2022; originally announced January 2022.

    Comments: 18 pages, 3 figures. More than 350 Tb of data from thousands of simulations publicly available at https://www.camel-simulations.org

  36. arXiv:2109.10915  [pdf, other

    cs.LG astro-ph.CO astro-ph.GA astro-ph.IM cs.CV

    The CAMELS Multifield Dataset: Learning the Universe's Fundamental Parameters with Artificial Intelligence

    Authors: Francisco Villaescusa-Navarro, Shy Genel, Daniel Angles-Alcazar, Leander Thiele, Romeel Dave, Desika Narayanan, Andrina Nicola, Yin Li, Pablo Villanueva-Domingo, Benjamin Wandelt, David N. Spergel, Rachel S. Somerville, Jose Manuel Zorrilla Matilla, Faizan G. Mohammad, Sultan Hassan, Helen Shao, Digvijay Wadekar, Michael Eickenberg, Kaze W. K. Wong, Gabriella Contardo, Yongseok Jo, Emily Moser, Erwin T. Lau, Luis Fernando Machado Poletti Valle, Lucia A. Perez , et al. (3 additional authors not shown)

    Abstract: We present the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) Multifield Dataset, CMD, a collection of hundreds of thousands of 2D maps and 3D grids containing many different properties of cosmic gas, dark matter, and stars from 2,000 distinct simulated universes at several cosmic times. The 2D maps and 3D grids represent cosmic regions that span $\sim$100 million light year… ▽ More

    Submitted 22 September, 2021; originally announced September 2021.

    Comments: 17 pages, 1 figure. Third paper of a series of four. Hundreds of thousands of labeled 2D maps and 3D grids from thousands of simulated universes publicly available at https://camels-multifield-dataset.readthedocs.io

  37. arXiv:2109.09057  [pdf, other

    cs.CL

    Knowledge-Enhanced Evidence Retrieval for Counterargument Generation

    Authors: Yohan Jo, Haneul Yoo, JinYeong Bak, Alice Oh, Chris Reed, Eduard Hovy

    Abstract: Finding counterevidence to statements is key to many tasks, including counterargument generation. We build a system that, given a statement, retrieves counterevidence from diverse sources on the Web. At the core of this system is a natural language inference (NLI) model that determines whether a candidate sentence is valid counterevidence or not. Most NLI models to date, however, lack proper reaso… ▽ More

    Submitted 19 September, 2021; originally announced September 2021.

    Comments: To appear in Findings of EMNLP 2021

  38. arXiv:2108.12841  [pdf, other

    eess.IV cs.CV

    Rethinking Deep Image Prior for Denoising

    Authors: Yeonsik Jo, Se Young Chun, Jonghyun Choi

    Abstract: Deep image prior (DIP) serves as a good inductive bias for diverse inverse problems. Among them, denoising is known to be particularly challenging for the DIP due to noise fitting with the requirement of an early stopping. To address the issue, we first analyze the DIP by the notion of effective degrees of freedom (DF) to monitor the optimization progress and propose a principled stopping criterio… ▽ More

    Submitted 29 August, 2021; originally announced August 2021.

    Comments: ICCV 2021

  39. arXiv:2107.08823  [pdf, other

    cs.LG cs.AI

    One-Class Classification for Wafer Map using Adversarial Autoencoder with DSVDD Prior

    Authors: Ha Young Jo, Seong-Whan Lee

    Abstract: Recently, semiconductors' demand has exploded in virtual reality, smartphones, wearable devices, the internet of things, robotics, and automobiles. Semiconductor manufacturers want to make semiconductors with high yields. To do this, manufacturers conduct many quality assurance activities. Wafer map pattern classification is a typical way of quality assurance. The defect pattern on the wafer map c… ▽ More

    Submitted 15 July, 2021; originally announced July 2021.

  40. arXiv:2105.07571  [pdf, other

    cs.CL

    Classifying Argumentative Relations Using Logical Mechanisms and Argumentation Schemes

    Authors: Yohan Jo, Seojin Bang, Chris Reed, Eduard Hovy

    Abstract: While argument mining has achieved significant success in classifying argumentative relations between statements (support, attack, and neutral), we have a limited computational understanding of logical mechanisms that constitute those relations. Most recent studies rely on black-box models, which are not as linguistically insightful as desired. On the other hand, earlier studies use rather simple… ▽ More

    Submitted 16 May, 2021; originally announced May 2021.

    Comments: To Appear in TACL 2021

  41. arXiv:2104.13119  [pdf, other

    cs.CV

    Fisheye Lens Camera based Autonomous Valet Parking System

    Authors: Young Gon Jo, Seok Hyeon Hong, Sung Soo Hwang, Jeong Mok Ha

    Abstract: This paper proposes an efficient autonomous valet parking system utilizing only cameras which are the most widely used sensor. To capture more information instantaneously and respond rapidly to changes in the surrounding environment, fisheye cameras which have a wider angle of view compared to pinhole cameras are used. Accordingly, visual simultaneous localization and mapping is used to identify t… ▽ More

    Submitted 27 April, 2021; originally announced April 2021.

    Comments: 8 pages, 17 figures, 4 tables

  42. arXiv:2103.03049  [pdf, other

    eess.AS cs.LG cs.SD

    A Neural Text-to-Speech Model Utilizing Broadcast Data Mixed with Background Music

    Authors: Hanbin Bae, Jae-Sung Bae, Young-Sun Joo, Young-Ik Kim, Hoon-Young Cho

    Abstract: Recently, it has become easier to obtain speech data from various media such as the internet or YouTube, but directly utilizing them to train a neural text-to-speech (TTS) model is difficult. The proportion of clean speech is insufficient and the remainder includes background music. Even with the global style token (GST). Therefore, we propose the following method to successfully train an end-to-e… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

    Comments: Accepted at ICASSP 2021

  43. arXiv:2103.00006  [pdf, other

    eess.SP cs.LG

    Towards Synthesizing Twelve-Lead Electrocardiograms from Two Asynchronous Leads

    Authors: Yong-Yeon Jo, Young Sang Choi, Jong-Hwan Jang, Joon-Myoung Kwon

    Abstract: The electrocardiogram (ECG) records electrical signals in a non-invasive way to observe the condition of the heart, typically looking at the heart from 12 different directions. Several types of the cardiac disease are diagnosed by using 12-lead ECGs Recently, various wearable devices have enabled immediate access to the ECG without the use of wieldy equipment. However, they only provide ECGs with… ▽ More

    Submitted 25 June, 2024; v1 submitted 28 February, 2021; originally announced March 2021.

  44. arXiv:2010.02660  [pdf, other

    cs.CL

    Detecting Attackable Sentences in Arguments

    Authors: Yohan Jo, Seojin Bang, Emaad Manzoor, Eduard Hovy, Chris Reed

    Abstract: Finding attackable sentences in an argument is the first step toward successful refutation in argumentation. We present a first large-scale analysis of sentence attackability in online arguments. We analyze driving reasons for attacks in argumentation and identify relevant characteristics of sentences. We demonstrate that a sentence's attackability is associated with many of these characteristics… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  45. arXiv:2010.02654  [pdf, other

    cs.CL

    Extracting Implicitly Asserted Propositions in Argumentation

    Authors: Yohan Jo, Jacky Visser, Chris Reed, Eduard Hovy

    Abstract: Argumentation accommodates various rhetorical devices, such as questions, reported speech, and imperatives. These rhetorical tools usually assert argumentatively relevant propositions rather implicitly, so understanding their true meaning is key to understanding certain arguments properly. However, most argument mining systems and computational linguistics research have paid little attention to im… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  46. arXiv:2009.04070  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    Exploiting Multi-Modal Features From Pre-trained Networks for Alzheimer's Dementia Recognition

    Authors: Junghyun Koo, Jie Hwan Lee, Jaewoo Pyo, Yujin Jo, Kyogu Lee

    Abstract: Collecting and accessing a large amount of medical data is very time-consuming and laborious, not only because it is difficult to find specific patients but also because it is required to resolve the confidentiality of a patient's medical records. On the other hand, there are deep learning models, trained on easily collectible, large scale datasets such as Youtube or Wikipedia, offering useful rep… ▽ More

    Submitted 2 March, 2021; v1 submitted 8 September, 2020; originally announced September 2020.

    Comments: In the Proceedings of INTERSPEECH 2020

  47. arXiv:2007.15281  [pdf, other

    eess.AS cs.SD

    Speaking Speed Control of End-to-End Speech Synthesis using Sentence-Level Conditioning

    Authors: Jae-Sung Bae, Hanbin Bae, Young-Sun Joo, Junmo Lee, Gyeong-Hoon Lee, Hoon-Young Cho

    Abstract: This paper proposes a controllable end-to-end text-to-speech (TTS) system to control the speaking speed (speed-controllable TTS; SCTTS) of synthesized speech with sentence-level speaking-rate value as an additional input. The speaking-rate value, the ratio of the number of input phonemes to the length of input speech, is adopted in the proposed system to control the speaking speed. Furthermore, th… ▽ More

    Submitted 13 August, 2020; v1 submitted 30 July, 2020; originally announced July 2020.

    Comments: Accepted to INTERSPEECH 2020

  48. arXiv:2005.14629  [pdf

    cs.RO eess.SP

    Stealth UAV through Coanda Effect

    Authors: Dongyoon Shin, Hyeji Kim, Jihyuk Gong, Uijeong Jeong, Yeeun Jo, Eric Matson

    Abstract: This paper uses Coanda Effect to reduce motors, the source of noise, and finds low noise materials with sufficient lift force so that it can achieve acoustical stealth UAVs.According to NASA research [1], the noise of UAVs is better heard to people. But there must be some moments when we need to operate the drones quietly, so how can we reduce the noise? In previous research, there have also been… ▽ More

    Submitted 29 April, 2020; originally announced May 2020.

    Comments: 8 pages, 18 Figures, Accepted in The Fourth IEEE International Conference on Robotics Computing

  49. arXiv:2005.01056  [pdf, other

    eess.IV cs.CV

    NTIRE 2020 Challenge on Perceptual Extreme Super-Resolution: Methods and Results

    Authors: Kai Zhang, Shuhang Gu, Radu Timofte, Taizhang Shang, Qiuju Dai, Shengchen Zhu, Tong Yang, Yandong Guo, Younghyun Jo, Sejong Yang, Seon Joo Kim, Lin Zha, Jiande Jiang, Xinbo Gao, Wen Lu, Jing Liu, Kwangjin Yoon, Taegyun Jeon, Kazutoshi Akita, Takeru Ooba, Norimichi Ukita, Zhipeng Luo, Yuehan Yao, Zhenyu Xu, Dongliang He , et al. (38 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2020 challenge on perceptual extreme super-resolution with focus on proposed solutions and results. The challenge task was to super-resolve an input image with a magnification factor 16 based on a set of prior examples of low and corresponding high resolution images. The goal is to obtain a network design capable to produce high resolution results with the best percept… ▽ More

    Submitted 3 May, 2020; originally announced May 2020.

    Comments: CVPRW 2020

  50. arXiv:2004.02432  [pdf, other

    cs.CV

    Deep Space-Time Video Upsampling Networks

    Authors: Jaeyeon Kang, Younghyun Jo, Seoung Wug Oh, Peter Vajda, Seon Joo Kim

    Abstract: Video super-resolution (VSR) and frame interpolation (FI) are traditional computer vision problems, and the performance have been improving by incorporating deep learning recently. In this paper, we investigate the problem of jointly upsampling videos both in space and time, which is becoming more important with advances in display systems. One solution for this is to run VSR and FI, one by one, i… ▽ More

    Submitted 9 August, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

    Comments: ECCV2020 accepted