Skip to main content

Showing 1–42 of 42 results for author: Kang, P

  1. arXiv:2407.03644  [pdf, other

    cs.HC

    On-Device Training Empowered Transfer Learning For Human Activity Recognition

    Authors: Pixi Kang, Julian Moosmann, Sizhen Bian, Michele Magno

    Abstract: Human Activity Recognition (HAR) is an attractive topic to perceive human behavior and supplying assistive services. Besides the classical inertial unit and vision-based HAR methods, new sensing technologies, such as ultrasound and body-area electric fields, have emerged in HAR to enhance user experience and accommodate new application scenarios. As those sensors are often paired with AI for HAR,… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2406.12356  [pdf, other

    cs.IR

    A Gradient Accumulation Method for Dense Retriever under Memory Constraint

    Authors: Jaehee Kim, Yukyung Lee, Pilsung Kang

    Abstract: InfoNCE loss is commonly used to train dense retriever in information retrieval tasks. It is well known that a large batch is essential to stable and effective training with InfoNCE loss, which requires significant hardware resources. Due to the dependency of large batch, dense retriever has bottleneck of application and research. Recently, memory reduction methods have been broadly adopted to res… ▽ More

    Submitted 18 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  3. arXiv:2405.01028  [pdf, other

    cs.CV

    Technical Report of NICE Challenge at CVPR 2024: Caption Re-ranking Evaluation Using Ensembled CLIP and Consensus Scores

    Authors: Kiyoon Jeong, Woojun Lee, Woongchan Nam, Minjeong Ma, Pilsung Kang

    Abstract: This report presents the ECO (Ensembled Clip score and cOnsensus score) pipeline from team DSBA LAB, which is a new framework used to evaluate and rank captions for a given image. ECO selects the most accurate caption describing image. It is made possible by combining an Ensembled CLIP score, which considers the semantic alignment between the image and captions, with a Consensus score that account… ▽ More

    Submitted 13 June, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

  4. arXiv:2404.13919  [pdf, other

    cs.CL cs.AI cs.HC

    Navigating the Path of Writing: Outline-guided Text Generation with Large Language Models

    Authors: Yukyung Lee, Soonwon Ka, Bokyung Son, Pilsung Kang, Jaewook Kang

    Abstract: Large Language Models (LLMs) have significantly impacted the writing process, enabling collaborative content creation and enhancing productivity. However, generating high-quality, user-aligned text remains challenging. In this paper, we propose Writing Path, a framework that uses explicit outlines to guide LLMs in generating goal-oriented, high-quality pieces of writing. Our approach draws inspira… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: under review

  5. arXiv:2403.18771  [pdf, other

    cs.CL

    CheckEval: Robust Evaluation Framework using Large Language Model via Checklist

    Authors: Yukyung Lee, Joonghoon Kim, Jaehee Kim, Hyowon Cho, Pilsung Kang

    Abstract: We introduce CheckEval, a novel evaluation framework using Large Language Models, addressing the challenges of ambiguity and inconsistency in current evaluation methods. CheckEval addresses these challenges by dividing evaluation criteria into detailed sub-aspects and constructing a checklist of Boolean questions for each, simplifying the evaluation. This approach not only renders the process more… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: HEAL at CHI 2024

  6. arXiv:2403.07036  [pdf, other

    cs.LG cs.CV cs.DC

    A Converting Autoencoder Toward Low-latency and Energy-efficient DNN Inference at the Edge

    Authors: Hasanul Mahmud, Peng Kang, Kevin Desai, Palden Lama, Sushil Prasad

    Abstract: Reducing inference time and energy usage while maintaining prediction accuracy has become a significant concern for deep neural networks (DNN) inference on resource-constrained edge devices. To address this problem, we propose a novel approach based on "converting" autoencoder and lightweight DNNs. This improves upon recent work such as early-exiting framework and DNN partitioning. Early-exiting f… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 8 Pages, 8 Figures

  7. arXiv:2312.16071  [pdf, other

    cs.NE cs.AI cs.GR cs.LG

    Event-based Shape from Polarization with Spiking Neural Networks

    Authors: Peng Kang, Srutarshi Banerjee, Henry Chopp, Aggelos Katsaggelos, Oliver Cossairt

    Abstract: Recent advances in event-based shape determination from polarization offer a transformative approach that tackles the trade-off between speed and accuracy in capturing surface geometries. In this paper, we investigate event-based shape from polarization using Spiking Neural Networks (SNNs), introducing the Single-Timestep and Multi-Timestep Spiking UNets for effective and efficient surface normal… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: 25 pages

  8. arXiv:2312.04982  [pdf, other

    cs.CL

    Boosting Prompt-Based Self-Training With Mapping-Free Automatic Verbalizer for Multi-Class Classification

    Authors: Yookyung Kho, Jaehee Kim, Pilsung Kang

    Abstract: Recently, prompt-based fine-tuning has garnered considerable interest as a core technique for few-shot text classification task. This approach reformulates the fine-tuning objective to align with the Masked Language Modeling (MLM) objective. Leveraging unlabeled data, prompt-based self-training has shown greater effectiveness in binary and three-class classification. However, prompt-based self-tra… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: EMNLP 2023 findings

  9. arXiv:2311.05160  [pdf, other

    cs.LG cs.AI

    RAPID: Training-free Retrieval-based Log Anomaly Detection with PLM considering Token-level information

    Authors: Gunho No, Yukyung Lee, Hyeongwon Kang, Pilsung Kang

    Abstract: As the IT industry advances, system log data becomes increasingly crucial. Many computer systems rely on log texts for management due to restricted access to source code. The need for log anomaly detection is growing, especially in real-world applications, but identifying anomalies in rapidly accumulating logs remains a challenging task. Traditional deep learning-based anomaly detection models req… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  10. arXiv:2311.03754  [pdf, other

    cs.CL

    Which is better? Exploring Prompting Strategy For LLM-based Metrics

    Authors: Joonghoon Kim, Saeran Park, Kiyoon Jeong, Sangmin Lee, Seung Hun Han, Jiyoon Lee, Pilsung Kang

    Abstract: This paper describes the DSBA submissions to the Prompting Large Language Models as Explainable Metrics shared task, where systems were submitted to two tracks: small and large summarization tracks. With advanced Large Language Models (LLMs) such as GPT-4, evaluating the quality of Natural Language Generation (NLG) has become increasingly paramount. Traditional similarity-based metrics such as BLE… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: Eval4NLP 2023 shared task winner on both Small and Large model Track for Summarization

  11. arXiv:2306.02043  [pdf, other

    cs.AI

    Painsight: An Extendable Opinion Mining Framework for Detecting Pain Points Based on Online Customer Reviews

    Authors: Yukyung Lee, Jaehee Kim, Doyoon Kim, Yookyung Kho, Younsun Kim, Pilsung Kang

    Abstract: As the e-commerce market continues to expand and online transactions proliferate, customer reviews have emerged as a critical element in shaping the purchasing decisions of prospective buyers. Previous studies have endeavored to identify key aspects of customer reviews through the development of sentiment analysis models and topic models. However, extracting specific dissatisfaction factors remain… ▽ More

    Submitted 3 June, 2023; originally announced June 2023.

    Comments: WASSA at ACL 2023

  12. arXiv:2303.01034  [pdf, other

    cs.LG cs.AI

    Multi-Task Self-Supervised Time-Series Representation Learning

    Authors: Heejeong Choi, Pilsung Kang

    Abstract: Time-series representation learning can extract representations from data with temporal dynamics and sparse labels. When labeled data are sparse but unlabeled data are abundant, contrastive learning, i.e., a framework to learn a latent space where similar samples are close to each other while dissimilar ones are far from each other, has shown outstanding performance. This strategy can encourage va… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

  13. arXiv:2210.04277  [pdf, other

    cs.NE cs.AI cs.LG cs.RO

    Boost Event-Driven Tactile Learning with Location Spiking Neurons

    Authors: Peng Kang, Srutarshi Banerjee, Henry Chopp, Aggelos Katsaggelos, Oliver Cossairt

    Abstract: Tactile sensing is essential for a variety of daily tasks. And recent advances in event-driven tactile sensors and Spiking Neural Networks (SNNs) spur the research in related fields. However, SNN-enabled event-driven tactile learning is still in its infancy due to the limited representation abilities of existing spiking neurons and high spatio-temporal complexity in the event-driven tactile data.… ▽ More

    Submitted 19 December, 2022; v1 submitted 9 October, 2022; originally announced October 2022.

    Comments: Under review. Please note that this paper is a journal extension of our previous conference paper: arXiv:2209.01080. Please check what we added in the introduction part

  14. arXiv:2209.01080  [pdf, other

    cs.NE cs.AI cs.LG cs.RO

    Event-Driven Tactile Learning with Location Spiking Neurons

    Authors: Peng Kang, Srutarshi Banerjee, Henry Chopp, Aggelos Katsaggelos, Oliver Cossairt

    Abstract: The sense of touch is essential for a variety of daily tasks. New advances in event-based tactile sensors and Spiking Neural Networks (SNNs) spur the research in event-driven tactile learning. However, SNN-enabled event-driven tactile learning is still in its infancy due to the limited representative abilities of existing spiking neurons and high spatio-temporal complexity in the data. In this pap… ▽ More

    Submitted 23 July, 2022; originally announced September 2022.

    Comments: accepted by IJCNN 2022 (oral), the source code is available at https://github.com/pkang2017/TactileLocNeurons

  15. arXiv:2208.13022  [pdf, other

    cs.IT

    Parity-Check Matrix Partitioning for Efficient Layered Decoding of QC-LDPC Codes

    Authors: Teng Lu, Xuan He, Peng Kang, Jiongyue Xing, Xiaohu Tang

    Abstract: In this paper, we consider how to partition the parity-check matrices (PCMs) to reduce the hardware complexity and computation delay for the row layered decoding of quasi-cyclic low-density parity-check (QC-LDPC) codes. First, we formulate the PCM partitioning as an optimization problem, which targets to minimize the maximum column weight of each layer while maintaining a block cyclic shift proper… ▽ More

    Submitted 27 August, 2022; originally announced August 2022.

  16. arXiv:2207.03858  [pdf, other

    cs.CL

    DSTEA: Improving Dialogue State Tracking via Entity Adaptive Pre-training

    Authors: Yukyung Lee, Takyoung Kim, Hoonsang Yoon, Pilsung Kang, Junseong Bang, Misuk Kim

    Abstract: Dialogue State Tracking (DST) is critical for comprehensively interpreting user and system utterances, thereby forming the cornerstone of efficient dialogue systems. Despite past research efforts focused on enhancing DST performance through alterations to the model structure or integrating additional features like graph relations, they often require additional pre-training with external dialogue c… ▽ More

    Submitted 23 July, 2023; v1 submitted 8 July, 2022; originally announced July 2022.

    Journal ref: KnowledgeNLP@KDD2023

  17. arXiv:2206.09178  [pdf, other

    cs.CV cs.AI

    REVECA -- Rich Encoder-decoder framework for Video Event CAptioner

    Authors: Jaehyuk Heo, YongGi Jeong, Sunwoo Kim, Jaehee Kim, Pilsung Kang

    Abstract: We describe an approach used in the Generic Boundary Event Captioning challenge at the Long-Form Video Understanding Workshop held at CVPR 2022. We designed a Rich Encoder-decoder framework for Video Event CAptioner (REVECA) that utilizes spatial and temporal information from the video to generate a caption for the corresponding the event boundary. REVECA uses frame position embedding to incorpora… ▽ More

    Submitted 18 June, 2022; originally announced June 2022.

    Comments: The IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR). LOng-form VidEo Understanding (LOVEU) workshop

  18. arXiv:2203.10808  [pdf, other

    cs.CV cs.LG

    AnoViT: Unsupervised Anomaly Detection and Localization with Vision Transformer-based Encoder-Decoder

    Authors: Yunseung Lee, Pilsung Kang

    Abstract: Image anomaly detection problems aim to determine whether an image is abnormal, and to detect anomalous areas. These methods are actively used in various fields such as manufacturing, medical care, and intelligent information. Encoder-decoder structures have been widely used in the field of anomaly detection because they can easily learn normal patterns in an unsupervised learning environment and… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

  19. arXiv:2203.03123  [pdf, other

    cs.CL cs.AI

    Mismatch between Multi-turn Dialogue and its Evaluation Metric in Dialogue State Tracking

    Authors: Takyoung Kim, Hoonsang Yoon, Yukyung Lee, Pilsung Kang, Misuk Kim

    Abstract: Dialogue state tracking (DST) aims to extract essential information from multi-turn dialogue situations and take appropriate actions. A belief state, one of the core pieces of information, refers to the subject and its specific content, and appears in the form of domain-slot-value. The trained model predicts "accumulated" belief states in every turn, and joint goal accuracy and slot accuracy are m… ▽ More

    Submitted 31 March, 2022; v1 submitted 6 March, 2022; originally announced March 2022.

    Comments: ACL 2022 (short)

  20. arXiv:2202.10001  [pdf, other

    cs.LG cs.AI

    Recurrent Auto-Encoder With Multi-Resolution Ensemble and Predictive Coding for Multivariate Time-Series Anomaly Detection

    Authors: Heejeong Choi, Subin Kim, Pilsung Kang

    Abstract: As large-scale time-series data can easily be found in real-world applications, multivariate time-series anomaly detection has played an essential role in diverse industries. It enables productivity improvement and maintenance cost reduction by preventing malfunctions and detecting anomalies based on time-series data. However, multivariate time-series anomaly detection is challenging because real-… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

  21. arXiv:2201.06071  [pdf, ps, other

    cs.IT

    Memory Efficient Mutual Information-Maximizing Quantized Min-Sum Decoding for Rate-Compatible LDPC Codes

    Authors: Peng Kang, Kui Cai, Xuan He, Jinhong Yuan

    Abstract: In this letter, we propose a two-stage design method to construct memory efficient mutual information-maximizing quantized min-sum (MIM-QMS) decoder for rate-compatible low-density parity-check (LDPC) codes. We first develop a modified density evolution to design a unique set of lookup tables (LUTs) that can be used for rate-compatible LDPC codes. The constructed LUTs are optimized based on their… ▽ More

    Submitted 16 January, 2022; originally announced January 2022.

    Comments: This paper is an extended version of the manuscript submitted to IEEE Communications Letters

  22. arXiv:2111.09564  [pdf, other

    cs.LG cs.CL

    LAnoBERT: System Log Anomaly Detection based on BERT Masked Language Model

    Authors: Yukyung Lee, Jina Kim, Pilsung Kang

    Abstract: The system log generated in a computer system refers to large-scale data that are collected simultaneously and used as the basic data for determining errors, intrusion and abnormal behaviors. The aim of system log anomaly detection is to promptly identify anomalies while minimizing human intervention, which is a critical problem in the industry. Previous studies performed anomaly detection through… ▽ More

    Submitted 23 July, 2023; v1 submitted 18 November, 2021; originally announced November 2021.

  23. arXiv:2110.05172  [pdf, other

    cs.CL

    K-Wav2vec 2.0: Automatic Speech Recognition based on Joint Decoding of Graphemes and Syllables

    Authors: Jounghee Kim, Pilsung Kang

    Abstract: Wav2vec 2.0 is an end-to-end framework of self-supervised learning for speech representation that is successful in automatic speech recognition (ASR), but most of the work on the topic has been developed with a single language: English. Therefore, it is unclear whether the self-supervised framework is effective in recognizing other languages with different writing systems, such as Korean which use… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: 13 pages, 4 figures

  24. arXiv:2108.12637  [pdf, other

    cs.CL

    Oh My Mistake!: Toward Realistic Dialogue State Tracking including Turnback Utterances

    Authors: Takyoung Kim, Yukyung Lee, Hoonsang Yoon, Pilsung Kang, Junseong Bang, Misuk Kim

    Abstract: The primary purpose of dialogue state tracking (DST), a critical component of an end-to-end conversational system, is to build a model that responds well to real-world situations. Although we often change our minds from time to time during ordinary conversations, current benchmark datasets do not adequately reflect such occurrences and instead consist of over-simplified conversations, in which no… ▽ More

    Submitted 12 October, 2022; v1 submitted 28 August, 2021; originally announced August 2021.

    Comments: SereTOD Workshop at EMNLP 2022

  25. arXiv:2107.10474  [pdf, other

    cs.CL cs.LG

    Back-Translated Task Adaptive Pretraining: Improving Accuracy and Robustness on Text Classification

    Authors: Junghoon Lee, Jounghee Kim, Pilsung Kang

    Abstract: Language models (LMs) pretrained on a large text corpus and fine-tuned on a downstream text corpus and fine-tuned on a downstream task becomes a de facto training strategy for several natural language processing (NLP) tasks. Recently, an adaptive pretraining method retraining the pretrained language model with task-relevant data has shown significant performance improvements. However, current adap… ▽ More

    Submitted 22 July, 2021; originally announced July 2021.

  26. arXiv:2105.02942  [pdf, other

    cs.LG cs.CV

    Understanding Catastrophic Overfitting in Adversarial Training

    Authors: Peilin Kang, Seyed-Mohsen Moosavi-Dezfooli

    Abstract: Recently, FGSM adversarial training is found to be able to train a robust model which is comparable to the one trained by PGD but an order of magnitude faster. However, there is a failure mode called catastrophic overfitting (CO) that the classifier loses its robustness suddenly during the training and hardly recovers by itself. In this paper, we find CO is not only limited to FGSM, but also happe… ▽ More

    Submitted 6 May, 2021; originally announced May 2021.

  27. arXiv:2104.11421  [pdf, other

    cs.LG eess.IV eess.SP

    A Framework for Recognizing and Estimating Human Concentration Levels

    Authors: Woodo Lee, Jakyung Koo, Nokyung Park, Pilgu Kang, Jeakwon Shim

    Abstract: One of the major tasks in online education is to estimate the concentration levels of each student. Previous studies have a limitation of classifying the levels using discrete states only. The purpose of this paper is to estimate the subtle levels as specified states by using the minimum amount of body movement data. This is done by a framework composed of a Deep Neural Network and Kalman Filter.… ▽ More

    Submitted 23 April, 2021; originally announced April 2021.

  28. arXiv:2012.08888  [pdf, ps, other

    cs.AI

    Solving the Travelling Thief Problem based on Item Selection Weight and Reverse Order Allocation

    Authors: Lei Yang, Zitong Zhang, Xiaotian Jia, Peipei Kang, Wensheng Zhang, Dongya Wang

    Abstract: The Travelling Thief Problem (TTP) is a challenging combinatorial optimization problem that attracts many scholars. The TTP interconnects two well-known NP-hard problems: the Travelling Salesman Problem (TSP) and the 0-1 Knapsack Problem (KP). Increasingly algorithms have been proposed for solving this novel problem that combines two interdependent sub-problems. In this paper, TTP is investigated… ▽ More

    Submitted 16 December, 2020; originally announced December 2020.

  29. arXiv:2011.13147  [pdf, other

    cs.IT

    Generalized Mutual Information-Maximizing Quantized Decoding of LDPC Codes with Layered Scheduling

    Authors: Peng Kang, Kui Cai, Xuan He, Shuangyang Li, Jinhong Yuan

    Abstract: In this paper, we propose a framework of the mutual information-maximizing (MIM) quantized decoding for low-density parity-check (LDPC) codes by using simple mappings and fixed-point additions. Our decoding method is generic in the sense that it can be applied to LDPC codes with arbitrary degree distributions, and can be implemented based on either the belief propagation (BP) algorithm or the min-… ▽ More

    Submitted 12 February, 2022; v1 submitted 26 November, 2020; originally announced November 2020.

    Comments: This paper is an extended version of the manuscript submitted to the IEEE Transactions on Vehicular Technology

  30. Multi$^2$OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT

    Authors: Youngbin Ro, Yukyung Lee, Pilsung Kang

    Abstract: In this paper, we propose Multi$^2$OIE, which performs open information extraction (open IE) by combining BERT with multi-head attention. Our model is a sequence-labeling system with an efficient and effective argument extraction method. We use a query, key, and value setting inspired by the Multimodal Transformer to replace the previously used bidirectional long short-term memory architecture wit… ▽ More

    Submitted 7 October, 2020; v1 submitted 17 September, 2020; originally announced September 2020.

    Comments: 11 pages, Findings of EMNLP 2020

  31. arXiv:1906.09217  [pdf, other

    cs.IR

    Hierarchical Gating Networks for Sequential Recommendation

    Authors: Chen Ma, Peng Kang, Xue Liu

    Abstract: The chronological order of user-item interactions is a key feature in many recommender systems, where the items that users will interact may largely depend on those items that users just accessed recently. However, with the tremendous increase of users and items, sequential recommender systems still face several challenging problems: (1) the hardness of modeling the long-term user interests from s… ▽ More

    Submitted 21 June, 2019; originally announced June 2019.

    Comments: Accepted by the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2019 Research Track), code available:https://github.com/allenjack/HGN

  32. arXiv:1904.06666  [pdf, other

    cs.IT

    Mutual Information-Maximizing Quantized Belief Propagation Decoding of Regular LDPC Codes

    Authors: Xuan He, Kui Cai, Zhen Mei, Peng Kang, Xiaohu Tang

    Abstract: In this paper, we propose a class of finite alphabet iterative decoder (FAID), called mutual information-maximizing quantized belief propagation (MIM-QBP) decoder, for decoding regular low-density parity-check (LDPC) codes. Our decoder follows the reconstruction-calculation-quantization (RCQ) decoding architecture that is widely used in FAIDs. We present the first complete and systematic design fr… ▽ More

    Submitted 16 December, 2022; v1 submitted 14 April, 2019; originally announced April 2019.

  33. arXiv:1901.05219  [pdf, other

    cs.CL

    Sentence transition matrix: An efficient approach that preserves sentence semantics

    Authors: Myeongjun Jang, Pilsung Kang

    Abstract: Sentence embedding is a significant research topic in the field of natural language processing (NLP). Generating sentence embedding vectors reflecting the intrinsic meaning of a sentence is a key factor to achieve an enhanced performance in various NLP tasks such as sentence classification and document summarization. Therefore, various sentence embedding models based on supervised and unsupervised… ▽ More

    Submitted 16 January, 2019; originally announced January 2019.

    Comments: 11 pages

  34. arXiv:1901.04268  [pdf, other

    cs.IR cs.CV cs.MM

    Learning Shared Semantic Space with Correlation Alignment for Cross-modal Event Retrieval

    Authors: Zhenguo Yang, Zehang Lin, Peipei Kang, Jianming Lv, Qing Li, Wenyin Liu

    Abstract: In this paper, we propose to learn shared semantic space with correlation alignment (${S}^{3}CA$) for multimodal data representations, which aligns nonlinear correlations of multimodal data distributions in deep neural networks designed for heterogeneous data. In the context of cross-modal (event) retrieval, we design a neural network with convolutional layers and fully-connected layers to extract… ▽ More

    Submitted 21 May, 2019; v1 submitted 14 January, 2019; originally announced January 2019.

    Comments: 22 pages, submitted to ACM Transactions on Multimedia Computing Communications and Applications(ACM TOMM)

  35. arXiv:1812.02869  [pdf, other

    cs.IR

    Gated Attentive-Autoencoder for Content-Aware Recommendation

    Authors: Chen Ma, Peng Kang, Bin Wu, Qinglong Wang, Xue Liu

    Abstract: The rapid growth of Internet services and mobile devices provides an excellent opportunity to satisfy the strong demand for the personalized item or product recommendation. However, with the tremendous increase of users and items, personalized recommender systems still face several challenging problems: (1) the hardness of exploiting sparse implicit feedback; (2) the difficulty of combining hetero… ▽ More

    Submitted 6 December, 2018; originally announced December 2018.

    Comments: Accepted by the 12th ACM International Conference on Web Search and Data Mining (WSDM 2019), code available: https://github.com/allenjack/GATE

  36. arXiv:1810.13111  [pdf, ps, other

    cs.IT

    Enhanced Quasi-Maximum Likelihood Decoding of Short LDPC Codes based on Saturation

    Authors: Peng Kang, Yixuan Xie, Lei Yang, Chen Zheng, Jinhong Yuan, Yuejun Wei

    Abstract: In this paper, we propose an enhanced quasi-maximum likelihood (EQML) decoder for LDPC codes with short block lengths. After the failure of the conventional belief propagation (BP) decoding, the proposed EQML decoder selects unreliable variable nodes (VNs) and saturates their associated channel output values to generate a list of decoder input sequences. Each decoder input sequence in the list is… ▽ More

    Submitted 30 January, 2019; v1 submitted 31 October, 2018; originally announced October 2018.

  37. arXiv:1808.05505  [pdf, other

    cs.CL

    Paraphrase Thought: Sentence Embedding Module Imitating Human Language Recognition

    Authors: Myeongjun Jang, Pilsung Kang

    Abstract: Sentence embedding is an important research topic in natural language processing. It is essential to generate a good embedding vector that fully reflects the semantic meaning of a sentence in order to achieve an enhanced performance for various natural language processing tasks, such as machine translation and document classification. Thus far, various sentence embedding models have been proposed,… ▽ More

    Submitted 14 October, 2018; v1 submitted 16 August, 2018; originally announced August 2018.

    Comments: 10 pages

  38. arXiv:1806.06190   

    cs.CR

    Recurrent neural network-based user authentication for freely typed keystroke data

    Authors: Junhong Kim, Pilsung Kang

    Abstract: Keystroke dynamics-based user authentication (KDA) based on long and freely typed text is an enhanced user authentication method that can not only identify the validity of current users during login but also continuously monitors the consistency of typing behavior after the login process. Previous long and freely typed text-based KDA methods had difficulty incorporating the key sequence informatio… ▽ More

    Submitted 25 April, 2019; v1 submitted 16 June, 2018; originally announced June 2018.

    Comments: We found a code error

    MSC Class: 68T99

  39. arXiv:1805.02902  [pdf, ps, other

    cs.IT

    Reliability-Based Windowed Decoding for Spatially-Coupled LDPC Codes

    Authors: Peng Kang, Yixuan Xie, Lei Yang, Jinhong Yuan

    Abstract: In this letter, we propose a reliability-based windowed decoding scheme for spatially-coupled (SC) low-density parity-check (LDPC) codes. To mitigate the error propagation along the sliding windowed decoder of the SC LDPC codes, a partial message reservation (PMR) method is proposed where only the reliable messages generated in the previous decoding window are reserved for the next decoding window… ▽ More

    Submitted 8 May, 2018; v1 submitted 8 May, 2018; originally announced May 2018.

  40. arXiv:1802.03238  [pdf, other

    cs.CL

    Recurrent Neural Network-Based Semantic Variational Autoencoder for Sequence-to-Sequence Learning

    Authors: Myeongjun Jang, Seungwan Seo, Pilsung Kang

    Abstract: Sequence-to-sequence (Seq2seq) models have played an important role in the recent success of various natural language processing methods, such as machine translation, text summarization, and speech recognition. However, current Seq2seq models have trouble preserving global latent information from a long sequence of words. Variational autoencoder (VAE) alleviates this problem by learning a continuo… ▽ More

    Submitted 2 June, 2018; v1 submitted 9 February, 2018; originally announced February 2018.

    Comments: 14 pages

  41. arXiv:1709.09885  [pdf, other

    cs.CL

    Sentiment Classification with Word Attention based on Weakly Supervised Learning with a Convolutional Neural Network

    Authors: Gichang Lee, Jaeyun Jeong, Seungwan Seo, CzangYeob Kim, Pilsung Kang

    Abstract: In order to maximize the applicability of sentiment analysis results, it is necessary to not only classify the overall sentiment (positive/negative) of a given document but also to identify the main words that contribute to the classification. However, most datasets for sentiment analysis only have the sentiment label for each document or sentence. In other words, there is no information about whi… ▽ More

    Submitted 28 September, 2017; v1 submitted 28 September, 2017; originally announced September 2017.

    Comments: 16 pages

  42. arXiv:1709.00845  [pdf, other

    cs.LG

    Semi-supervised Learning with Deep Generative Models for Asset Failure Prediction

    Authors: Andre S. Yoon, Taehoon Lee, Yongsub Lim, Deokwoo Jung, Philgyun Kang, Dongwon Kim, Keuntae Park, Yongjin Choi

    Abstract: This work presents a novel semi-supervised learning approach for data-driven modeling of asset failures when health status is only partially known in historical data. We combine a generative model parameterized by deep neural networks with non-linear embedding technique. It allows us to build prognostic models with the limited amount of health status information for the precise prediction of futur… ▽ More

    Submitted 4 September, 2017; originally announced September 2017.

    Comments: 9 pages, 6 figures, 1 table, KDD17 Workshop on Machine Learning for Prognostics and Health Management.August 13-17, 2017, Halifax, Nova Scotia - Canada

    ACM Class: I.2.6; H.2.8