Skip to main content

Showing 1–40 of 40 results for author: Gu, G

  1. arXiv:2405.00029  [pdf, ps, other

    cs.CV cs.IR

    Automatic Creative Selection with Cross-Modal Matching

    Authors: Alex Kim, Jia Huang, Rob Monarch, Jerry Kwac, Anikesh Kamath, Parmeshwar Khurd, Kailash Thiyagarajan, Goodman Gu

    Abstract: Application developers advertise their Apps by creating product pages with App images, and bidding on search terms. It is then crucial for App images to be highly relevant with the search terms. Solutions to this problem require an image-text matching model to predict the quality of the match between the chosen image and the search terms. In this work, we present a novel approach to matching an Ap… ▽ More

    Submitted 28 February, 2024; originally announced May 2024.

  2. arXiv:2403.07240  [pdf, other

    cs.CV

    Frequency-Aware Deepfake Detection: Improving Generalizability through Frequency Space Learning

    Authors: Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, Yunchao Wei

    Abstract: This research addresses the challenge of developing a universal deepfake detector that can effectively identify unseen deepfake images despite limited training data. Existing frequency-based paradigms have relied on frequency-level artifacts introduced during the up-sampling in GAN pipelines to detect forgeries. However, the rapid advancements in synthesis technology have led to specific artifacts… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 9 pages, 4 figures, AAAI24

  3. arXiv:2402.10251  [pdf, other

    q-bio.NC cs.AI cs.LG eess.SP

    Brant-2: Foundation Model for Brain Signals

    Authors: Zhizhang Yuan, Daoze Zhang, Junru Chen, Gefei Gu, Yang Yang

    Abstract: Foundational models benefit from pre-training on large amounts of unlabeled data and enable strong performance in a wide variety of applications with a small amount of labeled data. Such models can be particularly effective in analyzing brain signals, as this field encompasses numerous application scenarios, and it is costly to perform large-scale annotation. In this work, we present the largest f… ▽ More

    Submitted 28 March, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: 14 pages, 7 figures

  4. arXiv:2312.10461  [pdf, other

    cs.CV

    Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection

    Authors: Chuangchuang Tan, Huan Liu, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, Yunchao Wei

    Abstract: Recently, the proliferation of highly realistic synthetic images, facilitated through a variety of GANs and Diffusions, has significantly heightened the susceptibility to misuse. While the primary focus of deepfake detection has traditionally centered on the design of detection algorithms, an investigative inquiry into the generator architectures has remained conspicuously absent in recent years.… ▽ More

    Submitted 20 December, 2023; v1 submitted 16 December, 2023; originally announced December 2023.

    Comments: 10 pages, 4 figures

  5. arXiv:2312.01998  [pdf, other

    cs.CV cs.IR

    Language-only Efficient Training of Zero-shot Composed Image Retrieval

    Authors: Geonmo Gu, Sanghyuk Chun, Wonjae Kim, Yoohoon Kang, Sangdoo Yun

    Abstract: Composed image retrieval (CIR) task takes a composed query of image and text, aiming to search relative images for both conditions. Conventional CIR approaches need a training dataset composed of triplets of query image, query text, and target image, which is very expensive to collect. Several recent works have worked on the zero-shot (ZS) CIR paradigm to tackle the issue without using pre-collect… ▽ More

    Submitted 31 March, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: CVPR 2024 camera-ready; First two authors contributed equally; 17 pages, 3.1MB

  6. arXiv:2312.01725  [pdf, other

    cs.CV

    StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On

    Authors: Jeongho Kim, Gyojung Gu, Minho Park, Sunghyun Park, Jaegul Choo

    Abstract: Given a clothing image and a person image, an image-based virtual try-on aims to generate a customized image that appears natural and accurately reflects the characteristics of the clothing image. In this work, we aim to expand the applicability of the pre-trained diffusion model so that it can be utilized independently for the virtual try-on task.The main challenge is to preserve the clothing det… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: 17 pages

  7. arXiv:2311.14246  [pdf, other

    cs.CR

    Constant-Time Wasmtime, for Real This Time: End-to-End Verified Zero-Overhead Constant-Time Programming for the Web and Beyond

    Authors: Garrett Gu, Hovav Shacham

    Abstract: We claim that existing techniques and tools for generating and verifying constant-time code are incomplete, since they rely on assumptions that compiler optimization passes do not break constant-timeness or that certain operations execute in constant time on the hardware. We present the first end-to-end constant-time-aware compilation process that preserves constant-time semantics at every step fr… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

  8. arXiv:2311.02396  [pdf, other

    cs.RO

    Precise Robotic Needle-Threading with Tactile Perception and Reinforcement Learning

    Authors: Zhenjun Yu, Wenqiang Xu, Siqiong Yao, Jieji Ren, Tutian Tang, Yutong Li, Guoying Gu, Cewu Lu

    Abstract: This work presents a novel tactile perception-based method, named T-NT, for performing the needle-threading task, an application of deformable linear object (DLO) manipulation. This task is divided into two main stages: Tail-end Finding and Tail-end Insertion. In the first stage, the agent traces the contour of the thread twice using vision-based tactile sensors mounted on the gripper fingers. The… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

  9. arXiv:2310.01740  [pdf, other

    cs.RO

    Control of Soft Pneumatic Actuators with Approximated Dynamical Modeling

    Authors: Wu-Te Yang, Burak Kurkcu, Motohiro Hirao, Lingfeng Sun, Xinghao Zhu, Zhizhou Zhang, Grace X. Gu, Masayoshi Tomizuka

    Abstract: This paper introduces a full system modeling strategy for a syringe pump and soft pneumatic actuators(SPAs). The soft actuator is conceptualized as a beam structure, utilizing a second-order bending model. The equation of natural frequency is derived from Euler's bending theory, while the damping ratio is estimated by fitting step responses of soft pneumatic actuators. Evaluation of model uncertai… ▽ More

    Submitted 19 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: 8 pages, 10 figures, accepted by 2023 IEEE ROBIO conference

  10. arXiv:2303.11916  [pdf, other

    cs.CV cs.IR

    CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion

    Authors: Geonmo Gu, Sanghyuk Chun, Wonjae Kim, HeeJae Jun, Yoohoon Kang, Sangdoo Yun

    Abstract: This paper proposes a novel diffusion-based model, CompoDiff, for solving zero-shot Composed Image Retrieval (ZS-CIR) with latent diffusion. This paper also introduces a new synthetic dataset, named SynthTriplets18M, with 18.8 million reference images, conditions, and corresponding target image triplets to train CIR models. CompoDiff and SynthTriplets18M tackle the shortages of the previous CIR ap… ▽ More

    Submitted 16 July, 2024; v1 submitted 21 March, 2023; originally announced March 2023.

    Comments: TMLR camera-ready; First two authors contributed equally; TMLR Expert Certification; 30 pages, 5.9MB

  11. arXiv:2212.04114  [pdf, other

    cs.CV

    Group Generalized Mean Pooling for Vision Transformer

    Authors: Byungsoo Ko, Han-Gyu Kim, Byeongho Heo, Sangdoo Yun, Sanghyuk Chun, Geonmo Gu, Wonjae Kim

    Abstract: Vision Transformer (ViT) extracts the final representation from either class token or an average of all patch tokens, following the architecture of Transformer in Natural Language Processing (NLP) or Convolutional Neural Networks (CNNs) in computer vision. However, studies for the best way of aggregating the patch tokens are still limited to average pooling, while widely-used pooling strategies, s… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

  12. arXiv:2210.02254  [pdf, other

    cs.CV

    Granularity-aware Adaptation for Image Retrieval over Multiple Tasks

    Authors: Jon Almazán, Byungsoo Ko, Geonmo Gu, Diane Larlus, Yannis Kalantidis

    Abstract: Strong image search models can be learned for a specific domain, ie. set of labels, provided that some labeled images of that domain are available. A practical visual search model, however, should be versatile enough to solve multiple retrieval tasks simultaneously, even if those cover very different specialized domains. Additionally, it should be able to benefit from even unlabeled images from th… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: ECCV 2022

  13. arXiv:2206.14180  [pdf, other

    cs.CV cs.AI

    High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions

    Authors: Sangyun Lee, Gyojung Gu, Sunghyun Park, Seunghwan Choi, Jaegul Choo

    Abstract: Image-based virtual try-on aims to synthesize an image of a person wearing a given clothing item. To solve the task, the existing methods warp the clothing item to fit the person's body and generate the segmentation map of the person wearing the item before fusing the item with the person. However, when the warping and the segmentation generation stages operate individually without information exc… ▽ More

    Submitted 20 July, 2022; v1 submitted 28 June, 2022; originally announced June 2022.

    Comments: Accepted to ECCV 2022

  14. arXiv:2206.08585  [pdf, other

    cs.CV

    HairFIT: Pose-Invariant Hairstyle Transfer via Flow-based Hair Alignment and Semantic-Region-Aware Inpainting

    Authors: Chaeyeon Chung, Taewoo Kim, Hyelin Nam, Seunghwan Choi, Gyojung Gu, Sunghyun Park, Jaegul Choo

    Abstract: Hairstyle transfer is the task of modifying a source hairstyle to a target one. Although recent hairstyle transfer models can reflect the delicate features of hairstyles, they still have two major limitations. First, the existing methods fail to transfer hairstyles when a source and a target image have different poses (e.g., viewing direction or face size), which is prevalent in the real world. Al… ▽ More

    Submitted 17 June, 2022; originally announced June 2022.

    Comments: BMVC 2021 Oral Presentation

  15. arXiv:2203.14463  [pdf, other

    cs.CV cs.CL

    Large-scale Bilingual Language-Image Contrastive Learning

    Authors: Byungsoo Ko, Geonmo Gu

    Abstract: This paper is a technical report to share our experience and findings building a Korean and English bilingual multimodal model. While many of the multimodal datasets focus on English and multilingual multimodal research uses machine-translated texts, employing such machine-translated texts is limited to describing unique expressions, cultural information, and proper noun in languages other than En… ▽ More

    Submitted 14 April, 2022; v1 submitted 27 March, 2022; originally announced March 2022.

    Comments: Accepted by ICLRW2022

  16. arXiv:2112.08816  [pdf, other

    cs.CV cs.IR

    Deep Hash Distillation for Image Retrieval

    Authors: Young Kyun Jang, Geonmo Gu, Byungsoo Ko, Isaac Kang, Nam Ik Cho

    Abstract: In hash-based image retrieval systems, degraded or transformed inputs usually generate different codes from the original, deteriorating the retrieval accuracy. To mitigate this issue, data augmentation can be applied during training. However, even if augmented samples of an image are similar in real feature space, the quantization can scatter them far away in Hamming space. This results in represe… ▽ More

    Submitted 13 July, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: ECCV2022

  17. arXiv:2106.00186  [pdf, other

    cs.CV cs.LG

    Towards Light-weight and Real-time Line Segment Detection

    Authors: Geonmo Gu, Byungsoo Ko, SeoungHyun Go, Sung-Hyun Lee, Jingeun Lee, Minchul Shin

    Abstract: Previous deep learning-based line segment detection (LSD) suffers from the immense model size and high computational cost for line prediction. This constrains them from real-time inference on computationally restricted environments. In this paper, we propose a real-time and light-weight line segment detector for resource-constrained environments named Mobile LSD (M-LSD). We design an extremely eff… ▽ More

    Submitted 26 April, 2022; v1 submitted 31 May, 2021; originally announced June 2021.

    Comments: Accepted by AAAI2022

  18. arXiv:2104.03015  [pdf, other

    cs.CV

    RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network

    Authors: Minchul Shin, Yoonjae Cho, Byungsoo Ko, Geonmo Gu

    Abstract: In this paper, we study the compositional learning of images and texts for image retrieval. The query is given in the form of an image and text that describes the desired modifications to the image; the goal is to retrieve the target image that satisfies the given modifications and resembles the query by composing information in both the text and image modalities. To remedy this, we propose a nove… ▽ More

    Submitted 25 October, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

  19. arXiv:2103.16940  [pdf, other

    cs.CV cs.IR cs.LG

    Learning with Memory-based Virtual Classes for Deep Metric Learning

    Authors: Byungsoo Ko, Geonmo Gu, Han-Gyu Kim

    Abstract: The core of deep metric learning (DML) involves learning visual similarities in high-dimensional embedding space. One of the main challenges is to generalize from seen classes of training data to unseen classes of test data. Recent works have focused on exploiting past embeddings to increase the number of instances for the seen classes. Such methods achieve performance improvement via augmentation… ▽ More

    Submitted 8 October, 2021; v1 submitted 31 March, 2021; originally announced March 2021.

    Comments: Accepted by ICCV2021

  20. arXiv:2103.15454  [pdf, other

    cs.CV cs.IR cs.LG

    Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning

    Authors: Geonmo Gu, Byungsoo Ko, Han-Gyu Kim

    Abstract: One of the main purposes of deep metric learning is to construct an embedding space that has well-generalized embeddings on both seen (training) classes and unseen (test) classes. Most existing works have tried to achieve this using different types of metric objectives and hard sample mining strategies with given training data. However, learning with only the training data can be overfitted to the… ▽ More

    Submitted 29 March, 2021; originally announced March 2021.

    Comments: Accepted by AAAI2021

  21. arXiv:2103.11526  [pdf, other

    cs.LG

    ExAD: An Ensemble Approach for Explanation-based Adversarial Detection

    Authors: Raj Vardhan, Ninghao Liu, Phakpoom Chinprutthiwong, Weijie Fu, Zhenyu Hu, Xia Ben Hu, Guofei Gu

    Abstract: Recent research has shown Deep Neural Networks (DNNs) to be vulnerable to adversarial examples that induce desired misclassifications in the models. Such risks impede the application of machine learning in security-sensitive domains. Several defense methods have been proposed against adversarial attacks to detect adversarial examples at test time or to make machine learning models more robust. How… ▽ More

    Submitted 21 March, 2021; originally announced March 2021.

    Comments: 15 pages, 10 figures

  22. K-Hairstyle: A Large-scale Korean Hairstyle Dataset for Virtual Hair Editing and Hairstyle Classification

    Authors: Taewoo Kim, Chaeyeon Chung, Sunghyun Park, Gyojung Gu, Keonmin Nam, Wonzo Choe, Jaesung Lee, Jaegul Choo

    Abstract: The hair and beauty industry is a fast-growing industry. This led to the development of various applications, such as virtual hair dyeing or hairstyle transfer, to satisfy the customer's needs. Although several hairstyle datasets are available for these applications, they often consist of a relatively small number of images with low resolution, thus limiting their performance on high-quality hair… ▽ More

    Submitted 9 October, 2021; v1 submitted 11 February, 2021; originally announced February 2021.

    Comments: ICIP 2021 final version

  23. arXiv:2101.04773  [pdf, other

    cs.SD cs.AI cs.CR eess.AS

    Practical Speech Re-use Prevention in Voice-driven Services

    Authors: Yangyong Zhang, Maliheh Shirvanian, Sunpreet S. Arora, Jianwei Huang, Guofei Gu

    Abstract: Voice-driven services (VDS) are being used in a variety of applications ranging from smart home control to payments using digital assistants. The input to such services is often captured via an open voice channel, e.g., using a microphone, in an unsupervised setting. One of the key operational security requirements in such setting is the freshness of the input speech. We present AEOLUS, a security… ▽ More

    Submitted 12 January, 2021; originally announced January 2021.

  24. arXiv:2012.03283  [pdf, other

    cs.CR cs.CY

    On the Privacy and Integrity Risks of Contact-Tracing Applications

    Authors: Jianwei Huang, Vinod Yegneswaran, Phillip Porras, Guofei Gu

    Abstract: Smartphone-based contact-tracing applications are at the epicenter of the global fight against the Covid-19 pandemic. While governments and healthcare agencies are eager to mandate the deployment of such applications en-masse, they face increasing scrutiny from the popular press, security companies, and human rights watch agencies that fear the exploitation of these technologies as surveillance to… ▽ More

    Submitted 8 December, 2020; v1 submitted 6 December, 2020; originally announced December 2020.

  25. arXiv:2011.12492  [pdf

    cs.CV

    Multi-feature driven active contour segmentation model for infrared image with intensity inhomogeneity

    Authors: Qinyan Huang, Weiwen Zhou, Minjie Wan, Xin Chen, Qian Chen, Guohua Gu

    Abstract: Infrared (IR) image segmentation is essential in many urban defence applications, such as pedestrian surveillance, vehicle counting, security monitoring, etc. Active contour model (ACM) is one of the most widely used image segmentation tools at present, but the existing methods only utilize the local or global single feature information of image to minimize the energy function, which is easy to ca… ▽ More

    Submitted 24 November, 2020; originally announced November 2020.

  26. arXiv:2010.11724  [pdf, other

    cs.CV

    LID 2020: The Learning from Imperfect Data Challenge Results

    Authors: Yunchao Wei, Shuai Zheng, Ming-Ming Cheng, Hang Zhao, Liwei Wang, Errui Ding, Yi Yang, Antonio Torralba, Ting Liu, Guolei Sun, Wenguan Wang, Luc Van Gool, Wonho Bae, Junhyug Noh, Jinhwan Seo, Gunhee Kim, Hao Zhao, Ming Lu, Anbang Yao, Yiwen Guo, Yurong Chen, Li Zhang, Chuangchuang Tan, Tao Ruan, Guanghua Gu , et al. (10 additional authors not shown)

    Abstract: Learning from imperfect data becomes an issue in many industrial applications after the research community has made profound progress in supervised learning from perfectly annotated datasets. The purpose of the Learning from Imperfect Data (LID) workshop is to inspire and facilitate the research in developing novel approaches that would harness the imperfect data and improve the data-efficiency du… ▽ More

    Submitted 17 October, 2020; originally announced October 2020.

    Comments: Summary of the 2nd Learning from Imperfect Data Workshop in conjunction with CVPR 2020

  27. arXiv:2010.05260  [pdf

    cs.CV

    Infrared target tracking based on proximal robust principal component analysis method

    Authors: Chao Ma, Guohua Gu, Xin Miao, Minjie Wan, Weixian Qian, Kan Ren, Qian Chen

    Abstract: Infrared target tracking plays an important role in both civil and military fields. The main challenges in designing a robust and high-precision tracker for infrared sequences include overlap, occlusion and appearance change. To this end, this paper proposes an infrared target tracker based on proximal robust principal component analysis method. Firstly, the observation matrix is decomposed into a… ▽ More

    Submitted 11 October, 2020; originally announced October 2020.

  28. arXiv:2007.14995  [pdf, other

    cs.CR

    Return-Oriented Programming in RISC-V

    Authors: Garrett Gu, Hovav Shacham

    Abstract: RISC-V is an open-source hardware ISA based on the RISC design principles, and has been the subject of some novel ROP mitigation technique proposals due to its open-source nature. However, very little work has actually evaluated whether such an attack is feasible assuming a typical RISC-V implementation. We show that RISC-V ROP can be used to perform Turing complete calculation and arbitrary funct… ▽ More

    Submitted 29 July, 2020; originally announced July 2020.

  29. arXiv:2003.02546  [pdf, other

    cs.CV cs.IR cs.LG

    Embedding Expansion: Augmentation in Embedding Space for Deep Metric Learning

    Authors: Byungsoo Ko, Geonmo Gu

    Abstract: Learning the distance metric between pairs of samples has been studied for image retrieval and clustering. With the remarkable success of pair-based metric learning losses, recent works have proposed the use of generated synthetic points on metric learning losses for augmentation and generalization. However, these methods require additional generative networks along with the main network, which ca… ▽ More

    Submitted 23 April, 2020; v1 submitted 5 March, 2020; originally announced March 2020.

    Comments: Accepted by CVPR 2020

  30. arXiv:2001.11658  [pdf, other

    cs.CV cs.IR

    Symmetrical Synthesis for Deep Metric Learning

    Authors: Geonmo Gu, Byungsoo Ko

    Abstract: Deep metric learning aims to learn embeddings that contain semantic similarity information among data points. To learn better embeddings, methods to generate synthetic hard samples have been proposed. Existing methods of synthetic hard sample generation are adopting autoencoders or generative adversarial networks, but this leads to more hyper-parameters, harder optimization, and slower training sp… ▽ More

    Submitted 23 April, 2020; v1 submitted 30 January, 2020; originally announced January 2020.

    Comments: Accepted by AAAI 2020

  31. arXiv:2001.06268  [pdf, ps, other

    cs.CV

    Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network

    Authors: Jungkyu Lee, Taeryun Won, Tae Kwan Lee, Hyemin Lee, Geonmo Gu, Kiho Hong

    Abstract: Recent studies in image classification have demonstrated a variety of techniques for improving the performance of Convolutional Neural Networks (CNNs). However, attempts to combine existing techniques to create a practical model are still uncommon. In this study, we carry out extensive experiments to validate that carefully assembling these techniques and applying them to basic CNN models (e.g. Re… ▽ More

    Submitted 13 March, 2020; v1 submitted 17 January, 2020; originally announced January 2020.

    Comments: 9 pages, 2 figures, 18 tables

  32. arXiv:1911.01644  [pdf, other

    cs.DS

    Fast Multiple Pattern Cartesian Tree Matching

    Authors: Geonmo Gu, Siwoo Song, Simone Faro, Thierry Lecroq, Kunsoo Park

    Abstract: Cartesian tree matching is the problem of finding all substrings in a given text which have the same Cartesian trees as that of a given pattern. In this paper, we deal with Cartesian tree matching for the case of multiple patterns. We present two fingerprinting methods, i.e., the parent-distance encoding and the binary encoding. By combining an efficient fingerprinting method and a conventional mu… ▽ More

    Submitted 5 November, 2019; originally announced November 2019.

    Comments: Submitted to WALCOM 2020

  33. arXiv:1907.11854  [pdf, other

    cs.CV cs.IR cs.LG

    A Benchmark on Tricks for Large-scale Image Retrieval

    Authors: Byungsoo Ko, Minchul Shin, Geonmo Gu, HeeJae Jun, Tae Kwan Lee, Youngjoon Kim

    Abstract: Many studies have been performed on metric learning, which has become a key ingredient in top-performing methods of instance-level image retrieval. Meanwhile, less attention has been paid to pre-processing and post-processing tricks that can significantly boost performance. Furthermore, we found that most previous studies used small scale datasets to simplify processing. Because the behavior of a… ▽ More

    Submitted 23 April, 2020; v1 submitted 27 July, 2019; originally announced July 2019.

  34. arXiv:1810.05399  [pdf

    cs.CV

    Thermal Infrared Colorization via Conditional Generative Adversarial Network

    Authors: Xiaodong Kuang, Xiubao Sui, Chengwei Liu, Yuan Liu, Qian Chen, Guohua Gu

    Abstract: Transforming a thermal infrared image into a realistic RGB image is a challenging task. In this paper we propose a deep learning method to bridge this gap. We propose learning the transformation mapping using a coarse-to-fine generator that preserves the details. Since the standard mean squared loss cannot penalize the distance between colorized and ground truth images well, we propose a composite… ▽ More

    Submitted 4 November, 2018; v1 submitted 12 October, 2018; originally announced October 2018.

  35. arXiv:1808.05336  [pdf, other

    cs.RO cs.AI cs.CV

    Simultaneous Localization And Mapping with depth Prediction using Capsule Networks for UAVs

    Authors: Sunil Prakash, Gaelan Gu

    Abstract: In this paper, we propose an novel implementation of a simultaneous localization and mapping (SLAM) system based on a monocular camera from an unmanned aerial vehicle (UAV) using Depth prediction performed with Capsule Networks (CapsNet), which possess improvements over the drawbacks of the more widely-used Convolutional Neural Networks (CNN). An Extended Kalman Filter will assist in estimating th… ▽ More

    Submitted 15 August, 2018; originally announced August 2018.

  36. arXiv:1804.09021  [pdf, other

    cs.CL cs.AI

    Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition

    Authors: Zhenghui Wang, Yanru Qu, Liheng Chen, Jian Shen, Weinan Zhang, Shaodian Zhang, Yimei Gao, Gen Gu, Ken Chen, Yong Yu

    Abstract: We study the problem of named entity recognition (NER) from electronic medical records, which is one of the most fundamental and critical problems for medical text mining. Medical records which are written by clinicians from different specialties usually contain quite different terminologies and writing styles. The difference of specialties and the cost of human annotation makes it particularly di… ▽ More

    Submitted 28 April, 2018; v1 submitted 24 April, 2018; originally announced April 2018.

    Comments: NAACL HLT 2018

  37. arXiv:1712.03534  [pdf, other

    cs.CV

    Dynamics Transfer GAN: Generating Video by Transferring Arbitrary Temporal Dynamics from a Source Video to a Single Target Image

    Authors: Wissam J. Baddar, Geonmo Gu, Sangmin Lee, Yong Man Ro

    Abstract: In this paper, we propose Dynamics Transfer GAN; a new method for generating video sequences based on generative adversarial learning. The spatial constructs of a generated video sequence are acquired from the target image. The dynamics of the generated video sequence are imported from a source video sequence, with arbitrary motion, and imposed onto the target image. To preserve the spatial constr… ▽ More

    Submitted 10 December, 2017; originally announced December 2017.

  38. arXiv:1711.10267  [pdf, other

    cs.CV

    Differential Generative Adversarial Networks: Synthesizing Non-linear Facial Variations with Limited Number of Training Data

    Authors: Geonmo Gu, Seong Tae Kim, Kihyun Kim, Wissam J. Baddar, Yong Man Ro

    Abstract: In face-related applications with a public available dataset, synthesizing non-linear facial variations (e.g., facial expression, head-pose, illumination, etc.) through a generative model is helpful in addressing the lack of training data. In reality, however, there is insufficient data to even train the generative model for face synthesis. In this paper, we propose Differential Generative Adversa… ▽ More

    Submitted 28 December, 2017; v1 submitted 28 November, 2017; originally announced November 2017.

    Comments: 20 pages

  39. arXiv:1709.05961  [pdf, other

    cs.CV

    Adaptive compressed 3D imaging based on wavelet trees and Hadamard multiplexing with a single photon counting detector

    Authors: Huidong Dai, Weiji He, Guohua Gu, Ling Ye, Tianyi Mao, Qian Chen

    Abstract: Photon counting 3D imaging allows to obtain 3D images with single-photon sensitivity and sub-ns temporal resolution. However, it is challenging to scale to high spatial resolution. In this work, we demonstrate a photon counting 3D imaging technique with short-pulsed structured illumination and a single-pixel photon counting detector. The proposed multi-resolution photon counting 3D imaging techniq… ▽ More

    Submitted 14 September, 2017; originally announced September 2017.

    Comments: 11 pages, 5 figures, 1 table

  40. arXiv:1506.05203  [pdf, other

    cs.DS

    Fast Multiple Order-Preserving Matching Algorithms

    Authors: Myoungji Han, Munseong Kang, Sukhyeun Cho, Geonmo Gu, Jeong Seop Sim, Kunsoo Park

    Abstract: Given a text $T$ and a pattern $P$, the order-preserving matching problem is to find all substrings in $T$ which have the same relative orders as $P$. Order-preserving matching has been an active research area since it was introduced by Kubica et al. \cite{kubica2013linear} and Kim et al. \cite{kim2014order}. In this paper we present two algorithms for the multiple order-preserving matching proble… ▽ More

    Submitted 17 June, 2015; originally announced June 2015.

    Comments: 15 pages, 8 figures, submitted to IWOCA 2015