Skip to main content

Showing 1–50 of 59 results for author: Ju, Y

  1. arXiv:2406.15638  [pdf, other

    cs.NI cs.LG

    Root Cause Analysis of Anomalies in 5G RAN Using Graph Neural Network and Transformer

    Authors: Antor Hasan, Conrado Boeira, Khaleda Papry, Yue Ju, Zhongwen Zhu, Israat Haque

    Abstract: The emergence of 5G technology marks a significant milestone in developing telecommunication networks, enabling exciting new applications such as augmented reality and self-driving vehicles. However, these improvements bring an increased management complexity and a special concern in dealing with failures, as the applications 5G intends to support heavily rely on high network performance and low l… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  2. arXiv:2406.03287  [pdf, other

    cs.NE cs.CL cs.LG

    SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms

    Authors: Xingrun Xing, Zheng Zhang, Ziyi Ni, Shitao Xiao, Yiming Ju, Siqi Fan, Yequan Wang, Jiajun Zhang, Guoqi Li

    Abstract: Towards energy-efficient artificial intelligence similar to the human brain, the bio-inspired spiking neural networks (SNNs) have advantages of biological plausibility, event-driven sparsity, and binary activation. Recently, large-scale language models exhibit promising generalization capability, making it a valuable issue to explore more general spike-driven models. However, the binary spikes in… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  3. arXiv:2405.10272  [pdf, other

    cs.CV cs.AI cs.SD eess.AS eess.IV

    Faces that Speak: Jointly Synthesising Talking Face and Speech from Text

    Authors: Youngjoon Jang, Ji-Hoon Kim, Junseok Ahn, Doyeop Kwak, Hong-Sun Yang, Yoon-Cheol Ju, Il-Hwan Kim, Byeong-Yeol Kim, Joon Son Chung

    Abstract: The goal of this work is to simultaneously generate natural talking faces and speech outputs from text. We achieve this by integrating Talking Face Generation (TFG) and Text-to-Speech (TTS) systems into a unified framework. We address the main challenges of each task: (1) generating a range of head poses representative of real-world scenarios, and (2) ensuring voice consistency despite variations… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: CVPR 2024

  4. arXiv:2405.03775  [pdf, other

    cs.CR

    Secure Inference for Vertically Partitioned Data Using Multiparty Homomorphic Encryption

    Authors: Shuangyi Chen, Yue Ju, Zhongwen Zhu, Ashish Khisti

    Abstract: We propose a secure inference protocol for a distributed setting involving a single server node and multiple client nodes. We assume that the observed data vector is partitioned across multiple client nodes while the deep learning model is located at the server node. Each client node is required to encrypt its portion of the data vector and transmit the resulting ciphertext to the server node. The… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  5. arXiv:2404.18033  [pdf, other

    cs.CV

    Exposing Text-Image Inconsistency Using Diffusion Models

    Authors: Mingzhen Huang, Shan Jia, Zhou Zhou, Yan Ju, Jialing Cai, Siwei Lyu

    Abstract: In the battle against widespread online misinformation, a growing problem is text-image inconsistency, where images are misleadingly paired with texts with different intent or meaning. Existing classification-based methods for text-image inconsistency can identify contextual inconsistencies but fail to provide explainable justifications for their decisions that humans can understand. Although more… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  6. arXiv:2404.13146  [pdf, other

    cs.CR cs.CV

    DeepFake-O-Meter v2.0: An Open Platform for DeepFake Detection

    Authors: Yan Ju, Chengzhe Sun, Shan Jia, Shuwei Hou, Zhaofeng Si, Soumyya Kanti Datta, Lipeng Ke, Riky Zhou, Anita Nikolich, Siwei Lyu

    Abstract: Deepfakes, as AI-generated media, have increasingly threatened media integrity and personal privacy with realistic yet fake digital content. In this work, we introduce an open-source and user-friendly online platform, DeepFake-O-Meter v2.0, that integrates state-of-the-art methods for detecting Deepfake images, videos, and audio. Built upon DeepFake-O-Meter v1.0, we have made significant upgrades… ▽ More

    Submitted 27 June, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

  7. arXiv:2404.10643  [pdf, other

    cs.NI eess.SP

    A Calibrated and Automated Simulator for Innovations in 5G

    Authors: Conrado Boeira, Antor Hasan, Khaleda Papry, Yue Ju, Zhongwen Zhu, Israat Haque

    Abstract: The rise of 5G deployments has created the environment for many emerging technologies to flourish. Self-driving vehicles, Augmented and Virtual Reality, and remote operations are examples of applications that leverage 5G networks' support for extremely low latency, high bandwidth, and increased throughput. However, the complex architecture of 5G hinders innovation due to the lack of accessibility… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  8. RMAFF-PSN: A Residual Multi-Scale Attention Feature Fusion Photometric Stereo Network

    Authors: Kai Luo, Yakun Ju, Lin Qi, Kaixuan Wang, Junyu Dong

    Abstract: Predicting accurate normal maps of objects from two-dimensional images in regions of complex structure and spatial material variations is challenging using photometric stereo methods due to the influence of surface reflection properties caused by variations in object geometry and surface materials. To address this issue, we propose a photometric stereo network called a RMAFF-PSN that uses residual… ▽ More

    Submitted 14 April, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

    Comments: 17 pages,12 figures

    Journal ref: Photonics 2023,10(5),548

  9. arXiv:2403.14077  [pdf, other

    cs.AI cs.CR

    Can ChatGPT Detect DeepFakes? A Study of Using Multimodal Large Language Models for Media Forensics

    Authors: Shan Jia, Reilin Lyu, Kangran Zhao, Yize Chen, Zhiyuan Yan, Yan Ju, Chuanbo Hu, Xin Li, Baoyuan Wu, Siwei Lyu

    Abstract: DeepFakes, which refer to AI-generated media content, have become an increasing concern due to their use as a means for disinformation. Detecting DeepFakes is currently solved with programmed machine learning algorithms. In this work, we investigate the capabilities of multimodal large language models (LLMs) in DeepFake detection. We conducted qualitative and quantitative experiments to demonstrat… ▽ More

    Submitted 11 June, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  10. arXiv:2402.17229  [pdf, other

    cs.CV cs.CY cs.LG

    Preserving Fairness Generalization in Deepfake Detection

    Authors: Li Lin, Xinan He, Yan Ju, Xin Wang, Feng Ding, Shu Hu

    Abstract: Although effective deepfake detection models have been developed in recent years, recent studies have revealed that these models can result in unfair performance disparities among demographic groups, such as race and gender. This can lead to particular groups facing unfair targeting or exclusion from detection, potentially allowing misclassified deepfakes to manipulate public opinion and undermine… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted by The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024)

  11. Explaining generative diffusion models via visual analysis for interpretable decision-making process

    Authors: Ji-Hoon Park, Yeong-Joon Ju, Seong-Whan Lee

    Abstract: Diffusion models have demonstrated remarkable performance in generation tasks. Nevertheless, explaining the diffusion process remains challenging due to it being a sequence of denoising noisy images that are difficult for experts to interpret. To address this issue, we propose the three research questions to interpret the diffusion process from the perspective of the visual concepts generated by t… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: 22 pages, published in Expert Systems with Applications

    MSC Class: 68T01

    Journal ref: Expert Systems with Applications 248 (2024) 123231

  12. arXiv:2401.07532  [pdf, other

    cs.SD cs.AI eess.AS

    Multi-view MidiVAE: Fusing Track- and Bar-view Representations for Long Multi-track Symbolic Music Generation

    Authors: Zhiwei Lin, Jun Chen, Boshi Tang, Binzhu Sha, Jing Yang, Yaolong Ju, Fan Fan, Shiyin Kang, Zhiyong Wu, Helen Meng

    Abstract: Variational Autoencoders (VAEs) constitute a crucial component of neural symbolic music generation, among which some works have yielded outstanding results and attracted considerable attention. Nevertheless, previous VAEs still encounter issues with overly long feature sequences and generated results lack contextual coherence, thus the challenge of modeling long multi-track symbolic music still re… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

    Comments: Accepted by ICASSP 2024

  13. arXiv:2401.07487  [pdf, other

    cs.RO cs.CV

    Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot Manipulation

    Authors: Yuanchen Ju, Kaizhe Hu, Guowei Zhang, Gu Zhang, Mingrun Jiang, Huazhe Xu

    Abstract: Enabling robotic manipulation that generalizes to out-of-distribution scenes is a crucial step toward open-world embodied intelligence. For human beings, this ability is rooted in the understanding of semantic correspondence among objects, which naturally transfers the interaction experience of familiar objects to novel ones. Although robots lack such a reservoir of interaction experience, the vas… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  14. arXiv:2311.07126  [pdf, other

    cs.LG

    How to Do Machine Learning with Small Data? -- A Review from an Industrial Perspective

    Authors: Ivan Kraljevski, Yong Chul Ju, Dmitrij Ivanov, Constanze Tschöpe, Matthias Wolff

    Abstract: Artificial intelligence experienced a technological breakthrough in science, industry, and everyday life in the recent few decades. The advancements can be credited to the ever-increasing availability and miniaturization of computational resources that resulted in exponential data growth. However, because of the insufficient amount of data in some cases, employing machine learning in solving compl… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  15. arXiv:2310.07184  [pdf, other

    cs.CV cs.LG

    NeuroInspect: Interpretable Neuron-based Debugging Framework through Class-conditional Visualizations

    Authors: Yeong-Joon Ju, Ji-Hoon Park, Seong-Whan Lee

    Abstract: Despite deep learning (DL) has achieved remarkable progress in various domains, the DL models are still prone to making mistakes. This issue necessitates effective debugging tools for DL practitioners to interpret the decision-making process within the networks. However, existing debugging methods often demand extra data or adjustments to the decision process, limiting their applicability. To tack… ▽ More

    Submitted 17 October, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

  16. arXiv:2309.16535  [pdf, other

    cs.CL cs.AI

    KLoB: a Benchmark for Assessing Knowledge Locating Methods in Language Models

    Authors: Yiming Ju, Zheng Zhang

    Abstract: Recently, Locate-Then-Edit paradigm has emerged as one of the main approaches in changing factual knowledge stored in the Language models. However, there is a lack of research on whether present locating methods can pinpoint the exact parameters embedding the desired knowledge. Moreover, although many researchers have questioned the validity of locality hypothesis of factual knowledge, no method i… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  17. arXiv:2309.16388  [pdf, other

    cs.CV

    Exposing Image Splicing Traces in Scientific Publications via Uncertainty-guided Refinement

    Authors: Xun Lin, Wenzhong Tang, Haoran Wang, Yizhong Liu, Yakun Ju, Shuai Wang, Zitong Yu

    Abstract: Recently, a surge in scientific publications suspected of image manipulation has led to numerous retractions, bringing the issue of image integrity into sharp focus. Although research on forensic detectors for image plagiarism and image synthesis exists, the detection of image splicing traces in scientific publications remains unexplored. Compared to image duplication and synthesis, image splicing… ▽ More

    Submitted 18 April, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

  18. arXiv:2309.07293  [pdf

    cs.CV eess.IV

    GAN-based Algorithm for Efficient Image Inpainting

    Authors: Zhengyang Han, Zehao Jiang, Yuan Ju

    Abstract: Global pandemic due to the spread of COVID-19 has post challenges in a new dimension on facial recognition, where people start to wear masks. Under such condition, the authors consider utilizing machine learning in image inpainting to tackle the problem, by complete the possible face that is originally covered in mask. In particular, autoencoder has great potential on retaining important, general… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Comments: 6 pages, 3 figures

    MSC Class: 68U10

    Journal ref: The 3rd International Conference on Artificial Intelligence and Computer Engineering(ICAICE 2022)

  19. arXiv:2308.16584  [pdf, other

    cs.CL

    Unsupervised Text Style Transfer with Deep Generative Models

    Authors: Zhongtao Jiang, Yuanzhe Zhang, Yiming Ju, Kang Liu

    Abstract: We present a general framework for unsupervised text style transfer with deep generative models. The framework models each sentence-label pair in the non-parallel corpus as partially observed from a complete quadruplet which additionally contains two latent codes representing the content and style, respectively. These codes are learned by exploiting dependencies inside the observed data. Then a se… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

  20. EdgeMatrix: A Resource-Redefined Scheduling Framework for SLA-Guaranteed Multi-Tier Edge-Cloud Computing Systems

    Authors: Shihao Shen, Yuanming Ren, Yanli Ju, Xiaofei Wang, Wenyu Wang, Victor C. M. Leung

    Abstract: With the development of networking technology, the computing system has evolved towards the multi-tier paradigm gradually. However, challenges, such as multi-resource heterogeneity of devices, resource competition of services, and networked system dynamics, make it difficult to guarantee service-level agreement (SLA) for the applications. In this paper, we propose a multi-tier edge-cloud computing… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: JSAC. arXiv admin note: substantial text overlap with arXiv:2203.10470

  21. arXiv:2306.16857  [pdf, other

    cs.RO cs.LG

    ArrayBot: Reinforcement Learning for Generalizable Distributed Manipulation through Touch

    Authors: Zhengrong Xue, Han Zhang, Jingwen Cheng, Zhengmao He, Yuanchen Ju, Changyi Lin, Gu Zhang, Huazhe Xu

    Abstract: We present ArrayBot, a distributed manipulation system consisting of a $16 \times 16$ array of vertically sliding pillars integrated with tactile sensors, which can simultaneously support, perceive, and manipulate the tabletop objects. Towards generalizable distributed manipulation, we leverage reinforcement learning (RL) algorithms for the automatic discovery of control policies. In the face of t… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

  22. arXiv:2306.16635  [pdf, other

    cs.CV cs.CY cs.LG

    Improving Fairness in Deepfake Detection

    Authors: Yan Ju, Shu Hu, Shan Jia, George H. Chen, Siwei Lyu

    Abstract: Despite the development of effective deepfake detectors in recent years, recent studies have demonstrated that biases in the data used to train these detectors can lead to disparities in detection accuracy across different races and genders. This can result in different groups being unfairly targeted or excluded from detection, allowing undetected deepfakes to manipulate public opinion and erode t… ▽ More

    Submitted 8 November, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

  23. arXiv:2306.16250  [pdf, other

    cs.SD eess.AS

    MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker Modulation

    Authors: Jun Chen, Wei Rao, Zilin Wang, Jiuxin Lin, Yukai Ju, Shulin He, Yannan Wang, Zhiyong Wu

    Abstract: The previous SpEx+ has yielded outstanding performance in speaker extraction and attracted much attention. However, it still encounters inadequate utilization of multi-scale information and speaker embedding. To this end, this paper proposes a new effective speaker extraction system with multi-scale interfusion and conditional speaker modulation (ConSM), which is called MC-SpEx. First of all, we d… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

    Comments: Accepted by InterSpeech 2023

  24. arXiv:2304.06870  [pdf, other

    cs.CV

    AutoSplice: A Text-prompt Manipulated Image Dataset for Media Forensics

    Authors: Shan Jia, Mingzhen Huang, Zhou Zhou, Yan Ju, Jialing Cai, Siwei Lyu

    Abstract: Recent advancements in language-image models have led to the development of highly realistic images that can be generated from textual descriptions. However, the increased visual quality of these generated images poses a potential threat to the field of media forensics. This paper aims to investigate the level of challenge that language-image generation models pose to media forensics. To achieve t… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

  25. arXiv:2303.07704  [pdf, other

    eess.AS cs.SD

    TEA-PSE 3.0: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System For ICASSP 2023 DNS Challenge

    Authors: Yukai Ju, Jun Chen, Shimin Zhang, Shulin He, Wei Rao, Weixin Zhu, Yannan Wang, Tao Yu, Shidong Shang

    Abstract: This paper introduces the Unbeatable Team's submission to the ICASSP 2023 Deep Noise Suppression (DNS) Challenge. We expand our previous work, TEA-PSE, to its upgraded version -- TEA-PSE 3.0. Specifically, TEA-PSE 3.0 incorporates a residual LSTM after squeezed temporal convolution network (S-TCN) to enhance sequence modeling capabilities. Additionally, the local-global representation (LGR) struct… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  26. arXiv:2302.14370  [pdf, other

    cs.SD cs.AI eess.AS eess.SP

    CrossSpeech: Speaker-independent Acoustic Representation for Cross-lingual Speech Synthesis

    Authors: Ji-Hoon Kim, Hong-Sun Yang, Yoon-Cheol Ju, Il-Hwan Kim, Byeong-Yeol Kim

    Abstract: While recent text-to-speech (TTS) systems have made remarkable strides toward human-level quality, the performance of cross-lingual TTS lags behind that of intra-lingual TTS. This gap is mainly rooted from the speaker-language entanglement problem in cross-lingual TTS. In this paper, we propose CrossSpeech which improves the quality of cross-lingual speech by effectively disentangling speaker and… ▽ More

    Submitted 12 June, 2023; v1 submitted 28 February, 2023; originally announced February 2023.

    Comments: Accepted to ICASSP 2023

  27. arXiv:2212.08414  [pdf, other

    cs.CV cs.AI

    Deep Learning Methods for Calibrated Photometric Stereo and Beyond

    Authors: Yakun Ju, Kin-Man Lam, Wuyuan Xie, Huiyu Zhou, Junyu Dong, Boxin Shi

    Abstract: Photometric stereo recovers the surface normals of an object from multiple images with varying shading cues, i.e., modeling the relationship between surface orientation and intensity at each pixel. Photometric stereo prevails in superior per-pixel resolution and fine reconstruction details. However, it is a complicated problem because of the non-linear relationship caused by non-Lambertian surface… ▽ More

    Submitted 1 February, 2024; v1 submitted 16 December, 2022; originally announced December 2022.

    Comments: 19 pages, 11 figures, 4 tables

  28. Personalized local heating neutralizing individual, spatial and temporal thermo-physiological variances in extreme cold environments

    Authors: Yi Ju, Xinyuan Ju, Hui Zhang, Bin Cao, Bin Liu, Yingxin Zhu

    Abstract: In this paper, we investigate the feasibility, robustness and optimization of introducing personal comfort systems (PCS), apparatuses that promises in energy saving and comfort improvement, into a broader range of environments. We report a series of laboratory experiments systematically examining the effect of personalized heating in neutralizing individual, spatial and temporal variations of ther… ▽ More

    Submitted 27 December, 2022; v1 submitted 11 December, 2022; originally announced December 2022.

    Journal ref: Building and Environment, 109950 (2022)

  29. arXiv:2212.00482  [pdf, other

    cs.CL

    IRRGN: An Implicit Relational Reasoning Graph Network for Multi-turn Response Selection

    Authors: Jingcheng Deng, Hengwei Dai, Xuewei Guo, Yuanchen Ju, Wei Peng

    Abstract: The task of response selection in multi-turn dialogue is to find the best option from all candidates. In order to improve the reasoning ability of the model, previous studies pay more attention to using explicit algorithms to model the dependencies between utterances, which are deterministic, limited and inflexible. In addition, few studies consider differences between the options before and after… ▽ More

    Submitted 23 October, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

    Comments: Accepted by EMNLP 2022

  30. arXiv:2211.08615  [pdf, other

    cs.CV

    GLFF: Global and Local Feature Fusion for AI-synthesized Image Detection

    Authors: Yan Ju, Shan Jia, Jialing Cai, Haiying Guan, Siwei Lyu

    Abstract: With the rapid development of deep generative models (such as Generative Adversarial Networks and Diffusion models), AI-synthesized images are now of such high quality that humans can hardly distinguish them from pristine ones. Although existing detection methods have shown high performance in specific evaluation settings, e.g., on images from seen models or on images without real-world post-proce… ▽ More

    Submitted 4 September, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

    Comments: 13 pages, 6 figures, 8 tables

  31. arXiv:2210.15853  [pdf, other

    cs.SD eess.AS

    Speech Enhancement with Intelligent Neural Homomorphic Synthesis

    Authors: Shulin He, Wei Rao, Jinjiang Liu, Jun Chen, Yukai Ju, Xueliang Zhang, Yannan Wang, Shidong Shang

    Abstract: Most neural network speech enhancement models ignore speech production mathematical models by directly mapping Fourier transform spectrums or waveforms. In this work, we propose a neural source filter network for speech enhancement. Specifically, we use homomorphic signal processing and cepstral analysis to obtain noisy speech's excitation and vocal tract. Unlike traditional signal processing, we… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: Submitted to ICASSP 2023

  32. arXiv:2210.15849  [pdf, ps, other

    cs.SD eess.AS

    Hierarchical speaker representation for target speaker extraction

    Authors: Shulin He, Huaiwen Zhang, Wei Rao, Kanghao Zhang, Yukai Ju, Yang Yang, Xueliang Zhang

    Abstract: Target speaker extraction aims to isolate a specific speaker's voice from a composite of multiple sound sources, guided by an enrollment utterance or called anchor. Current methods predominantly derive speaker embeddings from the anchor and integrate them into the separation network to separate the voice of the target speaker. However, the representation of the speaker embedding is too simplistic,… ▽ More

    Submitted 4 January, 2024; v1 submitted 27 October, 2022; originally announced October 2022.

    Comments: Accepted to ICASSP 2024

  33. arXiv:2210.13270  [pdf, other

    cs.CL cs.LG

    Generating Hierarchical Explanations on Text Classification Without Connecting Rules

    Authors: Yiming Ju, Yuanzhe Zhang, Kang Liu, Jun Zhao

    Abstract: The opaqueness of deep NLP models has motivated the development of methods for interpreting how deep models predict. Recently, work has introduced hierarchical attribution, which produces a hierarchical clustering of words, along with an attribution score for each cluster. However, existing work on hierarchical attribution all follows the connecting rule, limiting the cluster to a continuous span… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

  34. arXiv:2210.03027  [pdf, other

    cs.SD eess.AS

    AnimeTAB: A new guitar tablature dataset of anime and game music

    Authors: Yuecheng Zhou, Yaolong Ju, Lingyun Xie

    Abstract: While guitar tablature has become a popular topic in MIR research, there exists no such a guitar tablature dataset that focuses on the soundtracks of anime and video games, which have a surprisingly broad and growing audience among the youths. In this paper, we present AnimeTAB, a fingerstyle guitar tablature dataset in MusicXML format, which provides more high-quality guitar tablature for both re… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

  35. arXiv:2206.14195  [pdf, other

    cs.CV cs.RO

    Pedestrian 3D Bounding Box Prediction

    Authors: Saeed Saadatnejad, Yi Zhou Ju, Alexandre Alahi

    Abstract: Safety is still the main issue of autonomous driving, and in order to be globally deployed, they need to predict pedestrians' motions sufficiently in advance. While there is a lot of research on coarse-grained (human center prediction) and fine-grained predictions (human body keypoints prediction), we focus on 3D bounding boxes, which are reasonable estimates of humans without modeling complex mot… ▽ More

    Submitted 28 June, 2022; originally announced June 2022.

    Comments: Accepted and published in hEART2022 (the 10th Symposium of the European Association for Research in Transportation): http://www.heart-web.org/

  36. arXiv:2205.15195  [pdf, other

    cs.SD eess.AS

    Personalized Acoustic Echo Cancellation for Full-duplex Communications

    Authors: Shimin Zhang, Ziteng Wang, Yukai Ju, Yihui Fu, Yueyue Na, Qiang Fu, Lei Xie

    Abstract: Deep neural networks (DNNs) have shown promising results for acoustic echo cancellation (AEC). But the DNN-based AEC models let through all near-end speakers including the interfering speech. In light of recent studies on personalized speech enhancement, we investigate the feasibility of personalized acoustic echo cancellation (PAEC) in this paper for full-duplex communications, where background n… ▽ More

    Submitted 29 June, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: submitted to INTERSPEECH 22

  37. arXiv:2204.11475  [pdf

    cs.RO

    Adaptive actuation of magnetic soft robots using deep reinforcement learning

    Authors: Jianpeng Yao, Quanliang Cao, Yuwei Ju, Yuxuan Sun, Ruiqi Liu, Xiaotao Han, Liang Li

    Abstract: Magnetic soft robots have attracted growing interest due to their unique advantages in terms of untethered actuation and excellent controllability. However, finding the required magnetization patterns or magnetic fields to achieve the desired functions of these robots is quite challenging in many cases. No unified framework for design has been proposed yet, and existing methods mainly rely on manu… ▽ More

    Submitted 25 April, 2022; originally announced April 2022.

  38. arXiv:2203.13964  [pdf, other

    cs.CV

    Fusing Global and Local Features for Generalized AI-Synthesized Image Detection

    Authors: Yan Ju, Shan Jia, Lipeng Ke, Hongfei Xue, Koki Nagano, Siwei Lyu

    Abstract: With the development of the Generative Adversarial Networks (GANs) and DeepFakes, AI-synthesized images are now of such high quality that humans can hardly distinguish them from real images. It is imperative for media forensics to develop detectors to expose them accurately. Existing detection methods have shown high performance in generated images detection, but they tend to generalize poorly in… ▽ More

    Submitted 22 November, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

    Comments: 5 pages, 3 figures, 2 tables

  39. arXiv:2203.10470  [pdf, other

    cs.NI cs.DC

    EdgeMatrix: A Resources Redefined Edge-Cloud System for Prioritized Services

    Authors: Yuanming Ren, Shihao Shen, Yanli Ju, Xiaofei Wang, Wenyu Wang, Victor C. M. Leung

    Abstract: The edge-cloud system has the potential to combine the advantages of heterogeneous devices and truly realize ubiquitous computing. However, for service providers to guarantee the Service-Level-Agreement (SLA) priorities, the complex networked environment brings inherent challenges such as multi-resource heterogeneity, resource competition, and networked system dynamics. In this paper, we design a… ▽ More

    Submitted 20 March, 2022; originally announced March 2022.

  40. arXiv:2201.02025  [pdf, other

    cs.LG math.OC

    A deep learning-based model reduction (DeePMR) method for simplifying chemical kinetics

    Authors: Zhiwei Wang, Yaoyu Zhang, Enhan Zhao, Yiguang Ju, Weinan E, Zhi-Qin John Xu, Tianhan Zhang

    Abstract: A deep learning-based model reduction (DeePMR) method for simplifying chemical kinetics is proposed and validated using high-temperature auto-ignitions, perfectly stirred reactors (PSR), and one-dimensional freely propagating flames of n-heptane/air mixtures. The mechanism reduction is modeled as an optimization problem on Boolean space, where a Boolean vector, each entry corresponding to a specie… ▽ More

    Submitted 8 September, 2022; v1 submitted 6 January, 2022; originally announced January 2022.

  41. arXiv:2112.15087  [pdf, other

    cs.LG cs.AI

    ChunkFormer: Learning Long Time Series with Multi-stage Chunked Transformer

    Authors: Yue Ju, Alka Isac, Yimin Nie

    Abstract: The analysis of long sequence data remains challenging in many real-world applications. We propose a novel architecture, ChunkFormer, that improves the existing Transformer framework to handle the challenges while dealing with long time series. Original Transformer-based models adopt an attention mechanism to discover global information along a sequence to leverage the contextual data. Long sequen… ▽ More

    Submitted 30 December, 2021; originally announced December 2021.

    Comments: 7 pages, 4 figures

  42. arXiv:2110.07840  [pdf, other

    cs.CL cs.SD eess.AS

    ESPnet2-TTS: Extending the Edge of TTS Research

    Authors: Tomoki Hayashi, Ryuichi Yamamoto, Takenori Yoshimura, Peter Wu, Jiatong Shi, Takaaki Saeki, Yooncheol Ju, Yusuke Yasuda, Shinnosuke Takamichi, Shinji Watanabe

    Abstract: This paper describes ESPnet2-TTS, an end-to-end text-to-speech (E2E-TTS) toolkit. ESPnet2-TTS extends our earlier version, ESPnet-TTS, by adding many new features, including: on-the-fly flexible pre-processing, joint training with neural vocoders, and state-of-the-art TTS models with extensions like full-band E2E text-to-waveform modeling, which simplify the training pipeline and further enhance T… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

    Comments: Submitted to ICASSP2022. Demo HP: https://espnet.github.io/icassp2022-tts/

  43. arXiv:2109.05463  [pdf, other

    cs.LG cs.AI cs.CL

    The Logic Traps in Evaluating Post-hoc Interpretations

    Authors: Yiming Ju, Yuanzhe Zhang, Zhao Yang, Zhongtao Jiang, Kang Liu, Jun Zhao

    Abstract: Post-hoc interpretation aims to explain a trained model and reveal how the model arrives at a decision. Though research on post-hoc interpretations has developed rapidly, one growing pain in this field is the difficulty in evaluating interpretations. There are some crucial logic traps behind existing evaluation methods, which are ignored by most works. In this opinion piece, we summarize four kind… ▽ More

    Submitted 12 September, 2021; originally announced September 2021.

  44. Incorporating Lambertian Priors into Surface Normals Measurement

    Authors: Yakun Ju, Muwei Jian, Shaoxiang Guo, Yingyu Wang, Huiyu Zhou, Junyu Dong

    Abstract: The goal of photometric stereo is to measure the precise surface normal of a 3D object from observations with various shading cues. However, non-Lambertian surfaces influence the measurement accuracy due to irregular shading cues. Despite deep neural networks have been employed to simulate the performance of non-Lambertian surfaces, the error in specularities, shadows, and crinkle regions is hard… ▽ More

    Submitted 15 July, 2021; originally announced July 2021.

  45. arXiv:2106.08602  [pdf, ps, other

    math.CO cs.DM

    Colouring graphs with no induced six-vertex path or diamond

    Authors: Jan Goedgebeur, Shenwei Huang, Yiao Ju, Owen Merkel

    Abstract: The diamond is the graph obtained by removing an edge from the complete graph on 4 vertices. A graph is ($P_6$, diamond)-free if it contains no induced subgraph isomorphic to a six-vertex path or a diamond. In this paper we show that the chromatic number of a ($P_6$, diamond)-free graph $G$ is no larger than the maximum of 6 and the clique number of $G$. We do this by reducing the problem to imper… ▽ More

    Submitted 16 June, 2021; originally announced June 2021.

    Comments: 29 pages

  46. arXiv:2104.14281  [pdf

    cs.CY cs.LG stat.AP

    Leveraging Online Shopping Behaviors as a Proxy for Personal Lifestyle Choices: New Insights into Chronic Disease Prevention Literacy

    Authors: Yongzhen Wang, Xiaozhong Liu, Katy Börner, Jun Lin, Yingnan Ju, Changlong Sun, Luo Si

    Abstract: Objective: Ubiquitous internet access is reshaping the way we live, but it is accompanied by unprecedented challenges in preventing chronic diseases that are usually planted by long exposure to unhealthy lifestyles. This paper proposes leveraging online shopping behaviors as a proxy for personal lifestyle choices to improve chronic disease prevention literacy, targeted for times when e-commerce us… ▽ More

    Submitted 9 March, 2022; v1 submitted 29 April, 2021; originally announced April 2021.

    Comments: 58 pages with appendices, 5 figures, 17 tables

  47. arXiv:2012.12654  [pdf

    physics.chem-ph cs.LG math.NA

    A deep learning-based ODE solver for chemical kinetics

    Authors: Tianhan Zhang, Yaoyu Zhang, Weinan E, Yiguang Ju

    Abstract: Developing efficient and accurate algorithms for chemistry integration is a challenging task due to its strong stiffness and high dimensionality. The current work presents a deep learning-based numerical method called DeepCombustion0.0 to solve stiff ordinary differential equation systems. The homogeneous autoignition of DME/air mixture, including 54 species, is adopted as an example to illustrate… ▽ More

    Submitted 23 November, 2020; originally announced December 2020.

  48. arXiv:2010.00505  [pdf, other

    cs.CV

    An Ultra Lightweight CNN for Low Resource Circuit Component Recognition

    Authors: Yingnan Ju, Yue Chen

    Abstract: In this paper, we present an ultra lightweight system that can effectively recognize different circuit components in an image with very limited training data. Along with the system, we also release the data set we created for the task. A two-stage approach is employed by our system. Selective search was applied to find the location of each circuit component. Based on its result, we crop the origin… ▽ More

    Submitted 1 October, 2020; originally announced October 2020.

  49. arXiv:2007.14474  [pdf

    q-bio.QM cs.CL cs.DL

    Construction and Usage of a Human Body Common Coordinate Framework Comprising Clinical, Semantic, and Spatial Ontologies

    Authors: Katy Börner, Ellen M. Quardokus, Bruce W. Herr II, Leonard E. Cross, Elizabeth G. Record, Yingnan Ju, Andreas D. Bueckle, James P. Sluka, Jonathan C. Silverstein, Kristen M. Browne, Sanjay Jain, Clive H. Wasserfall, Marda L. Jorgensen, Jeffrey M. Spraggins, Nathan H. Patterson, Mark A. Musen, Griffin M. Weber

    Abstract: The National Institutes of Health's (NIH) Human Biomolecular Atlas Program (HuBMAP) aims to create a comprehensive high-resolution atlas of all the cells in the healthy human body. Multiple laboratories across the United States are collecting tissue specimens from different organs of donors who vary in sex, age, and body size. Integrating and harmonizing the data derived from these samples and 'ma… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

    Comments: 24 pages with SI, 6 figures, 5 tables

  50. arXiv:2006.11440  [pdf, other

    stat.ML cs.LG

    Local Convolutions Cause an Implicit Bias towards High Frequency Adversarial Examples

    Authors: Josue Ortega Caro, Yilong Ju, Ryan Pyle, Sourav Dey, Wieland Brendel, Fabio Anselmi, Ankit Patel

    Abstract: Adversarial Attacks are still a significant challenge for neural networks. Recent work has shown that adversarial perturbations typically contain high-frequency features, but the root cause of this phenomenon remains unknown. Inspired by theoretical work on linear full-width convolutional models, we hypothesize that the local (i.e. bounded-width) convolutional operations commonly used in current n… ▽ More

    Submitted 8 March, 2023; v1 submitted 19 June, 2020; originally announced June 2020.

    Comments: 23 pages, 11 figures, 12 Tables