Skip to main content

Showing 1–50 of 60 results for author: Ni, H

  1. arXiv:2407.01204  [pdf, other

    cs.CR cs.PL

    SCIF: A Language for Compositional Smart Contract Security

    Authors: Siqiu Yao, Haobin Ni, Andrew C. Myers, Ethan Cecchetti

    Abstract: Securing smart contracts remains a fundamental challenge. At its core, it is about building software that is secure in composition with untrusted code, a challenge that extends far beyond blockchains. We introduce SCIF, a language for building smart contracts that are compositionally secure. SCIF is based on the fundamentally compositional principle of secure information flow, but extends this cor… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.17443  [pdf, other

    cs.CV

    Using joint angles based on the international biomechanical standards for human action recognition and related tasks

    Authors: Kevin Schlegel, Lei Jiang, Hao Ni

    Abstract: Keypoint data has received a considerable amount of attention in machine learning for tasks like action detection and recognition. However, human experts in movement such as doctors, physiotherapists, sports scientists and coaches use a notion of joint angles standardised by the International Society of Biomechanics to precisely and efficiently communicate static body poses and movements. In this… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  3. arXiv:2406.12199  [pdf, other

    cs.LG cs.AI

    Time Series Modeling for Heart Rate Prediction: From ARIMA to Transformers

    Authors: Haowei Ni, Shuchen Meng, Xieming Geng, Panfeng Li, Zhuoying Li, Xupeng Chen, Xiaotong Wang, Shiyao Zhang

    Abstract: Cardiovascular disease (CVD) is a leading cause of death globally, necessitating precise forecasting models for monitoring vital signs like heart rate, blood pressure, and ECG. Traditional models, such as ARIMA and Prophet, are limited by their need for manual parameter tuning and challenges in handling noisy, sparse, and highly variable medical data. This study investigates advanced deep learning… ▽ More

    Submitted 27 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted by 2024 6th International Conference on Electronic Engineering and Informatics

  4. arXiv:2406.03702  [pdf, other

    cs.CV

    DSNet: A Novel Way to Use Atrous Convolutions in Semantic Segmentation

    Authors: Zilu Guo, Liuyang Bian, Xuan Huang, Hu Wei, Jingyu Li, Huasheng Ni

    Abstract: Atrous convolutions are employed as a method to increase the receptive field in semantic segmentation tasks. However, in previous works of semantic segmentation, it was rarely employed in the shallow layers of the model. We revisit the design of atrous convolutions in modern convolutional neural networks (CNNs), and demonstrate that the concept of using large kernels to apply atrous convolutions c… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  5. arXiv:2405.17191  [pdf, other

    cs.CV math.PR

    MCGAN: Enhancing GAN Training with Regression-Based Generator Loss

    Authors: Baoren Xiao, Hao Ni, Weixin Yang

    Abstract: Generative adversarial networks (GANs) have emerged as a powerful tool for generating high-fidelity data. However, the main bottleneck of existing approaches is the lack of supervision on the generator training, which often results in undamped oscillation and unsatisfactory performance. To address this issue, we propose an algorithm called Monte Carlo GAN (MCGAN). This approach, utilizing an innov… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  6. arXiv:2405.14913  [pdf, other

    stat.ME cs.LG math.PR stat.ML

    High Rank Path Development: an approach of learning the filtration of stochastic processes

    Authors: Jiajie Tao, Hao Ni, Chong Liu

    Abstract: Since the weak convergence for stochastic processes does not account for the growth of information over time which is represented by the underlying filtration, a slightly erroneous stochastic model in weak topology may cause huge loss in multi-periods decision making problems. To address such discontinuities Aldous introduced the extended weak convergence, which can fully characterise all essentia… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  7. arXiv:2405.13538  [pdf, other

    cs.CV

    Ultra-Fast Adaptive Track Detection Network

    Authors: Hai Ni, Rui Wang, Scarlett Liu

    Abstract: Railway detection is critical for the automation of railway systems. Existing models often prioritize either speed or accuracy, but achieving both remains a challenge. To address the limitations of presetting anchor groups that struggle with varying track proportions from different camera angles, an ultra-fast adaptive track detection network is proposed in this paper. This network comprises a bac… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  8. arXiv:2404.16306  [pdf, other

    cs.CV

    TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models

    Authors: Haomiao Ni, Bernhard Egger, Suhas Lohit, Anoop Cherian, Ye Wang, Toshiaki Koike-Akino, Sharon X. Huang, Tim K. Marks

    Abstract: Text-conditioned image-to-video generation (TI2V) aims to synthesize a realistic video starting from a given image (e.g., a woman's photo) and a text description (e.g., "a woman is drinking water."). Existing TI2V frameworks often require costly training on video-text datasets and specific model designs for text and image conditioning. In this paper, we propose TI2V-Zero, a zero-shot, tuning-free… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  9. arXiv:2403.15212  [pdf, other

    cs.CV

    GCN-DevLSTM: Path Development for Skeleton-Based Action Recognition

    Authors: Lei Jiang, Weixin Yang, Xin Zhang, Hao Ni

    Abstract: Skeleton-based action recognition (SAR) in videos is an important but challenging task in computer vision. The recent state-of-the-art (SOTA) models for SAR are primarily based on graph convolutional neural networks (GCNs), which are powerful in extracting the spatial information of skeleton data. However, it is yet clear that such GCN-based models can effectively capture the temporal dynamics of… ▽ More

    Submitted 26 May, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  10. arXiv:2402.16001  [pdf

    cs.CV

    Cross-Resolution Land Cover Classification Using Outdated Products and Transformers

    Authors: Huan Ni, Yubin Zhao, Haiyan Guan, Cheng Jiang, Yongshi Jie, Xing Wang, Yiyang Shen

    Abstract: Large-scale high-resolution land cover classification is a prerequisite for constructing Earth system models and addressing ecological and resource issues. Advancements in satellite sensor technology have led to an improvement in spatial resolution and wider coverage areas. Nevertheless, the lack of high-resolution labeled data is still a challenge, hindering the largescale application of land cov… ▽ More

    Submitted 5 March, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

  11. arXiv:2402.01749  [pdf, other

    cs.CY cs.AI cs.LG

    Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models

    Authors: Weijia Zhang, Jindong Han, Zhao Xu, Hang Ni, Hao Liu, Hui Xiong

    Abstract: Machine learning techniques are now integral to the advancement of intelligent urban services, playing a crucial role in elevating the efficiency, sustainability, and livability of urban environments. The recent emergence of foundation models such as ChatGPT marks a revolutionary shift in the fields of machine learning and artificial intelligence. Their unparalleled capabilities in contextual unde… ▽ More

    Submitted 29 January, 2024; originally announced February 2024.

  12. arXiv:2312.02535  [pdf, other

    cs.CV

    Towards Open-set Gesture Recognition via Feature Activation Enhancement and Orthogonal Prototype Learning

    Authors: Chen Liu, Can Han, Chengfeng Zhou, Crystal Cai, Suncheng Xiang, Hualiang Ni, Dahong Qian

    Abstract: Gesture recognition is a foundational task in human-machine interaction (HMI). While there has been significant progress in gesture recognition based on surface electromyography (sEMG), accurate recognition of predefined gestures only within a closed set is still inadequate in practice. It is essential to effectively discern and reject unknown gestures of disinterest in a robust system. Numerous m… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  13. arXiv:2311.09642  [pdf, other

    eess.IV cs.CV

    Weakly Supervised Anomaly Detection for Chest X-Ray Image

    Authors: Haoqi Ni, Ximiao Zhang, Min Xu, Ning Lang, Xiuzhuang Zhou

    Abstract: Chest X-Ray (CXR) examination is a common method for assessing thoracic diseases in clinical applications. While recent advances in deep learning have enhanced the significance of visual analysis for CXR anomaly detection, current methods often miss key cues in anomaly images crucial for identifying disease regions, as they predominantly rely on unsupervised training with normal images. This lette… ▽ More

    Submitted 18 November, 2023; v1 submitted 16 November, 2023; originally announced November 2023.

  14. arXiv:2311.04591  [pdf, other

    cs.CV cs.MM cs.RO eess.IV

    Rethinking Event-based Human Pose Estimation with 3D Event Representations

    Authors: Xiaoting Yin, Hao Shi, Jiaan Chen, Ze Wang, Yaozu Ye, Huajian Ni, Kailun Yang, Kaiwei Wang

    Abstract: Human pose estimation is a fundamental and appealing task in computer vision. Traditional frame-based cameras and videos are commonly applied, yet, they become less reliable in scenarios under high dynamic range or heavy motion blur. In contrast, event cameras offer a robust solution for navigating these challenging contexts. Predominant methodologies incorporate event cameras into learning framew… ▽ More

    Submitted 1 December, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: Extended version of arXiv:2206.04511. The code and dataset are available at https://github.com/MasterHow/EventPointPose

  15. arXiv:2311.02549  [pdf, other

    cs.CV

    3D-Aware Talking-Head Video Motion Transfer

    Authors: Haomiao Ni, Jiachen Liu, Yuan Xue, Sharon X. Huang

    Abstract: Motion transfer of talking-head videos involves generating a new video with the appearance of a subject video and the motion pattern of a driving video. Current methodologies primarily depend on a limited number of subject images and 2D representations, thereby neglecting to fully utilize the multi-view appearance features inherent in the subject video. In this paper, we propose a novel 3D-aware t… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

    Comments: WACV2024

  16. arXiv:2310.02815  [pdf, other

    cs.CV cs.RO eess.IV

    CoBEV: Elevating Roadside 3D Object Detection with Depth and Height Complementarity

    Authors: Hao Shi, Chengshan Pang, Jiaming Zhang, Kailun Yang, Yuhao Wu, Huajian Ni, Yining Lin, Rainer Stiefelhagen, Kaiwei Wang

    Abstract: Roadside camera-driven 3D object detection is a crucial task in intelligent transportation systems, which extends the perception range beyond the limitations of vision-centric vehicles and enhances road safety. While previous studies have limitations in using only depth or height information, we find both depth and height matter and they are in fact complementary. The depth feature encompasses pre… ▽ More

    Submitted 17 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: The source code will be made publicly available at https://github.com/MasterHow/CoBEV

  17. arXiv:2309.06496  [pdf, other

    cs.CR

    Level Up: Private Non-Interactive Decision Tree Evaluation using Levelled Homomorphic Encryption

    Authors: Rasoul Akhavan Mahdavi, Haoyan Ni, Dimitry Linkov, Florian Kerschbaum

    Abstract: As machine learning as a service continues gaining popularity, concerns about privacy and intellectual property arise. Users often hesitate to disclose their private information to obtain a service, while service providers aim to protect their proprietary models. Decision trees, a widely used machine learning model, are favoured for their simplicity, interpretability, and ease of training. In this… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

  18. arXiv:2308.07104  [pdf, other

    cs.CV cs.RO eess.IV

    FocusFlow: Boosting Key-Points Optical Flow Estimation for Autonomous Driving

    Authors: Zhonghua Yi, Hao Shi, Kailun Yang, Qi Jiang, Yaozu Ye, Ze Wang, Huajian Ni, Kaiwei Wang

    Abstract: Key-point-based scene understanding is fundamental for autonomous driving applications. At the same time, optical flow plays an important role in many vision tasks. However, due to the implicit bias of equal attention on all points, classic data-driven optical flow estimation methods yield less satisfactory performance on key points, limiting their implementations in key-point-critical safety-rele… ▽ More

    Submitted 22 September, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: Accepted to IEEE Transactions on Intelligent Vehicles (T-IV). The source code of FocusFlow will be available at https://github.com/ZhonghuaYi/FocusFlow_official

  19. arXiv:2308.04020  [pdf, other

    cs.CV

    Synthetic Augmentation with Large-scale Unconditional Pre-training

    Authors: Jiarong Ye, Haomiao Ni, Peng Jin, Sharon X. Huang, Yuan Xue

    Abstract: Deep learning based medical image recognition systems often require a substantial amount of training data with expert annotations, which can be expensive and time-consuming to obtain. Recently, synthetic augmentation techniques have been proposed to mitigate the issue by generating realistic images conditioned on class labels. However, the effectiveness of these methods heavily depends on the repr… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: MICCAI 2023

  20. arXiv:2308.03322  [pdf, other

    cs.CV cs.AI

    Part-Aware Transformer for Generalizable Person Re-identification

    Authors: Hao Ni, Yuke Li, Lianli Gao, Heng Tao Shen, Jingkuan Song

    Abstract: Domain generalization person re-identification (DG-ReID) aims to train a model on source domains and generalize well on unseen domains. Vision Transformer usually yields better generalization ability than common CNN networks under distribution shifts. However, Transformer-based ReID models inevitably over-fit to domain-specific biases due to the supervised learning strategy on the source domain. W… ▽ More

    Submitted 18 September, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

  21. arXiv:2308.02452  [pdf, other

    stat.ML cs.LG math.NA math.PR

    Generative Modelling of Lévy Area for High Order SDE Simulation

    Authors: Andraž Jelinčič, Jiajie Tao, William F. Turner, Thomas Cass, James Foster, Hao Ni

    Abstract: It is well known that, when numerically simulating solutions to SDEs, achieving a strong convergence rate better than O(\sqrt{h}) (where h is the step size) requires the use of certain iterated integrals of Brownian motion, commonly referred to as its "Lévy areas". However, these stochastic integrals are difficult to simulate due to their non-Gaussian nature and for a d-dimensional Brownian motion… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    MSC Class: 65C30

  22. arXiv:2307.15484  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding

    Authors: Chunyu Qiang, Hao Li, Hao Ni, He Qu, Ruibo Fu, Tao Wang, Longbiao Wang, Jianwu Dang

    Abstract: Recently, there has been a growing interest in text-to-speech (TTS) methods that can be trained with minimal supervision by combining two types of discrete speech representations and using two sequence-to-sequence tasks to decouple TTS. However, existing methods suffer from three problems: the high dimensionality and waveform distortion of discrete speech representations, the prosodic averaging pr… ▽ More

    Submitted 18 December, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: Accepted by ICASSP 2024

  23. arXiv:2307.04690  [pdf, ps, other

    quant-ph cs.IT math.NA

    Heisenberg-limited Hamiltonian learning for interacting bosons

    Authors: Haoya Li, Yu Tong, Hongkang Ni, Tuvia Gefen, Lexing Ying

    Abstract: We develop a protocol for learning a class of interacting bosonic Hamiltonians from dynamics with Heisenberg-limited scaling. For Hamiltonians with an underlying bounded-degree graph structure, we can learn all parameters with root mean squared error $ε$ using $\mathcal{O}(1/ε)$ total evolution time, which is independent of the system size, in a way that is robust against state-preparation and mea… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: 14 pages with 21-page appendix

  24. arXiv:2306.01123  [pdf, other

    cs.LG math.PR

    A Neural RDE-based model for solving path-dependent PDEs

    Authors: Bowen Fang, Hao Ni, Yue Wu

    Abstract: The concept of the path-dependent partial differential equation (PPDE) was first introduced in the context of path-dependent derivatives in financial markets. Its semilinear form was later identified as a non-Markovian backward stochastic differential equation (BSDE). Compared to the classical PDE, the solution of a PPDE involves an infinite-dimensional spatial variable, making it challenging to a… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    MSC Class: 68T07; 60L90; 60H30

  25. arXiv:2305.12511  [pdf, other

    cs.LG

    PCF-GAN: generating sequential data via the characteristic function of measures on the path space

    Authors: Hang Lou, Siran Li, Hao Ni

    Abstract: Generating high-fidelity time series data using generative adversarial networks (GANs) remains a challenging task, as it is difficult to capture the temporal dependence of joint probability distributions induced by time-series data. Towards this goal, a key step is the development of an effective discriminator to distinguish between time series distributions. We propose the so-called PCF-GAN, a no… ▽ More

    Submitted 6 April, 2024; v1 submitted 21 May, 2023; originally announced May 2023.

    Journal ref: Advances in Neural Information Processing Systems 36 (2024)

  26. arXiv:2305.11049  [pdf, other

    eess.IV cs.CV cs.LG

    NODE-ImgNet: a PDE-informed effective and robust model for image denoising

    Authors: Xinheng Xie, Yue Wu, Hao Ni, Cuiyu He

    Abstract: Inspired by the traditional partial differential equation (PDE) approach for image denoising, we propose a novel neural network architecture, referred as NODE-ImgNet, that combines neural ordinary differential equations (NODEs) with convolutional neural network (CNN) blocks. NODE-ImgNet is intrinsically a PDE model, where the dynamic system is learned implicitly without the explicit specification… ▽ More

    Submitted 6 November, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

  27. arXiv:2305.02901  [pdf, other

    cs.LG cs.AI cs.CR

    Single Node Injection Label Specificity Attack on Graph Neural Networks via Reinforcement Learning

    Authors: Dayuan Chen, Jian Zhang, Yuqian Lv, Jinhuan Wang, Hongjie Ni, Shanqing Yu, Zhen Wang, Qi Xuan

    Abstract: Graph neural networks (GNNs) have achieved remarkable success in various real-world applications. However, recent studies highlight the vulnerability of GNNs to malicious perturbations. Previous adversaries primarily focus on graph modifications or node injections to existing graphs, yielding promising results but with notable limitations. Graph modification attack~(GMA) requires manipulation of t… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

  28. arXiv:2304.12536  [pdf, other

    cs.CV

    Exploring Compositional Visual Generation with Latent Classifier Guidance

    Authors: Changhao Shi, Haomiao Ni, Kai Li, Shaobo Han, Mingfu Liang, Martin Renqiang Min

    Abstract: Diffusion probabilistic models have achieved enormous success in the field of image generation and manipulation. In this paper, we explore a novel paradigm of using the diffusion model and classifier guidance in the latent semantic space for compositional visual tasks. Specifically, we train latent diffusion models and auxiliary latent classifiers to facilitate non-linear navigation of latent repr… ▽ More

    Submitted 24 May, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

    Comments: Accepted to CVPR Workshop 2023

  29. Trees and Turtles: Modular Abstractions for State Machine Replication Protocols

    Authors: Natalie Neamtu, Haobin Ni, Robbert van Renesse

    Abstract: We present two abstractions for designing modular state machine replication (SMR) protocols: trees and turtles. A tree captures the set of possible state machine histories, while a turtle represents a subprotocol that tries to find agreement in this tree. We showcase the applicability of these abstractions by constructing crash-tolerant SMR protocols out of abstract tree turtles and providing exam… ▽ More

    Submitted 6 May, 2023; v1 submitted 16 April, 2023; originally announced April 2023.

    Comments: Full version of the paper published in PaPoC '23, including full proofs and discussion of BFT protocols

  30. arXiv:2303.13842  [pdf, other

    cs.CV cs.RO eess.IV

    FishDreamer: Towards Fisheye Semantic Completion via Unified Image Outpainting and Segmentation

    Authors: Hao Shi, Yu Li, Kailun Yang, Jiaming Zhang, Kunyu Peng, Alina Roitberg, Yaozu Ye, Huajian Ni, Kaiwei Wang, Rainer Stiefelhagen

    Abstract: This paper raises the new task of Fisheye Semantic Completion (FSC), where dense texture, structure, and semantics of a fisheye image are inferred even beyond the sensor field-of-view (FoV). Fisheye cameras have larger FoV than ordinary pinhole cameras, yet its unique special imaging model naturally leads to a blind area at the edge of the image plane. This is suboptimal for safety-critical applic… ▽ More

    Submitted 20 April, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

    Comments: Accepted to CVPR OmniCV 2023. Code and datasets will be available at https://github.com/MasterHow/FishDreamer

  31. arXiv:2303.13744  [pdf, other

    cs.CV

    Conditional Image-to-Video Generation with Latent Flow Diffusion Models

    Authors: Haomiao Ni, Changhao Shi, Kai Li, Sharon X. Huang, Martin Renqiang Min

    Abstract: Conditional image-to-video (cI2V) generation aims to synthesize a new plausible video starting from an image (e.g., a person's face) and a condition (e.g., an action class label like smile). The key challenge of the cI2V task lies in the simultaneous generation of realistic spatial appearance and temporal dynamics corresponding to the given image and condition. In this paper, we propose an approac… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: CVPR 2023

  32. arXiv:2303.00946  [pdf, ps, other

    math.NA cs.IT eess.SP math.ST

    A note on spike localization for line spectrum estimation

    Authors: Haoya Li, Hongkang Ni, Lexing Ying

    Abstract: This note considers the problem of approximating the locations of dominant spikes for a probability measure from noisy spectrum measurements under the condition of residue signal, significant noise level, and no minimum spectrum separation. We show that the simple procedure of thresholding the smoothed inverse Fourier transform allows for approximating the spike locations rather accurately.

    Submitted 13 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

    MSC Class: 94A08; 94A12; 81P60

  33. arXiv:2302.04965  [pdf, other

    cs.HC

    Guttation Monitor: Wearable Guttation Sensor for Plant Condition Monitoring and Diagnosis

    Authors: Qiuyu Lu, Lydia Yang, Aditi Maheshwari, Hengrong Ni, Tianyu Yu, Jianzhe Gu, Advait Wadhwani, Andreea Danielescu, Lining Yao

    Abstract: Plant life plays a critical role in the ecosystem. However, it is difficult for humans to perceive plants' reactions because the biopotential and biochemical responses are invisible to humans. Guttation droplets contain various chemicals which can reflect plant physiology and environmental conditions in real-time. Traditionally, these droplets are collected manually and analyzed in the lab with ex… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Comments: 15 pages, 13 figures

  34. arXiv:2301.07568  [pdf

    q-bio.BM cs.LG

    Beating the Best: Improving on AlphaFold2 at Protein Structure Prediction

    Authors: Abbi Abdel-Rehim, Oghenejokpeme Orhobor, Hang Lou, Hao Ni, Ross D. King

    Abstract: The goal of Protein Structure Prediction (PSP) problem is to predict a protein's 3D structure (confirmation) from its amino acid sequence. The problem has been a 'holy grail' of science since the Noble prize-winning work of Anfinsen demonstrated that protein conformation was determined by sequence. A recent and important step towards this goal was the development of AlphaFold2, currently the best… ▽ More

    Submitted 23 January, 2023; v1 submitted 18 January, 2023; originally announced January 2023.

    Comments: 12 pages

    MSC Class: 92-08 ACM Class: J.3

  35. arXiv:2211.11257  [pdf, other

    cs.CV eess.IV physics.optics

    Computational Imaging for Machine Perception: Transferring Semantic Segmentation beyond Aberrations

    Authors: Qi Jiang, Hao Shi, Shaohua Gao, Jiaming Zhang, Kailun Yang, Lei Sun, Huajian Ni, Kaiwei Wang

    Abstract: Semantic scene understanding with Minimalist Optical Systems (MOS) in mobile and wearable applications remains a challenge due to the corrupted imaging quality induced by optical aberrations. However, previous works only focus on improving the subjective imaging quality through the Computational Imaging (CI) technique, ignoring the feasibility of advancing semantic segmentation. In this paper, we… ▽ More

    Submitted 14 March, 2024; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: Accepted to IEEE Transactions on Computational Imaging (TCI). The project page is at https://github.com/zju-jiangqi/CIADA

  36. Semi-supervised Body Parsing and Pose Estimation for Enhancing Infant General Movement Assessment

    Authors: Haomiao Ni, Yuan Xue, Liya Ma, Qian Zhang, Xiaoye Li, Xiaolei Huang

    Abstract: General movement assessment (GMA) of infant movement videos (IMVs) is an effective method for early detection of cerebral palsy (CP) in infants. We demonstrate in this paper that end-to-end trainable neural networks for image sequence recognition can be applied to achieve good results in GMA, and more importantly, augmenting raw video with infant body parsing and pose estimation information can si… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: Elsevier Medical Image Analysis 2022

    Journal ref: Medical Image Analysis 83 (2023):102654

  37. arXiv:2210.01559  [pdf, other

    cs.CV

    Cross-identity Video Motion Retargeting with Joint Transformation and Synthesis

    Authors: Haomiao Ni, Yihao Liu, Sharon X. Huang, Yuan Xue

    Abstract: In this paper, we propose a novel dual-branch Transformation-Synthesis network (TS-Net), for video motion retargeting. Given one subject video and one driving video, TS-Net can produce a new plausible video with the subject appearance of the subject video and motion pattern of the driving video. TS-Net consists of a warp-based transformation branch and a warp-free synthesis branch. The novel desig… ▽ More

    Submitted 1 October, 2022; originally announced October 2022.

    Comments: WACV 2023

  38. arXiv:2209.08326  [pdf, other

    eess.AS cs.CL

    Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition

    Authors: Ye Bai, Jie Li, Wenjing Han, Hao Ni, Kaituo Xu, Zhuo Zhang, Cheng Yi, Xiaorui Wang

    Abstract: While transformers and their variant conformers show promising performance in speech recognition, the parameterized property leads to much memory cost during training and inference. Some works use cross-layer weight-sharing to reduce the parameters of the model. However, the inevitable loss of capacity harms the model performance. To address this issue, this paper proposes a parameter-efficient co… ▽ More

    Submitted 17 September, 2022; originally announced September 2022.

    Comments: accepted in INTERSPEECH 2022

  39. arXiv:2206.15445  [pdf, other

    eess.IV cs.CV

    Asymmetry Disentanglement Network for Interpretable Acute Ischemic Stroke Infarct Segmentation in Non-Contrast CT Scans

    Authors: Haomiao Ni, Yuan Xue, Kelvin Wong, John Volpi, Stephen T. C. Wong, James Z. Wang, Xiaolei Huang

    Abstract: Accurate infarct segmentation in non-contrast CT (NCCT) images is a crucial step toward computer-aided acute ischemic stroke (AIS) assessment. In clinical practice, bilateral symmetric comparison of brain hemispheres is usually used to locate pathological abnormalities. Recent research has explored asymmetries to assist with AIS segmentation. However, most previous symmetry-based work mixed differ… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

    Comments: MICCAI 2022

  40. arXiv:2204.00740  [pdf, other

    cs.LG stat.ML

    Path Development Network with Finite-dimensional Lie Group Representation

    Authors: Hang Lou, Siran Li, Hao Ni

    Abstract: The path signature, a mathematically principled and universal feature of sequential data, leads to a performance boost of deep learning-based models in various sequential data tasks as a complimentary feature. However, it suffers from the curse of dimensionality when the path dimension is high. To tackle this problem, we propose a novel, trainable path development layer, which exploits representat… ▽ More

    Submitted 1 April, 2022; originally announced April 2022.

    MSC Class: 60L10 ACM Class: G.3; I.5.1

  41. arXiv:2112.01166  [pdf, other

    q-fin.ST cs.LG

    Forex Trading Volatility Prediction using Neural Network Models

    Authors: Shujian Liao, Jian Chen, Hao Ni

    Abstract: In this paper, we investigate the problem of predicting the future volatility of Forex currency pairs using the deep learning techniques. We show step-by-step how to construct the deep-learning network by the guidance of the empirical patterns of the intra-day volatility. The numerical results show that the multiscale Long Short-Term Memory (LSTM) model with the input of multi-currency pairs consi… ▽ More

    Submitted 3 December, 2021; v1 submitted 2 December, 2021; originally announced December 2021.

  42. arXiv:2111.01207  [pdf, other

    cs.LG

    Sig-Wasserstein GANs for Time Series Generation

    Authors: Hao Ni, Lukasz Szpruch, Marc Sabate-Vidales, Baoren Xiao, Magnus Wiese, Shujian Liao

    Abstract: Synthetic data is an emerging technology that can significantly accelerate the development and deployment of AI machine learning pipelines. In this work, we develop high-fidelity time-series generators, the SigWGAN, by combining continuous-time stochastic models with the newly proposed signature $W_1$ metric. The former are the Logsig-RNN models based on the stochastic differential equations, wher… ▽ More

    Submitted 1 November, 2021; originally announced November 2021.

    Comments: This paper is accepted by the 2nd ACM International Conference on AI in Finance 2021

    MSC Class: 60L10 ACM Class: I.6; G.3

  43. arXiv:2110.13008  [pdf, other

    cs.CV cs.LG

    Logsig-RNN: a novel network for robust and efficient skeleton-based action recognition

    Authors: Shujian Liao, Terry Lyons, Weixin Yang, Kevin Schlegel, Hao Ni

    Abstract: This paper contributes to the challenge of skeleton-based human action recognition in videos. The key step is to develop a generic network architecture to extract discriminative features for the spatio-temporal skeleton data. In this paper, we propose a novel module, namely Logsig-RNN, which is the combination of the log-signature layer and recurrent type neural networks (RNNs). The former one com… ▽ More

    Submitted 1 November, 2021; v1 submitted 25 October, 2021; originally announced October 2021.

    Comments: This paper is accepted by British Machine Vision Conference 2021

  44. arXiv:2109.12065  [pdf, other

    cs.CV cs.AI

    DeepStroke: An Efficient Stroke Screening Framework for Emergency Rooms with Multimodal Adversarial Deep Learning

    Authors: Tongan Cai, Haomiao Ni, Mingli Yu, Xiaolei Huang, Kelvin Wong, John Volpi, James Z. Wang, Stephen T. C. Wong

    Abstract: In an emergency room (ER) setting, stroke triage or screening is a common challenge. A quick CT is usually done instead of MRI due to MRI's slow throughput and high cost. Clinical tests are commonly referred to during the process, but the misdiagnosis rate remains high. We propose a novel multimodal deep learning framework, DeepStroke, to achieve computer-aided stroke presence assessment by recogn… ▽ More

    Submitted 27 June, 2022; v1 submitted 24 September, 2021; originally announced September 2021.

  45. Compositional Security for Reentrant Applications

    Authors: Ethan Cecchetti, Siqiu Yao, Haobin Ni, Andrew C. Myers

    Abstract: The disastrous vulnerabilities in smart contracts sharply remind us of our ignorance: we do not know how to write code that is secure in composition with malicious code. Information flow control has long been proposed as a way to achieve compositional security, offering strong guarantees even when combining software from different trust domains. Unfortunately, this appealing story breaks down in t… ▽ More

    Submitted 27 March, 2021; v1 submitted 15 March, 2021; originally announced March 2021.

    ACM Class: D.4.6

    Journal ref: Proceedings of the 2021 IEEE Symposium on Security and Privacy (SP)

  46. arXiv:2010.08433  [pdf, other

    cs.CL cs.IR

    An efficient representation of chronological events in medical texts

    Authors: Andrey Kormilitzin, Nemanja Vaci, Qiang Liu, Hao Ni, Goran Nenadic, Alejo Nevado-Holgado

    Abstract: In this work we addressed the problem of capturing sequential information contained in longitudinal electronic health records (EHRs). Clinical notes, which is a particular type of EHR data, are a rich source of information and practitioners often develop clever solutions how to maximise the sequential information contained in free-texts. We proposed a systematic methodology for learning from chron… ▽ More

    Submitted 24 October, 2020; v1 submitted 16 October, 2020; originally announced October 2020.

    Comments: 4 pages, 2 figures, 7 tables

  47. arXiv:2007.08646  [pdf, other

    cs.CV

    SiamParseNet: Joint Body Parsing and Label Propagation in Infant Movement Videos

    Authors: Haomiao Ni, Yuan Xue, Qian Zhang, Xiaolei Huang

    Abstract: General movement assessment (GMA) of infant movement videos (IMVs) is an effective method for the early detection of cerebral palsy (CP) in infants. Automated body parsing is a crucial step towards computer-aided GMA, in which infant body parts are segmented and tracked over time for movement analysis. However, acquiring fully annotated data for video-based body parsing is particularly expensive d… ▽ More

    Submitted 16 July, 2020; originally announced July 2020.

    Comments: MICCAI 2020

  48. arXiv:2006.05421  [pdf, other

    cs.LG stat.ML

    Conditional Sig-Wasserstein GANs for Time Series Generation

    Authors: Shujian Liao, Hao Ni, Lukasz Szpruch, Magnus Wiese, Marc Sabate-Vidales, Baoren Xiao

    Abstract: Generative adversarial networks (GANs) have been extremely successful in generating samples, from seemingly high dimensional probability measures. However, these methods struggle to capture the temporal dependence of joint probability distributions induced by time-series data. Furthermore, long time-series data streams hugely increase the dimension of the target space, which may render generative… ▽ More

    Submitted 11 October, 2023; v1 submitted 9 June, 2020; originally announced June 2020.

    Comments: This paper has been accepted for Mathematical Finance Special Issue on Machine Learning in Finance

  49. arXiv:2004.04006  [pdf, other

    cs.LG eess.SP stat.ML

    Signature features with the visibility transformation

    Authors: Yue Wu, Hao Ni, Terence J. Lyons, Robin L. Hudson

    Abstract: In this paper we put the visibility transformation on a clear theoretical footing and show that this transform is able to embed the effect of the absolute position of the data stream into signature features in a unified and efficient way. The generated feature set is particularly useful in pattern recognition tasks, for its simplifying role in allowing the signature feature set to accommodate nonl… ▽ More

    Submitted 8 October, 2020; v1 submitted 8 April, 2020; originally announced April 2020.

    MSC Class: 60L10

  50. arXiv:2002.00440  [pdf

    eess.IV cs.CV

    Simultaneous Left Atrium Anatomy and Scar Segmentations via Deep Learning in Multiview Information with Attention

    Authors: Guang Yang, Jun Chen, Zhifan Gao, Shuo Li, Hao Ni, Elsa Angelini, Tom Wong, Raad Mohiaddin, Eva Nyktari, Ricardo Wage, Lei Xu, Yanping Zhang, Xiuquan Du, Heye Zhang, David Firmin, Jennifer Keegan

    Abstract: Three-dimensional late gadolinium enhanced (LGE) cardiac MR (CMR) of left atrial scar in patients with atrial fibrillation (AF) has recently emerged as a promising technique to stratify patients, to guide ablation therapy and to predict treatment success. This requires a segmentation of the high intensity scar tissue and also a segmentation of the left atrium (LA) anatomy, the latter usually being… ▽ More

    Submitted 2 February, 2020; originally announced February 2020.

    Comments: 34 pages, 10 figures, 7 tables, accepted by Future Generation Computer Systems journal