subscribe to arXiv mailings

Vertical Federated Learning for Effectiveness, Security, Applicability: A Survey

Authors: Mang Ye, Wei Shen, Bo Du, Eduard Snezhko, Vassili Kovalev, Pong C. Yuen

Abstract: Vertical Federated Learning (VFL) is a privacy-preserving distributed learning paradigm where different parties collaboratively learn models using partitioned features of shared samples, without leaking private data. Recent research has shown promising results addressing various challenges in VFL, highlighting its potential for practical applications in cross-domain collaboration. However, the cor… ▽ More Vertical Federated Learning (VFL) is a privacy-preserving distributed learning paradigm where different parties collaboratively learn models using partitioned features of shared samples, without leaking private data. Recent research has shown promising results addressing various challenges in VFL, highlighting its potential for practical applications in cross-domain collaboration. However, the corresponding research is scattered and lacks organization. To advance VFL research, this survey offers a systematic overview of recent developments. First, we provide a history and background introduction, along with a summary of the general training protocol of VFL. We then revisit the taxonomy in recent reviews and analyze limitations in-depth. For a comprehensive and structured discussion, we synthesize recent research from three fundamental perspectives: effectiveness, security, and applicability. Finally, we discuss several critical future research directions in VFL, which will facilitate the developments in this field. We provide a collection of research lists and periodically update them at https://github.com/shentt67/VFL_Survey. △ Less

Submitted 4 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

Comments: 31 pages, 9 figures, 10 tables

arXiv:2404.17830 [pdf, other]

Dynamic Against Dynamic: An Open-set Self-learning Framework

Authors: Haifeng Yang, Chuanxing Geng, Pong C. Yuen, Songcan Chen

Abstract: In open-set recognition, existing methods generally learn statically fixed decision boundaries using known classes to reject unknown classes. Though they have achieved promising results, such decision boundaries are evidently insufficient for universal unknown classes in dynamic and open scenarios as they can potentially appear at any position in the feature space. Moreover, these methods just sim… ▽ More In open-set recognition, existing methods generally learn statically fixed decision boundaries using known classes to reject unknown classes. Though they have achieved promising results, such decision boundaries are evidently insufficient for universal unknown classes in dynamic and open scenarios as they can potentially appear at any position in the feature space. Moreover, these methods just simply reject unknown class samples during testing without any effective utilization for them. In fact, such samples completely can constitute the true instantiated representation of the unknown classes to further enhance the model's performance. To address these issues, this paper proposes a novel dynamic against dynamic idea, i.e., dynamic method against dynamic changing open-set world, where an open-set self-learning (OSSL) framework is correspondingly developed. OSSL starts with a good closed-set classifier trained by known classes and utilizes available test samples for model adaptation during testing, thus gaining the adaptability to changing data distributions. In particular, a novel self-matching module is designed for OSSL, which can achieve the adaptation in automatically identifying known class samples while rejecting unknown class samples which are further utilized to enhance the discriminability of the model as the instantiated representation of unknown classes. Our method establishes new performance milestones respectively in almost all standard and cross-data benchmarks. △ Less

Submitted 2 May, 2024; v1 submitted 27 April, 2024; originally announced April 2024.

Comments: The first two authors contributed equally to this work. Accepted at IJCAI2024

arXiv:2307.10616 [pdf, other]

Heterogeneous Federated Learning: State-of-the-art and Research Challenges

Authors: Mang Ye, Xiuwen Fang, Bo Du, Pong C. Yuen, Dacheng Tao

Abstract: Federated learning (FL) has drawn increasing attention owing to its potential use in large-scale industrial applications. Existing federated learning works mainly focus on model homogeneous settings. However, practical federated learning typically faces the heterogeneity of data distributions, model architectures, network environments, and hardware devices among participant clients. Heterogeneous… ▽ More Federated learning (FL) has drawn increasing attention owing to its potential use in large-scale industrial applications. Existing federated learning works mainly focus on model homogeneous settings. However, practical federated learning typically faces the heterogeneity of data distributions, model architectures, network environments, and hardware devices among participant clients. Heterogeneous Federated Learning (HFL) is much more challenging, and corresponding solutions are diverse and complex. Therefore, a systematic survey on this topic about the research challenges and state-of-the-art is essential. In this survey, we firstly summarize the various research challenges in HFL from five aspects: statistical heterogeneity, model heterogeneity, communication heterogeneity, device heterogeneity, and additional challenges. In addition, recent advances in HFL are reviewed and a new taxonomy of existing HFL methods is proposed with an in-depth analysis of their pros and cons. We classify existing methods from three different levels according to the HFL procedure: data-level, model-level, and server-level. Finally, several critical and promising future research directions in HFL are discussed, which may facilitate further developments in this field. A periodically updated collection on HFL is available at https://github.com/marswhu/HFL_Survey. △ Less

Submitted 8 September, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

Comments: 42 pages, 11 figures, and 4 tables

arXiv:2202.05953 [pdf, other]

Open-set Adversarial Defense with Clean-Adversarial Mutual Learning

Authors: Rui Shao, Pramuditha Perera, Pong C. Yuen, Vishal M. Patel

Abstract: Open-set recognition and adversarial defense study two key aspects of deep learning that are vital for real-world deployment. The objective of open-set recognition is to identify samples from open-set classes during testing, while adversarial defense aims to robustify the network against images perturbed by imperceptible adversarial noise. This paper demonstrates that open-set recognition systems… ▽ More Open-set recognition and adversarial defense study two key aspects of deep learning that are vital for real-world deployment. The objective of open-set recognition is to identify samples from open-set classes during testing, while adversarial defense aims to robustify the network against images perturbed by imperceptible adversarial noise. This paper demonstrates that open-set recognition systems are vulnerable to adversarial samples. Furthermore, this paper shows that adversarial defense mechanisms trained on known classes are unable to generalize well to open-set samples. Motivated by these observations, we emphasize the necessity of an Open-Set Adversarial Defense (OSAD) mechanism. This paper proposes an Open-Set Defense Network with Clean-Adversarial Mutual Learning (OSDN-CAML) as a solution to the OSAD problem. The proposed network designs an encoder with dual-attentive feature-denoising layers coupled with a classifier to learn a noise-free latent feature representation, which adaptively removes adversarial noise guided by channel and spatial-wise attentive filters. Several techniques are exploited to learn a noise-free and informative latent feature space with the aim of improving the performance of adversarial defense and open-set recognition. First, we incorporate a decoder to ensure that clean images can be well reconstructed from the obtained latent features. Then, self-supervision is used to ensure that the latent features are informative enough to carry out an auxiliary task. Finally, to exploit more complementary knowledge from clean image classification to facilitate feature denoising and search for a more generalized local minimum for open-set recognition, we further propose clean-adversarial mutual learning, where a peer network (classifying clean images) is further introduced to mutually learn with the classifier (classifying adversarial images). △ Less

Submitted 11 February, 2022; originally announced February 2022.

Comments: Accepted by International Journal of Computer Vision (IJCV) 2022. Code will be available at https://github.com/rshaojimmy/ECCV2020-OSAD. arXiv admin note: text overlap with arXiv:2009.00814

arXiv:2110.12613 [pdf, other]

Federated Test-Time Adaptive Face Presentation Attack Detection with Dual-Phase Privacy Preservation

Authors: Rui Shao, Bochao Zhang, Pong C. Yuen, Vishal M. Patel

Abstract: Face presentation attack detection (fPAD) plays a critical role in the modern face recognition pipeline. The generalization ability of face presentation attack detection models to unseen attacks has become a key issue for real-world deployment, which can be improved when models are trained with face images from different input distributions and different types of spoof attacks. In reality, due to… ▽ More Face presentation attack detection (fPAD) plays a critical role in the modern face recognition pipeline. The generalization ability of face presentation attack detection models to unseen attacks has become a key issue for real-world deployment, which can be improved when models are trained with face images from different input distributions and different types of spoof attacks. In reality, due to legal and privacy issues, training data (both real face images and spoof images) are not allowed to be directly shared between different data sources. In this paper, to circumvent this challenge, we propose a Federated Test-Time Adaptive Face Presentation Attack Detection with Dual-Phase Privacy Preservation framework, with the aim of enhancing the generalization ability of fPAD models in both training and testing phase while preserving data privacy. In the training phase, the proposed framework exploits the federated learning technique, which simultaneously takes advantage of rich fPAD information available at different data sources by aggregating model updates from them without accessing their private data. To further boost the generalization ability, in the testing phase, we explore test-time adaptation by minimizing the entropy of fPAD model prediction on the testing data, which alleviates the domain gap between training and testing data and thus reduces the generalization error of a fPAD model. We introduce the experimental setting to evaluate the proposed framework and carry out extensive experiments to provide various insights about the proposed method for fPAD. △ Less

Submitted 24 October, 2021; originally announced October 2021.

Comments: Accepted by FG 2021. arXiv admin note: substantial text overlap with arXiv:2104.06595, arXiv:2005.14638

arXiv:2104.06595 [pdf, other]

Federated Generalized Face Presentation Attack Detection

Authors: Rui Shao, Pramuditha Perera, Pong C. Yuen, Vishal M. Patel

Abstract: Face presentation attack detection plays a critical role in the modern face recognition pipeline. A face presentation attack detection model with good generalization can be obtained when it is trained with face images from different input distributions and different types of spoof attacks. In reality, training data (both real face images and spoof images) are not directly shared between data owner… ▽ More Face presentation attack detection plays a critical role in the modern face recognition pipeline. A face presentation attack detection model with good generalization can be obtained when it is trained with face images from different input distributions and different types of spoof attacks. In reality, training data (both real face images and spoof images) are not directly shared between data owners due to legal and privacy issues. In this paper, with the motivation of circumventing this challenge, we propose a Federated Face Presentation Attack Detection (FedPAD) framework that simultaneously takes advantage of rich fPAD information available at different data owners while preserving data privacy. In the proposed framework, each data center locally trains its own fPAD model. A server learns a global fPAD model by iteratively aggregating model updates from all data centers without accessing private data in each of them. To equip the aggregated fPAD model in the server with better generalization ability to unseen attacks from users, following the basic idea of FedPAD, we further propose a Federated Generalized Face Presentation Attack Detection (FedGPAD) framework. A federated domain disentanglement strategy is introduced in FedGPAD, which treats each data center as one domain and decomposes the fPAD model into domain-invariant and domain-specific parts in each data center. Two parts disentangle the domain-invariant and domain-specific features from images in each local data center, respectively. A server learns a global fPAD model by only aggregating domain-invariant parts of the fPAD models from data centers and thus a more generalized fPAD model can be aggregated in server. We introduce the experimental setting to evaluate the proposed FedPAD and FedGPAD frameworks and carry out extensive experiments to provide various insights about federated learning for fPAD. △ Less

Submitted 30 April, 2022; v1 submitted 13 April, 2021; originally announced April 2021.

Comments: Accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS) 2022. arXiv admin note: substantial text overlap with arXiv:2005.14638

arXiv:2009.00814 [pdf, other]

Open-set Adversarial Defense

Authors: Rui Shao, Pramuditha Perera, Pong C. Yuen, Vishal M. Patel

Abstract: Open-set recognition and adversarial defense study two key aspects of deep learning that are vital for real-world deployment. The objective of open-set recognition is to identify samples from open-set classes during testing, while adversarial defense aims to defend the network against images with imperceptible adversarial perturbations. In this paper, we show that open-set recognition systems are… ▽ More Open-set recognition and adversarial defense study two key aspects of deep learning that are vital for real-world deployment. The objective of open-set recognition is to identify samples from open-set classes during testing, while adversarial defense aims to defend the network against images with imperceptible adversarial perturbations. In this paper, we show that open-set recognition systems are vulnerable to adversarial attacks. Furthermore, we show that adversarial defense mechanisms trained on known classes do not generalize well to open-set samples. Motivated by this observation, we emphasize the need of an Open-Set Adversarial Defense (OSAD) mechanism. This paper proposes an Open-Set Defense Network (OSDN) as a solution to the OSAD problem. The proposed network uses an encoder with feature-denoising layers coupled with a classifier to learn a noise-free latent feature representation. Two techniques are employed to obtain an informative latent feature space with the objective of improving open-set performance. First, a decoder is used to ensure that clean images can be reconstructed from the obtained latent features. Then, self-supervision is used to ensure that the latent features are informative enough to carry out an auxiliary task. We introduce a testing protocol to evaluate OSAD performance and show the effectiveness of the proposed method in multiple object classification datasets. The implementation code of the proposed method is available at: https://github.com/rshaojimmy/ECCV2020-OSAD. △ Less

Submitted 2 September, 2020; originally announced September 2020.

Comments: Accepted by ECCV 2020

arXiv:2008.02129 [pdf, other]

Self-supervised Temporal Discriminative Learning for Video Representation Learning

Authors: Jinpeng Wang, Yiqi Lin, Andy J. Ma, Pong C. Yuen

Abstract: Temporal cues in videos provide important information for recognizing actions accurately. However, temporal-discriminative features can hardly be extracted without using an annotated large-scale video action dataset for training. This paper proposes a novel Video-based Temporal-Discriminative Learning (VTDL) framework in self-supervised manner. Without labelled data for network pretraining, tempor… ▽ More Temporal cues in videos provide important information for recognizing actions accurately. However, temporal-discriminative features can hardly be extracted without using an annotated large-scale video action dataset for training. This paper proposes a novel Video-based Temporal-Discriminative Learning (VTDL) framework in self-supervised manner. Without labelled data for network pretraining, temporal triplet is generated for each anchor video by using segment of the same or different time interval so as to enhance the capacity for temporal feature representation. Measuring temporal information by time derivative, Temporal Consistent Augmentation (TCA) is designed to ensure that the time derivative (in any order) of the augmented positive is invariant except for a scaling constant. Finally, temporal-discriminative features are learnt by minimizing the distance between each anchor and its augmented positive, while the distance between each anchor and its augmented negative as well as other videos saved in the memory bank is maximized to enrich the representation diversity. In the downstream action recognition task, the proposed method significantly outperforms existing related works. Surprisingly, the proposed self-supervised approach is better than fully-supervised methods on UCF101 and HMDB51 when a small-scale video dataset (with only thousands of videos) is used for pre-training. The code has been made publicly available on https://github.com/FingerRec/Self-Supervised-Temporal-Discriminative-Representation-Learning-for-Video-Action-Recognition. △ Less

Submitted 5 August, 2020; originally announced August 2020.

Comments: 10 pages

arXiv:2005.14638 [pdf, other]

Federated Face Presentation Attack Detection

Authors: Rui Shao, Pramuditha Perera, Pong C. Yuen, Vishal M. Patel

Abstract: Face presentation attack detection (fPAD) plays a critical role in the modern face recognition pipeline. A face presentation attack detection model with good generalization can be obtained when it is trained with face images from different input distributions and different types of spoof attacks. In reality, training data (both real face images and spoof images) are not directly shared between dat… ▽ More Face presentation attack detection (fPAD) plays a critical role in the modern face recognition pipeline. A face presentation attack detection model with good generalization can be obtained when it is trained with face images from different input distributions and different types of spoof attacks. In reality, training data (both real face images and spoof images) are not directly shared between data owners due to legal and privacy issues. In this paper, with the motivation of circumventing this challenge, we propose Federated Face Presentation Attack Detection (FedPAD) framework. FedPAD simultaneously takes advantage of rich fPAD information available at different data owners while preserving data privacy. In the proposed framework, each data owner (referred to as \textit{data centers}) locally trains its own fPAD model. A server learns a global fPAD model by iteratively aggregating model updates from all data centers without accessing private data in each of them. Once the learned global model converges, it is used for fPAD inference. We introduce the experimental setting to evaluate the proposed FedPAD framework and carry out extensive experiments to provide various insights about federated learning for fPAD. △ Less

Submitted 28 September, 2020; v1 submitted 29 May, 2020; originally announced May 2020.

arXiv:1911.10771 [pdf, other]

Regularized Fine-grained Meta Face Anti-spoofing

Authors: Rui Shao, Xiangyuan Lan, Pong C. Yuen

Abstract: Face presentation attacks have become an increasingly critical concern when face recognition is widely applied. Many face anti-spoofing methods have been proposed, but most of them ignore the generalization ability to unseen attacks. To overcome the limitation, this work casts face anti-spoofing as a domain generalization (DG) problem, and attempts to address this problem by developing a new meta-… ▽ More Face presentation attacks have become an increasingly critical concern when face recognition is widely applied. Many face anti-spoofing methods have been proposed, but most of them ignore the generalization ability to unseen attacks. To overcome the limitation, this work casts face anti-spoofing as a domain generalization (DG) problem, and attempts to address this problem by developing a new meta-learning framework called Regularized Fine-grained Meta-learning. To let our face anti-spoofing model generalize well to unseen attacks, the proposed framework trains our model to perform well in the simulated domain shift scenarios, which is achieved by finding generalized learning directions in the meta-learning process. Specifically, the proposed framework incorporates the domain knowledge of face anti-spoofing as the regularization so that meta-learning is conducted in the feature space regularized by the supervision of domain knowledge. This enables our model more likely to find generalized learning directions with the regularized meta-learning for face anti-spoofing task. Besides, to further enhance the generalization ability of our model, the proposed framework adopts a fine-grained learning strategy that simultaneously conducts meta-learning in a variety of domain shift scenarios in each iteration. Extensive experiments on four public datasets validate the effectiveness of the proposed method. △ Less

Submitted 25 November, 2019; originally announced November 2019.

Comments: Accepted by AAAI 2020. Codes are available at https://github.com/rshaojimmy/AAAI2020-RFMetaFAS

arXiv:1904.03436 [pdf, other]

Unsupervised Embedding Learning via Invariant and Spreading Instance Feature

Authors: Mang Ye, Xu Zhang, Pong C. Yuen, Shih-Fu Chang

Abstract: This paper studies the unsupervised embedding learning problem, which requires an effective similarity measurement between samples in low-dimensional embedding space. Motivated by the positive concentrated and negative separated properties observed from category-wise supervised learning, we propose to utilize the instance-wise supervision to approximate these properties, which aims at learning dat… ▽ More This paper studies the unsupervised embedding learning problem, which requires an effective similarity measurement between samples in low-dimensional embedding space. Motivated by the positive concentrated and negative separated properties observed from category-wise supervised learning, we propose to utilize the instance-wise supervision to approximate these properties, which aims at learning data augmentation invariant and instance spread-out features. To achieve this goal, we propose a novel instance based softmax embedding method, which directly optimizes the `real' instance features on top of the softmax function. It achieves significantly faster learning speed and higher accuracy than all existing methods. The proposed method performs well for both seen and unseen testing categories with cosine similarity. It also achieves competitive performance even without pre-trained network over samples from fine-grained categories. △ Less

Submitted 6 April, 2019; originally announced April 2019.

Comments: CVPR 2019

arXiv:1711.01587 [pdf, ps, other]

doi 10.1109/TPAMI.2017.2727048

Inference-Based Similarity Search in Randomized Montgomery Domains for Privacy-Preserving Biometric Identification

Authors: Yi Wang, Jianwu Wan, Jun Guo, Yiu-Ming Cheung, Pong C Yuen

Abstract: Similarity search is essential to many important applications and often involves searching at scale on high-dimensional data based on their similarity to a query. In biometric applications, recent vulnerability studies have shown that adversarial machine learning can compromise biometric recognition systems by exploiting the biometric similarity information. Existing methods for biometric privacy… ▽ More Similarity search is essential to many important applications and often involves searching at scale on high-dimensional data based on their similarity to a query. In biometric applications, recent vulnerability studies have shown that adversarial machine learning can compromise biometric recognition systems by exploiting the biometric similarity information. Existing methods for biometric privacy protection are in general based on pairwise matching of secured biometric templates and have inherent limitations in search efficiency and scalability. In this paper, we propose an inference-based framework for privacy-preserving similarity search in Hamming space. Our approach builds on an obfuscated distance measure that can conceal Hamming distance in a dynamic interval. Such a mechanism enables us to systematically design statistically reliable methods for retrieving most likely candidates without knowing the exact distance values. We further propose to apply Montgomery multiplication for generating search indexes that can withstand adversarial similarity analysis, and show that information leakage in randomized Montgomery domains can be made negligibly small. Our experiments on public biometric datasets demonstrate that the inference-based approach can achieve a search accuracy close to the best performance possible with secure computation methods, but the associated cost is reduced by orders of magnitude compared to cryptographic primitives. △ Less

Submitted 5 November, 2017; originally announced November 2017.

Comments: 14 pages, 10 figures, 2 tables, regular paper

arXiv:1709.09297 [pdf, ps, other]

Dynamic Label Graph Matching for Unsupervised Video Re-Identification

Authors: Mang Ye, Andy J Ma, Liang Zheng, Jiawei Li, P C Yuen

Abstract: Label estimation is an important component in an unsupervised person re-identification (re-ID) system. This paper focuses on cross-camera label estimation, which can be subsequently used in feature learning to learn robust re-ID models. Specifically, we propose to construct a graph for samples in each camera, and then graph matching scheme is introduced for cross-camera labeling association. While… ▽ More Label estimation is an important component in an unsupervised person re-identification (re-ID) system. This paper focuses on cross-camera label estimation, which can be subsequently used in feature learning to learn robust re-ID models. Specifically, we propose to construct a graph for samples in each camera, and then graph matching scheme is introduced for cross-camera labeling association. While labels directly output from existing graph matching methods may be noisy and inaccurate due to significant cross-camera variations, this paper proposes a dynamic graph matching (DGM) method. DGM iteratively updates the image graph and the label estimation process by learning a better feature space with intermediate estimated labels. DGM is advantageous in two aspects: 1) the accuracy of estimated labels is improved significantly with the iterations; 2) DGM is robust to noisy initial training data. Extensive experiments conducted on three benchmarks including the large-scale MARS dataset show that DGM yields competitive performance to fully supervised baselines, and outperforms competing unsupervised learning methods. △ Less

Submitted 26 September, 2017; originally announced September 2017.

Comments: Accepted by ICCV 2017. Revised our IDE results on MARS dataset under standard evaluation protocol

arXiv:1703.00832 [pdf, other]

doi 10.1109/TPAMI.2018.2827389

On the Reconstruction of Face Images from Deep Face Templates

Authors: Guangcan Mai, Kai Cao, Pong C. Yuen, Anil K. Jain

Abstract: State-of-the-art face recognition systems are based on deep (convolutional) neural networks. Therefore, it is imperative to determine to what extent face templates derived from deep networks can be inverted to obtain the original face image. In this paper, we study the vulnerabilities of a state-of-the-art face recognition system based on template reconstruction attack. We propose a neighborly de-… ▽ More State-of-the-art face recognition systems are based on deep (convolutional) neural networks. Therefore, it is imperative to determine to what extent face templates derived from deep networks can be inverted to obtain the original face image. In this paper, we study the vulnerabilities of a state-of-the-art face recognition system based on template reconstruction attack. We propose a neighborly de-convolutional neural network (\textit{NbNet}) to reconstruct face images from their deep templates. In our experiments, we assumed that no knowledge about the target subject and the deep network are available. To train the \textit{NbNet} reconstruction models, we augmented two benchmark face datasets (VGG-Face and Multi-PIE) with a large collection of images synthesized using a face generator. The proposed reconstruction was evaluated using type-I (comparing the reconstructed images against the original face images used to generate the deep template) and type-II (comparing the reconstructed images against a different face image of the same subject) attacks. Given the images reconstructed from \textit{NbNets}, we show that for verification, we achieve TAR of 95.20\% (58.05\%) on LFW under type-I (type-II) attacks @ FAR of 0.1\%. Besides, 96.58\% (92.84\%) of the images reconstruction from templates of partition \textit{fa} (\textit{fb}) can be identified from partition \textit{fa} in color FERET. Our study demonstrates the need to secure deep templates in face recognition systems. △ Less

Submitted 28 April, 2018; v1 submitted 2 March, 2017; originally announced March 2017.

Comments: To appear in TPAMI, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018

arXiv:1611.00800 [pdf]

Temporal Matrix Completion with Locally Linear Latent Factors for Medical Applications

Authors: Frodo Kin Sun Chan, Andy J Ma, Pong C Yuen, Terry Cheuk-Fung Yip, Yee-Kit Tse, Vincent Wai-Sun Wong, Grace Lai-Hung Wong

Abstract: Regular medical records are useful for medical practitioners to analyze and monitor patient health status especially for those with chronic disease, but such records are usually incomplete due to unpunctuality and absence of patients. In order to resolve the missing data problem over time, tensor-based model is suggested for missing data imputation in recent papers because this approach makes use… ▽ More Regular medical records are useful for medical practitioners to analyze and monitor patient health status especially for those with chronic disease, but such records are usually incomplete due to unpunctuality and absence of patients. In order to resolve the missing data problem over time, tensor-based model is suggested for missing data imputation in recent papers because this approach makes use of low rank tensor assumption for highly correlated data. However, when the time intervals between records are long, the data correlation is not high along temporal direction and such assumption is not valid. To address this problem, we propose to decompose a matrix with missing data into its latent factors. Then, the locally linear constraint is imposed on these factors for matrix completion in this paper. By using a publicly available dataset and two medical datasets collected from hospital, experimental results show that the proposed algorithm achieves the best performance by comparing with the existing methods. △ Less

Submitted 31 October, 2016; originally announced November 2016.

arXiv:1512.05990 [pdf, other]

Deformable Distributed Multiple Detector Fusion for Multi-Person Tracking

Authors: Andy J Ma, Pong C Yuen, Suchi Saria

Abstract: This paper addresses fully automated multi-person tracking in complex environments with challenging occlusion and extensive pose variations. Our solution combines multiple detectors for a set of different regions of interest (e.g., full-body and head) for multi-person tracking. The use of multiple detectors leads to fewer miss detections as it is able to exploit the complementary strengths of the… ▽ More This paper addresses fully automated multi-person tracking in complex environments with challenging occlusion and extensive pose variations. Our solution combines multiple detectors for a set of different regions of interest (e.g., full-body and head) for multi-person tracking. The use of multiple detectors leads to fewer miss detections as it is able to exploit the complementary strengths of the individual detectors. While the number of false positives may increase with the increased number of bounding boxes detected from multiple detectors, we propose to group the detection outputs by bounding box location and depth information. For robustness to significant pose variations, deformable spatial relationship between detectors are learnt in our multi-person tracking system. On RGBD data from a live Intensive Care Unit (ICU), we show that the proposed method significantly improves multi-person tracking performance over state-of-the-art methods. △ Less

Submitted 18 December, 2015; originally announced December 2015.

arXiv:1205.0088

ProPPA: A Fast Algorithm for $\ell_1$ Minimization and Low-Rank Matrix Completion

Authors: Ranch Y. Q. Lai, Pong C. Yuen

Abstract: We propose a Projected Proximal Point Algorithm (ProPPA) for solving a class of optimization problems. The algorithm iteratively computes the proximal point of the last estimated solution projected into an affine space which itself is parallel and approaching to the feasible set. We provide convergence analysis theoretically supporting the general algorithm, and then apply it for solving $\ell_1$-… ▽ More We propose a Projected Proximal Point Algorithm (ProPPA) for solving a class of optimization problems. The algorithm iteratively computes the proximal point of the last estimated solution projected into an affine space which itself is parallel and approaching to the feasible set. We provide convergence analysis theoretically supporting the general algorithm, and then apply it for solving $\ell_1$-minimization problems and the matrix completion problem. These problems arise in many applications including machine learning, image and signal processing. We compare our algorithm with the existing state-of-the-art algorithms. Experimental results on solving these problems show that our algorithm is very efficient and competitive. △ Less

Submitted 19 May, 2012; v1 submitted 1 May, 2012; originally announced May 2012.

Comments: update needed

arXiv:1201.1409 [pdf, other]

Interactive Character Posing by Sparse Coding

Authors: Ranch Y. Q. Lai, Pong C. Yuen, K. W. Lee, J. H. Lai

Abstract: Character posing is of interest in computer animation. It is difficult due to its dependence on inverse kinematics (IK) techniques and articulate property of human characters . To solve the IK problem, classical methods that rely on numerical solutions often suffer from the under-determination problem and can not guarantee naturalness. Existing data-driven methods address this problem by learning… ▽ More Character posing is of interest in computer animation. It is difficult due to its dependence on inverse kinematics (IK) techniques and articulate property of human characters . To solve the IK problem, classical methods that rely on numerical solutions often suffer from the under-determination problem and can not guarantee naturalness. Existing data-driven methods address this problem by learning from motion capture data. When facing a large variety of poses however, these methods may not be able to capture the pose styles or be applicable in real-time environment. Inspired from the low-rank motion de-noising and completion model in \cite{lai2011motion}, we propose a novel model for character posing based on sparse coding. Unlike conventional approaches, our model directly captures the pose styles in Euclidean space to provide intuitive training error measurements and facilitate pose synthesis. A pose dictionary is learned in training stage and based on it natural poses are synthesized to satisfy users' constraints . We compare our model with existing models for tasks of pose de-noising and completion. Experiments show our model obtains lower de-noising and completion error. We also provide User Interface(UI) examples illustrating that our model is effective for interactive character posing. △ Less

Submitted 6 January, 2012; originally announced January 2012.

Comments: Submitted to Computer Graphics Forum

ACM Class: I.7

Showing 1–18 of 18 results for author: Yuen, P C