-
Advancing Anomaly Detection in Computational Workflows with Active Learning
Authors:
Krishnan Raghavan,
George Papadimitriou,
Hongwei Jin,
Anirban Mandal,
Mariam Kiran,
Prasanna Balaprakash,
Ewa Deelman
Abstract:
A computational workflow, also known as workflow, consists of tasks that are executed in a certain order to attain a specific computational campaign. Computational workflows are commonly employed in science domains, such as physics, chemistry, genomics, to complete large-scale experiments in distributed and heterogeneous computing environments. However, running computations at such a large scale m…
▽ More
A computational workflow, also known as workflow, consists of tasks that are executed in a certain order to attain a specific computational campaign. Computational workflows are commonly employed in science domains, such as physics, chemistry, genomics, to complete large-scale experiments in distributed and heterogeneous computing environments. However, running computations at such a large scale makes the workflow applications prone to failures and performance degradation, which can slowdown, stall, and ultimately lead to workflow failure. Learning how these workflows behave under normal and anomalous conditions can help us identify the causes of degraded performance and subsequently trigger appropriate actions to resolve them. However, learning in such circumstances is a challenging task because of the large volume of high-quality historical data needed to train accurate and reliable models. Generating such datasets not only takes a lot of time and effort but it also requires a lot of resources to be devoted to data generation for training purposes. Active learning is a promising approach to this problem. It is an approach where the data is generated as required by the machine learning model and thus it can potentially reduce the training data needed to derive accurate models. In this work, we present an active learning approach that is supported by an experimental framework, Poseidon-X, that utilizes a modern workflow management system and two cloud testbeds. We evaluate our approach using three computational workflows. For one workflow we run an end-to-end live active learning experiment, for the other two we evaluate our active learning algorithms using pre-captured data traces provided by the Flow-Bench benchmark. Our findings indicate that active learning not only saves resources, but it also improves the accuracy of the detection of anomalies.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
ABACUS: An Impairment Aware Joint Optimal Dynamic RMLSA in Elastic Optical Networks
Authors:
M Jyothi Kiran,
Venkatesh Chebolu,
Goutam Das,
Raja Datta
Abstract:
The challenge of optimal Routing and Spectrum Assignment (RSA) is significant in Elastic Optical Networks. Integrating adaptive modulation formats into the RSA problem - Routing, Modulation Level, and Spectrum Assignment - broadens allocation options and increases complexity. The conventional RSA approach entails predetermining fixed paths and then allocating spectrum within them separately. Howev…
▽ More
The challenge of optimal Routing and Spectrum Assignment (RSA) is significant in Elastic Optical Networks. Integrating adaptive modulation formats into the RSA problem - Routing, Modulation Level, and Spectrum Assignment - broadens allocation options and increases complexity. The conventional RSA approach entails predetermining fixed paths and then allocating spectrum within them separately. However, expanding the path set for optimality may not be advisable due to the substantial increase in paths with network size expansion. This paper delves into a novel approach called RMLSA, which proposes a comprehensive solution addressing both route determination and spectrum assignment simultaneously. An objective function named ABACUS, Adaptive Balance of Average Clustering and Utilization of Spectrum, is chosen for its capability to adjust and assign significance to average clustering and spectrum utilization. Our approach involves formulating an Integer Linear Programming model with a straightforward relationship between path and spectrum constraints. The model also integrates Physical Layer Impairments to ensure end-to-end Quality of Transmission for requested connections while maintaining existing ones. We demonstrate that ILP can offer an optimal solution for a dynamic traffic scenario within a reasonable time complexity. To achieve this goal, we adopt a structured formulation approach where essential information is determined beforehand, thus minimizing the need for online computations.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
5G Network Security Practices: An Overview and Survey
Authors:
Fatema Bannat Wala,
Mariam Kiran
Abstract:
This document provides an overview of 5G network security, describing various components of the 5G core network architecture and what kind of security services are offered by these 5G components. It also explores the potential security risks and vulnerabilities presented by the security architecture in 5G and recommends some of the best practices for the 5G network admins to consider while deployi…
▽ More
This document provides an overview of 5G network security, describing various components of the 5G core network architecture and what kind of security services are offered by these 5G components. It also explores the potential security risks and vulnerabilities presented by the security architecture in 5G and recommends some of the best practices for the 5G network admins to consider while deploying a secure 5G network, based on the surveyed documents from the European government's efforts in commercializing the IoT devices and securing supply chain over 5G networks.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Balanced Butterfly Counting in Bipartite-Network
Authors:
Apurba Das,
Aman Abidi,
Ajinkya Shingane,
Mekala Kiran
Abstract:
Bipartite graphs offer a powerful framework for modeling complex relationships between two distinct types of vertices, incorporating probabilistic, temporal, and rating-based information. While the research community has extensively explored various types of bipartite relationships, there has been a notable gap in studying Signed Bipartite Graphs, which capture liking / disliking interactions in r…
▽ More
Bipartite graphs offer a powerful framework for modeling complex relationships between two distinct types of vertices, incorporating probabilistic, temporal, and rating-based information. While the research community has extensively explored various types of bipartite relationships, there has been a notable gap in studying Signed Bipartite Graphs, which capture liking / disliking interactions in real-world networks such as customer-rating-product and senator-vote-bill. Balance butterflies, representing 2 x 2 bicliques, provide crucial insights into antagonistic groups, balance theory, and fraud detection by leveraging the signed information. However, such applications require counting balance butterflies which remains unexplored. In this paper, we propose a new problem: counting balance butterflies in a signed bipartite graph. To address this problem, we adopt state-of-the-art algorithms for butterfly counting, establishing a smart baseline that reduces the time complexity for solving our specific problem. We further introduce a novel bucket approach specifically designed to count balanced butterflies efficiently. We propose a parallelized version of the bucketing approach to enhance performance. Extensive experimental studies on nine real-world datasets demonstrate that our proposed bucket-based algorithm is up to 120x faster over the baseline, and the parallel implementation of the bucket-based algorithm is up to 45x faster over the single core execution. Moreover, a real-world case study showcases the practical application and relevance of counting balanced butterflies.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
The LBNL Superfacility Project Report
Authors:
Deborah Bard,
Cory Snavely,
Lisa Gerhardt,
Jason Lee,
Becci Totzke,
Katie Antypas,
William Arndt,
Johannes Blaschke,
Suren Byna,
Ravi Cheema,
Shreyas Cholia,
Mark Day,
Bjoern Enders,
Aditi Gaur,
Annette Greiner,
Taylor Groves,
Mariam Kiran,
Quincey Koziol,
Tom Lehman,
Kelly Rowland,
Chris Samuel,
Ashwin Selvarajan,
Alex Sim,
David Skinner,
Laurie Stephey
, et al. (2 additional authors not shown)
Abstract:
The Superfacility model is designed to leverage HPC for experimental science. It is more than simply a model of connected experiment, network, and HPC facilities; it encompasses the full ecosystem of infrastructure, software, tools, and expertise needed to make connected facilities easy to use. The three-year Lawrence Berkeley National Laboratory (LBNL) Superfacility project was initiated in 2019…
▽ More
The Superfacility model is designed to leverage HPC for experimental science. It is more than simply a model of connected experiment, network, and HPC facilities; it encompasses the full ecosystem of infrastructure, software, tools, and expertise needed to make connected facilities easy to use. The three-year Lawrence Berkeley National Laboratory (LBNL) Superfacility project was initiated in 2019 to coordinate work being performed at LBNL to support this model, and to provide a coherent and comprehensive set of science requirements to drive existing and new work.
A key component of the project was the in-depth engagements with eight science teams that represent challenging use cases across the DOE Office of Science. By the close of the project, we met our project goal by enabling our science application engagements to demonstrate automated pipelines that analyze data from remote facilities at large scale, without routine human intervention. In several cases, we have gone beyond demonstrations and now provide production-level services. To achieve this goal, the Superfacility team developed tools, infrastructure, and policies for near-real-time computing support, dynamic high-performance networking, data management and movement tools, API-driven automation, HPC-scale notebooks via Jupyter, authentication using Federated Identity and container-based edge services supported.
The lessons we learned during this project provide a valuable model for future large, complex, cross-disciplinary collaborations. There is a pressing need for a coherent computing infrastructure across national facilities, and LBNL's Superfacility project is a unique model for success in tackling the challenges that will be faced in hardware, software, policies, and services across multiple science domains.
△ Less
Submitted 27 June, 2022; v1 submitted 23 June, 2022;
originally announced June 2022.
-
Dynamic Template Selection Through Change Detection for Adaptive Siamese Tracking
Authors:
Madhu Kiran,
Le Thanh Nguyen-Meidine,
Rajat Sahay,
Rafael Menelau Oliveira E Cruz,
Louis-Antoine Blais-Morin,
Eric Granger
Abstract:
Deep Siamese trackers have recently gained much attention in recent years since they can track visual objects at high speeds. Additionally, adaptive tracking methods, where target samples collected by the tracker are employed for online learning, have achieved state-of-the-art accuracy. However, single object tracking (SOT) remains a challenging task in real-world application due to changes and de…
▽ More
Deep Siamese trackers have recently gained much attention in recent years since they can track visual objects at high speeds. Additionally, adaptive tracking methods, where target samples collected by the tracker are employed for online learning, have achieved state-of-the-art accuracy. However, single object tracking (SOT) remains a challenging task in real-world application due to changes and deformations in a target object's appearance. Learning on all the collected samples may lead to catastrophic forgetting, and thereby corrupt the tracking model.
In this paper, SOT is formulated as an online incremental learning problem. A new method is proposed for dynamic sample selection and memory replay, preventing template corruption. In particular, we propose a change detection mechanism to detect gradual changes in object appearance and select the corresponding samples for online adaption. In addition, an entropy-based sample selection strategy is introduced to maintain a diversified auxiliary buffer for memory replay. Our proposed method can be integrated into any object tracking algorithm that leverages online learning for model adaptation.
Extensive experiments conducted on the OTB-100, LaSOT, UAV123, and TrackingNet datasets highlight the cost-effectiveness of our method, along with the contribution of its key components. Results indicate that integrating our proposed method into state-of-art adaptive Siamese trackers can increase the potential benefits of a template update strategy, and significantly improve performance.
△ Less
Submitted 7 March, 2022;
originally announced March 2022.
-
Generative Target Update for Adaptive Siamese Tracking
Authors:
Madhu Kiran,
Le Thanh Nguyen-Meidine,
Rajat Sahay,
Rafael Menelau Oliveira E Cruz,
Louis-Antoine Blais-Morin,
Eric Granger
Abstract:
Siamese trackers perform similarity matching with templates (i.e., target models) to recursively localize objects within a search region. Several strategies have been proposed in the literature to update a template based on the tracker output, typically extracted from the target search region in the current frame, and thereby mitigate the effects of target drift. However, this may lead to corrupte…
▽ More
Siamese trackers perform similarity matching with templates (i.e., target models) to recursively localize objects within a search region. Several strategies have been proposed in the literature to update a template based on the tracker output, typically extracted from the target search region in the current frame, and thereby mitigate the effects of target drift. However, this may lead to corrupted templates, limiting the potential benefits of a template update strategy.
This paper proposes a model adaptation method for Siamese trackers that uses a generative model to produce a synthetic template from the object search regions of several previous frames, rather than directly using the tracker output. Since the search region encompasses the target, attention from the search region is used for robust model adaptation. In particular, our approach relies on an auto-encoder trained through adversarial learning to detect changes in a target object's appearance and predict a future target template, using a set of target templates localized from tracker outputs at previous frames. To prevent template corruption during the update, the proposed tracker also performs change detection using the generative model to suspend updates until the tracker stabilizes, and robust matching can resume through dynamic template fusion.
Extensive experiments conducted on VOT-16, VOT-17, OTB-50, and OTB-100 datasets highlight the effectiveness of our method, along with the impact of its key components. Results indicate that our proposed approach can outperform state-of-art trackers, and its overall robustness allows tracking for a longer time before failure.
△ Less
Submitted 20 February, 2022;
originally announced February 2022.
-
Hyperparameter Tuning for Deep Reinforcement Learning Applications
Authors:
Mariam Kiran,
Melis Ozyildirim
Abstract:
Reinforcement learning (RL) applications, where an agent can simply learn optimal behaviors by interacting with the environment, are quickly gaining tremendous success in a wide variety of applications from controlling simple pendulums to complex data centers. However, setting the right hyperparameters can have a huge impact on the deployed solution performance and reliability in the inference mod…
▽ More
Reinforcement learning (RL) applications, where an agent can simply learn optimal behaviors by interacting with the environment, are quickly gaining tremendous success in a wide variety of applications from controlling simple pendulums to complex data centers. However, setting the right hyperparameters can have a huge impact on the deployed solution performance and reliability in the inference models, produced via RL, used for decision-making. Hyperparameter search itself is a laborious process that requires many iterations and computationally expensive to find the best settings that produce the best neural network architectures. In comparison to other neural network architectures, deep RL has not witnessed much hyperparameter tuning, due to its algorithm complexity and simulation platforms needed. In this paper, we propose a distributed variable-length genetic algorithm framework to systematically tune hyperparameters for various RL applications, improving training time and robustness of the architecture, via evolution. We demonstrate the scalability of our approach on many RL problems (from simple gyms to complex applications) and compared with Bayesian approach. Our results show that with more generations, optimal solutions that require fewer training episodes and are computationally cheap while being more robust for deployment. Our results are imperative to advance deep reinforcement learning controllers for real-world problems.
△ Less
Submitted 26 January, 2022;
originally announced January 2022.
-
MAMRL: Exploiting Multi-agent Meta Reinforcement Learning in WAN Traffic Engineering
Authors:
Shan Sun,
Mariam Kiran,
Wei Ren
Abstract:
Traffic optimization challenges, such as load balancing, flow scheduling, and improving packet delivery time, are difficult online decision-making problems in wide area networks (WAN). Complex heuristics are needed for instance to find optimal paths that improve packet delivery time and minimize interruptions which may be caused by link failures or congestion. The recent success of reinforcement l…
▽ More
Traffic optimization challenges, such as load balancing, flow scheduling, and improving packet delivery time, are difficult online decision-making problems in wide area networks (WAN). Complex heuristics are needed for instance to find optimal paths that improve packet delivery time and minimize interruptions which may be caused by link failures or congestion. The recent success of reinforcement learning (RL) algorithms can provide useful solutions to build better robust systems that learn from experience in model-free settings.
In this work, we consider a path optimization problem, specifically for packet routing, in large complex networks. We develop and evaluate a model-free approach, applying multi-agent meta reinforcement learning (MAMRL) that can determine the next-hop of each packet to get it delivered to its destination with minimum time overall. Specifically, we propose to leverage and compare deep policy optimization RL algorithms for enabling distributed model-free control in communication networks and present a novel meta-learning-based framework, MAMRL, for enabling quick adaptation to topology changes. To evaluate the proposed framework, we simulate with various WAN topologies. Our extensive packet-level simulation results show that compared to classical shortest path and traditional reinforcement learning approaches, MAMRL significantly reduces the average packet delivery time even when network demand increases; and compared to a non-meta deep policy optimization algorithm, our results show the reduction of packet loss in much fewer episodes when link failures occur while offering comparable average packet delivery time.
△ Less
Submitted 29 November, 2021;
originally announced November 2021.
-
HYPPO: A Surrogate-Based Multi-Level Parallelism Tool for Hyperparameter Optimization
Authors:
Vincent Dumont,
Casey Garner,
Anuradha Trivedi,
Chelsea Jones,
Vidya Ganapati,
Juliane Mueller,
Talita Perciano,
Mariam Kiran,
Marc Day
Abstract:
We present a new software, HYPPO, that enables the automatic tuning of hyperparameters of various deep learning (DL) models. Unlike other hyperparameter optimization (HPO) methods, HYPPO uses adaptive surrogate models and directly accounts for uncertainty in model predictions to find accurate and reliable models that make robust predictions. Using asynchronous nested parallelism, we are able to si…
▽ More
We present a new software, HYPPO, that enables the automatic tuning of hyperparameters of various deep learning (DL) models. Unlike other hyperparameter optimization (HPO) methods, HYPPO uses adaptive surrogate models and directly accounts for uncertainty in model predictions to find accurate and reliable models that make robust predictions. Using asynchronous nested parallelism, we are able to significantly alleviate the computational burden of training complex architectures and quantifying the uncertainty. HYPPO is implemented in Python and can be used with both TensorFlow and PyTorch libraries. We demonstrate various software features on time-series prediction and image classification problems as well as a scientific application in computed tomography image reconstruction. Finally, we show that (1) we can reduce by an order of magnitude the number of evaluations necessary to find the most optimal region in the hyperparameter space and (2) we can reduce by two orders of magnitude the throughput for such HPO process to complete.
△ Less
Submitted 4 October, 2021;
originally announced October 2021.
-
NetGraf: A Collaborative Network Monitoring Stack for Network Experimental Testbeds
Authors:
Divneet Kaur,
Bashir Mohammed,
Mariam Kiran
Abstract:
Network performance monitoring collects heterogeneous data suchas network flow data to give an overview of network performance,and other metrics, necessary for diagnosing and optimizing servicequality. However, due to disparate and heterogeneity, to obtainmetrics and visualize entire data from several devices, engineershave to log into multiple dashboards.In this paper we present NetGraf, a comple…
▽ More
Network performance monitoring collects heterogeneous data suchas network flow data to give an overview of network performance,and other metrics, necessary for diagnosing and optimizing servicequality. However, due to disparate and heterogeneity, to obtainmetrics and visualize entire data from several devices, engineershave to log into multiple dashboards.In this paper we present NetGraf, a complete end-to-end networkmonitoring stack, that uses open-source network monitoring toolsand collects, aggregates, and visualizes network measurements on asingle easy-to-use real-time Grafana dashboard. We develop a novelNetGraf architecture and can deploy it on any network testbed suchas Chameleon Cloud by single easy-to-use script for a full view ofnetwork performance in one dashboard.This paper contributes to the theme of automating open-sourcenetwork monitoring tools software setups and their usability forresearchers looking to deploy an end-to-end monitoring stack ontheir own testbeds.
△ Less
Submitted 18 March, 2021;
originally announced May 2021.
-
Holistic Guidance for Occluded Person Re-Identification
Authors:
Madhu Kiran,
R Gnana Praveen,
Le Thanh Nguyen-Meidine,
Soufiane Belharbi,
Louis-Antoine Blais-Morin,
Eric Granger
Abstract:
In real-world video surveillance applications, person re-identification (ReID) suffers from the effects of occlusions and detection errors. Despite recent advances, occlusions continue to corrupt the features extracted by state-of-art CNN backbones, and thereby deteriorate the accuracy of ReID systems. To address this issue, methods in the literature use an additional costly process such as pose e…
▽ More
In real-world video surveillance applications, person re-identification (ReID) suffers from the effects of occlusions and detection errors. Despite recent advances, occlusions continue to corrupt the features extracted by state-of-art CNN backbones, and thereby deteriorate the accuracy of ReID systems. To address this issue, methods in the literature use an additional costly process such as pose estimation, where pose maps provide supervision to exclude occluded regions. In contrast, we introduce a novel Holistic Guidance (HG) method that relies only on person identity labels, and on the distribution of pairwise matching distances of datasets to alleviate the problem of occlusion, without requiring additional supervision. Hence, our proposed student-teacher framework is trained to address the occlusion problem by matching the distributions of between- and within-class distances (DCDs) of occluded samples with that of holistic (non-occluded) samples, thereby using the latter as a soft labeled reference to learn well separated DCDs. This approach is supported by our empirical study where the distribution of between- and within-class distances between images have more overlap in occluded than holistic datasets. In particular, features extracted from both datasets are jointly learned using the student model to produce an attention map that allows separating visible regions from occluded ones. In addition to this, a joint generative-discriminative backbone is trained with a denoising autoencoder, allowing the system to self-recover from occlusions. Extensive experiments on several challenging public datasets indicate that the proposed approach can outperform state-of-the-art methods on both occluded and holistic datasets
△ Less
Submitted 22 July, 2023; v1 submitted 13 April, 2021;
originally announced April 2021.
-
Incremental Multi-Target Domain Adaptation for Object Detection with Efficient Domain Transfer
Authors:
Le Thanh Nguyen-Meidine,
Madhu Kiran,
Marco Pedersoli,
Jose Dolz,
Louis-Antoine Blais-Morin,
Eric Granger
Abstract:
Recent advances in unsupervised domain adaptation have significantly improved the recognition accuracy of CNNs by alleviating the domain shift between (labeled) source and (unlabeled) target data distributions. While the problem of single-target domain adaptation (STDA) for object detection has recently received much attention, multi-target domain adaptation (MTDA) remains largely unexplored, desp…
▽ More
Recent advances in unsupervised domain adaptation have significantly improved the recognition accuracy of CNNs by alleviating the domain shift between (labeled) source and (unlabeled) target data distributions. While the problem of single-target domain adaptation (STDA) for object detection has recently received much attention, multi-target domain adaptation (MTDA) remains largely unexplored, despite its practical relevance in several real-world applications, such as multi-camera video surveillance. Compared to the STDA problem that may involve large domain shifts between complex source and target distributions, MTDA faces additional challenges, most notably the computational requirements and catastrophic forgetting of previously-learned targets, which can depend on the order of target adaptations. STDA for detection can be applied to MTDA by adapting one model per target, or one common model with a mixture of data from target domains. However, these approaches are either costly or inaccurate. The only state-of-art MTDA method specialized for detection learns targets incrementally, one target at a time, and mitigates the loss of knowledge by using a duplicated detection model for knowledge distillation, which is computationally expensive and does not scale well to many domains. In this paper, we introduce an efficient approach for incremental learning that generalizes well to multiple target domains. Our MTDA approach is more suitable for real-world applications since it allows updating the detection model incrementally, without storing data from previous-learned target domains, nor retraining when a new target domain becomes available. Our proposed method, MTDA-DTM, achieved the highest level of detection accuracy compared against state-of-the-art approaches on several MTDA detection benchmarks and Wildtrack, a benchmark for multi-camera pedestrian detection.
△ Less
Submitted 11 May, 2022; v1 submitted 13 April, 2021;
originally announced April 2021.
-
Mining Scientific Workflows for Anomalous Data Transfers
Authors:
Huy Tu,
George Papadimitriou,
Mariam Kiran,
Cong Wang,
Anirban Mandal,
Ewa Deelman,
Tim Menzies
Abstract:
Modern scientific workflows are data-driven and are often executed on distributed, heterogeneous, high-performance computing infrastructures. Anomalies and failures in the workflow execution cause loss of scientific productivity and inefficient use of the infrastructure. Hence, detecting, diagnosing, and mitigating these anomalies are immensely important for reliable and performant scientific work…
▽ More
Modern scientific workflows are data-driven and are often executed on distributed, heterogeneous, high-performance computing infrastructures. Anomalies and failures in the workflow execution cause loss of scientific productivity and inefficient use of the infrastructure. Hence, detecting, diagnosing, and mitigating these anomalies are immensely important for reliable and performant scientific workflows. Since these workflows rely heavily on high-performance network transfers that require strict QoS constraints, accurately detecting anomalous network performance is crucial to ensure reliable and efficient workflow execution. To address this challenge, we have developed X-FLASH, a network anomaly detection tool for faulty TCP workflow transfers. X-FLASH incorporates novel hyperparameter tuning and data mining approaches for improving the performance of the machine learning algorithms to accurately classify the anomalous TCP packets. X-FLASH leverages XGBoost as an ensemble model and couples XGBoost with a sequential optimizer, FLASH, borrowed from search-based Software Engineering to learn the optimal model parameters. X-FLASH found configurations that outperformed the existing approach up to 28\%, 29\%, and 40\% relatively for F-measure, G-score, and recall in less than 30 evaluations. From (1) large improvement and (2) simple tuning, we recommend future research to have additional tuning study as a new standard, at least in the area of scientific workflow anomaly detection.
△ Less
Submitted 22 March, 2021;
originally announced March 2021.
-
Knowledge Distillation Methods for Efficient Unsupervised Adaptation Across Multiple Domains
Authors:
Le Thanh Nguyen-Meidine,
Atif Belal,
Madhu Kiran,
Jose Dolz,
Louis-Antoine Blais-Morin,
Eric Granger
Abstract:
Beyond the complexity of CNNs that require training on large annotated datasets, the domain shift between design and operational data has limited the adoption of CNNs in many real-world applications. For instance, in person re-identification, videos are captured over a distributed set of cameras with non-overlapping viewpoints. The shift between the source (e.g. lab setting) and target (e.g. camer…
▽ More
Beyond the complexity of CNNs that require training on large annotated datasets, the domain shift between design and operational data has limited the adoption of CNNs in many real-world applications. For instance, in person re-identification, videos are captured over a distributed set of cameras with non-overlapping viewpoints. The shift between the source (e.g. lab setting) and target (e.g. cameras) domains may lead to a significant decline in recognition accuracy. Additionally, state-of-the-art CNNs may not be suitable for such real-time applications given their computational requirements. Although several techniques have recently been proposed to address domain shift problems through unsupervised domain adaptation (UDA), or to accelerate/compress CNNs through knowledge distillation (KD), we seek to simultaneously adapt and compress CNNs to generalize well across multiple target domains. In this paper, we propose a progressive KD approach for unsupervised single-target DA (STDA) and multi-target DA (MTDA) of CNNs. Our method for KD-STDA adapts a CNN to a single target domain by distilling from a larger teacher CNN, trained on both target and source domain data in order to maintain its consistency with a common representation. Our proposed approach is compared against state-of-the-art methods for compression and STDA of CNNs on the Office31 and ImageClef-DA image classification datasets. It is also compared against state-of-the-art methods for MTDA on Digits, Office31, and OfficeHome. In both settings -- KD-STDA and KD-MTDA -- results indicate that our approach can achieve the highest level of accuracy across target domains, while requiring a comparable or lower CNN complexity.
△ Less
Submitted 18 January, 2021;
originally announced January 2021.
-
Dynamic Graph Neural Network for Traffic Forecasting in Wide Area Networks
Authors:
Tanwi Mallick,
Mariam Kiran,
Bashir Mohammed,
Prasanna Balaprakash
Abstract:
Wide area networking infrastructures (WANs), particularly science and research WANs, are the backbone for moving large volumes of scientific data between experimental facilities and data centers. With demands growing at exponential rates, these networks are struggling to cope with large data volumes, real-time responses, and overall network performance. Network operators are increasingly looking f…
▽ More
Wide area networking infrastructures (WANs), particularly science and research WANs, are the backbone for moving large volumes of scientific data between experimental facilities and data centers. With demands growing at exponential rates, these networks are struggling to cope with large data volumes, real-time responses, and overall network performance. Network operators are increasingly looking for innovative ways to manage the limited underlying network resources. Forecasting network traffic is a critical capability for proactive resource management, congestion mitigation, and dedicated transfer provisioning. To this end, we propose a nonautoregressive graph-based neural network for multistep network traffic forecasting. Specifically, we develop a dynamic variant of diffusion convolutional recurrent neural networks to forecast traffic in research WANs. We evaluate the efficacy of our approach on real traffic from ESnet, the U.S. Department of Energy's dedicated science network. Our results show that compared to classical forecasting methods, our approach explicitly learns the dynamic nature of spatiotemporal traffic patterns, showing significant improvements in forecasting accuracy. Our technique can surpass existing statistical and deep learning approaches by achieving approximately 20% mean absolute percentage error for multiple hours of forecasts despite dynamic network traffic settings.
△ Less
Submitted 28 August, 2020;
originally announced August 2020.
-
A Flow-Guided Mutual Attention Network for Video-Based Person Re-Identification
Authors:
Madhu Kiran,
Amran Bhuiyan,
Louis-Antoine Blais-Morin,
Mehrsan Javan,
Ismail Ben Ayed,
Eric Granger
Abstract:
Person Re-Identification (ReID) is a challenging problem in many video analytics and surveillance applications, where a person's identity must be associated across a distributed non-overlapping network of cameras. Video-based person ReID has recently gained much interest because it allows capturing discriminant spatio-temporal information from video clips that is unavailable for image-based ReID.…
▽ More
Person Re-Identification (ReID) is a challenging problem in many video analytics and surveillance applications, where a person's identity must be associated across a distributed non-overlapping network of cameras. Video-based person ReID has recently gained much interest because it allows capturing discriminant spatio-temporal information from video clips that is unavailable for image-based ReID. Despite recent advances, deep learning (DL) models for video ReID often fail to leverage this information to improve the robustness of feature representations. In this paper, the motion pattern of a person is explored as an additional cue for ReID. In particular, a flow-guided Mutual Attention network is proposed for fusion of image and optical flow sequences using any 2D-CNN backbone, allowing to encode temporal information along with spatial appearance information. Our Mutual Attention network relies on the joint spatial attention between image and optical flow features maps to activate a common set of salient features across them. In addition to flow-guided attention, we introduce a method to aggregate features from longer input streams for better video sequence-level representation. Our extensive experiments on three challenging video ReID datasets indicate that using the proposed Mutual Attention network allows to improve recognition accuracy considerably with respect to conventional gated-attention networks, and state-of-the-art methods for video-based person ReID.
△ Less
Submitted 4 October, 2020; v1 submitted 9 August, 2020;
originally announced August 2020.
-
Unsupervised Multi-Target Domain Adaptation Through Knowledge Distillation
Authors:
Le Thanh Nguyen-Meidine,
Atif Belal,
Madhu Kiran,
Jose Dolz,
Louis-Antoine Blais-Morin,
Eric Granger
Abstract:
Unsupervised domain adaptation (UDA) seeks to alleviate the problem of domain shift between the distribution of unlabeled data from the target domain w.r.t. labeled data from the source domain. While the single-target UDA scenario is well studied in the literature, Multi-Target Domain Adaptation (MTDA) remains largely unexplored despite its practical importance, e.g., in multi-camera video-surveil…
▽ More
Unsupervised domain adaptation (UDA) seeks to alleviate the problem of domain shift between the distribution of unlabeled data from the target domain w.r.t. labeled data from the source domain. While the single-target UDA scenario is well studied in the literature, Multi-Target Domain Adaptation (MTDA) remains largely unexplored despite its practical importance, e.g., in multi-camera video-surveillance applications. The MTDA problem can be addressed by adapting one specialized model per target domain, although this solution is too costly in many real-world applications. Blending multiple targets for MTDA has been proposed, yet this solution may lead to a reduction in model specificity and accuracy. In this paper, we propose a novel unsupervised MTDA approach to train a CNN that can generalize well across multiple target domains. Our Multi-Teacher MTDA (MT-MTDA) method relies on multi-teacher knowledge distillation (KD) to iteratively distill target domain knowledge from multiple teachers to a common student. The KD process is performed in a progressive manner, where the student is trained by each teacher on how to perform UDA for a specific target, instead of directly learning domain adapted features. Finally, instead of combining the knowledge from each teacher, MT-MTDA alternates between teachers that distill knowledge, thereby preserving the specificity of each target (teacher) when learning to adapt to the student. MT-MTDA is compared against state-of-the-art methods on several challenging UDA benchmarks, and empirical results show that our proposed model can provide a considerably higher level of accuracy across multiple target domains. Our code is available at: https://github.com/LIVIAETS/MT-MTDA
△ Less
Submitted 19 November, 2020; v1 submitted 14 July, 2020;
originally announced July 2020.
-
Joint Progressive Knowledge Distillation and Unsupervised Domain Adaptation
Authors:
Le Thanh Nguyen-Meidine,
Eric Granger,
Madhu Kiran,
Jose Dolz,
Louis-Antoine Blais-Morin
Abstract:
Currently, the divergence in distributions of design and operational data, and large computational complexity are limiting factors in the adoption of CNNs in real-world applications. For instance, person re-identification systems typically rely on a distributed set of cameras, where each camera has different capture conditions. This can translate to a considerable shift between source (e.g. lab se…
▽ More
Currently, the divergence in distributions of design and operational data, and large computational complexity are limiting factors in the adoption of CNNs in real-world applications. For instance, person re-identification systems typically rely on a distributed set of cameras, where each camera has different capture conditions. This can translate to a considerable shift between source (e.g. lab setting) and target (e.g. operational camera) domains. Given the cost of annotating image data captured for fine-tuning in each target domain, unsupervised domain adaptation (UDA) has become a popular approach to adapt CNNs. Moreover, state-of-the-art deep learning models that provide a high level of accuracy often rely on architectures that are too complex for real-time applications. Although several compression and UDA approaches have recently been proposed to overcome these limitations, they do not allow optimizing a CNN to simultaneously address both. In this paper, we propose an unexplored direction -- the joint optimization of CNNs to provide a compressed model that is adapted to perform well for a given target domain. In particular, the proposed approach performs unsupervised knowledge distillation (KD) from a complex teacher model to a compact student model, by leveraging both source and target data. It also improves upon existing UDA techniques by progressively teaching the student about domain-invariant features, instead of directly adapting a compact model on target domain data. Our method is compared against state-of-the-art compression and UDA techniques, using two popular classification datasets for UDA -- Office31 and ImageClef-DA. In both datasets, results indicate that our method can achieve the highest level of accuracy while requiring a comparable or lower time complexity.
△ Less
Submitted 15 May, 2020;
originally announced May 2020.
-
Do optimization methods in deep learning applications matter?
Authors:
Buse Melis Ozyildirim,
Mariam Kiran
Abstract:
With advances in deep learning, exponential data growth and increasing model complexity, developing efficient optimization methods are attracting much research attention. Several implementations favor the use of Conjugate Gradient (CG) and Stochastic Gradient Descent (SGD) as being practical and elegant solutions to achieve quick convergence, however, these optimization processes also present many…
▽ More
With advances in deep learning, exponential data growth and increasing model complexity, developing efficient optimization methods are attracting much research attention. Several implementations favor the use of Conjugate Gradient (CG) and Stochastic Gradient Descent (SGD) as being practical and elegant solutions to achieve quick convergence, however, these optimization processes also present many limitations in learning across deep learning applications. Recent research is exploring higher-order optimization functions as better approaches, but these present very complex computational challenges for practical use. Comparing first and higher-order optimization functions, in this paper, our experiments reveal that Levemberg-Marquardt (LM) significantly supersedes optimal convergence but suffers from very large processing time increasing the training complexity of both, classification and reinforcement learning problems. Our experiments compare off-the-shelf optimization functions(CG, SGD, LM and L-BFGS) in standard CIFAR, MNIST, CartPole and FlappyBird experiments.The paper presents arguments on which optimization functions to use and further, which functions would benefit from parallelization efforts to improve pretraining time and learning rate convergence.
△ Less
Submitted 28 February, 2020;
originally announced February 2020.
-
On the Interaction Between Deep Detectors and Siamese Trackers in Video Surveillance
Authors:
Madhu Kiran,
Vivek Tiwari,
Le Thanh Nguyen-Meidine,
Eric Granger
Abstract:
Visual object tracking is an important function in many real-time video surveillance applications, such as localization and spatio-temporal recognition of persons. In real-world applications, an object detector and tracker must interact on a periodic basis to discover new objects, and thereby to initiate tracks. Periodic interactions with the detector can also allow the tracker to validate and/or…
▽ More
Visual object tracking is an important function in many real-time video surveillance applications, such as localization and spatio-temporal recognition of persons. In real-world applications, an object detector and tracker must interact on a periodic basis to discover new objects, and thereby to initiate tracks. Periodic interactions with the detector can also allow the tracker to validate and/or update its object template with new bounding boxes. However, bounding boxes provided by a state-of-the-art detector are noisy, due to changes in appearance, background and occlusion, which can cause the tracker to drift. Moreover, CNN-based detectors can provide a high level of accuracy at the expense of computational complexity, so interactions should be minimized for real-time applications.
In this paper, a new approach is proposed to manage detector-tracker interactions for trackers from the Siamese-FC family. By integrating a change detection mechanism into a deep Siamese-FC tracker, its template can be adapted in response to changes in a target's appearance that lead to drifts during tracking. An abrupt change detection triggers an update of tracker template using the bounding box produced by the detector, while in the case of a gradual change, the detector is used to update an evolving set of templates for robust matching.
Experiments were performed using state-of-the-art Siamese-FC trackers and the YOLOv3 detector on a subset of videos from the OTB-100 dataset that mimic video surveillance scenarios. Results highlight the importance for reliable VOT of using accurate detectors. They also indicate that our adaptive Siamese trackers are robust to noisy object detections, and can significantly improve the performance of Siamese-FC tracking.
△ Less
Submitted 31 October, 2019;
originally announced October 2019.
-
Progressive Gradient Pruning for Classification, Detection and DomainAdaptation
Authors:
Le Thanh Nguyen-Meidine,
Eric Granger,
Madhu Kiran,
Louis-Antoine Blais-Morin,
Marco Pedersoli
Abstract:
Although deep neural networks (NNs) have achievedstate-of-the-art accuracy in many visual recognition tasks,the growing computational complexity and energy con-sumption of networks remains an issue, especially for ap-plications on platforms with limited resources and requir-ing real-time processing. Filter pruning techniques haverecently shown promising results for the compression andacceleration…
▽ More
Although deep neural networks (NNs) have achievedstate-of-the-art accuracy in many visual recognition tasks,the growing computational complexity and energy con-sumption of networks remains an issue, especially for ap-plications on platforms with limited resources and requir-ing real-time processing. Filter pruning techniques haverecently shown promising results for the compression andacceleration of convolutional NNs (CNNs). However, thesetechniques involve numerous steps and complex optimisa-tions because some only prune after training CNNs, whileothers prune from scratch during training by integratingsparsity constraints or modifying the loss function.In this paper we propose a new Progressive GradientPruning (PGP) technique for iterative filter pruning dur-ing training. In contrast to previous progressive pruningtechniques, it relies on a novel filter selection criterion thatmeasures the change in filter weights, uses a new hard andsoft pruning strategy and effectively adapts momentum ten-sors during the backward propagation pass. Experimentalresults obtained after training various CNNs on image datafor classification, object detection and domain adaptationbenchmarks indicate that the PGP technique can achievea better trade-off between classification accuracy and net-work (time and memory) complexity than PSFP and otherstate-of-the-art filter pruning techniques.
△ Less
Submitted 25 February, 2020; v1 submitted 20 June, 2019;
originally announced June 2019.
-
A Comparison of CNN-based Face and Head Detectors for Real-Time Video Surveillance Applications
Authors:
Le Thanh Nguyen-Meidine,
Eric Granger,
Madhu Kiran,
Louis-Antoine Blais-Morin
Abstract:
Detecting faces and heads appearing in video feeds are challenging tasks in real-world video surveillance applications due to variations in appearance, occlusions and complex backgrounds. Recently, several CNN architectures have been proposed to increase the accuracy of detectors, although their computational complexity can be an issue, especially for real-time applications, where faces and heads…
▽ More
Detecting faces and heads appearing in video feeds are challenging tasks in real-world video surveillance applications due to variations in appearance, occlusions and complex backgrounds. Recently, several CNN architectures have been proposed to increase the accuracy of detectors, although their computational complexity can be an issue, especially for real-time applications, where faces and heads must be detected live using high-resolution cameras. This paper compares the accuracy and complexity of state-of-the-art CNN architectures that are suitable for face and head detection. Single pass and region-based architectures are reviewed and compared empirically to baseline techniques according to accuracy and to time and memory complexity on images from several challenging datasets. The viability of these architectures is analyzed with real-time video surveillance applications in mind. Results suggest that, although CNN architectures can achieve a very high level of accuracy compared to traditional detectors, their computational cost can represent a limitation for many practical real-time applications.
△ Less
Submitted 10 September, 2018;
originally announced September 2018.
-
Technical Report on Deploying a highly secured OpenStack Cloud Infrastructure using BradStack as a Case Study
Authors:
Bashir Mohammed,
Sibusiso Moyo,
K. M Maiyama,
Sulayman Kinteh,
Al Noaman M. K. Al-Shaidy,
M. A. Kamala,
M. Kiran
Abstract:
Cloud computing has emerged as a popular paradigm and an attractive model for providing a reliable distributed computing model.it is increasing attracting huge attention both in academic research and industrial initiatives. Cloud deployments are paramount for institution and organizations of all scales. The availability of a flexible, free open source cloud platform designed with no propriety soft…
▽ More
Cloud computing has emerged as a popular paradigm and an attractive model for providing a reliable distributed computing model.it is increasing attracting huge attention both in academic research and industrial initiatives. Cloud deployments are paramount for institution and organizations of all scales. The availability of a flexible, free open source cloud platform designed with no propriety software and the ability of its integration with legacy systems and third-party applications are fundamental. Open stack is a free and opensource software released under the terms of Apache license with a fragmented and distributed architecture making it highly flexible. This project was initiated and aimed at designing a secured cloud infrastructure called BradStack, which is built on OpenStack in the Computing Laboratory at the University of Bradford. In this report, we present and discuss the steps required in deploying a secured BradStack Multi-node cloud infrastructure and conducting Penetration testing on OpenStack Services to validate the effectiveness of the security controls on the BradStack platform. This report serves as a practical guideline, focusing on security and practical infrastructure related issues. It also serves as a reference for institutions looking at the possibilities of implementing a secured cloud solution.
△ Less
Submitted 25 December, 2017;
originally announced December 2017.
-
Converting a Systems Dynamic Model to an Agent-based model for studying the Bicoid morphogen gradient in Drosophila embryo
Authors:
Mariam Kiran,
Wei Liu
Abstract:
The concentration gradient of the Bicoid morphogen, which is established during the early stages of a Drosophila melanogaster embryonic development, determines the differential spatial patterns of gene expression and subsequent cell fate determination. This is mainly achieved by diffusion elicited by the different concentrations of the Bicoid protein in the embryo. Such chemical dynamic progress c…
▽ More
The concentration gradient of the Bicoid morphogen, which is established during the early stages of a Drosophila melanogaster embryonic development, determines the differential spatial patterns of gene expression and subsequent cell fate determination. This is mainly achieved by diffusion elicited by the different concentrations of the Bicoid protein in the embryo. Such chemical dynamic progress can be simulated by stochastic models, particularly the Gillespie alogrithm. However, as with various modelling approaches in biology, each technique involves drawing assumptions and reducing the model complexity sometimes limiting the model capability. This is mainly due to the complexity of the software modelling approaches to construct these models. Agent-based modelling is a technique which is becoming increasingly popular for modelling the behaviour of individual molecules or cells in computational biology.
This paper attempts to compare these two popular modelling techniques of stochastic and agent-based modelling to show how the model can be studied in detail using the different approaches. This paper presents how to use these techniques with the advantages and disadvantages of using either of these. Through various comparisons, such as computation complexity and results obtained, we show that although the same model is implemented, both approaches can give varying results. The results of the paper show that the stochastic model is able to give smoother results compared to the agent-based model which may need further analysis at a later stage. We discuss the reasons for these results and how these could be rectified in systems biology research.
△ Less
Submitted 17 December, 2014;
originally announced December 2014.
-
Experimental Report on Setting up a Cloud Computing Environment at the University of Bradford
Authors:
Bashir Mohammed,
Mariam Kiran
Abstract:
Cloud computing is increasingly attracting large attention in computing both in academic research and in industrial initiatives. Emerging as a popular paradigm and an attractive model of providing computing, information technology (IT) infrastructure, network and storage to large and small enterprises both in private and public sectors. This project was initiated and aimed at designing and Setting…
▽ More
Cloud computing is increasingly attracting large attention in computing both in academic research and in industrial initiatives. Emerging as a popular paradigm and an attractive model of providing computing, information technology (IT) infrastructure, network and storage to large and small enterprises both in private and public sectors. This project was initiated and aimed at designing and Setting up a basic Cloud lab Testbed running on Open stack under Virtual box for experiments and Hosting Cloud Platforms in the networking laboratory at the University of Bradford. This report presents the methodology of setting up a cloud lab testbed for experiment running on open stack. Current resources, in the Networking lab at the university were used and turned into virtual platforms for cloud computing testing. This report serves as a practical guideline, concentrating on the practical infrastructure related questions and issues, on setting up a cloud lab for testing and proof of concept. Finally the report proposes an experimental validation showing feasibility of migrating to cloud.
The primary focus of this report is to provide a brief background on different theoretical concepts of cloud computing, particularly virtualisation, and then it elaborates on the practical aspects concerning the setup and implementation of a Cloud lab test bed using open source solutions. This reports serves as a reference for institutions looking at the possibilities of implementing cloud solutions, in order to benefit from getting the basics and a view on the different aspects of cloud migration concepts.
△ Less
Submitted 15 December, 2014;
originally announced December 2014.
-
Using FLAME Toolkit for Agent-Based Simulation: Case Study Sugarscape Model
Authors:
Mariam Kiran
Abstract:
Social scientists have used agent-based models to understand how individuals interact and behave in various political, ecological and economic scenarios. Agent-based models are ideal for understanding such models involving interacting individuals producing emergent phenomenon. Sugarscape is one of the most famous examples of a social agent-based model which has been used to show how societies grow…
▽ More
Social scientists have used agent-based models to understand how individuals interact and behave in various political, ecological and economic scenarios. Agent-based models are ideal for understanding such models involving interacting individuals producing emergent phenomenon. Sugarscape is one of the most famous examples of a social agent-based model which has been used to show how societies grow in the real world.
This paper builds on the Sugarscape model, using the Flexible Large scale Agent-based modelling Environment (FLAME) to simulate three different scenarios of the experiment, which are based on the Sugar and Citizen locations. FLAME is an agent-based modelling framework which has previously been used to model biological and economic models. The paper includes details on how the model was written and the various parameters set for the simulation. The results of the model simulated are processed for three scenarios and analysed to see what affect the initial starting states of the agents had on the overall result obtained through the model and the variance in simulation time of processing the model on multicore architectures.
The experiments highlight that there are limitations of the FLAME framework and writing simulation models in general which are highly dependent on initial starting states of a model, also raising further potential work which can be built into the Sugarscape model to study other interesting phenomenon in social and economic laws.
△ Less
Submitted 14 August, 2014;
originally announced August 2014.