Skip to main content

Showing 1–27 of 27 results for author: Casale, G

  1. arXiv:2302.05630  [pdf, other

    eess.SY cs.LG

    CILP: Co-simulation based Imitation Learner for Dynamic Resource Provisioning in Cloud Computing Environments

    Authors: Shreshth Tuli, Giuliano Casale, Nicholas R. Jennings

    Abstract: Intelligent Virtual Machine (VM) provisioning is central to cost and resource efficient computation in cloud computing environments. As bootstrapping VMs is time-consuming, a key challenge for latency-critical tasks is to predict future workload demands to provision VMs proactively. However, existing AI-based solutions tend to not holistically consider all crucial aspects such as provisioning over… ▽ More

    Submitted 16 April, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

    Comments: Accepted in IEEE Transactions on Network and Service Management

  2. arXiv:2212.01302  [pdf, other

    cs.DC cs.AI

    DeepFT: Fault-Tolerant Edge Computing using a Self-Supervised Deep Surrogate Model

    Authors: Shreshth Tuli, Giuliano Casale, Ludmila Cherkasova, Nicholas R. Jennings

    Abstract: The emergence of latency-critical AI applications has been supported by the evolution of the edge computing paradigm. However, edge solutions are typically resource-constrained, posing reliability challenges due to heightened contention for compute and communication capacities and faulty application behavior in the presence of overload conditions. Although a large amount of generated log data can… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    Comments: Accepted in IEEE INFOCOM 2023

  3. arXiv:2210.04595  [pdf, other

    cs.DC eess.SY

    SampleHST: Efficient On-the-Fly Selection of Distributed Traces

    Authors: Alim Ul Gias, Yicheng Gao, Matthew Sheldon, José A. Perusquía, Owen O'Brien, Giuliano Casale

    Abstract: Since only a small number of traces generated from distributed tracing helps in troubleshooting, its storage requirement can be significantly reduced by biasing the selection towards anomalous traces. To aid in this scenario, we propose SampleHST, a novel approach to sample on-the-fly from a stream of traces in an unsupervised manner. SampleHST adjusts the storage quota of normal and anomalous tra… ▽ More

    Submitted 9 September, 2022; originally announced October 2022.

    Comments: 10 pages, 5 figures

  4. arXiv:2208.07658  [pdf, other

    cs.DC cs.AI cs.PF

    DRAGON: Decentralized Fault Tolerance in Edge Federations

    Authors: Shreshth Tuli, Giuliano Casale, Nicholas R. Jennings

    Abstract: Edge Federation is a new computing paradigm that seamlessly interconnects the resources of multiple edge service providers. A key challenge in such systems is the deployment of latency-critical and AI based resource-intensive applications in constrained devices. To address this challenge, we propose a novel memory-efficient deep learning based model, namely generative optimization networks (GON).… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: Accepted in IEEE Transactions on Network and Service Management (TNSM)

  5. arXiv:2208.00761  [pdf, other

    cs.DC

    AI Augmented Edge and Fog Computing: Trends and Challenges

    Authors: Shreshth Tuli, Fatemeh Mirhakimi, Samodha Pallewatta, Syed Zawad, Giuliano Casale, Bahman Javadi, Feng Yan, Rajkumar Buyya, Nicholas R. Jennings

    Abstract: In recent years, the landscape of computing paradigms has witnessed a gradual yet remarkable shift from monolithic computing to distributed and decentralized paradigms such as Internet of Things (IoT), Edge, Fog, Cloud, and Serverless. The frontiers of these computing technologies have been boosted by shift from manually encoded algorithms to Artificial Intelligence (AI)-driven autonomous systems… ▽ More

    Submitted 14 April, 2023; v1 submitted 1 August, 2022; originally announced August 2022.

    Comments: Accepted in Elsevier Journal of Network and Computer Applications

  6. arXiv:2205.10642  [pdf, other

    cs.DC cs.AI

    MetaNet: Automated Dynamic Selection of Scheduling Policies in Cloud Environments

    Authors: Shreshth Tuli, Giuliano Casale, Nicholas R. Jennings

    Abstract: Task scheduling is a well-studied problem in the context of optimizing the Quality of Service (QoS) of cloud computing environments. In order to sustain the rapid growth of computational demands, one of the most important QoS metrics for cloud schedulers is the execution cost. In this regard, several data-driven deep neural networks (DNNs) based schedulers have been proposed in recent years to all… ▽ More

    Submitted 21 May, 2022; originally announced May 2022.

    Comments: Accepted in IEEE CLOUD 2022

  7. arXiv:2205.10640  [pdf, other

    cs.DC cs.PF

    Learning to Dynamically Select Cost Optimal Schedulers in Cloud Computing Environments

    Authors: Shreshth Tuli, Giuliano Casale, Nicholas R. Jennings

    Abstract: The operational cost of a cloud computing platform is one of the most significant Quality of Service (QoS) criteria for schedulers, crucial to keep up with the growing computational demands. Several data-driven deep neural network (DNN)-based schedulers have been proposed in recent years that outperform alternative approaches by providing scalable and effective resource management for dynamic work… ▽ More

    Submitted 21 May, 2022; originally announced May 2022.

    Comments: Accepted as a poster in SIGMETRICS 2022

  8. arXiv:2205.10635  [pdf, other

    cs.DC cs.AI cs.PF

    SplitPlace: AI Augmented Splitting and Placement of Large-Scale Neural Networks in Mobile Edge Environments

    Authors: Shreshth Tuli, Giuliano Casale, Nicholas R. Jennings

    Abstract: In recent years, deep learning models have become ubiquitous in industry and academia alike. Deep neural networks can solve some of the most complex pattern-recognition problems today, but come with the price of massive compute and memory requirements. This makes the problem of deploying such large-scale neural networks challenging in resource-constrained mobile edge computing platforms, specifica… ▽ More

    Submitted 21 May, 2022; originally announced May 2022.

    Comments: Accepted in IEEE Transactions on Mobile Computing

  9. arXiv:2205.04575  [pdf, other

    cs.PF cs.NI

    JCSP: Joint Caching and Service Placement for Edge Computing Systems

    Authors: Yicheng Gao, Giuliano Casale

    Abstract: With constrained resources, what, where, and how to cache at the edge is one of the key challenges for edge computing systems. The cached items include not only the application data contents but also the local caching of edge services that handle incoming requests. However, current systems separate the contents and services without considering the latency interplay of caching and queueing. Therefo… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

  10. arXiv:2203.07140  [pdf, other

    cs.DC cs.LG

    CAROL: Confidence-Aware Resilience Model for Edge Federations

    Authors: Shreshth Tuli, Giuliano Casale, Nicholas R. Jennings

    Abstract: In recent years, the deployment of large-scale Internet of Things (IoT) applications has given rise to edge federations that seamlessly interconnect and leverage resources from multiple edge service providers. The requirement of supporting both latency-sensitive and compute-intensive IoT tasks necessitates service resilience, especially for the broker nodes in typical broker-worker deployment desi… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

    Comments: Accepted in DSN 2022

  11. arXiv:2201.07284  [pdf, other

    cs.LG

    TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data

    Authors: Shreshth Tuli, Giuliano Casale, Nicholas R. Jennings

    Abstract: Efficient anomaly detection and diagnosis in multivariate time-series data is of great importance for modern industrial applications. However, building a system that is able to quickly and accurately pinpoint anomalous observations is a challenging problem. This is due to the lack of anomaly labels, high data volatility and the demands of ultra-low inference times in modern applications. Despite t… ▽ More

    Submitted 14 May, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

    Comments: Accepted in VLDB 2022

  12. arXiv:2112.08916  [pdf, other

    cs.DC cs.LG cs.PF

    GOSH: Task Scheduling Using Deep Surrogate Models in Fog Computing Environments

    Authors: Shreshth Tuli, Giuliano Casale, Nicholas R. Jennings

    Abstract: Recently, intelligent scheduling approaches using surrogate models have been proposed to efficiently allocate volatile tasks in heterogeneous fog environments. Advances like deterministic surrogate models, deep neural networks (DNN) and gradient-based optimization allow low energy consumption and response times to be reached. However, deterministic surrogate models, which estimate objective values… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

    Comments: Accepted in IEEE Transactions on Parallel and Distributed Systems (Special Issue on PDC for AI), 2022

  13. arXiv:2112.07269  [pdf, other

    cs.DC cs.AI cs.PF

    MCDS: AI Augmented Workflow Scheduling in Mobile Edge Cloud Computing Systems

    Authors: Shreshth Tuli, Giuliano Casale, Nicholas R. Jennings

    Abstract: Workflow scheduling is a long-studied problem in parallel and distributed computing (PDC), aiming to efficiently utilize compute resources to meet user's service requirements. Recently proposed scheduling methods leverage the low response times of edge computing platforms to optimize application Quality of Service (QoS). However, scheduling workflow applications in mobile edge-cloud systems is cha… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

    Comments: Accepted in IEEE Transactions on Parallel and Distributed Systems (Special Issue on PDC for AI), 2022

  14. arXiv:2112.02292  [pdf, other

    cs.DC cs.LG

    PreGAN: Preemptive Migration Prediction Network for Proactive Fault-Tolerant Edge Computing

    Authors: Shreshth Tuli, Giuliano Casale, Nicholas R. Jennings

    Abstract: Building a fault-tolerant edge system that can quickly react to node overloads or failures is challenging due to the unreliability of edge devices and the strict service deadlines of modern applications. Moreover, unnecessary task migrations can stress the system network, giving rise to the need for a smart and parsimonious failure recovery scheme. Prior approaches often fail to adapt to highly vo… ▽ More

    Submitted 4 December, 2021; originally announced December 2021.

    Comments: Accepted in Infocom 2022

  15. arXiv:2111.10241  [pdf, other

    cs.DC cs.PF

    START: Straggler Prediction and Mitigation for Cloud Computing Environments using Encoder LSTM Networks

    Authors: Shreshth Tuli, Sukhpal Singh Gill, Peter Garraghan, Rajkumar Buyya, Giuliano Casale, Nicholas R. Jennings

    Abstract: Modern large-scale computing systems distribute jobs into multiple smaller tasks which execute in parallel to accelerate job completion rates and reduce energy consumption. However, a common performance problem in such systems is dealing with straggler tasks that are slow running instances that increase the overall response time. Such tasks can significantly impact the system's Quality of Service… ▽ More

    Submitted 19 November, 2021; originally announced November 2021.

    Comments: Accepted in IEEE Transactions on Services Computing, 2021

  16. HUNTER: AI based Holistic Resource Management for Sustainable Cloud Computing

    Authors: Shreshth Tuli, Sukhpal Singh Gill, Minxian Xu, Peter Garraghan, Rami Bahsoon, Schahram Dustdar, Rizos Sakellariou, Omer Rana, Rajkumar Buyya, Giuliano Casale, Nicholas R. Jennings

    Abstract: The worldwide adoption of cloud data centers (CDCs) has given rise to the ubiquitous demand for hosting application services on the cloud. Further, contemporary data-intensive industries have seen a sharp upsurge in the resource requirements of modern applications. This has led to the provisioning of an increased number of cloud servers, giving rise to higher energy consumption and, consequently,… ▽ More

    Submitted 28 October, 2021; v1 submitted 11 October, 2021; originally announced October 2021.

    Comments: Accepted in Elsevier Journal of Systems and Software, 2021

  17. arXiv:2110.02912  [pdf, other

    cs.LG

    Generative Optimization Networks for Memory Efficient Data Generation

    Authors: Shreshth Tuli, Shikhar Tuli, Giuliano Casale, Nicholas R. Jennings

    Abstract: In standard generative deep learning models, such as autoencoders or GANs, the size of the parameter set is proportional to the complexity of the generated data distribution. A significant challenge is to deploy resource-hungry deep learning models in devices with limited memory to prevent system upgrade costs. To combat this, we propose a novel framework called generative optimization networks (G… ▽ More

    Submitted 28 October, 2021; v1 submitted 6 October, 2021; originally announced October 2021.

    Comments: Accepted in NeurIPS 2021 - Workshop on ML for Systems

  18. arXiv:2106.01847  [pdf, other

    cs.PF

    Towards Cost-Optimal Policies for DAGs to Utilize IaaS Clouds with Online Learning

    Authors: Xiaohu Wu, Han Yu, Giuliano Casale, Guanyu Gao

    Abstract: Premier cloud service providers (CSPs) offer two types of purchase options, namely on-demand and spot instances, with time-varying features in availability and price. Users like startups have to operate on a limited budget and similarly others hope to reduce their costs. While interacting with a CSP, central to their concerns is the process of cost-effectively utilizing different purchase options… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

  19. COSCO: Container Orchestration using Co-Simulation and Gradient Based Optimization for Fog Computing Environments

    Authors: Shreshth Tuli, Shivananda Poojara, Satish N. Srirama, Giuliano Casale, Nicholas R. Jennings

    Abstract: Intelligent task placement and management of tasks in large-scale fog platforms is challenging due to the highly volatile nature of modern workload applications and sensitive user requirements of low energy consumption and response time. Container orchestration platforms have emerged to alleviate this problem with prior art either using heuristics to quickly reach scheduling decisions or AI driven… ▽ More

    Submitted 9 July, 2021; v1 submitted 29 April, 2021; originally announced April 2021.

    Comments: Accepted in IEEE Transactions on Parallel and Distributed Systems, 2021

  20. arXiv:2007.15314  [pdf, other

    cs.PF cs.GT

    Delay and Price Differentiation in Cloud Computing: A Service Model, Supporting Architectures, and Performance

    Authors: Xiaohu Wu, Francesco De Pellegrini, Giuliano Casale

    Abstract: Many cloud service providers (CSPs) provide on-demand service at a price with a small delay. We propose a QoS-differentiated model where multiple SLAs deliver both on-demand service for latency-critical users and delayed services for delay-tolerant users at lower prices. Two architectures are considered to fulfill SLAs. The first is based on priority queues. The second simply separates servers int… ▽ More

    Submitted 30 July, 2020; originally announced July 2020.

  21. arXiv:2007.01222  [pdf, other

    cs.PF

    COCOA: Cold Start Aware Capacity Planning for Function-as-a-Service Platforms

    Authors: Alim Ul Gias, Giuliano Casale

    Abstract: Function-as-a-Service (FaaS) is increasingly popular in the software industry due to the implied cost-savings in event-driven workloads and its synergy with DevOps. To size an on-premise FaaS platform, it is important to estimate the required CPU and memory capacity to serve the expected loads. Given the service-level agreements, it is however challenging to take the cold start issue into account… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

    Comments: 8 pages, 9 figures

  22. arXiv:1902.01321  [pdf, other

    cs.PF cs.DC cs.GT

    A Framework for Allocating Server Time to Spot and On-demand Services in Cloud Computing

    Authors: Xiaohu Wu, Francesco De Pellegrini, Guanyu Gao, Giuliano Casale

    Abstract: Cloud computing delivers value to users by facilitating their access to computing capacity in periods when their need arises. An approach is to provide both on-demand and spot services on shared servers. The former allows users to access servers on demand at a fixed price and users occupy different periods of servers. The latter allows users to bid for the remaining unoccupied periods via dynamic… ▽ More

    Submitted 1 September, 2019; v1 submitted 4 February, 2019; originally announced February 2019.

  23. arXiv:1807.08673  [pdf, ps, other

    stat.ME cs.PF stat.CO

    Variational inequalities and mean-field approximations for partially observed systems of queueing networks

    Authors: Iker Perez, Giuliano Casale

    Abstract: Queueing networks are systems of theoretical interest that find widespread use in the performance evaluation of interconnected resources. In comparison to counterpart models in genetics or mathematical biology, the stochastic (jump) processes induced by queueing networks have distinctive coupling and synchronization properties. This has prevented the derivation of variational approximations for co… ▽ More

    Submitted 27 June, 2019; v1 submitted 23 July, 2018; originally announced July 2018.

  24. arXiv:1711.09123  [pdf, other

    cs.DC

    A Manifesto for Future Generation Cloud Computing: Research Directions for the Next Decade

    Authors: Rajkumar Buyya, Satish Narayana Srirama, Giuliano Casale, Rodrigo Calheiros, Yogesh Simmhan, Blesson Varghese, Erol Gelenbe, Bahman Javadi, Luis Miguel Vaquero, Marco A. S. Netto, Adel Nadjaran Toosi, Maria Alejandra Rodriguez, Ignacio M. Llorente, Sabrina De Capitani di Vimercati, Pierangela Samarati, Dejan Milojicic, Carlos Varela, Rami Bahsoon, Marcos Dias de Assuncao, Omer Rana, Wanlei Zhou, Hai Jin, Wolfgang Gentzsch, Albert Y. Zomaya, Haiying Shen

    Abstract: The Cloud computing paradigm has revolutionised the computer science horizon during the past decade and has enabled the emergence of computing as the fifth utility. It has captured significant attention of academia, industries, and government bodies. Now, it has emerged as the backbone of modern economy by offering subscription-based services anytime, anywhere following a pay-as-you-go model. This… ▽ More

    Submitted 24 August, 2018; v1 submitted 24 November, 2017; originally announced November 2017.

    Comments: 51 pages, 3 figures

  25. arXiv:1704.05867  [pdf, ps, other

    cs.PF math.MG

    A note on integrating products of linear forms over the unit simplex

    Authors: Giuliano Casale

    Abstract: Integrating a product of linear forms over the unit simplex can be done in polynomial time if the number of variables n is fixed (V. Baldoni et al., 2011). In this note, we highlight that this problem is equivalent to obtaining the normalizing constant of state probabilities for a popular class of Markov processes used in queueing network theory. In light of this equivalence, we survey existing co… ▽ More

    Submitted 8 March, 2023; v1 submitted 19 April, 2017; originally announced April 2017.

    ACM Class: C.4; G.2

  26. arXiv:1606.06543  [pdf, other

    cs.DC

    An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing Systems

    Authors: Pooyan Jamshidi, Giuliano Casale

    Abstract: Finding optimal configurations for Stream Processing Systems (SPS) is a challenging problem due to the large number of parameters that can influence their performance and the lack of analytical models to anticipate the effect of a change. To tackle this issue, we consider tuning methods where an experimenter is given a limited budget of experiments and needs to carefully allocate this budget to fi… ▽ More

    Submitted 21 June, 2016; originally announced June 2016.

    Comments: MASCOTS 2016, code is available at https://github.com/dice-project/DICE-Configuration-BO4CO

  27. arXiv:0902.3065  [pdf, ps, other

    cs.PF

    The Multi-Branched Method of Moments for Queueing Networks

    Authors: Giuliano Casale

    Abstract: We propose a new exact solution algorithm for closed multiclass product-form queueing networks that is several orders of magnitude faster and less memory consuming than established methods for multiclass models, such as the Mean Value Analysis (MVA) algorithm. The technique is an important generalization of the recently proposed Method of Moments (MoM) which, differently from MVA, recursively co… ▽ More

    Submitted 18 February, 2009; originally announced February 2009.

    ACM Class: C.4