Skip to main content

Showing 1–23 of 23 results for author: Fonseca, R

  1. arXiv:2405.06856  [pdf, other

    cs.DC

    Aladdin: Joint Placement and Scaling for SLO-Aware LLM Serving

    Authors: Chengyi Nie, Rodrigo Fonseca, Zhenhua Liu

    Abstract: The demand for large language model (LLM) inference is gradually dominating the artificial intelligence workloads. Therefore, there is an urgent need for cost-efficient inference serving. Existing work focuses on single-worker optimization and lacks consideration of cluster-level management for both inference queries and computing resources. However, placing requests and managing resources without… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  2. arXiv:2404.19143  [pdf, other

    cs.DC

    Workload Intelligence: Punching Holes Through the Cloud Abstraction

    Authors: Lexiang Huang, Anjaly Parayil, Jue Zhang, Xiaoting Qin, Chetan Bansal, Jovan Stojkovic, Pantea Zardoshti, Pulkit Misra, Eli Cortez, Raphael Ghelman, Íñigo Goiri, Saravan Rajmohan, Jim Kleewein, Rodrigo Fonseca, Timothy Zhu, Ricardo Bianchini

    Abstract: Today, cloud workloads are essentially opaque to the cloud platform. Typically, the only information the platform receives is the virtual machine (VM) type and possibly a decoration to the type (e.g., the VM is evictable). Similarly, workloads receive little to no information from the platform; generally, workloads might receive telemetry from their VMs or exceptional signals (e.g., shortly before… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  3. arXiv:2403.04123  [pdf, other

    cs.SE cs.CL cs.LG

    Exploring LLM-based Agents for Root Cause Analysis

    Authors: Devjeet Roy, Xuchao Zhang, Rashi Bhave, Chetan Bansal, Pedro Las-Casas, Rodrigo Fonseca, Saravan Rajmohan

    Abstract: The growing complexity of cloud based software systems has resulted in incident management becoming an integral part of the software development lifecycle. Root cause analysis (RCA), a critical part of the incident management process, is a demanding task for on-call engineers, requiring deep domain knowledge and extensive experience with a team's specific services. Automation of RCA can result in… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  4. arXiv:2403.03377  [pdf, other

    cs.DC

    Junctiond: Extending FaaS Runtimes with Kernel-Bypass

    Authors: Enrique Saurez, Joshua Fried, Gohar Irfan Chaudhry, Esha Choukse, Íñigo Goiri, Sameh Elnikety, Adam Belay, Rodrigo Fonseca

    Abstract: This report explores the use of kernel-bypass networking in FaaS runtimes and demonstrates how using Junction, a novel kernel-bypass system, as the backend for executing components in faasd can enhance performance and isolation. Junction achieves this by reducing network and compute overheads and minimizing interactions with the host operating system. Junctiond, the integration of Junction with fa… ▽ More

    Submitted 7 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  5. arXiv:2309.05833  [pdf, other

    cs.CL cs.AI cs.LG cs.SE

    PACE-LM: Prompting and Augmentation for Calibrated Confidence Estimation with GPT-4 in Cloud Incident Root Cause Analysis

    Authors: Dylan Zhang, Xuchao Zhang, Chetan Bansal, Pedro Las-Casas, Rodrigo Fonseca, Saravan Rajmohan

    Abstract: Major cloud providers have employed advanced AI-based solutions like large language models to aid humans in identifying the root causes of cloud incidents. Despite the growing prevalence of AI-driven assistants in the root cause analysis process, their effectiveness in assisting on-call engineers is constrained by low accuracy due to the intrinsic difficulty of the task, a propensity for LLM-based… ▽ More

    Submitted 29 September, 2023; v1 submitted 11 September, 2023; originally announced September 2023.

  6. arXiv:2301.04022  [pdf, other

    cs.LG math.ST

    Distributed Sparse Linear Regression under Communication Constraints

    Authors: Rodney Fonseca, Boaz Nadler

    Abstract: In multiple domains, statistical tasks are performed in distributed settings, with data split among several end machines that are connected to a fusion center. In various applications, the end machines have limited bandwidth and power, and thus a tight communication budget. In this work we focus on distributed learning of a sparse linear regression model, under severe communication constraints. We… ▽ More

    Submitted 9 January, 2023; originally announced January 2023.

    Comments: 33 pages, 4 figures

    MSC Class: 62J07; 62J05; 68W15

  7. arXiv:2209.14967  [pdf, other

    stat.ML cs.LG

    Statistical Learning and Inverse Problems: A Stochastic Gradient Approach

    Authors: Yuri R. Fonseca, Yuri F. Saporito

    Abstract: Inverse problems are paramount in Science and Engineering. In this paper, we consider the setup of Statistical Inverse Problem (SIP) and demonstrate how Stochastic Gradient Descent (SGD) algorithms can be used in the linear SIP setting. We provide consistency and finite sample bounds for the excess risk. We also propose a modification for the SGD algorithm where we leverage machine learning method… ▽ More

    Submitted 27 November, 2022; v1 submitted 29 September, 2022; originally announced September 2022.

  8. arXiv:2106.12485  [pdf, other

    cs.DC physics.comp-ph physics.plasm-ph

    Particle-In-Cell Simulation using Asynchronous Tasking

    Authors: Nicolas Guidotti, Pedro Ceyrat, João Barreto, José Monteiro, Rodrigo Rodrigues, Ricardo Fonseca, Xavier Martorell, Antonio J. Peña

    Abstract: Recently, task-based programming models have emerged as a prominent alternative among shared-memory parallel programming paradigms. Inherently asynchronous, these models provide native support for dynamic load balancing and incorporate data flow concepts to selectively synchronize the tasks. However, tasking models are yet to be widely adopted by the HPC community and their effective advantages wh… ▽ More

    Submitted 29 August, 2021; v1 submitted 23 June, 2021; originally announced June 2021.

    Comments: Published on the 27th European Conference on Parallel and Distributed Computing (Euro-Par 2021)

    Journal ref: Euro-Par 2021: Parallel Processing. Lecture Notes in Computer Science, vol 12820, pp. 482-498

  9. arXiv:2105.14845  [pdf, other

    cs.DC cs.PF

    With Great Freedom Comes Great Opportunity: Rethinking Resource Allocation for Serverless Functions

    Authors: Muhammad Bilal, Marco Canini, Rodrigo Fonseca, Rodrigo Rodrigues

    Abstract: Current serverless offerings give users a limited degree of flexibility for configuring the resources allocated to their function invocations by either coupling memory and CPU resources together or providing no knobs at all. These configuration choices simplify resource allocation decisions on behalf of users, but at the same time, create deployments that are resource inefficient. In this paper,… ▽ More

    Submitted 31 May, 2021; originally announced May 2021.

  10. arXiv:2104.13869  [pdf, other

    cs.DC

    Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications

    Authors: Francisco Romero, Gohar Irfan Chaudhry, Íñigo Goiri, Pragna Gopa, Paul Batum, Neeraja J. Yadwadkar, Rodrigo Fonseca, Christos Kozyrakis, Ricardo Bianchini

    Abstract: Function-as-a-Service (FaaS) has become an increasingly popular way for users to deploy their applications without the burden of managing the underlying infrastructure. However, existing FaaS platforms rely on remote storage to maintain state, limiting the set of applications that can be run efficiently. Recent caching work for FaaS platforms has tried to address this problem, but has fallen short… ▽ More

    Submitted 28 April, 2021; originally announced April 2021.

    Comments: 18 pages, 15 figures

  11. arXiv:2007.00708  [pdf, other

    cs.LG cs.AI cs.RO math.OC stat.ML

    Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search

    Authors: Linnan Wang, Rodrigo Fonseca, Yuandong Tian

    Abstract: High dimensional black-box optimization has broad applications but remains a challenging problem to solve. Given a set of samples $\{\vx_i, y_i\}$, building a global model (like Bayesian Optimization (BO)) suffers from the curse of dimensionality in the high-dimensional search space, while a greedy search may lead to sub-optimality. By recursively splitting the search space into regions with high/… ▽ More

    Submitted 13 March, 2022; v1 submitted 1 July, 2020; originally announced July 2020.

  12. arXiv:2006.06863  [pdf, other

    cs.LG cs.NE

    Few-shot Neural Architecture Search

    Authors: Yiyang Zhao, Linnan Wang, Yuandong Tian, Rodrigo Fonseca, Tian Guo

    Abstract: Efficient evaluation of a network architecture drawn from a large search space remains a key challenge in Neural Architecture Search (NAS). Vanilla NAS evaluates each architecture by training from scratch, which gives the true performance but is extremely time-consuming. Recently, one-shot NAS substantially reduces the computation cost by training only one supernetwork, a.k.a. supernet, to approxi… ▽ More

    Submitted 1 August, 2021; v1 submitted 11 June, 2020; originally announced June 2020.

  13. arXiv:2003.03423  [pdf, other

    cs.DC

    Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider

    Authors: Mohammad Shahrad, Rodrigo Fonseca, Íñigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, Ricardo Bianchini

    Abstract: Function as a Service (FaaS) has been gaining popularity as a way to deploy computations to serverless backends in the cloud. This paradigm shifts the complexity of allocating and provisioning resources to the cloud provider, which has to provide the illusion of always-available resources (i.e., fast function invocations without cold starts) at the lowest possible resource cost. Doing so requires… ▽ More

    Submitted 5 June, 2020; v1 submitted 6 March, 2020; originally announced March 2020.

    Comments: 14 pages, 20 figures. Corrected and published in USENIX ATC, July 2020. For accompanying dataset, see https://github.com/Azure/AzurePublicDataset

  14. arXiv:1907.07762  [pdf, other

    cs.CY eess.SY

    Agro 4.0: A Green Information System for Sustainable Agroecosystem Management

    Authors: Eugênio Pacceli Reis da Fonseca, Evandro Caldeira, Heitor Soares Ramos Filho, Leonardo Barbosa e Oliveira, Adriano César Machado Pereira, Pierre Santos Vilela

    Abstract: Agriculture is one of the most critical activities developed today by humankind and is in constant technical evolution to supply food and other essential products to everlasting and increasing demand. New machines, seeds, and fertilizers were developed to increase the productivity of cultivated areas. It is estimated that by 2050 we will have a population of 9 billion people and the production of… ▽ More

    Submitted 11 July, 2019; originally announced July 2019.

  15. arXiv:1906.06832  [pdf, other

    cs.LG cs.CV stat.ML

    Sample-Efficient Neural Architecture Search by Learning Action Space

    Authors: Linnan Wang, Saining Xie, Teng Li, Rodrigo Fonseca, Yuandong Tian

    Abstract: Neural Architecture Search (NAS) has emerged as a promising technique for automatic neural network design. However, existing MCTS based NAS approaches often utilize manually designed action space, which is not directly related to the performance metric to be optimized (e.g., accuracy), leading to sample-inefficient explorations of architectures. To improve the sample efficiency, this paper propose… ▽ More

    Submitted 31 March, 2021; v1 submitted 16 June, 2019; originally announced June 2019.

    Comments: Accepted at TPAMI-2021

  16. arXiv:1903.11059  [pdf, other

    cs.CV

    AlphaX: eXploring Neural Architectures with Deep Neural Networks and Monte Carlo Tree Search

    Authors: Linnan Wang, Yiyang Zhao, Yuu Jinnai, Yuandong Tian, Rodrigo Fonseca

    Abstract: Neural Architecture Search (NAS) has shown great success in automating the design of neural networks, but the prohibitive amount of computations behind current NAS methods requires further investigations in improving the sample efficiency and the network evaluation cost to get better results in a shorter time. In this paper, we present a novel scalable Monte Carlo Tree Search (MCTS) based NAS agen… ▽ More

    Submitted 1 October, 2019; v1 submitted 26 March, 2019; originally announced March 2019.

    Comments: another search algorithm for NAS. arXiv admin note: substantial text overlap with arXiv:1805.07440

  17. arXiv:1811.08596  [pdf, other

    cs.DC

    SuperNeurons: FFT-based Gradient Sparsification in the Distributed Training of Deep Neural Networks

    Authors: Linnan Wang, Wei Wu, Junyu Zhang, Hang Liu, George Bosilca, Maurice Herlihy, Rodrigo Fonseca

    Abstract: The performance and efficiency of distributed training of Deep Neural Networks highly depend on the performance of gradient averaging among all participating nodes, which is bounded by the communication between nodes. There are two major strategies to reduce communication overhead: one is to hide communication by overlapping it with computation, and the other is to reduce message sizes. The first… ▽ More

    Submitted 8 February, 2021; v1 submitted 20 November, 2018; originally announced November 2018.

    Comments: 10 pages

  18. arXiv:1808.03322  [pdf, other

    cs.CR cs.RO

    Scanning the Internet for ROS: A View of Security in Robotics Research

    Authors: Nicholas DeMarinis, Stefanie Tellex, Vasileios Kemerlis, George Konidaris, Rodrigo Fonseca

    Abstract: Because robots can directly perceive and affect the physical world, security issues take on particular importance. In this paper, we describe the results of our work on scanning the entire IPv4 address space of the Internet for instances of the Robot Operating System (ROS), a widely used robotics platform for research. Our results identified that a number of hosts supporting ROS are exposed to the… ▽ More

    Submitted 23 July, 2018; originally announced August 2018.

    Comments: 10 pages

  19. arXiv:1805.07440  [pdf, other

    cs.LG cs.AI cs.CV cs.DC stat.ML

    Neural Architecture Search using Deep Neural Networks and Monte Carlo Tree Search

    Authors: Linnan Wang, Yiyang Zhao, Yuu Jinnai, Yuandong Tian, Rodrigo Fonseca

    Abstract: Neural Architecture Search (NAS) has shown great success in automating the design of neural networks, but the prohibitive amount of computations behind current NAS methods requires further investigations in improving the sample efficiency and the network evaluation cost to get better results in a shorter time. In this paper, we present a novel scalable Monte Carlo Tree Search (MCTS) based NAS agen… ▽ More

    Submitted 21 November, 2019; v1 submitted 18 May, 2018; originally announced May 2018.

    Comments: To appear in the Thirty-Fourth AAAI conference on Artificial Intelligence (AAAI-2020)

  20. FITing-Tree: A Data-aware Index Structure

    Authors: Alex Galakatos, Michael Markovitch, Carsten Binnig, Rodrigo Fonseca, Tim Kraska

    Abstract: Index structures are one of the most important tools that DBAs leverage to improve the performance of analytics and transactional workloads. However, building several indexes over large datasets can often become prohibitive and consume valuable system resources. In fact, a recent study showed that indexes created as part of the TPC-C benchmark can account for 55% of the total memory available in a… ▽ More

    Submitted 25 March, 2020; v1 submitted 30 January, 2018; originally announced January 2018.

    Comments: 18 pages

    Journal ref: SIGMOD (2019) 1189-1206

  21. MilkQA: a Dataset of Consumer Questions for the Task of Answer Selection

    Authors: Marcelo Criscuolo, Erick Rocha Fonseca, Sandra Maria Aluísio, Ana Carolina Sperança-Criscuolo

    Abstract: We introduce MilkQA, a question answering dataset from the dairy domain dedicated to the study of consumer questions. The dataset contains 2,657 pairs of questions and answers, written in the Portuguese language and originally collected by the Brazilian Agricultural Research Corporation (Embrapa). All questions were motivated by real situations and written by thousands of authors with very differe… ▽ More

    Submitted 10 January, 2018; originally announced January 2018.

    Comments: 6 pages

    Journal ref: Intelligent Systems (BRACIS), 2017 Brazilian Conference on

  22. arXiv:1607.07483  [pdf, other

    cs.RO

    Collision-Free Poisson Motion Planning in Ultra High-Dimensional Molecular Conformation Spaces

    Authors: Rasmus Fonseca, Dominik Budday, Henry van den Bedem

    Abstract: The function of protein, RNA, and DNA is modulated by fast, dynamic exchanges between three-dimensional conformations. Conformational sampling of biomolecules with exact and nullspace inverse kinematics, using rotatable bonds as revolute joints and non-covalent interactions as holonomic constraints, can accurately characterize these native ensembles. However, sampling biomolecules remains challeng… ▽ More

    Submitted 25 July, 2016; originally announced July 2016.

    Comments: 15 pages, 5 figures, submitted to WAFR16

    ACM Class: I.2.9; J.3

  23. arXiv:1607.03385  [pdf, other

    cs.NI

    Compiling Stateful Network Properties for Runtime Verification

    Authors: Tim Nelson, Nicholas DeMarinis, Timothy Adam Hoff, Rodrigo Fonseca, Shriram Krishnamurthi

    Abstract: Networks are difficult to configure correctly, and tricky to debug. These problems are accentuated by temporal and stateful behavior. Static verification, while useful, is ineffectual for detecting behavioral deviations induced by hardware faults, security failures, and so on, so dynamic property monitoring is also valuable. Unfortunately, existing monitoring and runtime verification for networks… ▽ More

    Submitted 15 July, 2016; v1 submitted 12 July, 2016; originally announced July 2016.