Skip to main content

Showing 1–17 of 17 results for author: Serafini, M

  1. arXiv:2406.00552  [pdf, other

    cs.LG cs.DC

    Graph Neural Network Training Systems: A Performance Comparison of Full-Graph and Mini-Batch

    Authors: Saurabh Bajaj, Hui Guan, Marco Serafini

    Abstract: Graph Neural Networks (GNNs) have gained significant attention in recent years due to their ability to learn representations of graph structured data. Two common methods for training GNNs are mini-batch training and full-graph training. Since these two methods require different training pipelines and systems optimizations, two separate categories of GNN training systems emerged, each tailored for… ▽ More

    Submitted 8 June, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

    Comments: 12 pages, 1 appendix, 8 Figures, 16 Tables, Graph Neural Network, Graph Neural Networks, Full-graph training, Mini-batch training, full-batch training, distributed training, performance, epoch time, time to accuracy, accuracy

  2. GraphMini: Accelerating Graph Pattern Matching Using Auxiliary Graphs

    Authors: Juelin Liu, Sandeep Polisetty, Hui Guan, Marco Serafini

    Abstract: Graph pattern matching is a fundamental problem encountered by many common graph mining tasks and the basic building block of several graph mining systems. This paper explores for the first time how to proactively prune graphs to speed up graph pattern matching by leveraging the structure of the query pattern and the input graph. We propose building auxiliary graphs, which are different pruned… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  3. arXiv:2312.15405  [pdf, other

    cs.DB

    Enhancing Computation Pushdown for Cloud OLAP Databases

    Authors: Yifei Yang, Xiangyao Yu, Marco Serafini, Ashraf Aboulnaga, Michael Stonebraker

    Abstract: Network is a major bottleneck in modern cloud databases that adopt a storage-disaggregation architecture. Computation pushdown is a promising solution to tackle this issue, which offloads some computation tasks to the storage layer to reduce network traffic. Existing cloud OLAP systems statically decide whether to push down computation during the query optimization phase and do not consider the st… ▽ More

    Submitted 29 December, 2023; v1 submitted 23 December, 2023; originally announced December 2023.

    Comments: 13 pages, 15 figures

  4. arXiv:2303.13775  [pdf, other

    cs.DC cs.LG

    GSplit: Scaling Graph Neural Network Training on Large Graphs via Split-Parallelism

    Authors: Sandeep Polisetty, Juelin Liu, Kobi Falus, Yi Ren Fung, Seung-Hwan Lim, Hui Guan, Marco Serafini

    Abstract: Graph neural networks (GNNs), an emerging class of machine learning models for graphs, have gained popularity for their superior performance in various graph analytical tasks. Mini-batch training is commonly used to train GNNs on large graphs, and data parallelism is the standard approach to scale mini-batch training across multiple GPUs. One of the major performance costs in GNN training is the l… ▽ More

    Submitted 27 June, 2024; v1 submitted 23 March, 2023; originally announced March 2023.

  5. arXiv:2212.10387  [pdf, other

    cs.DB cs.DC

    Tuning the Tail Latency of Distributed Queries Using Replication

    Authors: Nathan Ng, Hung Le, Marco Serafini

    Abstract: Querying graph data with low latency is an important requirement in application domains such as social networks and knowledge graphs. Graph queries perform multiple hops between vertices. When data is partitioned and stored across multiple servers, queries executing at one server often need to hop to vertices stored by another server. Such distributed traversals represent a performance bottleneck… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

    Comments: An earlier version of this paper was submitted in April 2022. Previous versions are available at https://marcoserafini.github.io/projects/latency-replication/index.html

  6. Scalable Graph Neural Network Training: The Case for Sampling

    Authors: Marco Serafini, Hui Guan

    Abstract: Graph Neural Networks (GNNs) are a new and increasingly popular family of deep neural network architectures to perform learning on graphs. Training them efficiently is challenging due to the irregular nature of graph data. The problem becomes even more challenging when scaling to large graphs that exceed the capacity of single devices. Standard approaches to distributed DNN training, such as data… ▽ More

    Submitted 5 May, 2021; originally announced May 2021.

    Comments: 9 pages, 2 figures

    Journal ref: ACM SIGOPS Operating Systems Review, Volume 55, Issue 1, July 2021, pp 68-76

  7. arXiv:2009.06693  [pdf, other

    cs.DC cs.LG

    Accelerating Graph Sampling for Graph Machine Learning using GPUs

    Authors: Abhinav Jangda, Sandeep Polisetty, Arjun Guha, Marco Serafini

    Abstract: Representation learning algorithms automatically learn the features of data. Several representation learning algorithms for graph data, such as DeepWalk, node2vec, and GraphSAGE, sample the graph to produce mini-batches that are suitable for training a DNN. However, sampling time can be a significant fraction of training time, and existing systems do not efficiently parallelize sampling. Samplin… ▽ More

    Submitted 10 May, 2021; v1 submitted 14 September, 2020; originally announced September 2020.

    Comments: Published in EuroSys 2021

  8. arXiv:2003.03604  [pdf, other

    cs.DC

    Aion: Better Late than Never in Event-Time Streams

    Authors: Sérgio Esteves, Gianmarco De Francisci Morales, Rodrigo Rodrigues, Marco Serafini, Luís Veiga

    Abstract: Processing data streams in near real-time is an increasingly important task. In the case of event-timestamped data, the stream processing system must promptly handle late events that arrive after the corresponding window has been processed. To enable this late processing, the window state must be maintained for a long period of time. However, current systems maintain this state in memory, which ei… ▽ More

    Submitted 22 April, 2020; v1 submitted 7 March, 2020; originally announced March 2020.

    Comments: 14 pages, 28 figures

  9. arXiv:2002.05837  [pdf, other

    cs.DB

    PushdownDB: Accelerating a DBMS using S3 Computation

    Authors: Xiangyao Yu, Matt Youill, Matthew Woicik, Abdurrahman Ghanem, Marco Serafini, Ashraf Aboulnaga, Michael Stonebraker

    Abstract: This paper studies the effectiveness of pushing parts of DBMS analytics queries into the Simple Storage Service (S3) engine of Amazon Web Services (AWS), using a recently released capability called S3 Select. We show that some DBMS primitives (filter, projection, aggregation) can always be cost-effectively moved into S3. Other more complex operations (join, top-K, group-by) require reimplementatio… ▽ More

    Submitted 13 February, 2020; originally announced February 2020.

  10. LiveGraph: A Transactional Graph Storage System with Purely Sequential Adjacency List Scans

    Authors: Xiaowei Zhu, Guanyu Feng, Marco Serafini, Xiaosong Ma, Jiping Yu, Lei Xie, Ashraf Aboulnaga, Wenguang Chen

    Abstract: The specific characteristics of graph workloads make it hard to design a one-size-fits-all graph storage system. Systems that support transactional updates use data structures with poor data locality, which limits the efficiency of analytical workloads or even simple edge scans. Other systems run graph analytics workloads efficiently, but cannot properly support transactions. This paper presents… ▽ More

    Submitted 29 August, 2020; v1 submitted 13 October, 2019; originally announced October 2019.

  11. arXiv:1804.01942  [pdf, other

    cs.DC cs.DB

    Scaling Out Acid Applications with Operation Partitioning

    Authors: Habib Saissi, Marco Serafini, Neeraj Suri

    Abstract: OLTP applications with high workloads that cannot be served by a single server need to scale out to multiple servers. Typically, scaling out entails assigning a different partition of the application state to each server. But data partitioning is at odds with preserving the strong consistency guarantees of ACID transactions, a fundamental building block of many OLTP applications. The more we scale… ▽ More

    Submitted 5 April, 2018; originally announced April 2018.

  12. arXiv:1705.09073  [pdf, other

    cs.DC

    Load Balancing for Skewed Streams on Heterogeneous Cluster

    Authors: Muhammad Anis Uddin Nasir, Hiroshi Horii, Marco Serafini, Nicolas Kourtellis, Rudy Raymond, Sarunas Girdzijauskas, Takayuki Osogami

    Abstract: Streaming applications frequently encounter skewed workloads and execute on heterogeneous clusters. Optimal resource utilization in such adverse conditions becomes a challenge, as it requires inferring the resource capacities and input distribution at run time. In this paper, we tackle the aforementioned challenges by modeling them as a load balancing problem. We propose a novel partitioning strat… ▽ More

    Submitted 1 October, 2017; v1 submitted 25 May, 2017; originally announced May 2017.

    Comments: 12 pages, under submission

  13. arXiv:1510.07623  [pdf, other

    cs.DC

    Partial Key Grouping: Load-Balanced Partitioning of Distributed Streams

    Authors: Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, David Garcia-Soriano, Nicolas Kourtellis, Marco Serafini

    Abstract: We study the problem of load balancing in distributed stream processing engines, which is exacerbated in the presence of skew. We introduce Partial Key Grouping (PKG), a new stream partitioning scheme that adapts the classical "power of two choices" to a distributed streaming setting by leveraging two novel techniques: key splitting and local load estimation. In so doing, it achieves better load b… ▽ More

    Submitted 26 October, 2015; originally announced October 2015.

    Comments: 14 pages. arXiv admin note: substantial text overlap with arXiv:1504.00788

  14. arXiv:1510.05714  [pdf, other

    cs.DC

    When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Processing

    Authors: Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, Nicolas Kourtellis, Marco Serafini

    Abstract: Carefully balancing load in distributed stream processing systems has a fundamental impact on execution latency and throughput. Load balancing is challenging because real-world workloads are skewed: some tuples in the stream are associated to keys which are significantly more frequent than others. Skew is remarkably more problematic in large deployments: more workers implies fewer keys per worker,… ▽ More

    Submitted 27 January, 2016; v1 submitted 19 October, 2015; originally announced October 2015.

    Comments: 12 pages, 14 Figures, this paper is accepted and will be published at ICDE 2016

  15. arXiv:1510.04233  [pdf, other

    cs.DC

    Arabesque: A System for Distributed Graph Mining - Extended version

    Authors: Carlos H. C. Teixeira, Alexandre J. Fonseca, Marco Serafini, Georgos Siganos, Mohammed J. Zaki, Ashraf Aboulnaga

    Abstract: Distributed data processing platforms such as MapReduce and Pregel have substantially simplified the design and deployment of certain classes of distributed graph analytics algorithms. However, these platforms do not represent a good match for distributed graph mining problems, as for example finding frequent subgraphs in a graph. Given an input graph, these problems require exploring a very large… ▽ More

    Submitted 14 October, 2015; originally announced October 2015.

    Comments: A short version of this report appeared in the Proceedings of the 25th ACM Symp. on Operating Systems Principles (SOSP), 2015

    Report number: QCRI-TR-2015-005

  16. The Power of Both Choices: Practical Load Balancing for Distributed Stream Processing Engines

    Authors: Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, David García-Soriano, Nicolas Kourtellis, Marco Serafini

    Abstract: We study the problem of load balancing in distributed stream processing engines, which is exacerbated in the presence of skew. We introduce Partial Key Grouping (PKG), a new stream partitioning scheme that adapts the classical "power of two choices" to a distributed streaming setting by leveraging two novel techniques: key splitting and local load estimation. In so doing, it achieves better load b… ▽ More

    Submitted 3 April, 2015; originally announced April 2015.

    Comments: 31st IEEE International Conference on Data Engineering (ICDE), 2015

  17. arXiv:1308.2979  [pdf, other

    cs.DC

    On Barriers and the Gap between Active and Passive Replication (Full Version)

    Authors: Flavio P. Junqueira, Marco Serafini

    Abstract: Active replication is commonly built on top of the atomic broadcast primitive. Passive replication, which has been recently used in the popular ZooKeeper coordination system, can be naturally built on top of the primary-order atomic broadcast primitive. Passive replication differs from active replication in that it requires processes to cross a barrier before they become primaries and start broadc… ▽ More

    Submitted 12 October, 2015; v1 submitted 13 August, 2013; originally announced August 2013.

    Comments: A shorter version of this work (without appendices) appears in the proceedings of the 27th International Symposium on Distributed Computing (DISC) 2013 conference