-
Rumors with Changing Credibility
Authors:
Charlotte Out,
Nicolás Rivera,
Thomas Sauerwald,
John Sylvester
Abstract:
Randomized rumor spreading processes diffuse information on an undirected graph and have been widely studied. In this work, we present a generic framework for analyzing a broad class of such processes on regular graphs. Our analysis is protocol-agnostic, as it only requires the expected proportion of newly informed vertices in each round to be bounded, and a natural negative correlation property.…
▽ More
Randomized rumor spreading processes diffuse information on an undirected graph and have been widely studied. In this work, we present a generic framework for analyzing a broad class of such processes on regular graphs. Our analysis is protocol-agnostic, as it only requires the expected proportion of newly informed vertices in each round to be bounded, and a natural negative correlation property.
This framework allows us to analyze various protocols, including PUSH, PULL, and PUSH-PULL, thereby extending prior research. Unlike previous work, our framework accommodates message failures at any time $t\geq 0$ with a probability of $1-q(t)$, where the credibility $q(t)$ is any function of time. This enables us to model real-world scenarios in which the transmissibility of rumors may fluctuate, as seen in the spread of ``fake news'' and viruses. Additionally, our framework is sufficiently broad to cover dynamic graphs.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
An Educational Tool for Exploring the Pumping Lemma Property for Regular Languages
Authors:
Josue N. Rivera,
Haiping Xu
Abstract:
Pumping lemma has been a very difficult topic for students to understand in a theoretical computer science course due to a lack of tool support. In this paper, we present an active learning tool called MInimum PUmping length (MIPU) educational software to explore the pumping lemma property for regular languages. For a given regular language, MIPU offers three major functionalities: determining the…
▽ More
Pumping lemma has been a very difficult topic for students to understand in a theoretical computer science course due to a lack of tool support. In this paper, we present an active learning tool called MInimum PUmping length (MIPU) educational software to explore the pumping lemma property for regular languages. For a given regular language, MIPU offers three major functionalities: determining the membership of an input string, generating a list of short strings that belong to the language, and automatically calculating the minimal pumping length of the language. The software tool has been developed to provide educational assistance to students to better understand the concepts of pumping lemma and minimum pumping length, and promote active learning through hand-on practice.
△ Less
Submitted 24 February, 2023;
originally announced February 2023.
-
Distributed Averaging in Opinion Dynamics
Authors:
Petra Berenbrink,
Colin Cooper,
Cristina Gava,
David Kohan Marzagão,
Frederik Mallmann-Trenn,
Nicolás Rivera,
Tomasz Radzik
Abstract:
We consider two simple asynchronous opinion dynamics on arbitrary graphs where every node $u$ has an initial value $ξ_u(0)$. In the first process, the NodeModel, at each time step $t\ge 0$, a random node $u$ and a random sample of $k$ of its neighbours $v_1,v_2,\cdots,v_k$ are selected. Then, $u$ updates its current value $ξ_u(t)$ to $ξ_u(t+1) = αξ_u(t) + \frac{(1-α)}{k} \sum_{i=1}^k ξ_{v_i}(t)$,…
▽ More
We consider two simple asynchronous opinion dynamics on arbitrary graphs where every node $u$ has an initial value $ξ_u(0)$. In the first process, the NodeModel, at each time step $t\ge 0$, a random node $u$ and a random sample of $k$ of its neighbours $v_1,v_2,\cdots,v_k$ are selected. Then, $u$ updates its current value $ξ_u(t)$ to $ξ_u(t+1) = αξ_u(t) + \frac{(1-α)}{k} \sum_{i=1}^k ξ_{v_i}(t)$, where $α\in (0,1)$ and $k\ge 1$ are parameters of the process. In the second process, the EdgeModel, at each step a random pair of adjacent nodes $(u,v)$ is selected, and then node $u$ updates its value equivalently to the NodeModel with $k=1$ and $v$ as the selected neighbour. For both processes, the values of all nodes converge to $F$, a random variable depending on the random choices made in each step. For the NodeModel and regular graphs, and for the EdgeModel and arbitrary graphs, the expectation of $F$ is the average of the initial values $\frac{1}{n}\sum_{u\in V} ξ_u(0)$. For the NodeModel and non-regular graphs, the expectation of $F$ is the degree-weighted average of the initial values. Our results are two-fold. We consider the concentration of $F$ and show tight bounds on the variance of $F$ for regular graphs. We show that, when the initial values do not depend on the number of nodes, then the variance is negligible, hence the nodes are able to estimate the initial average of the node values. Interestingly, this variance does not depend on the graph structure. For the proof we introduce a duality between our processes and a process of two correlated random walks. We also analyse the convergence time for both models and for arbitrary graphs, showing bounds on the time $T_\varepsilon$ required to make all node values `$\varepsilon$-close' to each other. Our bounds are asymptotically tight under assumptions on the distribution of the initial values.
△ Less
Submitted 11 May, 2023; v1 submitted 30 November, 2022;
originally announced November 2022.
-
A Multiple kernel testing procedure for non-proportional hazards in factorial designs
Authors:
Marc Ditzhaus,
Tamara Fernández,
Nicolás Rivera
Abstract:
In this paper we propose a Multiple kernel testing procedure to infer survival data when several factors (e.g. different treatment groups, gender, medical history) and their interaction are of interest simultaneously. Our method is able to deal with complex data and can be seen as an alternative to the omnipresent Cox model when assumptions such as proportionality cannot be justified. Our methodol…
▽ More
In this paper we propose a Multiple kernel testing procedure to infer survival data when several factors (e.g. different treatment groups, gender, medical history) and their interaction are of interest simultaneously. Our method is able to deal with complex data and can be seen as an alternative to the omnipresent Cox model when assumptions such as proportionality cannot be justified. Our methodology combines well-known concepts from Survival Analysis, Machine Learning and Multiple Testing: differently weighted log-rank tests, kernel methods and multiple contrast tests. By that, complex hazard alternatives beyond the classical proportional hazard set-up can be detected. Moreover, multiple comparisons are performed by fully exploiting the dependence structure of the single testing procedures to avoid a loss of power. In all, this leads to a flexible and powerful procedure for factorial survival designs whose theoretical validity is proven by martingale arguments and the theory for $V$-statistics. We evaluate the performance of our method in an extensive simulation study and illustrate it by a real data analysis.
△ Less
Submitted 14 June, 2022;
originally announced June 2022.
-
Diversity, Fairness, and Sustainability in Population Protocols
Authors:
Nan Kang,
Frederik Mallmann-Trenn,
Nicolás Rivera
Abstract:
Over the years, population protocols with the goal of reaching consensus have been studied in great depth. However, many systems in the real-world do not result in all agents eventually reaching consensus, but rather in the opposite: they converge to a state of rich diversity. Consider for example task allocation in ants. If eventually all ants perform the same task, then the colony will perish (l…
▽ More
Over the years, population protocols with the goal of reaching consensus have been studied in great depth. However, many systems in the real-world do not result in all agents eventually reaching consensus, but rather in the opposite: they converge to a state of rich diversity. Consider for example task allocation in ants. If eventually all ants perform the same task, then the colony will perish (lack of food, no brood care, etc.). Then, it is vital for the survival of the colony to have a diverse set of tasks and enough ants working on each task. What complicates matters is that ants need to switch tasks periodically to adjust the needs of the colony; e.g., when too many foragers fell victim to other ant colonies. Moreover, all tasks are equally important and maybe they need to keep certain proportions in the distribution of the task. How can ants keep a healthy and balanced allocation of tasks?
To answer this question, we propose a simple population protocol for $n$ agents on a complete graph and an arbitrary initial distribution of $k$ colours (tasks). We assume that each colour $i$ has an associated weight (importance) $w_i \geq 1$. By denoting $w$ as the sum of the weights of different colours, we show that the protocol converges in $O(w^2 n \log n)$ rounds to a configuration where the number of agents supporting each colour $i$ is concentrated on the fair share $w_in/w$ and will stay concentrated for a large number of rounds, w.h.p.
Our protocol has many interesting properties: agents do not need to know other colours and weights in the system, and our protocol requires very little memory per agent. Furthermore, the protocol guarantees fairness meaning that over a long period each agent has each colour roughly a number of times proportional to the weight of the colour. Finally, our protocol also fulfils sustainability meaning that no colour ever vanishes.
△ Less
Submitted 7 June, 2021; v1 submitted 20 May, 2021;
originally announced May 2021.
-
Multiple Random Walks on Graphs: Mixing Few to Cover Many
Authors:
Nicolás Rivera,
Thomas Sauerwald,
John Sylvester
Abstract:
Random walks on graphs are an essential primitive for many randomised algorithms and stochastic processes. It is natural to ask how much can be gained by running $k$ multiple random walks independently and in parallel. Although the cover time of multiple walks has been investigated for many natural networks, the problem of finding a general characterisation of multiple cover times for worst-case s…
▽ More
Random walks on graphs are an essential primitive for many randomised algorithms and stochastic processes. It is natural to ask how much can be gained by running $k$ multiple random walks independently and in parallel. Although the cover time of multiple walks has been investigated for many natural networks, the problem of finding a general characterisation of multiple cover times for worst-case start vertices (posed by Alon, Avin, Koucký, Kozma, Lotker, and Tuttle~in 2008) remains an open problem.
First, we improve and tighten various bounds on the stationary cover time when $k$ random walks start from vertices sampled from the stationary distribution. For example, we prove an unconditional lower bound of $Ω((n/k) \log n)$ on the stationary cover time, holding for any $n$-vertex graph $G$ and any $1 \leq k =o(n\log n )$. Secondly, we establish the stationary cover times of multiple walks on several fundamental networks up to constant factors. Thirdly, we present a framework characterising worst-case cover times in terms of stationary cover times and a novel, relaxed notion of mixing time for multiple walks called the partial mixing time. Roughly speaking, the partial mixing time only requires a specific portion of all random walks to be mixed. Using these new concepts, we can establish (or recover) the worst-case cover times for many networks including expanders, preferential attachment graphs, grids, binary trees and hypercubes.
△ Less
Submitted 29 December, 2022; v1 submitted 16 November, 2020;
originally announced November 2020.
-
Kernelized Stein Discrepancy Tests of Goodness-of-fit for Time-to-Event Data
Authors:
Tamara Fernandez,
Nicolas Rivera,
Wenkai Xu,
Arthur Gretton
Abstract:
Survival Analysis and Reliability Theory are concerned with the analysis of time-to-event data, in which observations correspond to waiting times until an event of interest such as death from a particular disease or failure of a component in a mechanical system. This type of data is unique due to the presence of censoring, a type of missing data that occurs when we do not observe the actual time o…
▽ More
Survival Analysis and Reliability Theory are concerned with the analysis of time-to-event data, in which observations correspond to waiting times until an event of interest such as death from a particular disease or failure of a component in a mechanical system. This type of data is unique due to the presence of censoring, a type of missing data that occurs when we do not observe the actual time of the event of interest but, instead, we have access to an approximation for it given by random interval in which the observation is known to belong. Most traditional methods are not designed to deal with censoring, and thus we need to adapt them to censored time-to-event data. In this paper, we focus on non-parametric goodness-of-fit testing procedures based on combining the Stein's method and kernelized discrepancies. While for uncensored data, there is a natural way of implementing a kernelized Stein discrepancy test, for censored data there are several options, each of them with different advantages and disadvantages. In this paper, we propose a collection of kernelized Stein discrepancy tests for time-to-event data, and we study each of them theoretically and empirically; our experimental results show that our proposed methods perform better than existing tests, including previous tests based on a kernelized maximum mean discrepancy.
△ Less
Submitted 26 August, 2020; v1 submitted 19 August, 2020;
originally announced August 2020.
-
Best-of-Three Voting on Dense Graphs
Authors:
Nan Kang,
Nicolas Rivera
Abstract:
Given a graph $G$ of $n$ vertices, where each vertex is initially attached an opinion of either red or blue. We investigate a random process known as the Best-of-three voting. In this process, at each time step, every vertex chooses three neighbours at random and adopts the majority colour. We study this process for a class of graphs with minimum degree $d = n^α$\,, where…
▽ More
Given a graph $G$ of $n$ vertices, where each vertex is initially attached an opinion of either red or blue. We investigate a random process known as the Best-of-three voting. In this process, at each time step, every vertex chooses three neighbours at random and adopts the majority colour. We study this process for a class of graphs with minimum degree $d = n^α$\,, where $α= Ω\left( (\log \log n)^{-1} \right)$. We prove that if initially each vertex is red with probability greater than $1/2+δ$, and blue otherwise, where $δ\geq (\log d)^{-C}$ for some $C>0$, then with high probability this dynamic reaches a final state where all vertices are red within $O\left( \log \log n\right) + O\left( \log \left( δ^{-1} \right) \right)$ steps.
△ Less
Submitted 22 March, 2019;
originally announced March 2019.
-
The dispersion time of random walks on finite graphs
Authors:
Nicolas Rivera,
Alexandre Stauffer,
Thomas Sauerwald,
John Sylvester
Abstract:
We study two random processes on an $n$-vertex graph inspired by the internal diffusion limited aggregation (IDLA) model. In both processes $n$ particles start from an arbitrary but fixed origin. Each particle performs a simple random walk until first encountering an unoccupied vertex, and at which point the vertex becomes occupied and the random walk terminates. In one of the processes, called \t…
▽ More
We study two random processes on an $n$-vertex graph inspired by the internal diffusion limited aggregation (IDLA) model. In both processes $n$ particles start from an arbitrary but fixed origin. Each particle performs a simple random walk until first encountering an unoccupied vertex, and at which point the vertex becomes occupied and the random walk terminates. In one of the processes, called \textit{Sequential-IDLA}, only one particle moves until settling and only then does the next particle start whereas in the second process, called \textit{Parallel-IDLA}, all unsettled particles move simultaneously. Our main goal is to analyze the so-called dispersion time of these processes, which is the maximum number of steps performed by any of the $n$ particles.
In order to compare the two processes, we develop a coupling which shows the dispersion time of the Parallel-IDLA stochastically dominates that of the Sequential-IDLA; however, the total number of steps performed by all particles has the same distribution in both processes. This coupling also gives us that dispersion time of Parallel-IDLA is bounded in expectation by dispersion time of the Sequential-IDLA up to a multiplicative $\log n$ factor. Moreover, we derive asymptotic upper and lower bound on the dispersion time for several graph classes, such as cliques, cycles, binary trees, $d$-dimensional grids, hypercubes and expanders. Most of our bounds are tight up to a multiplicative constant.
△ Less
Submitted 26 November, 2019; v1 submitted 28 August, 2018;
originally announced August 2018.
-
Dispersion processes
Authors:
Colin Cooper,
Andrew McDowell,
Tomasz Radzik,
Nicolas Rivera,
Takeharu Shiraga
Abstract:
We study a synchronous dispersion process in which $M$ particles are initially placed at a distinguished origin vertex of a graph $G$. At each time step, at each vertex $v$ occupied by more than one particle at the beginning of this step, each of these particles moves to a neighbour of $v$ chosen independently and uniformly at random. The dispersion process ends once the particles have all stopped…
▽ More
We study a synchronous dispersion process in which $M$ particles are initially placed at a distinguished origin vertex of a graph $G$. At each time step, at each vertex $v$ occupied by more than one particle at the beginning of this step, each of these particles moves to a neighbour of $v$ chosen independently and uniformly at random. The dispersion process ends once the particles have all stopped moving, i.e. at the first step at which each vertex is occupied by at most one particle.
For the complete graph $K_n$ and star graph $S_n$, we show that for any constant $δ>1$, with high probability, if $M \le n/2(1-δ)$, then the process finishes in $O(\log n)$ steps, whereas if $M \ge n/2(1+δ)$, then the process needs $e^{Ω(n)}$ steps to complete (if ever). We also show that an analogous lazy variant of the process exhibits the same behaviour but for higher thresholds, allowing faster dispersion of more particles.
For paths, trees, grids, hypercubes and Cayley graphs of large enough sizes (in terms of $M$) we give bounds on the time to finish and the maximum distance traveled from the origin as a function of the number of particles $M$.
△ Less
Submitted 9 December, 2017;
originally announced December 2017.
-
Fast plurality consensus in regular expanders
Authors:
Colin Cooper,
Tomasz Radzik,
Nicolás Rivera,
Takeharu Shiraga
Abstract:
Pull voting is a classic method to reach consensus among $n$ vertices with differing opinions in a distributed network: each vertex at each step takes on the opinion of a random neighbour. This method, however, suffers from two drawbacks. Even if there are only two opposing opinions, the time taken for a single opinion to emerge can be slow and the final opinion is not necessarily the initially he…
▽ More
Pull voting is a classic method to reach consensus among $n$ vertices with differing opinions in a distributed network: each vertex at each step takes on the opinion of a random neighbour. This method, however, suffers from two drawbacks. Even if there are only two opposing opinions, the time taken for a single opinion to emerge can be slow and the final opinion is not necessarily the initially held majority.
We refer to a protocol where 2 neighbours are contacted at each step as a 2-sample voting protocol. In the two-sample protocol a vertex updates its opinion only if both sampled opinions are the same. Not much was known about the performance of two-sample voting on general expanders in the case of three or more opinions. In this paper we show that the following performance can be achieved on a $d$-regular expander using two-sample voting. We suppose there are $k \ge 3$ opinions, and that the initial size of the largest and second largest opinions is $A_1, A_2$ respectively.
We prove that, if $A_1 - A_2 \ge C n \max\{\sqrt{(\log n)/A_1}, λ\}$, where $λ$ is the absolute second eigenvalue of matrix $P=Adj(G)/d$ and $C$ is a suitable constant, then the largest opinion wins in $O((n \log n)/A_1)$ steps with high probability.
For almost all $d$-regular graphs, we have $λ=c/\sqrt{d}$ for some constant $c>0$. This means that as $d$ increases we can separate an opinion whose majority is $o(n)$, whereas $Θ(n)$ majority is required for $d$ constant.
This work generalizes the results of Becchetti et. al (SPAA 2014) for the complete graph $K_n$.
△ Less
Submitted 13 April, 2017; v1 submitted 26 May, 2016;
originally announced May 2016.
-
Discordant voting processes on finite graphs
Authors:
Colin Cooper,
Martin Dyer,
Alan Frieze,
Nicolas Rivera
Abstract:
We consider an asynchronous voting process on graphs which we call discordant voting, and which can be described as follows. Initially each vertex holds one of two opinions, red or blue say. Neighbouring vertices with different opinions interact pairwise. After an interaction both vertices have the same colour. The quantity of interest is T, the time to reach consensus , i.e. the number of interac…
▽ More
We consider an asynchronous voting process on graphs which we call discordant voting, and which can be described as follows. Initially each vertex holds one of two opinions, red or blue say. Neighbouring vertices with different opinions interact pairwise. After an interaction both vertices have the same colour. The quantity of interest is T, the time to reach consensus , i.e. the number of interactions needed for all vertices have the same colour.
An edge whose endpoint colours differ (i.e. one vertex is coloured red and the other one blue) is said to be discordant. A vertex is discordant if its is incident with a discordant edge. In discordant voting, all interactions are based on discordant edges. Because the voting process is asynchronous there are several ways to update the colours of the interacting vertices. Push: Pick a random discordant vertex and push its colour to a random discordant neighbour. Pull: Pick a random discordant vertex and pull the colour of a random discordant neighbour. Oblivious: Pick a random endpoint of a random discordant edge and push the colour to the other end point.
We show that E(T), the expected time to reach consensus, depends strongly on the underlying graph and the update rule. For connected graphs on n vertices, and an initial half red, half blue colouring the following hold. For oblivious voting, E(T) =n^2/4 independent of the underlying graph. For the complete graph K_n, the push protocol has E(T)= Theta(n log(n)), whereas the pull protocol has E(T)= Theta(2^n). For the cycle C_n all three protocols have E(T) = Theta(n^2). For the star graph however, the pull protocol has E(T)=O(n^2), whereas the push protocol is slower with E(T) = Theta(n^2 log(n)).
The wide variation in E(T) for the pull protocol is to be contrasted with the well known model of synchronous pull voting, for which E(T) = O(n) on many classes of expanders.
△ Less
Submitted 25 December, 2016; v1 submitted 23 April, 2016;
originally announced April 2016.
-
The coalescing-branching random walk on expanders and the dual epidemic process
Authors:
Colin Cooper,
Tomasz Radzik,
Nicolas Rivera
Abstract:
Information propagation on graphs is a fundamental topic in distributed computing. One of the simplest models of information propagation is the push protocol in which at each round each agent independently pushes the current knowledge to a random neighbour. In this paper we study the so-called coalescing-branching random walk (COBRA), in which each vertex pushes the information to $k$ randomly sel…
▽ More
Information propagation on graphs is a fundamental topic in distributed computing. One of the simplest models of information propagation is the push protocol in which at each round each agent independently pushes the current knowledge to a random neighbour. In this paper we study the so-called coalescing-branching random walk (COBRA), in which each vertex pushes the information to $k$ randomly selected neighbours and then stops passing information until it receives the information again. The aim of COBRA is to propagate information fast but with a limited number of transmissions per vertex per step. In this paper we study the cover time of the COBRA process defined as the minimum time until each vertex has received the information at least once. Our main result says that if $G$ is an $n$-vertex $r$-regular graph whose transition matrix has second eigenvalue $λ$, then the COBRA cover time of $G$ is $\mathcal O(\log n )$, if $1-λ$ is greater than a positive constant, and $\mathcal O((\log n)/(1-λ)^3))$, if $1-λ\gg \sqrt{\log( n)/n}$. These bounds are independent of $r$ and hold for $3 \le r \le n-1$. They improve the previous bound of $O(\log^2 n)$ for expander graphs. Our main tool in analysing the COBRA process is a novel duality relation between this process and a discrete epidemic process, which we call a biased infection with persistent source (BIPS). A fixed vertex $v$ is the source of an infection and remains permanently infected. At each step each vertex $u$ other than $v$ selects $k$ neighbours, independently and uniformly, and $u$ is infected in this step if and only if at least one of the selected neighbours has been infected in the previous step. We show the duality between COBRA and BIPS which says that the time to infect the whole graph in the BIPS process is of the same order as the cover time of the COBRA process
△ Less
Submitted 26 May, 2016; v1 submitted 18 February, 2016;
originally announced February 2016.