subscribe to arXiv mailings

Network Representation and Modular Decomposition of Combinatorial Structures: A Galled-Tree Perspective

Authors: Anna Lindeberg, Guillaume E. Scholz, Marc Hellmuth

Abstract: In phylogenetics, reconstructing rooted trees from distances between taxa is a common task. Böcker and Dress generalized this concept by introducing symbolic dated maps $δ:X \times X \to Υ$, where distances are replaced by symbols, and showed that there is a one-to-one correspondence between symbolic ultrametrics and labeled rooted phylogenetic trees. Many combinatorial structures fall under the u… ▽ More In phylogenetics, reconstructing rooted trees from distances between taxa is a common task. Böcker and Dress generalized this concept by introducing symbolic dated maps $δ:X \times X \to Υ$, where distances are replaced by symbols, and showed that there is a one-to-one correspondence between symbolic ultrametrics and labeled rooted phylogenetic trees. Many combinatorial structures fall under the umbrella of symbolic dated maps, such as 2-dissimilarities, symmetric labeled 2-structures, or edge-colored complete graphs, and are here referred to as strudigrams. Strudigrams have a unique decomposition into non-overlapping modules, which can be represented by a modular decomposition tree (MDT). In the absence of prime modules, strudigrams are equivalent to symbolic ultrametrics, and the MDT fully captures the relationships $δ(x,y)$ between pairs of vertices $x,y$ in $X$ through the label of their least common ancestor in the MDT. However, in the presence of prime vertices, this information is generally hidden. To provide this missing structural information, we aim to locally replace the prime vertices in the MDT to obtain networks that capture full information about the strudigrams. While starting with the general framework of prime-vertex replacement networks, we then focus on a specific type of such networks obtained by replacing prime vertices with so-called galls, resulting in labeled galled-trees. We introduce the concept of galled-tree explainable (GATEX) strudigrams, provide their characterization, and demonstrate that recognizing these structures and reconstructing the labeled networks that explain them can be achieved in polynomial time. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2311.18284 [pdf, other]

The Complement of the Djokovic-Winkler Relation

Authors: Marc Hellmuth, Bruno J. Schmidt, Guillaume E. Scholz, Sandhya Thekkumpadan Puthiyaveedu

Abstract: The Djoković-Winkler relation $Θ$ is a binary relation defined on the edge set of a given graph that is based on the distances of certain vertices and which plays a prominent role in graph theory. In this paper, we explore the relatively uncharted ``reflexive complement'' $\overlineΘ$ of $Θ$, where $(e,f)\in \overlineΘ$ if and only if $e=f$ or $(e,f)\notin Θ$ for edges $e$ and $f$. We establish th… ▽ More The Djoković-Winkler relation $Θ$ is a binary relation defined on the edge set of a given graph that is based on the distances of certain vertices and which plays a prominent role in graph theory. In this paper, we explore the relatively uncharted ``reflexive complement'' $\overlineΘ$ of $Θ$, where $(e,f)\in \overlineΘ$ if and only if $e=f$ or $(e,f)\notin Θ$ for edges $e$ and $f$. We establish the relationship between $\overlineΘ$ and the set $Δ_{ef}$, comprising the distances between the vertices of $e$ and $f$ and shed some light on the intricacies of its transitive closure $\overlineΘ^*$. Notably, we demonstrate that $\overlineΘ^*$ exhibits multiple equivalence classes only within a restricted subclass of complete multipartite graphs. In addition, we characterize non-trivial relations $R$ that coincide with $\overlineΘ$ as those where the graph representation is disconnected, with each connected component being the (join of) Cartesian product of complete graphs. The latter results imply, somewhat surprisingly, that knowledge about the distances between vertices is not required to determine $\overlineΘ^*$. Moreover, $\overlineΘ^*$ has either exactly one or three equivalence classes. △ Less

Submitted 30 November, 2023; originally announced November 2023.

arXiv:2309.13634 [pdf, other]

Unique Least Common Ancestors and Clusters in Directed Acyclic Graphs

Authors: Ameera Vaheeda Shanavas, Manoj Changat, Marc Hellmuth, Peter F. Stadler

Abstract: We investigate the connections between clusters and least common ancestors (LCAs) in directed acyclic graphs (DAGs). We focus on the class of DAGs having unique least common ancestors for certain subsets of their minimal elements since these are of interest, particularly as models of phylogenetic networks. Here, we use the close connection between the canonical k-ary transit function and the closu… ▽ More We investigate the connections between clusters and least common ancestors (LCAs) in directed acyclic graphs (DAGs). We focus on the class of DAGs having unique least common ancestors for certain subsets of their minimal elements since these are of interest, particularly as models of phylogenetic networks. Here, we use the close connection between the canonical k-ary transit function and the closure function on a set system to show that pre-k-ary clustering systems are exactly those that derive from a class of DAGs with unique LCAs. Moreover, we show that k-ary T-systems and k-weak hierarchies are associated with DAGs that satisfy stronger conditions on the existence of unique LCAs for sets of size at most k. △ Less

Submitted 24 September, 2023; originally announced September 2023.

arXiv:2307.08654 [pdf, other]

The weighted total cophenetic index: A novel balance index for phylogenetic networks

Authors: Linda Knüver, Mareike Fischer, Marc Hellmuth, Kristina Wicke

Abstract: Phylogenetic networks play an important role in evolutionary biology as, other than phylogenetic trees, they can be used to accommodate reticulate evolutionary events such as horizontal gene transfer and hybridization. Recent research has provided a lot of progress concerning the reconstruction of such networks from data as well as insight into their graph theoretical properties. However, methods… ▽ More Phylogenetic networks play an important role in evolutionary biology as, other than phylogenetic trees, they can be used to accommodate reticulate evolutionary events such as horizontal gene transfer and hybridization. Recent research has provided a lot of progress concerning the reconstruction of such networks from data as well as insight into their graph theoretical properties. However, methods and tools to quantify structural properties of networks or differences between them are still very limited. For example, for phylogenetic trees, it is common to use balance indices to draw conclusions concerning the underlying evolutionary model, and more than twenty such indices have been proposed and are used for different purposes. One of the most frequently used balance index for trees is the so-called total cophenetic index, which has several mathematically and biologically desirable properties. For networks, on the other hand, balance indices are to-date still scarce. In this contribution, we introduce the \textit{weighted} total cophenetic index as a generalization of the total cophenetic index for trees to make it applicable to general phylogenetic networks. As we shall see, this index can be determined efficiently and behaves in a mathematical sound way, i.e., it satisfies so-called locality and recursiveness conditions. In addition, we analyze its extremal properties and, in particular, we investigate its maxima and minima as well as the structure of networks that achieve these values within the space of so-called level-$1$ networks. We finally briefly compare this novel index to the two other network balance indices available so-far. △ Less

Submitted 18 March, 2024; v1 submitted 17 July, 2023; originally announced July 2023.

arXiv:2306.06878 [pdf, other]

Fitch Graph Completion

Authors: Marc Hellmuth, Peter F. Stadler, Sandhya Thekkumpadan Puthiyaveedu

Abstract: Horizontal gene transfer is an important contributor to evolution. According to Walter M.\ Fitch, two genes are xenologs if they are separated by at least one HGT. More formally, the directed Fitch graph has a set of genes is its vertices, and directed edges $(x,y)$ for all pairs of genes $x$ and $y$ for which $y$ has been horizontally transferred at least once since it diverged from the last comm… ▽ More Horizontal gene transfer is an important contributor to evolution. According to Walter M.\ Fitch, two genes are xenologs if they are separated by at least one HGT. More formally, the directed Fitch graph has a set of genes is its vertices, and directed edges $(x,y)$ for all pairs of genes $x$ and $y$ for which $y$ has been horizontally transferred at least once since it diverged from the last common ancestor of $x$ and $y$. Subgraphs of Fitch graphs can be inferred by comparative sequence analysis. In many cases, however, only partial knowledge about the ``full'' Fitch graph can be obtained. Here, we characterize Fitch-satisfiable graphs that can be extended to a biologically feasible ``full'' Fitch graph and derive a simple polynomial-time recognition algorithm. We then proceed to showing that finding the Fitch graphs with total maximum (confidence) edge-weights is an NP-hard problem. △ Less

Submitted 12 June, 2023; originally announced June 2023.

arXiv:2306.04367 [pdf, other]

Solving NP-hard Problems on \textsc{GaTEx} Graphs: Linear-Time Algorithms for Perfect Orderings, Cliques, Colorings, and Independent Sets

Authors: Marc Hellmuth, Guillaume E. Scholz

Abstract: The class of $\mathsf{Ga}$lled-$\mathsf{T}$ree $\mathsf{Ex}$plainable ($\mathsf{GaTEx}$) graphs has recently been discovered as a natural generalization of cographs. Cographs are precisely those graphs that can be uniquely represented by a rooted tree where the leaves correspond to the vertices of the graph. As a generalization, $\mathsf{GaTEx}$ graphs are precisely those that can be uniquely repr… ▽ More The class of $\mathsf{Ga}$lled-$\mathsf{T}$ree $\mathsf{Ex}$plainable ($\mathsf{GaTEx}$) graphs has recently been discovered as a natural generalization of cographs. Cographs are precisely those graphs that can be uniquely represented by a rooted tree where the leaves correspond to the vertices of the graph. As a generalization, $\mathsf{GaTEx}$ graphs are precisely those that can be uniquely represented by a particular rooted acyclic network, called a galled-tree. This paper explores the use of galled-trees to solve combinatorial problems on $\mathsf{GaTEx}$ graphs that are, in general, NP-hard. We demonstrate that finding a maximum clique, an optimal vertex coloring, a perfect order, as well as a maximum independent set in $\mathsf{GaTEx}$ graphs can be efficiently done in linear time. The key idea behind the linear-time algorithms is to utilize the galled-trees that explain the $\mathsf{GaTEx}$ graphs as a guide for computing the respective cliques, colorings, perfect orders, or independent sets. △ Less

Submitted 26 April, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

arXiv:2304.11826 [pdf, other]

The Theory of Gene Family Histories

Authors: Marc Hellmuth, Peter F. Stadler

Abstract: Most genes are part of larger families of evolutionary related genes. The history of gene families typically involves duplications and losses of genes as well as horizontal transfers into other organisms. The reconstruction of detailed gene family histories, i.e., the precise dating of evolutionary events relative to phylogenetic tree of the underlying species has remained a challenging topic desp… ▽ More Most genes are part of larger families of evolutionary related genes. The history of gene families typically involves duplications and losses of genes as well as horizontal transfers into other organisms. The reconstruction of detailed gene family histories, i.e., the precise dating of evolutionary events relative to phylogenetic tree of the underlying species has remained a challenging topic despite their importance as a basis for detailed investigations into adaptation and functional evolution of individual members of the gene family. The identification of orthologs, moreover, is a particularly important subproblem of the more general setting considered here. In the last few years, an extensive body of mathematical results has appeared that tightly links orthology, a formal notion of best matches among genes, and horizontal gene transfer. The purpose of this chapter is the broadly outline some of the key mathematical insights and to discuss their implication for practical applications. In particular, we focus on tree-free methods, i.e., methods to infer orthology or horizontal gene transfer as well as gene trees, species trees and reconciliations between them without using \emph{a priori} knowledge of the underlying trees or statistical models for the inference of phylogenetic trees. Instead, the initial step aims to extract binary relations among genes. △ Less

Submitted 24 April, 2023; originally announced April 2023.

arXiv:2304.06453 [pdf, other]

On a generalization of median graphs: $k$-median graphs

Authors: Marc Hellmuth, Sandhya Thekkumpadan Puthiyaveedu

Abstract: Median graphs are connected graphs in which for all three vertices there is a unique vertex that belongs to shortest paths between each pair of these three vertices. To be more formal, a graph $G$ is a median graph if, for all $μ, u,v\in V(G)$, it holds that $|I(μ,u)\cap I(μ,v)\cap I(u,v)|=1$ where $I(x,y)$ denotes the set of all vertices that lie on shortest paths connecting $x$ and $y$. In this… ▽ More Median graphs are connected graphs in which for all three vertices there is a unique vertex that belongs to shortest paths between each pair of these three vertices. To be more formal, a graph $G$ is a median graph if, for all $μ, u,v\in V(G)$, it holds that $|I(μ,u)\cap I(μ,v)\cap I(u,v)|=1$ where $I(x,y)$ denotes the set of all vertices that lie on shortest paths connecting $x$ and $y$. In this paper we are interested in a natural generalization of median graphs, called $k$-median graphs. A graph $G$ is a $k$-median graph, if there are $k$ vertices $μ_1,\dots,μ_k\in V(G)$ such that, for all $u,v\in V(G)$, it holds that $|I(μ_i,u)\cap I(μ_i,v)\cap I(u,v)|=1$, $1\leq i\leq k$. By definition, every median graph with $n$ vertices is an $n$-median graph. We provide several characterizations of $k$-median graphs that, in turn, are used to provide many novel characterizations of median graphs. △ Less

Submitted 13 April, 2023; originally announced April 2023.

MSC Class: 05C75 ACM Class: G.2.2

arXiv:2212.02201 [pdf, other]

Relative Timing Information and Orthology in Evolutionary Scenarios

Authors: David Schaller, Tom Hartmann, Manuel Lafond, Nicolas Wieseke, Peter F. Stadler, Marc Hellmuth

Abstract: Evolutionary scenarios describing the evolution of a family of genes within a collection of species comprise the mapping of the vertices of a gene tree $T$ to vertices and edges of a species tree $S$. The relative timing of the last common ancestors of two extant genes (leaves of $T$) and the last common ancestors of the two species (leaves of $S$) in which they reside is indicative of horizontal… ▽ More Evolutionary scenarios describing the evolution of a family of genes within a collection of species comprise the mapping of the vertices of a gene tree $T$ to vertices and edges of a species tree $S$. The relative timing of the last common ancestors of two extant genes (leaves of $T$) and the last common ancestors of the two species (leaves of $S$) in which they reside is indicative of horizontal gene transfers (HGT) and ancient duplications. Orthologous gene pairs, on the other hand, require that their last common ancestors coincides with a corresponding speciation event. The relative timing information of gene and species divergences is captured by three colored graphs that have the extant genes as vertices and the species in which the genes are found as vertex colors: the equal-divergence-time (EDT) graph, the later-divergence-time (LDT) graph and the prior-divergence-time (PDT) graph, which together form an edge partition of the complete graph. Here we give a complete characterization in terms of informative and forbidden triples that can be read off the three graphs and provide a polynomial time algorithm for constructing an evolutionary scenario that explains the graphs, provided such a scenario exists. We show that every EDT graph is perfect. While the information about LDT and PDT graphs is necessary to recognize EDT graphs in polynomial-time for general scenarios, this extra information can be dropped in the HGT-free case. However, recognition of EDT graphs without knowledge of putative LDT and PDT graphs is NP-complete for general scenarios. In contrast, PDT graphs can be recognized in polynomial-time. We finally connect the EDT graph to the alternative definitions of orthology that have been proposed for scenarios with horizontal gene transfer. With one exception, the corresponding graphs are shown to be colored cographs. △ Less

Submitted 2 August, 2023; v1 submitted 5 December, 2022; originally announced December 2022.

arXiv:2211.16854 [pdf, other]

Resolving Prime Modules: The Structure of Pseudo-cographs and Galled-Tree Explainable Graphs

Authors: Marc Hellmuth, Guillaume E. Scholz

Abstract: The modular decomposition of a graph $G$ is a natural construction to capture key features of $G$ in terms of a labeled tree $(T,t)$ whose vertices are labeled as "series" ($1$), "parallel" ($0$) or "prime". However, full information of $G$ is provided by its modular decomposition tree $(T,t)$ only, if $G$ is a cograph, i.e., $G$ does not contain prime modules. In this case, $(T,t)$ explains $G$,… ▽ More The modular decomposition of a graph $G$ is a natural construction to capture key features of $G$ in terms of a labeled tree $(T,t)$ whose vertices are labeled as "series" ($1$), "parallel" ($0$) or "prime". However, full information of $G$ is provided by its modular decomposition tree $(T,t)$ only, if $G$ is a cograph, i.e., $G$ does not contain prime modules. In this case, $(T,t)$ explains $G$, i.e., $\{x,y\}\in E(G)$ if and only if the lowest common ancestor $\mathrm{lca}_T(x,y)$ of $x$ and $y$ has label "$1$". Pseudo-cographs, or, more general, GaTEx graphs $G$ are graphs that can be explained by labeled galled-trees, i.e., labeled networks $(N,t)$ that are obtained from the modular decomposition tree $(T,t)$ of $G$ by replacing the prime vertices in $T$ by simple labeled cycles. GaTEx graphs can be recognized and labeled galled-trees that explain these graphs can be constructed in linear time. In this contribution, we provide a novel characterization of GaTEx graphs in terms of a set $\mathfrak{F}_{\mathrm{GT}}$ of 25 forbidden induced subgraphs. This characterization, in turn, allows us to show that GaTEx graphs are closely related to many other well-known graph classes such as $P_4$-sparse and $P_4$-reducible graphs, weakly-chordal graphs, perfect graphs with perfect order, comparability and permutation graphs, murky graphs as well as interval graphs, Meyniel graphs or very strongly-perfect and brittle graphs. Moreover, we show that every GaTEx graph as twin-width at most 1 and and provide linear-time algorithms to solve several NP-hard problems (clique, coloring, independent set) on GaTEx graphs by utilizing the structure of the underlying galled-trees they explain. △ Less

Submitted 4 May, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

Comments: 18 pages, 3 figures

arXiv:2211.04322 [pdf, other]

Injective split systems

Authors: M. Hellmuth, K. T. Huber, V. Moulton, G. E. Scholz, P. F. Stadler

Abstract: A split system $\mathcal S$ on a finite set $X$, $|X|\ge3$, is a set of bipartitions or splits of $X$ which contains all splits of the form $\{x,X-\{x\}\}$, $x \in X$. To any such split system $\mathcal S$ we can associate the Buneman graph $\mathcal B(\mathcal S)$ which is essentially a median graph with leaf-set $X$ that displays the splits in $\mathcal S$. In this paper, we consider properties… ▽ More A split system $\mathcal S$ on a finite set $X$, $|X|\ge3$, is a set of bipartitions or splits of $X$ which contains all splits of the form $\{x,X-\{x\}\}$, $x \in X$. To any such split system $\mathcal S$ we can associate the Buneman graph $\mathcal B(\mathcal S)$ which is essentially a median graph with leaf-set $X$ that displays the splits in $\mathcal S$. In this paper, we consider properties of injective split systems, that is, split systems $\mathcal S$ with the property that $\mathrm{med}_{\mathcal B(\mathcal S)}(Y) \neq \mathrm{med}_{\mathrm B(\mathcal S)}(Y')$ for any 3-subsets $Y,Y'$ in $X$, where $\mathrm {med}_{\mathcal B(\mathcal S)}(Y)$ denotes the median in $\mathcal B(\mathcal S)$ of the three elements in $Y$ considered as leaves in $\mathcal B(\mathcal S)$. In particular, we show that for any set $X$ there always exists an injective split system on $X$, and we also give a characterization for when a split system is injective. We also consider how complex the Buneman graph $\mathcal B(\mathcal S)$ needs to become in order for a split system $\mathcal S$ on $X$ to be injective. We do this by introducing a quantity for $|X|$ which we call the injective dimension for $|X|$, as well as two related quantities, called the injective 2-split and the rooted-injective dimension. We derive some upper and lower bounds for all three of these dimensions and also prove that some of these bounds are tight. An underlying motivation for studying injective split systems is that they can be used to obtain a natural generalization of symbolic tree maps. An important consequence of our results is that any three-way symbolic map on $X$ can be represented using Buneman graphs. △ Less

Submitted 8 November, 2022; originally announced November 2022.

Comments: 22 pages, 3 figures

arXiv:2204.13466 [pdf, other]

Clustering Systems of Phylogenetic Networks

Authors: Marc Hellmuth, David Schaller, Peter F. Stadler

Abstract: Rooted acyclic graphs appear naturally when the phylogenetic relationship of a set $X$ of taxa involves not only speciations but also recombination, horizontal transfer, or hybridization, that cannot be captured by trees. A variety of classes of such networks have been discussed in the literature, including phylogenetic, level-1, tree-child, tree-based, galled tree, regular, or normal networks as… ▽ More Rooted acyclic graphs appear naturally when the phylogenetic relationship of a set $X$ of taxa involves not only speciations but also recombination, horizontal transfer, or hybridization, that cannot be captured by trees. A variety of classes of such networks have been discussed in the literature, including phylogenetic, level-1, tree-child, tree-based, galled tree, regular, or normal networks as models of different types of evolutionary processes. Clusters arise in models of phylogeny as the sets $\mathtt{C}(v)$ of descendant taxa of a vertex $v$. The clustering system $\mathscr{C}_N$ comprising the clusters of a network $N$ conveys key information on $N$ itself. In the special case of rooted phylogenetic trees, $T$ is uniquely determined by its clustering system $\mathscr{C}_T$. Although this is no longer true for networks in general, it is of interest to relate properties of $N$ and $\mathscr{C}_N$. Here, we systematically investigate the relationships of several well-studied classes of networks and their clustering systems. The main results are correspondences of classes of networks and clustering system of the following form: If $N$ is a network of type $\mathbb{X}$, then $\mathcal{C}_N$ satisfies $\mathbb{Y}$, and conversely if $\mathscr{C}$ is a clustering system satisfying $\mathbb{Y}$ then there is network $N$ of type $\mathbb{X}$ such that $\mathscr{C}\subseteq\mathscr{C}_N$.This, in turn, allows us to investigate the mutual dependencies between the distinct types of networks in much detail. △ Less

Submitted 28 April, 2022; originally announced April 2022.

arXiv:2112.05537 [pdf, other]

From Modular Decomposition Trees to Level-1 Networks: Pseudo-Cographs, Polar-Cats and Prime Polar-Cats

Authors: Marc Hellmuth, Guillaume E. Scholz

Abstract: The modular decomposition of a graph $G$ is a natural construction to capture key features of $G$ in terms of a labeled tree $(T,t)$ whose vertices are labeled as "series" ($1$), "parallel" ($0$) or "prime". However, full information of $G$ is provided by its modular decomposition tree $(T,t)$ only, if $G$ does not contain prime modules. In this case, $(T,t)$ explains $G$, i.e., $\{x,y\}\in E(G)$… ▽ More The modular decomposition of a graph $G$ is a natural construction to capture key features of $G$ in terms of a labeled tree $(T,t)$ whose vertices are labeled as "series" ($1$), "parallel" ($0$) or "prime". However, full information of $G$ is provided by its modular decomposition tree $(T,t)$ only, if $G$ does not contain prime modules. In this case, $(T,t)$ explains $G$, i.e., $\{x,y\}\in E(G)$ if and only if the lowest common ancestor $\mathrm{lca}_T(x,y)$ of $x$ and $y$ has label "$1$". This information, however, gets lost whenever $(T,t)$ contains vertices with label "prime". In this contribution, we aim at replacing "prime" vertices in $(T,t)$ by simple 0/1-labeled cycles, which leads to the concept of rooted labeled level-1 networks $(N,t)$. We characterize graphs that can be explained by such level-1 networks $(N,t)$, which generalizes the concept of graphs that can be explained by labeled trees, that is, cographs. We provide three novel graph classes: \emph{polar-cats} are a proper subclass of \emph{pseudo-cographs} which forms a proper subclass of \emph{prime polar-cats}. In particular, every cograph is a pseudo-cograph and prime polar-cats are precisely those graphs that can be explained by a labeled level-1 network. The class of prime polar-cats is defined in terms of the modular decomposition of graphs and the property that all prime modules "induce" polar-cats. We provide a plethora of structural results and characterizations for graphs of these new classes. In addition, we show under which conditions there is a unique least-resolved labeled level-1 network that explains a given graph and provide linear-time algorithms to recognize all these types of graphs and to construct level-1 networks to explain them. △ Less

Submitted 15 June, 2022; v1 submitted 10 December, 2021; originally announced December 2021.

arXiv:2112.00403 [pdf, other]

Orientation of Fitch Graphs and Detection of Horizontal Gene Transfer in Gene Trees

Authors: David Schaller, Marc Hellmuth, Peter F. Stadler

Abstract: Horizontal gene transfer events partition a gene tree $T$ and thus, its leaf set into subsets of genes whose evolutionary history is described by speciation and duplication events alone. Indirect phylogenetic methods can be used to infer such partitions $\mathcal{P}$ from sequence similarity or evolutionary distances without any a priory knowledge about the underlying tree $T$. In this contributio… ▽ More Horizontal gene transfer events partition a gene tree $T$ and thus, its leaf set into subsets of genes whose evolutionary history is described by speciation and duplication events alone. Indirect phylogenetic methods can be used to infer such partitions $\mathcal{P}$ from sequence similarity or evolutionary distances without any a priory knowledge about the underlying tree $T$. In this contribution, we assume that such a partition $\mathcal{P}$ of a set of genes $X$ is given and that, independently, an estimate $T$ of the original gene tree on $X$ has been derived. We then ask to what extent $T$ and the xenology information, i.e., $\mathcal{P}$ can be combined to determine the horizontal transfer edges in $T$. We show that for each pair of genes $x$ and $y$ with $x,y$ being in different parts of $\mathcal{P}$, it can be decided whether there always exists or never exists a horizontal gene transfer in $T$ along the path connecting $y$ and the most recent common ancestor of $x$ and $y$. This problem is equivalent to determining the presence or absence of the directed edge $(x,y)$ in so-called Fitch graphs; a more fine-grained version of graphs that represent the dependencies between the sets in $\mathcal{P}$. We then consider the generalization to insufficiently resolved gene trees and show that analogous results can be obtained. We show that the classification of $(x,y)$ can be computed in constant time after linear-time preprocessing. Using simulated gene family histories, we observe empirically that the vast majority of horizontal transfer edges in the gene tree $T$ can be recovered unambiguously. △ Less

Submitted 1 December, 2021; originally announced December 2021.

arXiv:2110.09346 [pdf, other]

Planar Median Graphs and Cubesquare-Graphs

Authors: Carsten R. Seemann, Vincent Moulton, Peter F. Stadler, Marc Hellmuth

Abstract: Median graphs are connected graphs in which for all three vertices there is a unique vertex that belongs to shortest paths between each pair of these three vertices. In this paper we provide several novel characterizations of planar median graphs. More specifically, we characterize when a planar graph $G$ is a median graph in terms of forbidden subgraphs and the structure of isometric cycles in… ▽ More Median graphs are connected graphs in which for all three vertices there is a unique vertex that belongs to shortest paths between each pair of these three vertices. In this paper we provide several novel characterizations of planar median graphs. More specifically, we characterize when a planar graph $G$ is a median graph in terms of forbidden subgraphs and the structure of isometric cycles in $G$, and also in terms of subgraphs of $G$ that are contained inside and outside of 4-cycles with respect to an arbitrary planar embedding of $G$. These results lead us to a new characterization of planar median graphs in terms of cubesquare-graphs that is, graphs that can be obtained by starting with cubes and square graphs, and iteratively replacing 4-cycle boundaries (relative to some embedding) by cubes or square-graphs. As a corollary we also show that a graph is planar median if and only if it can be obtained from cubes and square-graphs by a sequence of ``square-boundary'' amalgamations. These considerations also lead to an $\mathcal{O}(n\log n)$-time recognition algorithm to compute a decomposition of a planar median graph with $n$ vertices into cubes and square-graphs. △ Less

Submitted 18 October, 2021; originally announced October 2021.

arXiv:2109.10235 [pdf, other]

Quasi-Best Match Graphs

Authors: Annachiara Korchmaros, David Schaller, Marc Hellmuth, Peter F. Stadler

Abstract: Quasi-best match graphs (qBMGs) are a hereditary class of directed, properly vertex-colored graphs. They arise naturally in mathematical phylogenetics as a generalization of best match graphs, which formalize the notion of evolutionary closest relatedness of genes (vertices) in multiple species (vertex colors). They are explained by rooted trees whose leaves correspond to vertices. In contrast to… ▽ More Quasi-best match graphs (qBMGs) are a hereditary class of directed, properly vertex-colored graphs. They arise naturally in mathematical phylogenetics as a generalization of best match graphs, which formalize the notion of evolutionary closest relatedness of genes (vertices) in multiple species (vertex colors). They are explained by rooted trees whose leaves correspond to vertices. In contrast to BMGs, qBMGs represent only best matches at a restricted phylogenetic distance. We provide characterizations of qBMGs that give rise to polynomial-time recognition algorithms and identify the BMGs as the qBMGs that are color-sink-free. Furthermore, two-colored qBMGs are characterized as directed graphs satisfying three simple local conditions, two of which have appeared previously, namely bi-transitivity in the sense of Das et al. (2021) and a hierarchy-like structure of out-neighborhoods, i.e., $N(x)\cap N(y)\in\{N(x),N(y),\emptyset\}$ for any two vertices $x$ and $y$. Further results characterize qBMGs that can be explained by binary phylogenetic trees. △ Less

Submitted 22 January, 2023; v1 submitted 21 September, 2021; originally announced September 2021.

arXiv:2109.06755 [pdf, other]

Construction of $k$-matchings and $k$-regular subgraphs in graph products

Authors: Anna Lindeberg, Marc Hellmuth

Abstract: A $k$-matching $M$ of a graph $G=(V,E)$ is a subset $M\subseteq E$ such that each connected component in the subgraph $F = (V,M)$ of $G$ is either a single-vertex graph or $k$-regular, i.e., each vertex has degree $k$. In this contribution, we are interested in $k$-matchings within the four standard graph products: the Cartesian, strong, direct and lexicographic product. As we shall see, the pro… ▽ More A $k$-matching $M$ of a graph $G=(V,E)$ is a subset $M\subseteq E$ such that each connected component in the subgraph $F = (V,M)$ of $G$ is either a single-vertex graph or $k$-regular, i.e., each vertex has degree $k$. In this contribution, we are interested in $k$-matchings within the four standard graph products: the Cartesian, strong, direct and lexicographic product. As we shall see, the problem of finding non-empty $k$-matchings ($k\geq 3$) in graph products is NP-complete. Due to the general intractability of this problem, we focus on distinct polynomial-time constructions of $k$-matchings in a graph product $G\star H$ that are based on $k_G$-matchings $M_G$ and $k_H$-matchings $M_H$ of its factors $G$ and $H$, respectively. In particular, we are interested in properties of the factors that have to be satisfied such that these constructions yield a maximum $k$-matching in the respective products. Such constructions are also called "well-behaved" and we provide several characterizations for this type of $k$-matchings. Our specific constructions of $k$-matchings in graph products satisfy the property of being weak-homomorphism preserving, i.e., constructed matched edges in the product are never "projected" to unmatched edges in the factors. This leads to the concept of weak-homomorphism preserving $k$-matchings. Although the specific $k$-matchings constructed here are not always maximum $k$-matchings of the products, they have always maximum size among all weak-homomorphism preserving $k$-matchings. Not all weak-homomorphism preserving $k$-matchings, however, can be constructed in our manner. We will, therefore, determine the size of maximum-sized elements among all weak-homomorphims preserving $k$-matching within the respective graph products, provided that the matchings in the factors satisfy some general assumptions. △ Less

Submitted 14 September, 2021; originally announced September 2021.

arXiv:2107.01893 [pdf, other]

Combining Orthology and Xenology Data in a Common Phylogenetic Tree

Authors: Marc Hellmuth, Mira Michel, Nikolai N. Nøjgaard, David Schaller, Peter F. Stadler

Abstract: A rooted tree $T$ with vertex labels $t(v)$ and set-valued edge labels $λ(e)$ defines maps $δ$ and $\varepsilon$ on the pairs of leaves of $T$ by setting $δ(x,y)=q$ if the last common ancestor $\text{lca}(x,y)$ of $x$ and $y$ is labeled $q$, and $m\in \varepsilon(x,y)$ if $m\inλ(e)$ for at least one edge $e$ along the path from $\text{lca}(x,y)$ to $y$. We show that a pair of maps… ▽ More A rooted tree $T$ with vertex labels $t(v)$ and set-valued edge labels $λ(e)$ defines maps $δ$ and $\varepsilon$ on the pairs of leaves of $T$ by setting $δ(x,y)=q$ if the last common ancestor $\text{lca}(x,y)$ of $x$ and $y$ is labeled $q$, and $m\in \varepsilon(x,y)$ if $m\inλ(e)$ for at least one edge $e$ along the path from $\text{lca}(x,y)$ to $y$. We show that a pair of maps $(δ,\varepsilon)$ derives from a tree $(T,t,λ)$ if and only if there exists a common refinement of the (unique) least-resolved vertex labeled tree $(T_δ,t_δ)$ that explains $δ$ and the (unique) least resolved edge labeled tree $(T_{\varepsilon},λ_{\varepsilon})$ that explains $\varepsilon$ (provided both trees exist). This result remains true if certain combinations of labels at incident vertices and edges are forbidden. △ Less

Submitted 5 July, 2021; originally announced July 2021.

arXiv:2107.00072 [pdf, other]

A Simple Linear-Time Algorithm for the Common Refinement of Rooted Phylogenetic Trees on a Common Leaf Set

Authors: David Schaller, Marc Hellmuth, Peter F. Stadler

Abstract: Background. The supertree problem, i.e., the task of finding a common refinement of a set of rooted trees is an important topic in mathematical phylogenetics. The special case of a common leaf set $L$ is known to be solvable in linear time. Existing approaches refine one input tree using information of the others and then test whether the results are isomorphic. Results. A linear-time algorithm,… ▽ More Background. The supertree problem, i.e., the task of finding a common refinement of a set of rooted trees is an important topic in mathematical phylogenetics. The special case of a common leaf set $L$ is known to be solvable in linear time. Existing approaches refine one input tree using information of the others and then test whether the results are isomorphic. Results. A linear-time algorithm, LinCR, for constructing the common refinement $T$ of $k$ input trees with a common leaf set is proposed that explicitly computes the parent function of $T$ in a bottom-up approach. Conclusion. LinCR is simpler to implement than other asymptotically optimal algorithms for the problem and outperforms the alternatives in empirical comparisons. Availability. An implementation of LinCR in Python is freely available at https://github.com/david-schaller/tralda. △ Less

Submitted 24 September, 2021; v1 submitted 30 June, 2021; originally announced July 2021.

arXiv:2104.14146 [pdf, other]

Compatibility of Partitions with Trees, Hierarchies, and Split Systems

Authors: Marc Hellmuth, David Schaller, Peter F. Stadler

Abstract: The question whether a partition $\mathcal{P}$ and a hierarchy $\mathcal{H}$ or a tree-like split system $\mathfrak{S}$ are compatible naturally arises in a wide range of classification problems. In the setting of phylogenetic trees, one asks whether the sets of $\mathcal{P}$coincide with leaf sets of connected components obtained by deleting some edges from the tree $T$ that represents… ▽ More The question whether a partition $\mathcal{P}$ and a hierarchy $\mathcal{H}$ or a tree-like split system $\mathfrak{S}$ are compatible naturally arises in a wide range of classification problems. In the setting of phylogenetic trees, one asks whether the sets of $\mathcal{P}$coincide with leaf sets of connected components obtained by deleting some edges from the tree $T$ that represents $\mathcal{H}$ or $\mathfrak{S}$, respectively. More generally, we ask whether a refinement $T^*$ of $T$ exists such that $T^*$ and $\mathcal{P}$ are compatible in this sense. The latter is closely related to the question as to whether there exists a tree at all that is compatible with $\mathcal{P}$. We report several characterizations for (refinements of) hierarchies and split systems that are compatible with (systems of) partitions. In addition, we provide a linear-time algorithm to check whether refinements of trees and a given partition are compatible. The latter problem becomes NP-complete but fixed-parameter tractable if a system of partitions is considered instead of a single partition. In this context, we also explore the close relationship of the concept of compatibility and so-called Fitch maps. △ Less

Submitted 30 November, 2021; v1 submitted 29 April, 2021; originally announced April 2021.

arXiv:2103.07280 [pdf, other]

Heuristic Algorithms for Best Match Graph Editing

Authors: David Schaller, Manuela Geiß, Marc Hellmuth, Peter F. Stadler

Abstract: Best match graphs (BMGs) are a class of colored digraphs that naturally appear in mathematical phylogenetics and can be approximated with the help of similarity measures between gene sequences, albeit not without errors. The corresponding graph editing problem can be used as a means of error correction. Since the arc set modification problems for BMGs are NP-complete, efficient heuristics are need… ▽ More Best match graphs (BMGs) are a class of colored digraphs that naturally appear in mathematical phylogenetics and can be approximated with the help of similarity measures between gene sequences, albeit not without errors. The corresponding graph editing problem can be used as a means of error correction. Since the arc set modification problems for BMGs are NP-complete, efficient heuristics are needed if BMGs are to be used for the practical analysis of biological sequence data. Since BMGs have a characterization in terms of consistency of a certain set of rooted triples, we consider heuristics that operate on triple sets. As an alternative, we show that there is a close connection to a set partitioning problem that leads to a class of top-down recursive algorithms that are similar to Aho's supertree algorithm and give rise to BMG editing algorithms that are consistent in the sense that they leave BMGs invariant. Extensive benchmarking shows that community detection algorithms for the partitioning steps perform best for BMG editing. △ Less

Submitted 10 March, 2021; originally announced March 2021.

arXiv:2103.06683 [pdf, other]

From Modular Decomposition Trees to Rooted Median Graphs

Authors: Carmen Bruckmann, Peter F. Stadler, Marc Hellmuth

Abstract: The modular decomposition of a symmetric map $δ\colon X\times X \to Υ$ (or, equivalently, a set of symmetric binary relations, a 2-structure, or an edge-colored undirected graph) is a natural construction to capture key features of $δ$ in labeled trees. A map $δ$ is explained by a vertex-labeled rooted tree $(T,t)$ if the label $δ(x,y)$ coincides with the label of the last common ancestor of $x$ a… ▽ More The modular decomposition of a symmetric map $δ\colon X\times X \to Υ$ (or, equivalently, a set of symmetric binary relations, a 2-structure, or an edge-colored undirected graph) is a natural construction to capture key features of $δ$ in labeled trees. A map $δ$ is explained by a vertex-labeled rooted tree $(T,t)$ if the label $δ(x,y)$ coincides with the label of the last common ancestor of $x$ and $y$ in $T$, i.e., if $δ(x,y)=t(\mathrm{lca}(x,y))$. Only maps whose modular decomposition does not contain prime nodes, i.e., the symbolic ultrametrics, can be exaplained in this manner. Here we consider rooted median graphs as a generalization to (modular decomposition) trees to explain symmetric maps. We first show that every symmetric map can be explained by "extended" hypercubes and half-grids. We then derive a a linear-time algorithm that stepwisely resolves prime vertices in the modular decomposition tree to obtain a rooted and labeled median graph that explains a given symmetric map $δ$. We argue that the resulting "tree-like" median graphs may be of use in phylogenetics as a model of evolutionary relationships. △ Less

Submitted 11 March, 2021; originally announced March 2021.

arXiv:2103.06665 [pdf, other]

Arc-Completion of 2-Colored Best Match Graphs to Binary-Explainable Best Match Graphs

Authors: David Schaller, Manuela Geiß, Marc Hellmuth, Peter F. Stadler

Abstract: Best match graphs (BMGs) are vertex-colored digraphs that naturally arise in mathematical phylogenetics to formalize the notion of evolutionary closest genes w.r.t. an a priori unknown phylogenetic tree. BMGs are explained by unique least resolved trees. We prove that the property of a rooted, leaf-colored tree to be least resolved for some BMG is preserved by the contraction of inner edges. For t… ▽ More Best match graphs (BMGs) are vertex-colored digraphs that naturally arise in mathematical phylogenetics to formalize the notion of evolutionary closest genes w.r.t. an a priori unknown phylogenetic tree. BMGs are explained by unique least resolved trees. We prove that the property of a rooted, leaf-colored tree to be least resolved for some BMG is preserved by the contraction of inner edges. For the special case of two-colored BMGs, this leads to a characterization of the least resolved trees (LRTs) of binary-explainable trees and a simple, polynomial-time algorithm for the minimum cardinality completion of the arc set of a BMG to reach a BMG that can be explained by a binary tree. △ Less

Submitted 11 March, 2021; originally announced March 2021.

arXiv:2101.07000 [pdf, other]

Least resolved trees for two-colored best match graphs

Authors: David Schaller, Manuela Geiß, Marc Hellmuth, Peter F. Stadler

Abstract: 2-colored best match graphs (2-BMGs) form a subclass of sink-free bi-transitive graphs that appears in phylogenetic combinatorics. There, 2-BMGs describe evolutionarily most closely related genes between a pair of species. They are explained by a unique least resolved tree (LRT). Introducing the concept of support vertices we derive an $O(|V|+|E|\log^2|V|)$-time algorithm to recognize 2-BMGs and t… ▽ More 2-colored best match graphs (2-BMGs) form a subclass of sink-free bi-transitive graphs that appears in phylogenetic combinatorics. There, 2-BMGs describe evolutionarily most closely related genes between a pair of species. They are explained by a unique least resolved tree (LRT). Introducing the concept of support vertices we derive an $O(|V|+|E|\log^2|V|)$-time algorithm to recognize 2-BMGs and to construct its LRT. The approach can be extended to also recognize binary-explainable 2-BMGs with the same complexity. An empirical comparison emphasizes the efficiency of the new algorithm. △ Less

Submitted 18 January, 2021; originally announced January 2021.

arXiv:2011.00511 [pdf, other]

Best Match Graphs with Binary Trees

Authors: David Schaller, Manuela Geiß, Marc Hellmuth, Peter F. Stadler

Abstract: Best match graphs (BMG) are a key intermediate in graph-based orthology detection and contain a large amount of information on the gene tree. We provide a near-cubic algorithm to determine whether a BMG is binary-explainable, i.e., whether it can be explained by a fully resolved gene tree and, if so, to construct such a tree. Moreover, we show that all such binary trees are refinements of the uniq… ▽ More Best match graphs (BMG) are a key intermediate in graph-based orthology detection and contain a large amount of information on the gene tree. We provide a near-cubic algorithm to determine whether a BMG is binary-explainable, i.e., whether it can be explained by a fully resolved gene tree and, if so, to construct such a tree. Moreover, we show that all such binary trees are refinements of the unique binary-resolvable tree (BRT), which in general is a substantial refinement of the also unique least resolved tree of a BMG. Finally, we show that the problem of editing an arbitrary vertex-colored graph to a binary-explainable BMG is NP-complete and provide an integer linear program formulation for this task. △ Less

Submitted 8 March, 2021; v1 submitted 1 November, 2020; originally announced November 2020.

arXiv:2006.02249 [pdf, other]

Complete Characterization of Incorrect Orthology Assignments in Best Match Graphs

Authors: David Schaller, Manuela Geiß, Peter F. Stadler, Marc Hellmuth

Abstract: Genome-scale orthology assignments are usually based on reciprocal best matches. In the absence of horizontal gene transfer (HGT), every pair of orthologs forms a reciprocal best match. Incorrect orthology assignments therefore are always false positives in the reciprocal best match graph. We consider duplication/loss scenarios and characterize unambiguous false-positive (u-fp) orthology assignmen… ▽ More Genome-scale orthology assignments are usually based on reciprocal best matches. In the absence of horizontal gene transfer (HGT), every pair of orthologs forms a reciprocal best match. Incorrect orthology assignments therefore are always false positives in the reciprocal best match graph. We consider duplication/loss scenarios and characterize unambiguous false-positive (u-fp) orthology assignments, that is, edges in the best match graphs (BMGs) that cannot correspond to orthologs for any gene tree that explains the BMG. Moreover, we provide a polynomial-time algorithm to identify all u-fp orthology assignments in a BMG. Simulations show that at least $75\%$ of all incorrect orthology assignments can be detected in this manner. All results rely only on the structure of the BMGs and not on any a priori knowledge about underlying gene or species trees. △ Less

Submitted 26 November, 2020; v1 submitted 3 June, 2020; originally announced June 2020.

MSC Class: 92-08; 92D15; 68R01

arXiv:2004.07118 [pdf, other]

Complete Edge-Colored Permutation Graphs

Authors: Tom Hartmann, Max Bannach, Martin Middendorf, Peter F. Stadler, Nicolas Wieseke, Marc Hellmuth

Abstract: We introduce the concept of complete edge-colored permutation graphs as complete graphs that are the edge-disjoint union of "classical" permutation graphs. We show that a graph $G=(V,E)$ is a complete edge-colored permutation graph if and only if each monochromatic subgraph of $G$ is a "classical" permutation graph and $G$ does not contain a triangle with~$3$ different colors. Using the modular de… ▽ More We introduce the concept of complete edge-colored permutation graphs as complete graphs that are the edge-disjoint union of "classical" permutation graphs. We show that a graph $G=(V,E)$ is a complete edge-colored permutation graph if and only if each monochromatic subgraph of $G$ is a "classical" permutation graph and $G$ does not contain a triangle with~$3$ different colors. Using the modular decomposition as a framework we demonstrate that complete edge-colored permutation graphs are characterized in terms of their strong prime modules, which induce also complete edge-colored permutation graphs. This leads to an $\mathcal{O}(|V|^2)$-time recognition algorithm. We show, moreover, that complete edge-colored permutation graphs form a superclass of so-called symbolic ultrametrics and that the coloring of such graphs is always a Gallai coloring. △ Less

Submitted 15 April, 2020; originally announced April 2020.

arXiv:2004.06340 [pdf, other]

Hierarchical and Modularly-Minimal Vertex Colorings

Authors: Dulce I. Valdivia, Manuela Geiß, Maribel Hernández Rosales, Peter F. Stadler, Marc Hellmuth

Abstract: Cographs are exactly the hereditarily well-colored graphs, i.e., the graphs for which a greedy vertex coloring of every induced subgraph uses only the minimally necessary number of colors $χ(G)$. We show that greedy colorings are a special case of the more general hierarchical vertex colorings, which recently were introduced in phylogenetic combinatorics. Replacing cotrees by modular decomposition… ▽ More Cographs are exactly the hereditarily well-colored graphs, i.e., the graphs for which a greedy vertex coloring of every induced subgraph uses only the minimally necessary number of colors $χ(G)$. We show that greedy colorings are a special case of the more general hierarchical vertex colorings, which recently were introduced in phylogenetic combinatorics. Replacing cotrees by modular decomposition trees generalizes the concept of hierarchical colorings to arbitrary graphs. We show that every graph has a modularly-minimal coloring $σ$ satisfying $|σ(M)|=χ(M)$ for every strong module $M$ of $G$. This, in particular, shows that modularly-minimal colorings provide a useful device to design efficient coloring algorithms for certain hereditary graph classes. For cographs, the hierarchical colorings coincide with the modularly-minimal coloring. As a by-product, we obtain a simple linear-time algorithm to compute a modularly-minimal coloring of $P_4$-sparse graphs. △ Less

Submitted 14 April, 2020; originally announced April 2020.

Comments: arXiv admin note: text overlap with arXiv:1906.10031

arXiv:2001.05921 [pdf, other]

doi 10.46298/dmtcs.6040

Generalized Fitch Graphs III: Symmetrized Fitch maps and Sets of Symmetric Binary Relations that are explained by Unrooted Edge-labeled Trees

Authors: Marc Hellmuth, Carsten R. Seemann, Peter F. Stadler

Abstract: Binary relations derived from labeled rooted trees play an import role in mathematical biology as formal models of evolutionary relationships. The (symmetrized) Fitch relation formalizes xenology as the pairs of genes separated by at least one horizontal transfer event. As a natural generalization, we consider symmetrized Fitch maps, that is, symmetric maps $\varepsilon$ that assign a subset of co… ▽ More Binary relations derived from labeled rooted trees play an import role in mathematical biology as formal models of evolutionary relationships. The (symmetrized) Fitch relation formalizes xenology as the pairs of genes separated by at least one horizontal transfer event. As a natural generalization, we consider symmetrized Fitch maps, that is, symmetric maps $\varepsilon$ that assign a subset of colors to each pair of vertices in $X$ and that can be explained by a tree $T$ with edges that are labeled with subsets of colors in the sense that the color $m$ appears in $\varepsilon(x,y)$ if and only if $m$ appears in a label along the unique path between $x$ and $y$ in $T$. We first give an alternative characterization of the monochromatic case and then give a characterization of symmetrized Fitch maps in terms of compatibility of a certain set of quartets. We show that recognition of symmetrized Fitch maps is NP-complete. In the restricted case where $|\varepsilon(x,y)|\leq 1$ the problem becomes polynomial, since such maps coincide with class of monochromatic Fitch maps whose graph-representations form precisely the class of complete multi-partite graphs. △ Less

Submitted 18 May, 2021; v1 submitted 16 January, 2020; originally announced January 2020.

MSC Class: 68R01; 05C05; 92D15

Journal ref: Discrete Mathematics & Theoretical Computer Science, vol. 23 no. 1, Graph Theory (June 3, 2021) dmtcs:6040

arXiv:1911.07469 [pdf, other]

doi 10.1016/j.dam.2020.01.036

Generalized Fitch Graphs II: Sets of Binary Relations that are explained by Edge-labeled Trees

Authors: Marc Hellmuth, Carsten R. Seemann, Peter F. Stadler

Abstract: Fitch graphs $G=(X,E)$ are digraphs that are explained by $\{\emptyset, 1\}$-edge-labeled rooted trees $T$ with leaf set $X$: there is an arc $(x,y) \in E$ if and only if the unique path in $T$ that connects the last common ancestor $\mathrm{lca}(x,y)$ of $x$ and $y$ with $y$ contains at least one edge with label "1". In practice, Fitch graphs represent xenology relations, i.e., pairs of genes… ▽ More Fitch graphs $G=(X,E)$ are digraphs that are explained by $\{\emptyset, 1\}$-edge-labeled rooted trees $T$ with leaf set $X$: there is an arc $(x,y) \in E$ if and only if the unique path in $T$ that connects the last common ancestor $\mathrm{lca}(x,y)$ of $x$ and $y$ with $y$ contains at least one edge with label "1". In practice, Fitch graphs represent xenology relations, i.e., pairs of genes $x$ and $y$ for which a horizontal gene transfer happened along the path from $\mathrm{lca}(x,y)$ to $y$. In this contribution, we generalize the concept of Fitch graphs and consider trees $T$ that are equipped with edge-labeling $λ: E\to \mathcal{P}(M)$ that assigns to each edge a subset $M'\subseteq M$ of colors. Given such a tree, we can derive a map $\varepsilon_{(T,λ)}$ (or equivalently a set of not necessarily disjoint binary relations), such that $i\in \varepsilon_{(T,λ)}(x,y)$ (or equivalently $(x,y)\in R_i$) with $x,y\in X$, if and only if there is at least one edge with color $i$ from $\mathrm{lca}(x,y)$ to $y$. The central question considered here: Is a given map $\varepsilon$ a Fitch map, i.e., is there there an edge-labeled tree $(T,λ)$ with $\varepsilon_{(T,λ)} = \varepsilon$, and thus explains $\varepsilon$? Here, we provide a characterization of Fitch maps in terms of certain neighborhoods and forbidden submaps. Further restrictions of Fitch maps are considered. Moreover, we show that the least-resolved tree explaining a Fitch map is unique (up to isomorphism). In addition, we provide a polynomial-time algorithm to decide whether $\varepsilon$ is a Fitch map and, in the affirmative case, to construct the (up to isomorphism) unique least-resolved tree $(T^*,λ^*)$ that explains $\varepsilon$. △ Less

Submitted 30 January, 2020; v1 submitted 18 November, 2019; originally announced November 2019.

MSC Class: 68R01; 05C05; 92D15

arXiv:1910.13123 [pdf, other]

Reconstruction of time-consistent species trees

Authors: Manuel Lafond, Marc Hellmuth

Abstract: The history of gene families -- which are equivalent to event-labeled gene trees -- can to some extent be reconstructed from empirically estimated evolutionary event-relations containing pairs of orthologous, paralogous or xenologous genes. The question then arises as whether inferred event-labeled gene trees are "biologically feasible" which is the case if one can find a species tree with which t… ▽ More The history of gene families -- which are equivalent to event-labeled gene trees -- can to some extent be reconstructed from empirically estimated evolutionary event-relations containing pairs of orthologous, paralogous or xenologous genes. The question then arises as whether inferred event-labeled gene trees are "biologically feasible" which is the case if one can find a species tree with which the gene tree can be reconciled in a time-consistent way. In this contribution, we consider event-labeled gene trees that contain speciation, duplication as well as horizontal gene transfer and we assume that the species tree is unknown. We provide a cubic-time algorithm to decide whether a "time-consistent" binary species for a given event-labeled gene tree exists and, in the affirmative case, to construct the species tree within the same time-complexity. △ Less

Submitted 29 October, 2019; originally announced October 2019.

arXiv:1907.08865 [pdf, other]

Complexity of Modification Problems for Reciprocal Best Match Graphs

Authors: Marc Hellmuth, Manuela Geiß, Peter F. Stadler

Abstract: Reciprocal best match graphs (RBMGs) are vertex colored graphs whose vertices represent genes and the colors the species where the genes reside. Edges identify pairs of genes that are most closely related with respect to an underlying evolutionary tree. In practical applications this tree is unknown and the edges of the RBMGs are inferred by quantifying sequence similarity. Due to noise in the dat… ▽ More Reciprocal best match graphs (RBMGs) are vertex colored graphs whose vertices represent genes and the colors the species where the genes reside. Edges identify pairs of genes that are most closely related with respect to an underlying evolutionary tree. In practical applications this tree is unknown and the edges of the RBMGs are inferred by quantifying sequence similarity. Due to noise in the data, these empirically determined graphs in general violate the condition of being a ``biologically feasible'' RBMG. Therefore, it is of practical interest in computational biology to correct the initial estimate. Here we consider deletion (remove at most $k$ edges) and editing (add or delete at most $k$ edges) problems. We show that the decision version of the deletion and editing problem to obtain RBMGs from vertex colored graphs is NP-hard. Using known results for the so-called bicluster editing, we show that the RBMG editing problem for $2$-colored graphs is fixed-parameter tractable. A restricted class of RBMGs appears in the context of orthology detection. These are cographs with a specific type of vertex coloring known as hierarchical coloring. We show that the decision problem of modifying a vertex-colored graph (either by edge-deletion or editing) into an RBMG with cograph structure or, equivalently, to an hierarchically colored cograph is NP-complete. △ Less

Submitted 20 July, 2019; originally announced July 2019.

arXiv:1906.10031 [pdf, other]

Hierarchical Colorings of Cographs

Authors: D. I. Valdivia, M. Geiß, M. Hellmuth, M. Hernandez Rosales, P. F. Stadler

Abstract: Cographs are exactly hereditarily well-colored graphs, i.e., the graphs for which a greedy coloring of every induced subgraph uses only the minimally necessary number of colors $χ(G)$. In recent work on reciprocal best match graphs so-called hierarchically coloring play an important role. Here we show that greedy colorings are a special case of hierarchical coloring, which also require no more tha… ▽ More Cographs are exactly hereditarily well-colored graphs, i.e., the graphs for which a greedy coloring of every induced subgraph uses only the minimally necessary number of colors $χ(G)$. In recent work on reciprocal best match graphs so-called hierarchically coloring play an important role. Here we show that greedy colorings are a special case of hierarchical coloring, which also require no more than $χ(G)$ colors. △ Less

Submitted 24 June, 2019; originally announced June 2019.

arXiv:1903.07920 [pdf, other]

Reciprocal Best Match Graphs

Authors: Manuela Geiß, Peter F. Stadler, Marc Hellmuth

Abstract: Reciprocal best matches play an important role in numerous applications in computational biology, in particular as the basis of many widely used tools for orthology assessment. Nevertheless, very little is known about their mathematical structure. Here, we investigate the structure of reciprocal best match graphs (RBMGs). In order to abstract from the details of measuring distances, we define reci… ▽ More Reciprocal best matches play an important role in numerous applications in computational biology, in particular as the basis of many widely used tools for orthology assessment. Nevertheless, very little is known about their mathematical structure. Here, we investigate the structure of reciprocal best match graphs (RBMGs). In order to abstract from the details of measuring distances, we define reciprocal best matches here as pairwise most closely related leaves in a gene tree, arguing that conceptually this is the notion that is pragmatically approximated by distance- or similarity-based heuristics. We start by showing that a graph $G$ is an RBMG if and only if its quotient graph w.r.t.\ a certain thinness relation is an RBMG. Furthermore, it is necessary and sufficient that all connected components of $G$ are RBMGs. The main result of this contribution is a complete characterization of RBMGs with 3 colors/species that can be checked in polynomial time. For 3 colors, there are three distinct classes of trees that are related to the structure of the phylogenetic trees explaining them. We derive an approach to recognize RBMGs with an arbitrary number of colors; it remains open however, whether a polynomial-time for RBMG recognition exists. In addition, we show that RBMGs that at the same time are cographs (co-RBMGs) can be recognized in polynomial time. Co-RBMGs are characterized in terms of hierarchically colored cographs, a particular class of vertex colored cographs that is introduced here. The (least resolved) trees that explain co-RBMGs can be constructed in polynomial time. △ Less

Submitted 29 August, 2019; v1 submitted 19 March, 2019; originally announced March 2019.

arXiv:1812.09002 [pdf, other]

Reconciling Event-Labeled Gene Trees with MUL-trees and Species Networks

Authors: Marc Hellmuth, Katharina T. Huber, Vincent Moulton

Abstract: Phylogenomics commonly aims to construct evolutionary trees from genomic sequence information. One way to approach this problem is to first estimate event-labeled gene trees (i.e., rooted trees whose non-leaf vertices are labeled by speciation or gene duplication events), and to then look for a species tree which can be reconciled with this tree through a \emph{reconciliation map} between the tree… ▽ More Phylogenomics commonly aims to construct evolutionary trees from genomic sequence information. One way to approach this problem is to first estimate event-labeled gene trees (i.e., rooted trees whose non-leaf vertices are labeled by speciation or gene duplication events), and to then look for a species tree which can be reconciled with this tree through a \emph{reconciliation map} between the trees. In practice, however, it can happen that there is no such map from a given event-labeled tree to \emph{any} species tree. An important situation where this might arise is where the species evolution is better represented by a \emph{network} instead of a tree. In this paper, we therefore consider the problem of reconciling event-labeled trees with species networks. In particular, we prove that any event-labeled gene tree can be reconciled with some network and that, under certain mild assumptions on the gene tree, the network can even be assumed to be multi-arc free. To prove this result, we show that we can always reconcile the gene tree with some multi-labeled (MUL-)tree, which can then be "folded up" to produce the desired reconciliation and network. In addition, we study the interplay between reconciliation maps from event-labeled gene trees to MUL-trees and networks. Our results could be useful for understanding how genomes have evolved after undergoing complex evolutionary events such as polyploidy. △ Less

Submitted 8 May, 2019; v1 submitted 21 December, 2018; originally announced December 2018.

MSC Class: 05C90; 92D15

arXiv:1811.12661 [pdf, other]

doi 10.1007/s00285-019-01384-x

Alternative Characterizations of Fitch's Xenology Relation

Authors: Marc Hellmuth, Carsten R. Seemann

Abstract: According to Walter M. Fitch, two genes are xenologs if they are separated by at least one horizontal gene transfer. This concept is formalized through Fitch relations, which are defined as binary relations that comprise all pairs $(x,y)$ of genes $x$ and $y$ for which $y$ has been horizontally transferred at least once since it diverged from the least common ancestor of $x$ and $y$. This definiti… ▽ More According to Walter M. Fitch, two genes are xenologs if they are separated by at least one horizontal gene transfer. This concept is formalized through Fitch relations, which are defined as binary relations that comprise all pairs $(x,y)$ of genes $x$ and $y$ for which $y$ has been horizontally transferred at least once since it diverged from the least common ancestor of $x$ and $y$. This definition, in particular, preserves the directional character of the transfer. Fitch relations are characterized by a small set of forbidden induced subgraphs on three vertices and can be recognized in linear time. In this contribution, we provide two novel characterizations of Fitch relations and present an alternative, short and elegant proof of the characterization theorem established by Geiß et al.\ in \emph{J.\ Math.\ Bio 77(5), 2018}. △ Less

Submitted 30 November, 2018; originally announced November 2018.

arXiv:1803.10989 [pdf, other]

doi 10.1007/s00285-019-01332-9

Best Match Graphs

Authors: Manuela Geiß, Edgar Chavez, Marcos Gonzalez, Alitzel Lopez, Bärbel M. R. Stadler, Dulce I. Valdivia, Marc Hellmuth, Maribel H. Rosales, Peter F. Stadler

Abstract: THIS IS A CORRECTED VERSION INCLUDING AN APPENDED CORRIGENDUM. Best match graphs arise naturally as the first processing intermediate in algorithms for orthology detection. Let $T$ be a phylogenetic (gene) tree $T$ and $σ$ an assignment of leaves of $T$ to species. The best match graph $(G,σ)$ is a digraph that contains an arc from $x$ to $y$ if the genes $x$ and $y$ reside in different species… ▽ More THIS IS A CORRECTED VERSION INCLUDING AN APPENDED CORRIGENDUM. Best match graphs arise naturally as the first processing intermediate in algorithms for orthology detection. Let $T$ be a phylogenetic (gene) tree $T$ and $σ$ an assignment of leaves of $T$ to species. The best match graph $(G,σ)$ is a digraph that contains an arc from $x$ to $y$ if the genes $x$ and $y$ reside in different species and $y$ is one of possibly many (evolutionary) closest relatives of $x$ compared to all other genes contained in the species $σ(y)$. Here, we characterize best match graphs and show that it can be decided in cubic time and quadratic space whether $(G,σ)$ derived from a tree in this manner. If the answer is affirmative, there is a unique least resolved tree that explains $(G,σ)$, which can also be constructed in cubic time. △ Less

Submitted 9 September, 2020; v1 submitted 29 March, 2018; originally announced March 2018.

arXiv:1802.03657 [pdf, other]

Generalized Fitch Graphs: Edge-labeled Graphs that are explained by Edge-labeled Trees

Authors: Marc Hellmuth

Abstract: Fitch graphs $G=(X,E)$ are di-graphs that are explained by $\{\otimes,1\}$-edge-labeled rooted trees with leaf set $X$: there is an arc $xy\in E$ if and only if the unique path in $T$ that connects the least common ancestor $\textrm{lca}(x,y)$ of $x$ and $y$ with $y$ contains at least one edge with label $1$. In practice, Fitch graphs represent xenology relations, i.e., pairs of genes $x$ and $y$… ▽ More Fitch graphs $G=(X,E)$ are di-graphs that are explained by $\{\otimes,1\}$-edge-labeled rooted trees with leaf set $X$: there is an arc $xy\in E$ if and only if the unique path in $T$ that connects the least common ancestor $\textrm{lca}(x,y)$ of $x$ and $y$ with $y$ contains at least one edge with label $1$. In practice, Fitch graphs represent xenology relations, i.e., pairs of genes $x$ and $y$ for which a horizontal gene transfer happened along the path from $\textrm{lca}(x,y)$ to $y$. In this contribution, we generalize the concept of xenology and Fitch graphs and consider complete di-graphs $K_{|X|}$ with vertex set $X$ and a map $ε$ that assigns to each arc $xy$ a unique label $ε(x,y)\in M\cup \{\otimes\}$, where $M$ denotes an arbitrary set of symbols. A di-graph $(K_{|X|},ε)$ is a generalized Fitch graph if there is an $M\cup \{\otimes\}$-edge-labeled tree $(T,λ)$ that can explain $(K_{|X|},ε)$. We provide a simple characterization of generalized Fitch graphs $(K_{|X|},ε)$ and give an $O(|X|^2)$-time algorithm for their recognition as well as for the reconstruction of the unique least resolved phylogenetic tree that explains $(K_{|X|},ε)$. △ Less

Submitted 25 April, 2018; v1 submitted 10 February, 2018; originally announced February 2018.

arXiv:1712.01544 [pdf, ps, other]

A Short Note on Undirected Fitch Graphs

Authors: Manuela Geiß, Marc Hellmuth, Yangjing Long, Peter F. Stadler

Abstract: The symmetric version of Fitch's xenology relation coincides with class of complete multipartite graph and thus cannot convey any non-trivial phylogenetic information. The symmetric version of Fitch's xenology relation coincides with class of complete multipartite graph and thus cannot convey any non-trivial phylogenetic information. △ Less

Submitted 5 December, 2017; originally announced December 2017.

arXiv:1707.01667 [pdf, other]

doi 10.1016/j.ejc.2018.02.013

The Matroid Structure of Representative Triple Sets and Triple-Closure Computation

Authors: Marc Hellmuth, Carsten R. Seemann

Abstract: The closure $\textrm{cl}(R)$ of a consistent set $R$ of triples (rooted binary trees on three leaves) provides essential information about tree-like relations that are shown by any supertree that displays all triples in $R$. In this contribution, we are concerned with representative triple sets, that is, subsets $R'$ of $R$ with $\textrm{cl}(R') = \textrm{cl}(R)$. In this case, $R'$ still contains… ▽ More The closure $\textrm{cl}(R)$ of a consistent set $R$ of triples (rooted binary trees on three leaves) provides essential information about tree-like relations that are shown by any supertree that displays all triples in $R$. In this contribution, we are concerned with representative triple sets, that is, subsets $R'$ of $R$ with $\textrm{cl}(R') = \textrm{cl}(R)$. In this case, $R'$ still contains all information on the tree structure implied by $R$, although $R'$ might be significantly smaller. We show that representative triple sets that are minimal w.r.t.\ inclusion form the basis of a matroid. This in turn implies that minimal representative triple sets also have minimum cardinality. In particular, the matroid structure can be used to show that minimum representative triple sets can be computed in polynomial time with a simple greedy approach. For a given triple set $R$ that "identifies" a tree, we provide an exact value for the cardinality of its minimum representative triple sets. In addition, we utilize the latter results to provide a novel and efficient method to compute the closure $\textrm{cl}(R)$ of a consistent triple set $R$ that improves the time complexity $\mathcal{O}(|R||L_R|^4)$ of the currently fastest known method proposed by Bryant and Steel (1995). In particular, if a minimum representative triple set for $R$ is given, it can be shown that the time complexity to compute $\textrm{cl}(R)$ can be improved by a factor up to $|R||L_R|$. As it turns out, collections of quartets (unrooted binary trees on four leaves) do not provide a matroid structure, in general. △ Less

Submitted 25 January, 2018; v1 submitted 6 July, 2017; originally announced July 2017.

arXiv:1705.03823 [pdf, other]

Local algorithms for the prime factorization of strong product graphs

Authors: Marc Hellmuth, Wilfried Imrich, Werner Klöckl, Peter F. Stadler

Abstract: The practical application of graph prime factorization algorithms is limited in practice by unavoidable noise in the data. A first step towards error-tolerant "approximate" prime factorization, is the development of local approaches that cover the graph by factorizable patches and then use this information to derive global factors. We present here a local, quasi-linear al- gorithm for the prime fa… ▽ More The practical application of graph prime factorization algorithms is limited in practice by unavoidable noise in the data. A first step towards error-tolerant "approximate" prime factorization, is the development of local approaches that cover the graph by factorizable patches and then use this information to derive global factors. We present here a local, quasi-linear al- gorithm for the prime factorization of "locally unrefined" graphs with respect to the strong product. To this end we introduce the backbone B(G) for a given graph G and show that the neighborhoods of the backbone vertices provide enough information to determine the global prime factors. △ Less

Submitted 10 May, 2017; originally announced May 2017.

arXiv:1705.03817 [pdf, ps, other]

A Local Prime Factor Decomposition Algorithm for Strong Product Graphs

Authors: Marc Hellmuth

Abstract: This work is concerned with the prime factor decomposition (PFD) of strong product graphs. A new quasi-linear time algorithm for the PFD with respect to the strong product for arbitrary, finite, connected, undirected graphs is derived. Moreover, since most graphs are prime although they can have a product-like structure, also known as approximate graph products, the practical application of the we… ▽ More This work is concerned with the prime factor decomposition (PFD) of strong product graphs. A new quasi-linear time algorithm for the PFD with respect to the strong product for arbitrary, finite, connected, undirected graphs is derived. Moreover, since most graphs are prime although they can have a product-like structure, also known as approximate graph products, the practical application of the well-known "classical" prime factorization algorithm is strictly limited. This new PFD algorithm is based on a local approach that covers a graph by small factorizable subgraphs and then utilizes this information to derive the global factors. Therefore, we can take advantage of this approach and derive in addition a method for the recognition of approximate graph products. △ Less

Submitted 10 May, 2017; originally announced May 2017.

arXiv:1610.04199 [pdf, other]

doi 10.1080/10618600.2018.1513366

Simultaneous Credible Regions for Multiple Changepoint Locations

Authors: Tobias Siems, Marc Hellmuth, Volkmar Liebscher

Abstract: Within a Bayesian retrospective framework, we present a way of examining the distribution of \cps through a novel set estimator. For a given level, $α$, we aim at smallest sets that cover all \cps with a probability of at least $1-α$. These so-called smallest simultaneous credible regions, computed for certain values of $α$, provide parsimonious representations of the possible \cp locations. In ad… ▽ More Within a Bayesian retrospective framework, we present a way of examining the distribution of \cps through a novel set estimator. For a given level, $α$, we aim at smallest sets that cover all \cps with a probability of at least $1-α$. These so-called smallest simultaneous credible regions, computed for certain values of $α$, provide parsimonious representations of the possible \cp locations. In addition, combining them for a range of different $α$'s enables very informative yet condensed visualisations. Therewith we allow for the evaluation of model choices and the analysis of \cp data to an unprecedented degree. This approach exhibits superior sensitivity, specificity and interpretability in comparison with highest density regions, marginal inclusion probabilities and confidence intervals inferred by \stepR. Whilst their direct construction is usually intractable, asymptotically correct solutions can be derived from posterior samples. This leads to a novel NP-complete problem. Through reformulations into an Integer Linear Program we show empirically that a fast greedy heuristic computes virtually exact solutions. △ Less

Submitted 3 September, 2018; v1 submitted 13 October, 2016; originally announced October 2016.

Journal ref: Journal of Computational and Graphical Statistics 2018

arXiv:1511.07162 [pdf, ps, other]

Associativity and non-associativity of some hypergraph products

Authors: Richard H. Hammack, Marc Hellmuth, Lydia Ostermeier, Peter F. Stadler

Abstract: Several variants of hypergraph products have been introduced as generalizations of the strong and direct products of graphs. Here we show that only some of them are associative. In addition to the Cartesian product, these are the minimal rank preserving direct product, and the normal product. Counter-examples are given for the strong product as well as the non-rank-preserving and the maximal rank… ▽ More Several variants of hypergraph products have been introduced as generalizations of the strong and direct products of graphs. Here we show that only some of them are associative. In addition to the Cartesian product, these are the minimal rank preserving direct product, and the normal product. Counter-examples are given for the strong product as well as the non-rank-preserving and the maximal rank preserving direct product. △ Less

Submitted 23 November, 2015; originally announced November 2015.

arXiv:1407.3164 [pdf, ps, other]

The Relaxed Square Property

Authors: Marc Hellmuth, Tilen Marc, Lydia Ostermeier, Peter F. Stadler

Abstract: Graph products are characterized by the existence of non-trivial equivalence relations on the edge set of a graph that satisfy a so-called square property. We investigate here a generalization, termed RSP-relations. The class of graphs with non-trivial RSP-relations in particular includes graph bundles. Furthermore, RSP-relations are intimately related with covering graph constructions. For K_23-f… ▽ More Graph products are characterized by the existence of non-trivial equivalence relations on the edge set of a graph that satisfy a so-called square property. We investigate here a generalization, termed RSP-relations. The class of graphs with non-trivial RSP-relations in particular includes graph bundles. Furthermore, RSP-relations are intimately related with covering graph constructions. For K_23-free graphs finest RSP-relations can be computed in polynomial-time. In general, however, they are not unique and their number may even grow exponentially. They behave well for graph products, however, in sense that a finest RSP-relations can be obtained easily from finest RSP-relations on the prime factors. △ Less

Submitted 11 July, 2014; originally announced July 2014.

arXiv:1401.4965 [pdf, ps, other]

doi 10.1016/j.tcs.2014.10.045

On the Cartesian Skeleton and the Factorization of the Strong Product of Digraphs

Authors: Marc Hellmuth, Tilen Marc

Abstract: The three standard products (the Cartesian, the direct and the strong product) of undirected graphs have been wellinvestigated, unique prime factor decomposition (PFD) are known and polynomial time algorithms have been established for determining the prime factors. For directed graphs, unique PFD results with respect to the standard products are known. However, there is still a lack of algorithm… ▽ More The three standard products (the Cartesian, the direct and the strong product) of undirected graphs have been wellinvestigated, unique prime factor decomposition (PFD) are known and polynomial time algorithms have been established for determining the prime factors. For directed graphs, unique PFD results with respect to the standard products are known. However, there is still a lack of algorithms, that computes the PFD of directed graphs with respect to the direct and the strong product in general. In this contribution, we focus on the algorithmic aspects for determining the PFD of directed graphs with respect to the strong product. Essential for computing the prime factors is the construction of a so-called Cartesian skeleton. This article introduces the notion of the Cartesian skeleton of directed graphs as a generalization of the Cartesian skeleton of undirected graphs. We provide new, fast and transparent algorithms for its construction. Moreover, we present a first polynomial time algorithm for determining the PFD with respect to the strong product of arbitrary connected digraphs. △ Less

Submitted 20 January, 2014; originally announced January 2014.

arXiv:1308.2101 [pdf, ps, other]

Fast Recognition of Partial Star Products and Quasi Cartesian Products

Authors: Marc Hellmuth, Wilfried Imrich, Tomas Kupka

Abstract: This paper is concerned with the fast computation of a relation $\R$ on the edge set of connected graphs that plays a decisive role in the recognition of approximate Cartesian products, the weak reconstruction of Cartesian products, and the recognition of Cartesian graph bundles with a triangle free basis. A special case of $\R$ is the relation $δ^\ast$, whose convex closure yields the product r… ▽ More This paper is concerned with the fast computation of a relation $\R$ on the edge set of connected graphs that plays a decisive role in the recognition of approximate Cartesian products, the weak reconstruction of Cartesian products, and the recognition of Cartesian graph bundles with a triangle free basis. A special case of $\R$ is the relation $δ^\ast$, whose convex closure yields the product relation $σ$ that induces the prime factor decomposition of connected graphs with respect to the Cartesian product. For the construction of $\R$ so-called Partial Star Products are of particular interest. Several special data structures are used that allow to compute Partial Star Products in constant time. These computations are tuned to the recognition of approximate graph products, but also lead to a linear time algorithm for the computation of $δ^\ast$ for graphs with maximum bounded degree. Furthermore, we define \emph{quasi Cartesian products} as graphs with non-trivial $δ^\ast$. We provide several examples, and show that quasi Cartesian products can be recognized in linear time for graphs with bounded maximum degree. Finally, we note that quasi products can be recognized in sublinear time with a parallelized algorithm. △ Less

Submitted 9 August, 2013; originally announced August 2013.

arXiv:1305.1824 [pdf, ps, other]

doi 10.1016/j.dam.2014.02.017

Strong Products of Hypergraphs: Unique Prime Factorization Theorems and Algorithms

Authors: Marc Hellmuth, Manuel Noll, Lydia Ostermeier

Abstract: It is well-known that all finite connected graphs have a unique prime factor decomposition (PFD) with respect to the strong graph product which can be computed in polynomial time. Essential for the PFD computation is the construction of the so-called Cartesian skeleton of the graphs under investigation. In this contribution, we show that every connected thin hypergraph H has a unique prime facto… ▽ More It is well-known that all finite connected graphs have a unique prime factor decomposition (PFD) with respect to the strong graph product which can be computed in polynomial time. Essential for the PFD computation is the construction of the so-called Cartesian skeleton of the graphs under investigation. In this contribution, we show that every connected thin hypergraph H has a unique prime factorization with respect to the normal and strong (hypergraph) product. Both products coincide with the usual strong graph product whenever H is a graph. We introduce the notion of the Cartesian skeleton of hypergraphs as a natural generalization of the Cartesian skeleton of graphs and prove that it is uniquely defined for thin hypergraphs. Moreover, we show that the Cartesian skeleton of hypergraphs can be determined in O(|E|^2) time and that the PFD can be computed in O(|V|^2|E|) time, for hypergraphs H = (V,E) with bounded degree and bounded rank. △ Less

Submitted 8 May, 2013; originally announced May 2013.

arXiv:1303.6803 [pdf, ps, other]

doi 10.1007/s11786-013-0156-7

Partial Star Products: A Local Covering Approach for the Recognition of Approximate Cartesian Product Graphs

Authors: Marc Hellmuth, Wilfried Imrich, Tomas Kupka

Abstract: This paper is concerned with the recognition of approximate graph products with respect to the Cartesian product. Most graphs are prime, although they can have a rich product-like structure. The proposed algorithms are based on a local approach that covers a graph by small subgraphs, so-called partial star products, and then utilizes this information to derive the global factors and an embedding o… ▽ More This paper is concerned with the recognition of approximate graph products with respect to the Cartesian product. Most graphs are prime, although they can have a rich product-like structure. The proposed algorithms are based on a local approach that covers a graph by small subgraphs, so-called partial star products, and then utilizes this information to derive the global factors and an embedding of the graph under investigation into Cartesian product graphs. △ Less

Submitted 17 June, 2013; v1 submitted 27 March, 2013; originally announced March 2013.

arXiv:1301.6898 [pdf, ps, other]

doi 10.1016/j.disc.2013.12.012

Square Property, Equitable Partitions, and Product-like Graphs

Authors: Marc Hellmuth, Lydia Ostermeier, Peter F. Stadler

Abstract: Equivalence relations on the edge set of a graph $G$ that satisfy restrictive conditions on chordless squares play a crucial role in the theory of Cartesian graph products and graph bundles. We show here that such relations in a natural way induce equitable partitions on the vertex set of $G$, which in turn give rise to quotient graphs that can have a rich product structure even if $G$ itself is p… ▽ More Equivalence relations on the edge set of a graph $G$ that satisfy restrictive conditions on chordless squares play a crucial role in the theory of Cartesian graph products and graph bundles. We show here that such relations in a natural way induce equitable partitions on the vertex set of $G$, which in turn give rise to quotient graphs that can have a rich product structure even if $G$ itself is prime. △ Less

Submitted 2 September, 2013; v1 submitted 29 January, 2013; originally announced January 2013.

Comments: 20 pages, 6 figures

Showing 1–50 of 51 results for author: Hellmuth, M