-
Optimised Storage for Datalog Reasoning
Authors:
Xinyue Zhang,
Pan Hu,
Yavor Nenov,
Ian Horrocks
Abstract:
Materialisation facilitates Datalog reasoning by precomputing all consequences of the facts and the rules so that queries can be directly answered over the materialised facts. However, storing all materialised facts may be infeasible in practice, especially when the rules are complex and the given set of facts is large. We observe that for certain combinations of rules, there exist data structures…
▽ More
Materialisation facilitates Datalog reasoning by precomputing all consequences of the facts and the rules so that queries can be directly answered over the materialised facts. However, storing all materialised facts may be infeasible in practice, especially when the rules are complex and the given set of facts is large. We observe that for certain combinations of rules, there exist data structures that compactly represent the reasoning result and can be efficiently queried when necessary. In this paper, we present a general framework that allows for the integration of such optimised storage schemes with standard materialisation algorithms. Moreover, we devise optimised storage schemes targeting at transitive rules and union rules, two types of (combination of) rules that commonly occur in practice. Our experimental evaluation shows that our approach significantly improves memory consumption, sometimes by orders of magnitude, while remaining competitive in terms of query answering time.
△ Less
Submitted 19 December, 2023; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Enhancing Datalog Reasoning with Hypertree Decompositions
Authors:
Xinyue Zhang,
Pan Hu,
Yavor Nenov,
Ian Horrocks
Abstract:
Datalog reasoning based on the seminaïve evaluation strategy evaluates rules using traditional join plans, which often leads to redundancy and inefficiency in practice, especially when the rules are complex. Hypertree decompositions help identify efficient query plans and reduce similar redundancy in query answering. However, it is unclear how this can be applied to materialisation and incremental…
▽ More
Datalog reasoning based on the seminaïve evaluation strategy evaluates rules using traditional join plans, which often leads to redundancy and inefficiency in practice, especially when the rules are complex. Hypertree decompositions help identify efficient query plans and reduce similar redundancy in query answering. However, it is unclear how this can be applied to materialisation and incremental reasoning with recursive Datalog programs. Moreover, hypertree decompositions require additional data structures and thus introduce nonnegligible overhead in both runtime and memory consumption. In this paper, we provide algorithms that exploit hypertree decompositions for the materialisation and incremental evaluation of Datalog programs. Furthermore, we combine this approach with standard Datalog reasoning algorithms in a modular fashion so that the overhead caused by the decompositions is reduced. Our empirical evaluation shows that, when the program contains complex rules, the combined approach is usually significantly faster than the baseline approach, sometimes by orders of magnitude.
△ Less
Submitted 15 May, 2023; v1 submitted 11 May, 2023;
originally announced May 2023.
-
Data Science with Vadalog: Bridging Machine Learning and Reasoning
Authors:
Luigi Bellomarini,
Ruslan R. Fayzrakhmanov,
Georg Gottlob,
Andrey Kravchenko,
Eleonora Laurenza,
Yavor Nenov,
Stephane Reissfelder,
Emanuel Sallinger,
Evgeny Sherkhonov,
Lianlong Wu
Abstract:
Following the recent successful examples of large technology companies, many modern enterprises seek to build knowledge graphs to provide a unified view of corporate knowledge and to draw deep insights using machine learning and logical reasoning. There is currently a perceived disconnect between the traditional approaches for data science, typically based on machine learning and statistical model…
▽ More
Following the recent successful examples of large technology companies, many modern enterprises seek to build knowledge graphs to provide a unified view of corporate knowledge and to draw deep insights using machine learning and logical reasoning. There is currently a perceived disconnect between the traditional approaches for data science, typically based on machine learning and statistical modelling, and systems for reasoning with domain knowledge. In this paper we present a state-of-the-art Knowledge Graph Management System, Vadalog, which delivers highly expressive and efficient logical reasoning and provides seamless integration with modern data science toolkits, such as the Jupyter platform. We demonstrate how to use Vadalog to perform traditional data wrangling tasks, as well as complex logical and probabilistic reasoning. We argue that this is a significant step forward towards combining machine learning and reasoning in data science.
△ Less
Submitted 23 July, 2018;
originally announced July 2018.
-
Combining Rewriting and Incremental Materialisation Maintenance for Datalog Programs with Equality
Authors:
Boris Motik,
Yavor Nenov,
Robert Piro,
Ian Horrocks
Abstract:
Materialisation precomputes all consequences of a set of facts and a datalog program so that queries can be evaluated directly (i.e., independently from the program). Rewriting optimises materialisation for datalog programs with equality by replacing all equal constants with a single representative; and incremental maintenance algorithms can efficiently update a materialisation for small changes i…
▽ More
Materialisation precomputes all consequences of a set of facts and a datalog program so that queries can be evaluated directly (i.e., independently from the program). Rewriting optimises materialisation for datalog programs with equality by replacing all equal constants with a single representative; and incremental maintenance algorithms can efficiently update a materialisation for small changes in the input facts. Both techniques are critical to practical applicability of datalog systems; however, we are unaware of an approach that combines rewriting and incremental maintenance. In this paper we present the first such combination, and we show empirically that it can speed up updates by several orders of magnitude compared to using either rewriting or incremental maintenance in isolation.
△ Less
Submitted 1 May, 2015;
originally announced May 2015.
-
Handling owl:sameAs via Rewriting
Authors:
Boris Motik,
Yavor Nenov,
Robert Piro,
Ian Horrocks
Abstract:
Rewriting is widely used to optimise owl:sameAs reasoning in materialisation based OWL 2 RL systems. We investigate issues related to both the correctness and efficiency of rewriting, and present an algorithm that guarantees correctness, improves efficiency, and can be effectively parallelised. Our evaluation shows that our approach can reduce reasoning times on practical data sets by orders of ma…
▽ More
Rewriting is widely used to optimise owl:sameAs reasoning in materialisation based OWL 2 RL systems. We investigate issues related to both the correctness and efficiency of rewriting, and present an algorithm that guarantees correctness, improves efficiency, and can be effectively parallelised. Our evaluation shows that our approach can reduce reasoning times on practical data sets by orders of magnitude.
△ Less
Submitted 13 November, 2014;
originally announced November 2014.
-
Datalog Rewritability of Disjunctive Datalog Programs and its Applications to Ontology Reasoning
Authors:
Mark Kaminski,
Yavor Nenov,
Bernardo Cuenca Grau
Abstract:
We study the problem of rewriting a disjunctive datalog program into plain datalog. We show that a disjunctive program is rewritable if and only if it is equivalent to a linear disjunctive program, thus providing a novel characterisation of datalog rewritability. Motivated by this result, we propose weakly linear disjunctive datalog---a novel rule-based KR language that extends both datalog and li…
▽ More
We study the problem of rewriting a disjunctive datalog program into plain datalog. We show that a disjunctive program is rewritable if and only if it is equivalent to a linear disjunctive program, thus providing a novel characterisation of datalog rewritability. Motivated by this result, we propose weakly linear disjunctive datalog---a novel rule-based KR language that extends both datalog and linear disjunctive datalog and for which reasoning is tractable in data complexity. We then explore applications of weakly linear programs to ontology reasoning and propose a tractable extension of OWL 2 RL with disjunctive axioms. Our empirical results suggest that many non-Horn ontologies can be reduced to weakly linear programs and that query answering over such ontologies using a datalog engine is feasible in practice.
△ Less
Submitted 11 April, 2014;
originally announced April 2014.
-
Topological Logics with Connectedness over Euclidean Spaces
Authors:
Roman Kontchakov,
Yavor Nenov,
Ian Pratt-Hartmann,
Michael Zakharyaschev
Abstract:
We consider the quantifier-free languages, Bc and Bc0, obtained by augmenting the signature of Boolean algebras with a unary predicate representing, respectively, the property of being connected, and the property of having a connected interior. These languages are interpreted over the regular closed sets of n-dimensional Euclidean space (n greater than 1) and, additionally, over the regular closed…
▽ More
We consider the quantifier-free languages, Bc and Bc0, obtained by augmenting the signature of Boolean algebras with a unary predicate representing, respectively, the property of being connected, and the property of having a connected interior. These languages are interpreted over the regular closed sets of n-dimensional Euclidean space (n greater than 1) and, additionally, over the regular closed polyhedral sets of n-dimensional Euclidean space. The resulting logics are examples of formalisms that have recently been proposed in the Artificial Intelligence literature under the rubric "Qualitative Spatial Reasoning." We prove that the satisfiability problem for Bc is undecidable over the regular closed polyhedra in all dimensions greater than 1, and that the satisfiability problem for both languages is undecidable over both the regular closed sets and the regular closed polyhedra in the Euclidean plane. However, we also prove that the satisfiability problem for Bc0 is NP-complete over the regular closed sets in all dimensions greater than 2, while the corresponding problem for the regular closed polyhedra is ExpTime-complete. Our results show, in particular, that spatial reasoning over Euclidean spaces is much harder than reasoning over arbitrary topological spaces.
△ Less
Submitted 18 October, 2011;
originally announced October 2011.
-
On the Decidability of Connectedness Constraints in 2D and 3D Euclidean Spaces
Authors:
Roman Kontchakov,
Yavor Nenov,
Ian Pratt-Hartmann,
Michael Zakharyaschev
Abstract:
We investigate (quantifier-free) spatial constraint languages with equality, contact and connectedness predicates as well as Boolean operations on regions, interpreted over low-dimensional Euclidean spaces. We show that the complexity of reasoning varies dramatically depending on the dimension of the space and on the type of regions considered. For example, the logic with the interior-connectednes…
▽ More
We investigate (quantifier-free) spatial constraint languages with equality, contact and connectedness predicates as well as Boolean operations on regions, interpreted over low-dimensional Euclidean spaces. We show that the complexity of reasoning varies dramatically depending on the dimension of the space and on the type of regions considered. For example, the logic with the interior-connectedness predicate (and without contact) is undecidable over polygons or regular closed sets in the Euclidean plane, NP-complete over regular closed sets in three-dimensional Euclidean space, and ExpTime-complete over polyhedra in three-dimensional Euclidean space.
△ Less
Submitted 1 April, 2011;
originally announced April 2011.