subscribe to arXiv mailings

Optimised Storage for Datalog Reasoning

Authors: Xinyue Zhang, Pan Hu, Yavor Nenov, Ian Horrocks

Abstract: Materialisation facilitates Datalog reasoning by precomputing all consequences of the facts and the rules so that queries can be directly answered over the materialised facts. However, storing all materialised facts may be infeasible in practice, especially when the rules are complex and the given set of facts is large. We observe that for certain combinations of rules, there exist data structures… ▽ More Materialisation facilitates Datalog reasoning by precomputing all consequences of the facts and the rules so that queries can be directly answered over the materialised facts. However, storing all materialised facts may be infeasible in practice, especially when the rules are complex and the given set of facts is large. We observe that for certain combinations of rules, there exist data structures that compactly represent the reasoning result and can be efficiently queried when necessary. In this paper, we present a general framework that allows for the integration of such optimised storage schemes with standard materialisation algorithms. Moreover, we devise optimised storage schemes targeting at transitive rules and union rules, two types of (combination of) rules that commonly occur in practice. Our experimental evaluation shows that our approach significantly improves memory consumption, sometimes by orders of magnitude, while remaining competitive in terms of query answering time. △ Less

Submitted 19 December, 2023; v1 submitted 18 December, 2023; originally announced December 2023.

Comments: 19 pages

arXiv:2305.06854 [pdf, other]

Enhancing Datalog Reasoning with Hypertree Decompositions

Authors: Xinyue Zhang, Pan Hu, Yavor Nenov, Ian Horrocks

Abstract: Datalog reasoning based on the seminaïve evaluation strategy evaluates rules using traditional join plans, which often leads to redundancy and inefficiency in practice, especially when the rules are complex. Hypertree decompositions help identify efficient query plans and reduce similar redundancy in query answering. However, it is unclear how this can be applied to materialisation and incremental… ▽ More Datalog reasoning based on the seminaïve evaluation strategy evaluates rules using traditional join plans, which often leads to redundancy and inefficiency in practice, especially when the rules are complex. Hypertree decompositions help identify efficient query plans and reduce similar redundancy in query answering. However, it is unclear how this can be applied to materialisation and incremental reasoning with recursive Datalog programs. Moreover, hypertree decompositions require additional data structures and thus introduce nonnegligible overhead in both runtime and memory consumption. In this paper, we provide algorithms that exploit hypertree decompositions for the materialisation and incremental evaluation of Datalog programs. Furthermore, we combine this approach with standard Datalog reasoning algorithms in a modular fashion so that the overhead caused by the decompositions is reduced. Our empirical evaluation shows that, when the program contains complex rules, the combined approach is usually significantly faster than the baseline approach, sometimes by orders of magnitude. △ Less

Submitted 15 May, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

arXiv:1807.08712 [pdf, other]

Data Science with Vadalog: Bridging Machine Learning and Reasoning

Authors: Luigi Bellomarini, Ruslan R. Fayzrakhmanov, Georg Gottlob, Andrey Kravchenko, Eleonora Laurenza, Yavor Nenov, Stephane Reissfelder, Emanuel Sallinger, Evgeny Sherkhonov, Lianlong Wu

Abstract: Following the recent successful examples of large technology companies, many modern enterprises seek to build knowledge graphs to provide a unified view of corporate knowledge and to draw deep insights using machine learning and logical reasoning. There is currently a perceived disconnect between the traditional approaches for data science, typically based on machine learning and statistical model… ▽ More Following the recent successful examples of large technology companies, many modern enterprises seek to build knowledge graphs to provide a unified view of corporate knowledge and to draw deep insights using machine learning and logical reasoning. There is currently a perceived disconnect between the traditional approaches for data science, typically based on machine learning and statistical modelling, and systems for reasoning with domain knowledge. In this paper we present a state-of-the-art Knowledge Graph Management System, Vadalog, which delivers highly expressive and efficient logical reasoning and provides seamless integration with modern data science toolkits, such as the Jupyter platform. We demonstrate how to use Vadalog to perform traditional data wrangling tasks, as well as complex logical and probabilistic reasoning. We argue that this is a significant step forward towards combining machine learning and reasoning in data science. △ Less

Submitted 23 July, 2018; originally announced July 2018.

arXiv:1505.00212 [pdf, other]

Combining Rewriting and Incremental Materialisation Maintenance for Datalog Programs with Equality

Authors: Boris Motik, Yavor Nenov, Robert Piro, Ian Horrocks

Abstract: Materialisation precomputes all consequences of a set of facts and a datalog program so that queries can be evaluated directly (i.e., independently from the program). Rewriting optimises materialisation for datalog programs with equality by replacing all equal constants with a single representative; and incremental maintenance algorithms can efficiently update a materialisation for small changes i… ▽ More Materialisation precomputes all consequences of a set of facts and a datalog program so that queries can be evaluated directly (i.e., independently from the program). Rewriting optimises materialisation for datalog programs with equality by replacing all equal constants with a single representative; and incremental maintenance algorithms can efficiently update a materialisation for small changes in the input facts. Both techniques are critical to practical applicability of datalog systems; however, we are unaware of an approach that combines rewriting and incremental maintenance. In this paper we present the first such combination, and we show empirically that it can speed up updates by several orders of magnitude compared to using either rewriting or incremental maintenance in isolation. △ Less

Submitted 1 May, 2015; originally announced May 2015.

Comments: All proofs contained in the appendix. 7 pages + 4 pages appendix. 7 algorithms and one table with evaluation results

arXiv:1411.3622 [pdf, ps, other]

Handling owl:sameAs via Rewriting

Authors: Boris Motik, Yavor Nenov, Robert Piro, Ian Horrocks

Abstract: Rewriting is widely used to optimise owl:sameAs reasoning in materialisation based OWL 2 RL systems. We investigate issues related to both the correctness and efficiency of rewriting, and present an algorithm that guarantees correctness, improves efficiency, and can be effectively parallelised. Our evaluation shows that our approach can reduce reasoning times on practical data sets by orders of ma… ▽ More Rewriting is widely used to optimise owl:sameAs reasoning in materialisation based OWL 2 RL systems. We investigate issues related to both the correctness and efficiency of rewriting, and present an algorithm that guarantees correctness, improves efficiency, and can be effectively parallelised. Our evaluation shows that our approach can reduce reasoning times on practical data sets by orders of magnitude. △ Less

Submitted 13 November, 2014; originally announced November 2014.

Comments: This is the technical report supporting the AAAI 2015 Conference submission with the same title

arXiv:1404.3141 [pdf, other]

Datalog Rewritability of Disjunctive Datalog Programs and its Applications to Ontology Reasoning

Authors: Mark Kaminski, Yavor Nenov, Bernardo Cuenca Grau

Abstract: We study the problem of rewriting a disjunctive datalog program into plain datalog. We show that a disjunctive program is rewritable if and only if it is equivalent to a linear disjunctive program, thus providing a novel characterisation of datalog rewritability. Motivated by this result, we propose weakly linear disjunctive datalog---a novel rule-based KR language that extends both datalog and li… ▽ More We study the problem of rewriting a disjunctive datalog program into plain datalog. We show that a disjunctive program is rewritable if and only if it is equivalent to a linear disjunctive program, thus providing a novel characterisation of datalog rewritability. Motivated by this result, we propose weakly linear disjunctive datalog---a novel rule-based KR language that extends both datalog and linear disjunctive datalog and for which reasoning is tractable in data complexity. We then explore applications of weakly linear programs to ontology reasoning and propose a tractable extension of OWL 2 RL with disjunctive axioms. Our empirical results suggest that many non-Horn ontologies can be reduced to weakly linear programs and that query answering over such ontologies using a datalog engine is feasible in practice. △ Less

Submitted 11 April, 2014; originally announced April 2014.

Comments: 14 pages. To appear at AAAI-14

arXiv:1110.4034 [pdf, other]

doi 10.1145/2480759.2480765

Topological Logics with Connectedness over Euclidean Spaces

Authors: Roman Kontchakov, Yavor Nenov, Ian Pratt-Hartmann, Michael Zakharyaschev

Abstract: We consider the quantifier-free languages, Bc and Bc0, obtained by augmenting the signature of Boolean algebras with a unary predicate representing, respectively, the property of being connected, and the property of having a connected interior. These languages are interpreted over the regular closed sets of n-dimensional Euclidean space (n greater than 1) and, additionally, over the regular closed… ▽ More We consider the quantifier-free languages, Bc and Bc0, obtained by augmenting the signature of Boolean algebras with a unary predicate representing, respectively, the property of being connected, and the property of having a connected interior. These languages are interpreted over the regular closed sets of n-dimensional Euclidean space (n greater than 1) and, additionally, over the regular closed polyhedral sets of n-dimensional Euclidean space. The resulting logics are examples of formalisms that have recently been proposed in the Artificial Intelligence literature under the rubric "Qualitative Spatial Reasoning." We prove that the satisfiability problem for Bc is undecidable over the regular closed polyhedra in all dimensions greater than 1, and that the satisfiability problem for both languages is undecidable over both the regular closed sets and the regular closed polyhedra in the Euclidean plane. However, we also prove that the satisfiability problem for Bc0 is NP-complete over the regular closed sets in all dimensions greater than 2, while the corresponding problem for the regular closed polyhedra is ExpTime-complete. Our results show, in particular, that spatial reasoning over Euclidean spaces is much harder than reasoning over arbitrary topological spaces. △ Less

Submitted 18 October, 2011; originally announced October 2011.

MSC Class: 68T30 (Primary) 03D15; 68Q17 (Secondary) ACM Class: I.2.4; F.4.3; F.2.2

Journal ref: ACM Transactions on Computational Logic, 14(2:13), 2013

arXiv:1104.0219 [pdf, ps, other]

On the Decidability of Connectedness Constraints in 2D and 3D Euclidean Spaces

Authors: Roman Kontchakov, Yavor Nenov, Ian Pratt-Hartmann, Michael Zakharyaschev

Abstract: We investigate (quantifier-free) spatial constraint languages with equality, contact and connectedness predicates as well as Boolean operations on regions, interpreted over low-dimensional Euclidean spaces. We show that the complexity of reasoning varies dramatically depending on the dimension of the space and on the type of regions considered. For example, the logic with the interior-connectednes… ▽ More We investigate (quantifier-free) spatial constraint languages with equality, contact and connectedness predicates as well as Boolean operations on regions, interpreted over low-dimensional Euclidean spaces. We show that the complexity of reasoning varies dramatically depending on the dimension of the space and on the type of regions considered. For example, the logic with the interior-connectedness predicate (and without contact) is undecidable over polygons or regular closed sets in the Euclidean plane, NP-complete over regular closed sets in three-dimensional Euclidean space, and ExpTime-complete over polyhedra in three-dimensional Euclidean space. △ Less

Submitted 1 April, 2011; originally announced April 2011.

Comments: Accepted for publication in the IJCAI 2011 proceedings

Showing 1–8 of 8 results for author: Nenov, Y