subscribe to arXiv mailings

Learning Guided Automated Reasoning: A Brief Survey

Authors: Lasse Blaauwbroek, David Cerna, Thibault Gauthier, Jan Jakubův, Cezary Kaliszyk, Martin Suda, Josef Urban

Abstract: Automated theorem provers and formal proof assistants are general reasoning systems that are in theory capable of proving arbitrarily hard theorems, thus solving arbitrary problems reducible to mathematics and logical reasoning. In practice, such systems however face large combinatorial explosion, and therefore include many heuristics and choice points that considerably influence their performance… ▽ More Automated theorem provers and formal proof assistants are general reasoning systems that are in theory capable of proving arbitrarily hard theorems, thus solving arbitrary problems reducible to mathematics and logical reasoning. In practice, such systems however face large combinatorial explosion, and therefore include many heuristics and choice points that considerably influence their performance. This is an opportunity for trained machine learning predictors, which can guide the work of such reasoning systems. Conversely, deductive search supported by the notion of logically valid proof allows one to train machine learning systems on large reasoning corpora. Such bodies of proof are usually correct by construction and when combined with more and more precise trained guidance they can be boostrapped into very large corpora, with increasingly long reasoning chains and possibly novel proof ideas. In this paper we provide an overview of several automated reasoning and theorem proving domains and the learning and AI methods that have been so far developed for them. These include premise selection, proof guidance in several settings, AI systems and feedback loops iterating between reasoning and learning, and symbolic classification problems. △ Less

Submitted 6 March, 2024; originally announced March 2024.

arXiv:2303.06686 [pdf, other]

MizAR 60 for Mizar 50

Authors: Jan Jakubův, Karel Chvalovský, Zarathustra Goertzel, Cezary Kaliszyk, Mirek Olšák, Bartosz Piotrowski, Stephan Schulz, Martin Suda, Josef Urban

Abstract: As a present to Mizar on its 50th anniversary, we develop an AI/TP system that automatically proves about 60\% of the Mizar theorems in the hammer setting. We also automatically prove 75\% of the Mizar theorems when the automated provers are helped by using only the premises used in the human-written Mizar proofs. We describe the methods and large-scale experiments leading to these results. This i… ▽ More As a present to Mizar on its 50th anniversary, we develop an AI/TP system that automatically proves about 60\% of the Mizar theorems in the hammer setting. We also automatically prove 75\% of the Mizar theorems when the automated provers are helped by using only the premises used in the human-written Mizar proofs. We describe the methods and large-scale experiments leading to these results. This includes in particular the E and Vampire provers, their ENIGMA and Deepire learning modifications, a number of learning-based premise selection methods, and the incremental loop that interleaves growing a corpus of millions of ATP proofs with training increasingly strong AI/TP systems on them. We also present a selection of Mizar problems that were proved automatically. △ Less

Submitted 12 March, 2023; originally announced March 2023.

arXiv:2208.06652 [pdf, other]

Differentiable Inductive Logic Programming in High-Dimensional Space

Authors: Stanisław J. Purgał, David M. Cerna, Cezary Kaliszyk

Abstract: Synthesizing large logic programs through symbolic Inductive Logic Programming (ILP) typically requires intermediate definitions. However, cluttering the hypothesis space with intensional predicates typically degrades performance. In contrast, gradient descent provides an efficient way to find solutions within such high-dimensional spaces. Neuro-symbolic ILP approaches have not fully exploited thi… ▽ More Synthesizing large logic programs through symbolic Inductive Logic Programming (ILP) typically requires intermediate definitions. However, cluttering the hypothesis space with intensional predicates typically degrades performance. In contrast, gradient descent provides an efficient way to find solutions within such high-dimensional spaces. Neuro-symbolic ILP approaches have not fully exploited this so far. We propose extending the δILP approach to inductive synthesis with large-scale predicate invention, thus allowing us to exploit the efficacy of high-dimensional gradient descent. We show that large-scale predicate invention benefits differentiable inductive synthesis through gradient descent and allows one to learn solutions for tasks beyond the capabilities of existing neuro-symbolic ILP systems. Furthermore, we achieve these results without specifying the precise structure of the solution within the language bias. △ Less

Submitted 19 August, 2023; v1 submitted 13 August, 2022; originally announced August 2022.

Comments: 8 pages, under review

arXiv:2205.06640 [pdf, ps, other]

Lash 1.0 (System Description)

Authors: Chad E. Brown, Cezary Kaliszyk

Abstract: Lash is a higher-order automated theorem prover created as a fork of the theorem prover Satallax. The basic underlying calculus of Satallax is a ground tableau calculus whose rules only use shallow information about the terms and formulas taking part in the rule. Lash uses new, efficient C representations of vital structures and operations. Most importantly, Lash uses a C representation of (normal… ▽ More Lash is a higher-order automated theorem prover created as a fork of the theorem prover Satallax. The basic underlying calculus of Satallax is a ground tableau calculus whose rules only use shallow information about the terms and formulas taking part in the rule. Lash uses new, efficient C representations of vital structures and operations. Most importantly, Lash uses a C representation of (normal) terms with perfect sharing along with a C implementation of normalizing substitutions. We describe the ways in which Lash differs from Satallax and the performance improvement of Lash over Satallax when used with analogous flag settings. With a 10s timeout Lash outperforms Satallax on a collection TH0 problems from the TPTP. We conclude with ideas for continuing the development of Lash. △ Less

Submitted 13 May, 2022; originally announced May 2022.

Journal ref: IJCAR 2022 Conference Submission

arXiv:2205.01981 [pdf, other]

The Isabelle ENIGMA

Authors: Zarathustra A. Goertzel, Jan Jakubův, Cezary Kaliszyk, Miroslav Olšák, Jelle Piepenbrock, Josef Urban

Abstract: We significantly improve the performance of the E automated theorem prover on the Isabelle Sledgehammer problems by combining learning and theorem proving in several ways. In particular, we develop targeted versions of the ENIGMA guidance for the Isabelle problems, targeted versions of neural premise selection, and targeted strategies for E. The methods are trained in several iterations over hundr… ▽ More We significantly improve the performance of the E automated theorem prover on the Isabelle Sledgehammer problems by combining learning and theorem proving in several ways. In particular, we develop targeted versions of the ENIGMA guidance for the Isabelle problems, targeted versions of neural premise selection, and targeted strategies for E. The methods are trained in several iterations over hundreds of thousands untyped and typed first-order problems extracted from Isabelle. Our final best single-strategy ENIGMA and premise selection system improves the best previous version of E by 25.3% in 15 seconds, outperforming also all other previous ATP and SMT systems. △ Less

Submitted 4 May, 2022; originally announced May 2022.

Comments: 21 pages, 12 tables, ITP 2022

arXiv:2204.12311 [pdf, other]

Formalizing a Diophantine Representation of the Set of Prime Numbers

Authors: Karol Pąk, Cezary Kaliszyk

Abstract: The DPRM (Davis-Putnam-Robinson-Matiyasevich) theorem is the main step in the negative resolution of Hilbert's 10th problem. Almost three decades of work on the problem have resulted in several equally surprising results. These include the existence of diophantine equations with a reduced number of variables, as well as the explicit construction of polynomials that represent specific sets, in part… ▽ More The DPRM (Davis-Putnam-Robinson-Matiyasevich) theorem is the main step in the negative resolution of Hilbert's 10th problem. Almost three decades of work on the problem have resulted in several equally surprising results. These include the existence of diophantine equations with a reduced number of variables, as well as the explicit construction of polynomials that represent specific sets, in particular the set of primes. In this work, we formalize these constructions in the Mizar system. We focus on the set of prime numbers and its explicit representation using 10 variables. It is the smallest representation known today. For this, we show that the exponential function is diophantine, together with the same properties for the binomial coefficient and factorial. This formalization is the next step in the research on formal approaches to diophantine sets following the DPRM theorem. △ Less

Submitted 26 April, 2022; originally announced April 2022.

Journal ref: ITP 2022 Conference Paper

arXiv:2204.02737 [pdf, other]

Adversarial Learning to Reason in an Arbitrary Logic

Authors: Stanisław J. Purgał, Cezary Kaliszyk

Abstract: Existing approaches to learning to prove theorems focus on particular logics and datasets. In this work, we propose Monte-Carlo simulations guided by reinforcement learning that can work in an arbitrarily specified logic, without any human knowledge or set of problems. Since the algorithm does not need any training dataset, it is able to learn to work with any logical foundation, even when there i… ▽ More Existing approaches to learning to prove theorems focus on particular logics and datasets. In this work, we propose Monte-Carlo simulations guided by reinforcement learning that can work in an arbitrarily specified logic, without any human knowledge or set of problems. Since the algorithm does not need any training dataset, it is able to learn to work with any logical foundation, even when there is no body of proofs or even conjectures available. We practically demonstrate the feasibility of the approach in multiple logical systems. The approach is stronger than training on randomly generated data but weaker than the approaches trained on tailored axiom and conjecture sets. It however allows us to apply machine learning to automated theorem proving for many logics, where no such attempts have been tried to date, such as intuitionistic logic or linear logic. △ Less

Submitted 6 April, 2022; originally announced April 2022.

Journal ref: FLAIRS 2022

arXiv:2112.14603 [pdf, other]

doi 10.24963/ijcai.2022/378

Learning Higher-Order Programs without Meta-Interpretive Learning

Authors: Stanisław J. Purgał, David M. Cerna, Cezary Kaliszyk

Abstract: Learning complex programs through inductive logic programming (ILP) remains a formidable challenge. Existing higher-order enabled ILP systems show improved accuracy and learning performance, though remain hampered by the limitations of the underlying learning mechanism. Experimental results show that our extension of the versatile Learning From Failures paradigm by higher-order definitions signifi… ▽ More Learning complex programs through inductive logic programming (ILP) remains a formidable challenge. Existing higher-order enabled ILP systems show improved accuracy and learning performance, though remain hampered by the limitations of the underlying learning mechanism. Experimental results show that our extension of the versatile Learning From Failures paradigm by higher-order definitions significantly improves learning performance without the burdensome human guidance required by existing systems. Our theoretical framework captures a class of higher-order definitions preserving soundness of existing subsumption-based pruning methods. △ Less

Submitted 15 May, 2022; v1 submitted 29 December, 2021; originally announced December 2021.

Comments: Accepted at IJCAI 2022

arXiv:2107.10188 [pdf, ps, other]

JEFL: Joint Embedding of Formal Proof Libraries

Authors: Qingxiang Wang, Cezary Kaliszyk

Abstract: The heterogeneous nature of the logical foundations used in different interactive proof assistant libraries has rendered discovery of similar mathematical concepts among them difficult. In this paper, we compare a previously proposed algorithm for matching concepts across libraries with our unsupervised embedding approach that can help us retrieve similar concepts. Our approach is based on the fas… ▽ More The heterogeneous nature of the logical foundations used in different interactive proof assistant libraries has rendered discovery of similar mathematical concepts among them difficult. In this paper, we compare a previously proposed algorithm for matching concepts across libraries with our unsupervised embedding approach that can help us retrieve similar concepts. Our approach is based on the fasttext implementation of Word2Vec, on top of which a tree traversal module is added to adapt its algorithm to the representation format of our data export pipeline. We compare the explainability, customizability, and online-servability of the approaches and argue that the neural embedding approach has more potential to be integrated into an interactive proof assistant. △ Less

Submitted 21 July, 2021; originally announced July 2021.

Comments: Submission to FroCoS 2021

arXiv:2104.05207 [pdf, other]

Online Machine Learning Techniques for Coq: A Comparison

Authors: Liao Zhang, Lasse Blaauwbroek, Bartosz Piotrowski, Prokop Černý, Cezary Kaliszyk, Josef Urban

Abstract: We present a comparison of several online machine learning techniques for tactical learning and proving in the Coq proof assistant. This work builds on top of Tactician, a plugin for Coq that learns from proofs written by the user to synthesize new proofs. Learning happens in an online manner, meaning that Tactician's machine learning model is updated immediately every time the user performs a ste… ▽ More We present a comparison of several online machine learning techniques for tactical learning and proving in the Coq proof assistant. This work builds on top of Tactician, a plugin for Coq that learns from proofs written by the user to synthesize new proofs. Learning happens in an online manner, meaning that Tactician's machine learning model is updated immediately every time the user performs a step in an interactive proof. This has important advantages compared to the more studied offline learning systems: (1) it provides the user with a seamless, interactive experience with Tactician and, (2) it takes advantage of locality of proof similarity, which means that proofs similar to the current proof are likely to be found close by. We implement two online methods, namely approximate k-nearest neighbors based on locality sensitive hashing forests and random decision forests. Additionally, we conduct experiments with gradient boosted trees in an offline setting using XGBoost. We compare the relative performance of Tactician using these three learning methods on Coq's standard library. △ Less

Submitted 7 June, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

Comments: Intelligent Computer Mathematics 14th International Conference, CICM 2021

arXiv:2101.11716 [pdf, ps, other]

Disambiguating Symbolic Expressions in Informal Documents

Authors: Dennis Müller, Cezary Kaliszyk

Abstract: We propose the task of disambiguating symbolic expressions in informal STEM documents in the form of LaTeX files - that is, determining their precise semantics and abstract syntax tree - as a neural machine translation task. We discuss the distinct challenges involved and present a dataset with roughly 33,000 entries. We evaluated several baseline models on this dataset, which failed to yield even… ▽ More We propose the task of disambiguating symbolic expressions in informal STEM documents in the form of LaTeX files - that is, determining their precise semantics and abstract syntax tree - as a neural machine translation task. We discuss the distinct challenges involved and present a dataset with roughly 33,000 entries. We evaluated several baseline models on this dataset, which failed to yield even syntactically valid LaTeX before overfitting. Consequently, we describe a methodology using a transformer language model pre-trained on sources obtained from arxiv.org, which yields promising results despite the small size of the dataset. We evaluate our model using a plurality of dedicated techniques, taking the syntax and semantics of symbolic expressions into account. △ Less

Submitted 25 January, 2021; originally announced January 2021.

Comments: ICLR 2021 conference paper

arXiv:2101.09142 [pdf, other]

doi 10.1093/logcom/exab006

A Study of Continuous Vector Representationsfor Theorem Proving

Authors: Stanisław Purgał, Julian Parsert, Cezary Kaliszyk

Abstract: Applying machine learning to mathematical terms and formulas requires a suitable representation of formulas that is adequate for AI methods. In this paper, we develop an encoding that allows for logical properties to be preserved and is additionally reversible. This means that the tree shape of a formula including all symbols can be reconstructed from the dense vector representation. We do that by… ▽ More Applying machine learning to mathematical terms and formulas requires a suitable representation of formulas that is adequate for AI methods. In this paper, we develop an encoding that allows for logical properties to be preserved and is additionally reversible. This means that the tree shape of a formula including all symbols can be reconstructed from the dense vector representation. We do that by training two decoders: one that extracts the top symbol of the tree and one that extracts embedding vectors of subtrees. The syntactic and semantic logical properties that we aim to reserve include both structural formula properties, applicability of natural deduction steps, and even more complex operations like unifiability. We propose datasets that can be used to train these syntactic and semantic properties. We evaluate the viability of the developed encoding across the proposed datasets as well as for the practical theorem proving problem of premise selection in the Mizar corpus. △ Less

Submitted 22 January, 2021; originally announced January 2021.

arXiv:2005.12876 [pdf, ps, other]

A Survey of Languages for Formalizing Mathematics

Authors: Cezary Kaliszyk, Florian Rabe

Abstract: In order to work with mathematical content in computer systems, it is necessary to represent it in formal languages. Ideally, these are supported by tools that verify the correctness of the content, allow computing with it, and produce human-readable documents. These goals are challenging to combine and state-of-the-art tools typically have to make difficult compromises. In this paper we discuss… ▽ More In order to work with mathematical content in computer systems, it is necessary to represent it in formal languages. Ideally, these are supported by tools that verify the correctness of the content, allow computing with it, and produce human-readable documents. These goals are challenging to combine and state-of-the-art tools typically have to make difficult compromises. In this paper we discuss languages that have been created for this purpose, including logical languages of proof assistants and other formal systems, semi-formal languages, intermediate languages for exchanging mathematical knowledge, and language frameworks that allow building customized languages. We evaluate their advantages based on our experience in designing and applying languages and tools for formalizing mathematics. We reach the conclusion that no existing language is truly good enough yet and derive ideas for possible future improvements. △ Less

Submitted 26 May, 2020; originally announced May 2020.

Comments: CICM 2020 conference paper preprint

arXiv:1912.02636 [pdf, other]

doi 10.1145/3372885.3373827

Exploration of Neural Machine Translation in Autoformalization of Mathematics in Mizar

Authors: Qingxiang Wang, Chad Brown, Cezary Kaliszyk, Josef Urban

Abstract: In this paper we share several experiments trying to automatically translate informal mathematics into formal mathematics. In our context informal mathematics refers to human-written mathematical sentences in the LaTeX format; and formal mathematics refers to statements in the Mizar language. We conducted our experiments against three established neural network-based machine translation models tha… ▽ More In this paper we share several experiments trying to automatically translate informal mathematics into formal mathematics. In our context informal mathematics refers to human-written mathematical sentences in the LaTeX format; and formal mathematics refers to statements in the Mizar language. We conducted our experiments against three established neural network-based machine translation models that are known to deliver competitive results on translating between natural languages. To train these models we also prepared four informal-to-formal datasets. We compare and analyze our results according to whether the model is supervised or unsupervised. In order to augment the data available for auto-formalization and improve the results, we develop a custom type-elaboration mechanism and integrate it in the supervised translation. △ Less

Submitted 13 December, 2019; v1 submitted 5 December, 2019; originally announced December 2019.

Comments: The 9th ACM SIGPLAN International Conference on Certified Programs and Proofs

arXiv:1911.12073 [pdf, other]

doi 10.3233/FAIA200244

Property Invariant Embedding for Automated Reasoning

Authors: Miroslav Olšák, Cezary Kaliszyk, Josef Urban

Abstract: Automated reasoning and theorem proving have recently become major challenges for machine learning. In other domains, representations that are able to abstract over unimportant transformations, such as abstraction over translations and rotations in vision, are becoming more common. Standard methods of embedding mathematical formulas for learning theorem proving are however yet unable to handle man… ▽ More Automated reasoning and theorem proving have recently become major challenges for machine learning. In other domains, representations that are able to abstract over unimportant transformations, such as abstraction over translations and rotations in vision, are becoming more common. Standard methods of embedding mathematical formulas for learning theorem proving are however yet unable to handle many important transformations. In particular, embedding previously unseen labels, that often arise in definitional encodings and in Skolemization, has been very weak so far. Similar problems appear when transferring knowledge between known symbols. We propose a novel encoding of formulas that extends existing graph neural network models. This encoding represents symbols only by nodes in the graph, without giving the network any knowledge of the original labels. We provide additional links between such nodes that allow the network to recover the meaning and therefore correctly embed such nodes irrespective of the given labels. We test the proposed encoding in an automated theorem prover based on the tableaux connection calculus, and show that it improves on the best characterizations used so far. The encoding is further evaluated on the premise selection task and a newly introduced symbol guessing task, and shown to correctly predict 65% of the symbol names. △ Less

Submitted 27 November, 2019; originally announced November 2019.

Journal ref: ECAI 2020 - 24th European Conference on Artificial Intelligence

arXiv:1911.04873 [pdf, ps, other]

Can Neural Networks Learn Symbolic Rewriting?

Authors: Bartosz Piotrowski, Josef Urban, Chad E. Brown, Cezary Kaliszyk

Abstract: This work investigates if the current neural architectures are adequate for learning symbolic rewriting. Two kinds of data sets are proposed for this research -- one based on automated proofs and the other being a synthetic set of polynomial terms. The experiments with use of the current neural machine translation models are performed and its results are discussed. Ideas for extending this line of… ▽ More This work investigates if the current neural architectures are adequate for learning symbolic rewriting. Two kinds of data sets are proposed for this research -- one based on automated proofs and the other being a synthetic set of polynomial terms. The experiments with use of the current neural machine translation models are performed and its results are discussed. Ideas for extending this line of research are proposed, and its relevance is motivated. △ Less

Submitted 26 May, 2020; v1 submitted 7 November, 2019; originally announced November 2019.

arXiv:1905.13100 [pdf, other]

Towards Finding Longer Proofs

Authors: Zsolt Zombori, Adrián Csiszárik, Henryk Michalewski, Cezary Kaliszyk, Josef Urban

Abstract: We present a reinforcement learning (RL) based guidance system for automated theorem proving geared towards Finding Longer Proofs (FLoP). Unlike most learning based approaches, we focus on generalising from very little training data and achieving near complete confidence. We use several simple, structured datasets with very long proofs to show that FLoP can successfully generalise a single trainin… ▽ More We present a reinforcement learning (RL) based guidance system for automated theorem proving geared towards Finding Longer Proofs (FLoP). Unlike most learning based approaches, we focus on generalising from very little training data and achieving near complete confidence. We use several simple, structured datasets with very long proofs to show that FLoP can successfully generalise a single training proof to a large class of related problems. On these benchmarks, FLoP is competitive with strong theorem provers despite using very limited search, due to its ability to solve problems that are prohibitively long for other systems. △ Less

Submitted 29 June, 2021; v1 submitted 30 May, 2019; originally announced May 2019.

Comments: 16 pages, 3 figures, published at TABLEAUX2021

arXiv:1903.02539 [pdf, ps, other]

doi 10.1007/978-3-030-29436-6_8

GRUNGE: A Grand Unified ATP Challenge

Authors: Chad E. Brown, Thibault Gauthier, Cezary Kaliszyk, Geoff Sutcliffe, Josef Urban

Abstract: This paper describes a large set of related theorem proving problems obtained by translating theorems from the HOL4 standard library into multiple logical formalisms. The formalisms are in higher-order logic (with and without type variables) and first-order logic (possibly with multiple types, and possibly with type variables). The resultant problem sets allow us to run automated theorem provers t… ▽ More This paper describes a large set of related theorem proving problems obtained by translating theorems from the HOL4 standard library into multiple logical formalisms. The formalisms are in higher-order logic (with and without type variables) and first-order logic (possibly with multiple types, and possibly with type variables). The resultant problem sets allow us to run automated theorem provers that support different logical formats on corresponding problems, and compare their performances. This also results in a new "grand unified" large theory benchmark that emulates the ITP/ATP hammer setting, where systems and metasystems can use multiple ATP formalisms in complementary ways, and jointly learn from the accumulated knowledge. △ Less

Submitted 19 November, 2019; v1 submitted 6 March, 2019; originally announced March 2019.

Comments: CADE 27 -- 27th International Conference on Automated Deduction

arXiv:1808.06413 [pdf, ps, other]

doi 10.1007/978-3-319-96812-4_5

Concrete Semantics with Coq and CoqHammer

Authors: Łukasz Czajka, Burak Ekici, Cezary Kaliszyk

Abstract: The "Concrete Semantics" book gives an introduction to imperative programming languages accompanied by an Isabelle/HOL formalization. In this paper we discuss a re-formalization of the book using the Coq proof assistant. In order to achieve a similar brevity of the formal text we extensively use CoqHammer, as well as Coq Ltac-level automation. We compare the formalization efficiency, compactness,… ▽ More The "Concrete Semantics" book gives an introduction to imperative programming languages accompanied by an Isabelle/HOL formalization. In this paper we discuss a re-formalization of the book using the Coq proof assistant. In order to achieve a similar brevity of the formal text we extensively use CoqHammer, as well as Coq Ltac-level automation. We compare the formalization efficiency, compactness, and the readability of the proof scripts originating from a Coq re-formalization of two chapters from the book. △ Less

Submitted 20 August, 2018; originally announced August 2018.

arXiv:1805.07563 [pdf, ps, other]

Reinforcement Learning of Theorem Proving

Authors: Cezary Kaliszyk, Josef Urban, Henryk Michalewski, Mirek Olšák

Abstract: We introduce a theorem proving algorithm that uses practically no domain heuristics for guiding its connection-style proof search. Instead, it runs many Monte-Carlo simulations guided by reinforcement learning from previous proof attempts. We produce several versions of the prover, parameterized by different learning and guiding algorithms. The strongest version of the system is trained on a large… ▽ More We introduce a theorem proving algorithm that uses practically no domain heuristics for guiding its connection-style proof search. Instead, it runs many Monte-Carlo simulations guided by reinforcement learning from previous proof attempts. We produce several versions of the prover, parameterized by different learning and guiding algorithms. The strongest version of the system is trained on a large corpus of mathematical problems and evaluated on previously unseen problems. The trained system solves within the same number of inferences over 40% more problems than a baseline prover, which is an unusually high improvement in this hard AI domain. To our knowledge this is the first time reinforcement learning has been convincingly applied to solving general mathematical problems on a large scale. △ Less

Submitted 19 May, 2018; originally announced May 2018.

arXiv:1805.06502 [pdf, other]

First Experiments with Neural Translation of Informal to Formal Mathematics

Authors: Qingxiang Wang, Cezary Kaliszyk, Josef Urban

Abstract: We report on our experiments to train deep neural networks that automatically translate informalized LaTeX-written Mizar texts into the formal Mizar language. To the best of our knowledge, this is the first time when neural networks have been adopted in the formalization of mathematics. Using Luong et al.'s neural machine translation model (NMT), we tested our aligned informal-formal corpora again… ▽ More We report on our experiments to train deep neural networks that automatically translate informalized LaTeX-written Mizar texts into the formal Mizar language. To the best of our knowledge, this is the first time when neural networks have been adopted in the formalization of mathematics. Using Luong et al.'s neural machine translation model (NMT), we tested our aligned informal-formal corpora against various hyperparameters and evaluated their results. Our experiments show that our best performing model configurations are able to generate correct Mizar statements on 65.73\% of the inference data, with the union of all models covering 79.17\%. These results indicate that formalization through artificial neural network is a promising approach for automated formalization of mathematics. We present several case studies to illustrate our results. △ Less

Submitted 11 June, 2018; v1 submitted 10 May, 2018; originally announced May 2018.

Comments: Submission to CICM'2018

arXiv:1805.03107 [pdf, ps, other]

Machine Learning Guidance and Proof Certification for Connection Tableaux

Authors: Michael Färber, Cezary Kaliszyk, Josef Urban

Abstract: Connection calculi allow for very compact implementations of goal-directed proof search. We give an overview of our work related to connection tableaux calculi: First, we show optimised functional implementations of clausal and nonclausal proof search, including a consistent Skolemisation procedure for machine learning. Then, we show two guidance methods based on machine learning, namely reorderin… ▽ More Connection calculi allow for very compact implementations of goal-directed proof search. We give an overview of our work related to connection tableaux calculi: First, we show optimised functional implementations of clausal and nonclausal proof search, including a consistent Skolemisation procedure for machine learning. Then, we show two guidance methods based on machine learning, namely reordering of proof steps with Naive Bayesian probablities, and expansion of a proof search tree with Monte Carlo Tree Search. Finally, we give a translation of connection proofs to LK, enabling proof certification and automatic proof search in interactive theorem provers. △ Less

Submitted 15 May, 2018; v1 submitted 8 May, 2018; originally announced May 2018.

Comments: Submitted to JAR

arXiv:1804.00596 [pdf, ps, other]

doi 10.1007/s10817-020-09580-x

TacticToe: Learning to Prove with Tactics

Authors: Thibault Gauthier, Cezary Kaliszyk, Josef Urban, Ramana Kumar, Michael Norrish

Abstract: We implement a automated tactical prover TacticToe on top of the HOL4 interactive theorem prover. TacticToe learns from human proofs which mathematical technique is suitable in each proof situation. This knowledge is then used in a Monte Carlo tree search algorithm to explore promising tactic-level proof paths. On a single CPU, with a time limit of 60 seconds, TacticToe proves 66.4 percent of the… ▽ More We implement a automated tactical prover TacticToe on top of the HOL4 interactive theorem prover. TacticToe learns from human proofs which mathematical technique is suitable in each proof situation. This knowledge is then used in a Monte Carlo tree search algorithm to explore promising tactic-level proof paths. On a single CPU, with a time limit of 60 seconds, TacticToe proves 66.4 percent of the 7164 theorems in HOL4's standard library, whereas E prover with auto-schedule solves 34.5 percent. The success rate rises to 69.0 percent by combining the results of TacticToe and E prover. △ Less

Submitted 1 December, 2021; v1 submitted 2 April, 2018; originally announced April 2018.

Journal ref: J. Automated Reasoning 65(2): 257-286, 2021

arXiv:1804.00595 [pdf, ps, other]

doi 10.29007/ntlb

Learning to Reason with HOL4 tactics

Authors: Thibault Gauthier, Cezary Kaliszyk, Josef Urban

Abstract: Techniques combining machine learning with translation to automated reasoning have recently become an important component of formal proof assistants. Such "hammer" tech- niques complement traditional proof assistant automation as implemented by tactics and decision procedures. In this paper we present a unified proof assistant automation approach which attempts to automate the selection of appropr… ▽ More Techniques combining machine learning with translation to automated reasoning have recently become an important component of formal proof assistants. Such "hammer" tech- niques complement traditional proof assistant automation as implemented by tactics and decision procedures. In this paper we present a unified proof assistant automation approach which attempts to automate the selection of appropriate tactics and tactic-sequences com- bined with an optimized small-scale hammering approach. We implement the technique as a tactic-level automation for HOL4: TacticToe. It implements a modified A*-algorithm directly in HOL4 that explores different tactic-level proof paths, guiding their selection by learning from a large number of previous tactic-level proofs. Unlike the existing hammer methods, TacticToe avoids translation to FOL, working directly on the HOL level. By combining tactic prediction and premise selection, TacticToe is able to re-prove 39 percent of 7902 HOL4 theorems in 5 seconds whereas the best single HOL(y)Hammer strategy solves 32 percent in the same amount of time. △ Less

Submitted 2 April, 2018; originally announced April 2018.

Comments: LPAR-21. 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning. EasyChair 2017

arXiv:1703.00426 [pdf, other]

HolStep: A Machine Learning Dataset for Higher-order Logic Theorem Proving

Authors: Cezary Kaliszyk, François Chollet, Christian Szegedy

Abstract: Large computer-understandable proofs consist of millions of intermediate logical steps. The vast majority of such steps originate from manually selected and manually guided heuristics applied to intermediate goals. So far, machine learning has generally not been used to filter or generate these steps. In this paper, we introduce a new dataset based on Higher-Order Logic (HOL) proofs, for the purpo… ▽ More Large computer-understandable proofs consist of millions of intermediate logical steps. The vast majority of such steps originate from manually selected and manually guided heuristics applied to intermediate goals. So far, machine learning has generally not been used to filter or generate these steps. In this paper, we introduce a new dataset based on Higher-Order Logic (HOL) proofs, for the purpose of developing new machine learning-based theorem-proving strategies. We make this dataset publicly available under the BSD license. We propose various machine learning tasks that can be performed on this dataset, and discuss their significance for theorem proving. We also benchmark a set of simple baseline machine learning models suited for the tasks (including logistic regression, convolutional neural networks and recurrent neural networks). The results of our baseline models show the promise of applying machine learning to HOL theorem proving. △ Less

Submitted 1 March, 2017; originally announced March 2017.

arXiv:1701.06972 [pdf, other]

Deep Network Guided Proof Search

Authors: Sarah Loos, Geoffrey Irving, Christian Szegedy, Cezary Kaliszyk

Abstract: Deep learning techniques lie at the heart of several significant AI advances in recent years including object recognition and detection, image captioning, machine translation, speech recognition and synthesis, and playing the game of Go. Automated first-order theorem provers can aid in the formalization and verification of mathematical theorems and play a crucial role in program analysis, theory r… ▽ More Deep learning techniques lie at the heart of several significant AI advances in recent years including object recognition and detection, image captioning, machine translation, speech recognition and synthesis, and playing the game of Go. Automated first-order theorem provers can aid in the formalization and verification of mathematical theorems and play a crucial role in program analysis, theory reasoning, security, interpolation, and system verification. Here we suggest deep learning based guidance in the proof search of the theorem prover E. We train and compare several deep neural network models on the traces of existing ATP proofs of Mizar statements and use them to select processed clauses during proof search. We give experimental evidence that with a hybrid, two-phase approach, deep learning based guidance can significantly reduce the average number of proof search steps while increasing the number of theorems proved. Using a few proof guidance strategies that leverage deep neural networks, we have found first-order proofs of 7.36% of the first-order logic translations of the Mizar Mathematical Library theorems that did not previously have ATP generated proofs. This increases the ratio of statements in the corpus with ATP generated proofs from 56% to 59%. △ Less

Submitted 24 January, 2017; originally announced January 2017.

Journal ref: In Thomas Eiter and David Sands, editors, 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning (LPAR-21). EPiC Series in Computing, vol. 46, pages 85-105, EasyChair, 2017. ISSN 2398-7340

arXiv:1611.09703 [pdf, other]

Semantic Parsing of Mathematics by Context-based Learning from Aligned Corpora and Theorem Proving

Authors: Cezary Kaliszyk, Josef Urban, Jiří Vyskočil

Abstract: We study methods for automated parsing of informal mathematical expressions into formal ones, a main prerequisite for deep computer understanding of informal mathematical texts. We propose a context-based parsing approach that combines efficient statistical learning of deep parse trees with their semantic pruning by type checking and large-theory automated theorem proving. We show that the methods… ▽ More We study methods for automated parsing of informal mathematical expressions into formal ones, a main prerequisite for deep computer understanding of informal mathematical texts. We propose a context-based parsing approach that combines efficient statistical learning of deep parse trees with their semantic pruning by type checking and large-theory automated theorem proving. We show that the methods very significantly improve on previous results in parsing theorems from the Flyspeck corpus. △ Less

Submitted 29 November, 2016; originally announced November 2016.

arXiv:1611.05990 [pdf, other]

doi 10.1007/978-3-319-63046-5_34

Monte Carlo Tableau Proof Search

Authors: Michael Färber, Cezary Kaliszyk, Josef Urban

Abstract: We study Monte Carlo Tree Search to guide proof search in tableau calculi. This includes proposing a number of proof-state evaluation heuristics, some of which are learnt from previous proofs. We present an implementation based on the leanCoP prover. The system is trained and evaluated on a large suite of related problems coming from the Mizar proof assistant, showing that it is capable to find ne… ▽ More We study Monte Carlo Tree Search to guide proof search in tableau calculi. This includes proposing a number of proof-state evaluation heuristics, some of which are learnt from previous proofs. We present an implementation based on the leanCoP prover. The system is trained and evaluated on a large suite of related problems coming from the Mizar proof assistant, showing that it is capable to find new and different proofs. △ Less

Submitted 14 June, 2019; v1 submitted 18 November, 2016; originally announced November 2016.

Journal ref: Proceedings of the 26th International Conference on Automated Deduction, CADE 2017

arXiv:1606.05946 [pdf, ps, other]

doi 10.4204/EPTCS.210.4

Goal Translation for a Hammer for Coq (Extended Abstract)

Authors: Łukasz Czajka, Cezary Kaliszyk

Abstract: Hammers are tools that provide general purpose automation for formal proof assistants. Despite the gaining popularity of the more advanced versions of type theory, there are no hammers for such systems. We present an extension of the various hammer components to type theory: (i) a translation of a significant part of the Coq logic into the format of automated proof systems; (ii) a proof reconstruc… ▽ More Hammers are tools that provide general purpose automation for formal proof assistants. Despite the gaining popularity of the more advanced versions of type theory, there are no hammers for such systems. We present an extension of the various hammer components to type theory: (i) a translation of a significant part of the Coq logic into the format of automated proof systems; (ii) a proof reconstruction mechanism based on a Ben-Yelles-type algorithm combined with limited rewriting, congruence closure and a first-order generalization of the left rules of Dyckhoff's system LJT. △ Less

Submitted 19 June, 2016; originally announced June 2016.

Comments: In Proceedings HaTT 2016, arXiv:1606.05427

Journal ref: EPTCS 210, 2016, pp. 13-20

arXiv:1606.05427

doi 10.4204/EPTCS.210

Proceedings First International Workshop on Hammers for Type Theories

Authors: Jasmin Christian Blanchette, Cezary Kaliszyk

Abstract: This volume of EPTCS contains the proceedings of the First Workshop on Hammers for Type Theories (HaTT 2016), held on 1 July 2016 as part of the International Joint Conference on Automated Reasoning (IJCAR 2016) in Coimbra, Portugal. The proceedings contain four regular papers, as well as abstracts of the two invited talks by Pierre Corbineau (Verimag, France) and Aleksy Schubert (University of Wa… ▽ More This volume of EPTCS contains the proceedings of the First Workshop on Hammers for Type Theories (HaTT 2016), held on 1 July 2016 as part of the International Joint Conference on Automated Reasoning (IJCAR 2016) in Coimbra, Portugal. The proceedings contain four regular papers, as well as abstracts of the two invited talks by Pierre Corbineau (Verimag, France) and Aleksy Schubert (University of Warsaw, Poland). △ Less

Submitted 17 June, 2016; originally announced June 2016.

Journal ref: EPTCS 210, 2016

arXiv:1509.03534 [pdf, ps, other]

doi 10.1145/2676724.2693173

Premise Selection and External Provers for HOL4

Authors: Thibault Gauthier, Cezary Kaliszyk

Abstract: Learning-assisted automated reasoning has recently gained popularity among the users of Isabelle/HOL, HOL Light, and Mizar. In this paper, we present an add-on to the HOL4 proof assistant and an adaptation of the HOLyHammer system that provides machine learning-based premise selection and automated reasoning also for HOL4. We efficiently record the HOL4 dependencies and extract features from the t… ▽ More Learning-assisted automated reasoning has recently gained popularity among the users of Isabelle/HOL, HOL Light, and Mizar. In this paper, we present an add-on to the HOL4 proof assistant and an adaptation of the HOLyHammer system that provides machine learning-based premise selection and automated reasoning also for HOL4. We efficiently record the HOL4 dependencies and extract features from the theorem statements, which form a basis for premise selection. HOLyHammer transforms the HOL4 statements in the various TPTP-ATP proof formats, which are then processed by the ATPs. We discuss the different evaluation settings: ATPs, accessible lemmas, and premise numbers. We measure the performance of HOLyHammer on the HOL4 standard library. The results are combined accordingly and compared with the HOL Light experiments, showing a comparably high quality of predictions. The system directly benefits HOL4 users by automatically finding proofs dependencies that can be reconstructed by Metis. △ Less

Submitted 11 September, 2015; originally announced September 2015.

arXiv:1509.03527 [pdf, ps, other]

Sharing HOL4 and HOL Light proof knowledge

Authors: Thibault Gauthier, Cezary Kaliszyk

Abstract: New proof assistant developments often involve concepts similar to already formalized ones. When proving their properties, a human can often take inspiration from the existing formalized proofs available in other provers or libraries. In this paper we propose and evaluate a number of methods, which strengthen proof automation by learning from proof libraries of different provers. Certain conjectur… ▽ More New proof assistant developments often involve concepts similar to already formalized ones. When proving their properties, a human can often take inspiration from the existing formalized proofs available in other provers or libraries. In this paper we propose and evaluate a number of methods, which strengthen proof automation by learning from proof libraries of different provers. Certain conjectures can be proved directly from the dependencies induced by similar proofs in the other library. Even if exact correspondences are not found, learning-reasoning systems can make use of the association between proved theorems and their characteristics to predict the relevant premises. Such external help can be further combined with internal advice. We evaluate the proposed knowledge-sharing methods by reproving the HOL Light and HOL4 standard libraries. The learning-reasoning system HOL(y)Hammer, whose single best strategy could automatically find proofs for 30% of the HOL Light problems, can prove 40% with the knowledge from HOL4. △ Less

Submitted 11 September, 2015; originally announced September 2015.

arXiv:1507.08375

doi 10.4204/EPTCS.186

Proceedings Fourth Workshop on Proof eXchange for Theorem Proving

Authors: Cezary Kaliszyk, Andrei Paskevich

Abstract: This volume of EPTCS contains the proceedings of the Fourth Workshop on Proof Exchange for Theorem Proving (PxTP 2015), held as part of the International Conference on Automated Deduction (CADE 2015) on August 2-3, 2015 in Berlin. The PxTP workshop series brings together researchers working on various aspects of communication, integration, and cooperation between reasoning systems and formalisms.… ▽ More This volume of EPTCS contains the proceedings of the Fourth Workshop on Proof Exchange for Theorem Proving (PxTP 2015), held as part of the International Conference on Automated Deduction (CADE 2015) on August 2-3, 2015 in Berlin. The PxTP workshop series brings together researchers working on various aspects of communication, integration, and cooperation between reasoning systems and formalisms. These proceedings contain seven regular papers, as well as the abstracts of the invited talks by Georges Gonthier (Microsoft Research) and Bart Jacobs (University of Leuven). △ Less

Submitted 30 July, 2015; originally announced July 2015.

Journal ref: EPTCS 186, 2015

arXiv:1501.02155 [pdf, ps, other]

A formal proof of the Kepler conjecture

Authors: Thomas Hales, Mark Adams, Gertrud Bauer, Dat Tat Dang, John Harrison, Truong Le Hoang, Cezary Kaliszyk, Victor Magron, Sean McLaughlin, Thang Tat Nguyen, Truong Quang Nguyen, Tobias Nipkow, Steven Obua, Joseph Pleso, Jason Rute, Alexey Solovyev, An Hoai Thi Ta, Trung Nam Tran, Diep Thi Trieu, Josef Urban, Ky Khac Vu, Roland Zumkeller

Abstract: This article describes a formal proof of the Kepler conjecture on dense sphere packings in a combination of the HOL Light and Isabelle proof assistants. This paper constitutes the official published account of the now completed Flyspeck project. This article describes a formal proof of the Kepler conjecture on dense sphere packings in a combination of the HOL Light and Isabelle proof assistants. This paper constitutes the official published account of the now completed Flyspeck project. △ Less

Submitted 9 January, 2015; originally announced January 2015.

Comments: 21 pages

arXiv:1410.5476 [pdf, ps, other]

Certified Connection Tableaux Proofs for HOL Light and TPTP

Authors: Cezary Kaliszyk, Josef Urban, Jiri Vyskocil

Abstract: In the recent years, the Metis prover based on ordered paramodulation and model elimination has replaced the earlier built-in methods for general-purpose proof automation in HOL4 and Isabelle/HOL. In the annual CASC competition, the leanCoP system based on connection tableaux has however performed better than Metis. In this paper we show how the leanCoP's core algorithm can be implemented inside H… ▽ More In the recent years, the Metis prover based on ordered paramodulation and model elimination has replaced the earlier built-in methods for general-purpose proof automation in HOL4 and Isabelle/HOL. In the annual CASC competition, the leanCoP system based on connection tableaux has however performed better than Metis. In this paper we show how the leanCoP's core algorithm can be implemented inside HOLLight. leanCoP's flagship feature, namely its minimalistic core, results in a very simple proof system. This plays a crucial role in extending the MESON proof reconstruction mechanism to connection tableaux proofs, providing an implementation of leanCoP that certifies its proofs. We discuss the differences between our direct implementation using an explicit Prolog stack, to the continuation passing implementation of MESON present in HOLLight and compare their performance on all core HOLLight goals. The resulting prover can be also used as a general purpose TPTP prover. We compare its performance against the resolution based Metis on TPTP and other interesting datasets. △ Less

Submitted 20 October, 2014; originally announced October 2014.

arXiv:1410.5467 [pdf, ps, other]

Machine Learning of Coq Proof Guidance: First Experiments

Authors: Cezary Kaliszyk, Lionel Mamane, Josef Urban

Abstract: We report the results of the first experiments with learning proof dependencies from the formalizations done with the Coq system. We explain the process of obtaining the dependencies from the Coq proofs, the characterization of formulas that is used for the learning, and the evaluation method. Various machine learning methods are compared on a dataset of 5021 toplevel Coq proofs coming from the Co… ▽ More We report the results of the first experiments with learning proof dependencies from the formalizations done with the Coq system. We explain the process of obtaining the dependencies from the Coq proofs, the characterization of formulas that is used for the learning, and the evaluation method. Various machine learning methods are compared on a dataset of 5021 toplevel Coq proofs coming from the CoRN repository. The best resulting method covers on average 75% of the needed proof dependencies among the first 100 predictions, which is a comparable performance of such initial experiments on other large-theory corpora. △ Less

Submitted 20 October, 2014; originally announced October 2014.

arXiv:1406.1559 [pdf, ps, other]

doi 10.4204/EPTCS.152.6

Initial Experiments with TPTP-style Automated Theorem Provers on ACL2 Problems

Authors: Sebastiaan Joosten, Cezary Kaliszyk, Josef Urban

Abstract: This paper reports our initial experiments with using external ATP on some corpora built with the ACL2 system. This is intended to provide the first estimate about the usefulness of such external reasoning and AI systems for solving ACL2 problems. This paper reports our initial experiments with using external ATP on some corpora built with the ACL2 system. This is intended to provide the first estimate about the usefulness of such external reasoning and AI systems for solving ACL2 problems. △ Less

Submitted 5 June, 2014; originally announced June 2014.

Comments: In Proceedings ACL2 2014, arXiv:1406.1238

Journal ref: EPTCS 152, 2014, pp. 77-85

arXiv:1405.3906 [pdf, ps, other]

Matching concepts across HOL libraries

Authors: Thibault Gauthier, Cezary Kaliszyk

Abstract: Many proof assistant libraries contain formalizations of the same mathematical concepts. The concepts are often introduced (defined) in different ways, but the properties that they have, and are in turn formalized, are the same. For the basic concepts, like natural numbers, matching them between libraries is often straightforward, because of mathematical naming conventions. However, for more advan… ▽ More Many proof assistant libraries contain formalizations of the same mathematical concepts. The concepts are often introduced (defined) in different ways, but the properties that they have, and are in turn formalized, are the same. For the basic concepts, like natural numbers, matching them between libraries is often straightforward, because of mathematical naming conventions. However, for more advanced concepts, finding similar formalizations in different libraries is a non-trivial task even for an expert. In this paper we investigate automatic discovery of similar concepts across libraries of proof assistants. We propose an approach for normalizing properties of concepts in formal libraries and a number of similarity measures. We evaluate the approach on HOL based proof assistants HOL4, HOL Light and Isabelle/HOL, discovering 398 pairs of isomorphic constants and types. △ Less

Submitted 15 May, 2014; originally announced May 2014.

arXiv:1405.3451 [pdf, ps, other]

Developing Corpus-based Translation Methods between Informal and Formal Mathematics: Project Description

Authors: Cezary Kaliszyk, Josef Urban, Jiri Vyskocil, Herman Geuvers

Abstract: The goal of this project is to (i) accumulate annotated informal/formal mathematical corpora suitable for training semi-automated translation between informal and formal mathematics by statistical machine-translation methods, (ii) to develop such methods oriented at the formalization task, and in particular (iii) to combine such methods with learning-assisted automated reasoning that will serve as… ▽ More The goal of this project is to (i) accumulate annotated informal/formal mathematical corpora suitable for training semi-automated translation between informal and formal mathematics by statistical machine-translation methods, (ii) to develop such methods oriented at the formalization task, and in particular (iii) to combine such methods with learning-assisted automated reasoning that will serve as a strong semantic component. We describe these ideas, the initial set of corpora, and some initial experiments done over them. △ Less

Submitted 14 May, 2014; originally announced May 2014.

arXiv:1402.3578 [pdf, ps, other]

Learning-assisted Theorem Proving with Millions of Lemmas

Authors: Cezary Kaliszyk, Josef Urban

Abstract: Large formal mathematical libraries consist of millions of atomic inference steps that give rise to a corresponding number of proved statements (lemmas). Analogously to the informal mathematical practice, only a tiny fraction of such statements is named and re-used in later proofs by formal mathematicians. In this work, we suggest and implement criteria defining the estimated usefulness of the HOL… ▽ More Large formal mathematical libraries consist of millions of atomic inference steps that give rise to a corresponding number of proved statements (lemmas). Analogously to the informal mathematical practice, only a tiny fraction of such statements is named and re-used in later proofs by formal mathematicians. In this work, we suggest and implement criteria defining the estimated usefulness of the HOL Light lemmas for proving further theorems. We use these criteria to mine the large inference graph of the lemmas in the HOL Light and Flyspeck libraries, adding up to millions of the best lemmas to the pool of statements that can be re-used in later proofs. We show that in combination with learning-based relevance filtering, such methods significantly strengthen automated theorem proving of new conjectures over large formal mathematical libraries such as Flyspeck. △ Less

Submitted 10 February, 2014; originally announced February 2014.

Comments: journal version of arXiv:1310.2797 (which was submitted to LPAR conference)

arXiv:1402.2359 [pdf, ps, other]

Machine Learner for Automated Reasoning 0.4 and 0.5

Authors: Cezary Kaliszyk, Josef Urban, Jiří Vyskočil

Abstract: Machine Learner for Automated Reasoning (MaLARea) is a learning and reasoning system for proving in large formal libraries where thousands of theorems are available when attacking a new conjecture, and a large number of related problems and proofs can be used to learn specific theorem-proving knowledge. The last version of the system has by a large margin won the 2013 CASC LTB competition. This pa… ▽ More Machine Learner for Automated Reasoning (MaLARea) is a learning and reasoning system for proving in large formal libraries where thousands of theorems are available when attacking a new conjecture, and a large number of related problems and proofs can be used to learn specific theorem-proving knowledge. The last version of the system has by a large margin won the 2013 CASC LTB competition. This paper describes the motivation behind the methods used in MaLARea, discusses the general approach and the issues arising in evaluation of such system, and describes the Mizar@Turing100 and CASC'24 versions of MaLARea. △ Less

Submitted 28 May, 2014; v1 submitted 10 February, 2014; originally announced February 2014.

arXiv:1310.2805 [pdf, ps, other]

doi 10.1007/s10817-015-9330-8

MizAR 40 for Mizar 40

Authors: Cezary Kaliszyk, Josef Urban

Abstract: As a present to Mizar on its 40th anniversary, we develop an AI/ATP system that in 30 seconds of real time on a 14-CPU machine automatically proves 40% of the theorems in the latest official version of the Mizar Mathematical Library (MML). This is a considerable improvement over previous performance of large- theory AI/ATP methods measured on the whole MML. To achieve that, a large suite of AI/ATP… ▽ More As a present to Mizar on its 40th anniversary, we develop an AI/ATP system that in 30 seconds of real time on a 14-CPU machine automatically proves 40% of the theorems in the latest official version of the Mizar Mathematical Library (MML). This is a considerable improvement over previous performance of large- theory AI/ATP methods measured on the whole MML. To achieve that, a large suite of AI/ATP methods is employed and further developed. We implement the most useful methods efficiently, to scale them to the 150000 formulas in MML. This reduces the training times over the corpus to 1-3 seconds, allowing a simple practical deployment of the methods in the online automated reasoning service for the Mizar users (MizAR). △ Less

Submitted 10 October, 2013; originally announced October 2013.

Journal ref: J. Automated Reasoning 55(3): 245-256 (2015)

arXiv:1310.2797 [pdf, ps, other]

Lemma Mining over HOL Light

Authors: Cezary Kaliszyk, Josef Urban

Abstract: Large formal mathematical libraries consist of millions of atomic inference steps that give rise to a corresponding number of proved statements (lemmas). Analogously to the informal mathematical practice, only a tiny fraction of such statements is named and re-used in later proofs by formal mathematicians. In this work, we suggest and implement criteria defining the estimated usefulness of the HOL… ▽ More Large formal mathematical libraries consist of millions of atomic inference steps that give rise to a corresponding number of proved statements (lemmas). Analogously to the informal mathematical practice, only a tiny fraction of such statements is named and re-used in later proofs by formal mathematicians. In this work, we suggest and implement criteria defining the estimated usefulness of the HOL Light lemmas for proving further theorems. We use these criteria to mine the large inference graph of all lemmas in the core HOL Light library, adding thousands of the best lemmas to the pool of named statements that can be re-used in later proofs. The usefulness of the new lemmas is then evaluated by comparing the performance of automated proving of the core HOL Light theorems with and without such added lemmas. △ Less

Submitted 10 October, 2013; originally announced October 2013.

arXiv:1309.4962 [pdf, other]

HOL(y)Hammer: Online ATP Service for HOL Light

Authors: Cezary Kaliszyk, Josef Urban

Abstract: HOL(y)Hammer is an online AI/ATP service for formal (computer-understandable) mathematics encoded in the HOL Light system. The service allows its users to upload and automatically process an arbitrary formal development (project) based on HOL Light, and to attack arbitrary conjectures that use the concepts defined in some of the uploaded projects. For that, the service uses several automated reaso… ▽ More HOL(y)Hammer is an online AI/ATP service for formal (computer-understandable) mathematics encoded in the HOL Light system. The service allows its users to upload and automatically process an arbitrary formal development (project) based on HOL Light, and to attack arbitrary conjectures that use the concepts defined in some of the uploaded projects. For that, the service uses several automated reasoning systems combined with several premise selection methods trained on all the project proofs. The projects that are readily available on the server for such query answering include the recent versions of the Flyspeck, Multivariate Analysis and Complex Analysis libraries. The service runs on a 48-CPU server, currently employing in parallel for each task 7 AI/ATP combinations and 4 decision procedures that contribute to its overall performance. The system is also available for local installation by interested users, who can customize it for their own proof development. An Emacs interface allowing parallel asynchronous queries to the service is also provided. The overall structure of the service is outlined, problems that arise and their solutions are discussed, and an initial account of using the system is given. △ Less

Submitted 19 September, 2013; originally announced September 2013.

arXiv:1307.1528

doi 10.4204/EPTCS.118

Proceedings 10th International Workshop On User Interfaces for Theorem Provers

Authors: Cezary Kaliszyk, Christoph Lüth

Abstract: This EPTCS volume collects the post-proceedings of the 10th International Workshop On User Interfaces for Theorem Provers (UITP 2012), held as part of the Conferences on Intelligent Computer Mathematics (CICM 2012) in Bremen on July 11th 2012. The UITP workshop series aims at bringing together reasearchers interested in designing, developing and evaluating interfaces for interactive proof systems,… ▽ More This EPTCS volume collects the post-proceedings of the 10th International Workshop On User Interfaces for Theorem Provers (UITP 2012), held as part of the Conferences on Intelligent Computer Mathematics (CICM 2012) in Bremen on July 11th 2012. The UITP workshop series aims at bringing together reasearchers interested in designing, developing and evaluating interfaces for interactive proof systems, such as theorem provers, formal method tools, and other tools manipulating and presenting mathematical formulae. Started in 1995, it can look back on seventeen years of history by now. The papers in the present volume give a good indication of the range of questions currently addressed in the UITP community; this ranges from interface design (Windsteiger; Dunchev et al) to using technologies such as machine learning to assist the user (Komendantskaya et al). The web features prominently (Tankink), and new technology necessitates changes right down to the very basic modes of interaction (Wenzel) - the old REPL (read, evaluate, print, loop) mode of interaction can not take advantage of modern technology, such as the web and multi-core machines. △ Less

Submitted 5 July, 2013; originally announced July 2013.

Journal ref: EPTCS 118, 2013

arXiv:1305.5710 [pdf, other]

Formal Mathematics on Display: A Wiki for Flyspeck

Authors: Carst Tankink, Cezary Kaliszyk, Josef Urban, Herman Geuvers

Abstract: The Agora system is a prototype "Wiki for Formal Mathematics", with an aim to support developing and documenting large formalizations of mathematics in a proof assistant. The functions implemented in Agora include in-browser editing, strong AI/ATP proof advice, verification, and HTML rendering. The HTML rendering contains hyperlinks and provides on-demand explanation of the proof state for each pr… ▽ More The Agora system is a prototype "Wiki for Formal Mathematics", with an aim to support developing and documenting large formalizations of mathematics in a proof assistant. The functions implemented in Agora include in-browser editing, strong AI/ATP proof advice, verification, and HTML rendering. The HTML rendering contains hyperlinks and provides on-demand explanation of the proof state for each proof step. In the present paper we show the prototype Flyspeck Wiki as an instance of Agora for HOL Light formalizations. The wiki can be used for formalizations of mathematics and for writing informal wiki pages about mathematics. Such informal pages may contain islands of formal text, which is used here for providing an initial cross-linking between Hales's informal Flyspeck book, and the formal Flyspeck development. The Agora platform intends to address distributed wiki-style collaboration on large formalization projects, in particular both the aspect of immediate editing, verification and rendering of formal code, and the aspect of gradual and mutual refactoring and correspondence of the initial informal text and its formalization. Here, we highlight these features within the Flyspeck Wiki. △ Less

Submitted 24 May, 2013; originally announced May 2013.

Comments: 16 pages, published as part of the CICM 2013 conference proceedings

arXiv:1211.7012 [pdf, ps, other]

doi 10.1007/s10817-014-9303-3

Learning-Assisted Automated Reasoning with Flyspeck

Authors: Cezary Kaliszyk, Josef Urban

Abstract: The considerable mathematical knowledge encoded by the Flyspeck project is combined with external automated theorem provers (ATPs) and machine-learning premise selection methods trained on the proofs, producing an AI system capable of answering a wide range of mathematical queries automatically. The performance of this architecture is evaluated in a bootstrapping scenario emulating the development… ▽ More The considerable mathematical knowledge encoded by the Flyspeck project is combined with external automated theorem provers (ATPs) and machine-learning premise selection methods trained on the proofs, producing an AI system capable of answering a wide range of mathematical queries automatically. The performance of this architecture is evaluated in a bootstrapping scenario emulating the development of Flyspeck from axioms to the last theorem, each time using only the previous theorems and proofs. It is shown that 39% of the 14185 theorems could be proved in a push-button mode (without any high-level advice and user interaction) in 30 seconds of real time on a fourteen-CPU workstation. The necessary work involves: (i) an implementation of sound translations of the HOL Light logic to ATP formalisms: untyped first-order, polymorphic typed first-order, and typed higher-order, (ii) export of the dependency information from HOL Light and ATP proofs for the machine learners, and (iii) choice of suitable representations and methods for learning from previous proofs, and their integration as advisors with HOL Light. This work is described and discussed here, and an initial analysis of the body of proofs that were found fully automatically is provided. △ Less

Submitted 26 October, 2014; v1 submitted 29 November, 2012; originally announced November 2012.

Journal ref: J. Automated Reasoninig 54(1): 99, 2014

arXiv:1206.0136 [pdf, ps, other]

doi 10.2168/LMCS-8(2:14)2012

General Bindings and Alpha-Equivalence in Nominal Isabelle

Authors: Christian Urban, Cezary Kaliszyk

Abstract: Nominal Isabelle is a definitional extension of the Isabelle/HOL theorem prover. It provides a proving infrastructure for reasoning about programming language calculi involving named bound variables (as opposed to de-Bruijn indices). In this paper we present an extension of Nominal Isabelle for dealing with general bindings, that means term constructors where multiple variables are bound at once.… ▽ More Nominal Isabelle is a definitional extension of the Isabelle/HOL theorem prover. It provides a proving infrastructure for reasoning about programming language calculi involving named bound variables (as opposed to de-Bruijn indices). In this paper we present an extension of Nominal Isabelle for dealing with general bindings, that means term constructors where multiple variables are bound at once. Such general bindings are ubiquitous in programming language research and only very poorly supported with single binders, such as lambda-abstractions. Our extension includes new definitions of alpha-equivalence and establishes automatically the reasoning infrastructure for alpha-equated terms. We also prove strong induction principles that have the usual variable convention already built in. △ Less

Submitted 19 June, 2012; v1 submitted 1 June, 2012; originally announced June 2012.

Comments: 35 pages

ACM Class: F.3.1

Journal ref: Logical Methods in Computer Science, Volume 8, Issue 2 (June 20, 2012) lmcs:813

arXiv:0809.1644 [pdf, ps, other]

Computing with Classical Real Numbers

Authors: Cezary Kaliszyk, Russell O'Connor

Abstract: There are two incompatible Coq libraries that have a theory of the real numbers; the Coq standard library gives an axiomatic treatment of classical real numbers, while the CoRN library from Nijmegen defines constructively valid real numbers. Unfortunately, this means results about one structure cannot easily be used in the other structure. We present a way interfacing these two libraries by show… ▽ More There are two incompatible Coq libraries that have a theory of the real numbers; the Coq standard library gives an axiomatic treatment of classical real numbers, while the CoRN library from Nijmegen defines constructively valid real numbers. Unfortunately, this means results about one structure cannot easily be used in the other structure. We present a way interfacing these two libraries by showing that their real number structures are isomorphic assuming the classical axioms already present in the standard library reals. This allows us to use O'Connor's decision procedure for solving ground inequalities present in CoRN to solve inequalities about the reals from the Coq standard library, and it allows theorems from the Coq standard library to apply to problem about the CoRN reals. △ Less

Submitted 9 September, 2008; originally announced September 2008.

Journal ref: Journal of Formalized Reasoning, 2(1):27-39, 2009

Showing 1–49 of 49 results for author: Kaliszyk, C