# Tutorials
# Table of Contents
[TOC]
----
# First steps
## Create a prover9 input file and find a proof
Press the **New Blank Prover9** File button ![][image-1]
ProverX will automatically write some of the text for you:
```matlab
include(preferences).
formulas(assumptions).
% Write the axioms here:
end_of_list.
formulas(goals).
% and the goals here:
end_of_list.
```
Notice on the file mathat there is a file called "preferences" with some common settings. You can, of course, modify this file to suit your preferences.
No let's try to prove that a group in which all elements have order 2 is commutative.
So enter the following lines:
```matlab
include(preferences).
formulas(assumptions).
% group axioms
(x * y) * z = x * (y * z). % associativity
1 * x = x. % left identity
x * 1 = x. % right identity
x' * x = 1. % lef inverse
x * x' = 1. % right inverse
% special assumption
x * x = 1. % all elements have order 2
end_of_list.
formulas(goals).
x * y = y * x. % commutativity
end_of_list.
```
press the **Save File** button ![][image-2] and call this file abel1.in.
Then press the **Run with Prover9** button ![][image-3]
You should get a new window on the right with the result:
```html
============================== PROOF =================================
% Proof 1 at 0.04 (+ 0.06) seconds.
% Length of proof is 11.
% Level of proof is 4.
% Maximum clause weight is 11.000.
% Given clauses 12.
1 x * y = y * x # label(non_clause) # label(goal). [goal].
2 (x * y) * z = x * (y * z). [assumption].
3 1 * x = x. [assumption].
4 x * 1 = x. [assumption].
7 x * x = 1. [assumption].
8 c2 * c1 != c1 * c2. [deny(1)].
13 x * (x * y) = y. [para(7(a,1),2(a,1,1)),rewrite([3(2)]),flip(a)].
14 x * (y * (x * y)) = 1. [para(7(a,1),2(a,1)),flip(a)].
20 x * (y * x) = y. [para(14(a,1),13(a,1,2)),rewrite([4(2)]),flip(a)].
24 x * y = y * x. [para(20(a,1),13(a,1,2))].
25 $F. [resolve(24,a,8,a)].
============================== end of proof ==========================
```
## Using the preprocessor
Now, let's do exactly the same thing but using some of the preprocessor directives.
Create this file and save it (call it abel2.in):
```matlab
include(preferences).
formulas(assumptions).
include(group).
% special assumption
x ** 2 = 1.
end_of_list.
formulas(goals).
x * y = y * x.
end_of_list.
```
Notice that all the group theory axioms have been replaced by the directive **include(group)**. This file contains the group axioms and is stored on the server; you can also create you own include files (if you create a file called "group" it will take precedence over the default one). Notice also that the special assumption now uses the exponentiation directive.
Press the **Run with Prover9** button and you should get exactly the same result.
Now, if you look at the file tree on the left, you will notice a new file called abel2.in.tmp (this is the actual file that is passed to ProverX). Click on that file to reveal an identical file to abel1.in (note that .tmp files are not editable).
## Syntax checking
If you wish to check what the preprocessor did to your input file, just press the **Syntax Check** button ![][image-4]
A new pane will open side-by-side with your original code, revealing the .tmp file that will be passed to ProverX.
But the main purpose of this button is to find errors in your input file. For instance, remove the last dot in "x \* y = y \* x" and press the button.
You should now see the offending line highlighted in the .tmp file:
![][image-5]
## Using mace4
Now remove the special assumption ('x \*\* 2 = 1.') and try to run prover9 again.
By removing this line, we are asking the program to prove that all groups are commutative. This is obviously false and you will get the message:
```html
No proof found !
```
So, instead, press the **Run with Mace4** button to find a model that satisfies this theory (a non commutative group).
You should get:
```html
============================== MODEL =================================
interpretation( 6, [number=1, seconds=0], [
function(c1, [ 0 ]),
function(c2, [ 2 ]),
function('(_), [ 0, 1, 2, 4, 3, 5 ]),
function(*(_,_), [
1, 0, 3, 2, 5, 4,
0, 1, 2, 3, 4, 5,
4, 2, 1, 5, 0, 3,
5, 3, 0, 4, 1, 2,
2, 4, 5, 1, 3, 0,
3, 5, 4, 0, 2, 1 ])
]).
============================== end of model ==========================
```
which is the smallest non commutative group.
To see this result in the REPL instead of a window, just check the **Run on REPL** option ![][image-6]
![][image-7]
This can be useful when there are errors on your input file / script.
## Using the command line
Open the ProverX REPL ![][image-8] and type:
```python
p = Proverx('abelian.in')
```
then type:
```python
p.find_proofs()
```
and:
```python
p.proofs[0]
```
you should see:
![][image-9]
IMPORTANT: when you are not using the command line, to reduce bandwidth on the server, you should disconnect the REPL ![][image-10]
(The first time, you do this on a session, you might get an alert message from your browser saying: "Are you sure you want to leave this page?" , just press "Leave page" and ignore it.)
If you want to hide the REPL, just press: ![][image-11]
You can also interact with gap ![][image-12]:
![][image-13]
## Creating scripts
We will do the same as above, but this time using a python script.
Start by creating a new file with the text below and save it with a .py extension (for instance abel.py)
Enter this text:
```python
p = Proverx('abelian.in')
print 'Axioms:\n'
print p.axioms
print 'Goals:\n'
print p.goals
print 'Proofs found:\n'
print p.find_proofs()
# Shows the proof
print p.proofs
```
and press the **Run Script** button ![][image-14]
```python
Axioms:
(x * y) * z = x * (y * z)
1 * x = x
x * 1 = x
x' * x = 1
x * x' = 1
x * x = 1
Goals:
x * y = y * x
Proofs found:
1
============================== Proof 1 ==============================
Proof 1 in 0.01 seconds .
Length of proof is 11.
Level of proof is 4
Maximum clause weight is 11.0.
Given clauses 12.
1 x * y = y * x # label(non_clause) # label(goal). [goal].
2 (x * y) * z = x * (y * z). [assumption].
3 1 * x = x. [assumption].
4 x * 1 = x. [assumption].
7 x * x = 1. [assumption].
8 c2 * c1 != c1 * c2. [deny(1)].
13 x * (x * y) = y. [para(7(a,1),2(a,1,1)),rewrite([3(2)]),flip(a)].
14 x * (y * (x * y)) = 1. [para(7(a,1),2(a,1)),flip(a)].
20 x * (y * x) = y. [para(14(a,1),13(a,1,2)),rewrite([4(2)]),flip(a)].
24 x * y = y * x. [para(20(a,1),13(a,1,2))].
25 $F. [resolve(24,a,8,a)].
```
A neat trick is to use Python's triple-quoted strings. This way, you can have your prover9 input and your script in the same file:
```python
abel = """
formulas(assumptions).
% group axioms
(x * y) * z = x * (y * z). % associatitivity
1 * x = x. % left identity
x * 1 = x. % right identity
x' * x = 1. % lef inverse
x * x' = 1. % right inverse
% special assumption
x * x = 1. % all elements have order 2
end_of_list.
formulas(goals).
x * y = y * x.
end_of_list.
"""
p = Proverx(abel)
p.find_proofs()
print p.proofs[0].stats.max_weight
```
Run it and you will get:
```python
11.0
```
As you keep running scripts, the result windows begins to accumulate. If you try to close them, you will get a warning that you are trying to close an unsaved file. This can become annoying when there are a lot of them. So, if you want to close all results without warnings, just press the **Close all results** button ![][image-15]
## GAP scripts
You are not limited to Python, you can also write GAP scripts.
For instance, suppose you want to know the order as well as the elements of the symmetric group S4.
Just create a file a save it with the extension .g
Enter the following gap instructions:
```pascal
S4 := SymmetricGroup(4);
Print("Order of S4: ", Order(S4), "\n\n");
Print("Elements:\n");
Display(Elements(S4));
```
Then press the **Run Script** button ![][image-16]. This time the GAP interpreter will be invoked (because of the .g extension in the file name). The result will be:
```html
Order of S4: 24
Elements:
[ (), (3,4), (2,3), (2,3,4), (2,4,3), (2,4), (1,2), (1,2)(3,4), (1,2,3),
(1,2,3,4), (1,2,4,3), (1,2,4), (1,3,2), (1,3,4,2), (1,3), (1,3,4),
(1,3)(2,4), (1,3,2,4), (1,4,3,2), (1,4,2), (1,4,3), (1,4), (1,4,2,3),
(1,4)(2,3) ]
```
# Using strategies
Having a programming language means that we can extend ProverX anyway we see fit.
One of the first needs that comes to mind is to use the power of python to create strategies that allows us to attack complex proofs (see the module strategies).
Note that the files for these examples are in the **Tutorials** folder.
## Using hints
Let's now take an input file (hard.in) with some hard theorems to prove:
```html
% assign(new_constants, 1).
assign(eq_defs, fold).
set(restrict_denials).
formulas(assumptions).
% Veroff's 2-basis for BA in terms of the Sheffer stroke.
f(x,y) = f(y,x).
f(f(x,y),f(x,f(y,z))) = x.
% Define a new operation (which turns out to be complement).
% The "assign(eq_defs, fold)" above causes this definition to be
% oriented as a rewrite rule so that the operation is introduced
% whenever possible.
x' = f(x,x).
end_of_list.
formulas(goals).
% Sheffer basis for Boolean Algebra
f(f(x,x),f(x,x)) = x # label(Sheffer_1).
f(x,f(y,f(y,y))) = f(x,x) # label(Sheffer_2).
f(f(f(y,y),x),f(f(z,z),x)) = f(f(x,f(y,z)),f(x,f(y,z))) # label(Sheffer_3).
end_of_list.
```
And the file easy.in has and easier proof of the same theorems (by adding an extra assumption):
```html
% assign(new_constants, 1).
assign(eq_defs, fold).
set(restrict_denials).
formulas(assumptions).
% Veroff's 2-basis for BA in terms of the Sheffer stroke.
f(x,y) = f(y,x).
f(f(x,y),f(x,f(y,z))) = x.
f(x,f(y,f(x',z))) = f(x,y'). % extra assumption
% Define a new operation (which turns out to be complement).
% The "assign(eq_defs, fold)" above causes this definition to be
% oriented as a rewrite rule so that the operation is introduced
% whenever possible.
x' = f(x,x).
end_of_list.
formulas(goals).
% Sheffer basis for Boolean Algebra
f(f(x,x),f(x,x)) = x # label(Sheffer_1).
f(x,f(y,f(y,y))) = f(x,x) # label(Sheffer_2).
f(f(f(y,y),x),f(f(z,z),x)) = f(f(x,f(y,z)),f(x,f(y,z))) # label(Sheffer_3).
end_of_list.
```
If we run prover9 on these two files, hard.in will take a lot longer to obtain a proof than easy.in.
A strategy to find a quicker proof for the hard case would be to extract the hints from the easy proof and use them to prove the hard one.
This can be easily done (and tested) with this script (remember to give this file the .py extension to be runnable):
```python
easy = Proverx('Tutorials/easy.in')
hard = Proverx('Tutorials/hard.in')
easy.find_proofs()
print "easy proof:", easy.proofs.stats.time, "secs."
hard.find_proofs()
print "hard proof:", hard.proofs.stats.time, "secs."
hard.hints.add(easy.proofs.hints)
hard.find_proofs()
print "hard with hints:", hard.proofs.stats.time, "secs."
```
You can verify that the hard proof with the hints should be a lot faster to obtain:
```html
easy proof: 0.24 secs.
hard proof: 1.91 secs.
hard with hints: 0.27 secs.
```
As an exercise, try to measure the diference in length of each individual proof.
## Using interpretations lists
Suppose that you have this file (LT-82-2.in):
```html
if(Prover9).
assign(order, kbo).
assign(max_weight, 25).
end_if.
formulas(sos).
% lattice theory
x v y = y v x.
(x v y) v z = x v (y v z).
x ^ y = y ^ x.
(x ^ y) ^ z = x^ (y ^ z).
x ^ (x v y) = x.
x v (x ^ y) = x.
end_of_list.
formulas(sos).
(x ^ y) v (x ^ z) = x ^ ((y ^ (x v z)) v (z ^ (x v y))) # label(H82).
end_of_list.
formulas(goals).
x ^ (y v (x ^ z)) = x ^ (y v (z ^ ((x ^ (y v z)) v (y ^ z)))) # label(H2).
end_of_list.
```
If you run prover9, it will take some time to find a proof (more than 15 minutes, depending on your hardware).
One strategy would be to find a model of a subset of the axioms and add it to an interpretation list. This way, prover9 can discard some clauses quicker.
Just by observing the input, we can see that it should be easy to find a model for the lattice theory.
So let's try this script:
```python
p = Proverx('LT-82-2.in')
last = p.axioms.pop() # remove the last axiom (H82)
if p.find_models():
print p.models[0]
```
the answer is:
```html
interpretation( 6 , [number = 1, seconds=0], [
function(c1, [0]),
function(c2, [1]),
function(c3, [2]),
function(^(_,_), [
0, 3, 3, 3, 0, 3,
3, 1, 5, 3, 1, 5,
3, 5, 2, 3, 2, 5,
3, 3, 3, 3, 3, 3,
0, 1, 2, 3, 4, 5,
3, 5, 5, 3, 5, 5 ]),
function(v(_,_), [
0, 4, 4, 0, 4, 4,
4, 1, 4, 1, 4, 1,
4, 4, 2, 2, 4, 2,
0, 1, 2, 3, 4, 5,
4, 4, 4, 4, 4, 4,
4, 1, 2, 5, 4, 5 ])
]).
```
So, if we add this interpretation to the input file, we should get a quicker proof. We can do it this way:
```python
p = Proverx('LT-82-2.in')
last = p.axioms.pop()
p.find_models()
p.interps.add(str(p.models[0]))
p.axioms.append(last)
p.find_proofs()
print p.proofs
```
And we get the proof in a few seconds:
```html
============================== Proof 1 H2 ==============================
Proof 1 in 5.72 seconds H2.
Length of proof is 45.
Level of proof is 9
Maximum clause weight is 25.0.
Given clauses 288.
1 x ^ (y v (x ^ z)) = x ^ (y v (z ^ ((x ^ (y v z)) v (y ^ z)))) # label(H2) # label(non_clause) # label(goal). [goal].
2 x v y = y v x. [assumption].
3 (x v y) v z = x v (y v z). [assumption].
4 x ^ y = y ^ x. [assumption].
5 (x ^ y) ^ z = x ^ (y ^ z). [assumption].
6 x ^ (x v y) = x. [assumption].
7 x v (x ^ y) = x. [assumption].
8 (x ^ y) v (x ^ z) = x ^ ((y ^ (x v z)) v (z ^ (x v y))) # label(H82). [assumption].
9 x ^ ((y ^ (x v z)) v (z ^ (x v y))) = (x ^ y) v (x ^ z) # label(false). [copy(8),flip(a)].
10 c4 ^ (c5 v (c6 ^ ((c4 ^ (c5 v c6)) v (c5 ^ c6)))) != c4 ^ (c5 v (c4 ^ c6)) # label(H2) # answer(H2). [deny(1)].
11 c4 ^ (c5 v (c6 ^ ((c5 ^ c6) v (c4 ^ (c5 v c6))))) != c4 ^ (c5 v (c4 ^ c6)) # answer(H2). [copy(10),rewrite([2(12)])].
13 x v (y v z) = y v (x v z). [para(2(a,1),3(a,1,1)),rewrite([3(2)])].
15 x ^ (y ^ z) = y ^ (x ^ z). [para(4(a,1),5(a,1,1)),rewrite([5(2)])].
16 x ^ (y v x) = x. [para(2(a,1),6(a,1,2))].
18 x ^ ((x v y) ^ z) = x ^ z. [para(6(a,1),5(a,1,1)),flip(a)].
22 x v (y ^ x) = x. [para(4(a,1),7(a,1,2))].
24 x ^ x = x. [para(7(a,1),6(a,1,2))].
25 x v x = x. [para(6(a,1),7(a,1,2))].
26 x ^ ((y ^ (z v x)) v (z ^ (x v y))) = (x ^ y) v (x ^ z) # label(false). [para(2(a,1),9(a,1,2,1,2))].
28 x ^ ((y ^ (x v z)) v ((x v y) ^ z)) = (x ^ z) v (x ^ y) # label(false). [para(4(a,1),9(a,1,2,1)),rewrite([2(5)])].
31 x ^ (y v (x v z)) = x. [para(13(a,1),6(a,1,2))].
32 x v (y v (x ^ z)) = y v x. [para(7(a,1),13(a,1,2)),flip(a)].
34 x ^ ((y ^ (x v z)) v ((y v x) ^ z)) = (x ^ z) v (x ^ y) # label(false). [para(4(a,1),26(a,1,2,1)),rewrite([2(5)])].
46 x ^ (x ^ y) = x ^ y. [para(24(a,1),5(a,1,1)),flip(a)].
48 x ^ (y ^ x) = y ^ x. [para(24(a,1),5(a,2,2)),rewrite([4(2)])].
57 x v (y v x) = y v x. [para(25(a,1),3(a,2,2)),rewrite([2(2)])].
58 x ^ (y v (z v x)) = x. [para(3(a,1),16(a,1,2))].
69 x ^ ((y ^ x) v (z ^ x)) = (x ^ y) v (z ^ x). [para(22(a,1),9(a,1,2,1,2)),rewrite([5(4),6(3),48(7)])].
74 x ^ (y ^ (x v z)) = y ^ x. [para(6(a,1),15(a,1,2)),flip(a)].
81 x ^ (y ^ (z v x)) = y ^ x. [para(16(a,1),15(a,1,2)),flip(a)].
127 x ^ (((x v y) ^ z) v (y ^ (x v ((x v y) ^ z)))) = (x ^ z) v (x ^ y) # label(false). [para(46(a,1),28(a,1,2,2)),rewrite([2(7),18(11)])].
131 x ^ ((y ^ (x v z)) v (z ^ (x v (y ^ (x v z))))) = (y ^ x) v (x ^ z) # label(false). [para(48(a,1),28(a,1,2,2)),rewrite([2(7),74(11)])].
132 x ^ ((y ^ (z v x)) v (z ^ (x v (y ^ (z v x))))) = (y ^ x) v (x ^ z) # label(false). [para(48(a,1),34(a,1,2,2)),rewrite([2(7),81(11)])].
159 x ^ ((x ^ y) v (z ^ x)) = (x ^ y) v (z ^ x). [para(22(a,1),127(a,1,2,1,1)),rewrite([22(4),7(4),4(3),48(3),48(7)])].
168 x ^ (y v ((z v y) ^ (x v y))) = (x ^ y) v (x ^ (z v y)). [para(58(a,1),9(a,1,2,1))].
223 x ^ ((y ^ (z v x)) v (z ^ (x v (y ^ (x v z))))) = (y ^ x) v (x ^ z) # label(false). [para(2(a,1),131(a,1,2,1,2))].
230 x ^ ((y ^ x) v (z ^ y)) = x ^ y. [para(22(a,1),132(a,1,2,1,2)),rewrite([22(5),69(6),57(6),5(5),159(4),4(8),48(8),2(8),22(8)])].
318 (x ^ y) v (x ^ (z v y)) = x ^ (z v y). [para(81(a,1),22(a,1,2)),rewrite([2(4)])].
331 x ^ (y v ((z v y) ^ (x v y))) = x ^ (z v y). [back_rewrite(168),rewrite([318(9)])].
360 (x ^ y) v (y ^ (z v x)) = y ^ (z v x). [para(58(a,1),223(a,1,2,2,2,2)),rewrite([3(2),31(3),331(5)]),flip(a)].
983 x ^ (y v ((y v z) ^ x)) = x ^ (y v z). [para(6(a,1),230(a,1,2,2)),rewrite([2(3)])].
3796 (x ^ y) v (z v (y ^ (u v x))) = z v (y ^ (u v x)). [para(360(a,1),13(a,1,2)),flip(a)].
7467 x ^ ((x ^ y) v (z ^ (y v x))) = (x ^ z) v (x ^ y) # label(false). [para(26(a,1),983(a,2)),rewrite([4(8),26(8),13(6),2(5),3796(6)])].
7628 x ^ ((y ^ x) v (z ^ (y v x))) = (x ^ z) v (x ^ y). [para(4(a,1),7467(a,1,2,1))].
7634 $F # answer(H2). [back_rewrite(11),rewrite([7628(13),4(5),4(8),32(10),2(6)]),xx(a)].
```
What's happening is that the model falsifies the goal theorem in the simplified axiom set. Proofs of the original theorem must include some clauses that evaluate to false in this model. (Note that two clauses that evaluate to true in the model can only produce clauses that follow from the simplifed axioms and therefore cannot deduce the goal clause.) Prover9's default clause selection strategy will choose these "false in model" clauses sooner than it would without having the model as input.
This is why this strategy *can* help (but doesn't necessarily help) to find a proof.
# More scripting
## Proving with interpretations
Now we can generalize the previous example and wrap it in a reusable function:
```python
def prove_with_interps(fname, time_out = 5):
p = Proverx(fname)
p.axioms.backup()
p.set_option('Mace4', 'assign', 'max_seconds, {}'.format(time_out))
p.set_option('Prover9', 'assign', 'max_seconds, {}'.format(time_out))
for axioms in p.axioms.subsets(order = 'desc'):
p.axioms.replace(axioms)
if p.find_models():
p.interps.add(str(p.models))
p.axioms.restore()
if p.find_proofs():
return copy(p)
return None
```
### How does it work?
What this function does is to take all subsets of the axioms (a generalisation of the previous script) and checks if a model exists for that subset. If it does it will add into the interpretation list and tries to find a proof. If a proof is found, a Proverx instance is returned, otherwise it goes on to the next subset.
Note that this function already exists in the module **strategies**.
## Writing a graph viewer for proofs
Sometimes it is very useful to see the proofs in a graphical form (for example with a directed graph). Wouldn't it be nice if we could do that with ProverX?
Not only can we do that, but most important, it is quite an easy task to accomplish !
One of the advantages of using python is that we have access to it's huge third-party library.
For this project, we will use [Graphviz][1].
Create a new python file and enter:
```python
import graphviz
def depends_from(clause):
ids = []
rewrites = re.findall(r"rewrite\(\[.*?\]\)", clause, re.I) #find rewrite justifications
for rewrite in rewrites:
ids += re.findall(r"([0-9]+[A-Z]*)\(", rewrite, re.I) #get clauses ids from rewrite
clause = clause.replace(rewrite, '')
paras = re.findall(r"para\(.*?\)\)", clause, re.I) #find para justifications
for para in paras:
ids += re.findall(r"([0-9]+[A-Z]*)\(", para, re.I) #get clauses ids from para
clause = clause.replace(para, '')
ids += re.findall('[0-9]+[A-Z]*', clause, re.I) # get remaining ids
return list(set(ids)) #remove duplicate ids
def graph(proof, filename):
graph = graphviz.Digraph(format='png', filename = filename + '.gv')
graph.node('$F',shape='doublecircle')
for id, clause in proof.clauses.iteritems():
depends = depends_from(clause.justifications)
if depends:
for dep_id in depends:
graph.edge(proof.clauses[dep_id].literals, proof.clauses[id].literals)
graph.render()
p = Proverx('abelian.in')
p.find_proofs()
graph(p.proofs[0], 'graph')
```
Run this file and you should see two new files on the file tree on the left (if you don't see them, just press the **Refresh**
button): 'graph.gv' and 'graph.gv.png'.
If you press the .png file you will see:
![][image-17]
### How does it work?
For this script, we use the _clauses_ attribute from the class **Proof**.
First we create a function **depends\_from()** that takes a justification from a clause and returns all the id's this clause depends from (rewrite, para, etc.). This is accomplished with regular expressions.
Then, the main function instantiates a graphviz.Digraph instance and for each clause in the proof, it gets the justifications from that clause and passes them to **depends\_from()**. It creates the edges from all this clauses (using the literals as the name) and then it renders the graph.
[1]: https://www.graphviz.org
[image-1]: images/newp9.png
[image-2]: images/save.png
[image-3]: images/prover9.png
[image-4]: images/syntax.png
[image-5]: images/syntax_error.png
[image-6]: images/run_on_repl.png
[image-7]: images/repl_result.png
[image-8]: images/proverx.png
[image-9]: images/repl_session.png
[image-10]: images/unplug.png
[image-11]: images/show_repl.png
[image-12]: images/gap.png
[image-13]: images/gap_session.png
[image-14]: images/run_script.png
[image-15]: images/close_all.png
[image-16]: images/run_script.png
[image-17]: images/graph.gv.png