Proof Terms in Rocq
Post Metadata
In this blog post, we’re going to take a deep dive into looking at proof terms in Rocq.
When we write proofs in a proof assistant like Rocq, we’re used to thinking in
terms of goals and tactics (intros, apply, rewrite, induction, and so on),
but these goals and tactics are just a layer of abstraction over the
proof terms themselves.
In fact, every proof we write is actually just a program.
The Curry-Howard correspondence
is the foundational idea that propositions correspond to types,
and proofs correspond to programs that inhabit those types.
To prove a proposition, we construct a term (a function) whose type is that proposition.
If such a function exists, the proposition is true; if not, the type is uninhabited and the proposition is false.
Note that the Curry-Howard correspondence applies to all programs, not just those we write in a proof language.
For example, factorial is a function that proves the proposition corresponding
to the type int -> int.
Of course, this isn’t a very interesting proposition (it just says if there is an
int, then there is an int), but it is a proof nonetheless!
When we write proofs about programs, it’s easy to think of the “proof code” as
being a totally separate kind of thing from the “normal code,” but it’s worth
noting that this distinction is artificial.
The proof code is also just a program, and the normal code is also a proof of some proposition.
That being said, the goal of this blog post is to understand proofs we’ve written in Rocq from the program perspective. No more goals and tactics. Just program terms and functions!
Example 1: forall X Y : Prop, X -> Y -> X
This proposition says that if you have two propositions, X and Y, and you know X is true, and you know Y is true, then you can conclude X is true. This is obviously true, but because it’s so simple, it’s a good place to start looking at proof terms.
Proving this proposition in Rocq using tactics might look something like this:
Goal forall X Y : Prop, X -> Y -> X.
intros. apply H.
Qed.
Or even like this:
Goal forall X Y : Prop, X -> Y -> X.
auto.
Qed.
But what do these tactics actually mean?
We know from the Curry-Howard correspondence that proofs are just programs,
and we can prove a proposition by writing a program,
but this proof script that we just wrote doesn’t really feel like a program.
The Rocq proof engine uses the tactics we write to construct
a “proof term,” which is exactly the program that proves the proposition.
That is, the completed proof is a program that inhabits the type of the
proposition, thus serving as a witness that the proposition is true.
We can use Show Proof. in Rocq to inspect this proof term, and we’ll see
something like this:
(fun (X Y : Prop) (H : X) (_ : Y) => H)
This looks much more like a program!
It’s a function with 4 curried arguments: X of type Prop, Y of type Prop,
H of type X, and an unnamed argument of type Y.
The return type is X, so the body of the function is a proof of the proposition X.
You can think of this proof term as saying “If you give me propositions X and Y,
a proof of X, and a proof of Y, then I will give you a proof X”.
Written that way, it becomes obvious that the function body should just
immediately return the proof of X that was supplied as the third argument.
Though people don’t usually write proof terms directly in Rocq, you totally can:
Goal forall X Y : Prop, X -> Y -> X.
exact (fun X Y H _ => H).
Qed.
Example 2: forall X Y Z : Prop, (X -> Y -> Z) -> (X -> Y) -> (X -> Z)
First, let’s look at how we might prove this goal using tactics in Rocq:
Goal forall X Y Z : Prop,
(X -> Y -> Z) -> (X -> Y) -> (X -> Z).
intros X Y Z H1 H2 HX.
apply H1. Show Proof.
- apply HX. Show Proof.
- apply H2. apply HX.
Qed..
In this proof script, we first introduce names for the three propositions X,
Y, and Z, and for the hypotheses: H1 for the X -> Y -> Z premise,
H2 for the X -> Y premise, and HX for the X premise.
Now our goal is to construct a proof of Z from these components.
H1 says “If you give me a proof for X and a proof for Y,
then I will give you a proof for Z.”
The Rocq tactic apply H matches the conclusion of H with the current goal,
and then replaces the goal with the premise(s) of H.
So in our case, we apply H1, resulting in two subgoals: X and Y.
We can discharge the first subgoal with HX and the second with H2 followed
by HX.
These tactics explain procedurally how to complete the proof,
but again, they don’t look much like a program.
Let’s use Show Proof again:
(fun (X Y Z : Prop)
(H1 : X -> Y -> Z)
(H2 : X -> Y)
(HX : X)
=> H1 HX (H2 HX))
This proof term is again a function, this time with 6 curried arguments:
Xwith typePropYwith typePropZwith typePropH1with typeX -> Y -> ZH2with typeX -> YHXwith typeX
And the return type of the function is Z.
The body of the function is H1 HX (H2 HX), which if you spend some time
working through the types of the various function calls, you’ll see has
type Z as expected.
Notice that the two arguments to H1 in the function body correspond
to the two subgoals we had when we proved the goal using tactics.
Our goal was to prove (X -> Y -> Z) -> (X -> Y) -> (X -> Z), and we do it
by writing a program that produces something of type Z when supplied
with the 3 propositions and the 3 hypotheses (as functions).
Said another way, the program is itself the proof of the proposition.
Because we are able to write the function, it must be the case that
the proposition we are trying to prove is true.
Example 3: forall x y z : nat, x + (y + z) = (x + y) + z
So far, we’ve been able to prove the propositions just by directly applying
the hypotheses as functions and pushing symbols around until we construct
a term that has the type we want.
But simple symbol pushing won’t allow us to prove properties that rely
on recursive or inductive structures.
If you’ve written proofs in Rocq, you’ve probably used the induction tactic.
This tactic is a bread-and-butter tool for reasoning about natural numbers,
lists, trees, recursive functions, and so on.
Goal forall x y z : nat, x + (y + z) = (x + y) + z.
induction x.
- intros. reflexivity.
- intros. simpl. rewrite IHx. reflexivity.
Show Proof.
Qed.
induction x gives us two subgoals: one for the base case 0,
and one for the inductive case S x.
In the inductive case, we additionally get to use
the induction hypothesis forall y z, x + (y + z) = (x + y) + z.
Using the IH, we can discharge the goal using just a few tactics.
In the previous example, we saw that the two subgoals of the proof
ended up as function arguments in the proof term.
If we inspect the proof term in this case using Show Proof, we see:
(fun x : nat =>
nat_ind (fun x0 : nat => forall y z : nat, x0 + (y + z) = x0 + y + z)
(fun y z : nat => eq_refl)
(fun (x0 : nat) (IHx : forall y z : nat, x0 + (y + z) = x0 + y + z)
(y z : nat) =>
eq_ind_r (fun n : nat => S n = S (x0 + y + z)) eq_refl (IHx y z)
:
S x0 + (y + z) = S x0 + y + z) x)
It’s a little hard to pick out what subterms of this proof term correspond
to the cases in our induction proof, but they’re in there I promise!
Let’s start by trying to understand nat_ind.
nat is an inductive type defined in the Rocq standard library:
Inductive nat :=
| O
| S : nat -> nat.
All inductive types in Rocq come with an induction principle,
so when nat gets defined, Rocq automatically generates nat_ind,
with the following type:
forall P : nat -> Prop,
P 0 ->
(forall n : nat, P n -> P (S n)) ->
forall n : nat, P n`.
In plain terms, this says “for any proposition about nats, if you have a proof
that the proposition holds for 0 and if you have a proof showing that if
the proposition holds for n then it holds for n + 1, then you can conclude that
the proposition holds for all natural numbers”.
nat_ind is a function, so to construct a proof by induction, we just
need to call this function with the right arguments.
One thing that’s helpful for making sense of proof terms is that we can
mix and match writing the terms explicitly and using tactics.
For example, we can partially apply nat_ind with just the first argument
(the proposition P) like this:
nat_ind (fun a : nat => forall b c : nat, a + (b + c) = (a + b) + c)
Which gives us this goal:
Goal forall x y z : nat, x + (y + z) = (x + y) + z.
apply (nat_ind (fun a => forall b c, a + (b + c) = (a + b) + c)).
Show Proof at this point gives us
(nat_ind (fun a : nat => forall b c : nat, a + (b + c) = (a + b) + c) ?Goal ?Goal0)
which says that we’ve told nat_ind what Prop we’re trying to prove,
but we haven’t given the proofs of P 0 or P n -> P (S n) yet, so those are left as holes.
We can solve these subgoals using tactics if we want:
Goal forall x y z : nat, x + (y + z) = (x + y) + z.
apply (nat_ind (fun a => forall b c, a + (b + c) = (a + b) + c)).
- reflexivity.
- intros. simpl. rewrite H. reflexivity.
Qed.
But now that we’ve gotten some practice constructing our own proof terms,
we should be able to fill in these holes with functions.
For the first argument, we need something with type P 0, which for our
choice of P is forall y z: 0 + (y + z) = (0 + y) + z.
So our function is going to look something like this: fun y z : nat => ...,
but what should we put in the function body to show the equality
between 0 + (y + z) and (0 + y) + z?
There’s an inductive type built into Rocq called eq:
Inductive eq (A : Type) (x : A) : A -> Prop :=
| eq_refl : x = x.
The eq_refl constructor says that any value is equal to itself (the reflexivity
tactic corresponds to using this constructor in the proof term)
So we can fill in our function body for the P 0 term: fun y z : nat => eq_refl.
Goal forall x y z : nat, x + (y + z) = (x + y) + z.
apply (nat_ind (fun a => forall b c, a + (b + c) = (a + b) + c)
(fun y z => eq_refl)).
If we Show Proof again, we’ll see that we only have one hole left:
(nat_ind (fun a : nat => forall b c : nat, a + (b + c) = (a + b) + c) (fun y z : nat => eq_refl) ?Goal)
The last hole needs to have type forall n : nat, P n -> P (S n).
Notice that the P n in the type is what allows us to use the induction
hypothesis to construct the P (S n) term.
The proof term should have four arguments:
x0: nat, which come from theforall nIHx : forall y z : nat, x0 + (y + z) = (x0 + y) + z, which comes fromP n- and 4.
y : natandz : nat, which come from theforall b cin the type ofP. So it’ll end up looking like this:fun x0 IHx y z => ..., where the body of the function needs to have typeS n + (b + c) = (S n + b) + c.
We’re going to need one more piece before we can fill in the function body:
we need a way to substitute equal terms.
eq_ind_r is a function with type
forall (A : Type) (x : A) (P : A -> Prop), P x -> forall y : A, y = x -> P y
Basically what this says is that if you know some property holds for x
and you know x = y, then the property holds for y too.
Nothing groundbreaking, but it’s going to be useful to have a way to do this manipulation
of terms in our proof construction (this corresponds to the rewrite tactic).
To use eq_ind_r we supply the following 6 arguments:
A = natThis is the type of the input toPx = ((x0 + y) + z)This is the value for which we knowPholdsP = fun n => S n = S ((x0 + y) + z)This is the property we want to conclude holds fory. Notice that this is a differentPthan we had at the top level.P x = eq_reflThis is the proof thatP xholds. In this case,S ((x0 + y) + z) = S ((x0 + y) + z), which we prove using the reflexivity definition from before.y = x0 + (y + z)This is the value we want to concludePholds for.(y = x) = (IHx y z)Finally, we need to supply a proof thaty = x. We can do this by calling the induction hypothesis.
So the whole thing looks like this:
@eq_ind_r
nat
(x0 + y + z)
(fun n => S n = S (x0 + y + z))
eq_refl
(x0 + (y + z))
(IHx y z)
And it allows us to conclude S (x0 + (y + z)) = S ((x0 + y) + z), which is exactly what we needed to prove!
(Notation note: the @ tells Rocq not to use implicit arguments. If we omit it, then Rocq will infer
arguments 1, 2, and 4, and we can just write eq_ind_r (fun n => S n => S ((x0 + y) + z)) eq_refl (IHx y z))
Putting the whole thing together, we have our proof by induction without using any tactics:
Goal forall x y z : nat, x + (y + z) = (x + y) + z.
exact (nat_ind (fun a => forall b c, a + (b + c) = (a + b) + c)
(fun y z => eq_refl)
(fun x0 IHx y z =>
@eq_ind_r nat ((x0 + y) + z) (fun n => S n = S ((x0 + y) + z)) eq_refl (x0 + (y + z)) (IHx y z))).
Qed.
If you made it this far, thank you!
Hopefully something in these examples helped you understand proofs or programs in a slightly different way.
Obviously there’s lots of stuff in proof assistants that I didn’t get into here.
For example, I didn’t get into search tactics (auto, intuition, lia, etc.) at all.
These tactics can seem like magic when you use them,
but they are still constructing concrete proof terms internally.
Instead of the tactic corresponding one-to-one with the proof term it constructs,
these tactics perform bounded searches looking for ways to fill in the holes in the proof term.