osa1 github gitlab twitter cv rss

A complicated proof of a simple theorem

July 10, 2014 - Tagged as: coq, en.

It’s that time of the year again in which I’m suddenly very interested about verification and interactive theorem proving. I’m currently working on proving properties of some well-known algorithms.

At least in the beginning even the simplest properties may turn out to be very hard to prove. In this post I’m going to give an example of a very simple theorem which turns out to be hard to prove for beginners like me.

Let’s say I want to divide peano nats by 10. I can define div10 in two ways: 1) I can implement a function that directly divides by 10 2) I can implement div2 and div5 and use those two to implement div10:

(* first definition *)
Fixpoint div10 (n : nat) : nat :=
  match n with
  | S (S (S (S (S (S (S (S (S (S n'))))))))) => S (div10 n')
  | _ => 0
  end.

(* second definition *)
Fixpoint div5 (n : nat) : nat :=
  match n with
  | S (S (S (S (S n')))) => S (div5 n')
  | _ => 0
  end.

Fixpoint div2 (n : nat) : nat :=
  match n with
  | S (S n') => S (div2 n')
  | _ => 0
  end.

Definition div10' (n : nat) : nat := div5 (div2 n).

It may be obvious enough that those two definitions are equal(in the sense that they always terminate and given same arguments they return same values) but as an exercise let’s try to prove it:

Theorem divs_eq : forall n, div10 n = div10' n.

It’s obvious that we need to do induction on n, but unfortunately that leads to very complicated induction hypothesis and I can’t make any use of them:

Proof.
  unfold div10'. intro.
  do 10 (induction n; auto). simpl. f_equal. 
Abort.

At this point goal is same as our original goal, but induction hypothesis are about a hundred lines long so I can’t make any use of it. (and I don’t understand why induction hypothesis getting that big)

It turns out that by using a different induction scheme we can easily prove this. Instead of using standard induction scheme of nats “prove P 0, assume P n and prove P (S n)” we can use “prove P 0, P 1, … P 9, assume P n and prove P (10 + n)”:

Definition nat10ind : forall (P : nat -> Prop),
  P 0 -> P 1 -> P 2 -> P 3 -> P 4 -> P 5 -> P 6 -> P 7 -> P 8 -> P 9 ->
  (forall n, P n -> P (10 + n)) -> forall n, P n.

  intros.
  assert (P n /\ P (1 + n) /\ P (2 + n) /\ P (3 + n) /\ P (4 + n) /\ P (5 + n) /\
          P (6 + n) /\ P (7 + n) /\ P (8 + n) /\ P (9 + n)).
  + induction n.
    - simpl. repeat split; assumption.
    - repeat (match goal with H : _ /\ _ |- _ => destruct H end).
      repeat split; try assumption.
      apply H9. assumption.
  + destruct H10. assumption.
Defined.

Now our main theorem is very easy to prove:

Theorem divs_eq : forall n, div10 n = div10' n.
Proof.
  unfold div10'. intro.
  induction n using nat10ind; auto.
  simpl. f_equal. assumption.
Qed.

One thing that I found weird in nat10ind definition is that the expression match goal with ... is kind of special in that goal is not an identifier that represents some part of the context. Rather, match goal with ... is a special syntax to pattern match against the whole context, with hypothesis and goals1. H : _ /\ _ |- _ part is then matching for any hypothesis with form _ /\ _ and binding it to H, ignoring the current goal(RHS of turnstile shown as |-).


  1. See tacexpr_1 production of Coq expression syntax in reference manual for details.↩︎