Franklin Pezzuti Dyer

Home     Posts     CV     Contact     People

Totality and non-standard recursion in Idris

Lately I've been continuing my adventures in dependent type theory and computer-verified proofs by working with the Idris language. I've really enjoyed using Agda, but I think that Idris has started to appeal to me more because it feels like it strikes nice balance between practicality and more "academic" use cases, e.g. as a proof assistant. But I still consider myself a beginner at all of this stuff, so instead of arguing for the merits of Idris I'd like to talk about a specific interesting problem that I've solved recently. (You can find all of the code that I've written for this post in this Github repo.)

Idris has this neat feature called totality checking. A function is called total when it is guaranteed to terminate "eventually" for any possible input value you might pass it. Due to the uncomputability of the Halting Problem, it's not possible to look at an arbitrary function and decide algorithmically whether or not it can enter an infinite loop. But for certain kinds of recursive definitions that adhere to a specific pattern, it's possible to confirm that they define total functions.

For instance, they can generally verify that the following kind of definition is total:

f : Nat -> a
f 0     = x
f (S n) = g (f n)

where x : a is the base-case value of f, and g : a -> a is some function that is iteratively applied to x. Totality checkers like this sort of definition because the argument decreases by one with each recursive call. If one of your function definitions can be seen to fit this pattern through static analysis, then it will get a big thumbs-up from the totality checker, because this is a known way of defining total functions on Nat.

Totality is super important when it comes to writing proofs. If you allow proofs to be non-total, then it's actually possible to prove anything at all! In Idris (and Agda) we represent "the false proposition" as the empty type, which in Idris is called Void. If you allow non-total proofs, then you can write the following recursive proof of "false", or equivalently, construct the following element Void:

x : Void
x = x

Yes, this is ridiculous. If someone asked you to prove a statement $P$, this would be equivalent to saying "sure thing! ask me tomorrow and I'll give you my proof". And then saying the same thing the next day, and the next day, and so on. If a language is meant to be used for proof verification, then it certainly should not allow this sort of thing to happen.

In Agda, this is not an issue because Agda is a total language, meaning that it does not even allow you to write definitions that it can't verify as total. This is good, but it has the inconvenient consequence that some functions can't be written at all in Agda - that is, it's not Turing Complete. Not only that, but many recursive functions that are total are actually significantly more difficult to write because the most straightforward way of writing them doesn't satisfy Agda's totality checker. Idris occupies a middle ground on this issue: it does not require all functions to be total, but you can annotate specific functions with the total keyword if you'd like Idris to make sure that they are total. Any proofs you write in Idris should be marked total.

To take a concrete example, consider the recursive function $L : \mathbb N\to \mathbb N$ defined by a base case of $L(0) = 0$ and the following recurrence: Incidentally, this function's explicit value is actually given as follows for arguments $n\neq 0$: In Idris, we can define this function directly using its recursive definition, just like this:

log2Careless : Nat -> Nat
log2Careless 0      = 0
log2Careless (S x)  = 1 + log2Careless (div2 $ S x)

where div2 is a helper function I've written to compute the quantity $\lfloor n/2\rfloor$. This definition works fine in Idris, but if you were to write an analogous definition in Agda, it would yell at you because it doesn't recognize that this will always terminate. This is because its termination actually depends on the properties of the function div2 involved in the recursive call. After all, if div2 were, say, the identity function instead of $n\mapsto \lfloor n/2\rfloor$, then in fact this definition would give a non-total function. Similarly, if you try putting the total keyword before this definition in Idris, it would also yell at you, and for good reason.

So what if we want our cake and to eat it too? How can we write a function that both satisfies this recurrence and is a total function? After all, if we're going to be doing pure math in Idris, ideally all of the functions involved should be total.

Remember the example function we saw earlier, which is accepted by the totality checker because its argument decrements by one with each recursive call? We can mimic this behavior by adding a second accumulator-like argument to our function that also decrements by one wih each recursive call. The new definition looks like this:

total ɑlog2CarefulHelper : Nat -> Nat -> Natβ ɑlog2CarefulHelper _ 0 = 0β ɑlog2CarefulHelper m (S n)β = ɑcaseSplitβ ɑ(\_ => 1 + log2CarefulHelper (div2 m) n)β ɑ(\_ => log2CarefulHelper m n)β ɑ(decLeq (S n) m)β total ɑlog2Careful : Nat -> Natβ log2Careful n = log2CarefulHelper n n

This is definitely a more roundabout way of defining the same recursive function, and it's not obvious at first glance why it satisfies the same recurrence. I like to think about the difference between these two recursive definitions graphically: you might picture the sequence of recursive calls involved in computing $L(7)$ like this, for the "careless" recursive definition:

Fig 1

and you might picture the sequence of recursive calls involved in computing $L(7)$ for the "careful" recursive definition like this:

Fig 2

If it's not obvious to us that this function satisfies the same recurrence as $L$, then it almost certainly isn't obvious to Idris, either. This is the price we pay for making our function total: the recursive formula is no longer part of its definition, and therefore we actually have to write a proof to convince Idris that log2Careful satisfies the recurrence for $L$, if we want to have this fact at our disposal. The recurrence for log2Careless can be proved trivially because it's part of the definition:

log2CarelessRecurrence : (n : Nat) ->
                         log2Careless (S n) = 1 + log2Careless (div2 $ S n)
log2CarelessRecurrence n = Refl

But oops, if you try to mark this proof as total, Idris will complain that log2Careless may not a total function. And a proof that isn't necessarily total isn't of much value, as we've seen. So while log2Careless is much more natural to write, it doesn't play nice with proofs.

What we want, then, is to construct a proof of the following type:

log2CarefulRecurrence : (n : Nat) ->
                         log2Careful (S n) = 1 + log2Careful (div2 $ S n)

I'm not going to lie - this proof was a real pain in the butt and it took a lot of practice with proof assistants before I was capable of writing it. My first step was to prove an intermediate proposition about the log2Careful function, essentially stating that when the second argument is greater than the first argument, it can be "dropped down" to the value of the first argument without changing the function's value. You can see this pictorialy in our diagram representing the sequence of recursive calls for log2Careful, too. It's reflected in the fact that the arrows always move horizontally and to the left until reaching the diagonal.

Anyways, here's my proof of this lemma:

total log2DropDown : (m, n : Nat) -> LeqNat m n -> log2CarefulHelper m n = log2CarefulHelper m m ɑlog2DropDown 0 0 _ = Reflβ ɑlog2DropDown 0 (S n) (LeqZero (S n))β = trans ɑ(caseSplitNo {y = 1 + log2CarefulHelper 0 n} (decLeq (S n) 0) (succNotLeqZero n))β ɑ(log2DropDown 0 n (LeqZero n))β ɑlog2DropDown (S m) (S n) (LeqShift leq1)β = ɑcaseSplitβ ɑ(\leq2 => cong (log2CarefulHelper $ S m) (leqAntisym leq2 $ LeqShift leq1))β (\nleq => let ɑleq2 = leqImmediateSuc leq1 (\eq => nleq $ LeqShift $ eqImpliesLeq $ sym eq)β in trans ɑ(caseSplitNo (decLeq (S n) (S m)) nleq)β ɑ(log2DropDown (S m) n leq2))β ɑ(decLeq (S n) (S m))β

Lemma in hand, the rest of the proof is not so bad. Here's what it looks like:

total ɑlog2CarefulRecurrence : (n : Nat) -> log2Careful (S n) = 1 + log2Careful (div2 $ S n)β log2CarefulRecurrence n = trans ɑ(caseSplitYes (decLeq (S n) (S n)) (eqImpliesLeq Refl))β ɑ(cong (1+) $ log2DropDown (div2 $ S n) n (div2Leq n))β

This whole ordeal was painfully gruelling for such a simple-looking recurrence relation, so of course I immediately wanted to rewrite these proofs in as abstract of a form as possible, so that in the future I could just appeal to my abstract proof for any additional recurrences without having to think through this whole ordeal again. My "recurrence template" looks pretty similar to the foldr higher-order function, with just a couple of differences.

Firstly, it requires that the domain type a be endowed with some sort of "size function" that assigns to each element of a a natural number representing its "size". This is necessary to express the idea of a "well-founded" recurrence. In order to guarantee that a recursive definition of a function on an arbitrary domain terminates, we need some way of saying that each function value is recursively defined in terms of a function value at a "smaller" input. And this is predicated on there existing some notion of size for elements of a.

Secondly, it allows for multiple base cases of the function to be defined. In particular, the base cases are defined to be precisely the elements of a whose size equals zero. There can be multiple such elements, and they are all of the preimages of 0 : Nat under the size function.

Analogously to the drawings we saw earlier describing how the log2 functions' recursive definitions progress from one input to the next, we might picture the sequence of recursive calls involved in this kind of generalized recurrence as follows:

Fig 3

So, to completely describe a recurrence defining a recursive function a -> b, we need a function recg : a -> b -> b which is a direct analogue of the function passed as an argument to foldr; a function recf : a -> a specifying which element of a is to be the "previous case" for any given element of a; a function size : a -> Nat assigning a quantitative size to each possible input; and a function base : a -> b mapping each size-zero input value with the base-case value that should be assigned to it. My design choice was to bundle all of these data into a record type, like this:

public export
record GenericRecurrence a b where
    constructor MakeGenRec
    size : a -> Nat
    base : a -> b
    recf : a -> a
    recg : a -> b -> b 

The "careless" way of defining a recursive function from these data is implemented like this:

public export
carelessNatRecurse : GenericRecurrence a b -> (a -> b)
carelessNatRecurse gr x
    = if (gr.size x) == 0 
      then gr.base x 
      else gr.recg x (carelessNatRecurse gr (gr.recf x))

This is very similar to the definition of foldr! Sadly, much like our log2Careless function, Idris does not recognize this as total (and for good reason, as it may not be total at all depending on the recf function). So here is my "careful" version of the same function, which Idris does recognize as being total:

carefulHelper : GenericRecurrence a b -> (a -> Nat -> Nat -> b)
carefulHelper gr x s 0 = gr.base x
carefulHelper gr x s (S k)
    = caseSplit
        (\_ => gr.recg x (carefulHelper gr (gr.recf x) (gr.size $ gr.recf x) k))
        (\_ => carefulHelper gr x s k)
        (decLeq (S k) s)

public export
carefulNatRecurse : GenericRecurrence a b -> (a -> b)
carefulNatRecurse gr x = carefulHelper gr x (gr.size x) (gr.size x)

As before, Idris can't figure out on its own that recursive functions defined using carefulNatRecurse satisfy the recurrence they're supposed to satisfy, so it requires a proof on my part. The proof is pretty similar to the one I wrote for log2Careful and just as unwieldy, so I won't include it here. However, it has an interesting difference in the type signature that is worth pointing out. (But if you're curious, you can see the source code here.) The type signature looks like this:

total public export carefulNatRecurrence : ɑ(gr : GenericRecurrence a b)β -> ɑ(bound : (x : a) -> (n : Nat) -> (gr.size x = S n) -> LeqNat (gr.size (gr.recf x)) n)β -> ɑ(x : a) -> (n : Nat) -> (gr.size x = S n)β -> ɑ(carefulNatRecurse gr x = gr.recg x (carefulNatRecurse gr $ gr.recf x))β

The difference is that we need to actually provide as an additional argument the boundedness property of recf guaranteeing that it reduces input sizes with each recursive call. We didn't need this when proving log2CarefulRecurrence because we were working with a specific recurrence, in which the recursive calls were to arguments of half the size. For this specific case, I had already proven the boundedness property of div2 as a lemma called div2Leq, but in the more general case, this lemma would have to be passed as an argument.

Finally, here's what the whole definition of $L$ looks like when using my abstracted recursion scheme, rather than doing the whole proof by hand.

total ɑlog2RecursionPackage : GenericRecurrence Nat Natβ log2RecursionPackage = MakeGenRec ɑidβ ɑidβ ɑdiv2β ɑ(\_ => (1+))β total ɑlog2 : Nat -> Nat log2 = carefulNatRecurse log2RecursionPackageβ total ɑlog2Recurrence : (n : Nat) -> log2 (S n) = 1 + log2 (div2 (S n))β log2Recurrence n = carefulNatRecurrence ɑlog2RecursionPackageβ ɑ(\sm, m, eq => transport (\k => LeqNat (div2 k) m) (sym eq) (div2Leq m))β (S n) n Refl

Pretty handy, right?

back to home page