Franklin Pezzuti Dyer

Avoiding endless evaluations in Scheme

In the Scheme language certain control structures have an interesting quirk. When we perform a function call like (f x y) (in this case, a function with two arguments) both of the arguments x and y are evaluated first before passing them to the function f. We can demonstrate this phenomenon by fabricating arguments that cause side effects when they're evaluated:

> (define x '(begin (display "evaluating x\n") 34))
> (define y '(begin (display "evaluating y\n") 35))
> (define (f a b) (+ a b))
> (eval x)
evaluating x
34
> (eval y)
evaluating y
35
> (f (eval x) (eval y))
evaluating x
evaluating y
69

However, we can see that some control structures that appear to be functions actually aren't because they don't handle their arguments this way. For instance, check out what happens when we carry out boolean operations with the and construction:

> (define x '(begin (display "evaluating x\n") #t))
> (define y '(begin (display "evaluating y\n") #f))
> (and (eval x) (eval y))
evaluating x
evaluating y
#f
> (and (eval y) (eval x))
evaluating y
#f

In particular, a call to and evaluates the arguments one at a time until finding a false value, returning #f if it finds one and #t otherwise. After all once it finds an argument with value #f there's no need to evaluate the rest of the arguments because the appearance of a single false value immediately determines the result of this boolean operation. The or construction employs a similar trick: once it finds the value #t among its arguments, it instantly returns #t without evaluating the rest.

This peculiarity doesn't usually have much of an impact on the performance of our programs but it can make a huge difference in cases where one of the expressions passed as an argument causes an error or enters an infinite loop in such a way that it doesn't return any valid value. It's not too difficult to come up with examples of expressions whose evaluations never terminate. For example, if we define (define btm '(eval btm)) and we attempt to evaluate (eval btm) then we've just entered into an infinite loop. If we try and pass this devious expression as an argument to a function, then the evaluation will never finish, even if the result doesn't actually depend on the value of the argument. For instance, the function (lambda x 0) should always return 0 in principle, but when we try to perform the call ((lambda x 0) (eval btm)) it doesn't return anything because of its attempted evaluation of btm. On the other hand:

> (and #f (eval btm))
#f

since and never even tries to evaluate the btm in the second argument, and for this reason is able to return a value. On the other hand, the call (and (eval btm) #f) never terminates because it tries evaluating the first argument before the second, and the call (and #t (eval btm)) never terminates because the result of the expression truly does depend on the value of the second argument.

But I'd say that the evaluations (and (eval btm) #f) and (and #t (eval btm)) are very different. In the latter case, it wouldn't even make sense to assign a value, because it could end up either true or false depending on the value of (eval btm), and since this never returns any kind of boolean value there's no way of non-arbitrarily assigning a value to this expression. On the other hand, intuition tells us that the expression (and (eval btm) #f) "should" evaluate to false - after all, even though the first argument never returns a boolean, if it were to return one then the expression would turn out false no matter its value. In fact, we can define our own version of and that evaluates the second argument before the first:

(define-syntax and2
        (syntax-rules ()
                ((and2 x y)
                 (let ((tmp y))
                      (if tmp x #f)))))

With this construction (and2 (eval btm) #f) works but (and2 #f (eval btm)) doesn't - hence, it has the same weakness as and but reversed. How could we write control structures that evaluate boolean expressions in every case in which their result is determined by the the arguments that do return a value?

Let me try to explain more precisely what exactly I'm trying to say here. Suppose we have a function (not a Scheme function, but a mathematical function that we'd like to implement in Scheme) that takes various arguments: $f:A_1\times \cdots \times A_n \to B$ To represent our ability to pass ill-defined arguments to this function, we can extend the domain of the function like this: $\tilde f:(A_1\sqcup \{\bot\})\times\cdots\times (A_n\sqcup \{\bot\})\to (B\sqcup\{\bot\})$ in other words, we're adding to each component of the domain an extra "value" $\bot$ that represents something that can't be evaluated or that doesn't have a specified value. (I swear, I'm not the one who came up with this weird notation.) We're going to define a partial order on the domain of the extended function $\tilde f$ on the following way: we'll say that a tuple $t=(a_1,\cdots,a_n)$ is "less than" the tuple $t'=(a_1',\cdots,a_n')$, or $t\leq t'$, if and only if $t$ "could equal $t'$" if some of its undefined elements (equal to $\bot$) were defined. Stated more rigorously, we say that $t\leq t'$ if $a_i\ne a_i'$ only when $a_i = \bot$. For instance, we would say that $(1,\bot,3)\leq (1,2,3)$. Intuitively speaking, if we start evaluating the arguments to a function and in some given moment it happens that the first argument has evaluated to $1$ and the third argument to $3$ but the second argument is still being evaluated, it's still possible that a $2$ comes out and that the sequence of arguments ends up being $(1,2,3)$. On the other hand $(1,\bot,3),\nleq (1,2,4)$ since the third argument has already evaluated to $3$ and it's no longer possible for it to be a $4$. In Scheme specifically we'll have the following partial order for a pair of boolean arguments:

in which $\bot$ represents the "value" of an expression like (eval btm).

We'll require some properties of this function $\tilde f$ so that it fits our intuition of the computation that it represents. The first thing is that the more well-defined the input is, the more well-defined the output should be. If $t'$ is a "possible evaluation" of $t$, or $t\leq t'$, then $\tilde f(t')$ should be a possible evaluation of $f(t)$, or $\tilde f(t)\leq\tilde f(t')$. Thus, we'll require the following monotonicity property: $t\leq t' ~ \implies ~ \tilde f(t)\leq \tilde f(t')$ In particular if the arguments $t$ are sufficient to determine the value of $\tilde f(t)$ and if $t'$ is "even more specific" than $t$ then $\tilde f(t')$ should have this same value. Additionally, if we want to consider $\tilde f$ to be an implementation of $f$, then it should have the same values as $f$ on the subset $A_1\times\cdots\times A_n$ of its domain, that is, when all of its arguments are well-defined. We can use the picture from before of the partial ordering on pairs of booleans to visualize the subordering of arguments in which the Scheme implementation of and is defined and the subordering on which this logical operation could be defined:

The best possible implementation of $f$ with respect to how robustly it handles ill-formed arguments would be one that assigns a well-defined value to every set of arguments whose well-defined values uniquely determine the result of the computation. That is, the best implementation of a logical conjunction in Scheme would return a value for each of the pairs of arguments inside of the green region. Can you formulate this property mathematically?

Well, that's enough with the mathematical formalism. Maybe I'll come back to this in a future post, but if you'd like to read more about this way of formalizing partial computations you should check out domain theory. This is a topic I'd like to learn more about myself.

In any case, my attempt to implement logical conjunction in a way that handles non-terminating computations as robustly as possible involves threads. If we evaluate one argument after the other then there will always remain a case in which we're waiting on the result of a computation that never terminates while the other argument is sufficient to determine the result. On the other hand if we proceed by evaluating the two arguments simultaneously, or rather by evaluating each of them bit by bit, then we'll be able to end the computation as soon as we receive a value that determines the value of the operation. Here's my code:

(define-syntax andp 
        (syntax-rules ()
                ((andp x y)
                 (let* ((ch (make-channel))
                        (th1 (thread (lambda () (channel-put ch x)))) 
                        (th2 (thread (lambda () (channel-put ch y)))))
                       (and (channel-get ch) (channel-get ch))))))

This function creates a channel and two threads th1 and th2 that communicate with the parent process through that channel. According to Racket's documentation on parallelism the function channel-get is blocked until some process sends a value into the provided channel, and then it returns this value. Hence, by using (and (channel-get ch) (channel-get ch)) we accept the value of the computation that returns a result first as the first argument to and. If this value ends up being false then we can return #f without even waiting on another value with a second call to channel-get. If we test out this implementation andp we get the following results:

> (andp #f (eval btm))
#f
> (andp (eval btm) #f)
#f

I'm not sure if it would really ever be feasible to use this operation in practice, though. The creation of threads tends to be costly, so a program that uses the parallel implementation andp for every logical conjunction would be super inefficient. In any case, maybe there could arise cases in which it would be worthwhile, for instance when handling computations where it's likely for one part to take much longer to evaluate than the other.

We can also consider how to handle non-evaluating arguments in functions more complex than boolean operations. For instance, consider the control structure if, which also has a special control flow in Scheme. During the evaluation of the expression (if t x y), the argument t is evaluated first and if it ends up true than the value of x is returned, and otherwise the value of y is returned, in such a way that either only the value of x or only the value of y is computed. For instance, the computations (if #t 23 (eval btm)) and (if #f (eval btm) 23) will indeed return a value (both evaluate to 23). If the first argument t can't be evaluated then (if t x y) never finished evaluating, but notice that there are cases in which the result doesn't actually depend on the value of t. For instance, in the expression (if (eval btm) 23 23) there isn't really any need to evaluate ‌(eval btm) because regardless of whether the first argument is #t or #f the final result would be 23. To implement an if-else structure that handles this special case better, we can also make use of threads:

(define-syntax ifp 
        (syntax-rules ()
                ((ifp test x y)
                 (let* ((ch (make-channel))
                         (th1 (thread (lambda () (if (eqv? x y) (channel-put ch x) 0))))
                         (th2 (thread (lambda () (if test x y)))))
                       (channel-get ch)))))

With this implementation we get the following results:

> (ifp #t 23 (eval btm))
23
> (ifp #f (eval btm) 23)
23
> (ifp (eval btm) 23 23)
23

while evaluations like (if (eval btm) 23 24) and (if #t (eval btm) 23) will still go on forever - after all, their results actually depend on the infinite computation of (eval btm).

Let's briefly take a look at a more complex example regarding the implementation of a function that compares two lists and decides if they are equal or not. Here are two implementations that don't make use of parallelism:

(define (comp-list-1 l1 l2)
        (cond ((empty? l1) (empty? l2))
              ((empty? l2) #f) 
              (else (and (eqv? (eval (car l1)) (eval (car l2))) 
                         (comp-list-1 (cdr l1) (cdr l2))))))

(define (comp-list-2 l1 l2)
        (cond ((empty? l1) (empty? l2))
              ((empty? l2) #f) 
              (else (and (comp-list-2 (cdr l1) (cdr l2)) 
                         (eqv? (eval (car l1)) (eval (car l2)))))))

For well-behaved arguments, these two implementations will produce the same results, but they're very difference when it comes to their treatment of lists that contain entries that can't be evaluated. For instance since comp-list-1 compares the two lists from the beginning to the end, it's capable of distinguishing between the two lists '(1 (eval btm)) and '(2 3) because it detects a difference between 1 and 2. It's not capable of distinguishing between '(1 (eval btm) 3) and '(1 2 4), since it runs into a non-terminating computation in the second entry before finding the entries 3 and 4 that allow it to distinguish between the two lists. For comp-list-2 the situation is reversed. In fact, we can explicitly classify the pairs of lists that each implementation is capable of distinguishing. In particular, ‌(comp-list-1 l1 l2) will return a result if:

Neither l1 nor l2 has any non-evaluating entries
The lists l1 and l2 are distinct and the first entry at which they differ can be evaluated in both lists
One list is shorter than the other and all of the elements of the shorter list can be evaluated

On the other hand, (comp-list-2 l1 l2) returns a result when:

Neither l1 nor l2 has any non-evaluating entries
One list is shorter than the other
The lists l1 and l2 are distinct and the last entry at which they differ can be evaluated in both lists

An interesting detail here is that comp-list-2 detects inequality between lists with different lengths without having to evaluate a single element. However, in addition to comp-list-1 and comp-list-2 we can design a parallel (but certainly not very efficient) implementation that makes use of our andp construction:

(define (comp-list-p l1 l2)
        (if (empty? l1)
            (empty? l2)
            (andp (eqv? (eval (car l1)) (eval (car l2)))
                  (comp-list-p (cdr l1) (cdr l2)))))

With this implementation the termination conditions are much less stringent. In particular, to conclude that two lists are unequal, it's only necessary that the two lists have distinct lengths or that they have two evaluating entries in any position with distinct values. This implementation is capable of distinguishing between both the pair of lists '(1 (eval btm)) and '(2 3) and the pair of lists '(1 (eval btm) 3) and '(1 2 4). It's even capable of distinguishing between the lists '((eval btm) 34 (eval btm)) and '((eval btm) 35 (eval btm)), which neither comp-list-1 nor comp-list-2 manages to accomplish.