Slogan: Definitions build environments.
Let us add a recursive binding mechanism to our language where we restrict right hand sides to lambda expressions.
(define-struct rec-let (lhs ; variable rhs ; required to be a lambda-expression body))where
lhs
is the new local variable, rhs
is
the lambda-expression defining the value of the new variable,
and body
is an expression that can use the new
local variable. The new variable
lhs
is visible in both
rhs
and
body
. The code to process a rec-let in our interpreter
MEval might look like
((rec-let? M) ... (MEval (rec-let-body M) (extend env (rec-let-lhs M) (make-closure (rec-let-rhs M) E))))To turn the
rhs
expression of M
into a
recursive closure, we must ensure that E
is exactly the
same environment that we are in the process of constructing. In other
words, we need a special kind of environment, env*
,
with the following property:
env* = (extend env (rec-let-lhs M) (make-closure (rec-let-rhs M) env*))
Using the following procedure
(define FENV (lambda (env*) (extend env (rec-let-lhs M) (make-closure (rec-let-rhs M) env*))))the equation for
env*
can be rewritten as
env* = (FENV env*)which shows that we want the environment to be the fixed-point of the function
FENV
(mapping environments
to environments).
If environments are implemented as functions (symbol -> Value), the definition of
fix-env: (Env -> Env) -> Env
, the function that finds the fixed-point of
FENV (and all other functions mapping environments to enviroments),
is simple:
(define fix-env (lambda (F) (local [(define renv (F (lambda (id) (lookup id renv))))] renv)))If Scheme were a call-by-name functional language, the code would be even simpler:
(define fix-env (lambda (F) (local [(define renv (F renv))] renv)))If Scheme were a call-by-name language, then
(local [(define renv (F renv))] renv)))would bind
renv
to a suspension (closure of no arguments)
containing the expression (F renv) and the new environment
containing renv. When renv
is evaluated in the
body of the local
, the suspension evaluates (F
renv)
in the environment constructed by local
,
which includes the new binding of renv
.
Hence,
(fix-env FENV) = (local [(define renv (FENV renv))] renv) = (local [(define renv (extend env (rec-let-lhs M) (make-closure (rec-let-rhs M) renv)))] renv)If
M = (make-rec-let x proc ast),then
(local [(define renv (extend env (rec-let-lhs M) (make-closure (rec-let-rhs M) renv)))] renv) = (local [(define renv (extend env x (make-closure proc renv)))] renv) = (define renv' (extend env (make-closure proc renv'))) renv'which is exactly the recursively defined environment that we need. The (call-by-value) Scheme solution requires the expression
(lambda (id) (lookup id renv))instead of
renv
because call-by-value evaluation
requires a lambda form to suspend evaluation. On inputs of type
(symbol -> Value),
the function
(lambda (env) (lambda (id) (lookup id env))is the identity function except on improper values of env (undefined, divergence, or errors). On improper values for env,
(lambda (id) (lookup id env))is still a legal environment (which will blow up only when it is applied).
In the absence of this suspension trick, the evaluation of the right hand side of the definition
(define renv (extend env (rec-let-lhs M) (make-closure (rec-let-rhs M) renv)))in Scheme, blows up because renv has no value (is undefined) until the evaluation of the definition is complete. The suspension trick wraps the argument
renv
inside a lambda-expresssion
without changing its meaning for proper values of env.
We have used the recursive binding mechanism local
in
Scheme (and call-by-name Scheme) to implement rec-let
.
Note that we have placed a strong requirement on the rhs
of rec-let
in LC. Therefore, this interpreter does not
completely explain how local
works
in Scheme. We will give a more complete explanation later in the course.
local
or letrec
to interpret
rec-let
from LC?
Hints: Take a look at the sample programs in1 and in2 for this assignment. The book The Seasoned Schemer by Matthias Felleisen and Daniel Friedman contains an extensive discussion of the Y operator.
If we represent environments as structures, how do we write
fix-env
? We need to construct an environment that
contains references to copies of itself. In a "lazy" functional
language like Haskell or Miranda, we could easily build the
required structures using "lazy" constructors that defer the
evaluation of their arguments.
For example if we added
lazy-cons to LC, the code would
(rec-let x (lazy-cons 0 x) ...)define x as a list consisting of 0 followed by x.
Instead of storing a value in each field of the constructed structure, a lazy constructor stores a suspension specifying how to evaluate the field when it is needed. This is exactly the same mechanism used to support call-by-name argument passing to program defined functions. Note that this approach formulates self referential environments as infinite data structures rather data structures with cycles.
In functional Scheme, we cannot use ordinary data structures (non-functions) to solve the recursion equation
env* = (FENV env*)because these structures cannot be infinite. However, in imperative Scheme, we can create the the necessary self-references using cycles created by brute-force. We can overwrite the environment field inside the new closure to point to the extended environment. If we adhere to the list representation of environments introduced in Lecture 4, we must abandon the explicit definition the
fix-env
operation because finding the references to
env* that
we have to patch in (FENV env*) is either impossible (if
F uses constructors that we have not anticipated) or inefficient
(because we have to traverse all of the data structure (F env*)).
Fortunately, we don't need a
a general function fix-env
for our interpreter;
all we need is a function
(rec-extend M env)
that builds the fixed-point for FENV:
(FENV (rec-extend M env)) = (rec-extend M env)We can define rec-extend as follows
(define rec-extend (lambda (M env) (local [(define var (rec-let-lhs M)) (define rhs (rec-let-rhs M)) (define clsr (make-closure rhs null)) (define renv (extend env var clsr))] ; Is renv the appropriate environment yet? No. (set-closure-env! clsr renv))))
Given the environment env
and the rec-let expression
M
of a function,
(rec-extend env M)
extends the environment with the new function
binding in M such that the environment embedded in the function binding
is the extended environment. The rec-let
clause of our interpreter can now be written
((rec-let? M) ... (MEval (rec-let-body M) (rec-extend env M)))
fix-env
function. Hint: Scheme boxes.
(define rec-extend (lambda (M env) (local [(define var (rec-let-lhs M)) (define rhs (rec-let-rhs M)) (define renv (extend env var (void)))] ; (void) is a dummy value for var ; Is renv the appropriate environment yet? No. (set-first-pair-val! renv (MEval rhs renv))))) (define set-first-pair-val! (lambda (env val) (local [(define first-pair (first env))] (set-pair-val! first-pair val))))To make MEval abort on the inspection of (void) we also need to add the clause
[(void? M) (error 'MEval "variable referenced before definition in rec-let")]to the definition of MEval.
Exercise Generalize the code from the previous exercise to allow an arbitrary expression as the right hand side of a rec-let. Hint: it is much simpler than the code above.
Say we have functions from R to R (where R denotes the set of real numbers. We might have a function like
f(x) = 2x + 1The fixed-point of
f
is the value x
such that x = f(x)
. Hence,
x = 2 x + 1
, implying that the
fixed-point x = -1
.
Does every function from R to R
have a fixed-point? No: consider g(x) = x +
1
. Substituting x
and reducing, we get 0 =
1
, which is a contradiction. On the other hand, h(x) = x
has an infinite number of solutions. Hence, a function from R
to R could have zero, one, several, or an infinite number of fixed-points.
However, we will show that for every function from environments to environments, we can
construct a fixed-point.
We can use fixed-points to assign a mathematical meaning to any recursive definition expressed in a computational framework. For example, we can assign a mathematical meaning to any recursive definition of a function in terms of already defined functions.
Consider the equation
fact(n) = if zero?(n) then 1 else n * fact(n - 1)over the natural numbers N = {0, 1, 2, ...} where the primitive functions
zero?
, *
, -
, and
if-then-else
have their usual meaning. The defined
function factobviously must satisfy the equation
fact = map n to if zero?(n) then 1 else n * fact(n - 1)which is mathematically equivalent to being a fixed-point of the functional FACT defined by the non-recursive equation
FACT(f) = map n to if zero?(n) then 1 else n * f(n - 1)
How do we find the fixed-point of such an equation? We interpret F as an inductive definition of the graph of the partial function fix(F).
A partial function mapping N to N is any a subset G of N x N such that (a,b) in G and (a,c) in G imply that b = c. A (total) function mapping N to N is a partial function F such that for every a in N, there exists b such that (a,b) in F. Computed functions correspond to partial functions rather than total functions because they may diverge on some inputs.
We use the term graph of a function to indicate that we are interpreting the function simply as a set of ordered pairs.
Let us restate the definition of the fixed-point fact of FACT as an inductive definition of the graph of a function.
Given a, b in N, the pair (a,b) belongs to the graph of fact iff either:
The preceding construction can be interpreted as building successively better approximations to the graph of fact. We start with the empty set empty as our initial approximation. Using this interpretation for fact, we deterine that fact must contain the element (0,1). Using the new approximation {(0,1)} for fact, we apply the defintion again to determine that fact must contain the element (1,1) in addition to (0,1). By repeating this process forever, we build better and better approximations to fact. The entire graph of fact is the infinite union of all of the finite approximations Morever, it is the least set satisfying the preceding definition. All of the inductive data definitions presented in Comp 210 can be interpreted in exactly the same way. Each data definition specifies a function D mapping sets to sets. The set specfied by the definition is the least set d such that D(d) = d.We just showed how the recursive definition of factorial could be interpreted as an inductive definition of its graph. It is suggestive of a more general construction that enables us to build the least fixed-point of a computable function F mapping Pow(NxN) (graphs of functions on N) into Pow(NxN). By repeatedly applying the function F to the empty set we can build the least fixed point of F. The fixed-point f of F is
Union {Fk(empty) | k = 0,1,...}where Fk(empty) is k applications of F to empty: F(F(F( ... F(empty)...))) and F0(empty) = empty.
Consider the FACT example above.
FACT1(empty) = {(0,1)} FACT2(empty) = {(0,1), (1,1)} FACT3(empty) = {(0,1), (1,1), (2,2)} ... FACTN+1(empty) = {(0,1), (1,1), (2,2), ..., (N,N!)} ...Each additional application of FACT permits the recursive evaluation of our equation defining fact to go one level deeper. FACT1 is the function computed by the equation for fact if the first recursive call diverges. FACTN is the function computed by the equation if the Nth recursive call (in depth) diverges.
Any recursive definition of a function from N to N can be interpreted in this way. How do we apply this idea to solving our recursion equation over environments? If we represent envivronments as functions, then exactly the same technique works. The only difference is that the function FENV maps environments to environments instead of N to N.
If environments are represented some other way, then the representation must permit us to find a non-empty least fixed-point for FENV. Otherwise, the representation is incorrect.
Consider the function FADD on sets of N defined by
FADD(S) = {s+1 | s in S}What is its least fixed-point? The emtpy set! In a computational setting every computable function mapping a data domain D (e.g., N, sets over N, functions over N) into D has a least fixed-point. But the fixed-point may be the trivial "empty" element of D. By definition, a computational data domain must be a partial order with a least element that contains the limits of infinite ascending chains of elements. Similarly, any computable function over such a domain must preserve limits of ascending chains (continuity). It is not difficult to prove that every continuous function f on a domain D has a least fixed point: it the limit (least upper bound) of the ascending chain empty, f(empty), f(f(empty)), ... Consider the recursive definition
x = x + 1over N. What is the "least" element of N satisfying this equation? There is no conventional natural number that suffices. But the computational analog of the set N includes one additional element, called bottom, which is the meaning of a non-terminating computation. Does the equation
bottom = bottom + 1hold? Yes! Both sides of the equation represent non-termination.
At this point in the course, we have explained several language constructs, such as procedures and applications, in terms of their Scheme counterparts. A notable exception is environments, which we have represented explicitly. Indeed, since we explicitly manage variable-value associations in our interpreters, we can analyze some of the lower-level behavior of the language such as its memory management.
Consider the evaluation of
(let (x 1) (let (y 2) (let (z 3) ...)))We begin with the empty environment; as we encounter each
let
, we add a new level to our environment; as we leave
the let
body, we remove that level. At the end of the
entire expression, our environment will be empty again. Hence, our
environment will have behaved like a stack. Do environments
always behave in this manner?
Consider this expression:
((lambda (x) (lambda (y) (y x))) 10)This evaluates to an ``apply to
10
'' function. This
expression
((lambda (z) (lambda (w) (+ z w))) 20)evaluates to a procedure that adds
20
to its argument.
If we apply the former to the latter, what happens to the environment at
each stage?
We begin with the empty environment; the first application is
performed and the environment is extended with the binding [x =
10]
. The closure is now created, and the environment is
emptied. Then we evaluate the second expression, which adds the
binding [z = 20]
. The second closure (call it
C2
) is produced, and the environment is emptied again.
At this point, we are ready to perform the desired application. When
we do, the environment has the bindings [x = 10]
and
[y = C2]
. Now C2
is applied, which has the
effect of replacing the current environment with its own, which
contains the bindings [z = 20]
. The application adds the
binding [w = 10]
, at which point the addition is performed,
30
is returned, and the environment is emptied again.
The moral of this example is that environments for LC programs, no matter how we choose to represent them, branch out like trees, and a simple stack discipline is insufficient for maintaining them. In many programming languages (such as Fortran, Pascal, and C/C++), it is often the programmer's job to manage memory and the collection of environments in existence at any point in the program must be representable by a stack. Such languages prohibit language constructions that can create a collection of environments that cannot represented by a stack. In our interpreters for LC, we have left the task of managing the storage for LC environments to Scheme's run-time system. Otherwise, our interpreters would have been much more complex programs.
cork@cs.rice.edu/