Section 5: A Meta-Interpreter for LC (Meaning) |
|
|
|
In Lecture 4, we developed a "syntactic" interpreter for LC that relied exclusively on substitution to interpret variables; such interpreters repeatedly rewrite programs in the syntax of the source language until the reduced form is a value. This is a powerful interpretation technique. For instance, even utilities as seemingly far removed from programming languages as the sendmail daemon use it for configuration files.
At the end of Lecture 4, we introduced a different form of interpreter that relies on a table of variable/value pairs, called an environment to interpret variables. Our rationale for introducing this approach was computational efficiency. This approach to interpretation eliminates the high cost of traversing the body of procedure (performing susbstitutions) whenever the procedure is applied.
In this lecture, we will look at a different motivation for environment-based interpreters, which we henceforth call meta-interpreters. A meta-interpreter for LC uses an environment to assign meaning to arbitrary program phrases (expressions) not just closed ones. An environment is a partial function mapping variables names (symbols) to values in the programming language. A meta interpreter takes an environment as an auxiliary argument and relies on this environment to assign meaning to the free variables in the input phrase.
The primary motivation for the term ``meta'' in ``meta-interpreter'' is that a a meta-interpreter assigns meaning to programs by reducing the meaning of program phrases to the meanings of their components. A meta-interpreter assigns meaning to program phrases using the same inductive framework that logicians use to assign meaning to mathematical formulas. This process reduces the meaning of all program phrases to the meaning of a single program written in a metalanguage. If this program happens to be purely functional, then (as we will see later in the course) the meta-interpreter defines the meaning of programs in exactly the same way as logicians assign meaning to formulas in a mathematical theory such as set theory.
In the last lecture, we wrote a meta-interpreter for LC in Scheme that represented environments as lists of variable/value pairs and closures as records containing the procedure text and the closing environment. We also represented LC numbers as Scheme integers. This convention enables us to interpret LC addition as Scheme addition. We can make this interpreter for LC more abstract by representing closures (evaluated lambda-expressions) as Scheme procedures. Then, e can use Scheme application to interpret LC application.
Here is a sketch of a meta-interpreter Eval, which is essentially our environment-based Eval from last lecture with the representations of environments and closures left unspecified.
(define Eval (lambda (M env) (cond ((var? M) (lookup (var-name M) env)) ((num? M) (num-num M)) ((add? M) (+ (Eval (add-left M) env) (Eval (add-right M) env))) ((proc? M) (make-closure M env)) ((app? M) (Apply (Eval (app-rator M) env) (Eval (app-rand M) env))))))
Note: The + operation used above must be chosen with care, since the addition operation in the meta-language won't necessarily be the same as that of the implemented language.
What are the values in LC? There are two: numbers and procedures. Numbers can be represented directly in the meta-language. To avoid a premature choice of representation for closures, we have chosen to use the abstractions make-closure and Apply. Thus, if we ever need to change the interpretation of closures, we can do so without changing the interpreter itself.
Exercise: Which elements of the interpreter would we have to change if we change one of our representation choices? |
In the special case when the language we are interpreting is the same as that in which the interpreter is written (for instance, a Scheme interpreter written in Scheme), we call the interpreter meta-circular.
The preceding interpreter leaves the representations of closures and environments unspecified.
In lecture 4, we represented closures as structs (pair) containing an abstract syntax tree M for a lambda expression and an environment assigning a value to each free variable in M.
Let us re-examine the representation of LC closures (function values). The closure type only has two operations:
Since Scheme supports procedures as values, we can represent LC procedures using Scheme procedures. Consider the following definition of make-closure which returns a procedure rather than a struct:
; proc Env -> closure (define make-closure (lambda (proc-exp env) (lambda (value-for-param) (Eval (proc-body proc-exp) (extend env (proc-param proc-exp) value-for-param)))))
To use this representation of closures, we must modify the representation of Mapply:
; closure Val -> Val (define Apply (lambda (clos val) (clos val)))
Note that the Scheme closure returned by (make-closure proc-exp env) closes over env; env appears free in the body of make-closure.
Abstractly, we can characterize Apply and Eval as follows:
(Apply (make-closure (make-proc var body) env) val) = (Eval body (extend env var val))
A representation of closures as structures (code-env pairs) was presented in the preceding lecture. Exactly the same representation works here.
Exercise: Given the declaration
(define-structure (closure P E))how do we write Apply? |
One part of the interpreter remains unspecified: the representation of environments. Before considering the available alternatives, it is worth-while to consider environments abstractly too. There are three things we need to understand with respect to environments:
The latter two create new environments, while the first of these extracts information from information. Here are the equations that relate the constructors and the selector:
(lookup var (empty-env)) = (error 'lookup "variable ~a not found" var) (lookup var (extend env varN val)) = (if (eq? var varN) ;; are var and varN the same symbol val (lookup var env))
What is a good representation choice for environments? Note that there is only a fixed number of free variables in a given program, and that we can ascertain how many there are before we begin evaluating the program. On the other hand, we can be lax and assume that there can be arbitrarily many free variables. A good representation in the former case is the vector; in the latter case, we might wish to use lists. However, there is at least one more representation.
Since environments are simply extendable functions mapping symbols to values, we can represent them as Scheme procedures (or Java commands).
Consider the following implementations:
(define lookup (lambda (var env) (env var))) (define empty-env (lambda () (lambda (var) (error 'lookup "variable ~a not found" var))))
We can then prove that this implementation satisfies one of the equations that characterize environments:
(lookup var (empty-env)) = (lookup var ((lambda () (lambda (varN) (error 'lookup "variable ~a not found" varN))))) = (lookup var (lambda (varN) (error 'lookup "variable ~a not found" varN))) = ((lambda (varN) (error 'lookup "variable ~a not found" varN)) var) = (error 'lookup "variable ~a not found" var)
as desired. We can similarly define extend:
(define extend (lambda (env varN val) (lambda (name) (if (eq? name varN) val (env name)))))
Exercise: Verify that extend and lookup satisfy the above equation. |
Now suppose we added some new binding constructs to LC. For instance, suppose we added the struct seq-let to our abstract syntax, and defined its behavior as follows:
(Eval (make-seq-let var rhs body) env) = (Eval body (extend env var (Eval rhs env)))
However, now say we add recursive lexical bindings, represented by the abstract syntax struct rec-let. Then we want
(Eval (make-rec-let var rhs body) env) = (Eval body (extend env var (Eval rhs ...)))
where the ... represents the (extend env var ...) term. How can we implement such a construct? We clearly need a way to create an environment that refers to itself. If we represent environments as procedures, we can use recursive procedures to implement this kind of extension.
Exercise: Can we use lists or other representations to accomplish this goal? |
Hint: What did we do in Comp 210 to create data structures that refer to themselves? |
Say we have functions from R to R, where R denotes the set of real numbers. We might have a function like
f(x) = 2x + 1
The fixed-point of f is the value x such that x = f(x). Hence, x = 2 x + 1, implying that the fixed-point x = -1.
Does every function from R to R have a fixed-point? No: consider g(x) = x + 1. Substituting x and reducing, we get 0 = 1, which is a contradiction. On the other hand, h(x) = x has an infinite number of solutions. Hence, a function from R to R could have zero, one, several, or an infinite number of fixed-points. However, we will show that for every function from environments to environments, we can construct a fixed-point.
We can use fixed-points to assign a mathematical meaning to any recursive definition expressed in a computational framework. For example, we can assign a mathematical meaning to any recursive definition of a function in terms of already defined functions.
Consider the equation
fact(n) = if zero?(n) then 1 else n * fact(n - 1)
over the natural numbers N = {0, 1, 2, ...} where the primitive functions zero?, *, -, and if-then-else have their usual meaning. The defined function
fact
obviously must satisfy the equation
fact = map n to if zero?(n) then 1 else n * fact(n - 1)
which is mathematically equivalent to being a fixed-point of the functional FACT defined by the non-recursive equation
FACT(f) = map n to if zero?(n) then 1 else n * f(n - 1)
How do we find the fixed-point of such an equation? We interpret F as an inductive definition of the graph of the partial function fix(F).
A partial function mapping N to N is any a subset G of N x N such that (a,b) in G and (a,c) in G imply that b = c. A (total) function mapping N to N is a partial function F such that for every a in N, there exists b such that (a,b) in F. Computed functions correspond to partial functions rather than total functions because they may diverge on some inputs.
We use the term graph of a function to indicate that we are interpreting the function simply as a set of ordered pairs.
Let us restate the definition of the fixed-point fact of FACT as an inductive definition of the graph of a function.
Given a, b in N, the pair (a,b) belongs to the graph of fact iff either:
All of the inductive data definitions presented in Comp 210 can be interpreted in exactly the same way. Each data definition specifies a function D mapping sets to sets. The set specfied by the definition is the least set d such that D(d) = d.
How is the fixed-point of a function F mapping sets over N to sets over N constructed? By repeatedly applying the function F to the empty set. The fixed-point f of F is
Union {Fk(empty) | k = 0,1,...}
where Fk(empty) is k applications of F to empty: F(F(F( ... F(empty)...))) and F0(empty) = empty.
Consider the FACT example above.
FACT1(empty) = {(0,1)} FACT2(empty) = {(0,1), (1,1)} FACT3(empty) = {(0,1), (1,1), (2,2)} ... FACTN+1(empty) = {(0,1), (1,1), (2,2), ..., (N,N!)} ...
Each additional application of FACT permits the recursive evaluation of our equation defining fact to go one level deeper. FACT1 is the function computed by the equation for fact if the first recursive call diverges. FACTN is the function computed by the equation if the Nth recursive call (in depth) diverges.
Any recursive definition of a function from N to N can be interpreted in this way.
Consider the function FADD on sets of N defined by
FADD(S) = {s+1 | s in S}
What is its least fixed-point? The emtpy set! In a computational setting every computable function mapping a data domain D (e.g., N, sets over N, functions over N) into D has a least fixed-point. But the fixed-point may be the trivial "empty" element of D. By definition, a computational data domain must be a partial order with a least element that contains the limits of infinite ascending chains of elements. Similarly, any computable function over such a domain must preserve limits of ascending chains (continuity). It is not difficult to prove that every continuous function f on a domain D has a least fixed point: it the limit (least upper bound) of the ascending chain empty, f(empty), f(f(empty)), ... Consider the recursive definition
x = x + 1
over N. What is the "least" element of N satisfying this equation? There is no conventional natural number that suffices. But the computational analog of the set N includes one additional element, called bottom, which is the meaning of a non-terminating computation. Does the equation
bottom = bottom + 1
hold? Yes! Both sides of the equation represent non-termination.