We will add an assignment operation to the language, with the syntax
(set! x M)and the abstract representation
(define-structure (setter lhs rhs))where
x
can be any lambda
-bound identifier.
set!
allows us to model changing events in the real
world.
There are two defensible models of how assignment should behave.
In languages like Scheme and Java, an assignment changes the binding
of an identifier. Conceptually, assignment to an identifier
destructively modifies
the value field of the binding for the specified identifier.
In this model, only the value of an identifer can be passed as an argument
to a procedure. Hence, assignment to a formal parameter has no
effect on the corresponding actual parameter.
In the other model, only fields of data structures can be modified.
"Assignable variables" are simply identifiers bound to
modifiable structures called "cells" or ref
s.
These structures have a single updatable field. ML supports
this model of assignment.
Typically, a single set!
enables us to create
cycles and graphs, while multiple set!
s model state in
the modeled universe.
Consider the following code fragment:
((lambda (x) ... (set! x 6) ...) 5)What does the
set!
do to x
? Initially,
x
is bound to the value 5
. But the
5
is a constant, so it can't directly be changed (since
the 5
can't be changed into a 6
). Rather,
we will need to change the value associated with x
. We
can accomplish this by altering the environment:
((setter? M) change the environment)But how do we do this?
To make variables assignable, we need to change what they stand for. They cannot be directly associated with values; rather, they must be associated with something, which is associated with the value, and which can be changed to associate with a different value. What can these somethings be? Boxes are a reasonable object to use here.
Moral: Variables must stand for boxes.
Where do we associate boxes with values? Since every use of a procedure is supposed to create a new copy of the body with all occurrences of the parameter replaced by the argument value, it is natural to create a box right after the argument is evaluated and to put the value of the argument into that box. Put technically, we change the interpretation of application as follows:
((app? M) (MApply (MEval (app-fp M) env) (box (MEval (app-ap M) env))))This in turn induces another change in the interpreter: since every variable is boxed, we now have to unbox it to get its actual value.
((var? M) (unbox (lookup (var-name M) env)))
At this point, there is no LC program we can write to distinguish this implementation of LC from the one we had before.
We are now ready to interpret a set!
expression:
((setter? M) (set-box! (lookup (setter-lhs M) env) (MEval (setter-rhs M) env)))Why is the value returned by
MEval
not itself boxed? A
set!
expression only changes the value that a variable is
associated with; it does not introduce a new variable.
Consider this program, which contains a mutation:
(let ((f (lambda (x) (set! x 5)))) (let ((y 10)) (let ((_ (f y))) y)))What is the result of this? The value of
y
,
10
, is placed in a new box when f
is applied; this new box is thrown away after the procedure body
(including the set!
) has been evaluated, so the eventual
value is that of y
, which is still 10
. This
is call-by-value: we passed the value of
y
, not the capability to change its value. Therefore, in
this language, we cannot write a swap
procedure.
Assume we want that the following expression
(let ((f (lambda (x y) (let ((t x)) (set! x y) (set! y t))))) (let ((a 5) (b 6)) (f a b) (cons a b)))evaluate to
(cons 6 5)
. Then we have to pass
references to variables a
and b
,
not their value alone. We can accomplish this with a small
change in the interpreter, which is motivated by a simple
implementation-oriented observation: when the argument expression in
an application is already a variable, it is associated with a box in
the environment. Hence, we can pass this box to the procedure and
don't need to create a new one:
((app? M) (MApply (... fp ...) (if (var? (app-ap M)) (lookup (var-name (app-ap M)) env) (...))))This new mode of parameter-passing is called call-by-reference. Pascal and Fortran IV used variants of this parameter-passing technique. While passing references enables programmers to write procedures like
swap
, it also
introduces a new phenomenon into the language: variable aliasing.
Variable aliasing occurs when two syntactially distinct
variables refer to the same mutable location in the
environment. In Scheme such a coincidence is impossible; in Pascal it
is common.
Note: a different form of aliasing, data aliasing, occurs when two distinct paths into a compound data structure refer to the same location. Both call-by-value and call-by-reference languages permit data aliasing.
SML offers one middle ground between pass-by-value and pass-by-reference. It forces a programmer to associate variables that are used for modeling cycles or state change with reference cells (or boxes). Put differently, it makes references into values and can thus turn pass-by-value into the parameter passing of references exactly when needed.
To model SML we introduce three new classes of expressions:
(ref M)
, which creates a ref cell, which is a record
with one slot that holds the value of M
;
(! M)
, which assumes that M
will
evaluate to a ref cell; and,
(:= M1 M2)
, which assumes M1
will
evaluate to a ref cell, and replaces the contents of that ref cell
with the value M2
reduces to.
In LC-SML, ref cells are a new class of values, distinct from numbers and procedures (closures). Interpreting the new expressions requires the addition of three new lines to the original interpreter of LC:
(define MEval (lambda (M env) (cond ((var? M) (lookup (var-name M) env)) ((num? M) (num-num M)) ((add? M) (+ (MEval (add-left M) env) (MEval (add-right M) right))) ((proc? M) (make-closure M env)) ((app? M) (MApply (MEval (app-rator M) env) (MEval (app-rand M) env))) ((ref? M) (box (MEval (ref-init M) env))) ((!? M) (unbox (MEval (!-expr M) env))) ((:=? M) (set-box! (MEval (:=-lhs M) env) (MEval (:=-rhs M) env)))))) (define MApply (lambda (val-of-fp val-of-arg-p) (val-of-fp val-of-arg-p)))
swap
in LC-SML.
What did we have to change here? We added one line for each new feature, but were able to leave the old code untouched. This is a very elegant design: adding the new set of features does not change the underlying interpreter for the old language. This is called orthogonality.
Shriram Krishnamurthi / shriram@cs.rice.edu