We will henceforth refer to our typed version of LC as TLC. TLC is an extremely ``boring'' programming language for the following reason: every program in TLC terminates. For example, consider the canonical infinite loop
((lambda x (x x)) (lambda x (x x)))which we write in TLC as
((lambda x t (x x)) ...)where
t
is the type we assign the parameter. Since
x
is applied in the body of the procedure, it must have
some type t = u -> v
. But the argument to x
is x
itself, so the type u
must be whatever
t
is. Hence, we need a type t = t -> v
to
be able to type this application. You should be able to convince
yourself that we cannot construct such a type from our inductive
definition of types for TLC.
We cannot type this infinite loop. Indeed, a stronger property holds:
Every program in TLC terminates. The following theorem holds
for TLC: For all programs M
, if M
is of
type t
, then M
evaluates to a value of type
t
. Since programs are just closed expressions, a
similar theorem holds for all (closed) sub-expressions of programs.
Just as we cannot express infinite loops, we also cannot introduce other constructions such as pairing without explicitly adding type judgements for them to the language.
Thus, we see that types have imposed a tremendous restriction on our ability to express computations. Of course, this is the purpose of type systems: to restrict the expressivity of a language so that programmers cannot easily shoot themselves in the foot.
Let us consider some extensions we can add to TLC and examine their associated typing rules.
We extend TLC with terms of the form
(rec-let x tx Vx B)that have the following type judgement:
tenv + [x : tx] |- Vx : tx tenv + [x : tx] |- B : t ------------------------------------------------------ tenv |- (reclet x tx Vx B) : t
We can add the constructor cons
, which has type t s
-> t x s
(read: ``cons
takes two arguments, of
type t
and s
respectively, and returns an
object of which is a Cartesian product, whose first projection has
type t
and the second s
''). Note that
juxtaposition is used for arguments to a function, while the
x
operator represents a type constructed from two
component types.
In adding cons
, we have added a new constructor to our
language of types:
t ::= ... | (t x t)
We can then define car
and cdr
to represent
the projection functions:
|- car : t x s -> t |- cdr : t x s -> sHowever, note that any argument list must still be a tuple whose length is pre-determined and hence cannot be arbitrary. Thus, this
cons
cannot be used to create lists.
To circumvent this, we can add lists explicitly to the language.
First, we must again extend our type syntax to represent the list
type. For simplicity, we shall assume that lists can only contain
integers. Then the type of an integer list is ilist
:
t ::= ... | ilistWe then specify the types of the following constants and primitives:
null : ilist cons : int ilist -> ilist car : ilist -> int cdr : ilist -> ilist null? : ilist -> bool cons? : ilist -> bool
We should consider whether these additions to TLC change the property we stated earlier, ie, that all TLC programs terminate. If we add explicit recursion to our language, we can write programs such as the following:
(reclet f (int -> int) (lambda x int (f x)) (f 0))This program will not terminate, which is contrary to the original statement.Now the Central Theorem of Typed Languages is: For all programs, if
M
is of type t
and if
M
evaluates to a value V
, then
V
has the type t
.
The addition of pairs to TLC does not change the above statement of
the Central Theorem of Typed Languages. However, consider the
addition of lists. What happens if we evaluate the expression
(car null)
? According to the typing rules,
null
is of type ilist
, so the application is
well-typed; however, the result of this expression cannot be any
meaningful integer. What should the result be?
This leads us to the following dictum: Types are Lies.
There are several approaches we can take to resolving this situation. Some of these are:
(car null)
and (/ 1 0)
diverge.
In a safe implementation, the behavior upon encountering each error is clearly specified. In contrast, an unsafe implementation would depend upon the meta-level error processing. For instance, in C, if an element beyond the end of an array is accessed, the behavior is unspecified: it might yield an error (at the meta-level), or it might blithely return some value. In constrast, a safe language like ML clearly specifies what action to take: for instance, it might contain rules such as
(car null) : int ==> (raise IListEmpty) : int
What we see here is that types alone do not guarantee anything about a
language. We also need to have a safe implementation of the
programming language to enjoy the guarantees promised by a Type
Theorem. However, even if a language is typed and safe, a programmer
must think about those phrases in his programs that may raise
pre-defined run-time exceptions (such as (car null)
or
(/ 1 0)
). In the future, programming environments like
DrScheme will help the programmer reason about such problems. In the
PL research literature, there are analysis methods (many developed
here at Rice) similar to type-checking that prove the overwhelming
majority of these run-time checks can never abort (and hence, can be
eliminated). In addition, one form of analysis (called set-based
analysis) provides an explanation for each "retained" check showing
why the analysis could not eliminate it.