In our last lecture, we introduced two key concepts: union types and
recursive types. We also introduced the datatype
mechanism, which can express both these concepts. With
datatype
, we can do away with some of the types we had
introduced into TLC earlier, such as ilist
, which can be
written (using a more compact concrete syntax that omits selector names) as
datatype ilist = n () + c ( int , x )
datatype x = n () + c ( int , x ) in datatype y = ni () + ci ( int -> int , y ) in ... datatype z = nz ( int ) + cz ( int -> z ) ...
In TLC with datatype definitions, we can define both lists of
int
and lists of (int -> int)
as follows
(datatype list_int ([null_int null_int?] [cons_int cons_int? (int first_int) (list_int rest_int)] (datatype list_int->int ([null_int->int null_int->int?] [cons_int->int cons_int->int? (int->int first_int->int) (list_int->int rest_int->int)]) ...))Given these datatype definitions and the addition of Scheme-like
if
expressions
to TLC, we can write a
procedure that maps its argument over a list of integers, returning a
list of integers:
(define map (lambda (f (int -> int) l list_int) (if (null_int? l) null_int (cons_int (f (first_int l)) (map f (rest_int l))))))Similarly, we can write a procedure that maps its argument over a list of functions ...
(define map (lambda (f ((int -> int) -> int) l list_int->int) (if (null_int->int? l) null_int (cons_int (f (first_int->int l)) (map f (restr_int->int l))))))Of course, these two procedures are identical except for the types specified for the arguments. In Scheme, we have only one
map
procedure that subsumes all of these; in languages
like Pascal or TLC, we need to have one for each argument type.
If we look carefully at these declarations, we notice that only two of
the types actually matter: the type of the elements in the list
argument and the return type of the function argument. Everything
else in the declaration is implied by these two parameters. Hence,
we could rewrite map
using parameterized types as follows:
(define map (lambda (alpha) (lambda (beta) (reclet g ((beta -> alpha) -> (list beta) -> (list alpha)) (lambda (f (beta -> alpha)) (lambda (l (list beta)) (if (null_beta? l) null_alpha (cons_alpha (f (first_beta l)) ((g f) (rest_beta l)))))) g))))In this declaration, the arguments being passed to
map
are types; hence, we cannot write this procedure in
TLC. So we need another extension to the type language to handle
these kinds of abstractions. We call this new language XPolyLC.
Lambda
, which is similar to lambda
except
that it binds types, not values, to identifiers. Using
Lambda
, we can write map
as
(Lambda (alpha Omega) (Lambda (beta Omega) (reclet g ((beta -> alpha) -> (list beta) -> (list alpha)) (lambda (f (beta -> alpha)) (lambda (l (list beta)) (if (null_beta? l) null_alpha (cons_alpha (f (first_beta l)) ((g f) (rest_beta l)))))))))where the keyword
Lambda
refers to abstraction over types
rather than abstraction over data values of some type.
Recall that in TLC, we have to annotate each binding occurrence with
its type. But what type can we assign to types themselves? We use
Omega
, which represents the set of all types. Since the
``type'' of all arguments bound by Lambda
will always be
Omega
, we shall henceforth leave this declaration
implicit. This is the introduction rule for type
abstractions.
Now consider this definition of map
:
(define map (Lambda (alpha) (Lambda (beta) (lambda (f (beta -> alpha)) (lambda (l (list beta)) (if (null_beta? l) null_alpha (cons_alpha (f (first_beta l)) ((map f) (rest_beta l)))))))))When we perform the recursive call to
map
, we need to
pass to pass the types first to instantiate map
.
However, we do not have any facilities for passing types as arguments,
since they are not values.
Tapp
(for ``type application''). Tapp
invokes an objected created with Lambda
with a type as an
argument. Hence, we would write the recursive call to
map
above as
(((Tapp (Tapp map alpha) beta) f) (rest_beta l))with the rest of the code remaining unchanged.
In TLC, and indeed in many programming languages, a programmer provides type annotations at binding instances of variables, but leaves the type of individual expressions to be inferred (computed) and checked by the language implementation. In XPolyLC, we have added two new kinds of expressions, type introductions and eliminations, so we examine what the types of these expressions are.
(lambda (l (list beta)) ...)
expression in either definition of map
?
((list beta) -> (list alpha))
.
(lambda (f (beta -> alpha)) ...)
expression in either definition of map
?
((beta -> alpha) ->
type of (1))
.
(Lambda (beta) ...)
in the latter
definition of map
?
The type is (beta ->
type of (2))
, for
all possible values of beta
. Hence, we write
this as (forall beta .
(2))
.
(Lambda (alpha) ...)
?
This has type (alpha ->
type of (3))
, for
all possible values of alpha
. We write this type
as (forall alpha .
(3))
.
(define map (Lambda (alpha) (Lambda (beta) (lambda (f (beta -> alpha)) (lambda (l (list beta)) (if ([Tapp null? beta] l) [Tapp null alpha] ([Tapp cons alpha] (f ([Tapp first beta] l)) (([Tapp [Tapp map alpha] beta] f) ([Tapp rest beta] l)))))))))
Hence, alpha
and beta
range (abstract) over
types in the same manner that lambda
-bound identifiers
range (abstract) over values. This property of a type system, wherein
``type variables'' are used to abstract over types, is called
polymorphism. This new form of types is called a type
schema, since it represents a framework which can be
instantiated with several different types. This kind of type system
is said to be explicitly polymorphic, since the programmer is
required to manually specify each instance of type abstraction, and to
perform the appropriate type applications.
While explicit polymorphism adds considerable power to a programming language, it is notationally clumsy because all type instantiations (applications) must be explicitly performed in the code. (Generic Java requires explicit instantiation of polymorphic (generic) classes but not of polymorphic methods.) To eliminate this problem, we allow the programmer the following freedom: every type declaration required in TLC can be left blank. We then design typing rules that will attempt to deduce an appropriate type from context such that the resulting expression will pass the type checker.
Consider the following example. If we have
(lambda (x _) x)what can we write in the blank? We could certainly use
int
; likewise, we could use bool
. In fact,
any well-formed type we write in the blank will satisfy the type
checker.
Now consider a typing judgement for let
:
Tenv |- Exp : t Tenv + [f : t] |- Bexp : type ------------------------------------------------ Tenv |- (let f t Exp Bexp) : typeSay the
Exp
is (lambda (x _) x)
. Then how
many ways are there to type it and use it in the body? We can only
use whatever was put in the place of _
, so the resulting
identity function can only be used on objects of the one type
t
above. For instance, it could not be used to type
(let ((id (lambda (x _) x))) (if (id true) (id 5) (id 6)))since the closure is applied to both booleans and integers. Hence, this is not a convenient typing rule. Instead, the SML programming language uses a modified judgement for
let
:
Tenv |- Bexp [f / Exp] : type --------------------------------- Tenv |- (let f _ Exp Bexp) : typeThis rule literally copies the
Exp
expression through
Bexp
; when applied to the code fragment we examined
earlier, we end up typing the expression
(if ((lambda (x _) x) true) ((lambda (x _) x) 5) ((lambda (x _) x) 6))As can be seen, everything -- including the programmer's demand to deduce a type -- is copied by this rule. Hence, an independent deduction can be made at each location.
The last rule is easy to implement, though it can be expensive in
terms of the time taken for the typing phase. But, how do we
implement type deduction, especially since there is an unbounded number of
types? The answer is that we create a new type, assuming no
properties for it, and based on the constraints derived from typing
the body M
, we derive the "deduced" type.
In summary, polymorphism is the ability to abstract over types.
Implicit polymorphism is the ability of a type system to
assign many different types to the same phrase without the need for
explicit type declarations. Several languages, including SML, Haskell
and Clean, provide type inference ("computing the most general possible
type for each expression") with implicit
polymorphism for let
(and possibly other binding
constructs).
cork@cs.rice.edu (adapted from shriram@cs.rice.edu)