Section 14: Polymorphism |
|
|
|
In our last lecture, we introduced two key concepts: union types and recursive
types. We also introduced the datatype
mechanism, which can express
both these concepts. With datatype
, we can do away with some of
the types we had introduced into TLC earlier, such as ilist
, which
can be written (using a more compact concrete syntax that omits selector names)
as
datatype ilist = n () + c ( int , x )
Puzzle: Determine what these declarations represent:
datatype x = n () + c ( int , x ) in datatype y = ni () + ci ( int -> int , y ) in ... datatype z = nz ( int ) + cz ( int -> z ) ... |
In TLC with datatype definitions, we can define both lists of int
and lists of (int -> int)
as follows
(datatype list_int ([null_int null_int?] [cons_int cons_int? (int first_int) (list_int rest_int)] (datatype list_int->int ([null_int->int null_int->int?] [cons_int->int cons_int->int? (int->int first_int->int) (list_int->int rest_int->int)]) ...))
Given these datatype definitions and the addition of Scheme-like if
expressions to TLC, we can write a procedure that maps its argument over a list
of integers, returning a list of integers:
(define map (lambda (f (int -> int) l list_int) (if (null_int? l) null_int (cons_int (f (first_int l)) (map f (rest_int l))))))
Similarly, we can write a procedure that maps its argument over a list of functions...
(define map (lambda (f ((int -> int) -> int) l list_int->int) (if (null_int->int? l) null_int (cons_int (f (first_int->int l)) (map f (restr_int->int l))))))
Of course, these two procedures are identical except for the types specified for
the arguments. In Scheme, we have only one map
procedure that subsumes
all of these; in languages like Pascal or TLC, we need to have one for each argument
type.
If we look carefully at these declarations, we notice that only two of the
types actually matter: the type of the elements in the list argument and the
return type of the function argument. Everything else in the declaration is
implied by these two parameters. Hence, we could rewrite map
using
parameterized types as follows:
(define map (lambda (alpha) (lambda (beta) (reclet g ((beta -> alpha) -> (list beta) -> (list alpha)) (lambda (f (beta -> alpha)) (lambda (l (list beta)) (if (null_beta? l) null_alpha (cons_alpha (f (first_beta l)) ((g f) (rest_beta l)))))) g))))
In this declaration, the arguments being passed to map
are types;
hence, we cannot write this procedure in TLC. So we need another extension to
the type language to handle these kinds of abstractions. We call this new language
XPolyLC.
Lambda
,
which is similar to lambda
except that it binds types,
not values, to identifiers. Using Lambda
, we can write map
as
(Lambda (alpha Omega) (Lambda (beta Omega) (reclet g ((beta -> alpha) -> (list beta) -> (list alpha)) (lambda (f (beta -> alpha)) (lambda (l (list beta)) (if (null_beta? l) null_alpha (cons_alpha (f (first_beta l)) ((g f) (rest_beta l)))))))))
Lambda
refers to abstraction over types rather
than abstraction over data values of some type. Recall that in TLC, we have
to annotate each binding occurrence with its type. But what type can we assign
to types themselves? We use Omega
, which represents the set of
all types. Since the ``type'' of all arguments bound by Lambda
will always be Omega
, we shall henceforth leave this declaration
implicit. This is the introduction rule for type abstractions.map
:(define map (Lambda (alpha) (Lambda (beta) (lambda (f (beta -> alpha)) (lambda (l (list beta)) (if (null_beta? l) null_alpha (cons_alpha (f (first_beta l)) ((map f) (rest_beta l)))))))))
map
, we need to pass to
pass the types first to instantiate map
. However, we do not have
any facilities for passing types as arguments, since they are not values.Tapp
(for
``type application''). Tapp
invokes an objected created with
Lambda
with a type as an argument. Hence, we would write the
recursive call to map
above as
(((Tapp (Tapp map alpha) beta) f) (rest_beta l))with the rest of the code remaining unchanged.
In TLC, and indeed in many programming languages, a programmer provides type annotations at binding instances of variables, but leaves the type of individual expressions to be inferred (computed) and checked by the language implementation. In XPolyLC, we have added two new kinds of expressions, type introductions and eliminations, so we examine what the types of these expressions are.
(lambda (l (list beta)) ...)
expression
in either definition of map
?
((list beta) -> (list alpha))
.
(lambda (f (beta -> alpha)) ...)
expression
in either definition of map
?
((beta -> alpha) ->
type of (1))
.
(Lambda (beta) ...)
in the latter definition
of map
?
The type is (beta ->
type of (2))
, for
all possible values of beta
. Hence, we write this
as (forall beta .
(2))
.
(Lambda (alpha) ...)
?
This has type (alpha ->
type of (3))
,
for all possible values of alpha
. We write this type as (forall
alpha .
(3))
.
In reality, we need to perform a type application to instantiate each primitive as well. Hence, the resulting code looks like
(define map (Lambda (alpha) (Lambda (beta) (lambda (f (beta -> alpha)) (lambda (l (list beta)) (if ([Tapp null? beta] l) [Tapp null alpha] ([Tapp cons alpha] (f ([Tapp first beta] l)) (([Tapp [Tapp map alpha] beta] f) ([Tapp rest beta] l)))))))))
Hence, alpha
and beta
range (abstract) over types
in the same manner that lambda
-bound identifiers range (abstract)
over values. This property of a type system, wherein ``type variables'' are
used to abstract over types, is called polymorphism. This new form
of types is called a type schema, since it represents a framework which
can be instantiated with several different types. This kind of type system is
said to be explicitly polymorphic, since the programmer is required
to manually specify each instance of type abstraction, and to perform the appropriate
type applications.
While explicit polymorphism adds considerable power to a programming language, it is notationally clumsy because all type instantiations (applications) must be explicitly performed in the code. (Generic Java requires explicit instantiation of polymorphic (generic) classes but not of polymorphic methods.) To eliminate this problem, we allow the programmer the following freedom: every type declaration required in TLC can be left blank. We then design typing rules that will attempt to deduce an appropriate type from context such that the resulting expression will pass the type checker.
Consider the following example. If we have
(lambda (x _) x)
what can we write in the blank? We could certainly use int
; likewise,
we could use bool
. In fact, any well-formed type we write in the
blank will satisfy the type checker.
Now consider a typing judgement for let
:
Tenv |- Exp : t Tenv + [f : t] |- Bexp : type ------------------------------------------------ Tenv |- (let f t Exp Bexp) : type
Say the Exp
is (lambda (x _) x)
. Then how many ways
are there to type it and use it in the body? We can only use whatever was put
in the place of _
, so the resulting identity function can only be
used on objects of the one type t
above. For instance, it could not
be used to type
(let ((id (lambda (x _) x))) (if (id true) (id 5) (id 6)))
since the closure is applied to both booleans and integers. Hence, this is not
a convenient typing rule. Instead, the SML programming language uses a modified
judgement for let
:
Tenv |- Bexp [f / Exp] : type --------------------------------- Tenv |- (let f _ Exp Bexp) : type
This rule literally copies the Exp
expression through Bexp
;
when applied to the code fragment we examined earlier, we end up typing the expression
(if ((lambda (x _) x) true) ((lambda (x _) x) 5) ((lambda (x _) x) 6))
As can be seen, everything -- including the programmer's demand to deduce a type -- is copied by this rule. Hence, an independent deduction can be made at each location.
The last rule is easy to implement, though it can be expensive in terms of
the time taken for the typing phase. But, how do we implement type deduction,
especially since there is an unbounded number of types? The answer is that we
create a new type, assuming no properties for it, and based on the
constraints derived from typing the body M
, we derive the "deduced"
type.
In summary, polymorphism is the ability to abstract over types. Implicit
polymorphism is the ability of a type system to assign many different types
to the same phrase without the need for explicit type declarations. Several
languages, including SML, Haskell and Clean, provide type inference ("computing
the most general possible type for each expression") with implicit polymorphism
for let
(and possibly other binding constructs).
Exercise: What is the connexion between explicit and implicit polymorphism? |