For the moment, let us confine ourselves to monolithic programming languages: languages that can only express partial functions mapping the natural numbers N (bitstrings) to the natural numbers. (We could restrict Jam to such a form for example by omitting all operations involving lists and either banning functions as arguments or prohibiting them in top level function definitions.)
Consider any such programming language P. To qualify as a legitmate programming language, the programs must have a "surface syntax" representation consisting of sequences of symbols over a finite alphabet A. Can it express all total functions mapping N to N? The answer is no. We will present two proofs of this fact. The first proof is a simple counting argument. How many different total functions are there mapping N to N? An infinite number. How many programs are there in P? An infinite number. (We will assume programs in P can be arbitrarily long; otherwise, it is obvious that there aren't enough programs to compute all the total functions on the natural numbers.) But infinite sets are not necessarily comparable in size. You are presumably aware that there are far more real numbers than rationals (or integers), while the set of rational numbers and the set of integers are the same size (cardinality). Two infinite sets A and B are the same size if there can be put into one to one correspondence, i.e., if there exists a function f:A->B that is that is one-to-one (for all x,y in A [f(x)=f(y) -> x=y]) and onto (for every z in B, there exists x in A such that f(x)=z).
Exercise Show that the rationals and the natural numbers can be put into one-to-one correspondence. Hint: depict the set of rational as a two dimensional table of unbounded width and height and traverse the table along each finite length diagonal.
It is easy to show that the set of all programs P has the same cardinality (has the same size) as the set of natural numbers. Simply sort the programs in P by length and then alphabetize them (using any ordering you choose on the alphabet A for P). Such an enumeration of programs places them in one to one correspondence with the natural numbers N.
Now consider how many functions that there are mapping N to N. We know that this set must be at least as big as the set characteristic functions for all subsets of the natural numbers (the power set of N). The characteristic function fS for subset S is defined by the equation
fS(n) = if n in S then 1 else 0One of the most basic results of set theory is the fact the the power set of N is larger than N. It is not difficult to prove this fact by a diagonalization argument. We will present the proof for the natural numbers N but it easily generalizes to any infinite set S. Assume that the power set Pow(N) can be put into one-to-one correspondence with the natural numbers. Let S0, S1, S2, ... be an enumeration of N that establishes that correspondence. Then define S' as {i | i not in Si}. In other words, natural number i belongs to S' iff it does not belong to Si. Since S' in a member of Pow(N), it must appear in the enumeration S0, S1, ... somewhere, say Sj. Is j a member of S' = Sj. Assume that j is an element Sj. Then by the definition of S', j is not a member of S', yielding a contradiction. Similarly, if j is not a member of S, then j is not a member of Sj, implying that j is a member of S, again yielding a contradiction. Hence, no such enumeration S0, S1, ... can exist. Now we have all of the machinery at hand required to prove that there are more total functions mapping N to N. The number of functions is at least as large as the power set of N which is strictly larger than N. But the set of programs in P is countable, has the same size as the natural Numbers N. Hence there are more functions than programs
0 1 2 3 ... P0 x x x x P1 x x x x . . .Define P' as the function that returns Pi(i)+1 if Pi(i) terminates and 0 if Pi(i) diverges. If we identify 0 with false and non-zero values with true then P' reports whether or not the computation Pi(i) converges. Can P' be computed by some program Pj in our enumeration of all programs? It is easy to prove that the answer is no by contradiction. Assume j exists such that Pj = P'. Since P' terminates for all inputs, the computation P'(j) terminates. Hence, the computation Pj(j) terminates implying that P'(j) = Pj(j)+1 = P'(j)+1, a contradiction. Hence, P' is not a computable function.
Slogan: Computation is successive approximation
Let us explore what properties domains (sets of data values) and functions on those domains must have to support recursive defintions. In the detour in Lecture 6, we informally showed how to construct the fixed-point of a function mapping sets to sets. In this lecture, we study this issue more rigorously in a general setting. Why does this question matter? It shows the boundaries of what can be expressed computationally. Much of mathematics lies outside the reach of computation.
We begin by reviewing a standard definition from discrete mathematics.
A partial order is a set D together with a relation <= (which we will read as "approximates") on D (a subset of D x D) with the following three properties:
Some examples include:
A partial order does not have enough structure to support the general solution of fixed point equations. We need to add the two restrictions: chain-completeness and consistent-completeness. A chain C in a partial order (D,<=) is a (possibly empty) denumerable sequence c1, c2, ..., cn, ... such that
c1 <= c2 <= ... <= cn <= ...An upper bound for a chain C in (D,<=) is an element d in D such that ci <= d for all ci in C. A partial order (D,<=) is chain-complete (or simply complete) iff every chain C has a least upper bound. (An upper bound c* for C is least if c* approximates every upper bound for C.) We typically abbreviate the term complete partial order as cpo. Note that chain can be empty, implying that every cpo has a least element. We call this least element "bottom" and typically denote it bot. As we will see, bot corresponds to divergence.
Of the three examples of partial orders given above, only the first one is complete. In the partial order of finite alphabetic strings, a chain of the form
"a" <= "aa" <= "aaa" <= ...has no upper bound, much less a least upper bound. Similarly, in the partial order (N,<=), the chain
0 <= 1 <= 2 <= ...has no upper bound. In essence, these spaces are missing the "limit" points of chains of approximations.
The notion of cpo described above is still too broad to characterize potentially domains for incremental computation. The problem is that such domains do not have enough finite approximations to express all possible finite snapsnots of the results of computations. Consider the following cpo (nicknamed "pentagon" or "home-plate")
o o |\ /| | \ / | | \ / | | X | | / \ | | / \ | |/ \| A o o B \ / \ / \ / owhere each circle ("o") represents a domain element. The two elements A and B are labeled because they have an upper bound but no least upper bound. A computation yielding a result in this domain could describe an answer by outputting both A and B since these elements are "consistent", i.e. have an upper bound. But what element of the domain would would be the meaning of such computation if it produces no additional output? There are two distinct upper bounds for {A,B}. We can rule out this problem by forcing the domain D to include least upper bounds for all bounded subsets (subsets such that every member is less than a particular element of D). In essence, this property asserts that given any set of approximations to the answer to a computation, there is unique element of the domain that summarizes all of this approximation information.
The pentagon domaih fails this test because the set {A,B} is bounded but has no least upper bound.
We formalize this property as consistent completeness. A cpo [D,<=] is consistently complete iff every bounded subset S of D has a least upper bound. A subset S of D is bounded iff there exists d in D such that forall s in S, s <= d.
So we assert that a domain for incremental computation is a consistently complete cpo. In fact, we will use the term domain as a synonym for consistently complete cpo.
Example Consider a function f on lazy streams of integers
that maps
lazy-cons(1,bottom)
to 1
,
lazy-cons(1,2)
to 0
, and everything
else to bottom
. This
function is not monotonic because bottom
(the meaning of
an infinite loop) approximates 2 yet
f(lazy-cons(1,loop()))
does not approximate
f(lazy-cons(1,2))
.
This property asserts that as a function gathers more information about its input it cannot withdraw information about its output that it has already produced!
Monotoncity is not the only topological constraint on incrementally computable functions. Consider a function f that on infinitely streams of integers that maps every finite approximation
lazy-cons(n1, lazy-cons(n2, ... lazy-cons(nk, bottom) ...))to bottom, but maps every infinite stream
lazy-cons(n1, lazy-cons(n2, ... lazy-cons(nk, ...) ...))to
true
. A computable function cannot produce
this behavior because it can never determine that its input
is infinite; all it sees are progressively better finite approximations.
For this reason, computable functions must be continous:
given any chain of approximations a0 <= a1 < = ...,
f(lub{ai| i = 0,1, ...}) = lub{ f(ai) | i = 0,1, ...}.
In other words, the behavior of a continuous function on an infinite
element is simply its cumulative behavior on the finite approximations
to the infinite element.
Example Consider a function f on lazy streams of integers that maps
all elements except
zeroes = lazy-cons(0,lazy-cons(0, ... (lazy-cons(0, ...))))(the infinite stream of zeroes) to
bottom
and
maps
zeroesto
null. This function is clearly monotonic since it differs from
bottom
only on a maximal
element (zeroes
).
Yet it is not continuous because the function is bottom for
every element of the the chain of finite approximations
to zeroes
(finite streams of zeroes ending
in bottom
). By the definition of continuity, the
function would have to yield bottom
on zeroes
to be continuous at zeroes
, but it doesn't.
The everywhere undefined function (empty set) obviously approximates all other functions. Numbers and functions are incomparable. The special element bot approximates all functions as well as all numbers. Why? An attempt to define a function can diverge, which is different from defining the everywhere divergent function. (The latter produces a closure which can be identified using the {\tt function?} primitive. It diverges only when it is applied to an argument.)
Consider the function definition written in Scheme:
(define fact (lambda (n) (if (zero? n) 1 (* n (fact (sub1 n))))))What meaning in the space of strict partial functions mapping integers to integers should we assign to
fact
. We need to find a function fact
in this space such that
F(fact) = factWe can construct a functional corresonding to this function that maps an estimate for the meaning of
fact
into
a better estimate:
(define FACT (lambda (fact) (lambda (n) (if (zero? n) 1 (* n (fact (sub1 n))))))
Given the empty function, FACT
constructs the partial
function {(0,1)} (that diverges everywhere except for returning 1 when
the input is 0).
All computable functions including FACT
are monotonic: if x <= y
then FACT(x)
approximates FACT(y)
. Given this
property and the property
that FACT
is continuous (all computable functions are
continuous), we can deduce (by repeatedly appealing to the monotonicity
of FACT):
empty <= FACT(empty) <= ... FACT(...FACT(emtpy)) <= ...The preceding sequence is clearly a chain. Hence, it must have a least upper bound which we write
fix(FACT)
because it is
least fixed
fixed-point of the functional FACT
. This function
the meaning of fact
determined by
the recursive definition above.
In general, every computable function mapping a cpo D to D has
a least fixed-point. Hence, we can interpret any recursive
definition as the least fixed-point of the corresponding
function.
D = Zbot + (D -> D)where
Zbot
is the integers augmented by bot, +
means the standard union operation except bottom elements are coalesced,
and (D -> D)
means the set of all strict
continuous functions
mapping D
into D
. (An function f mapping
the cpos D1, D2, ... Dn
into the cpo D0
is strict iff f(d1,d2,...,dn) =
bot
if any di
= bot.)
This equation can
be interpreted as a recursive definition of the domain D. We can
solve this equation by constructing a functional on domains and
taking its least fixed-point. But a lot of technical machinery
must be developed. First we have to show that the set of domains
forms a cpo. Then we have show that the +
and
->
operations (functions) on domains are continuous.
This deeper inquiry into the semantics of LC is the subject matter
of Comp 511.
cork@cs.rice.edu/ (adapted from shriram@cs.rice.edu)