Programming
Assignment 6: |
|
Preparation Create a programs/6 directory within your comp411 directory. All of your files for this assignment should be stored in the programs/6 subdirectory.
Teamwork You are expected to do this assignment and all other programming assignments using pair programming. When you submit your assignment, indicate the composition of your team (names, student ids, and email addresses) in your README file.
If you cannot find a partner and would like one, post a message to our Piazza page and we will try to find one for you. Teams of more than two members are not permitted.
Task Your assignment is to transform an untyped, call-by-value, eager dialect of Jam to continuation passing style (CPS) and use these transformations to support the addtion of the letcc construct. The letcc construct is formally defined below. We are providing a Jam interpreter for the Jam dialect that you will extend. It supports two different forms of "let" binding: unrestricted ordinary let as in Assignment 2 and "letrec" (recursive "let") as in Assignment 3, except that the right hand sides of the "let" bindings are restricted to map constructions. The provided parser supports both of these "let" constructs including the "map" restriction on letrec The assignment consists of three separate phases: (i) modifying the parser to consistently rename variables to eliminate the shadowing of variables by nested declarations (map parameters, let bindings, letrec bindings) with the same names [already done in the support code as discussed below], (ii) transforming Jam programs to CPS, leaving the interpreter from phase (i) unchanged; and (iii) supporting the letcc construct in your interpreter by using the CPS transformation to eliminate letcc from Jam programs. The interpreter provided in our support code correctly parses letcc but aborts with an error if you try to execute it. The CPS transformation augmented by rule given below transforms Jam programs with letcc to equivalent, more complex Jam programs without letcc. The support code for checking Jam syntax already renames variables to achieve task (i) above. The solution code for old Assignment 6 included this step as part of syntax checking and removing it would complicate our code base while adding to the routine software engineering burden for this assignment. So you really only have to do parts (ii) and (iii) above, but you need to understand what the provided syntax checker does, including renaming.
This is a longer, more challenging assignment than the previous assignments (except for the extra credit version of Assignment 5) and hence is worth 150 points instead of 100 points.
Background The Jam language for this assignment is identical to the call-by-value, eager language used in Assignment 4 with the following tweak. In Assignment 6 Jam, there are two "let" constructs: (i) letrec, which is recursive let with right hand sides limited to map constructions, and (ii) let, which is ordinary non-recursive let abbreviating the application of a map to the right hand sides. These modifications to the Jam language simplify the definition of the CPS transformation. This "tweak" revises the definition of the language syntax as follows:
Expressions
Exp ::= ... | let Def+ in Exp | letrec MapDef+ in Exp | Map | ... Map ::= map IdList to Exp
Definitions
Def ::= Id := Exp ; MapDef ::= Id := Map ;
In your Java implementation, the only public evaluation method in the Interpreter class must be called eval() instead of the former name valueValue() in Assignment 4 and eagerEval in Assignment 5.
Phase Ia [This phase is already implemented by the "syntax checker" in the provided support code.] Modify your parser to prevent shadowing variable names by converting the name of each Jam variable from x to x:d where d is the lexical depth of the declaration of x. If x is free (not allowed in legal programs) then it has lexical depth 0. If it occurs inside one level of lexical nesting, it has lexical depth 1. For example, the program
(map x to x)(7)
becomes
(map x:1 to x:1)(7)
after renaming. Similarly,
let id := map y to y; in id(1)
becomes
let id:1 := map y:1 to y:1; in id:1(1)
after renaming. The right-hand-sides of let bindings have the same nesting level as the surrounding context. Recall the expansion of let in terms of lambda.
Renamed variables cannot be confused with existing variable names because : is not a legal character in variable names read by the parser. Within a given scope all variables in the lexical environment must have distinct names because all variables introduced in a particular let or map must be distinct and the names at every nesting level are disjoint because of the added : suffixes.
This transformation can either be applied directly during parsing, or as a separate pass after the input program has been parsed into an initial AST.
In your Java code, add a method
public AST unshadow();
in the Interpreter class that returns the input AST after applying the unshadowing transformation.
This unshadowing transformation should be applied just after parsing regardless of whether or not the CPS transform is later applied. (This is a requirement for matching parser output in our grading test suite.)
The unshadowing transformation permits let constructs to be re-interpreted as let* constructs without affecting program semantics. The correctness of our rules for CPS transformation hinges on this identity.
Note: In order to support the full CPS transformation without changing the semantics of the input program, we need to ensure that "short-circuit" evaluation is used for in CPSed programs for the boolean & and | operators. The CPS rules assume that all primitive functions evaluate all of their arguments which conflicts with short-circuit evaluation. The provided interpreter already works around this problem by supporing the primitive function asBool to eliminate the & and | operators as part of syntax checking (as implemented by the checkProg() method using CheckVisitor instances). This asBool function has a very simple definition. For asBool(X):
The provided syntax checker performs the following transformations on input programs
M & N => if M then asBool(N) else false M | N => if M then true else asBool(N)
so that your CPS code does not have handle any primitive functions that do not evaluate all of their arguments. Without these transformations, CPS conversion will treat & and | just like other binary operators and always force the evaluation of both arguments (as in call-by-value evaluation of program-defined functions). Also note how this transformation prevents & or | from returning non-boolean values. Without it, the CPS transformation, the expression true & 1 after CPS transformatoin would evaluate to 1.
Phase II Write a postprocessor (function from abstract syntax to abstract syntax) for your parser that transforms a Phase I Jam program to equivalent CPS form. Specifically, given a program M, your postprocessor will generate the program Cps[map x to x, M0] where M0 is M converted to unshadowed form and Cps is a binary function mapping Jam program text to Jam program defined by the transformation rules given below. These rules are a loosely based on the exposition in Friedman, et al., Chapter 8.
These rules presume that variables have been renamed to prevent “holes in scope” (nested variables with the same name) as described in Phase I. They will not work correctly on programs that shadow variable names without Phase I being performed first.
For the purposes of this assignment, we will consider operator applications (both unary and binary) as syntactic sugar for applications of corresponding primitive operations. Hence operator applications are treated just like primitive applications.
If your CPS tranformer encounters a letcc construct embedded in an input program, throw an exception reporting the error and abort exection. We will replace this code by a simple translation of this construct in Phase III.
In your Java code, formulate the CPS postprocessor as a method
public AST convertToCPS()
in your Interpreter class.
These interfaces are important because we will use them to test your code.
You must also provide a new method/function for performing interpretation of the CPSed program. In your Java code, you must add a method
public JamVal cpsEval();
to the Interpreter class that converts the associated program to CPS and then interprets the transformed program.
You can test that your implementation of the CPS transformation preserves the meaning of programs in Java by comparing the results produced by eval() and cpsEval().
To state the CPS transformation rules, we need to introduce a few technical definitions. Study them until you thoroughly understand them.
A Jam application E(E1, ...,En) is primitive if E is a primitive Jam function. Recall that we are interpreting operator applications as syntactic sugar for applications of corresponding primitive operations. So an application is primitive if and only if the rator of the application is either a primitive function or an operator. For example, the applications first(append(x,y)) and square(x) * y are both primitive while the applications square(4) and append(x,y) are not.
A Jam expression E is simple if and only if all applications except those nested inside map constructions are primitive, i.e. have a primitive function or operator as the rator. For example,
let x := first(a) * b * first(c); Y := map f to let g := map z to f(z(z)); in g(g); in cons(x, cons(Y, null))
and
x+(y*z)
are both simple. In contrast,
f(1)
and
let Y := map f to let g := map z to f(z(z)); in map x to (g(g))(x); in Y(map fact to map n to if n=0 then 1 else n*fact(n-1))
are not simple because f is not primitive and Y is not primitive.
The following rules define two syntactic tranformers (functions) on Jam program text: the binary transformer Cps : Jam * Jam -> Jam and the unary transformer Rsh : Simp -> Simp, where Jam is the set of Jam expressions and Simp is the set of simple Jam expressions (Rsh stands for “reshape”). The binary transformer Cps[k,M] takes a Jam expression k denoting a unary function, and an unshadowed Jam expression M as input and produces a tail-recursive Jam expression with the same meaning as k(M).
The unary transformer Rsh is a “help” function for Cps that take an unshadowed simple expression as input and adds a continuation parameter to the map expressions and function constants embedded in simple expressions. Rsh also adjusts applications of the arity primitive function to ignore the added continuation argument.
In the following rules, S, S1, S2, ... are meta-variables (pattern matching variables) denoting simple Jam expressions; k, A, B, C, E, E1, E2, ..., T are meta-variables denoting arbitrary Jam expressions; x1, x2, ... are metavariables denoting ordinary Jam identifiers; v, v1, v2, ... denote fresh Jam identifiers that do not appear in any other program text; and k is a metavariable denoting a Jam expression that evaluates to a unary function. The variable names x, y, and k embedded in program text denote themselves.
Definition of Cps The following clauses define the textual tranformation Cps[k, M]:
Definition of Rsh The helper transformer Rsh[S] is defined by the following rules:
For the purposes of testing your programs we require the following standardization. The top-level continuation must have exactly the syntactic form
map x to x
using the variable name x. In some transformations, you must generate a new variable name. For this purpose, use variable names of the form :i where i is an integer. These name cannot be confused with the names of variables that already exist in the program. The sequence of variable names generated by your CPS tranformer must be :0, :1, :2, ... so that your CPS tranformer has exactly the same behavior as our solution. Note that you must transform a program by making the leftmost possible reduction given that match variables S and E can only match raw program text (any embedded calls on Cps and Rsh must have already been reduced).
Phase III As the third part of the assignment, you will enhance your CPS transformer from Phase II, to handle the letcc construct. This language extension is often called supporting first-class continuations
As we have already observed, the provided parser handles the letcc construct but the provided interpreter does not and aborts with an error if it encounters such a contruct. In this phase, we officially extend the source langage to include the new construct.
Exp ::= ... | letcc x in M
The new construct letcc x in M binds the identifier x to the current continuation, and evaluates M in the extended environment. A continuation is a closure of one argument, reshaped to take an auxiliary continuation argument (like all other closures after CPS) which unlike other closure, it discards. Invoking the continuation k on a return value replaces the current continuation by k and applies k to the specified value and the current (soon to be discarded) continuation. Since continuations are ordinary values, they can be stored in data structures, passed as argments, and returned as values. There is a large Scheme literature on various programming operations (such as co-routines) that can be performed using continuations
The letcc construct is only supported in the interpreters that perform CPS conversion on the code. The conventional interpreters abort execution with an error if they encounter a use of letcc. To perform CPS conversion on program containing letcc, we extend our rules for CPS conversion as follows.
First, a Jam expression E is simple if and only if all occurrences of the letcc construct and non-primitive applications appear nested within map constructions.
Second, we add the following clause to the definition of the Cps syntax transformer:
Testing You are expected to write unit tests for all of your non-trivial methods or functions and submit your test code as part of your program. For more information, refer back to Assignment 1.
The directory tests contains the Java test file Assign6Test.java consisting of some very simple unit tests for your program. You can also use the test file Assign4Test.java as basis for creating test files for this assignment.
Your program directory should only contain the files that you want to submit. We will subsequently provide you with test scripts similar to the ones we provided for Assignment 4. Running these scripts will alert you to any problems with your directory configuration. We might also place a few addtional test programs in the directory tests.
Make sure that your program passes the sample unit tests and works with our grading script before submitting it. Create a README file in the your directory program/6 that
We have provided a sample README for your reference. Make sure that your README file is structured correctly:
mgricken dmp This is where the readme text starts. |
Any test data files for your program must be stored in the programs/6 directory. Do not store anything in subdirectories below programs/6.
Each procedure or method in your program should include a short comment stating precisely what it does. For routines that parse a particular form of program phrase, give grammar rules describing that form.
Please make sure to remove or disable all debug output before submitting. Excessive use of print statements considerably slows your interpreter and thus testing your project down. If your interpreter generates a timeout error because debug output slows it down, you may not get credit for the test cases that took too long.
To submit your program, make sure that everything that you want to submit is located in the directory comp411/programs/6 on CLE@R and type the command
~comp411/bin/turnin411 6
from within the comp411/programs directory.
The command will inform you whether or not your submission succeeded. Only submit one copy of your program per team. If you need to resubmit an improved version your program, submit it from the same account as the original so that the old version is replaced.