Assignment 1: Symbolic Evaluation of Boolean Expressions in Scala

Due: 2pm Friday, Jan 20, 2012

Overview

Write a collection of Scala functions in the Scala object BoolExp that reduces boolean expressions (represented in the input and output streams in Scheme-like notation) to simplified form. Stubs are provided for these functions in the file BoolExp.scala linked below. For the purposes of this assignment, boolean expressions are Scheme expressions constructed from:

Some sample inputs include
x
T
(& x y)
(| x (! y))
(? x y F)
(> x F)
(> (& F y) x)
(> z (| y z))
(> (| (! a) b) (> a b))

The shorter names T, F, !, &, |, >, and ? are used instead of true, false, not, and, or, implies, and if for notational brevity which matters in very large inputs.

The course staff is providing:

The stub file BoolExp.scala also includes comments showing you exactly what code you have to write to complete writing your simplifier. Of course, you also need to write corresponding tests and add them to the file BoolExpTest.scala.

The code in Parser.scala enables you to test your solution on large inputs stored in files. Parser.scala includes two Parser constructors Parser(File file) and Parser(String form) for building parsers to parse the boolean expression (in external text form) in the specified File or String, respectively. Since the library class File is defined in the package java.io, you need to insert

import java.io._

at the head of a test file that uses the Parser class on the contents of a file.
To construct a Parser for the formula in a file <fileName> you must invoke

new Parser(new File(<fileName>)).

If you omit the File(...) construction in the argument to Parser and use "<fileName>" instead, you will create a Parser for the String "<fileName>". which is interpreted as a simple boolean variable. The File input format is important because it enables us to conveniently apply your simplifier to formulas that are thousands of symbols long. You are expected to appropriately use case classes and pattern matching to write your code Since the only stub files that you have to modify are BoolExp.scala and BoolExpTest.scala, simply submit expanded versions of these files via OwlSpace to submit your assignment. Warning: we will run your program on large inputs to determine if you wrote the code correctly. Try using the large test files provided at the course website. Add appropriate tests to BoolExpTest.scala. You may need to increase the JVM stack size using the JVM argument -Xss64m (in the "(x)=arguments" tab in the Run Configuration for Scala applications on the Run menu) to run the largest tests.

Given a parsed input of type BoolExp, the simplification process consists of following four phases:

A description of each of these phases follows. The reduce function has type String -> String.

Conversion to IfExp form

A boolean expression (BoolExp) can be converted to ifExp form by repeatedly applying the following rewrite rules in any order until no rule is applicable.

(!  X)   	=>	(if  X  false true)
(&  X  Y)	=>	(if  X  Y  false)
(|  X  Y)	=>	(if  X  true  Y)
(>  X  Y)	=>	(if  X  Y  true)    // recall that ">" denotes "implies"

In these rules, we use the external textual notation for formulas rather than the internal Scala representation. Note that the "if" constructor on the right hand side of these rules should be implemented in your program by IIf objects. The symbols X and Y denote arbitrary boolExps}. The conversion process always terminates (since each rewrite strictly reduces the number of logical connectives excluding {{make-If) and yields a unique answer independent of the order in which the rewrites are performed. This property is called the Church-Rosser property, after the logicians (Alonzo Church and Barkley Rosser) who invented the concept.

Since the reduction rules for this phase are Church-Rosser, you can write the function convertToIf using simple structural recursion. For each of the boolean operators &, |, !, >, and if, reduce the component expressions first and then applying the matching reduction (except for if for which there is no top-level reduction).

The following examples (written in Scala notation) illustrate the conversion process:

convertToIf(Or(And(Var("x"),Var("y")), Var("z")))  => IIf(IIf(IVar("x"), IVar("y"), false), true, IVar("z"))
convertToIf(Implies(Var("x"), Not(Var("y")))) => IIf(IVar("x"), IIf(IVar("y"), IVal(false), IVal(true)), IVal(true))

We suggest simply traversing the BoolExp input tree using structural recursion and converting all BoolExp objects to the corresponding IfExp objects.

The provided function parse: String -> BoolExp takes a Scheme-like expression and returns the corresponding BoolExp.

Normalization

An IfExp is normalized iff every sub-expression in test position is either a variable (IVar) or a boolean constant (IVal). We will refer to the set of normalized IfExps as norm-IfExp. Unfortunatley, this subset of BoolExp cannot be defined as a Scala type.

For example, the IfExp

IIf(IIf(IVar("x"), IVar("y"), IVar("z")), IVar("u"), IVar("z"))
is not a norm-IfExp because it has an IIf construction in test position. In contrast, the equivalent IfExp
IIf(IVar("x"), IIf(IVar("y"), IVar("u"), IVar("v"), IIf(IVar("z"), IVar("u"), IVar("v")))
is normalized and hence is a norm-IfExp.

The normalization process, implemented by the function normalize: IfExp -> norm-IfExp eliminates all if constructions that appear in test positions inside IIf constructions. We perform this transformation by repeatedly applying the following rewrite rule (to any portion of the expression) until it is inapplicable:

IIf(IIf(X, Y, Z), U, V) => IIf(X, IIf(Y, U, V), IIf(Z, U, V)).

This transformation always terminates and yields a unique answer independent of the order in which rewrites are performed. The proof of this fact is left as an optional exercise.

In the normalize function, it is critically important not to duplicate any work, so the order in which reductions are made really matters. Do NOT apply the normalization rule above unless U and V are already normalized, because the rule duplicates both U and V. If you reduce the consequent and the alternative (U and V in the left hand side of the rule above) before reducing the test, normalize runs in linear time (in the number of nodes in the input); if done in the wrong order it runs in exponential time in the worst case. (And some of our test cases will exhibit this worst case behavior.)

Define an auxiliary function headNormalize that takes three norm-IfExps X, Y, and Z and constructs a norm-IfExp equivalent to IIf(X, Y, Z). This help function processes X because the test position must be a variable or a constant, yet X can be an arbitrary norm-IfExp. In contrast, (headNormalize X Y Z) never even inspects Y and Z because they are already normalized and the normalizing transformations performed in headNormalize never place these expressions in test position.

The following examples illustrate how the normalize and headNormalize functions behave (expressed in Scala notation):

headNormalize('x, 'y, 'z)     =>  IIf('x, 'y, 'z)
headNormalize(True, 'y, 'z)   =>  IIf(True, 'y, 'z)
headNormalize(False, 'y, 'z)  =>  IIf(False, 'y, 'z)

headNormalize(IIf('x, 'y, 'z), 'u, 'v) => IIf('x, IIf('y, 'u, 'v), IIf('z, 'u, 'v))

headNormalize(IIf('x, IIf('yt, 'yc, 'ya), IIf('zt, 'zc, 'za)), 'u, 'v) =>
  IIf('x, IIf('yt, IIf('yc, 'u, 'v)), IIf('ya, 'u, 'v), IIf('zt, IIf('zc, 'u, 'v)), IIf('za, 'u, 'v))

normalize(True)  => True
normalize(False) => False
normalize('x)    => 'x
normalize(IIf('x, 'y, 'z)) => IIf('x, 'y, 'z)
normalize(IIf(IIf('x, 'y, 'z), 'u, 'v)) => IIf('x, IIf('y, 'u, 'v), IIf('z, 'u, 'v))

Once a large formula has been normalized, do not try to print it unless you know that the formula is small! The printed form can be exponentially larger than the internal representation (because the internal representation can share subtrees).

Before you start writing normalize, write the template corresponding to the inductive data definition of norm-IfExp.

Symbolic Evaluation

The symbolic evaluation process, implemented by the function eval: norm-IfExp environment -> norm-IfExp, reduces a norm-IfExp to simple form. In particular, it reduces all tautologies (expressions that are always true) to IVal(true) and all contradictions (expressions that are always false) to IVal(false).

Symbolic evaluation applies the following rewrite rules to an expression until none is applicable (with one exception discussed below):

IIf(T, X, Y)  =>  X
IIf(F, X, Y)  =>  Y
IIf(X, T, F)  =>  X
IIf(X, Y, Y)  =>  Y
IIf(X, Y, Z)  =>  IIf(X, Y[X <- true], Z[X <- false])
where T denotes the boolean constant true (IVal(true)); F denotes the boolean constant false (IVal(false)); and X, Y, and Z are variables (IVars)

The notation M[X <- N] means M with occurrences of the symbol X replaced by the expression N. It is very costly to actually perform these subtitutions on an IfExp. To avoid this computational expense, we simply maintain a list of bindings which are pairs (the Scala library type Tuple2) consisting of variable names (Strings) and Boolean values {true, false}.

An environment is a set of bindings, which can be conveniently represented using the Scala library type List[Tuple2[String, Boolean]] or the Scala library type immutable.Map[String, Boolean].

When the eval function encounters a variable (IVar), it looks up the variable name in the environment and replaces the symbol by its boolean value if it exists.

These rewrite rules do not have the Church-Rosser property. The last two rewrite rules are the spoilers; the relative order in which they are applied can affect the result in some cases. However, the rewrite rules do have the Church-Rosser property on expressions which are tautologies or contradictions.

If the final rule is applied only when X actually occurs in either Y or Z, then the symbolic evaluation process is guaranteed to terminate. In this case, every rule either reduces the size of the expression or the number of variable occurrences in it.

We recommend applying the rules in the order shown from the top down until no more reductions are possible (using the constraint on the final rule). Note that the last rule should only be applied once to a given sub-expression.

Conversion to Boolean Form

The final phase converts an expression in (not necessarily reduced or normalized) If form to an equivalent expression constructed from variables and { true, false, And, Or, Not, Implies, If }. This process eliminates every expression of the form

IIf(X, Y, Z)

where one of the arguments {X, Y, Z} is a constant {true, false}.

Use the following set of reduction rules to perform this conversion

IIf(X, false, true)  =>  Not(X)
IIf(X, Y, false)     =>  And(X, Y)
IIf(X, true, Y)      =>  Or(X, Y)
IIf(X, Y, T)         =>  Implies(X, Y)

where X , Y , and Z are arbitrary IfExps and T is the true IfExp (IVal(true)). This set of rules is Church-Rosser, so the rules can safely be applied using simple structural recursion.

Point Distribution

Hints on Writing Your Scala Code

The abstract syntax classes in BoolExp.scala include a separate composite (case class) hierarchy called IfExp for representing boolean expressions exclusively in terms of conditionals. This representation includes only case classes, making it easy to write the functions perform normalization, evaluation, and conversion-to-Boolean-form using pattern matching. This representation also improves program performance.

In Scala, you can write pattern matching functions explicitly instead of encoding them as visitor objects.

Here are the links for the files:

Update: Augmented versions of BoolExp.scala and BoolExpTest.scala are located in this directory. The new versions include

These extensions to our support files enable you to write tests that process input files simply using Scala/Java string notation for the answer.

Sample Input Files

The following files contain large formulas that can be reduced by your simplifier. Only the files named bigDataX require a larger stack size than the JVM default on most platforms. NOTE: to handle the bigDataX files, you may need to use JVM argument -Xss64M in the Eclipse Run Configuration for your program.