Lecture 1: Studying Programming Languages

 

 

What you will learn

A large portion of this course focuses on reading ASCII program text, representing it using appropriate data structures, and then "walking" those structures to compute a result. A simple example of this would be reading a mathematical expression and calculating a final value from it by successively reducing terms. Many of the more advanced issues involved in parsing program text such order of precedence, left vs. right recursion in grammars, how to do fast lexical analysis , etc. are not covered in this class. You can find out more in COMP 412, the introductory compiler class.

The concepts and skills covered in this early portion of the course provide students with sufficient background to design and implement customized languages to support the implementation and extension of applications software (e.g., the expression language for a spreadsheet, the macro language for a word processor, the internal language for generating completed tax forms within a tax package like TurboTax). In some applications like Emacs, the application reduces to designing and implementing such a language. The underlying language for Emacs is eLisp, a lisp dialect including an extension set of primitives to support text editing and textual display. As a result, there are emacs modules for reading mail, subscribing to newsgroups, etc. An ambitious user can extend Emacs to support essentially an display based application. While a language facility like eLisp (with a large collection of primitives for manipulating the surrounding computing environment) is complex, it will not be out of graduates of this course.

In the second portion of the course, we will focus on analyzing programs. Instead of just reducing a program expression to an answer, we can actually prove certain things about an input program, such as the fact that it conforms to a specifed set of constraints. If the input language includes a type system, we can prove that the program makes sense from the perspective of the types that it manipulates. In effect, we can guaranty that a program will never misuse data such as adding a number to a boolean. In a programming language supporting references (pointers), we can also prove that a program will not make memory accesses to non-existent data. Interestingly, you will find that for these analyses, you can employ techniques similar to the ones you have already used for the reduction of a program to a result. Improving type and memory safety in commercial languages is still a very active area. Java just added generic types to the language, a feature which many of you will use if you decide to write your projects in Java. The way in which these generic types were added still leaves a lot to be desired, and solutions to improving the type safety and expressiveness of Java are still developed, here at Rice, for example.

In the third portion of the course, you will learn how to transform programs to lower-level form, eliminating nested procedure calls (recursion!) and hence the need for a central stack. After these transformations, a program can easily be transliterated to machine code. These transformations are employed in many compilers (particularly those for "advanced languages" supporting first-class procedures) essentially reducing the task of compiling an advanced language to compiling a low-level language like C.

Why you should learn it

Studying programming languages is important for many reasons. In the past few years, several new languages have emerged, and even become dominant in a few cases, and each of these languages is targeted at a particular technological niche. Think of Java, C#, XML, Perl, Python, and Ruby, for example. To make effective use of technology, you have to master the language that controls the technology.

New programming languages are constantly being invented. The proliferation of languages has made it especially important to understand the design and properties of languages. In particular, many technologies already have languages designed to address them; the user only needs to find the appropriate language. Having studied programming language, it is likely that you will have to expend less energy in the future.

Understanding programming language concepts is key to better program design. A grasp of programming language principles enables the programmer to explore a much larger spectrum of possible programming techniques for solving a given computational problem and to assess the tradeoffs among them. You might also have to choose a programming language for some project in the future. Making such a selection involves several issues, including technical, sociological, economic considerations. In this course, we will focus primarily on technical considerations because they are more universal and more enduring. Finally, you might be asked to design a new language as part of writing a particular software application. To be successful, you must understand the conceptual building blocks used to construct programming languages.

While you may never have the task of creating "the new C++" or "the new Scheme", the information you can learn in this class will probably be helpful more often than you think, not only in computing something, but also in creating a small new language to fill some niche, extending programs by adding a programming language, and in just using existing programming languages.

Back to course website