logo
 
Introduction
CptS 355 - Programming Language Design
Washington State University
Home
Notices
Calendar
Homework
Syllabus
Resources
People

Thought questions

  • There are over 3000 programming languages, why?
  • New programming languages are still being invented, why? New application domains, improved ways of expressing ideas, ...
  • If many languages are the norm, why do we teach you to program in only one language in CS 121/122? Must assume no prior familiarity with programming.
  • Which language is better or worse? We will present some criteria for evaluating languages, in a very non-rigorous manner.
  • With so many languages, what should we study? Like Biology 101, we study abstractions.

What is a Programming Language?

Some features that it should have
  • universal - it can compute anything that can be computed (too strict in my opinion)
  • implementable - it can be implemented
  • efficient - it can be used to efficiently encode algorithms (a factor of 2-3 times slower matters a great deal in some applications)
  • has a syntax (form) and semantics (meaning) - we will study semantics
    A few different forms of the expression x + 1
    
     x + 1     C, Pascal, Ada, Algol
     $x + 1    Perl
     x 1 add   Forth, Postscript
     (+ x 1)   Lisp, Scheme
    
    

Why learn about Programming Language?

  • Expression of ideas will improve - Improved potential for expression leads to improved thinking. You can improve your programming/problem solving by learning to "think" in new languages.
  • Some languages are especially suited for some applications - Knowing can improve your ability to choose an appropriate language.
  • Practice learning new languages quickly.
  • Understand implementation - sometimes limits language features.
  • Historical perspective - why did language X become mainstream?

Programming domains

  • Scientific computing - 1950s, 1960s, Fortran
  • Business applications - 1960s, COBOL
  • AI - 1950s, Lisp, Prolog
  • Systems prog. - 1960s - IBM and PL/S, Unix and C
  • Scripting languages - awk, tcl, javascript
  • Special purpose - PostScript

Influences

  • Hardware
    • Imperative languages model changes in state (e.g., in the values in registers, memory, and on disk).
    • Parallel machines - there are languages for parallel computing
  • Mathematics
    • Lambda calculus - LISP
    • Functional languages have a program as a function - ML, Scheme, LISP
    • Logic programming view a program as a mathematical proof, the program evaluates by trying to prove a theorem - Prolog
  • Declarative - Programmer "declares" the desired result, the computer figures out how to compute that result - SQL and Prolog are declarative programming languages
  • Methodology
    • 1960s - larger and larger systems
    • 1970s - top-down design - Pascal
    • late 1970s, early 80s - data-oriented design (ADTs) Ada and Modula
    • 1980s - object-oriented design - C++, Smalltalk
  • Special purpose languages
    • Visual languages for designing user interfaces - Visual Basic
    • Markup Languages - HTML, XML, SGML

Computer Language Levels

  • item low-level -- close to the machine instruction set, examples are machine language and assembler language (symbolic machine code)
  • item high-level -- make it easier for humans, examples are C and Pascal
  • item very high-level -- give a general idea of what computer should do, let it do it, example is Prolog

Criteria for Evaluating Languages

                            Readability   Writability  Reliability
Simplicity/orthogonality        X             X            X
Control structures              X             X            X
Data types                      X             X            X
Syntax design                   X             X            X
Support for abstraction                       X            X
Expressivity                                  X            X
Type checking                                              X
Exception handling                                         X
Restricted aliasing                                        X
  • Simplicity - is there a single way of doing something thing? Multiplicity is many different ways to do something. In C there are many different ways to add 1
     
       x = x + 1
       x += 1
       ++x
       x++
     
    Another common example is operator overloading, which is useful, but multiply defines symbols. In Java is the + symbol in the following addition or string concatenation?
     
       x + y
     
    Orthogonality is the ability to mix and match features. Example from assembler language. In IBM assembler,
       add reg, mem
       add reg, reg
       add mem, mem
     
    Parameter passing in C.
       int x;
       int y[300]; 
       struct ... z;
    
       /* x and z are passed by value, y by address */
       foo(x, y, z);
     
  • Control statements - goto considered harmful
  • Data types - lack of boolean type in C makes programs a little difficult to read and write. Orthogonality of data types is also important (e.g., can I declare an array of type X where X can be any type?).
  • Syntax considerations - Shouldn't restrict identifier length. The should be few "reserved words". There are a variety of block delimiters (e.g., begin/end), with various tradeoffs.
          begin .. end
          { ... }
          if ... begin ... end if
          while ... begin ... end while
        
    Finally, reserved words should mean the same thing, independent of context. In C "static" means different things.
  • Support for abstraction - ability to write/call subprograms/procedures/functions/classes helps writability. Does it help readability?
  • Expressivity - Is there precision in how we express concepts? Consider short-circuit evaluation. In C we what if foo() is sloooowwww and bar() is fast? in the following code fragment?
          if (foo() == 3 && bar() == 4) ...
        
    It is more efficient to put the fast conjunct first, assuming that there is an equal probability that each conjunct will return 0 (false).
          if (bar() == 4 && and foo() == 3) ...
        
    But this relies on the programmers understanding to make sense of the implementation trick. In ADA we can specify short circuit evaluation explicitly to help alert the reader/writer to the underlying semantics.
          if bar() == 4 and then foo() == 3 then ...
        
  • Type checking - Lack of it is a potential source of errors. For example, early versions of C did not check types of formal parameters vs. actual parameters, but the return type was checked. But makes it harder to write programs, in the sense that you have to match up types when you are writing code.
  • Exception handling - Makes code ugly to read sometimes, but improves reliability.
  • Aliasing - two names for the same thing, e.g.,
         int x;
         int *y;
         y = &x;
    
         /* Contents of variable x known as x and *y */
       

Cost

Cost is another way to evaluate languages. Various kinds of costs.
  • Training.
  • Writing.
  • Compilation (in terms of time and size of produced code)
  • Execution. A dominant cost in some applications. Often, optimization can reduce the cost, at the expense of increasing the cost of compilation/writing/and or training.
  • Language implementation - is the compiler free?
  • Reliability - Failure in produced code/run-time environment can be very costly, think of a program running an X-ray machine.
  • Maintenance - Useful code is always refined, updated, debugged. This cost can outweigh development cost.

Implementation Strategies

The diagram below shows the steps in compilation. Compile once, run executable many times. The diagram below shows the steps in interpretation. Every time it is run, the program has to be parsed line-by-line. So if you have a loop in your program, every line in the loop is parsed each time through the loop. So generally, interpretation is "slower" than compilation in terms of execution. The hybrid implementation compiles the code down to intermediate code, and then evaluates the intermediate code program directly, as shown in the following diagram. Hybrid implementations offer increased speed over interpretation since the code does not have to be parsed repeatedly.

Source of Information

These lecture notes are based on Chapter 1 in "Programming Language Concepts and Paradigms" by David Watt and Chapter 1 in "Programming Languages" by Robert Sebesta.
                                                                                                                                                                                                                                                                                                                                             
  (c) 2003 Curtis Dyreson, (c) 2004 Carl H. Hauser           E-mail questions or comments to Prof. Carl Hauser