logo
 
Syntax and Semantics
CptS 355 - Programming Language Design
Washington State University
Home
Notices
Calendar
Homework
Syllabus
Resources
People

Thought question

  • How do you know what are well-formed programs in a language?

Introduction

Languages are described by a
  • syntax - "form", a specification of the well-formed constructs in a language
  • semantics - "function", a specification of the meaning of construct in a language
Describing syntax is easier than semantics.

Formal Descriptions of Syntax

Formally, a language, represented as L, is a set of strings, usually called sentences, from some alphabet, represented as S. A grammar, G is a set of rules that describe (legal sentences) a language. From a grammar we can automatically construct the following machines:
  • a language recognizer - A recognizer is a machine that takes a string and determines if it belongs to the language described by the grammar.
  • a language generator - A generator is a machine that produces only legal sentences in the language. It produces random sentences.

To simplify the specification of a grammar, usually the syntax of a programming language is specified with respect to an alphabet that is a set of tokens (Contrast with CptS 317 where alphabets are typically collections of single characters). A token is a category of lexemes. Example lexemes (literally, what appears in the program) are:

  • begin
  • }
  • "asdfas"
  • 3.54
  • (
  • foo
Common tokens (categories of lexemes) are:
  • identifier: foo
  • reserved word begin: begin
  • string literal: "asdfas"
  • integer literal: 7
  • right brace: }
  • left parenthesis: (
  • plus operation: + -
  • mult operation: * / %
So let's look at a sentence in a programming language
  sum = x + y;
The lexemes are (from left to right): sum, =, x, +, y, ; and the corresponding tokens are: identifier, assignment op, identifier, plus op, identifier, semicolon.

BNF

Work by Turing and Chomsky in the 1940-50s identified four categories of languages of increasing power and complexity: regular, context-free, context-sensitive, and recursively enumerable. Usually, programming languages are context-free. The first programming language to have a formally specified grammar was ALGOL 60. The formal description was in a "metalanguage" called Backus Naur Format. A metalanguage is a language used to describe other languages. The components of BNF include the following.
  • Terminals - These are the tokens. Terminals will be represented as the name of the token, e.g., begin means the token corresponding to the reserved word begin, identifier means any identifier, etc.
  • Nonterminals - These are represented in angle brackets, e.g., <stmt> means the nonterminal statement.
  • Rules or productions of the form
      nonterminal -> body
    The body of the rule consists of a list of
    • terminals,
    • nonterminals,
    • | (meaning "or"),
    • a pair of brackets, [], enclosing an optional clause,
    • a pair of braces, {}, followed by a * (meaning zero or more), or + (meaning one or more), enclosing a repeating clause
    • e or E, representing the empty string.
    The interpretation of a rule is that the syntax of the nonterminal, sometimes called the head or left-hand side (LHS) is described by the body, sometimes called the right-hand side (RHS). For example, the following rule describes the syntax of an if statement.
    <if_stmt> -> if <predicate> then <stmt>
    | if <predicate> then <stmt> else <stmt>

    Note that a non-terminal may be the LHS of several rules. The rule given above is the same as the pair of rules given below.
    <if_stmt> -> if <predicate> then <stmt>
    <if_stmt> -> if <predicate> then <stmt> else <stmt>

    Another equivalent formulation is as an optional clause.
    <if_stmt> -> if <predicate> then <stmt> [ else <stmt> ]

  • A start symbol - By default the start symbol is the nonterminal on the LHS of the first rule.
Usually there is also a terminator symbol, often '.', for each rule but since the book doesn't use one, we won't either.

Derivations

Let's look at how a grammar can be used to generate a sentence in the language. The process is called derivation. The idea is that if we can somehow derive the sentence from the start symbol, then the sentence is part of the language described by the grammar. Derivation proceeds by replacing a nonterminal with its body. Consider the following simple grammar.
<stmt_list> -> <stmt>
| <stmt> ; <stmt_list>
<stmt> -> <var> = <expr>
<expr> -> <expr> - <expr>
| <expr> * <expr>
| <var>
<var> -> X
| Y
and consider the sentence X = Y - X * X. Let's try to derive it.
<stmt_list>
=> <stmt>
=> <var> = <expr>
=> X = <expr>
=> X = <expr> * <expr>
=> X = <expr> * <var>
=> X = <expr> * X
=> X = <expr> - <expr> * X
=> X = <var> - <expr> * X
=> X = Y - <expr> * X
=> X = Y - <var> * X
=> X = Y - Y * X

Derivation Tree

Each derivation creates a derivation tree. (Note, Sebesta calls these parse trees, which they are, but I want to hold off on discussing them as parse trees for the moment.)
  • The root of the tree is the start symbol.
  • Every interior node is a nonterminal.
  • Every leaf is a terminal.
  • There is one child for each nonterminal or terminal in the body of a production that is used in the derivation.
Here is the derivation tree for the above derivation
                      stmt_list
                          |
                        stmt
                       / | \
                    var  =  expr
                     |      / | \
                     X   expr *  expr 
                        /  | \      |
                    expr   -  expr  X
                     |         | 
                    var       var
                     |         |      
                     Y         Y      
A grammar is ambiguous is if there are two or more derivation trees for some sentence. This grammar is ambiguous since there is more than one possible derivation tree for the sentence above. Here is a second derivation and its corresponding tree.
<stmt_list>
=> <stmt>
=> <var> = <expr>
=> X = <expr>
=> X = <expr> - <expr>
=> X = <var> - <expr>
=> X = Y - <expr>
=> X = Y - <expr> * <expr>
=> X = Y - <expr> * <var>
=> X = Y - <expr> * X
=> X = Y - <var> * X
=> X = Y - Y * X
                      stmt_list
                          |
                        stmt
                       / | \
                    var  =  expr
                     |      / | \
                     X   expr -  expr 
                           |     / | \
                           Y  expr *  expr
                               |       | 
                              var     var
                               |       | 
                               Y       X 

Parsing

Derivation starts with the start symbol and proceeds by replacing nonterminals. Parsing is the inverse process: starting with a string purportedly in the language it attempts to find a derivation tree which is now called a parse tree. For our purposes, informal approaches to parsing will be sufficient. Parsing is examined more rigorously in the Compilers course, CptS 452.

Relationship between Grammar, Associativity and Precedence

Specifying the right grammar for a language can help to control associativity and precedence.

Associativity refers to a "direction" in which (binary) operators associate. In mathematical notation, subtraction is left-associative meaning that

  7 - 3 - 4 
is interpreted to mean
  (7 - 3) - 4 
rather than
  7 - (3 - 4)
which has a very different meaning!

Precedence refers to which operations are executed prior to others. Multiplication typically has higher precedence than subtraction meaning it should be done first so

  7 + 3 * 4
evaluates to 19 and not to 40. Most (but not all) programming languages respect these mathematical conventions.

A parse tree implicitly says which operations' results are input to other operations. For example, in the parse tree given above, if we assume X = 3 and Y = 4, then the result is 4 - (4 * 3) or -8 because the result of the multiplication is the second operand of the substraction.

Question: what is the result if we use the first parse tree instead?

Associativity and precedence can be specified in a grammar by altering whether recursion is done on the right or left sides of rules, and by altering the derivation order of the grammar rules.

To specify precedence, the trick is to split the production where the precedence is ambiguous into two (or more) productions. Notice that this moves multiplication down the parse tree, so that a multiplication can never be the ancestor of a subtraction.
<stmt_list> -> <stmt>
| <stmt> ; <stmt_list>
<stmt> -> <var> = <expr>
<expr> -> <expr> - <expr>
| <term>
<term> -> <term> * <term>
| <var>
<var> -> X
| Y

Question: what might we do to the grammar so that the result of a subtraction might be an operand of a multiplication? How would you do it in mathematical notation?

We can specify associativity in the grammar by giving a direction to the parse, that is, by recursing on only the left side (or right side, but not both) of an operation. Let's make subtraction left associative and make multiplication right-associative (just as an illustration).
<expr> -> <expr> - <term>
| <term>
<term> -> <var> * <term>
| <var>
<var> -> X
| Y

Exercise: convince yourself that the above grammar gives multiplication precedence over subtraction, that subtraction associates to the left, and that multiplication associates to the right, by creating parse trees for several expressions.

So now for a nasty little secret about programming languages and context-free grammars. The grammar for a PL typically does *not* specify the acceptable programs of the language. Consider

   int *c;
   c = 17;

Some aspects of the language, such as type checking, are difficult (in the sense that they make the grammar blow up in size) or impossible to express using CFGs. These aspects usually go by the name of static semantics. We will take up the issue of types and type checking later in the semester.

Dynamic Semantics

The dynamic semantics refers to the question "what does this program compute?" Often this is taken to mean the state, or collection of values of the variables in the program, as the computation proceeds from state to state. Usually, we are interested primarily in knowing about the final state. Since programs in many languages evaluate statement by statement, the dynamic semantics is description or understanding of precisely how a statement modifies the state. There are three kinds of semantics.
  • Operational semantics - Describe precisely the meaning of each statement in a high-level language in terms of a low-level language. My description of PostScript was essentially done by giving an informal operational semantics.
  • Axiomatic semantics - Useful in proving correctness of programs. Think of program as a proof and each statement as an axiom. The axiom relates a precondition (about the state) to a postcondition (about the state).
  • Denotational semantics - Each statement can be modeled as a function that relates the input (the state prior to the statement) to the output (the state after the statement). The program is modeled by composing functions.

Operational Semantics

This is a commonly-used semantics. It is easiest for the compiler-implementer to use. You can think of this as a precise description of how to map a statement in a high-level language to a low-level intermediate code. For example, consider a statement of the form.
  if <predicate> then
    <stmt_list>1
  else
    <stmt_list>2
Operational semantics can be expressed by means of a translation function, T, that maps programs in a high-level language to programs in a lower-level language. For example. T applied to the if statement above might yield:
     T(<predicate>)
     # Assume boolean result of predicate ends up in R1
     eq  R1, false, L1:
     T(<stmt_list>1)
     goto L2:
L1:
      T(<stmt_list>2)
L2:
Note that we are assuming that the meaning or semantics of the low-level language is understood, sound, and rigorous.

Axiomatic Semantics

While operational semantics tells us in detail how a program is to be executed, the goal of axiomatic semantics is to help us understand what a program does.

Review the material covered in Math 216 on propositional and predicate logic as a prelude to tackling this section. You are likely to need DeMorgan's laws for negation of conjunctions (and) and disjunctions (or) and also their extension to statements quantified using "for all" and "there exists".

Fundamental to understanding axiomatic semantics is an understanding of state. For our purposes here, a state of a computation (of some program) is simply the values of all the variables in the program. We use predicate logic formulas to describe properties of states that are of interest to us. Examples

    x = y            %% true in states where x and y have equal values
    z < x + y        %% true in states where z is less than x+y
In the context of axiomatic semantics these will be called assertions and will be written inside curly braces {}. Annotating a point in a program with an assertion means "whenever the program is executing at that point, the assertion is true of the current state". The assertion before a statement is called its precondition and the one after is called its postcondition. Example: if S is a statement in a program and we see
  {P} S {Q}
then P is the precondition and Q is the postcondition.

The fundamental insight of axiomatic semantics is that if we want some property to be true of the state after executing a statement it is straightforward to determine what must be true of the state before executing the statement. The before-to-after direction isn't nearly as easy, so axiomatic semantics is initially counter-intuitive and feels like it works "backwards" to many people.

Suppose that I have some statement S with postcondition Q that I wish to be true of the state after executing S. The weakest precondition of S with respect to Q, written wp(S, {Q}), is a predicate that describes all of the states from which executing S leads to a state satisfying Q.

An axiomatic semantics is essentially a set of rules for reasoning about and deriving weakest preconditions for statements of a programming language.

Notice that wp() is a bit like an integral expression in calculus in the following way: it has a well-defined meaning, but without applying some rules to transform it to a simpler form, we won't understand very much about what it is telling us of the initial state. Indeed, axiomatic semantics is sometimes called the weakest-precondition calculus.

Assignment

The axiom for the assignment statement x = E, where E is an expression: wp(x=E, {Q}) is
   Qx -> E
meaning x is replaced in Q by E. For example,
  wp(x = y * 5, {x = 10 and z = 4})
is {y * 5 = 10 and z = 4} or equivalently {y = 2 and z = 4}. As another example consider
  wp(x = x * 5, {x < 10 and z = 4})
We can deduce that the precondition must be {x * 5 < 10 and z = 4} or equivalently {x < 2 and z = 4}.

We want the weakest precondition since often there are many possible preconditions. Consider

  wp(x = y * 5, {x < 10})
One precondition is that {x < 0} but it is not the "weakest" since if x is 1 or 0 prior to the assignment, the postcondition is still satisfied.

Sequences

The sequence rule:
   wp( S1; S2, {Q})
is
   wp(S1, wp(S2, {Q})
So for example
   wp(x1 = E1 ; x2 = E2, {Q})
is
  wp(x1 = E1, {P'})
where
  P' = Qx2 -> E2
so
  wp(x1 = E1 ; x2 = E2, {Q})
is
  (Qx2 -> E2)x1 -> E1
As a concrete example:
  wp(x=z+1; y=x*2, {y=10})
is
  wp(x=z+1, {x*2=10})
is
  (z+1)*2=10
or equivalently by solving for z:
  z = 4
Notice that the application of these rules is completely straightforward: The rules tell you exactly what has to be done to go from wp(S, {Q}) to a formula that doesn't involve wp(); again in analogy to integrals: once you know the rules for powers of x you can just apply them without thinking about it.

A compact representation of the sequence rule is in asserted program form

   {P} S1 {Q} S2 {R}
where Q => wp(S2, R) and P => wp(S1, Q).

Selection (if)

The axiomatic meaning of a selection:
  wp(if B then S1 else S2, {Q})
is
  (B => wp(S1, {Q})) and (not B => wp(S2, {Q}))
Let's look at an example.
  wp(if x < 0 then x = y + 1 else x = y - 1, {x = 4})
is
 (x < 0 => y=3) and (x >= 0 => y=5)
Equivalently, using def'n of => and the distributive law
  (x < 0 and y = 3) or (x >= 0 and y = 5) 
and with one more application of the distributive law
  (y = 3 or y = 5)
A compact representation of the selection rule is in asserted program form
   {P}
   if B then {P and B} S1 {Q} else {P and not B} S2 {Q}
   {Q} 
where (P and B)=> wp(S1,Q) and (P and not B) => wp(S2,Q).

Loops

Loops are harder to reason about because we don't know how many times the loop body will be executed. Like the general integration problem in calculus, computing the weakest precondition of a loop is hard, unlike the other program structures we have looked at.

Loops are a powerful programming tool to represent much computation with little code, and axiomatic semantics has an equally powerful tool to reason about what loops do. The key to reasoning about loops is to identify a loop invariant, denoted I. The invariant is assertion that is true before each execution of the loop body. Using invariants is like using induction: we show that it is true for the base case and show that if it is true for n-1 iterations it is true for n iterations. The rule for

  while B do S end
is easiest (not easy!) to understand when presented in asserted program form.
  {I and B} S {I} implies {I} while B do S end {I and (not B)}
which spelled out in wp() terminology says if
  (I and B) => wp(S, {I})
then
  I => wp(while B do S, {I and not B})
Therefore, using this approach to establish {P} while B do S {Q} we must establish four things.
  1) P implies I
  2) {I and B} S {I}
  3) (I and (not B)) implies Q
  4) the loop terminates
We have to find candidates for the invariant by experience, insight, experiment. Once we have a candidate we can test it using the rules above to see if it meets our needs. Let's look at an example.
  while x != y do y = y - 1 {x = y}
For zero iterations the weakest precondition is
  {x = y}
For one iteration the weakest precondition is
  {x = y - 1}
For two iterations the weakest precondition is
  {x = y - 2}
For n iterations the weakest precondition is
  {x = y - n}
We know that for any non-negative n
  {x = y - n} implies {x <= y}
So our loop invariant is
  {x <= y}
which we will also choose for P. Now let's check whether {I and B} S {I} holds.
  {x <= y and x != y} y = y - 1 {x <= y}
and it does since
  {x <= y and x != y} implies {x < y} 
and
  {x < y} y = y - 1 {x <= y}
Now let's check if (I and (not B)) implies Q.
  {x <= y and (not x != y)} implies {x = y}
simplified
  {x <= y and x = y} implies {x = y}
which we know is true since
  P and Q implies Q
Finally, we have to check that the loop terminates. Informally, we observe that at each iteration the difference between x and y decreases by one and that if the two are ever equal the loop exits. So indeed the loop terminates when x starts out no bigger than y. If we can show loop termination, we have total correctness. If we can't we have only partial correctness.

Denotational Semantics

The idea behind denotation semantics is to build a function that describes or denotes the meaning of a program. This differs from operational semantics because we want to build a rigorous mathematical definition. Let's assume a simple language consisting of the following.
  numbers, e.g., 3, -10
  a plus operation, e.g., [3 + 4]
  a print operation, e.g., print [3 + 4] would output 7
  an if-then-else statement (with non-zero for "truth"), e.g., 
     if [3 + 4] then 4 else [4 + 5]
The denotations for each construct in the language is given below.
  M('0') =def 0
  M('1') =def 1
   ....
  M([x + y]) =def M(x) + M(y)
  M(if B then expr1 else expr2) =def 
           if M(B) != 0 then M(expr1) else M(expr2)
  M(print expr) =def output M(expr)
Now let's look at the meaning of a program.
  print if [[3 + 4] + -7] then 5 else 6
The meaning is
  M(print if [[3 + 4] + -7] then 5 else 6) 
    -> output M(if [[3 + 4] + -7] then 5 else 6) 
    -> output if M([[3 + 4] + -7]) != 0 then M(5) else M(6)
    -> output if M([3 + 4]) + M(-7) != 0 then 5 else 6
    -> output if M(3) + M(4) + -7 != 0 then 5 else 6
    -> output if 3 + 4 + -7 != 0 then 5 else 6

Capturing state

Recall that the state of a program is the collection of values of its variables. We can add the state to the meaning (for languages that have assignment) as follows.
  M(x = E, state) =def statex -> M(E,state)
So the meaning of an assignment is to replace the value of x in state with the result of the expression, E, computed on the current state.

Loops

Loops utilize a recursive definition that "unrolls" the loop by one step. is
   M(while B do S, state)  =def 
                  if (B,state) then M(while B do S, M(S,state))
                  else state
So the meaning of a loop depends on the loop condition, B. If the loop condition, B, is true, then the we go through the loop at least one and generate a new state, and then re-evaluate the loop with the new state. Otherwise, the loop does not execute and so the state is unchanged by the loop.

Source of Information

These lecture notes are based on Chapter 3 in "Programming Languages, 6ed" by Robert Sebesta.
                                                                                                                                                                                                                                                                                                                                             
  (c) 2003 Curtis Dyreson, (c) 2004 Carl H. Hauser           E-mail questions or comments to Prof. Carl Hauser