Homework #6

1.  Construct a pushdown automata for strings over {a,b} with
exactly twice as many "a"s as "b"s.

Solution:

This machine is similar to the one given in the textbook to accept strings
over {a,b} with exactly as many "a"s as "b".  In this case, for each a we
push an A or pop a B, depending on what is on the top of the stack.
For each b we push BB or pop AA, depending on what is on the top of the
stack.  Because we cannot pop AA in one move, we pop A and change state.
In this new state, we an A is on the top of the stack we can pop the A
without processing any input and move back to the old state.

M = ({s, p, q, f}, {a, b}, {A,B,C}, delta, s, {f})

delta(s, e, e) = (s, C)
delta(a, b, C) = (a, BBC)	delta(a, b, B) = (a, BBB)
delta(a, a, C) = (a, AC)	delta(a, a, B) = (a, e)
delta(a, a, A) = (a, AA)	delta(a, b, A) = (b, e)
delta(a, e, C) = (f, e)
delta(b, b, C) = (b, BBC)	delta(b, b, B) = (b, BBB)
delta(b, a, C) = (b, AC)	delta(b, a, B) = (b, e)
delta(b, e, A) = (a, e)

2.  Construct a PDA for the language {a^i b^j:  i neq j}

Solution:

M = ({q0, q1, q2, q3}, {a,b}, {A,B}, delta, q0, {q2, q3})

delta(q0, a, e) = (q0, A)	delta(q0, a, A) = (q0, AA)
delta(q0, b, e) = (q1, B)	delta(q0, b, A) = (q1, e)
delta(q1, b, A) = (q1, e)	delta(q1, b, e) = (q2, e)
delta(q1, e, A) = (q3, e)	delta(q2, b, e) = (q2, e)

3.  Use the procedure outlined in class to generate a PDA from a CFG to
generate a PDA for the CFG G below.  Trace the operation of the PDA on the
input string (()()).

G = ({S, (, )}, {(, )}, R, S)
R = {S -> e,
S -> SS,
S -> (S)}

Solution:

M = ({p,q}, {(,)}, {S, (, )}, Delta, p, {q})
Delta(p,e,e) = (q,S)
Delta(q,e,S) = (q,e)
Delta(q,e,S) = (q,SS)
Delta(q,e,S) = (q,(S))
Delta(q,(,() = (q,e)
Delta(q,),)) = (q,e)

(p, (()()), e) |- (q, (()()), S) |- (q, (()()), (S)) |-
(q, ()()), S)) |- (q, ()()), SS)) |- (q, ()()), (S)S)) |-
(q, )()), S)S)) |- (q, )()), )S)) |- (q, ()), S)) |- (q, ()), (S))) |-
(q, )), S))) |- (q, )), ))) |- (q, ), )) |- (q, e, e)

4.  Convert grammar G into an equivalent Chomsky Normal Form grammar.

G = (V, Sigma, R, E)
V =     {+, *, (, ), id, T, F, E}
Sigma = {+, *, (, ), id}
R =     {E -> E + T | T
T -> T * F | F
F -> ( E ) | id

Solution:

* There are no e-productions.

* There are 2 unit productions:
E -> T
T -> F

After replacement grammar is

E -> E + T | T * F | ( E ) | id
T -> T * F | ( E ) | id
F -> ( E ) | id

* There are no useless symbols.

* Replace terminals on right sides of rules (with 2 or more symbols) with
nonterminals

After replacement grammar is

E -> E X1 T | T X2 F | X3 E X4 | id
T -> T X2 F | X3 E X4 | id
F -> X3 E X4 | id
X1 -> +
X2 -> *
X3 -> (
X4 -> )

* Make each rule have two nonterminals or one terminal on the right side.

After replacement is

E -> E D1
D1 -> X1 T
E -> T D2
D2 -> X2 F
E -> X3 D3
D3 -> E X4
E -> id
T -> T D2
T -> X3 D3
T -> id
F -> X3 D3
F -> id
X1 -> +
X2 -> *
X3 -> (
X4 -> )

5.  Let G be a grammar in Chomsky normal form.  If w has length n and is an
element of L(G), what is the maximum possible depth of any derivation
tree for w?  What is the minimum possible depth of any derivation tree for w?
Provide precise bounds, and prove that your answers are the best possible.
Note that the depth of a tree is the maximum number of edges on any simple path
from the root to a leaf.

Solution:

The depth of the derivation tree depends on the number of variables in the
grammar.  We show in class that if the number of variables is k, the
maximum-length path for w (and thus the depth of the derivation tree)
is k.  The proof of this bound is provided as part of the proof of the
pumping lemma for context free languages.

For the minimum depth, note that CNF rules are of the form
A -> BC
or
A -> a
(B,C nonterminals, a terminal).

Let n = |w|, w = a1 a2 ... an.  Each character a1 ... an will be derived
by a separate rule of the form Xi -> ai.  However, no rule will contain
more than two of the Xi nonterminals on the right hand side.  Thus the
shortest possible derivation will only be able to contain one Xi on the
right hand side of each rule used, except for the bottom rule in the tree
which will contain two Xi nonterminals.  The form of this derivation will
be

S      -> X1 D1
D1     -> X2 D2
D2     -> X3 D3
...
D{n-1} -> X{n-1} Xn

The shortest derivation will apply all of these n rules in term plus
the rule Xn -> an (other terminals can be generated earlier in the tree).
The total derivation length of this shortest derivation, and thus the
minimum length of the derivation tree, is n + 1.

5.  From Homework #5:  Lex and Yacc.

File "l"

letter   [a-zA-Z]
delim    [ \n\t]
ws       {delim}+
digit    [0-9]

%%

{ws}           {}
"program"      { fprintf(stderr,"Parsed PROGRAM token\n"); return(TPROGRAM); }
"begin"        { fprintf(stderr,"Parsed BEGIN token\n"); return(TBEGIN); }
"end"          { fprintf(stderr,"Parsed END token\n"); return(TEND); }
"("            { fprintf(stderr,"Parsed ( token\n"); return(TLPAREN); }
")"            { fprintf(stderr,"Parsed ) token\n"); return(TRPAREN); }
";"            { fprintf(stderr,"Parsed ; token\n"); return(TSEMI); }
"."            { fprintf(stderr,"Parsed . token\n"); return(TDOT); }
","            { fprintf(stderr,"Parsed , token\n"); return(TCOMMA); }
\"{letter}({letter}|{digit})*\"          { pident(yytext); return(TIDENT); }

%%

pident(lexeme)
char *lexeme;
{
fprintf(stderr,"Parsed ident %s\n",lexeme);
}

File "y"

%token TPROGRAM TLPAREN TRPAREN TSEMI TDOT TCOMMA TBEGIN TEND TIDENT

%%

pascal_program   : TPROGRAM ident
TLPAREN program_heading TRPAREN TSEMI block TDOT
{fprintf(stderr,"Done\n");}

| program_heading TCOMMA ident

block            : TBEGIN TEND

ident            : TIDENT

%%

#include "lex.yy.c"
#include

main()
{
return(yyparse());
}

yyerror(s)
char *s;
{
fprintf(stderr,"%s\n",s);
}