Homework #6 1. Construct a pushdown automata for strings over {a,b} with exactly twice as many "a"s as "b"s. Solution: This machine is similar to the one given in the textbook to accept strings over {a,b} with exactly as many "a"s as "b". In this case, for each a we push an A or pop a B, depending on what is on the top of the stack. For each b we push BB or pop AA, depending on what is on the top of the stack. Because we cannot pop AA in one move, we pop A and change state. In this new state, we an A is on the top of the stack we can pop the A without processing any input and move back to the old state. M = ({s, p, q, f}, {a, b}, {A,B,C}, delta, s, {f}) delta(s, e, e) = (s, C) delta(a, b, C) = (a, BBC) delta(a, b, B) = (a, BBB) delta(a, a, C) = (a, AC) delta(a, a, B) = (a, e) delta(a, a, A) = (a, AA) delta(a, b, A) = (b, e) delta(a, e, C) = (f, e) delta(b, b, C) = (b, BBC) delta(b, b, B) = (b, BBB) delta(b, a, C) = (b, AC) delta(b, a, B) = (b, e) delta(b, e, A) = (a, e) 2. Construct a PDA for the language {a^i b^j: i neq j} Solution: M = ({q0, q1, q2, q3}, {a,b}, {A,B}, delta, q0, {q2, q3}) delta(q0, a, e) = (q0, A) delta(q0, a, A) = (q0, AA) delta(q0, b, e) = (q1, B) delta(q0, b, A) = (q1, e) delta(q1, b, A) = (q1, e) delta(q1, b, e) = (q2, e) delta(q1, e, A) = (q3, e) delta(q2, b, e) = (q2, e) 3. Use the procedure outlined in class to generate a PDA from a CFG to generate a PDA for the CFG G below. Trace the operation of the PDA on the input string (()()). G = ({S, (, )}, {(, )}, R, S) R = {S -> e, S -> SS, S -> (S)} Solution: M = ({p,q}, {(,)}, {S, (, )}, Delta, p, {q}) Delta(p,e,e) = (q,S) Delta(q,e,S) = (q,e) Delta(q,e,S) = (q,SS) Delta(q,e,S) = (q,(S)) Delta(q,(,() = (q,e) Delta(q,),)) = (q,e) (p, (()()), e) |- (q, (()()), S) |- (q, (()()), (S)) |- (q, ()()), S)) |- (q, ()()), SS)) |- (q, ()()), (S)S)) |- (q, )()), S)S)) |- (q, )()), )S)) |- (q, ()), S)) |- (q, ()), (S))) |- (q, )), S))) |- (q, )), ))) |- (q, ), )) |- (q, e, e) 4. Convert grammar G into an equivalent Chomsky Normal Form grammar. G = (V, Sigma, R, E) V = {+, *, (, ), id, T, F, E} Sigma = {+, *, (, ), id} R = {E -> E + T | T T -> T * F | F F -> ( E ) | id Solution: * There are no e-productions. * There are 2 unit productions: E -> T T -> F After replacement grammar is E -> E + T | T * F | ( E ) | id T -> T * F | ( E ) | id F -> ( E ) | id * There are no useless symbols. * Replace terminals on right sides of rules (with 2 or more symbols) with nonterminals After replacement grammar is E -> E X1 T | T X2 F | X3 E X4 | id T -> T X2 F | X3 E X4 | id F -> X3 E X4 | id X1 -> + X2 -> * X3 -> ( X4 -> ) * Make each rule have two nonterminals or one terminal on the right side. After replacement is E -> E D1 D1 -> X1 T E -> T D2 D2 -> X2 F E -> X3 D3 D3 -> E X4 E -> id T -> T D2 T -> X3 D3 T -> id F -> X3 D3 F -> id X1 -> + X2 -> * X3 -> ( X4 -> ) 5. Let G be a grammar in Chomsky normal form. If w has length n and is an element of L(G), what is the maximum possible depth of any derivation tree for w? What is the minimum possible depth of any derivation tree for w? Provide precise bounds, and prove that your answers are the best possible. Note that the depth of a tree is the maximum number of edges on any simple path from the root to a leaf. Solution: The depth of the derivation tree depends on the number of variables in the grammar. We show in class that if the number of variables is k, the maximum-length path for w (and thus the depth of the derivation tree) is k. The proof of this bound is provided as part of the proof of the pumping lemma for context free languages. For the minimum depth, note that CNF rules are of the form A -> BC or A -> a (B,C nonterminals, a terminal). Let n = |w|, w = a1 a2 ... an. Each character a1 ... an will be derived by a separate rule of the form Xi -> ai. However, no rule will contain more than two of the Xi nonterminals on the right hand side. Thus the shortest possible derivation will only be able to contain one Xi on the right hand side of each rule used, except for the bottom rule in the tree which will contain two Xi nonterminals. The form of this derivation will be S -> X1 D1 D1 -> X2 D2 D2 -> X3 D3 ... D{n-1} -> X{n-1} Xn The shortest derivation will apply all of these n rules in term plus the rule Xn -> an (other terminals can be generated earlier in the tree). The total derivation length of this shortest derivation, and thus the minimum length of the derivation tree, is n + 1. 5. From Homework #5: Lex and Yacc. File "l" letter [a-zA-Z] delim [ \n\t] ws {delim}+ digit [0-9] %% {ws} {} "program" { fprintf(stderr,"Parsed PROGRAM token\n"); return(TPROGRAM); } "begin" { fprintf(stderr,"Parsed BEGIN token\n"); return(TBEGIN); } "end" { fprintf(stderr,"Parsed END token\n"); return(TEND); } "(" { fprintf(stderr,"Parsed ( token\n"); return(TLPAREN); } ")" { fprintf(stderr,"Parsed ) token\n"); return(TRPAREN); } ";" { fprintf(stderr,"Parsed ; token\n"); return(TSEMI); } "." { fprintf(stderr,"Parsed . token\n"); return(TDOT); } "," { fprintf(stderr,"Parsed , token\n"); return(TCOMMA); } \"{letter}({letter}|{digit})*\" { pident(yytext); return(TIDENT); } %% pident(lexeme) char *lexeme; { fprintf(stderr,"Parsed ident %s\n",lexeme); } File "y" %token TPROGRAM TLPAREN TRPAREN TSEMI TDOT TCOMMA TBEGIN TEND TIDENT %% pascal_program : TPROGRAM ident TLPAREN program_heading TRPAREN TSEMI block TDOT {fprintf(stderr,"Done\n");} program_heading : ident | program_heading TCOMMA ident {fprintf(stderr,"Parsed program heading\n");} block : TBEGIN TEND ident : TIDENT %% #include "lex.yy.c" #includemain() { return(yyparse()); } yyerror(s) char *s; { fprintf(stderr,"%s\n",s); }