October 13

* Pushdown Automata

  DFAs provided a very limited model of computation
  No external memory
  This prevents us from recognizing string in languages like 0^n 1^n,
     where we need to REMEMBER the number of 0s we have seen

  A PDA is an NFA with a stack (LIFO)
  In one move it can

     - pop the top symbol off the stack
     - push string of finite length onto stack
     - read the top symbol on the stack
     - change state
     - read the input symbol

  With a stack we can push a symbol for each processed 0, and pop them
as we process 1s.

  Think of stack machines as a stack of plates.  The plates are in a holder,
so the only plate that can be seen is the top plate.  Each different plate
pattern represents a symbol.  Plates can be added to and removed from the stack.

   PDAs are nondeterministic - we will define deterministic PDAs later.

   A PDA M = (K, Sigma, gamma, Delta, s, F)
      K: finite set of states
      Sigma:  alphabet (input symbols)
      Gamma:  alphabet (stack symbols)
      s: s in K is the initial state
      F: F <= K is the set of final states
      Delta:  transition relation, <= (K x (Sigma v e) x Gamma*) x (K x Gamma*)


   Pictoral representation


      +-+ a, z/alpha +-+
      |p|----------->|q|
      +-+            +-+

   A transition here means "when in state p reading input symbol `a'
and z is the symbol popped off of the stack, change state to q and push
alpha onto the stack".  If a = e, then we can move regardless of the
input symbol and do not advance the read head.


   Example
                    0, 0/00        1, 0/e
                     +--+           +--+
                     |  |           |  |
                     v  |           v  |
      +--+ 0, e/0    +--+ 1, 0/     +--+ e, e/e    +====+
      |s0|---------->|s1|---------->|s2|---------->||sf||
      +--+           +--+           +--+           +====+

      L(M) = ?

      L(M) = {0^n 1^n : n >= 0}


                    0, 1/01
                    0, 0/00
		    1, 1/11        1, 1/e
		    1, 0/10        0, 0/e
                     +--+           +--+
                     |  | e, 0/0    |  |
           0, e/0    v  | e, 1/1    V  |
      +--+ 1, e/1    +--+ e, e/e    +--+ e, e/e    +====+
      |s0|---------->|s1|---------->|s2|---------->||sf||
      +--+           +--+           +--+           +====+

      L(M) = ?

      L(M) = palindromes


   A "configuration" of a PDA that describes the current status of a PDA
is a member of K x Sigma* x Gamma*.  This is a triple including the current
state, the input yet to be read, and the contexts of the stack.
Configuration (q, w, abc) means we are in state q, we still need to process
w, a is at the top of the stack (next to pop) and c is at the bottom
of the stack.

   (p, x, alpha) "yields in one step" (q, y, zeta), or
(p, x, alpha) |-M (q, x, beta) if there is a transition
((p, a, beta), (q, gamma)) in Delta s.t.

   x = ay
   alpha = beta miu
   zeta = gamma miu

   miu in Gamma*

   In other words, there is a transition from p to q when processing a,
the first character of alpha (beta) is removed from the stack during the
transition and a new string (gamma) is pushed on the top of the stack.

   M "accepts" string w iff (s, w, e) |-*M (p, e, e) for some p in F.
The "language accepted by M", L(M), is the set of all strings accepted by M.

   When processing 001100 in the example machine
(s0, 001100, e) |- (s1, 01100, 0) |- (s1, 1100, 00) |- (s1, 100, 100) |-
   (s2, 100, 100) |- (s2, 00, 00) |- (s2, 0, 0) |- (s2, 0, e) |- (sf, e, e)

(s0, 001100), e) |-* (sf, e, e)



* Equivalent class of languages for CFG and PDA

  Proof

  1.  CFG -> PDA

     Given G = (V, Sigma, R, S), construct PDA M s.t. L(M) = L(G).

     M = ({p,q}, Sigma, V, Delta, p, {q})
     Delta = {((p, e, e), (q, S))     Transition immediately to q and push S
	      ((q, e, A), (q, x))     For each rule A -> x in R
	      ((q, a, a), (q, e))}    For each a in Sigma


   M first pushes S and enters q.
For each step after this, simulate CFG rules - either replace
a nonterminal A on the stack by x if A -> x, or pop a terminal
symbol that matches the next input symbol.

   This PDA imitiates a leftmost derivation of the string.

         G = ({S,a,b}, {a,b,c}, R, S)
	 R = S-> c | aSa | bSb
   L(G) = {wcw^R:  w in Sigma*}


   M = ({p,q}, Sigma, V, Delta, p, {q})

      Delta = {((p,e,e), (q,S)),
	       ((q,e,S), (q,c))
	       ((q,e,S), (q,aSa)),
	       ((q,e,S), (q,bSb)),
	       ((q,c,c), (q,c)),
	       ((q,a,a), (q,a)),
	       ((q,b,b), (q,b))}

(p,aabbcbbaa,e) |- (q,aabbcbbaa,S) |- (q,aabbcbbaa,aSa) |- (q,abbcbbaa,Sa) |-
   (q,abbcbbaa,aSaa) |- (q,bbcbbaa,Saa) |- (q,bbcbbaa,bSbaa) |-
   (q,bcbbaa,Sbaa) |- (q,bcbbaa,bSbbaa) |- (q,cbbaa,Sbbaa) |- (q,cbbaa,cbbaa) |-
   (q,bbaa,bbaa) |- (q,baa,baa) |- (q,aa,aa) |- (q,a,a) |- (q,e,e)

   Proof of equivalence in book using induction on |w|.


   2.  PDA -> CFG

       Given G = (V, T, P, S)

       Construct M such that L(M) = L(G).


       Intuitive idea.

       Suppose M reads "a" and top of stack is "A", and M replaces
       A with B1 B2 ... Bm and advances read head.

       The grammar should generate "a" and replace nonterminal A with
       B1 B2 ... Bm.

	  * Variables in derivation = symbols of stack

	  * Terminal generated = portion of input already scanned.

       We would have production A -> a B1 B2 ... Bm.

       This is the converse of our construction of PDAs from grammars.


       Problem with this approach:

	  M cannot always replace A (while reading a) with B1 ... Bm,
	     it can only do this if it is in the correct state.

	  If M has only one state, then our construction is correct.
	     In general, we have to take care of different states.


       Rather than add A -> a B1 B2 ... Bm,
	  we restrict the grammar to take into account the state
	  transitions of the machine.

	  Variables = {[s, A, p]:  s, p in S, A in Gama}

	  Intuitively, [s, A, p] -> a[p, B, ?]
	     will mean that in M we can, while reading "a", go to state p
	     from state s while replacing top-of-stack A with the symbol B.

	     A leftmost derivation of x will correspond to M on input x.

       Our goal is to show that [s, A, p] =>*G x iff M, starting in state s,
	  reads input x, erases A, and ends up in state p.

       If M = (K, Sigma, Gamma, delta, s, empty)
	  (accept by null stack - can show this is equivalent to accept
	   by final state)

       Define G = (V, Sigma, P, S) where
	  V = {[q, A, p]: q, p in K, A in Gamma} union {S}

	  Productions are

	     (1) S -> [s, e, q] for each q in K
	     (2) [q, A, q{m+1}] -> a[q1, B1, q2][q2, B2, q3]...[qm, Bm, q{m+1}]
		 for each q, q1, q2, ..., q{m+1} in K, each a in Sigma union e,
		 and A, B1, B2, ..., Bm in Gamma, s.t. delta(q, a, A)
		 contains (q1, B1 B2 ... Bm).
		 If m = 0, the production is [q, A, q1] -> a.

   Example
                        1, X/e
         0, e/X         e, X/e
	 0, X/XX        e, e/e
          +--+           +--+
          |  |           |  |
          v  |           v  |
          +--+ 1, X/e    +--+
      --->|q0|---------->|q1|
          +--+           +--+

      V = {S, [q0, X, q0], [q0, X, q1], [q1, X, q0], [q1, X, q1],
	      [q0, e, q0], [q0, e, q1], [q1, e, q0], [q1, e, q1], 0, 1}

      T = {0,1}

	 S -> [q0, e, q0]
	 S -> [q0, e, q1]

         delta(q0, 0, e) = {(q0, X)}
	 [q0, e, q0] -> 0[q0, X, q0][q0, e, q0]
	 [q0, e, q0] -> 0[q0, X, q1][q1, e, q0]

	 delta(q0, 0, e) = {(q0, X)}
	 [q0, e, q1] -> 0[q0, X, q0][q0, e, q1]
	 [q0, e, q1] -> 0[q0, X, q1][q1, e, q1]

	 delta(q0, 0, X) = {(q0, XX)}
	 [q0, X, q0] -> 0[q0, X, q0][q0, X, q0]
	 [q0, X, q0] -> 0[q0, X, q1][q1, X, q0]
	 [q0, X, q1] -> 0[q0, X, q0][q0, X, q1]
	 [q0, X, q1] -> 0[q0, X, q1][q1, X, q1]

	 delta(q0, 1, X) = {(q1, e)}
	 [q0, X, q1] -> 1

	 delta(q1, 1, X) = {(q1, e)}
	 [q1, e, q1] -> e

	 delta(q1, e, e) = {(q1, e)}
	 [q1, X, q1] -> e

	 delta(q1, e, X) = {(q1, e)}
	 [q1, X, q1] -> e

	 delta(q1, 1, X) = {(q1, e)}
	 [q1, X, q1] -> 1



         There are no productions for [q1, X, q0] and [q1, e, q0].
	 All productions for [q0, X, q0] and [q0, e, q0] and one of these on the
	 right hand side, so no terminal string can be derived from them.
	 Deleting rules with these four symbols, we have the following rules.

	 S -> [q0, e, q1]
	 [q0, e, q1] -> 0[q0, X, q1][q1, e, q1]
	 [q0, X, q1] -> 0[q0, X, q1][q1, X, q1]
	 [q1, X, q1] -> 1
	 [q1, e, q1] -> e
	 [q1, X, q1] -> e
	 [q1, X, q1] -> 1