Homework #7


1.  Show that the language L = {a^i b^j c^k: i < j < k} is not a context-free
language.

    Solution:  If L were context free, then the pumping lemma should hold.
    Let z = a^n b^{n+1} c^{n+2}.  Given this string and knowing that |z| >= n,
    we want to define z as uvwxy such that |vwx| <= n, |vx| >= 1.
    Because |vwx| <= n, there are five possible descriptions of uvwxy:

       1.  vwx is a^p for some p<=n, p>=1
       2.  vwx is a^p b^q for some p+q<=n, p+q>=1
       3.  vwx is b^p for some p<=n, p>=1
       4.  vwx is b^p c^q for some p+q<=n, p+q>=1
       5.  vwx is c^q for some i<=n, i>=1

       Note that because |vwx| <= n, vwx cannot contain both "a"s and "c".
       For all of these cases, u v^i w x^i y, i>=0, should be in the language.

       In case 1, if i=2 we will be adding an a to the string, making the
       number of "a"s n+1 and thus the string is not in the language.  The
       same argument holds for case 3 in which the number of "b"s will be
       equal to the number of "c"s.  A similar argument holds in case 5.
       In case 5 if i=0 then the number of "c"s will be less than or equal to
       the number of "b"s.

       In case 2, when i=2 either the number of "a"s will be greater than the
       number of "b"s or the number of "b"s will be greater than the number of
       "c"s (depending on the distribution of v and x).

       In case 4, when i=0 either the number of "b"s will be less than or equal
       to number of "a"s or the number of "c"s will be less than or equal to
       the number of "b"s (depending on the distribution of v and x).

2.  Show that the language L = {a^i: is is a prime} is not a context-free
language.

    Solution:  If L were context free, then the pumping lemma should hold.
    We pick a string z equal to 0^m, where m is prime and >= n + 2.
    0^m = uvwxy = 0^|u| 0^|v| 0^|w| 0^|x| 0^|y|, |vwx| <= n, |vx| >= 1.
    If we pick i = |u| + |w| + |y|, then uv^iwx^iy =
    0^|u| 0^{|v|(|u|+|w|+|y|)} 0^|w| 0^{|x|(|u|+|w|+|y|)} 0^|y|
    = 0^{(|v|+|x|+1)(|u|+|w|+|y|)}.  Because |v| + |x| + 1 >= 2 and
    |u| + |y| >= 2 (this has to be true because |vwx| <= n), the string 
    is not in L.  Thus the language is not context free.

3.  Is the language L = {w w^R w: w is in {a,b}*} a context-free language?
Prove or disprove your answer.

    Solution:  L is not a context-free language.  Remember that CFL are closed
    under union.  We know that because L' = a*b*a*b* is a regular language it is
    also a context-free language.  However, L union L' is
    L'' = a^i b^j a^i b^{j/2}, which is not a context-free language.

    If L'' were context-free than the pumping lemma should hold.  Let
    z = a^n b^n a^n b^{n/2}.  Because z = uvwxy and |vwx| <= n, |vw| >= 1,
    vwx cannot span the first and second set of "a"s (the string of "b"s
    between the two substrings is of length n).  Thus if vwx contains any "a"s
    then when i=0 the resulting string will not be in L''.
    
    If vwx does not contain any "a"s it must contain all "b"s.  However, vwx
    cannot span the first and second set of "b"s (the string of "a"s between
    the two substrings if of length n).  Thus if vwx contains "b"s from the
    first group, when i=0 the number of "b"s in the first group will be less
    than twice the number of "b"s in the second group.  Similarly, if vwx
    contains "b"s from the second group, when i=0 the number of "b"s in the
    second group will be less than half the number of "b"s in the first group.
    Because the pumped strings are not in L'', L is not context free.

4.  Show that CFLs are closed under the REVERSE operation, where
REVERSE(L) = {x: x^R is in L}.

    Solution:  Let G = (V, T, P, S) be a CFG in Chomsky Normal Form
    such that L = L(G).  We construct a CFG G' such that L(G') = REVERSE(L(G)).
    Note that every rule in G is of the form
       A -> BC or
       A -> a     for nonterminals B,C and terminal a.

    Let G' = (V, T, P', S) where P' is defined as follows:

       - Every rule in P of the form A -> a is also contained in P'
       - For every rule in P of the form A -> BC, the rule
	 A -> CB is contained in P'.

    For example, if P contains the following rules

       S -> aSb | ab    (a^i b^i:  i >= 1),

       in Chomsky Normal Form these rules become

       S -> AC | AB
       C -> SB
       A -> a
       B -> b


       G' contains the rules

       S -> CA | BA
       C -> BS
       A -> a
       B -> b

       which generates strings {ba, bbaa, bbbaaa, etc., or b^i a^i: i >= 1).


5.  How would you modify the CYK algorithm (or the dynamic programming
algorithm presented in the book, choose either one) to produce an actual
derivation of x in G?

    Solution:  Using the CYK algorithm we can store along with each symbol in
    the box, the two boxes higher in the table that were used to produce this
    symbol.  For example, if S were stored in location (i=1, j=6) because A is
    stored in location (i=1, j=2), B is stored in location (i=3, j=4), and
    S -> AB is in the set of rules, then we store the two contributing locations
    along with the rule and symbol S in location (i=1, j=6).  To produce a
    derivation of x, use the CYK algorithm to decide if x is in G.  If x is in
    G, use the stored information to generate the derivation in the following
    way.  Starting with the last box, output the stored rule(s) for S.  For each
    symbol on the right hand side of the rule, visit the box that corresponds to
    the symbol and expand the derived string according to the rule stored there.
    Repeat this operation until no nonterminals are left in the string.

    Consider the application of CYK to the string baaaa in G, where the
    rules are   S -> AB | AC, A -> BC | a, B -> CB | b, C -> AA | b

             1       2       3       4       5
         +-------+-------+-------+-------+-------+
         |  B,C  |   A   |   A   |   A   |   A   |
       1 | B->b  | A->a  | A->a  | A->a  | A->a  |
         | C->b  |       |       |       |       |
         +-------+-------+-------+-------+-------+
         |  {}   |   C   |   C   |   C   |
       2 |       | C->AA | C->AA | C->AA |
         |       | (2,1) | (3,1) | (4,1) |
         |       | (3,1) | (4,1) | (5,1) |
         +-------+-------+-------+-------+
         |   A   |   S   |   S   |
       3 | A->BC | S->AC | S->AC |
         | (1,1) | (2,1) | (3,1) |
         | (2,2) | (3,2) | (4,2) |
         +-------+-------+-------+
         |   C   |  {}   |
       4 | C->AA |       |
         | (1,3) |       |
         | (4,1) |       |
         +-------+-------+
         |   S   |
       5 | S->AC |
         | (1,3) |
         | (4,2) |
         +-------+

    S -> A(1,3) C(4,2) -> B(1,1) C(2,2) C(4,2) -> b C(2,2) C(4,2) ->
	 b A(2,1) A(3,1) C(4,2) -> ba A(3,1) C(4,2) -> baa C(4,2) ->
	 baa A(4,1) A(5,1) -> baaa A(5,1) -> baaaa

    The rules and box locations can be stored in the dynamic programmin
    algorithm described in the textbook to generate the derivation in a
    similar fashion.

6.  Use the CYK algorithm for the grammar G below to determine whether
    a) aaaaa
    b) aabbab

    are in the language generated by grammar G.

    G = ({S, A, B, C, a, b}, {a, b}, R, S)
    R = {S -> AB | BC
	 A -> BA | a
	 B -> CC | b
	 C -> AB | a}


    Solution:

    a)
                 position
	      1   2   3   4   5
            +---+---+---+---+---+
      l   1 |A,C|A,C|A,C|A,C|A,C|
      e     +---+---+---+---+---+
      n   2 | B | B | B | B |
      g     +---+---+---+---+
      t   3 |S,A|S,A|S,A|
      h     +---+---+---+
          4 |{} |{} |
            +---+---+
          5 |S,A|       S is in the box (1,5), so aaaaa is in the language.
            +---+


     b)
                 position
	      1     2   3   4   5   6
            +-----+---+---+---+---+---+
      l   1 |A,C  |A,C| B | B |A,C| B |
      e     +-----+---+---+---+---+---+
      n   2 | B   |S,C|{} |S,A|S,C|
      g     +-----+---+---+---+---+
      t   3 | B   | B | A |S,C|
      h     +-----+---+---+---+
          4 |S,C  | A |S,C|
            +-----+---+---+
          5 |A,B  |A,B|
            +-----+---+
          6 |S,B,C|       S is in the box (1,6), so aabbab is in the language.
            +-----+