```September 17

*  Applications

http://www.ibc.wustl.edu/~zuker/Bio-5495/sequences-html/node5.html

Use FSM to look for a pattern in a DNA sequence

*  L(M) = ((ab v aab)*a*)*
M = ?

a   b
+-++--------+
v |v        |
--->+===+ a  +-+ |
+->|| ||--->| |-+
|  +===+    +-+
|   a|
|    v
|   +-+
b |   | |
|   +-+
|   a|
|    v
|   +-+
|   | |
|   +-+
|    ^
+----+

*  NFAs with epsilon-transitions

Allow change of state without reading input

delta:  K x (Sigma union {e}) -> P(S)

Example:

0       1
+--+    +--+
v  |    v  |
+--+ e  +--+
|q0|--->|q1|
+--+    +--+
| e \ 1
|    \
v     v
+--+ 0  +====+<+
|q3|--->||q2|| | 0
+--+    +====+-+

Provides flexibility in constructing FSM

(0 v 11)101*

+-+ 0   +-+
--->| |---->| |
+-+     +-+
1|       |
v       |
+-+     e|
| |      |
+-+      |        1
1|       | +-------------+
v       | v             |
+-+ e   +===+ 1  +-+ 0  +-+
| |---->|| ||--->| |--->| |
+-+     +===+    +-+    +-+

* Equivalence of Techniques

We have shown four methods of representing regular languages
DFAs, NFAs, NFAs with epsilon-transitions, regular expressions

We will now show that these are equivalent
We will show methods of converting from one representation to another

+-------->DFAs----------+
|                       |
|                       |
v                       v
NFAs                    REs
^                       |
|                       |
|                       |
+------>NFAs with<------+
epsilon-transitions

* DFAs <-> NFAs

DFAs -> NFAs

Theorem:  if L is accepted by some DFA, then L is accepted by some NFA.
Proof:  Obvious.  NFAs subsume DFAs.

NFAs -> DFAs

Theorem:  if L is accepted by some NFA, then L is accepted by some DFA.
Proof:  Build a DFA that keeps track of all the states the NFA could be in at
any time while reading the input.

Note:  Finite automata M1 and M2 are said to be "equivalent"
iff L(M1) = L(M2).

Example

M1 = ({q0,q1,q2}, {0,1}, delta, q0, {q2})
d(q0,0) = q0, d(q0,0) = q1, d(q0,1) = q2,
d(q1,0) = q2,
d(q2,1) = q2, d(q2,1) = q1

M2 = ({{}, q0, q1, q2, q0q1, q0q2, q1q2, q0q1q2}, {0,1}, delta, q0,
{q2, q0q2, q1q2, q0q1q2})

Note that
* DFA states correspond to all possible sets of states of NFA
* Final states correspond to all states containing NFA final states
* Transitions correspond to NFA transitions

d(q0,0) = q0q1		d(q0,1) = q2
d(q1,0) = q2		d(q1,1) = {}
d(q2,0) = {}		d(q2,1) = q1q2
d({},0) = {}		d({},1) = {}
d(q0q1,0) = q0q1q2 	d(q0q1,1) = q2
d(q0q2,0) = q0q1		d(q0q2,1) = q1q2
d(q1q2,0) = q2		d(q1q2,1) = q1q2
d(q0q1q2,0) = q0q1q2	d(q0q1q2,1) = q1q2

Proof
Let M = (K, Sigma, delta, s, F) be an NFA.
Construct a DFA M' = (K', Sigma, delta', s', F') s.t. L(M) = L(M').

K' = P(K) = power set of K
Note that we will remove states that cannot be reached
s' = s
F' = {k' in K': k' intersect F neq empty}
delta'(q0q1...qi, a) = p0p1...pj  iff
delta({q0, q1, ..., qi}, a) = {p0, p1, ..., pj}
Note that delta' is computed by applying delta to each state of K
represented by q0q1...qi.  When we apply delta to each of these
states and take the union we get a new set of states, p0 ... pj.
The new set of states has a representative state p0...pj, which is the
value of delta'(q0q1...qi, a).

Prove by induction on |w| that
delta'(s', w) = q0q1...qi iff delta(s, w) = {q0, q1, ..., qi}.

Base Case:  Let |w| = 0.  In this case s' = s and w must be e,
and we know delta'(s',e) = s' and delta(s,e) = {s}.

Inductive Hypothesis:  Assume property holds for |w| <= n.

Inductive Step:  Prove property then holds for |w| = n+1.
Let a represent the last symbol in the string (string is xa), |x| = n.
delta'(s', xa) = delta'(delta'(s', x), a).

From the inductive hypothesis we know
delta'(s', x) = q0q1...qj iff delta(s, x) = {q0, q1, ..., qj}.

From the definition of delta' we know
delta'(q0q1...qj, a) = [r0r1...rk] iff
delta({q0, q1, ..., qj}, a) = {r0, r1, ..., rk}.

Thus we know that
delta'(s', xa) = r0r1...rk iff delta(s, xa) = {r0, r1, ..., rk}.

Thus the property holds.

We also know that delta'(s', w) is in F' exactly when delta(s, w) contains
a state of K that is in F.

Thus L(M) = L(M').
Q.E.D.

* NFAs <-> NFAs with epsilon-transitions

NFAs -> NFAs with epsilon-transitions

Theorem:  if L is accepted by some NFA,
then L is accepted by some NFA with epsilon-transitions.
Proof:  Obvious.  NFAs with epsilon-transitions subsume NFAs.

NFAs with epsilon-transitions -> NFAs

Theorem: if L is accepted by some NFA with epsilon-transitions,
then L is accepted by some NFA without epsilon-transitions.
Proof:  Given M = (K, Sigma, delta, s, F) which is an NFA with
epsilon-transitions,

construct M' = (K', Sigma, delta', s', F') which is an
NFA without epsilon-transitions, s.t. L(M) = L(M').

K' = K
s' = s
delta' is an extension of delta.  We add extra edges to compensate for lack
of epsilon-edges.

delta' has all non-epsilon-transitions that delta has.
In addition, forall p,q in K, forall a in Sigma, if
e   a   e
p ~~~>--->~~~>q in M, where ~~~> indicates a finite-length path of e edges
a
then add transition p--->q in M'

e
F' = F unless s~~~>S_{accept} in M, in which case F' = F union {s}.

For the previous example we note:

q0 on 0 can lead to q0, q1, q2, q3
1             q1, q2, q3
q1 on 0 can lead to q2
1             q1, q2, q3
q2 on 0 can lead to q2
1             empty
q3 on 0 can lead to q2
1             empty

This defines our NFA without epsilon-transitions, F = {q2} because there
was not a path from s to a final state consisting of epsilon-edges only
in the original NFA.

Leave the proof as an exercise.

* Example of the conceptual power of nondeterminism

Let L be a language.  Define 1/2(L) to be
{x: for some y such that |x| = |y|, xy is in L}.

1/2(L) is the set of first halves of strings in L.  Prove that for each
regular L, 1/2(L) is regular.

Proof:  Let M be a DFA for L.  M' will be an NFA constructed to accept 1/2(L).
M' uses the "finger method" as follows.
M' places the left finger on s in M, and the right finger on s_i in M
where s_i is a nondeterministic "guess" as to where input x will lead.

As aharacters of x are read, M' moves the left finger along transitions
as dictated by characters read, and simultaneously moves the right finger
along a nondeterministically chosen edge (which may be labeled with any
character, not necessarily the one of x being read).

M' accepts x iff at the end its left finger is on s_i and
its right finger is on the accept state.
Thus M' accepts x iff
* x leads to some state s_i in M, and
* there exists y |y| = |x| s.t. y leads from s_i to an accept state in M.

Formally:

Use tuple for fingers (we need a middle finger to remember
the guessed s_i state).

M' = (K', Sigma, delta', s', F'), where
K' = K_{left finger} x K{middle finger} x K{right finger} and
s in K = new start state

* delta'(s, epsilon) = {: si in S}
"Guess" the state si to which input x will lead

* delta'(, a) =
{: delta(si, a) = sl and
there exists b in Sigma s.t. delta(sk, b) = sm}
Note that the middle finger does not change after the initial guess

F' = {:  si in S, sj in F}

As you can see, the left finger moved on input x to where the
middle finger guessed (si, si).  The right finger moved from si
to some accepting state sj of M (si, sj).
```