Lecture 4: Deterministic Finite Automata

The following figure shows the state diagram of a deterministic finite automaton (DFA),

The automaton has three states \(q_0\), \(q_1\), and \(q_2\). The state \(q_0\) is designated as the start state and \(q_1\) is designated as an accept state. The directed edges between states specify the transitions of the automaton when reading an input string.

This automaton accepts the set of strings that contain \(1\) and have an even number of \(0\)’s after the last \(1\).

Languages

An alphabet \(\Sigma\) is a finite set of symbols, e.g., \(\Sigma=\{0,1\}\) or \(\Sigma={a,b,c}\). (We will usually content ourselves with the binary alphabet.) A string \(w\) over \(\Sigma\) is a finite sequence of symboles, written as \(w=x_1\cdots x_n\) for \(x_1,\ldots,x_n\in \Sigma\). The empty string (string of length zero) is denoted \(\varepsilon\). The set of all strings is denoted \(\Sigma^\ast\). A language \(L\) is any subset of strings, so that \(L\subseteq \Sigma^\ast\).

We will use languages to encode computational problems. (A language \(L\) can be thought to encode the function \(f{\colon}\Sigma^\ast\to\{0,1\}\) with \(f(w)=1\) if and only if \(w\in L\)).


Operations on languages: Let \(A,B\subseteq \Sigma^*\) be two languages. The following operations on languages are useful:

(Notice that the Kleene star notation is compatible with the notation for the set of all strings.)

Deterministic Finite Automaton

Definition: A deterministic finite automaton \(M\) consists of the following parts:

  • a transition function \(\delta{\colon}{} Q\times \Sigma \to Q\) for an alphabet \(\Sigma\) and a set of states \(Q\),
  • a start state \(q_0\in Q\),
  • a set of accept states \(F\subseteq Q\).

Language of a DFA

Definition: A DFA \(M\) accepts a string \(w=x_1\cdots x_n\) if there exists a sequence of states \(r_0,r_1,\ldots,r_n\in Q\) such that

  • the first state is the start statem \(r_0=q_0\)
  • the \(i\)th state is obtained by reading the \(i\)th symbol, \(r_i=\delta(r_{i-1},x_i)\)
  • the final state is an accept state, \(r_n\in F\).

The language of \(M\) is defined as \[ L(M) = \{ w \mid \text{$M$ accepts $w$}\}. \] If \(L\) is the language of some deterministic finite automaton, we say that \(L\) is regular.

Example proof

Recall the automaton \(M\) from the beginning:
q₀q₁q₂01100,1

Lemma: The language of the automaton \(M\) is the set of all strings \(w\in\{0,1\}^*\) such that \(w\) contains a \(1\) and ends with an even number of \(0\)’s after the last \(1\).

We will prove this lemma by inductions. To make the proof easier, we will show a stronger statement. (Proofs by induction often become easier if the statement is strengthened.)

Define the repeated transition function \(\delta^*{\colon}\Sigma^*\to Q\), \[ \delta^*(w) := \text{state of $M$ after reading $w$} \] This function satisfies \(\delta^\ast(w x)=\delta(\delta^\ast(w),x)\) for every string \(w\in \Sigma^\ast\) and symbol \(x\in \Sigma\). Furthermore, \(L(M) = \{w \mid \delta^\ast(w) \in F \}.\) Hence, the following claim implies the lemma.


Claim: The repeated transition function of \(M\) is \[ \delta^\ast(w) = \begin{cases} q_0 & \text{ if $w$ does not contains $1$ ($A$)},\\ q_1 & \text{ if $w$ contains $1$ and ends with an even number of $0$'s after the the last $1$ ($B$)},\\ q_2 & \text{ if $w$ contains $1$ and ends with an odd number of $0$'s after the the last $1$ ($C$)}. \end{cases} \] Proof of claim: By induction on the length of \(w\). (At this point, the proof is completely mechanical.)

If \(w=\varepsilon\), then \(\delta^*(w)=q_0\) and the claim is true (because \(\varepsilon\in A\)).

Suppose length of \(w=ux\) for \(x\in \{0,1\}\) is greater than \(0\). The induction hypothesis is that the claim holds for all strings that are shorter than \(w\). (In particular, the claim holds for the substring \(u\) of \(w\).)

We consider three cases:

We conclude that the claim is true in all three cases, which proves the claim.