Lecture 11: Turing machines
What is computation?
The idea of computation is ancient (especially arithmetic computation, like addition and multiplication). Also computing devices have been around for a long time. (Think about tally sticks and abaci)
If we want to reason about the limitations of computation, we need a rigorous mathematical definition of computation. Turing machines provide such a definition. Before the formal definition, we will discuss an “informal definition” of computation. Then we will see that Turing machines indeed capture this informal definition.
“Informal definition” of computation
At an informal level, we can define computation as the process of manipulating data according to simple rules. If we represent data as a binary string \(w \in \{ 0,1 \}^n\), then we can visualize a (discrete) process that manipulates data by the following kind of diagram: \[ w \longrightarrow { {w}^{\scriptscriptstyle(1)} } \longrightarrow \cdots \longrightarrow { {w}^{\scriptscriptstyle(i)} }\longrightarrow { {w}^{\scriptscriptstyle(i+1)} } \longrightarrow \cdots \longrightarrow { {w}^{\scriptscriptstyle(n)} } \] To make the notion of “simple rules” more concrete, the idea of locality is helpful; we want that a computation step \({ {w}^{\scriptscriptstyle(i)} } \to { {w}^{\scriptscriptstyle(i+1)} }\) only changes a small part of the data. To further concretize this idea, we will add a “marker,” denoted \(\dagger\), to indicate which part of the data is allowed to change. With this marker, the computation looks as follows: \[ \dagger w \longrightarrow { {u}^{\scriptscriptstyle(1)} } \dagger { {v}^{\scriptscriptstyle(1)} } \longrightarrow \cdots \longrightarrow { {u}^{\scriptscriptstyle(i)} } \dagger { {v}^{\scriptscriptstyle(i)} }\longrightarrow { {u}^{\scriptscriptstyle(i+1)} } \dagger { {v}^{\scriptscriptstyle(i+1)} } \longrightarrow \cdots \] The rules of the computation specify how the data around the marker changes in one step depending on the current data around the marker.
Turing machines are a further formalization along this line of thought.
Turing machine
We can visualize a Turing machine as the union of three components:
- a finite-state control (akin to a deterministic finite automaton)
- a tape, extending infinitely in one direction
- a read / write head, pointing to a tape position
The computation starts with the head and input placed at the beginning of the tape. In each computation step, the control reads the symbols under the head, writes a symbol under the head, and moves the head one position to the left or to the right. The computation halts if the control accepts or rejects.
Definition: A Turing machine \(M\) consists of the following:
- input alphabet \(\Sigma\) (e.g., \(\Sigma=\{0,1\}\)).
- tape alphabet \(\Gamma\),
- contains all symbols of input alphabet, \(\Sigma \subseteq \Gamma\),
- contains a designated blank symbol \({ {\scriptstyle \square} }\in \Gamma \setminus \Sigma\).
- transition function \(\delta: Q \times \Gamma \to Q\times \Gamma \times \{{\mathtt R},{\mathtt L}\}\),
- set of states \(Q\),
- \({\mathtt R}\) and \({\mathtt L}\) indicate left and right movement of head,
- special states: start \(q_0 \in Q\), accept \({q_{\mathrm{accept}}}\in Q\), and reject \({q_{\mathrm{reject}}}\in Q \setminus \{{q_{\mathrm{accept}}}\}\).
Configurations of Turing machines
A configuration describes the state of the whole machine (as opposed to just the control) during some step of a computation. In particular, the configuration contains the content of the tape, the position of the head, and the state of the control. We represent a configuration by a string \(C\), \[ C = u q v \] where \(u\in \Gamma^\ast\) is the tape content before the head position, \(q\in Q\) is the state of the machines, and \(v\in \Gamma^+\) is the tape content after the head position, including the symbol directly under the head.
Definition: For a Turing machine \(M\), a configuration \(C\) yields in one step a configuration \(C'\), denoted $ C _C’, $ if one of the following rules is matched:
- Suppose \(\delta(q,a)=(q',a',{\mathtt R})\) for some \(q,q'\in Q\) and \(a,a'\in \Gamma\). Then, for all \(u,v\in \Gamma^\ast\), \[u q a v \vdash_\delta u a' q' v \text{ and } u q a \vdash_\delta u a' q' { {\scriptstyle \square} }\]
- Suppose \(\delta(q,a)=(q',a',{\mathtt L})\) for some \(q,q'\in Q\) and \(a,a'\in \Gamma\). Then, for all \(u,v\in \Gamma^\ast\) and \(b\in \Gamma\), \[u b q a v \vdash_\delta u q b a' v\text{ and }q a v \vdash_\delta q' a' v\]
For every configuration \(C\), there exists exactly one configuration \(C'\) such that \(C \vdash_\delta C'\) (the computation is deterministic).
Computation of Turing machines
Definition: We say a configuration \(C\) yields in a finite number of steps a configuration \(C'\), denoted \(C \vdash _\delta^\ast C'\), if there exists a sequence of configurations \(C_0,\ldots,C_n\) such that \(C_i \vdash_\delta C_{i+1}\) for all \(i\in \{0,\ldots,n-1\}\). A Turing machine \(M\) accepts on input \(w\in \Sigma^\ast\) if \(q_0 w \vdash_\delta^\ast C\) for some configuration \(C\) that contains the accept state. Similarly, \(M\) rejects on \(w\) if \(q_0 w \vdash_\delta^\ast C\) for some configuration \(C\) that contains the reject state. Finally we say that \(M\) halts on input \(w\) if it either accepts or rejects on \(w\).
Definition: The language of a Turing machine \(M\), \[ L(M) := \{ w\in \Sigma^\ast \mid \text{$M$ accepts on input $w$} \}. \]
Decidability
We say a language \(L\subseteq \Sigma^\ast\) is recognizable (by a Turing machine) if there exists a Turing machine \(M\) such that \(L(M)=L\). A language \(L\) is decidable if there exists a Turing machine \(M\) that accepts every \(w\in L\) and rejects every \(w\notin L\).
Theorem: A language \(L\) is decidable if and only if both \(L\) and \(\neg L\) are recognizable.
Proof: One direction is trivial. A machine that decides \(L\) also recognizes \(L\). If we exchange the accept and reject state of this machine, it recognizes \(\neg L\).
It remains to show that if both \(L\) and \(\neg L\) are recognizable then \(L\) is decidable. Let \(M_Y\) and \(M_N\) be machines with \(L(M_Y) = L\) and \(L(M_N)= \neg L\). Then, we can build a Turing machine \(M\) that decides \(L\) as follows:
Operation of \(M\) on input \(w\):
- Repeat the following, starting with \(i=0\):
- Simulate \(M_Y\) for \(i\) steps on input \(w\). Accept if \(M_Y\) accepts in \(i\) steps.
- Simulate \(M_N\) for \(i\) steps on input \(w\). Reject if \(M_Y\) accepts in \(i\) steps.
- Increment \(i\) by one.
The machine \(M\) accepts \(w\) if and only if \(M_Y\) accepts \(w\), and \(M\) rejects \(w\) if and only if \(M_N\) accepts \(w\). \(\square\)
Example: Powers of Two
Lemma: The following language is decidable, \[ \{0^{2^n} \mid n\in{\mathbb N}_0\}. \]
Proof: We are to construct a Turing machine \(M\) that decides the above language. A general way to implement a Turing machine is to sweep over the tape content repeatedly, and on each sweep, the machine behaves like a DFA (with the difference that the machine may also write to the tape). The additional power of Turing machines over finite automata comes from the fact that they may sweep over the data multiple times.
Operation of \(M\) on input \(w\):1
- Reject if \(w\) is not of the form \(0^+\).
- Repeat the following:
- Accept if exactly one (uncrossed) \(0\) is on the tape.
- Reject if the number (uncrossed) \(0\)’s is odd and bigger than \(1\).
- Cross out every other \(0\) on the tape
(say by replacing them with the symbol $\mathtt X \in \Gamma$)..comment
Each of these steps can be implemented by one sweep over the data, using a constant number of states. Hence, the above algorithm as a whole can be implemented as a Turing machine.
It remains to argue that the above algorithm decides the language \(\{0^{2^n} \mid n \in{\mathbb N}_0\}\). (We need to show that the machine always halts and if it halts its answer is correct.) We may assume that \(w\) has the form \(0^+\). (Otherwise, the machine rejects, which is correct.) During the operation of the machine, we will maintain the invariant that there is at least one uncrossed \(0\) on the tape. The third step of the loop is reached only if the number of uncrossed \(0\)’s is even and at least two (by the invariant). Hence, each full iteration exactly halves the number of uncrossed \(0\)’s. Thus, if \(w=0^{m}\), there are exactly \(m/2^{i}\) uncrossed \(0\)’s after \(i\) full iterations of the loop. We conclude that \(M\) accepts only if \(1=m/2^i\) for some \(i\in {\mathbb N}_0\), which means \(m\) is a power of \(2\), and \(M\) rejects only if \(m/2^i\) is an odd number bigger than \(1\), which means that \(m\) is not a power of \(2\). Furthermore, the machine halts after at most \(i\) iterations, where \(i\) is the smallest integer such that \(m/2^i\le 1\). \(\square\)
As very first step, we mark the beginning of the tape, using special symbols from the tape alphabet, \(\hat{}0\) and \(\hat{}1\). This marker will allow us to detect in future iterations if we reached the beginning of the tape. For all other purposes, these special symbols will be treated as regular symbols, e.g., \(\hat{}0\) is treated as \(0\) and \(\hat{}1\) is treated as \(1\).↩