Lecture 3: Hard functions

Generic lower bound

Theorem: For every $n\in{\mathbb N}$ with $n\ge 100$, there exists an $n$-bit Boolean function that requires circuits of size at least $2^{n}/100n$.

Proof: The idea is to compare the number of $n$-bit Boolean functions $f{\colon}\{0,1\}^n\to \{0,1\}$ to the number of circuits of size at most $s=2^n/100n$. If the number of functions is larger than the number of such circuits, it means that there exists a functions that is not computed by such a circuit.

Let us first count the number of $n$-bit Boolean functions. In general, the number of functions from a finite set $X$ to a finite set $Y$ is equal to $|Y|^{|X|}$. (You should convince yourself of this fact.) Hence, the number of $n$-bit Boolean functions is $2^{2^n}$.

Now let’s count the number of size-$s$ circuits. A circuit of size $s$ consists of a directed acyclic graph (DAG) on $s'=s+n\le 2s$ nodes ($n$ input nodes and $s$ gates). Each node has in-degree at most $2$. (The in-degree of a node is the number of ingoing edges.) Furthermore, each input node (in-degree $0$) is labeled with an input variable $i\in[n]$. Each gate is labeled with a Boolean operation, either ${\mathrm{\scriptstyle AND}}$, ${\mathrm{\scriptstyle OR}}$, or ${\mathrm{\scriptstyle NOT}}$.

We claim that we can describe such a circuit completely by a bit string of length at most $10s\log s$. First, we give every node a unique ID. Since there are only $s'$ nodes in the circuit, we only need $\log s'$ bits for the ID. (We can use the numbers from $1$ to $s'$ as IDs.) Our description of the circuit will consist of $s'$ parts, each of length at most $10\log s'$ (one part per vertex). To describe a single vertex $u$, we store the ID of $u$ ($\log s'$ bits), the IDs of the vertices that have incoming arcs to $u$ ($2 \log s'$ bits) and the label of $u$, which is either a Boolean function or the index of an input/output variable (at most $\log n\le \log s'$ bits). We don’t need to store the outgoing arcs of a vertex, because those will be stored as incoming arcs by other vertices. In total, we store at most $4\log s'$ bits per vertex. Hence, the description of the whole circuit is indeed shorter than $10s\log s$ bits.

It follows that the number of circuits of size $s$ is at most $2^{10 s\log s}$ (the number of bit strings of length $10 s \log s$). For $s=2^n/100n$, this number is much less than $2^{2^n}$. (Check this calculation.) It follows that an overwhelming majority of $n$-bit Boolean functions does not have circuits of size $2^n/100n$.

Remark: While the theorem shows that some functions require large circuits, the proof of the theorem does not give an explicit construction of such a function. (The proof is non-constructive.) It’s a famous open problem to show strong lower bounds on the circuit size of concrete functions. For example, the following funtion is conjectured to require circuits of exponential size \[ \mathrm{\scriptstyle CIRCUIT-SAT}_n(x) := \begin{cases} 0 & \text{ if $x$ is the binary encoding of a circuit}\\ & \text{ that computes the constant $0$ function,}\\ 1 & \text{ otherwise}. \end{cases} \]

Straight-line Programs

In this part, we will discuss an exact correspondence between circuits and a simple kind of programs.

Here is an example of a straight-line program:

Input: $x_1,x_2$
Instructions:

$y_1 := \neg x_1$

$y_2 := \neg x_2$

$y_3 := x_1 \land y_2$

$y_4 := y_1 \land x_2$

$y_5 := y_3 \lor y_4$

Output: $y_5$

This program outputs the parity of the two input bits. We can construct a circuit that corresponds to this program in the following way: For every instruction, we have a gate in the circuit that applies the same Boolean operation and has incoming edges from nodes corresponding to the right-hand-side variables of the instruction. The size of the resulting circuit is exactly equal to the number of instructions of the straight-line program.

Definition: A straight-line program is a sequence of instructions of the following form: Each instruction applies a basic Boolean operation (${\mathrm{\scriptstyle NOT}},{\mathrm{\scriptstyle AND}},{\mathrm{\scriptstyle OR}}$) to input variables or previously assigned variables and assigns the outcome of this operation to a new variable. The output of the program is the outcome of last instruction. The length of a straight-line program is the number of instructions.

Exercise: A Boolean function has a straight-line program of length at most $\ell$ if and only if it has a Boolean circuit of size at most $\ell$.