# Threshold Phenomena for Random Discrete Structures

Jinyoung Park

Communicated by Notices Associate Editor Emilie Purvine

## 1. Erdős–Rényi Model

To begin, we briefly introduce a model of random graphs. Recall that a graph is a mathematical structure that consists of vertices (nodes) and edges.

Roughly speaking, a random graph in this article means that, given a vertex set, the existence of each potential edge is decided at random. We will specifically focus on the Erdős–Rényi random graph (denoted by ), which is defined as follows.

Consider vertices that are labelled from 1 to .

Observe that on those vertices, there are potentially edges, that is, the edges labelled . Given a probability , include each of the potential edges with probability , where the choice of each edge is made independently from the choices of the other edges.

In reality, when we consider , is a large (yet finite) number that tends to infinity, and is usually a function of that tends to zero as . For example, , , etc.

As we saw in Example 1.1, a random graph is a random variable with a certain probability distribution (as opposed to a fixed graph) that depends on the values of and . Assuming is given, the structure of changes as the value of changes, and in order to understand , we ask questions about the structure of such as

or

Basically, for any property of interest, we can ask

In those questions, usually we are interested in understanding the typical structure/behavior of . Observe that, unless or , there is always a positive probability that all of the edges in are absent, or all of them are present (see Examples 1.2, 1.3). But in this article, we would rather ignore such extreme events that happen with a tiny probability, and focus on properties that possesses with a probability close to 1.

We often use languages and tools from probability theory to describe/understand behaviors of . Below we discuss some very basic examples.

We will write if as .

## 2. Threshold Phenomena

One striking thing about is that appearance and disappearance of certain properties are abrupt. Probably one of the most well-known examples that exhibit threshold phenomena of is the appearance of the giant component. A component of a graph is a maximal connected subgraph. For example, the graph in Figure 4 consists of four components, and the size (the number of vertices) of each component is , , , and .

For , observe that, when , the size of a largest component of is 1; in this case all of the edges are absent with probability 1, so each of the components is an isolated vertex. On the other hand, when , is the complete graph with probability 1, so the size of its largest component is .

Then what if is strictly between and ?

Of course, one would naturally guess that as increases from to , the typical size of a largest component in would also increase from 1 to . But what is really interesting here is that there is a “sudden jump” in this increment.

In the following statement and everywhere else, with high probability means that the probability that the event under consideration occurs tends to 1 as .

The above theorem says that if is “slightly smaller” than , then typically all of the components of are very small (note that is much smaller than the number of vertices, ).

On the other hand, if is “slightly larger” than , then the size of a largest component of is as large as linear in . It is also well-known that all other components are very small (at most of order ), and this unique largest component is called the giant component.

So around the value , the giant component “suddenly” appears, and therefore the structure of also drasitically changes. This is one example of the threshold phenomena that exhibits, and the value is a threshold function for of having the giant component. (The formal definition of a threshold function is given in Definition 2.3. See also the definition of the threshold in Section 5.)

The abrupt appearance of the giant component of is just one instance of vast threshold phenomena for random discrete structures. In this article, we will mostly deal with for the sake of concreteness, but there will be a brief discussion about a more general setting in Section 5.

Now we introduce the formal definition of a threshold function due to Erdős and Rényi. Recall that, in , all the vertices are labelled . A graph property is a property that is invariant under graph isomorphisms (i.e., relabelling the vertices), such as , , , etc. We use for a graph property, and denotes that has property .

For example, is a threshold function for the existence of the giant component.

Note that it is not obvious at all whether a given graph property would admit a threshold function. Erdős and Rényi proved that many graph properties have a threshold function, and about 20 years later, Bollobás and Thomason proved that, in fact, there is a wide class of properties that admit a threshold function. In what follows, an increasing (graph) property is a property that is preserved under addition of edges. For example, connectivity is an increasing property, because if a graph is connected then it remains connected no matter what edges are additionally added.

Now it immediately follows from the above theorem that all the properties that we have mentioned so far—connectivity, planarity,⁠Footnote3 having the giant component, etc.—have a threshold function (thus exhibit a threshold phenomenon). How fascinating it is!

3

We can apply the theorem for nonplanarity, which is an increasing property.

On the other hand, knowing that a property has a threshold function does not tell us anything about the value of . So it naturally became a central interest in the study of random graphs to find a threshold function for various increasing properties. One of the most studied classes of increasing properties is subgraph containment, i.e., the question of for what , is likely/unlikely to contain a copy of the given graph. Figure 8 shows some of the well-known threshold functions for various subgraph containments (and that for connectivity).

We stress that, in general, finding a threshold function for a given increasing property is a very hard task. To illustrate this point, let’s consider one of the most basic objects in graph theory, a spanning tree—a tree that contains all of the vertices.

The question of finding a threshold function for of containing a spanning tree⁠Footnote4 was one of the first questions studied by Erdős and Rényi. Already in their seminal paper 6, Erdős and Rényi showed that a threshold function for containing a spanning tree is . However, the difficulty of this problem immensely changes if we require to contain a specific (up to isomorphisms) spanning tree (or more broadly, a spanning graph.⁠Footnote5) For example, one of the biggest open questions in this area back in 1960s was finding a threshold function for a Hamiltonian cycle (a cycle that contains all of the vertices).

4

This is equivalent to is connected.

5

A spanning graph means a graph that contains all of the vertices

This problem was famously solved by Pósa in 1976.

Note that both threshold functions for contain any spanning tree and are of order , even though the latter is a stronger requirement. Later we will see (in the discussion that follows Example 4.6) that is actually an easy lower bound on both threshold functions. It has long been conjectured that for any spanning tree⁠Footnote6 with a constant maximum degree, its threshold function is of order . This conjecture was only very recently proved by Montgomery 14.

6

More precisely, for any sequence of spanning trees

## 3. The Kahn–Kalai Conjecture: A Preview

Henceforth, always denotes an increasing property.

In 2006, Jeff Kahn and Gil Kalai 12 posed an extremely bold conjecture that captures the location of threshold functions for any increasing properties. Its formal statement will be given in Conjecture 4.11 (graph version) and Theorem 5.7 (abstract version), and in this section we will give an informal description of this conjecture first. All of the terms not defined here will be discussed in the forthcoming sections.

Given an , we are interested in locating its threshold function, .⁠Footnote7 But again, this is in general a very hard task.

7

We switch the notation from to to emphasize its dependence on .

Kahn and Kalai introduced another quantity which they named the expectation threshold and denoted by , which is associated with some sort of expectation calculations as its name indicates. By its definition (Definition 4.5),

and, in particular, is easy to compute for many interesting increasing properties . So provides an “easy” lower bound on the hard parameter . A really fascinating part is that then Kahn and Kalai conjectured that is, in fact, bounded above by multiplied by some tiny quantity!

So this conjecture asserts that, for any , is actually well-predicted by (much) easier !

The graph version of this conjecture (Conjecture 4.11) is still open, but the abstract version (Theorem 5.7) is recently proved in 15.

## 4. Motivating Examples

The conjecture of Kahn and Kalai is very strong, and even the authors of the conjecture wrote in their paper 12 that “it would probably be more sensible to conjecture that it is not true.” The fundamental question that motivated this conjecture was:

All of the examples in this section are carefully chosen to show the motivation behind the conjecture.

Recall that the definition of a threshold (Definition 2.3) doesn’t distinguish constant factors. So in this section, we will use the convenient notation , , and to mean (respectively) , and up to constant factors. Finally, write for a threshold function for of containing a copy of , for notational simplicity.

The next example shows that the above dream is too dreamy to be true.

In Example 4.3, is not predicted by the expected number of , thus the Dream is broken. However, it still shows that is predicted by the expected number of some subgraph of , and, intriguingly, this holds true in general. To provide its formal statement, define the density of a graph by

The next theorem tells us the exciting fact that we can find by just looking at its densest subgraph, as long as is fixed.⁠Footnote8

8

For example, a Hamiltonian cycle is not a fixed graph, since it grows as grows.

For example, in Example 4.2, the densest subgraph of is itself, so is determined by the expectation of . This also determines in Example 4.3, since the densest subgraph of is again .

Motivated by the preceding examples and Theorem 4.4, we give a formal definition of the expectation threshold.

Observe that

and in particular, Theorem 4.4 gives that

Note that this gives a beautiful answer to Question 4.1 whenever is a property of containing a fixed graph.

The existence of an isolated vertex in is essentially equivalent to the coupon collector’s problem:

Observe that, in Example 4.6, the “coupon-collector behavior” of