Friendly bisections of random graphs

Resolving a conjecture of F\"uredi from 1988, we prove that with high probability, the random graph $G(n,1/2)$ admits a friendly bisection of its vertex set, i.e., a partition of its vertex set into two parts whose sizes differ by at most one in which $n-o(n)$ vertices have at least as many neighbours in their own part as across. The engine of our proof is a new method to study stochastic processes driven by degree information in random graphs; this involves combining enumeration techniques with an abstract second moment argument.


Introduction
In a cut of a graph, i.e., a partition of its vertex set into two parts, we call a vertex friendly if it has more neighbours in its own part than across, and unfriendly otherwise.Questions about finding friendly and unfriendly partitions of graphs, i.e., partitions in which all (or almost all) the vertices are friendly or unfriendly, have been investigated in various contexts: in combinatorics, on account of their inherent interest [5,10,18,25,29,33,35], in computer science, as 'local' analogues of important NP-complete partitioning problems [4,13], in probability and statistical physics, owing to their connections to spin glasses [1,17,19,31], and in logic and set theory [2,30]; this list is merely a representative sample (and by no means exhaustive) since such partitions have been studied extremely broadly.On the other hand, when it comes to finding friendly or unfriendly bisections, i.e., partitions into two parts whose sizes differ by at most one, much less is known.Our aim here is to prove an old and well-known conjecture about random graphs due to Füredi [16].This problem has gained some notoriety over the years, in part due to its inclusion in Green's list of 100 open problems [20,Problem 91].Our main result is as follows.
Degree-driven stochastic processes.Although Theorem 1.1 is specifically about friendly bisections of random graphs, the approach we adopt to prove this result is rather general, and it may be that the more important point of this work is its contribution to methodology.Concretely, we develop a method that appears suitable for analysing many different types of stochastic processes on random graphs driven primarily by degree information; for example, in forthcoming work, the fourth and fifth authors [28] use modifications of these techniques to settle various conjectures of Tran and Vu [36] concerning majority dynamics on random graphs.Below, we outline how our approach allows us to prove Theorem 1.1.
We adopt a constructive approach that yields an efficient algorithm to find the bisection promised by Theorem 1.1.To motivate our approach, it is instructive to consider the following basic algorithm, motivated by the classical large-cut-finding algorithm: starting with any bisection A ∪ B of a graph G, repeatedly check whether there are vertices v ∈ A and w ∈ B such that deg B (v) > deg A (v) and deg A (w) > deg B (v), and if so, swap v and w.It is easy to see that such a swap must decrease the size (i.e., the number of crossing edges) of the bisection, so this algorithm must terminate.Of course, if we are unlucky, it might happen that when the algorithm terminates, all the vertices in Date: June 9, 2021.A are friendly, while very few of the vertices in B are friendly, so the resulting bisection may be very far from satisfying the conclusion of Theorem 1.1.However, it seems plausible that such an outcome is rather unusual: if G is sampled from G(n, 1/2), then one might expect this algorithm (interpreted as a random process) to typically follow a predictable trajectory, and in particular, the number of friendly vertices in A and in B to stay roughly the same for most of the duration of the algorithm.This is a promising starting point, especially due to the fact that we do not actually need to fully understand the typical trajectory of the process.Indeed, we only need to show that at each step k, the number of friendly vertices in A concentrates around some value N k .By symmetry (assuming for the moment that n is even), the number of friendly vertices in B would then concentrate around N k as well, so the numbers of friendly vertices in A and B would never get 'too imbalanced'.However, it is far from obvious how to actually establish concentration.Roughly speaking, the main issue is that in order to execute even the first step of the algorithm, we have to inspect every vertex of our graph, meaning that there is seemingly 'no remaining randomness' for the second step.This is in contrast with most other random graph processes in the literature (such as H-free or H-removal processes, as in [7,8,15] for example), where each individual step is defined in terms of a random choice.
There are two ideas that allow us to salvage enough randomness to establish the desired concentration.First, instead of swapping vertices one at at time, we shall instead swap a sizeable 'batch' of vertices between A and B in each step; this is strongly reminiscent of the influential 'nibbling' idea introduced by Rödl [27].We will be able to use discrepancy properties of random graphs to show that, in a typical outcome of the random graph G(n, 1/2), when we have a bisection A ∪ B in which many vertices in A and in B are unfriendly, swapping a large number of the 'unfriendliest' vertices in A and in B dramatically decreases the size of the bisection.That is to say, it should only take a few steps, (about exp(1/ε), in fact) to reach a bisection in which one of the two parts has (1 − ε)n/2 friendly vertices.This makes the problem of establishing concentration more tractable, since we now only need to do this for a large constant number of steps.Our second main observation is that in order to execute a step of our algorithm, we only need to know the degrees deg A (v) and deg B (v) for each vertex v at that stage (and not any other information about the graph).Thus, instead of revealing the whole graph to study the first step, we may simply reveal the required degree information, meaning that our random graph is now conditionally a degree-constrained random graph.We then have the randomness of this degree-constrained random graph with which to show concentration at the next step, for which we again only need to (dynamically) reveal some more degree information, and so on.
The above observations leave us with the task of demonstrating concentration in some (families of) degree-constrained random graphs.In order to study these degree-constrained random graphs, we have at our disposal powerful enumeration theorems due to McKay and Wormald [26], and extensions by Canfield, Greenhill, and McKay [12], which give very precise asymptotic formulae for the number of graphs with specified degree information.In principle, this allows one to write down explicit formulae for essentially all relevant probabilities, from which one could attempt to compute the typical trajectory of the process.However, the necessary computations are formidable, and in particular, the various densities under consideration do not appear to have closed-form expressions past the first few iterations.
Our approach to circumventing these issues brings us to the heart of the matter: we develop an abstract second-moment argument with which one can establish concentration of various statistics at a given step, using only stability and anti-concentration information about the outcomes of previous steps.In particular, this enables us to establish concentration without actually knowing the trajectory of the process.This is superficially reminiscent of martingale arguments establishing concentration around the mean without any knowledge of the location of the mean itself (see [3]), but the inputs to such arguments, typically Lipschitz-like behaviour of the random variables of interest, are rather different from the inputs to our argument.As mentioned earlier, the methods in our argument are quite general, and we anticipate that a broad range of similar stochastic processes will now become amenable to analysis.Notation.Our graph-theoretic notation is for the most part standard; see [9] for terms not defined here.In a graph G, we write deg(v) for the degree of a vertex v ∈ V (G), and N (v) for its neighbourhood; also, for a subset U ⊆ V (G), we write deg U (v) for the number of neighbours of v in U , i.e., for the size of N (v) ∩ U .We write G(n, p) for the Erdős-Rényi random graph on n vertices with edge density p.
Our use of asymptotic notation is mostly standard as well.We say that an event occurs with high probability if it holds with probability 1 − o(1) as some parameter (usually n, unless we specify otherwise) grows large.Constants suppressed by asymptotic notation may be absolute, or might depend on other fixed parameters; we shall spell out the latter situation explicitly whenever there might be cause for confusion.To lighten notation, we write f = g ± h for |f − g| ≤ h.We maintain this convention with asymptotic notation as well, so f = g ± n −Ω (1) for example is taken to mean |f − g| = n −Ω (1) .We also adopt the following non-standard bit of notation: as a parameter n grows large, we write f ≃ h if f = (1 ± n −Ω(1) )h.Finally, following a common abuse, we omit floors and ceilings wherever they are not crucial.
Organisation.This paper is organised as follows.In Section 2, we describe the swapping process that allows us to prove Theorem 1.1, and also give the deduction of our main result from a few key lemmas.In Section 3, we dispose of the more routine of these lemmas.The beef of our argument is in Section 4, where we must work rather hard to establish the key concentration properties of our swapping process.

Proof overview
In this section we make some initial observations, then describe a random swapping process that underlies our argument and state some facts about this process (with proofs to follow later).We then show how to deduce Theorem 1.1 from these facts.
Given a bisection A∪B of a graph, the friendliness ∆ A,B (v) of a vertex v is the difference between the number of its neighbours on its own side and the number of its neighbours on the other side.We say a vertex is friendly if its friendliness is positive, and otherwise, we say it is unfriendly.The total friendliness ∆ A,B of the bisection A ∪ B is then given by We also make a simple observation that allows us to restrict our attention to random graphs of even order (which in turn allows us to somewhat simplify the presentation).A simple union bound (similar to calculations we will see in Section 3) shows that with high probability, in any partition of the vertex set of G(n, 1/2), at most 10n/ log n vertices have friendliness 1, i.e., have exactly one more neighbour on their own side than across, or vice versa.Consequently, it clearly suffices to establish Theorem 1.1 for G(n, 1/2) when n is even; indeed, when n is odd, we may delete an arbitrary vertex from the random graph, apply Theorem 1.1 to the result, and add back the deleted vertex to either part to get the desired bisection.Therefore, all graphs under consideration will be of even order unless explicitly specified otherwise, and we shall not belabour this point any further.
The following lemma shows that for a typical outcome of the random graph G(n, 1/2), there is a window of length O(n 3/2 ) within which the total friendliness of any bisection lies.

Lemma 2.1.
There is a γ > 0 such that for a random graph G ∼ G(n, 1/2), with high probability, every bisection Next, we shall define a simple random 'swap' operation that modifies a bisection with the aim of making it more friendly.Definition 2.2.Given a bisection A∪ B of an n-vertex graph G, the α-swap of A∪ B is the random bisection obtained by the following procedure.First, we take the subset A ′ ⊆ A of the ⌊αn⌋ most unfriendly vertices in A, and the subset B ′ ⊆ B of the ⌊αn⌋ most unfriendly vertices in B (breaking ties according to some a priori fixed ordering of the vertex set), and swap A ′ and B ′ .At this stage, the parts of the resulting bisection are then (A \ A ′ ) ∪ B ′ and (B \ B ′ ) ∪ A ′ .Next, we make a uniformly random choice of ⌊α 4 n⌋ vertices on both of these sides, and swap these subsets.
We remark that the second (random) swap in the α-swap procedure is not actually necessary for the proof of Theorem 1.1, but the analysis later in the paper would become substantially more involved without it.
The following lemma shows that in a typical outcome of the random graph G(n, 1/2), for every bisection A ∪ B, either our swapping operation increases the total friendliness by Ω(n 3/2 ), or almost all the vertices in one of the parts (either A or B) are already friendly.
Lemma 2.3.For every fixed ε > 0, there are α ∈ (0, ε) and β > 0 for which a random graph G ∼ G(n, 1/2) has, with high probability, the following property.In any bisection A ∪ B of G in which at least εn vertices are unfriendly in each of A and B, the random bisection A 1 ∪ B 1 obtained from an α-swap of A ∪ B always satisfies Finally, the next lemma establishes concentration properties for bisections obtained by iterating our swapping operation.With these facts in hand, we may now easily deduce Theorem 1.1.
Proof of Theorem 1.1.For any fixed ε > 0, we shall show that G ∼ G(n, 1/2) with high probability has a bisection in which at most 2εn + o(n) vertices are unfriendly.
Say that a bisection A ∪ B is ε-good if there are at most εn unfriendly vertices in A or at most εn unfriendly vertices in B. Now, the following properties hold with high probability, by Lemmas 2.1, 2.3 and 2.4.
(1) There is an interval of length at most 2γn 3/2 such that the total friendliness of every bisection of G lies in this interval.(2) For every 0 (3) For every 1 ≤ k ≤ K, the numbers of unfriendly vertices in A k and in B k differ by o(n).Fix outcomes of G and A 1 ∪ B 1 , A 2 ∪ B 2 , . . ., A K ∪ B K satisfying all these properties.Now, by property (1), it is not possible for the total friendliness to increase by βn 3/2 in each of the K iterations.So, by property (2), there must be some k for which A k ∪ B k is ε-good, meaning that there are at most εn unfriendly vertices in A k or at most εn unfriendly vertices in B k .The third property (3) now ensures that there are at most 2εn + o(n) unfriendly vertices in total at this stage.The bisection A k ∪ B k has the properties we desire, proving the result.
2.1.Overview of the proofs of the key lemmas.We now briefly discuss the proofs of Lemmas 2.1, 2.3 and 2.4.First, Lemma 2.1 is proved via a Chernoff bound and a simple union bound over all possible bisections.Second, Lemma 2.3 is also proved by a union bound: we show that that no bisection of the graph has many vertices with friendliness very close to zero, so that there is always some reasonably large gain from swapping unfriendly vertices; here, one must also control the (small) amount of additional unfriendliness potentially introduced between pairs of swapped vertices.
The proof of Lemma 2.4 is by far the most technical ingredient in the proof.At a high level, one runs the iterated swap algorithm on a random graph G ∼ G(n, 1/2), at each step revealing only that information about G (namely, degrees into certain parts) which is necessary to determine the outcome of the α-swap procedure.So, at every step, we need to study a degree-constrained random graph model; this is accomplished using graph enumeration techniques in the style of McKay-Wormald [26].One can track the fraction of vertices that live in prescribed parts at prescribed times inductively, showing via the second moment method in our degree-constrained random graph model that the numbers of different types of vertices are concentrated.However, several obstacles arise naturally due to the presence of complicated conditional distributions, and the need for all of the different 'well-conditioned' degree-constrained models (based on different revelations) to converge to a single distribution of degrees.The totality of what must be tracked to implement this argument is contained in Proposition 4.3.
In particular, we note that the first part of the proof (Lemmas 2.1 and 2.3) and the second part of the proof (Lemma 2.4) are essentially logically independent, and the analysis here can be extended to a variety of similar algorithms based on degree sequences.One can think of the first part as providing a monovariant to the graph process analysed in the second part, guaranteeing that the graph partition 'gets better' over time and converges to a friendly distribution of degrees rather than to an abstract (iterated) optimiser of some associated variational problem.

Swapping decrement
In this section we prove Lemmas 2.1 and 2.3.To start with, we need some simple facts about centered binomial distributions.The first is a Chernoff bound (see [21,Theorem 2.1], for example) and the second follows from either Stirling's approximation or the Erdős-Littlewood-Offord theorem (see [34,Corollary 7.4]).
(2) For all t ≥ 1 and all x ∈ R, we have The proof of Lemma 2.1 is extremely simple, being a routine application of the union bound.
Proof of Lemma 2.1.There are n n/2 ≤ 2 n bisections in total.For each such bisection A ∪ B, the random variable ∆ A,B + n/2 has a centered binomial distribution to which Theorem 3.1 applies (with N = n 2 ).For sufficiently large γ, we then have so the desired result follows from the union bound.
Lemma 2.3 is also proved by the union bound, but for this, we will first need to prove some auxiliary lemmas.Lemma 3.2.For any sufficiently small fixed η > 0, a random graph G ∼ G(n, 1/2) with high probability has the property that for every bisection A ∪ B of G, we have |∆ A,B (v)| ≥ 4 −1/η √ n for all but at most ηn vertices v ∈ A, and for all but at most ηn vertices v ∈ B.
Proof.For each bisection A ∪ B, if we condition on an outcome of G[A], then the random variables {∆ A,B (v) : v ∈ A} become mutually independent.Conditionally, for each v ∈ A, the random variable 2∆ A,B (v) + 1 has a centered binomial distribution to which Theorem 3.1 applies (with N = n − 1).Therefore, for large n, from which it follows that the probability that the property in the statement of the lemma does not hold is at most Lemma 3.3.For any sufficiently small fixed α > 0, a random graph G ∼ G(n, 1/2) with high probability has the property that for every bisection A ∪ B of G and every pair of subsets A ′ ⊆ A and B ′ ⊆ B each of size αn, we have where we view Proof.Note that the event does not depend on A, B, only on A ′ , B ′ .For subsets A ′ and B ′ as in the statement of the lemma, the random variable ∆ A ′ ,B ′ + αn has a centered binomial distribution to which Theorem 3.1 applies (with N = 2αn 2 ).We then have , so the desired result follows from a union bound over all choices of A ′ and B ′ .
Lemma 3.4.For any sufficiently small fixed δ > 0, a random graph G ∼ G(n, 1/2) with high probability has the following property.For every bisection A ∪ B, and every pair of subsets Proof.For each bisection A ∪ B and subsets A ′ and B ′ as in the lemma statement, the random variable ∆ A 1 ,B 1 − ∆ A,B has a centered binomial distribution to which Theorem 3.1 applies (with N = 2(n/2 − δn)δn).We then have , so the desired result follows once again from the union bound.
We are now ready to prove Lemma 2.3.
Proof of Lemma 2.3.Let η < ε/2 be small enough for Lemma 3.2 to hold.Let α ∈ (0, ε/2) be small enough so that Lemma 3.3 holds and Lemma 3.4 holds for δ = α 4 , and also α ≤ 4 −3/η .Now assume that the properties in Lemmas 3.2 to 3.4 all hold for G with these parameters, which occurs with high probability.Now, consider an arbitrary bisection A ∪ B where at least εn vertices in A are unfriendly and at least εn vertices in B are unfriendly.Let A ′ be the subset of the αn most unfriendly vertices in A, and let B ′ ⊆ B be the subset of the αn most unfriendly vertices in B. By assumption, at least εn vertices in A are unfriendly, so at least (ε − α)n ≥ ηn vertices in A are unfriendly but not as unfriendly as the vertices in A ′ .By Lemma 3.2 we deduce that for all v ∈ A ′ we have the parts resulting from the first step in an α-swap.We know that |∆ A ′ ,B ′ | ≤ α 4/3 n 3/2 by Lemma 3.3, so we have Finally, by the guarantee in Lemma 3.4, we note that the final random swap in the definition of the α-swap procedure changes the friendliness of the bisection A ′′ ∪ B ′′ by at most in passing to the final bisection A 1 ∪ B 1 .It follows that we have the desired result with β = 3α4 −1/η .

Concentration of the iterated swapping process
In this section we prove Lemma 2.4.In fact, it will follow from the more technical Proposition 4.3, which we shall shortly state and prove by induction.To get started, we need some definitions.
First, we introduce some notation to handle empirical distributions.Given a sequence (a i : i ∈ I), the uniform measure L on this sequence is the probability distribution of a j where j is chosen uniformly from I. When the sequence (a i : i ∈ I) is itself random -for example, comprised of jointly random vectors -we emphasise that the associated uniform measure L is itself a random object, i.e., each realisation of the random sequence (a i : i ∈ I) gives rise to an associated uniform measure on this realisation.
We now define some empirical degree distributions associated with our iterated swapping process.
Definition 4.1.Given a graph G on the vertex set {1, . . ., n}, we consider the iterated swapping process in which we start with the bisection A 0 ∪ B 0 , where A 0 = {1, . . ., n/2} and B 0 = {n/2 + 1, . . ., n}, and repeatedly perform α-swaps k times to yield a sequence (A t ∪ B t ) k t=0 of bisections.For a binary sequence x = (x t ) k+1 t=1 ∈ {0, 1} k+1 , let V x be the set of vertices that are in part A t at those times t with x t−1 = 0, and in part B t at those times t with x t−1 = 1 for 1 ≤ t ≤ k + 1.For a binary sequence x ∈ {0, 1} k+1 , let L x be the uniform measure on the sequence of degree vectors Next, we recall the definition of multidimensional Kolmogorov distance on R d .
Definition 4.2.Let L and L ′ be probability distributions on R d .We define the Kolmogorov distance d K (L, L ′ ) between L and L ′ to be the supremum of where a 1 , . . ., a d ∈ R.

Note that the Kolmogorov distance controls the probability of lying in any half-open box: indeed, for any such box
we can use the inclusion-exclusion principle to express L(B) as a signed sum of The promised generalisation of Lemma 2.4 is now as follows.
A1 For each x ∈ {0, 1} k+1 , we have A2 For each x ∈ {0, 1} k+1 , we have A4 For each x ∈ {0, 1} k+1 , and each box B = y∈{0,1} k+1 (a y , b y ] with side lengths b y − a y = n −c α,k (and, therefore, vol(B) = (n −c α,k ) 2 k+1 ) we have Again, we emphasise that we treat α and k as fixed constants for the purpose of the 'with high probability' statement in the above proposition; in particular, Proposition 4.3 only holds if n grows sufficiently fast (with respect to α and k).
Before discussing the proof of Proposition 4.3, we explain how it implies Lemma 2.4.The key observation is that A1 to A4 essentially allow us to read off, from the distributions L x , arbitrary information about degree statistics (and, in particular, the number of friendly vertices in each part).We will need the following lemma.Lemma 4.4.Suppose that G is such that A2 to A4 are satisfied, and let H ⊆ R {0,1} k+1 be any closed half-space (i.e., a region bounded by a hyperplane).Then for any x ∈ {0, 1} k+1 , we have We defer the proof of Lemma 4.4 (in a slightly stronger form, see Lemma 4.6) to Section 4. To this end, for i ∈ {0, 1}, let S i = {x ∈ {0, 1} k+1 : x k+1 = i} and note that a vertex v ∈ A k is unfriendly if and only if So, defining the affine half-space . By Proposition 4.3 and Lemma 4.4, with high probability we have We will prove Proposition 4.3 by induction on k.In its full generality, our argument will rely on a second moment computation that utilises results of McKay-Wormald [26] and Canfield-Greenhill-McKay [12] about enumerating graphs with specified vertex-degrees.Since the argument is rather technical, we shall proceed slowly, first illustrating the base case before jumping into the meat of the argument.
4.1.The base case.In this subsection we prove Proposition 4.3 for k = 0.This entails some explicit calculations in the random graph G(n, 1/2); the inductive step can be seen as a 'relativised' version of this argument, with the randomness coming from a well-conditioned random graph with specified degree information rather than G(n, 1/2).
Recall that we need to prove that the four properties in A1 to A4 each hold with high probability.The most interesting of these properties is A2, which will be established using the following lemma.Lemma 4.5.Fix c > 0 and d ∈ N. Let ( d(v)) v∈V be a sequence of n discrete jointly random vectors in R d , and let L be the (fixed) distribution on R d defined by choosing v uniformly at random from V and then sampling from d(v).Suppose that for a box Q = (−q, q] d with q ≥ 1, the following conditions hold: (1) for each s, t ∈ Q and each u, v ∈ V , we have for each box B ⊆ Q with side lengths at least n −c , we have L(B) ≤ q vol(B).
For a given realisation of the random sequence ( d(v)) v∈V , let L be the (random) distribution on R d which is the uniform measure on this realisation.With probability at least In applications, d(v) will be a list of degrees from v to a number of other fixed subsets, and ( d(v)) v∈V will be the random ensemble of these lists.The above lemma roughly states that given decorrelation between these degree statistics, and (for technical reasons) a tail bound and anticoncentration, the empirical degree distribution of V is very likely to concentrate around an explicit distribution.
Here, we again reiterate that the constants suppressed by the asymptotic notation in Lemma 4.5 are allowed to depend on the fixed parameters c and d.
Proof of Lemma 4.5.For any v ∈ V , and any box B, let E v,B be the event that d(v) lies in B, so that n L(B) is the number of v ∈ V such that E v,B holds.For u, v ∈ V and boxes B, B ′ ⊆ Q, we can sum the bound in (1) over all the points t ∈ B and s ∈ B ′ to see that , so by Chebyshev's inequality, with probability at least 1 − n −c/4 , we have Now, consider a family B of O(n c/8 q d ) half-open boxes with side lengths at most D = n −c/(8d) that partition the (big) box Q.By the union bound, with probability 1 − O(q d n −c/8 ), the bound (4.1) holds for all B ∈ B. Also, since . Now, it is a routine matter to deduce the desired conclusion from these two facts.The details are as follows.
For any semi-infinite box and Furthermore, using (3) and (4.1) for all B ∈ B, we see that both the sum B∈B − L(B) and the

So, we have
proving the lemma.Now we use Lemma 4.5 to prove the base case of Proposition 4.3.
Proof of the k = 0 case of Proposition 4.3.First, we have log n with probability at least 1 − 1/n 2 , say, just by the Chernoff bound, whence a union bound demonstrates A3.
It remains to prove A2 and A4.It is enough to prove them for x = (0), by symmetry.We will take L 0 to be the distribution of the random vector where v ∈ V 0 is arbitrary; clearly, this distribution does not actually depend on the specific choice of v ∈ V 0 .Then, L 0 has a simple description in terms of independent binomial distributions.Although it will not be necessary for the proof, we remark that L 0 is well-approximated by the bivariate normal distribution N (0, 1/2) 2 , and it is possible to take L 0 to be this distribution as well.
Before proceeding further, we note that the aforementioned Chernoff bound shows that with establishing A4.Now, we claim that for every pair of vertices u, v and every pair of points s, t ∈ Q, we have Indeed, we will then be able to apply Lemma 4.5 to establish that A2 holds with high probability.
The claim follows from the following explicit calculation.The only dependence between d(u) and d(v) comes from the potential edge between u and v, but we can check that if we condition on this edge being present (or not), the probabilities P( d(u) = t) and P( d(v) = s) vary only by a factor of (1 ± O( log n/n)), which in itself boils down to the observation that n/2−1 t

4.2.
Preliminaries for the inductive step.We start with some preparations before proceeding to the details of the inductive step.First, we provide a proof of Lemma 4.4; actually we prove the following more general lemma.Lemma 4.6.For fixed c > 0, d ∈ N and any q ≥ 1, let L, L ′ be probability distributions on R d satisfying d K (L, L ′ ) ≤ n −c , L ′ (−q, q] d = 1, and L(B) ≤ q vol(B) for all boxes B with side lengths at least n −c .Then the following conclusions hold. ( Here, the constants suppressed by the asymptotic notation in Lemma 4.6 are allowed to depend on the fixed parameters c and d.
Proof of Lemma 4.6.Let Q = (−q, q] d , and note that For the first point, let B + ⊆ B be the subcollection of boxes which intersect H, and let B − ⊆ B be the subcollection of boxes fully included in H, so that For the second part, let B + be the subcollection of boxes that intersect R, so |B + | = O((q/D) d−1 ).We similarly observe that Second, we isolate the part of the proof of Lemma 4.5 in which we approximated Kolmogorov distance via small boxes.Lemma 4.7.For fixed c > 0 and d ∈ N, there exists a c ′ = c ′ (c, d) > 0 for which the following holds.Let L, L ′ be probability distributions on R d , where L ′ is (possibly) a random object.Let Q = (−q, q] d ⊆ R d be a box for q ≥ 1, and let B be a partition of it into at most q d n c/2 boxes with side lengths at most n −c/(2d) .Suppose the following conditions are satisfied.
( Then, with high probability, we have We will also need some lemmas for working with random graphs with constrained degree sequences.These lemmas will be deduced from powerful enumeration theorems due to McKay and Wormald [26] and Canfield, Greenhill, and McKay [12].Before stating these lemmas, we define a notion of 'closeness' between two degree sequences.This definition is chosen to be convenient for the proof of Proposition 4.3; it has two cases which will both arise in different parts of the proof.Definition 4.8.Consider a pair of sequences (a(v)) v∈V and (b(w)) w∈W .Let A, B be the uniform measures on these sequences (obtained by choosing a random element of each of these sequences).We say that (a(v)) v∈V and (b(w)) w∈W are proximate if at least one of the following two conditions holds.
We are now ready to state the promised pair of lemmas.We defer the details of their proofs to Appendix A. The first of these lemmas is for the non-bipartite setting.Recall that ≃ means equality up to a multiplicative factor (1 ± n −Ω(1) ).Lemma 4.9.Let (d w ) w∈W be a sequence with even sum on a set W of n vertices such that ) for all T ⊆ W , and ).Such a sequence is a graphic sequence for all sufficiently large n.Let G be a uniformly random graph on W with this degree sequence.Then, for any fixed v ∈ W and S ⊆ W satisfying |S|, n−|S| = Ω(n), the following hold.
(1) For any integer (2) Let us write P(deg S (v) = t) = p(v, (d w ) w∈S , (d w ) w / ∈S , t) as a function of v, the relevant degree sequences, and t.Then, for t = |S|/2 ± O( √ n log n) and the other parameters as constrained above, this function p(•) depends continuously on its parameters, in the following sense: if ) w∈W ′ \S ′ , t ′ ), recalling that ≃ denotes equality up to a multiplicative factor of 1 ± n −Ω (1) .
Next, the second of the promised pair of lemmas is for the bipartite setting.Lemma 4.10.Let ((d v ) v∈V , (d w ) w∈W ) be a pair of sequences with identical sums on a bipartition ) for all T ⊆ W , and ).Such a pair of sequences form a bipartite-graphic sequence for all sufficiently large n.Let G be a uniformly random bipartite graph between V and W with this degree sequence.Then, for any fixed u ∈ V and S ⊆ W satisfying |S|, n − |S| = Ω(n), the following hold.(2) Let us write as a function of u, the relevant degree sequences, and t.Then, for t = |S|/2 ± O( √ n log n) and the other parameters as constrained above, this function p(•) depends continuously on its parameters, in the following sense: if recalling that ≃ denotes equality up to a multiplicative factor of 1 ± n −Ω (1) .
Finally, we require the following concentration properties of the edge-counts in a random graph.
Lemma 4.11.There are absolute constants C, c > 0 such that if G ∼ G(n, 1/2) is a random graph, then with probability at least 1 − 2 exp(−cn) we have for all disjoint S, T that (1) The proof of Lemma 4.11 is an immediate application of a Chernoff bound and the union bound, similar to the proof of Lemma 2.1, so we omit the details.Now we are ready to finish the proof of Proposition 4.3 by establishing its inductive step.

4.3.
Proof of the inductive step.Consider k − 1 iterations of the α-swap process, giving rise to a partition of the vertices into sets V x , for x ∈ {0, 1} k , as defined in Definition 4.1.An additional iteration of the α-swap process will refine this to a partition into sets V x , for x ∈ {0, 1} k+1 ; to emphasise the difference between these two partitions we write W x instead of V x when x ∈ {0, 1} k .By the inductive hypothesis, there are real numbers π x ≥ α 4(k−1) /2 and distributions L x for x ∈ {0, 1} k such that the following properties are satisfied with high probability.
B1 For each x ∈ {0, 1} k , we have B4 For each x ∈ {0, 1} k , and each box B with side lengths n −c α,k−1 we have Here, we remind the reader that L x is an empirical distribution measuring the degrees of vertices in W x into the various sets W y .Also, we remark that although B4 as written only concerns boxes with side lengths exactly n −c α,k , a simple covering argument shows that the same conclusion holds when B is a box with side lengths at least n −c α,k (up to a constant factor).
Next, let record the part and degree information after k − 1 iterations of the α-swap process, so B1 to B4 are all really properties of R. Let E be the event that all the conclusions of Lemma 4.11 hold for all disjoint subsets of vertices S and T .By Lemma 4.11, we have for some universal c > 0, so by Markov's inequality, with high probability, R has the property that Now, let us condition on an outcome of R satisfying B1 to B5; we say that such an outcome is well-behaved.It suffices to prove that, in the resulting conditional probability space, A1 to A4 hold with high probability.Note that, conditionally, G is now a random graph with certain degree constraints.To be precise, for each x ∈ {0, 1} k , the induced subgraph G[W x ] is uniform over all graphs in which each v ∈ W x has degree deg Wx (v), and for each pair of distinct x, y ∈ {0, 1} k , the subgraph G[W x , W y ] (consisting of the edges of G between W x and W y ) is uniform over all bipartite graphs in which each v ∈ W x has degree deg Wy (v) and each v ∈ W y has degree deg Wx (v).Furthermore, all these random subgraphs of the form G[W x ], G[W x , W y ] are independent, and B1, B3 and B5 in particular ensure that either Lemma 4.9 or Lemma 4.10 apply to all these subgraphs.
Recalling that we have performed k − 1 iterations of the α-swap procedure so far, we now consider the effect of a kth α-swap.Recall that this α-swap has two steps.First, the ⌊αn⌋ unfriendliest vertices on each side are swapped.The information recorded in R is enough to determine the outcome of this first step.Second, a random set of ⌊α 4 n⌋ vertices on each side are swapped; let S be the random pair of sets that are swapped in this second step, and note that S is independent from G conditional on the partition at that time.
For the remainder of this proof, asymptotic notation should be understood to be treating k, α as fixed constants, so, for example, the inequality in B2 can be described as saying d K ( L x , L x ) ≤ n −Ω(1) .4.3.1.Concentration of the part sizes.First we prove that A1 holds with high probability.Let S i = {z ∈ {0, 1} k : z k = i}, and recall that the bisection resulting from the first k − 1 iterations of the α-swap process has parts Consider any z ∈ {0, 1} k , and let W ′ z be the portion of W z that is swapped during the first step of the kth α-swap (i.e., these vertices are among the ⌊αn⌋ unfriendliest vertices in their part of the bisection A k−1 ∪ B k−1 ; this is determined by the outcome of R we have conditioned on).It suffices to prove that 1) , for some π ′ z that does not depend on the specific choice of R that we are conditioning on (but demanding no lower bound on π ′ z ).Indeed, for any b ∈ {0, 1}, the second part of the α-swap process (in which we randomly swap sets A ′ , B ′ of ⌊α 4 n⌋ vertices on both sides) will then, with high probability, yield 1) , where Here we have used B1 and a Chernoff bound for the hypergeometric distribution; see for example [21,Theorem 2.10].
To this end, we study the sets W ′ z .Assume without loss of generality that z k = 0 (i.e., W ′ z ⊆ A k−1 ).Let A ′ be the set of the ⌊αn⌋ unfriendliest vertices in A k−1 (so W ′ z = W z ∩ A ′ ), and let A (ζ) be the set of vertices in A k−1 with friendliness at most ζ √ n.We will approximate A ′ with A (ζ) , for an appropriate choice of ζ.
For ζ ∈ R, define the affine half-space Then, By the second point in Lemma 4.6, the function f satisfies a Lipschitz-like property: (1) .
By the first point in Lemma 4.6, we then have 1) .That is to say, the set A ′ differs from the set A (ζα) by only n 1−Ω(1) elements (noting that either A ′ ⊆ A (ζ) or A (ζ) ⊆ A ′ always).Again using the first point in Lemma 4.6, it follows that 1) , as desired, where π ′ z = π z L(H ζα ).

Some intermediate empirical degree distributions.
For a vertex v, define the degree vector (which is determined by R), and recall that for z ∈ {0, 1} k , L z is the uniform measure on the sequence ( g(v)) v∈Wz .For b ∈ {0, 1}, let D (z,b) be the uniform measure on ( g(v)) v∈V (z,b) (which depends on R, S, but not the remaining randomness of G).This can be thought of as an 'intermediate' empirical degree distribution between L z and L (z,b) , where we consider the degrees from vertices in V (z,b) into the sets W y .The considerations in the previous section give us quite strong control over the D (z,b) .Indeed, for any box B ⊆ R {0,1} k let W z (B) be the set of all v ∈ W z with g(v) ∈ B, and as in the last section, assume without loss of generality that 1) , and a concentration inequality for the hypergeometric distribution shows that with probability 1 − O(1/n) over the randomness of S, we have 1) , where 1) , where for all boxes S ⊆ R {0,1} k .Recalling B3 and B4, and partitioning the big box into n c/2+o (1) boxes with side lengths n −c/(2•2 k ) for a sufficiently small c > 0, it follows from Lemma 4.7 that 1) with high probability over the randomness of S.

4.3.3.
Controlling the outlier degrees.We next prove that A3 holds with high probability.In addition to our conditioning on R, in this subsection we also condition on an outcome of S such that each |V x | = Ω(n) (we have just observed in our consideration of A1 that such bounds hold with high probability).Fix an arbitrary x ∈ {0, 1} k+1 and y ∈ {0, 1} k .We wish to show that with high probability, for every v ∈ W y we have deg √ n log n, for some C α,k > 0. This suffices, since we will then be able to take the union bound over all O(1) choices of x, y.The desired bound follows from part (1) of Lemma 4.9 and part (1) of Lemma 4.10 along with a Chernoff bound for the hypergeometric distribution and a union bound over v ∈ W y : if z = (x 1 , . . ., x k ) satisfies z = y, then we consider the degree-constrained random graph G[W y ], and if we instead have z = y, then we consider the degree-constrained bipartite graph G[W y , W z ].

4.3.4.
Defining the ideal distributions.We shall address A4 first before turning to A2 (which is by far the most involved of the four properties).Therefore, at this juncture, we take a moment to say something about how we will define the distributions L x for x ∈ {0, 1} k+1 .First, for specific outcomes of R, S (which determine the sets V x for x ∈ {0, 1} k+1 ), we let L R,S x be the distribution obtained by choosing a random v ∈ V x and sampling its degree vector according to the remaining randomness in G.We will later show that if R is well-behaved, and S also satisfies certain properties that hold with high probability, then L R,S x is actually not very sensitive to the specific choice of R and S, whence we will be able to prove that A2 holds with high probability when we take L x to be any such L R,S x .
4.3.5.Anti-concentration.Here, we show that A4 holds.As in Section 4.3.3,we condition on a well-behaved outcome of R as well as on an outcome of S such that each |V x | = Ω(n).By the above discussion, it suffices to show that L R,S x satisfies the anti-concentration property in A4.The rough idea for establishing this involves combining Lemmas 4.9 and 4.10 (which provide anti-concentration subject to the remaining randomness in G) with the anti-concentration property in B4 coming from the outcome of the process so far.
Fix a vertex v ∈ W z for some z ∈ {0, 1} k .By part (1) of Lemma 4.9 and part (1) of Lemma 4.10, for y ∈ {0, 1} k and t ∈ N, parameterising t = |V (y,0) |/2 + τ √ n and writing d v = deg Wy (v), we have uniformly in t.Indeed, when applying Lemma 4.9, this holds with room to spare when |τ | > |V (y,0) | 1/10 = Ω(n 1/10 ), and when |τ | ≤ |V (y,0) | 1/10 , we may see that we uniformly have by a standard anti-concentration inequality for the hypergeometric distribution (see for example [14,Lemma 3.2]).Since we are conditioning on R, S, the degree-constrained random graph G[W z ] and the degreeconstrained bipartite graphs G[W z , W y ] are all independent, so the 2 k different degrees deg V (y,0) (v), for y ∈ {0, 1} k , are all independent as well.Thus, we obtain the uniform joint anti-concentration bound Note that for each y ∈ {0, 1} k , the degrees deg V (y,0) (v) and deg V (y,1) (v) are certainly not independent, since deg V (y,0) (v) + deg V (y,1) (v) = deg Wy (v) is determined by R. Nonetheless, our joint anti-concentration bound does imply that for any box B ⊆ R {0,1} k+1 with side lengths D ≥ 1/ √ n, we have Note that vol(B) = D 2 k+1 , so (4.4)only provides 'half as much anti-concentration' as we desire for A4.So far, we have only considered anti-concentration of d(v) when v is a fixed vertex; we will next establish the remainder of our anti-concentration and A4 proper by allowing v to vary and appealing to B2 and B4.
Recall the definition of the degree vectors g(v) and the empirical distributions D (z,b) from Section 4.3.2.Each D (z,b) is obtained from L z by conditioning on an event that holds with probability Ω(1), so B4 implies the same anti-concentration property for these intermediate empirical distributions, i.e., for all boxes B ⊆ R {0,1} k+1 with side lengths at least n −c , where c = c α,k−1 , and all x ∈ {0, 1} k+1 .Now, let π : R {0,1} k+1 → R {0,1} k be the projection map Note that g(v) = π( d(v)) for all v, and note that if B ⊆ R {0,1} k+1 is a box with side lengths n −c , then π(B) is contained in a box with side lengths 2n −c .So, by (4.4) and (4.5), we have for all x ∈ {0, 1} k+1 , as desired.
4.3.6.Concentration of the empirical degree distributions.In this subsection we use a second moment calculation as in Section 4.1 to show that, if we condition on appropriate outcomes of R and S, then with high probability, for any x ∈ {0, 1} k+1 , we have 1) .
We shall later prove that the distributions L R,S x , for appropriate R, S, are all Kolmogorov-close to each other; it will then follow that A2 holds with high probability.
As in the previous two subsections, we condition on a well-behaved outcome of R and an outcome of S for which |V x | = Ω(n) for all x ∈ {0, 1} k+1 .Fix an x ∈ {0, 1} k+1 , and as before, let , where C α,k is as chosen in Section 4.3.3(so, we have say We wish to apply Lemma 4.5.To this end, we shall, for an arbitrary pair of vertices u and v, study conditional probabilities of the form √ n log n .We will show that for such data u, v, z, b, and each t ∈ R (z,b) , the value of the above conditional probability is not very sensitive to the choice of T .
Let y ∈ {0, 1} k be such that v ∈ W y .As usual, we need to consider separately the case where y = z and where y = z; in the former case, we study the degree-constrained random graph G[W y ], and in the latter case we study the degree-constrained random bipartite graph G[W y , W z ].
If y = z, then having conditioned on the event N Wy (u) = T , now G[W y \ {u}] is a random graph with a particular degree sequence (namely, the degree sequence where we delete u if it is in W y , and if so we also decrement the degree of every vertex in T by one).Considering how this degree sequence varies for different choices of T, T ′ , it follows from part (2) of Lemma 4.9 (and the first part of Definition 4.8) that for each u, v, z, b as above, each t ∈ R (z,b) , and each such pair T, T ′ , we have We obtain the same conclusion if z = y by considering the bipartite graph G[W y , W z ], except now relying on Lemma 4.10.
The above argument implies that for all u, v, z, b, t as above, we in fact have Observing that all the random subgraphs of the form G[W y ], G[W y , W z ] are independent, we deduce that for any τ , σ ∈ Q, we have Therefore we can apply Lemma 4.5, using A4 (which we have already proved) and the fact that L x (Q c ) ≤ 1/n 2 for all x ∈ {0, 1} k+1 , to conclude that A2 holds with high probability.
4.3.7.Sensitivity to the conditioned information.To finish, we wish to show that for all x ∈ {0, 1} k+1 , well-behaved R and R ′ , and almost all outcomes S and S ′ , we have 1) .
This will complete the proof of the inductive step of Proposition 4.3.
Recall the definitions of the degree vectors g(v) and the intermediate degree distributions D x , D x from Section 4.3.2.In that subsection, we showed for all well-behaved R that, with high probability over S, we have d K ( D x , D x ) = n −Ω (1) .Let c (depending on α, k) be sufficiently small such that d K ( D x , D x ) ≤ n −c with high probability, and let us now call an outcome of S well-behaved if this is the case for all x ∈ {0, 1} k+1 .
Then for any x ∈ {0, 1} k+1 and t, t ′ = π x n/2 ± n 1/2−Ω (1) , by part (2) of Lemma 4.9 and part (2) of Lemma 4.10 (and using the second part of Definition 4.8), we have Now, consider well-behaved data R, S, R ′ , S ′ , and fix some x ∈ {0, 1} k+1 .Our next objective is to construct an injective mapping φ between V R,S x with 'roughly the same statistics' as v.This will allow us to compare probabilities conditional on the outcomes (R, S) with probabilities conditional on the outcomes (R ′ , S ′ ).
, so by the same considerations as in Section 4.3.3,we know that (1) boxes with side lengths n −c/(2•2 k+1 ) .Since R, S, R ′ , S ′ are all well-behaved, we have Also, we may assume with no loss of generality that c is sufficiently small, and in particular, that c < c α,k , so by A1, we have |V R,S x x be obtained by choosing m(B) elements from each V R,S x (B) for B ∈ B, so that (1) .
) ), so applying (4.6) and summing over points in B, we see for all v ∈ U that for some c ′ > 0 depending on c and k.Now, if we coarsen B into a partition B ′ of n −c ′ /2+o (1)  boxes with side lengths at most n −c ′ /(2•2 k+1 ) , then we easily see that the conditions of Lemma 4.7 are satisfied, and we deduce that 1) as desired.This finishes the inductive proof of Proposition 4.3.
Proof.First, note that The desired result now follows from the fact that We are now ready for the proof of Lemma 4.9.
Proof of Lemma 4.9.We shall estimate the probabilities in question using Proposition A.1.Indeed, the hypothesis in the statement of Lemma 4.9, in the language of Proposition A.1, may be stated as , and it follows from the facts that i∈W β as well.For part (2) of Lemma 4.9, it is sufficient to verify that the expression in (A.2) is polynomiallystable when the parameters in question vary by the amounts specified in the statement of Lemma 4.9; here, we say that an expression is polynomially-stable if it varies by at most a multiplicative factor of 1 ± n −Ω (1) .This may be done term by term, as we outline below. Suppose and t ′ satisfy the hypothesis in the statement of the lemma, and additionally, are such that  1) , this being a consequence of the previous two points.
In the regime where h, n is polynomially-stable when n and h vary by n 1−Ω (1) , and d and t vary by n 1/2−Ω (1) , which in particular tells us that This can be seen via a careful (and rather tedious) application of Stirling's approximation, or alternately, by using a sufficiently precise form of the de Moivre-Laplace normal approximation, as in [39] for example.Next, we need to verify that each of exp(Λ 1 ), exp(−Λ 2 ), exp(−Λ 3 ) and exp(Λ 4 ) are similarly polynomially-stable, and this may be accomplished in a straightforward manner using Lemma A.2.To illustrate, we spell out the details for exp(−Λ 3 ) below.
Recall that Our goal is to show, with β ′ i defined by 1) .This is true with room to spare if the two sequences are proximate on account of the first part of Definition 4.8, since in this case, we know that for some bijection ψ : S → S ′ , from which it follows that i∈S If the two sequences are proximate on account of the second part of Definition 4.8, then since |n − n ′ | ≤ n 1−Ω (1) , it is easily checked that the Kolmogorov distance between the uniform measures on (β i ) i∈S and (β ′ i ) i∈S ′ is at most n −Ω (1) , so by Lemma A.2 (with k = 2 and q = log n), we have i∈S as claimed.Reasoning similarly about the proximate pair (d w ) w∈W \S and (d ′ w ) w∈W ′ \S ′ , we deduce that i∈W \S as well.Putting these pair of estimates together shows that 3 ).The details in the other three cases (i.e., for Λ 1 , Λ 2 and Λ 4 ) are similar, and we leave them to the reader.Proposition A.1 is a consequence of the following more general statement, the proof of which will be given in Appendix C once we have collected the requisite machinery in Appendix B.
Proposition A.3.Let (d w ) w∈W be a sequence with even sum on a set W of n vertices such that, defining β w by d w = (n − 1)/2 + β w (n − 1)/2, we have |β w | ≤ log n for each ∈ W .Such a sequence is a graphic sequence for all sufficiently large n.Let G be a uniformly random graph with this degree sequence on the vertex set W .For any fixed v ∈ W , S ⊆ W of size h satisfying min(h, n − h) ≥ n/(log n) 1/8 , and an integer t ∈ [0, d v ], we have where T = T 1 ∪ T 2 is a random set chosen by picking T 1 uniformly from S\v t and T 2 uniformly from S c \v dv−t , and where Λ 1 , Λ 3 and Λ T are given by To proceed, we will need to understand expressions as appearing in the right side of Proposition A.3.To this end, we state two general results about sums of random variables constrained to live on a "slice" of the Boolean hypercube.
Lemma A.4.Let a 1 , . . ., a n ∈ R and let X = n i=1 a i ξ i , where ξ = (ξ 1 , . . ., ξ n ) is uniform on the subset of binary vectors in {0, 1} n which have sum s.Writing η Proof.The first part follows from the Azuma-Hoeffding inequality, as outlined in [ If σ ≤ n −1/8 , then η is similarly bounded and we obtain an upper bound of the form 1 + O(n −1/9 ).
Combining with Ee X ≥ e EX , the result follows.If σ > n −1/8 , then a combinatorial central limit theorem of Bolthausen [11] shows that and T 2 uniformly from S c \v dv−t , we have where the small additive error term comes from the fact that whether v ∈ S or v ∈ S c slightly change the fractions listed above, but not by much.At this point, if |t − h/2| > n 3/5 , we have From now on we assume |t − h/2| ≤ n 3/5 .We next compute the variance of Λ T .Following the computation in the proof of Lemma A.5, we see these sums being over all (unordered) two-element subsets; here, we again use the fact that the fraction t/|S \ v| is close to t/h regardless of if v ∈ S or v ∈ S c .Now using t = h/2 ± n Since the above estimate holds for every choice of T ⊆ W \ v, we may finish by noting that where T = T 1 ∪ T 2 is a random set chosen by picking T 1 uniformly from S t and T 2 uniformly from W \S dv−t .Rearranging this, and recalling that Φ = exp(Λ 1 ±O(n −1/6 )), gives us the desired result.
To finish, we outline the proof of Proposition A.

Lemma 2 . 4 .
Fix ε > α > 0, k ∈ N, and an arbitrary bisection A ∪ B of the vertex set of G(n, 1/2).For a random graph G ∼ G(n, 1/2), let A k ∪ B k be the bisection obtained by performing k iterations of the α-swap procedure starting from A ∪ B. Writing X and Y respectively for the number of unfriendly vertices in A k and B k , we have with high probability that |X − Y | = o(n).
2; we now deduce Lemma 2.4 from Proposition 4.3 and Lemma 4.4.Proof of Lemma 2.4.Let A k ∪ B k be the bisection resulting from k iterations of the α-swap process.Recall that in the statement of Lemma 2.4, the random variables X and Y are the numbers of unfriendly vertices in A k and B k .It suffices to prove that there is some value N (potentially depending on all of α, k, n) such that X = N + o(n) with high probability.Indeed, by symmetry it would follow that Y = N + o(n) with high probability as well, implying that |X − Y | = o(n) with high probability, as desired.
) For any region H ⊆ R d defined as the intersection of O(1) (closed or open) affine half-spaces, we have L ′ (H) = L(H) ± O(q d n −c/(2d) ).(2) For any R ⊆ R d obtained as the region between two parallel (closed or open) affine hyperplanes separated by a distance of at most n −c , we have L As in the proof of the base case of Proposition 4.3 (in Section 4.1), we consider a family B of O(q d n c/2 ) half-open boxes with side lengths at most D = n −c/(2d) that partition Q.

)
For each B ∈ B, we have |L ′ (B) − L(B)| ≤ n −c with probability at least 1 − n −c .(2) L(Q c ) ≤ n −c , and L ′ (Q c ) ≤ n −c with probability at least 1 − n −c .(3) For each box B ∈ B with side lengths at least n −c , we have L(B) ≤ q vol(B).

7 .
Proof of Proposition A.7.The proof of this proposition mirrors that of Proposition A.3, except now using Theorem B.2 instead of Theorem B.1, and the second estimate in Lemma B.3 instead of the first.Since the requisite calculations are routine (and are analogous to those spelled out in the proof Proposition A.3), we leave the details of these calculations to the reader.