1 Introduction

Denote by \(A_n \subset \mathbb Z ^2\) a box of side length \(n\), i.e., \(A = \{(x, y)\in \mathbb Z ^2: 0\le x, y\le n\}\), and let \(\partial A_n = \{v\in A_n: \exists u\in \mathbb Z ^2 {\setminus } A_n: v\sim u\}\). The discrete Gaussian free field (GFF) \(\{\eta _v: v\in A_n\}\) on \(A_n\) with Dirichlet boundary condition, is then defined to be a mean zero Gaussian process which takes value 0 on \(\partial A_n\) and satisfies the following Markov field condition for all \(v\in A_n{\setminus } \partial A_n\) : \(\eta _v\) is distributed as a Gaussian variable with variance \(1\) and mean equal to the average over the neighbors given the GFF on \(A_n{\setminus } \{v\}\) (see later for a definition of GFF using Green functions). Throughout the paper, we use the notation

$$\begin{aligned} M_n = \sup _{v\in A_n} \eta _v. \end{aligned}$$
(1)

We prove the following tail behavior for \(M_n\).

Theorem 1.1

There exist absolute constants \(C,c>0\) so that for all \(n\in \mathbb N \) and \(0\le \lambda \le (\log n)^{2/3}\)

$$\begin{aligned} c\mathrm{e}^{-C\lambda }&\le \mathbb{P }(M_n \ge \mathbb E M_n + \lambda ) \le C\mathrm{e}^{-c\lambda }\\ c\mathrm{e}^{-C \mathrm{e}^{C\lambda }}&\le \mathbb{P }(M_n \le \mathbb E M_n - \lambda ) \le C\mathrm{e}^{-c\mathrm{e}^{c\lambda }} \end{aligned}$$

The preceding theorem gives the tail behavior when the deviation is less than \((\log n)^{2/3}\). For \(\lambda \ge (\log n)^{2/3}\), by isoperimetric inequality for general Gaussian processes (see, e.g., Ledoux [16, Theorem 7.1, Eq. (7.4)]) and the simple fact that \(\max _v\text{ Var} \eta _v = 2\log n/\pi + O(1)\) (see Lemma 2.2), we have

$$\begin{aligned} \mathbb{P }(|M_n - \mathbb E M_n | \ge \lambda ) \le 2\, \mathrm{e}^{- c\lambda ^2/\log n}\,, \quad \text{ for} \text{ an} \text{ absolute} \text{ constant} \, c>0. \end{aligned}$$

Combined with Theorem 1.1, this immediately gives the order of the variance for \(M_n\). Before stating the result, let us specify some conventions for notations throughout the paper. The letters \(c\) and \(C\) denote absolute positive constants, whose values might vary from line to line. By convention, we denote by \(C\) large constants and by \(c\) small constants. Other absolute constants that appeared are fixed once and for all. If there exists an absolute constant \(C>0\) such that \(a_n = C b_n\) for all \(n\ge 1\), we write \(a_n = O(b_n)\); we write \(a_n = \Theta (b_n)\) if \(a_n = O(b_n)\) as well as \(b_n = O(a_n)\); if \(\limsup _{n\rightarrow \infty } a_n /b_n \rightarrow 0\), we write \(a_n = o(b_n)\). We are now ready to state the corollary.

Corollary 1.2

We have that \(\mathrm{Var} M_n = \Theta (1)\).

Corollary 1.2 improves an \(o(\log n)\) bound on the variance due to Chatterjee [7], thereby confirming a folklore conjecture (see Question (4) of [7]). An important ingredient for our proof is the following result on the tightness of the maximum of the GFF on 2D box due to Bramson and Zeitouni [6].

Theorem 1.3

[6] The sequence of random variables \(M_n - \mathbb E M_n\) is tight and

$$\begin{aligned} \mathbb E M_n = 2\sqrt{2/\pi } \big (\log n - \tfrac{3}{8\log 2} \log \log n\big ) + O(1). \end{aligned}$$

Previously to [6], Bolthausen et al. [3] proved that \((M_n - \mathbb E M_n)\) is tight along a deterministic subsequence \((n_k)_{k\in \mathbb N }\). Earlier works on the extremal values of GFF include Bolthausen et al. [2] who established the asymptotics for \(M_n\), and Daviaud [8] who studied the extremes for the GFF.

We compare our results with tail behavior for the maximum of the GFF on a binary tree. Interestingly, in the case of tree, the maximum exhibits an exponential decay for the right tail, but a Gaussian type decay for the left tail as opposed to the double exponential decay for 2D box. This is because in the case of 2D box, the Dirichlet boundary condition decouples the GFF near the boundary such that the GFF behaves almost independently close to the boundary. The same phenomenon also occurs for the event that all the GFFs are nonnegative: for a binary tree of height \(n\) the probability is about \(\text{ e}^{-\Theta (n^2)}\), and for a box of side length \(n\) the probability is about \(\text{ e}^{-\Theta (n)}\) (see Deuschel [9]).

Much more was known about the maximal displacement of branching Brownian motion (BBM). In their classical paper, Kolmogorov et al. [13] studied its connection with the so-called KPP-equation, from which it could be deduced that both the right and left tails exhibit exponential types of decay. The probabilistic interpretation of KPP-equation in terms of BBM was further exploited by Bramson [4]. Then the precise asymptotic tails were computed, and in particular a polynomial prefactor for the right tail was detected (this appears to be fundamentally different from the tail of Gumble distribution, which arise from the maximum of, say, i.i.d. Gaussian variables). See, e.g., Bramson [5] and Harris [12] for the right tail, and see Arguin et al. [1] for the left tail (the argument is due to De Lellis). In addition, Lalley and Sellke [14] obtained an integral representation for the limiting law of the centered maximum.

We now give the definition of GFF using the connection with random walks (in particular, Green functions). Consider a connected graph \(G = (V, E)\). For \(U \subset V\), the Green function \(G_U(\cdot , \cdot )\) of the discrete Laplacian is given by

$$\begin{aligned} G_U(x, y) = \mathbb E _x\left(\sum _{k=0}^{\tau _U - 1} \mathbf {1}\{S_k = y\}\right)\,, \quad \text{ for} \text{ all}\, x, y\in V, \end{aligned}$$
(2)

where \(\tau _U\) is the hitting time to set \(U\) for random walk \((S_k)\), defined by (the notation applies throughout the paper)

$$\begin{aligned} \tau _U = \min \{k\ge 0: S_k \in U\}. \end{aligned}$$
(3)

The GFF \(\{\eta _v: v\in V\}\) with Dirichlet boundary on \(U\) is then defined to be a mean zero Gaussian process indexed by \(V\) such that the covariance matrix is given by Green function \((G_U(x, y))_{x, y\in V}\) (In general graph, it is typical to normalize the Green function by the degree of the target vertex \(y\). In the case of 2D lattices, this normalization is usually dropped since the degrees are constant). It is clear to see that \(\eta _v = 0\) for all \(v\in U\).

2 Proofs

In this section, we prove Theorem 1.1. We start with a brief discussion on the proof strategy, and then demonstrate the upper (lower) bounds for the right (left) tails in the subsequent four subsections.

2.1 A word on proof strategy

Our proof typically employs a two-level structure which involves either a partitioning or a packing for a 2D box \(A_n\) by (slightly) smaller boxes. In all the proofs, we use Theorem 1.3 to control the behavior in small boxes, and study “typical” events on small boxes with probability strictly bounded away from 0 and 1. The large deviation bounds typically come from gluing the small boxes together to a big box, with the probability either inverse proportional to the number of small boxes or exponentially small in the number of boxes.

By Theorem 1.3, there exists a universal constant \(\kappa >0\) such that for all \(n \ge 3 n^{\prime }\)

$$\begin{aligned}&2\sqrt{2/\pi }\log (n/n^{\prime }) - \tfrac{3\sqrt{2/\pi }}{4\log 2}\log (\log n/\log n^{\prime }) - \kappa \le \mathbb E M_n - \mathbb E M_{n^{\prime }} \nonumber \\&\quad \le 2\sqrt{2/\pi }\log (n/n^{\prime }) + \kappa . \end{aligned}$$
(4)

That is to say, in order to observe a difference of \(\lambda \) in the expectation for the maximum, the side length of the box has to increase (decrease) by a factor of \(\exp (\Theta (\lambda ))\). This suggests that the number of small boxes shall be \(\exp (\Theta (\lambda ))\) in our two-level structure. Depending on how the large deviation arises, this will yield a tail of either exponential or double exponential decay.

In order to construct the two-level structure, we use repeatedly the decomposition of Gaussian process: for a joint Gaussian process \((X, Y)\), we can write \(X\) as a sum of a (linear) function of \(Y\) and an independent Gaussian process \(X^{\prime }\). Here, we used a crucial fact that Gaussian processes possess linear structures where orthogonality implies independence. Furthermore, the next well-known property specific to GFF proves to be quite useful (see Dynkin [10, Theorem 1.2.2]).

Lemma 2.1

Let \(\{\eta _v\}_{v\in V}\) be a GFF on a graph \(G=(V, E)\). For \(U\subset V\), define \(\tau _U\) as in (3). Then, for \(v\in V\), we have

$$\begin{aligned} \mathbb E (\eta _v \mid \eta _u, u\in U) = \sum _{u\in U}\mathbb{P }_v(S_{\tau _U} = u) \cdot \eta _u. \end{aligned}$$

2.2 Upper bound on the right tail

In this subsection, we prove that for an absolute constant \(C, \lambda _0>0\)

$$\begin{aligned} \mathbb{P }(M_n - \mathbb E M_n \ge \lambda ) \le C \text{ e}^{-\sqrt{\pi /2} \lambda }\,, \quad \text{ for} \text{ all}\; n\in \mathbb{N }\quad \text{ and}\quad \lambda \ge \lambda _0. \end{aligned}$$
(5)

Note that we could choose \(\lambda _0\) arbitrarily large by adjusting the constant \(C\) in Theorem 1.1. Let \(N = n \lceil \text{ e}^{\sqrt{\pi /8} (\lambda - \kappa - \alpha )} \rceil \), where \(\kappa \) is from (4) and \(\alpha > 0\) will be selected later. Denote by \(p = p_\alpha = \text{ e}^{-\sqrt{\pi /2} (\lambda - \kappa - \alpha )}\) and \(k = \lceil \text{ e}^{\sqrt{\pi /8} (\lambda - \kappa - \alpha )} \rceil \). It suffices to prove that \(\mathbb{P }(M_n - \mathbb E M_n \ge \lambda ) \le p\), and we prove it by contradiction. To this end, we assume that

$$\begin{aligned} \mathbb{P }(M_n - \mathbb E M_n \ge \lambda ) > p \end{aligned}$$
(6)

and try to derive a contradiction.

Now, consider an \(N \times N\) 2D box \(A_N\) and let \(\{\eta _v : v\in A_N\}\) be a GFF on \(A_N\) with Dirichlet boundary condition. We partition \(A_N\) into \(k^2\) boxes of side length \(n\) and denote by \(\mathcal B \) the collection of these boxes. We abuse the notation \(\partial \mathcal B \) to denote the union of the boundary sets of the smaller boxes in \(\mathcal B \). For \(B\in \mathcal B \), we let \(\{g_v^B: v\in B\}\) be a GFF on \(B\) with Dirichlet boundary condition and we let \(\{\{g_v^B: v\in B\}\}_{B\in \mathcal B }\) be independent from each other and independent from \(\{\eta _v: v\in \partial \mathcal B \}\). Using the decomposition of Gaussian process, we can write that for every \(v\in B\subseteq A_N\)

$$\begin{aligned} \eta _v = g^B_v + \mathbb E (\eta _v \mid \{\eta _u: u\in \partial \mathcal B \}). \end{aligned}$$
(7)

Denote by \(\phi _v = \mathbb E (\eta _v \mid \{\eta _u: u\in \partial \mathcal B \})\). We note that \(\phi _v\) is a convex combination of \(\{\eta _u: u\in \partial \mathcal B \}\) where the linear coefficients are deterministic. Thus,

$$\begin{aligned} \{\phi _v : v\in A_N\} \, \text{ is} \text{ independent} \text{ of}\, \{\{g^B_v: v\in B\}: B\in \mathcal B \}. \end{aligned}$$
(8)

Denote by \(M_B = \sup _{v\in B} g^B_v\). It is clear that \(\{M_B : B \in \mathcal B \}\) is a collection of i.i.d. random variables and each of them is distributed as \(M_n\). Therefore, by (6), we obtain that \(\mathbb{P }(M_B \ge \mathbb E M_n + \lambda ) \ge p\). Using independence, we get

$$\begin{aligned} \mathbb{P }\left(\sup _{B\in \mathcal B } \sup _{v\in B}\, g^B_v \ge \mathbb E M_n + \lambda \right) = \mathbb{P }\left(\sup _{B\in \mathcal B } M_B \ge \mathbb E M_n + \lambda \right) \ge 1/2. \end{aligned}$$

Let \(\chi \in B \subseteq A_N\) such that \(g^B_\chi = \sup _{B\in \mathcal B } \sup _{v\in B} g^B_v\). We see that \(\chi \) is random (obviously) and independent of \(\{\phi _v : v\in \partial \mathcal B \}\) by (8). Therefore, we obtain

$$\begin{aligned}&\mathbb{P }\left(\sup _{v\in A_N} \eta _v \ge \mathbb E M_n + \lambda \right) \nonumber \\&\quad \ge \mathbb{P } \left(g^B_\chi \ge \mathbb E M_n + \lambda , \phi _\chi \ge 0) \ge (1/2) \min _{v\in A_N} \mathbb{P } (\phi _v \ge 0\right) = 1/4. \end{aligned}$$
(9)

Recalling (4) and our definition of \(N\), we thus derive that

$$\begin{aligned} \mathbb{P }(M_N - \mathbb E M_N \ge \alpha ) \ge 1/4. \end{aligned}$$

However, Theorem 1.3 implies that there exists a universal constant \(\alpha (1/4) > 0\) such that \(\mathbb{P }(M_n - \mathbb E M_n \ge \alpha (1/4)) < 1/4\) for all \(n\in \mathbb N \). Setting \(\alpha = \alpha (1/4)\), we arrive at a contradiction and thus show that (6) cannot hold, thereby establishing (5).

2.3 Lower bound on the right tail

In this subsection, we analyze the lower bound on the right tail and aim to prove that for absolute constant \(c, \lambda _0>0\)

$$\begin{aligned} \mathbb{P }(M_n - \mathbb E \, M_n \ge \lambda ) \ge \tfrac{c}{\lambda }\, \text{ e}^{- 8 \sqrt{2\pi }\lambda } , \quad \text{ for} \text{ all}\,n \in \,\mathbb N \quad \text{ and} \quad \lambda _0\le \lambda \le (\log n)^{2/3}.\nonumber \\ \end{aligned}$$
(10)

To prove the above lower bound, we consider a box \(A_{n^{\prime }}\) of side length \(n^{\prime } = n \text{ e}^{-\beta \lambda }\) in the center of \(A_n\), where \(\beta > 0\) is to be selected (note that since \(\lambda \le (\log n)^{2/3}\), we have \(n^{\prime }\ge 1\) is well defined). Let \(\{g_v: v\in A_{n^{\prime }}\}\) be a Gaussian free field on \(A_{n^{\prime }}\) with Dirichlet boundary condition and independent from \(\{\eta _v : v\in \partial A_{n^{\prime }}\}\). Analogous to (7), we can write that

$$\begin{aligned} \eta _v = g_v + \phi _v , \quad \text{ for} \text{ all} \, v\in A_{n^{\prime }}, \end{aligned}$$

where \(\phi _v = \mathbb E (\eta _v \mid \{\eta _u : u\in \partial A_{n^{\prime }}\})\) is a convex combination of \(\{\eta _u: u\in \partial A_{n^{\prime }}\}\). We wish to estimate the variance of \(\phi _v\). For this purpose, we need the following standard estimates on Green functions for random walks in 2D lattices. See, e.g., [15, Proposition 4.6.2, Theorem. 4.4.4] for a reference.

Lemma 2.2

For \(A\subset \mathbb Z ^2\), consider a random walk \((S_t)\) on \(\mathbb Z ^2\) and define \(\tau _{\partial A} = \min \{j\ge 0: S_j \in \partial A\}\) be the hitting time to \(\partial A\). For \(u, v\in A\), let \(G_{\partial A}(u, v)\) be the Green function as in (2). For a certain nonnegative function \(a(\cdot , \cdot )\) such that \(a(x, x) = 0\) and \(a(x, y) = \frac{2}{\pi } \log |x-y| + \frac{2\gamma \log 8}{\pi } + O(|x-y|^{-2})\), where \(\gamma \) is Euler’s constant. Then, we have

$$\begin{aligned} G_{\partial A}(u, v) = \mathbb E _u(a(S_{\tau _{\partial A}}, v)) - a(u, v). \end{aligned}$$

By the preceding lemma, we infer that for any \(u, w\in \partial A_{n^{\prime }}\),

$$\begin{aligned} \text{ Cov}(\eta _u, \eta _w) = G_{\partial A_n}(u, w) \ge \tfrac{2}{\pi } \beta \lambda + O(1). \end{aligned}$$

Since \(\phi _v\) is a convex combination of \(\{\eta _u: u\in \partial A_{n^{\prime }}\}\), this implies that for all \(v\in A_{n^{\prime }}\)

$$\begin{aligned} \text{ Var} \phi _v \ge \tfrac{2}{\pi } \beta \lambda + O(1). \end{aligned}$$
(11)

By Theorem 1.3, there exists an absolute constant \(\alpha (1/2)\) such that

$$\begin{aligned} \mathbb{P }(M_n - \mathbb E M_n \ge -\alpha (1/2)) \ge 1/2 \quad \text{ for} \text{ all}\,n\in \mathbb N . \end{aligned}$$
(12)

Let \(\chi \in A_{n^{\prime }}\) such that \(g_\chi = \sup _{v\in A_{n^{\prime }}} g_v\). Recalling that \(|\mathbb E M_n - \mathbb E M_{n^{\prime }}| \le 2\sqrt{2/\pi } \beta \lambda + O(\log \beta \lambda )+\kappa \) and that \(\lambda \ge \lambda _0\), we obtain that

$$\begin{aligned} \mathbb{P }\left(\sup _{v\in A_n} \eta _v \ge \mathbb E M_n + \lambda \right)&\ge \mathbb{P }(g_\chi \ge \mathbb E M_{n^{\prime }} - \alpha (1/2), \phi _\chi \ge \alpha (1/2) \\&+\, \kappa + (2\sqrt{2/\pi }\beta + 1) \lambda )\\&\ge \frac{1}{2} \frac{\pi }{\sqrt{\beta \lambda + O(1)}} \int \limits _{z \ge \alpha (1/2) + \kappa + (2\sqrt{2/\pi }\beta + 1) \lambda }{\text{ e}^{-\frac{z^2}{2\beta \lambda /\pi +O(1)}}} dz \\&\ge \frac{c}{\sqrt{\lambda }} \text{ e}^{-\pi (2\sqrt{2/\pi } \beta + 1)^2\lambda /\beta }, \end{aligned}$$

where the first inequality follows from (11) and the independence between \(\chi \) and \(\{\phi _v: v\in A_{n^{\prime }}\}\) (analogous to (8)), and in the second inequality \(c>0\) is a small absolute constant. Setting \(\beta = \sqrt{\pi /8}\), we obtain the desired estimate (10).

2.4 Upper bound on the left tail

In this subsection, we give the upper bound for the lower tail of the maximum and prove the following for absolute constants \(C, c, \lambda _0>0\).

$$\begin{aligned} \mathbb{P }(M_n - \mathbb E M_n \le -\lambda ) \le C \text{ e}^{-ce^{c\lambda }}, \quad \text{ for} \text{ all} \quad n\in \mathbb N \quad \text{ and}\quad \lambda _0\le \lambda \le (\log n)^{2/3}.\nonumber \\ \end{aligned}$$
(13)

Let \(\alpha = \alpha (1/2)\) be defined as in (12). Denote by \(r = n \exp (-\sqrt{\pi /8}(\lambda - \alpha - \kappa -4))\) and \(\ell = n \exp (-\sqrt{\pi /8}(\lambda - \alpha - \kappa -4)/3)\). Assume that the left bottom corner of \(A_n\) is the origin \(o= (0, 0)\). Define \(o_i = (i\ell , 2r)\) for \(1\le i \le m = \lfloor n/2\ell \rfloor \). Let \(\mathcal C _i\) be a discrete ball of radius \(r\) centered at \(o_i\) and let \(B_i \subset \mathcal C (i)\) be a box of side length \(r/8\) centered at \(o_i\). Let \(\mathfrak{C }=\{\mathcal{C }_i: 1\le i\le m\}\) and \(\mathcal B = \{B_i: 1\le i\le m\}\). Analogous to (7), we can write

$$\begin{aligned} \eta _v = g_v^B + \phi _v, \, \quad \text{ for} \text{ all} \, v\in B \subseteq \mathcal C \in \mathfrak C , \end{aligned}$$

where \(\{g_v^B: v\in B\}\) is the projection of the GFF on \(\mathcal C \) with Dirichlet boundary condition on \(\partial \mathcal C \), and \(\{\{g_v^B: v\in B\} : B\in \mathcal B \}\) are independent of each other and of \(\{\eta _v : v\in \partial \mathfrak C \}\) (here \(\partial \mathfrak C = \cup _\mathcal{C \in \mathfrak C } \partial \mathcal C \)), and \(\phi _v = \mathbb E (\eta _v \mid \{\eta _u: u\in \partial \mathfrak C \})\) is a convex combination of \(\{\eta _u: u\in \partial \mathfrak C \}\). For every \(B\in \mathcal B \), define \(\chi _B \in B\) such that

$$\begin{aligned} g_{\chi _B}^B = \sup _{v\in B}g_v^B. \end{aligned}$$

Recalling (4), we get that \(\mathbb E M_n - \mathbb E M_{r/8} \le \lambda - \alpha \) (here we assume \(\lambda _0\) is large enough such that \(n > r/8\)).

Using an analogous derivation of (9), we get that

$$\begin{aligned} \mathbb{P }\left(g_{\chi _B}^B \ge \mathbb E M_n - \lambda \right) \ge 1/4, \end{aligned}$$

where we used definition of \(\alpha \) in (12). Let \(W = \{\chi _B: g_{\chi _B}^B \ge \mathbb E M_n - \lambda , B\in \mathcal B \}\). By independence, a standard concentration argument gives that for an absolute constant \(c> 0\)

$$\begin{aligned} \mathbb{P }(|W| \le \tfrac{1}{8} m) \le \text{ e}^{-c m}. \end{aligned}$$
(14)

It remains to study the process \(\{\phi _v: v\in W\}\). If there exists \(v\in W\) such that \(\phi _v > 0\), we have \(\sup _{u \in A_n} \eta _u > \mathbb E M_n - \lambda \). Thanks to independence, it then suffices to prove the following lemma.

Lemma 2.3

Let \(U\subset \cup _{B\in \mathcal B } B\) such that \(|U\cap B| \le 1\) for all \(B\in \mathcal B \). Assume that \(|U| \ge m/8\). Then, for some absolute constants \(C, c>0\)

$$\begin{aligned} \mathbb{P }(\phi _v \le 0 \, for \, all \quad v\in U) \le C \text{ e}^{-c \text{ e}^{c \lambda }}. \end{aligned}$$

To prove the preceding lemma, we need to study the correlation structure for the Gaussian process \(\{\phi _v: v\in U\}\).

Lemma 2.4

[15, Lemma 6.3.7] For all \(n\ge 1\), let \(\mathcal C (n) \subset \mathbb Z ^2\) be a discrete ball of radius \(n\) centered at the origin. Then there exist absolute constants \(c, C>0\) such that for all \(n\ge 1\) and \(x\in \mathcal C (n/4)\) and \(y\in \partial \mathcal C (n)\)

$$\begin{aligned} c/n\le \mathbb{P }_x(\tau _{\partial \mathcal C (n)} = y) \le C/n. \end{aligned}$$

Write \(a_{v, w} = \mathbb{P }_v(\tau _{\partial \mathcal C } = \tau _w)\). The preceding lemma implies that \(c/r\le a_{v, w} \le C/r\) for all \(v\in B\subset \mathcal C \). Combined with Lemma 2.1, it follows that

$$\begin{aligned} \phi _v = \sum _{w\in \partial \mathcal C } a_{v, w} \eta _w. \end{aligned}$$
(15)

Therefore, we have

$$\begin{aligned} \text{ Var} \phi _v = \Theta (1/r^2) \sum _{u, w} \text{ Cov}(\eta _u, \eta _w)= \Theta (1/r^2) \sum _{u, w\in \partial \mathcal C } G_{\partial A_n}(u, w). \end{aligned}$$
(16)

In order to estimate the sum of Green functions, one could use Lemma 2.2. Alternatively, it is computation free if we apply the next lemma.

Lemma 2.5

[15, Proposition 6.4.1] For all \(n\ge 1\), let \(\mathcal C (n) \subset \mathbb Z ^2\) be a discrete ball of radius \(n\) centered at the origin. Then for all \(k< n\) and \(x\in \mathcal C (n) {\setminus } \mathcal C (k)\), we have

$$\begin{aligned} \mathbb{P }_x(\tau _{\partial \mathcal C (n)} < \tau _{\partial \mathcal C (k)}) = \frac{\log |x| - \log k + O(1/k)}{\log n - \log k}. \end{aligned}$$

Now, write

$$\begin{aligned} p_{\min } = \min _\mathcal{C \in \mathfrak C }\min _{u\in \partial \mathcal C } \mathbb{P }_u(\tau _{\partial A_n} < \tau ^+_{\partial \mathcal C }), \quad \text{ and}\quad p_{\max } = \max _\mathcal{C \in \mathfrak C }\max _{u\in \partial \mathcal C } \mathbb{P }_u(\tau _{\partial A_n} < \tau ^+_{\partial \mathcal C }), \end{aligned}$$

where \(\tau ^+_{\partial \mathcal C } = \min \{k \ge 1: S_k \in \partial \mathcal C \}\) is the first returning time to \(\partial \mathcal C \). By the preceding lemma, we have

$$\begin{aligned} 1/(4 r\lambda ) \le p_{\min } \le p_{\max } \le O(1/r) \, \quad \text{ for} \text{ all} \, u\in \partial \mathcal C \quad \text{ and}\quad \mathcal C \in \mathfrak C . \end{aligned}$$

Therefore, by Markovian property we have

$$\begin{aligned}&\Theta (r) \le \frac{1}{p_{\max }} \le \sum _{w\in \partial \mathcal C } G_{\partial A_n}(u, w) \le 1+ \frac{1}{p_{\min }} = O( r\lambda ), \nonumber \\&\quad \text{ for} \text{ all}\ u\in \partial \mathcal C \quad \text{ and} \quad \mathcal C \in \mathfrak C . \end{aligned}$$
(17)

Combined with (16), this implies that

$$\begin{aligned} \Theta (1) \le \text{ Var}(\phi _v) = O(\lambda )\,, \quad \text{ for} \text{ all}\, v \in U. \end{aligned}$$

We also wish to bound the covariance between \(\phi _v\) and \(\phi _u\) for \(u, v\in U\). Assume \(u\in \mathcal C _i\) and \(v\in \mathcal C _j\) for \(i\ne j\). By (17), we see that

$$\begin{aligned} \text{ Cov}(\phi _u, \phi _v)&\le O(1/r) \max _{x\in \mathcal C _i} G_{\partial A_n}(x, \partial \mathcal C _j)\nonumber \le O(1/r) \max _{x\in \mathcal C _i} \mathbb{P }_x(\tau _{\partial \mathcal C _j} < \tau _{\partial A_n})\nonumber \\&\times \max _{y\in \partial \mathcal C _j} G_{\partial A_n}(y, \partial \mathcal C _j)\nonumber \\&\le O(1/r) \max _{x\in \mathcal C _i} \mathbb{P }_x(\tau _{\partial \mathcal C _j} < \tau _{\partial A_n}) \max _{y\in \partial \mathcal C _j} \sum _{z\in \partial \mathcal C _j}G_{\partial A_n}(y, z)\nonumber \\&\le O(\lambda )\max _{x\in \mathcal C _i} \mathbb{P }_x(\tau _{\partial \mathcal C _j} < \tau _{\partial A_n}). \end{aligned}$$
(18)

We incorporate the estimate for the above hitting probability in the next lemma.

Lemma 2.6

For any \(i\ne j\) and \(x\in \mathcal C _i\), we have

$$\begin{aligned} \mathbb{P }_x( \tau _{\partial \mathcal C _j} < \tau _{\partial A_n}) \le C\sqrt{r/\ell }, \end{aligned}$$

where \(C>0\) is a universal constant.

proof

We consider the projection of the random walk to the horizontal and vertical axes, and denote them by \((X_t)\) and \((Y_t)\) respectively. Define

$$\begin{aligned} T_{_X} = \min \left\{ t: |X_t - x| \ge \ell /2 \right\} ,\quad \text{ and} \quad T_{_Y} = \min \{t: Y_t = 0\}. \end{aligned}$$

It is clear that \(\tau _{\partial A_n}\le T_{_Y}\) and \(T_{_X} \le \tau _{\partial \mathfrak C {\setminus } \partial \mathcal C _i}\). Write \(t^\star = r \ell \). Since the number of steps spent on waling in the horizontal (vertical) axis is a Binomial distribution with parameter \(t\) and \(1/2\), an application of CLT yields that with probability at least \(1- \exp (-c t^\star )\) (here \(c>0\) is an absolute constant) the number of such steps is at least \(t^\star /3\) (and thus, at most \(2t^\star /3\)). Combined with standard estimates for 1-dimensional random walks (see, e.g., [18, Theorem 2.17, Lemma 2.21]), it follows that for a universal constant \(C>0\)

$$\begin{aligned} \mathbb{P }(T_{_Y} \ge t^\star ) \le C \sqrt{r/\ell }. \end{aligned}$$

Using Markov property for random walk, we see that

$$\begin{aligned} \mathbb{P }(T_{_X} \le t^\star ) \le (\mathbb{P }(T_{_X} \le \ell ^2))^{t^\star /\ell ^2} \le \varepsilon ^{r/\ell }, \end{aligned}$$

where \(\varepsilon <1\) is an absolute constant. This completes the proof.\(\square \)

Combining the preceding lemma and (18), we obtain that (here we assume that \(\lambda _0\) is large enough)

$$\begin{aligned} \text{ Cov}(\phi _u, \phi _v) = O(\lambda \sqrt{r/\ell }), \quad \text{ for} \text{ all}\, u, v \in U. \end{aligned}$$

Therefore, we have the following bounds on the correlation coefficients \(\rho _{u, v}\):

$$\begin{aligned} 0\le \rho _{u, v} = O(\lambda \sqrt{r/\ell }), \quad \text{ for} \text{ all}\, u \ne v\in U. \end{aligned}$$
(19)

At this point, we wish to apply Slepian’s [20] comparison theorem (see also, [11, 17]).

Theorem 2.7

If \(\{\xi _i\mathrm{\,:}\, 1\le i \le n\}\) and \(\{\zeta _i\mathrm{\,:}\, 1\le i\le n\}\) are two mean zero Gaussian process such that

$$\begin{aligned} \text{ Var} \xi _i = \text{ Var} \zeta _i, \quad and \quad \text{ Cov} (\xi _i, \xi _j) \le \text{ Cov}(\zeta _i, \zeta _j) \quad for \, all \quad 1\le i, j \le n. \end{aligned}$$
(20)

Then for all real numbers \(\lambda _1, \ldots , \lambda _n\),

$$\begin{aligned} \mathbb{P }(\xi _i \le \lambda _i for \, all \, 1\le i\le n) \le \mathbb{P }(\zeta _i \le \lambda _i for all \quad 1\le i\le n). \end{aligned}$$

The following is an immediate consequence.

Corollary 2.8

Let \(\{\xi _i \,\mathrm{:}\, 1\le i\le n\}\) be a mean zero Gaussian process such that the correlation coefficients satisfy \(0\le \rho _{i, j} \le \rho \le 1/2\) for all \(1\le i< j\le n\). Then,

$$\begin{aligned} \mathbb{P }(\xi _i\le 0, for all \, 1\le i\le n ) \le \text{ e}^{-1/(2\rho )} + (9/10)^n. \end{aligned}$$

Proof

Since we are comparing \(\xi _i\)’s with zero, it allows us to assume that \(\text{ Var} \xi _i = 1\) for all \(1\le i\le n\). Let \(\zeta _i = \sqrt{\rho } X + \sqrt{1-\rho ^2} Y_i\) where \(X\) and \(Y_i\)’s are i.i.d. standard Gaussian variables. It is clear that our processes \(\{\xi _i: 1\le i\le n\}\) and \(\{\zeta _i: 1\le i \le n\}\) satisfy (20). By Theorem 2.7, we obtain that

$$\begin{aligned} \mathbb{P }(\xi _i \le 0 \, \text{ for} \text{ all} \, 1\le i\le n) \le \mathbb{P }(\zeta _i \le 0 \, \text{ for} \text{ all} \, 1\le i\le n). \end{aligned}$$

Since \(\{\zeta _i \le 0 \, \text{ for} \text{ all} \, 1\le i\le n\} \subseteq \{X \le -1/\sqrt{\rho }\} \cup \{Y_i \le 1/\sqrt{1-\rho ^2} \, \text{ for} \text{ all} \, 1\le i\le n\}\), we have

$$\begin{aligned} \mathbb{P }(\zeta _i \le 0 \, \text{ for} \text{ all} \, 1\le i\le n)&\le \mathbb{P }(X \le -1/\sqrt{\rho }) \\&+ \mathbb{P }(Y_i \le 1/\sqrt{1-\rho ^2} \,\text{ for} \text{ all} \, 1\le i\le n)\\&\le \text{ e}^{-1/(2\rho )} + (9/10)^n. \end{aligned}$$

Altogether, this completes the proof.\(\square \)

Proof of Lemma 2.3

Recall definitions of \(r\), \(\ell \) and \(m\). The desired estimate follows from an application of the preceding corollary to \(\{\phi _v: v\in U\}\) and the correlation bounds (19) (here we assume that \(\lambda \) is large enough such that \(\rho _{u, v}\le 1/2\) for all \(u\ne v\)).\(\square \)

Combining Lemma 2.3 and (14), we finally complete the proof for the upper bound on the left tail as in (13).

2.5 Lower bound on the left tail

In this subsection, we study the lower bound for the lower tail of the maximum and show that for absolute constants \(C , c , n_0, \lambda _0> 0\)

$$\begin{aligned} \mathbb{P }(M_n - \mathbb E M_n \le -\lambda ) \ge c \text{ e}^{-C \text{ e}^{C\lambda }} \quad \text{ for} \text{ all} \quad n\ge n_0 \quad \text{ and}\quad \lambda _0 \le \lambda \le (\log n)^{2/3}.\qquad \end{aligned}$$
(21)

The proof consists of two steps: (1) We estimate the probability for \(\sup _{v\in B} \eta _v \le \mathbb E M_n - \lambda \) for a small box \(B\) in \(A_n\). (2) Applying FKG inequality for GFF, we bootstrap the estimate on a small box to the whole box.

By Theorem 1.3, there exists an absolute constant \(\alpha ^* > 0\) such that

$$\begin{aligned} \mathbb{P }(M_n \le \mathbb E M_n + \alpha ^*) \ge 3/4 \, \quad \text{ for} \text{ all}\, n\in \mathbb N . \end{aligned}$$
(22)

We first consider the behavior of GFF in a box of side length \(\ell \), where

$$\begin{aligned} \ell \stackrel{\scriptscriptstyle \triangle }{=}n \text{ e}^{-10(\lambda + \kappa + \alpha ^* + 2)}. \end{aligned}$$
(23)

Lemma 2.9

Let \(B\subseteq A_n\) be a box of side length \(\ell \). Then,

$$\begin{aligned} \mathbb{P }\left(\sup _{v\in B} \eta _v \le \mathbb E M_n - \lambda \right) \ge 1/2. \end{aligned}$$

In order to prove the lemma, let \(B^{\prime }\) be a box of side length \(2\ell \) that has the same center as \(B\), and let \(\hat{B} = B^{\prime } \cap A_n\). Consider the GFF \(\{g_v: v\in \hat{B}\}\) on \(\hat{B}\) with Dirichlet boundary condition (on \(\partial \hat{B}\)). We wish to compare \(\{\eta _v: v\in B\}\) with \(\{g_v: v\in B\}\). For \(u, v\in B\), let

$$\begin{aligned} \rho _{u, v} = \frac{\text{ Cov}(\eta _u, \eta _v)}{\sqrt{\text{ Var} \eta _u \, \text{ Var} \eta _v}} \quad \text{ and}\quad \hat{\rho }_{u, v} = \frac{\text{ Cov}(g_u, g_v)}{\sqrt{\text{ Var} g_u \, \text{ Var} g_v}} \end{aligned}$$

be the correlations coefficients of two GFFs under consideration.

Lemma 2.10

For all \(u, v\in B\), we have \(\rho _{u, v} \ge \hat{\rho }_{u, v}\)   for all \(u, v \in B\).

Proof

Since by definition \(\hat{B} \subset A_n\), we see that \(\tau _{\partial \hat{B}} \le \tau _{\partial A_n}\) deterministically for a random walk started from an arbitrary vertex in \(B\). Note that

$$\begin{aligned} G_{\partial A_n} (u, v) = \mathbb{P }_u(\tau _v < \tau _{\partial A_n}) G_{\partial A_n}(v, v) \quad \text{ and} \quad G_{\partial \hat{B}} (u, v) = \mathbb{P }_u(\tau _v < \tau _{\partial \hat{B}}) G_{\partial \hat{B}}(v, v) \end{aligned}$$

Altogether, we obtain that

$$\begin{aligned} \rho _{u, v} = \sqrt{\mathbb{P }_u(\tau _v < \tau _{\partial A_n}) \mathbb{P }_v(\tau _u < \tau _{\partial A_n})} \ge \sqrt{\mathbb{P }_u(\tau _v < \tau _{\partial \hat{B}}) \mathbb{P }_v(\tau _u < \tau _{\partial \hat{B}})} = \hat{\rho }_{u, v}. \end{aligned}$$

\(\square \)

We next compare the variances for the two GFFs.

Lemma 2.11

For all \(v\in B\), we have that

$$\begin{aligned} \mathrm{Var} \eta _v \le \Big (1 + \frac{(1+o(1)(\log (n/\ell ) + O(1))}{\log n}\Big ) \mathrm{Var} g_v. \end{aligned}$$

Proof

It suffices to compare the Green functions \(G_{\partial A_n}(v, v)\) and \(G_{\partial \hat{B}}(v, v)\). We can decompose them in terms of the hitting points to \(\partial \hat{B}\) and obtain that

$$\begin{aligned} G_{\partial A_n} (v, v) = G_{\partial \hat{B}}(v, v) + \sum _{w\in \partial \hat{B}}\mathbb{P }_v(\tau _w = \tau _{\partial \hat{B}}) G_{\partial A_n}(w, v). \end{aligned}$$

Note that for \(w\in \partial \hat{B} \cap \partial A_n\), we have \(G_{\partial A_n}(w, v) = 0\). For \(w\in \partial \hat{B} {\setminus } \partial A_n\), we see that \(|v-w| \ge \ell \) by our definition of \(\hat{B}\). Therefore, by Lemma 2.2, we have

$$\begin{aligned} G_{\partial A_n}(w, v) \le \tfrac{2}{\pi } \log (n/\ell ) + O(1). \end{aligned}$$

Since \(|v-w|\ge \ell \) for \(w\in \partial \hat{B} {\setminus } \partial A_n\), Lemma 2.2 gives that

$$\begin{aligned} G_{\partial \hat{B}}(v, v)&= \sum _{w\in \partial \hat{B} {\setminus } \partial A_n} \mathbb{P }_v(\tau _w = \tau _{\partial \hat{B}}) \cdot a(w, v) \\&\ge \big (\tfrac{2}{\pi } +o(1)\big ) \log n \sum _{w\in \partial \hat{B} {\setminus } \partial A_n} \mathbb{P }_v(\tau _w = \tau _{\partial \hat{B}}), \end{aligned}$$

where we used the assumption that \(\lambda \le (\log n)^{2/3}\). Altogether, we get that

$$\begin{aligned} G_{\partial A_n} (v, v) \le \big (1 + \tfrac{(1+o(1)) (\log (n/\ell ) + O(1))}{\log n}\big ) G_{\partial \hat{B}} (v, v), \end{aligned}$$

completing the proof.\(\square \)

We will need the following lemma to handle some technical issues.

Lemma 2.12

For a graph \(G = (V, E)\), consider \(V_1\subset V_2 \subset V\). Let \(\{\eta ^{(1)}_v\}_{v\in V}\) and \(\{\eta ^{(2)}_v\}_{v\in V}\) be GFFs on \(V\) such that \(\eta ^{(1)}|_{V_1} = 0\) and \(\eta ^{(2)}|_{V_2} = 0\), respectively. Then for any number \(t \in \mathbb R \)

$$\begin{aligned} \mathbb{P }\left(\sup _{v\in U}\eta ^{(1)}_v \ge t\right) \ge \tfrac{1}{2} \mathbb{P }\left(\sup _{v\in U}\eta ^{(2)}_v \ge t\right). \end{aligned}$$

Proof

Note that the conditional covariance matrix of \(\{\eta ^{(1)}_v\}_{v\in U}\) given the values of \(\{\eta ^{(1)}_v\}_{v\in V_2{\setminus } V_1}\) corresponds to the covariance matrix of \(\{\eta ^{(2)}_v\}_{v\in U}\). This implies that

$$\begin{aligned} \left\{ \eta ^{(1)}_v: v\in U\right\} \stackrel{law}{=} \left\{ \eta ^{(2)}_v + \mathbb E \left(\eta ^{(1)}_v \mid \left\{ \eta ^{(1)}_u: u\in V_2 \setminus V_1\right\} \right): v\in U\right\} , \end{aligned}$$

where on the right hand side \(\{\eta ^{(2)}_v: v\in U\}\) is independent of \(\{\eta ^{(1)}_u: u\in V_2{\setminus } V_1\}\). Write \(\phi _v = \mathbb E (\eta ^{(1)}_v \mid \{\eta ^{(1)}_u: u\in V_2 {\setminus } V_1\})\). Note that \(\phi _v\) is a linear combination of \(\{\eta ^{(1)}_u: u\in V_2 {\setminus } V_1\}\), and thus a mean zero Gaussian variable. By the above identity in law, we derive that

$$\begin{aligned} \mathbb{P }\left(\sup _{v\in U}\eta ^{(1)}_v \ge t\right) \ge \mathbb{P }\left(\eta ^{(2)}_\xi + \phi _\xi \ge t\right) = \tfrac{1}{2}\mathbb{P }\left(\eta ^{(2)}_\xi \ge t\right) = \tfrac{1}{2} \mathbb{P }\left(\sup _{v\in U} \eta ^{(2)}_v \ge t\right)\!, \end{aligned}$$

where we denote by \(\xi \in U\) the maximizer of \(\{\eta ^{(2)}_u: u\in U\}\) and the second transition follows from the independence of \(\{\eta ^{(1)}_v\}\) and \(\{\phi _v\}\).\(\square \)

We are now ready to give

Proof of Lemma 2.9

Write \(b_v = \sqrt{\text{ Var} \eta _v / \text{ Var} g_v}\) for every \(v\in B\). By Lemma 2.11, we see that \(b_v \le 1 + (1/2+o(1))(\log (n/\ell )+O(1))/\log n\) for all \(v\in B\). Consider the Gaussian process defined by \(\xi _v = \eta _v/b_v\). By Lemma 2.10, we see that \(\{\xi _v: v\in B\}\) and \(\{g_v: v\in B\}\) satisfy the assumption in Theorem 2.7, and thus

$$\begin{aligned} \mathbb{P }\left(\sup _{v\in B} \xi _v \le \gamma \right) \ge \mathbb{P }\left(\sup _{v\in B} g_v \le \gamma \right) , \quad \text{ for} \text{ all} \, \gamma \in \mathbb R . \end{aligned}$$
(24)

Plugging into \(\gamma = \mathbb E M_{2\ell } + \alpha ^*\) and using (22) and Lemma 2.12 (we need to use Lemma 2.12 as the box \(\hat{B}\) might not be a squared box of side-length \(2\ell \) but a subset of that), we obtain that

$$\begin{aligned} \mathbb{P }\left(\sup _{v\in B} \xi _v \le \mathbb E M_{2\ell } + \alpha ^*\right)&\ge \mathbb{P }\left(\sup _{v\in B} g_v \le \mathbb E M_{2\ell } + \alpha ^*\right) \\&\ge \mathbb{P }\left(\sup _{v\in \hat{B}} g_v \le \mathbb E M_{2\ell } + \alpha ^*\right) \ge 1/2. \end{aligned}$$

Also, By definition of \(\ell \) and (4) as well as our assumption that \(\lambda \le (\log n)^{2/3}\), we see that

$$\begin{aligned} \mathbb E M_n \ge \mathbb E M_{2\ell } + 2\sqrt{2/\pi } \log (n/\ell ) - 10. \end{aligned}$$

Therefore, for large constants \(\lambda _0, n_0\), we can deduce that

$$\begin{aligned}&(1 + (1/2+o(1))(\log (n/\ell )+O(1))/\log n) (\mathbb E M_{2\ell } + \alpha ^*) \\&\quad \le \mathbb E M_{2\ell } + \tfrac{2}{3}\log (n/\ell ) +1 \le \mathbb E M_n - \lambda , \end{aligned}$$

where we used Theorem 1.3 and the definition of \(\ell \) in (23). Altogether, we deduce that

$$\begin{aligned} \mathbb{P }\left(\sup _{v\in B} \eta _v \le \mathbb E M_n - \lambda \right) \ge 1/2. \end{aligned}$$

\(\square \)

Now, we wish to apply FKG inequality and obtain the estimate on the probability \(\sup _{v\in A_n} \eta _v \le \mathbb E M_n - \lambda \). Pitt [19] proves that the FKG inequality holds for a Gaussian process with nonnegative covariances. Since clearly the GFF has nonnegative covariances, the FKG inequality holds for GFF.

Partition \(A_n\) into a union of boxes \(\mathcal B \) where each of the boxes is of side length at most \(\ell \). We choose \(\mathcal B \) in a way such that \(|\mathcal B |\) is minimized. Clearly, \(|\mathcal B | \le (\lceil n/\ell \rceil )^2\). Observing that the event \(\{\sup _{v\in B} \eta _v \le \mathbb E M_n - \lambda \}\) is decreasing for all \(B\in \mathcal B \), we apply FKG inequality and Lemma 2.9, and conclude that

$$\begin{aligned} \mathbb{P }\left(\sup _{v\in A_n} \le \mathbb E M_n - \lambda \right) \ge \prod _{B\in \mathcal B } \mathbb{P }\left(\sup _{v\in B} \le \mathbb E M_n - \lambda \right) \ge (1/2)^{|\mathcal B |}. \end{aligned}$$

Recalling the definition of \(\ell \) as in (23), this completes the proof of (21).