Probabilistic discrepancy bound for Monte Carlo point sets

By a profound result of Heinrich, Novak, Wasilkowski, and Wo{\'z}niakowski the inverse of the star-discrepancy $n^*(s,\ve)$ satisfies the upper bound $n^*(s,\ve) \leq c_{\mathrm{abs}} s \ve^{-2}$. This is equivalent to the fact that for any $N$ and $s$ there exists a set of $N$ points in $[0,1]^s$ whose star-discrepancy is bounded by $c_{\mathrm{abs}} s^{1/2} N^{-1/2}$. The proof is based on the observation that a random point set satisfies the desired discrepancy bound with positive probability. In the present paper we prove an applied version of this result, making it applicable for computational purposes: for any given number $q \in (0,1)$ there exists an (explicitly stated) number $c(q)$ such that the star-discrepancy of a random set of $N$ points in $[0,1]^s$ is bounded by $c(q) s^{1/2} N^{-1/2}$ with probability at least $q$, uniformly in $N$ and $s$.


Introduction and statement of results
The number n * (s, ε), which is defined as the smallest possible cardinality of a point set in [0, 1] s having discrepancy bounded by ε, is called the inverse of the discrepancy.Heinrich, Novak, Wasilkowski, and Woźniakowski [10] proved the upper bound which is complemented by the lower bound n * (s, ε) ≥ c abs sε −1 due to Hinrichs [11] (throughout the paper, c abs denotes absolute constants, not always the same).Hence the inverse of the star-discrepancy depends linearly on the dimension, while the precise dependence on ε is still unknown.It is easy to see that (1) is equivalent to the fact that for any N and s there exists a set P N of N points in [0, 1] s such that the star-discrepancy D * N of this point set is bounded by (recently we showed that it is possible to choose c abs = 10 in (2), see [1]).The existence of such a point set directly follows from the surprising observation that a randomly generated point set (that is, a Monte Carlo point set) satisfies the desired discrepancy estimate with positive probability.Of course, for applications such a mere existence result is not of much use, as was remarked by several colleagues at the MCQMC 2012 conference in Sydney.For this reason, in the present paper we prove an applied version of ( 2), which provides estimates for the probability of a random point set satisfying (2) (depending on the value of the constant).As our Theorem 1 below shows, this probability is extremely large already for moderate values of c, for example for c = 20.Additionally, the quality of our estimates for these probabilities improves as the dimension s increases (which is somewhat counter-intuitive, and originates from the exponential inequalities used in the proof, which cause a "concentration of mass" phenomenon).
The fact that the probability of a random point set satisfying ( 2) is very large is in contrast to the fact that no general constructions of point sets satisfying such discrepancy bounds are known.So far, the best results are a component-by-component construction of Doerr, Gnewuch, Kritzer and Pillichshammer [2], a semi-deterministic algorithm based on dependent randomized rounding due to Doerr, Gnewuch, and Wahlström [3], and a construction of Hinrichs [12] of a "structured" set of N = 256 points in dimension s = 15 having discrepancy less than 1/4 (by this means solving one instance of an open problem in [14]).
For more information concerning the inverse of the discrepancy and tractability of multidimensional integration we refer to a recent survey article of Gnewuch [7], and to the monograph of Novak and Woźniakowski [13,14].A collection of open problems on this topic can be found in [9].
In the present paper, we will prove the following theorem.
It is interesting that the quality of the discrepancy estimate in Theorem 1 improves as the dimension s increases; for example the necessary number c(q, s) to have star-discrepancy bounded by c(q, s)s 1/2 N −1/2 with probability at least 90% is 15.30 in dimension s = 1, while it is only 12.65 in dimension s = 100.However, neglecting this advantage of large dimensions in order to obtain a result which holds uniformly in s, one immediately obtains the following corollary.
Theorem 1 shows that the probability that a random point set satisfies the discrepancy bound c(q, s)s 1/2 N −1/2 is extremely large already for moderate values of c(q, s).The following table illustrates this fact, for s = 10 and s = 100.q 0.01 0.5 0.9 0.99 0.999 c(q,10) 12.62 12.71 12.92 13.20 13.48 c(q,100) 12.62 12.63 12.65 12.68 12.71 As the table shows, the probability that a random point set has "small" discrepancy in the sense that its discrepancy is bounded by cs 1/2 N −1/2 for some moderate c (for example, c = 20) is extremely large.This observation is an exciting counterpart of the fact that we do not have the slightest idea of how to construct point sets satisfying such discrepancy bounds, even for moderate N and s.It should also be noted that calculating the star-discrepancy of a given (high-dimensional) point set is computationally very difficult, see [4,8].Hence, although our results show that the probability of a random point set having small discrepancy is very large, checking that a concrete point set satisfies such discrepancy bounds is in general (in high dimensions) a computationally intractable problem.

Preliminaries
Throughout the paper, s ≥ 1 denotes the dimension and λ denotes the s-dimensional Lebesgue measure.For x, y ∈ [0, 1] s , where x = (x 1 , . . ., x s ) and y = (y 1 , . . ., y s ), we write x ≤ y if x i ≤ y i , 1 ≤ i ≤ s, and for any x ∈ [0, 1] s we write [0, x] for the set {y ∈ [0, 1] s : 0 ≤ y ≤ x}.Furthermore, we write |A| for the number of elements of a set A.
The following Lemma 1 of Gnewuch [5,Theorem 1.15] is a central ingredient in the proof of our main result.For convenience we use the notation from [5] and [6]: The number N (s, δ) denotes the smallest possible cardinality of a δ-cover of [0, 1] s .Similarly, for any δ ∈ (0, 1] a set ∆ of pairs of points from [0, 1] s is called a δ-bracketing cover of [0, 1] s , if for every pair (x, z) ∈ ∆ the estimate λ([0, z)) − λ([0, x)) ≤ δ holds, and if for every y ∈ [0, 1] s there exists a pair (x, z) from ∆ such that x ≤ y ≤ z.The number N [ ] (s, δ) denotes the smallest possible cardinality of a δ-bracketing cover of [0, 1] s .Lemma 1 For any s ≥ 1 and δ ∈ (0, 1] By Lemma 1 for any 1 ≤ k ≤ K there exists a 2 −k -cover of [0, 1] s , denoted by Γ k , such that Furthermore we denote by ∆ K a 2 −K -bracketing cover for which which also exists due to Lemma 1. Moreover we define Γ K as By definition for every x ∈ [0, 1] s there exists a pair We define . . . and p K+1 (x) = w K (x), p 0 (x) = 0.For x, y ∈ [0, 1] s we set Then the sets [p k (x), p k+1 (x)], 1 ≤ k ≤ K, are disjoint, and we obtain Hence for every x, y Moreover, independent of x, we have for 0 For 0 ≤ k ≤ K we define A k to be the set of all sets of the form [p k (x), p k+1 (x)], where x ∈ [0, 1] s .Then for 0 ≤ k ≤ K, as a consequence of Lemma 1, we can bound the cardinality of A k by Note that all elements of A k , where 0 ≤ k ≤ K, have Lebesgue measure bounded by 2 −k .This dyadic decomposition method was introduced in [1], where it is described in more detail.
Let X 1 , . . ., X N be independent, identically distributed (i.i.d.) random variables defined on some probability space (Ω, A, P) having uniform distribution on [0, 1] s , and let I ∈ A k for some k ≥ 0. Then the random variables 1 I (X 1 ), . . ., 1 I (X N ) are i.i.d.random variables, having expected value λ(I) and variance Since the X n are independent it follows that the random variable N n=1 1 I (X n ) has expected value N λ(I) and variance N (λ(I) − λ(I) 2 ).
In the proof of our main result we need two well-known results from probability theory, namely Bernstein's and Hoeffding's inequality.Bernstein's inequality states that for Z 1 , . . ., Z N being i.i.d.random variables, satisfying E Z n = 0 and |Z n | ≤ C a.s.for some C > 0, By applying this inequality to the random variables 1 I (X n ) − λ(I), we obtain for t > 0. Using (7) we conclude For k ∈ {0, 1} we use Hoeffding's inequality, which yields

Proof of Theorem 1
Since the theorem is trivial for N < 32 s + log (1 − q) −1 < 5.7 2 s + log (1 − q) −1 we assume that N ≥ 32 s + log (1 − q) −1 and set Then K ≥ 3, and Furthermore we have By choosing t = c √ sN for some c > 0, we conclude from ( 8), ( 9) and ( 11) that for any c > 0 Let B k , k = 0, . . ., K be given as The strategy of the proof is to find for any given q ∈ (0, 1) constants c k = c k (q), k = 0, . . ., K for which holds for any given q.
First we consider the case k = 0.By (6) we have that |A 0 | ≤ (6e) s .We choose thus together with ( 12) and ( 13) it follows that Furthermore we get by ( 6) that |A 1 | ≤ (10e) s and with we obtain that Next we consider the case 2 ≤ k ≤ K.By ( 12) and ( 13) we have We set for 2 ≤ k ≤ K. Thus by ( 16) we obtain Summing up the estimated probabilities gives Therefore with at least probability q, a realization X 1 (ω), . . ., X n (ω) is such that We denote by z n a point set which is defined by such a realization, i.e.