A simple Proof of Stolarsky's Invariance Principle

Stolarsky [Proc. Amer. Math. Soc. 41 (1973), 575--582] showed a beautiful relation that balances the sums of distances of points on the unit sphere and their spherical cap $\mathbb{L}_2$-discrepancy to give the distance integral of the uniform measure on the sphere a potential-theoretical quantity (Bj{\"o}rck [Ark. Mat. 3 (1956), 255--269]). Read differently it expresses the worst-case numerical integration error for functions from the unit ball in a certain Hilbert space setting in terms of the $\mathbb{L}_2$-discrepancy and vice versa (first author and Womersley [Preprint]). In this note we give a simple proof of the invariance principle using reproducing kernel Hilbert spaces.


Introduction
We consider the unit sphere S d = z = (z 1 , . . . , z d+1 ) ∈ R d+1 : z = z 2 1 + · · · + z 2 d+1 = 1 embedded in the Euclidean space R d+1 , d ≥ 2. Let f : S d → C be a continuous function. Then we approximate the integral S d f (x) d σ d (x), where σ d is the normalized Lebesgue surface area measure on S d ( S d d σ d = 1), by an equal weight numerical integration rule where z 0 , . . . , z N −1 ∈ S d are the integration nodes on the sphere. In order to analyze the integration error committed by the approximation, we define a worst-case error by e(H, Q N ) = sup where H denotes a normed function space with norm · . The rate of decay of the worst-case error depends on the function space and the integration nodes. For a fixed function space, the worst-case error can serve as a quality criterion for different sets of integration nodes, meaning that the performance of a set of N quadrature points z 0 , . . . , z N −1 can be compared to another set of N quadrature points by comparing the corresponding worst-case errors. Generally, this only means that the integration error is smaller, but sometimes the worst-case error allows also a geometrical interpretation. Another quality criterion for points on the sphere exploits the potential energy, or more generally, the Riesz s-energy of configurations of points modeling unit charges which are thought to interact through a potential 1/ · s (s = 0), where · denotes the Euclidean distance. (We refer the reader to the survey papers [10] and [13] and for universally optimal configurations to [8].) A particular instance is the (normalized) sum of distances (s = −1) It is well-known from potential theory (see Björck [4]) that this (discrete) sum of distances of optimal N -point configurations approaches the associated (continuous) distance integral of the uniform measure σ d on S d as N → ∞. In fact, any sequence of N -point systems with this property turns out to be 'asymptotically uniformly distributed'; that is, the discrete probability measure obtained by placing equal charges at the points tends to the uniform measure in the weak-star sense. The difference measuring the deviation between theoretical and empirical (−1)-energy quantifies the quality of points on the sphere (and, indirectly, their uniform distribution) using energy. It should be mentioned that the upper bound of correct order N −1−1/d for (2) (for optimal configurations) was obtained by Stolarksy [16] using his invariance principle and a result of Schmidt [14] on the discrepancy of spherical caps. The correct-order lower bound (N −1−1/d ) was established by Beck [3] using his Fourier transform technique. The spherical cap discrepancy measures the maximum deviation between theoretical and empirical distribution with respect to spherical caps as test sets. It can be used to compare point sets on the sphere with respect to their distribution properties. To introduce the concept of spherical cap discrepancy, we require some notation. A spherical cap centered at The family of all spherical caps is denoted by For a set J ⊆ R d+1 we define the indicator function For a measurable set J ⊆ S d let Using spherical caps we can define another quality criterion for points on the sphere S d in terms of their distribution properties. One such criterion is the spherical cap L 2 -discrepancy, which is given by We considered three, seemingly different, measures which can be applied to point sets on the sphere. It turns out that in some instances, the three measures are related to each other. Stolarksy's insight [16] was that the sum of distances of points on the sphere and the spherical cap discrepancy coincide. On the other hand, also the sum of distances of points on the sphere and the worst-case error coincide for a certain choice of function space, see [7] and also Sloan and Womersley [15] regarding a generalized discrepancy of Cui and Freeden [9].
In this paper we give a simple proof of these results based on reproducing kernel Hilbert spaces. We also provide some generalizations which follow from our approach.

Reproducing kernel Hilbert space
We define a reproducing kernel Hilbert space using the general approach of [12,Ch. 9.6].
For x, y ∈ S d we define the function Since 1 C(z;t) (x) = 1 C(x;t) (z), we also have The function is obviously symmetric, i.e. we have K C (x, y) = K C (y, x). Further, let a 0 , . . . , a N −1 ∈ C and x 0 , . . . , x N −1 ∈ S d . Then we have Thus, the function K C is symmetric and positive definite. By [2], this implies that K C is a reproducing kernel. It is also shown in [2] that a reproducing kernel uniquely defines a Hilbert space of functions with a certain inner product. Let H C = H(K C , S d ) denote the corresponding reproducing kernel Hilbert space of functions f : S d → R with reproducing kernel K C .
We consider now functions f 1 , f 2 : S d → C which permit a certain integral representation.
Notice that for any fixed y ∈ S d the function K C (·, y) also is of this form, where the function g is given by 1 C(z;t) (y) (considered as a function of z and t and where y is fixed). For functions of this form we can define an inner product by Let y ∈ S d be fixed. With this definition we obtain By [2], the inner product in H C is unique. Therefore, functions f i , which are given by (4) and (5) is an inner product for those functions in H C . Consider now the reproducing kernel K C . We have The last integral does not depend on the unit vector (x − y)/ x − y by rotational symmetry. Thus, we have (cf. Appendix A) Hausdorff measure normalized such that the d-dimensional unit cube [0, 1) d has measure one and Γ(z) is the Gamma function.) Therefore we obtain the following closed form representation: The reproducing kernel has the following properties: for y, z ∈ S d we have Note that the Karhunen-Loevy expansion of the function x − y is based on ultraspherical harmonics. Hence the eigenfunctions of K C are the ultraspherical harmonics. The corresponding eigenvalues are also known. Therefore, the functions in H C can be expanded using ultraspherical harmonics and the inner product can be written using the coefficients of such an expansion. See [7] for these results.
3 Worst-case error Then we define the worst-case error for a quadrature rule Q N given in (1) by Let f ∈ H C . Then, by the reproducing kernel property f (y) = f, K C (·, y) K C for y ∈ S d , and, since the integration functional f → S d f d σ d is bounded on H C and has the 'representer' where the 'representer' of the error of numerical integration for the rule Q N for functions in H C is given by The Cauchy-Schwarz inequality yields In particular, equality is assumed in the last relation when taking f to be the 'representer' R(H C , Q N ; ·) itself. It follows that Expanding the square of the worst-case error and substituting the closed form of the reproducing kernel we arrive at the well-known representation This shows how the square worst-case error in our reproducing kernel Hilbert space is related to (2). In the next section we show how the worst-case error e(H C , Q N ) in our reproducing kernel Hilbert space relates to the L 2 spherical cap discrepancy.

Spherical cap discrepancy and Stolarsky's invariance principle
Using the integral representation of the reproducing kernel (3) we have Thus, the 'representer' of the error of numerical integration is of the form (4); that is Therefore, using the inner product representation (5) in (8), we obtain stating that the worst-case error of the numerical integration formula Q N in (1) in the considered Sobolev space setting equals the so-called spherical cap L 2 -discrepancy of the integration nodes.
Combining (9) and (10), we arrive at Stolarsky's invariance principle for the Euclidean distance on spheres.
Proposition 1 (Stolarsky [16]) Let x 0 , . . . , x N −1 ∈ S d be an arbitrary N point configuration on the sphere S d . Then we have The L 2 -discrepancy of an N -point configuration on S d decreases as its sum of distances increases and vice versa. The right-hand side is the distance integral of the uniform measure σ d on the sphere S d which is the unique extremal measure (also known as the equilibrium measure) maximizing the distance integral over the family of (Borel) probability measures µ supported on S d . For the potential theory of the generalized distance integral we refer to Björck [4].

A weighted reproducing kernel
The above results can be generalized by introducing a weight function. Let v : [−1, 1] → R satisfy v(t) > 0 for all t and which has an antiderivative, which we denote by V . Then we define the reproducing kernel with weight function v as follows x, y ∈ S d .
For functions represented by integrals the corresponding inner product is now given by The reproducing kernel can be written as For certain weight functions v, this expression may have a concise form. This reproducing kernel defines a reproducing kernel Hilbert space H C ,v . The 'representer' of the error of numerical integration for the rule Q N for functions in H C ,v takes on the form We claim that K C ,v (x, y) is a function of the inner product x, y , cf. Appendix B. Using the same approach as before we obtain This worst case error can also be expressed in terms of a weighted discrepancy measure: Using (14) and (15) we obtain the weighted version of Stolarsky invariance principle.
Theorem 1 Let x 0 , . . . , x N −1 ∈ S d be an arbitrary N point configuration on the sphere S d . Let K C ,v be the weighted reproducing kernel given by (12). Then we have The double integral above can be expressed in terms of the weight function, see (20). In [5], Stolarsky's (general) invariance principle is extended and used to get bounds for the spherical cap discrepancy, see also [6]. Stolarksy [17] also extended his principle to certain metric spaces arising from measures.
Stolarsky [16] introduced the function which becomes a metric if the kernel g (integrable on [0, 1]) is positive but the proof of the corresponding invariance principle making use of Haar integrals over the special orthogonal group SO(d + 1) does not require it. Note that for g ≡ 1 the function ρ(x, y) is a constant multiple of the Euclidean distance. With some care one may even consider g(x) = 1/(1 − x 2 ). It is well-known that a reproducing kernel K(x, y) induces a distance (metric) by means of For example, the reproducing kernel (7) yields In general, for the symmetric weighted kernel K C ,v (x, y), which does only depend on the inner product x, y , it follows that (a ∈ S d fixed) By Theorem 1 on arrives at which should be compared with (16).
Acknowledgement: The first author is grateful to the School of Mathematics and Statistics at UNSW for their support.

A Auxiliary results
The normalized surface area measure σ d on S d admits the following decomposition where t ∈ [−1, 1], y * ∈ S d−1 and ω d denotes the surface area of S d (cf Müller [11]). (By definition y, p = t, where p is the North Pole of S d .) Thus, by rotational symmetry, the integral of a zonal function f ( z, · ), z ∈ S d fixed, with respect to σ d reduces to Proof. [Proof of relations (6)] One gets The second equality follows from is the beta function. The third equality follows from the well-known formulas for the volume of the unit ball in R d and the surface area of S d . The asymptotics follows from the asymptotic expansion of a ratio of Gamma functions (cf. [1]). 2

B The weighted reproducing kernel
Next, we investigate the weighted reproducing kernel (13) in more detail. In particular, it will be shown that the kernel K C ,v (x, y) is a function of the inner product x, y . On observing that x, z ≤ y, z if and only if y − x, z ≥ 0 we may write for x = y which immediately shows symmetry of the reproducing kernel, where By abuse of notation we set (note that x, y = u) In this way x will be the 'North Pole' in the decomposition (17) and we obtain The indicator function in the inner integral is a zonal function depending on w = y * , z * only. Thus, we apply again (17) with y * as 'North Pole'. That is where the inner product evaluates as Proceeding similarly for A C ,v (y, x), one sees that, indeed, A C ,v (y, x) = A C ,v (x, y). Furthermore, A C ,v (x, y) depends only on the inner product x, y which in turn implies that the reproducing kernel K C ,v (x, y) is a function of the inner product x, y . The right-hand side in (19) describes a line which stays strictly between the levels −1 and 1 for v in [−1, 1] by the left-hand side in (19). Further analysis gives that the indicator functions is one (i) if t ≤ − (1 + u)/2 and −1 ≤ v ≤ 1, or, (ii) if − (1 + u)/2 ≤ t ≤ (1 + u)/2 and (t/ √ 1 − t 2 ) (1 − u)/(1 + u) ≤ v ≤ 1, and zero otherwise. This leads to where, when using u = x, y = cos φ (0 < φ < π) and t = cos ψ, one has 1 + u 2 = cos(φ/2), 1 − u 1 + u = tan(φ/2), The change of variable ξ = x(t) yields We compute the following integral (using (18)): The inner integral is one if z is in the half-sphere centered at −x and zero otherwise. Hence, Since g ′ (t) = (ω d−1 /ω d )(1 − t 2 ) d/2−1 for g(t) = (1/2) I t 2 (1/2, d/2), 0 ≤ t ≤ 1, integration by parts gives It follows that