Sanov’s theorem in the Wasserstein distance: A necessary and sufficient condition

https://doi.org/10.1016/j.spl.2009.12.003Get rights and content

Abstract

Let (Xn)n1 be a sequence of i.i.d.r.v.’s with values in a Polish space (E,d) of law μ. Consider the empirical measures Ln=1nk=1nδXk,n1. Our purpose is to generalize Sanov’s theorem about the large deviation principle of Ln from the weak convergence topology to the stronger Wasserstein metric Wp. We show that Ln satisfies the large deviation principle in the Wasserstein metric Wp (p[1,+)) if and only if Eeλdp(x0,x)dμ(x)<+ for all λ>0, and for some x0E.

Section snippets

Introduction and main results

Let (E,d) be a complete metric and separable (say Polish) space equipped with its Borel σ-field E. We denote the set of all probability measures on E by M1(E).

Several lemmas

Lemma 2.1

Let Z be a real nonnegative random variable such that Λ(λ)=logEeλZ<+ for all λ>0 , and Λ be semi-Legendre transform,Λ(r)supλ0{rλlogE(eZλ)}for all r0 . Then Λ is a nondecreasing lower semicontinuous convex function, andΛ(r)r+asr+.

Proof

Taking λ=0, we get Λ(r)0. Λ(r) is a nondecreasing lower semicontinuous convex function because it is the supremum of a class of nondecreasing continuous affine functions. For any fixed numbers λ>0, and r>0, Λ(r)rλΛ(λ),Λ(r)rλΛ(λ)r. As Λ(λ)<+ for

Another proof of the necessity in Theorem 1.1 for p=1

When p=1, we have by Kantorovich–Rubinstein’s theorem (see Villani, 2003, Theorem 1.14) W1(μ,ν)=sup{Eφd(νμ);φ:ER,φLip1}, where φLip=supxy|φ(x)φ(y)|d(x,y) is the Lipschitzian seminorm. This result identifies the LDP estimation of Ln on (M11(E),W1) as that of Ln(f) over the class of all Lipschitz functions on E, with Lipschitz constant not more than 1.

More precisely, given {φ;φ:ER,φLip1,φ(x0)=0}for some fixed x0E, let l() be the space of all bounded real functions on with

Two examples

Let us give two statistical applications.

Example 4.1

Let (Xn)n1 be real valued i.i.d.r.v.’s with common distribution function F(μ(dx)=dF(x)) such that R|x|μ(dx)<+. Let Fn(x)=1nk=1nI(,x](Xk),n1 be the empirical distribution functions. Since W1(ν1,ν2)=R|Fν1(x)Fν2(x)|dx,ν1,ν2M1(R) (see Villani, 2003, Page 75), where Fν(x)ν(,x] is the distribution function of ν, then νFν(x)F(x) is isometric from (M11(R),W1) to (L1(R,dx),1). By Theorem 1.1, P(FnF) satisfies the LDP on L1(R,dx) if and only

Acknowledgements

The authors are very grateful to the editor and associated editor for their comments. We are particularly indebted to the referee for his attentive corrections and helpful comments. The first named author also thanks Professor F.Q. Gao, Z.L. Zhang and N. Yao for the useful discussions.

References (13)

  • P. Eichelsbacher et al.

    Exponential approximations in completely regular topological spaces and extensions of Sanov’s theorem

    Stochastic Process. Appl.

    (1998)
  • M.A. Arcones

    Large deviations of empirical processes

  • F. Bolley et al.

    Quantitative concentration inequalities for empirical measures on non compact spaces

    Probab. Theory Related Fields

    (2007)
  • M.F. Chen

    From Markov Chains to Non-equilibrium Particle Systems

    (2004)
  • A. Dembo et al.
  • P. Eichelsbacher et al.

    Large deviations for partial sums U-processes

    Theory Probab. Appl.

    (1998)
There are more references available in the full text version of this article.

Cited by (39)

  • Large deviations for empirical measures of mean-field Gibbs measures

    2020, Stochastic Processes and their Applications
    Citation Excerpt :

    We first present the result of Sanov’s theorem in the Wasserstein distance by Wang et al. [21].

  • The enhanced Sanov theorem and propagation of chaos

    2018, Stochastic Processes and their Applications
View all citing articles on Scopus
View full text