Large Deviations estimates for some non-local equations. General bounds and applications

Large deviation estimates for the following linear parabolic equation are studied: \[ \frac{\partial u}{\partial t}=\tr\Big(a(x)D^2u\Big) + b(x)\cdot D u + \int_{\R^N} \Big\{(u(x+y)-u(x)-(D u(x)\cdot y)\ind{|y|<1}(y)\Big\}\d\mu(y), \] where $\mu$ is a L\'evy measure (which may be singular at the origin). Assuming only that some negative exponential integrates with respect to the tail of $\mu$, it is shown that given an initial data, solutions defined in a bounded domain converge exponentially fast to the solution of the problem defined in the whole space. The exact rate, which depends strongly on the decay of $\mu$ at infinity, is also estimated.


Introduction
The equations we consider in this paper take the following form: ∂u ∂t = Tr a(x)D 2 u +b(x)·Du+ Recently, equations like (1.1) involving Lévy-type non-local terms have been under thorough investigation by many authors and in many directions. Indeed, they present challenging problems which are not covered by the existing "local" vision of pde's. References are too many to cite, those which are more closely related to our concern are given below.
Our main goal here is to obtain some estimates on how the solutions u R that are defined in the ball B R = {|x| < R} approach the solution in R N as R → ∞, in the spirit of [5]. To sum up briefly, the main results of the present paper show that provided µ has at most an exponential tail, a bound of the following kind always holds as R → ∞: (1)) . (1. 2) The behaviour of f depends on the tail of µ, but in any case f cannot grow faster than linearly. This has to be compared to the classical exp(−R 2 )-type estimate associated to the heat equation [5]: the presence of non-local terms implies an order exp(−R ln R) at best.
Typical examples -As for (1.1), a first example we have in mind is the convolution equation, Here a, b = 0 and µ is a Lévy measure with probability density J, which is symmetric and integrable -see [13,14]. In [9], the authors gave partial answers for this equation when J decays strictly faster than an exponential at infinity. Our results here include and generalize those given in [9]. Be carefull that writing the equation under the form (1.3) requires J to be of unit mass, and moreover that in this case µ(z) = J(−z), a detail which has to be taken into account when dealing with non-symmetric kernels.
A more sophisticated example is the following: where P.V. stands for principal value. This is the same as putting the compensator term Du(x)·y since the measure is symmetric. The measure J(y) = e −|y| /|y| N +α , called tempered α-stable law, appears for instance in finance [20].
It is important to notice that our framework (which will be fully established in Section 2) does not allow to treat fractional Laplace-type nonlocal terms for which J(y) = 1/|y| N +α , α ∈ (0, 2) since we require the tail of J to go to zero at least exponentially fast.
Existence and uniqueness of solutions -Concerning the existence and uniqueness of solutions of (1.1), first notice that in the case of constant coefficients a and b, this can be derived from a Fourier analysis of the equation. In the absence of differential terms, i.e., a = b = 0 and if µ is not singular at the origin, various results about existence and uniqueness of solutions in R N or bounded domains can be found in [9,10,13,14].
Finally, in the case when a(x) = σσ t with σ and b Lipschitz, existence and uniqueness of bounded solutions are proven in [7] for the elliptic (stationnary) version of the equation under some assumptions on µ. With little modifications (which essentially amount to changing the (−γu)-term in the equation for a u t -term), the same result holds true for the parabolic version, and gives existence and uniqueness of solutions in R N or in a bounded domain. See also [1,11,12] for some other results.
We shall not derive here a full theory of existence, uniqueness and comparison for (1.1) which is not the central point of this paper. Rather, we will simply assume that given an initial data u 0 continuous, positive and bounded, there exists a unique viscosity solution u(x, t) of (1.1) with initial data u 0 . We also consider for any R > 0 the solution u R of (1.1) in B R with initial data u 0 and u(x, t) = 0 for any x > R (see [9] for details) and assume that for any R > 0, there exists a unique viscosity solution u R . Starting from this, we will now estimate the difference (u − u R ).
Large deviations -Estimate (1.2) may be seen as a large deviations result if one considers the probabilistic viewpoint associated to the equation. We shall not enter here into details about Lévy processes and refer for instance to [8,15] for a probability-oriented approach, but let us just mention a few facts.
A Lévy process is a stochastic process (Y t ) t 0 which has stationary and independent increments: for any 0 < s < t, (Y t − Y s ) only depends on (t − s) and is independent of (Y t ′ − Y s ′ ) for non-overlapping time intervals. An important fact is that (Y t ) is not required to be continuous (in time), so that this type of process can model either continuous diffusions or jump diffusions, and a mix of both. Thus, typical Lévy processes are the "usual" brownian motion, the compound Poisson process and the α-stable jump process. Now, if u(·, t) represents the density of the law of Y t , then it can be shown, using the Lévy-Khintchine formula, that the characteristic function of Y t , which is the Fourier transform of u, takes the following formû (ξ, t) = e tϕ(ξ) , where the characteristic exponent ϕ is the sum of of (i) a brownian motion, (ii) a drift and (iii) a pure jump process. More precisely, in Fourier variables: where µ is a Lévy measure. By taking the inverse Fourier transform ofû t = ϕ(ξ)û, we recover exactly equation (1.1): Brownian motions are associated to Laplacian-type diffusion terms Tr a(x)D 2 u while a drift corresponds to the b(x) · Du-term. In the case of the compound Poisson process, where J ∈ L 1 is the jump distribution of the process so that the non-local term can be re-written as J * u, leading to (1.3). Finally, in the case of the α-stable law µ(z) = |z| −N −α , we recover the well-known fractional Laplace operator.
In some sense, estimate (1.2) measures how many processes have escaped the ball B R between time t = 0 and t = T (we refer to [5] for a precise proof of this assertion in the case of the Laplacian), and obtaining an exponential-type estimate is precisely what is called a large deviation result -see the introduction of [9], and [18] for more about this.
Notations -Throughout the paper we will denote by (e i ) the canonical basis of R N , while ν will refer to any unit vector. The euclidean scalar product is denoted as a · b and we use | · | for the associated norm in R N . We will use ang(a, b) to denote the angle between the vectors a and b (see Definition 5.1) and a → a T is the transposition. The notation B(x, r) stands, as is usual, for the ball of radius r > 0, centered at x ∈ R N and we also use the simpler notation B r = B(0, r). We also denote Du ∈ R N the gradient of u and D 2 u ∈ M N (R) the Hessian matrix of u. In all the paper, c(J) denotes any constant which only depends on J, possibly varying from one line to another. The notation J 1 J 2 essentially means that J 1 J 2 in R N , but we add some strict separation inside a small region (see Definition 2.4). We denote by a ∧ b the infimum of a and b.
Main results -We have first a general result, Theorem 3.1, which gives a theoretical bound in terms of a rate function I ∞ , typical of large deviation estimates: as R → ∞, The rate function I ∞ satisfies the limit Hamilton-Jacobi problem where the Hamiltonian associated to (1.1) is given by: The idea of the proof is to suitably rescale the problem for u−u R (which is not the same as in the case of the local heat equation, though) and to derive an equation for the log-transform; then pass to the limit as R → ∞.
The Lagrangian L associated to H is defined as usual as and using a Lax-Oleinik formula, see [19], we obtain a semi-explicit expression for I ∞ , see (3.2) in terms of L.
In our present framework, obtaining (1.4) is more involved than in [9], since we have to deal with the differential terms and the singular part in H. Hence, we face several difficulties that come from the generalization of the equation, which imply non trivial adaptations and new developments of some aspects of the theory. Let us briefly explain the main points we have to deal with: (i) Singular measures at the origin: using kernels with singularities requires first a suitable concept of solution; we use here the notion of viscosity solutions derived in [6] for Lévy-type operators, which allows us to handle this situation. Then we introduce what we call the essential Hamiltonian, H ess , wiping out the singularity at the origin, see Definition 2.3, and prove that asymptotically H and H ess are equivalent (and the same holds for the associated Lagrangians).
(ii) Non symmetric kernels: we want to estimate L in order to obtain an expression for I ∞ . This needs an extra effort in this case: since we do not have symmetry, the vectors p and DH(p) do not necessarily point in the same direction. Hence dot products can not be simplified and we have to carefully analyze the angle between vectors. We prove some results valid for non symmetric kernels J at the logarithm level, and others using the smallest symmetric kernel above J. (iii) Differential terms: the first two terms in equation (1.1) do not introduce much difficulties with respect to [9] but we have to deal with them with the notion of viscosity solution. To be able to pass to the limit, we will assume that a and b have limits at infinity. (iv) Possibly infinite Hamiltonians: in the case when J(y) ∼ e −α|y| as |y| → ∞, H is infinite outside B α , which makes the analysis of (1.4) much more delicate, since an initial layer appears. Thus, one of the major contributions of this paper is to provide existence, uniqueness and comparison results for the Hamilton-Jacobi equation u t + H(Du) = 0 when H its infinite outside the a ball.
Organization -After a preliminary section, Section 2, in which some properties of the generalized non-local equation as well as for the Hamiltonian are stated, we devote Section 3 to the theoretical behaviour of the problem, where we prove Theorem 3.1. Then we concentrate our efforts in finding a behaviour for L, that allows us to give more explicit convergence rates for u − u R . We deal with this issue in sections 4, 5 and 6, where we consider different types of measures (compactly supported, intermediate, exponentially decaying), which lead to different behaviours. Section 6 especially deals with critical kernels for which H is not finite everywhere; so we devote a big part of this section to treat the Hamilton-Jacobi equation with such H. Finally in Section 7 we collect some further results: we give some explicit calculations of rates and consider totally asymmetric measures in 1-D. We also explain how our methods allow to obtain some estimates for the related nonlocal KPP-equation.

Preliminaries
Let us focus now on the concrete hypothesis that concern the equations we have in mind. For µ, a, b, A, B in (1.1) and (1.5) there are essentially three assumptions we shall make throughout the paper: The measure µ is a Lévy measure with density J(·) satisfying Hence in the sequel we rewrite (1.5) as The last assumption in (2.1) implies in particular that the integral in (2.2) over {|y| < 1} converges.

Hypothesis 2.
The N × N -matrix a(·) is continuous and nonnegative, b(·) is a continuous vector field in R N and we also assume that the following limits are well-defined: (notice that necessarily A is nonnegative).
Except in Section 6, we also make the following assumption on J: Hypothesis 3.
This ensures that the exponential part of the Hamiltonian in (1.5) (or (2.2)) converges for any p ∈ R N . In Section 6, we shall only assume that sup β > 0 : which is the case for exponential-type tails, and implies that the associated Hamiltonian is not everywhere-defined.
Most of our results would be also valid in a more general setting, for instance we could relax the C 1 condition on J, but this is not a major matter for this paper. There are also some specific other assumptions that we shall make in Section 5.

Viscosity solutions of the equation
Equations like (1.1) can be treated by various ways, depending on the coefficients a(·) and b(·) and whether the measure µ is singular at the origin or not. Since we make only general assumptions, the best tool in order to deal with differential terms and a singular measure is provided by the notion of viscosity solutions.
Essentially, the notion of viscosity solution allows to give sense to the differential terms as well as the integral term without knowing a priori that the solution is regular, using comparison with smooth test functions. Notice however that in the integral term, one only needs to replace u by a smooth function in a small neighborhood of y = 0. We refer to [7,11,12] for more results on viscosity solutions in presence of singular measures. So, if we set we then introduce the following definition: R is a subsolution of (1.1) iff, for any δ ∈ (0, 1) and any test function φ ∈ C 2 (R N × R + ), at each maximum point A locally bounded l.s.c. viscosity supersolution is defined in the same way with reversed inequality and min instead of max. Finally, a viscosity solution is a function which the upper and lower semicontinuous envelopes are respectively sub-and supersolution of the problem.
When intial and/or boundary data are involved, the definition takes into account that at a boundary point, either the inequation has to hold, or the boundary data has to be taken in the sub/super solution sense. If Ω is the domain of definition, f is the boundary data (in the sense of nonlocal equations) and u(x, 0) = u 0 (x) is the initial data of the problem we define: and initial data u(x, 0) = u 0 (x) iff, for any δ > 0 and any test function φ ∈ C 2 (R N × (0, ∞)), at each maximum point A supersolution is defined with reversed inequalities and min/max changed accordingly, and a solution is such that its l.s.c./u.s.c. envelopes are sub/super solutions of the equation.

Hamiltonians
We shall show now some properties of the specific Hamiltonians we consider: Lemma 2.1. Let J be a kernel satisfying (2.1). Then the Hamiltonian H is superlinear and strictly convex, and thus the associated Lagrangian L is well-defined, convex and superlinear.
Proof. A straightforward calculus shows that in the sense of matrices, (y ⊗ y) e p·y J(y) dy > 0 so that H is strictly convex. Now, one can easily check that e p·y J(y) dy .
Indeed, the differential terms are clearly of the order of |p| 2 and |p| respectively, while the integral over |y| < 1 can be bounded by: On the other hand, e p·y J(y) dy c(J) e ρ0|p|/2 , so that indeed H is superlinear. It is well-known (see [21] for instance) that if H is strictly convex and superlinear, so does L.
We shall go a bit further in this direction, using that the main contribution of the Hamiltonian comes from the exponential term in the integral. So let us define the essential part of the Hamiltonian and the corresponding Lagrangian: e p·y J(y) dy and L ess (q) := sup where ρ 0 is defined in (2.1).
The reason why we integrate over {|y| > ρ 0 /2} is that we want to avoid the singularity at the origin, but we also want to be sure to integrate within the support of J when it is compactly supported. The condition supp (J) ⊃ B ρ0 ensures that {|y| > ρ 0 /2} ∩ supp (J) = ∅.
We have first a basic estimate, similar to the one that was used in Lemma 2.1: Let J be a kernel satisfying (2.1) and (2.5). Then we have, Moreover there exists ε > 0 such that for any unit vector ν, and the same result holds for Proof. The first estimate is easily obtained: Notice that the constant c(J) may be small, but it is still positive since J is continuous and positive inside {p · y > ρ 0 |p|/2}.
For the second estimate, the proof is essentially the same: where in this case c(J) = {|y|>1} |y|J(y) dy < ∞ from (2.5).
Finally, for any unit vector e i of the canonical basis of R N , we compute with the same decomposition: We notice that the set {p · y > ρ 0 |p|/2} contains the truncated cone: hence for some ε > 0 small enough (which can be chosen independently of e i ), and since B ∩ B ρ0 = ∅, the integral of J on B is positive. So we can estimate the integral from below as follows: So the result also holds for any unit vector ν.
The next estimate will be essential in the sequel: Let J be a kernel satisfying (2.1) and (2.4). Then, for any γ ∈ (0, ρ 0 ), Proof. We begin with splitting the integral in two terms, The first integral is estimated as follows: Now, notice that since γ < ρ 0 , even if J is compactly supported, we are sure that the integral I 2 concerns a region where J is not zero, so that we can indeed control it by p · DH ess (p): Summing up I 1 and I 2 gives the first result of the lemma.
Remark 2.4. If DH ess (p) would grow faster than an exponential, estimate (2.6) would be sufficient to conclude that H ess (p) is negligible in front of p · DH ess (p). Actually this will be the case when J is not compactly supported, see Lemma 5.6.
The following lemma proves that essentially, the estimates we will produce in this paper do not depend on A, B and neither on the behaviour of J near the origin.
Moreover, if J is symmetric, then H ess and L ess are also symmetric.
Proof. We split the Hamiltonian as follows: Since the Hamiltonians behave at least exponentially, Lemma 2.2, the O(|p| 3 ) is negligible by far, and H and H ess are equivalent. The calculations for DH and DH ess are similar: we get while |DH(p)| grows at least exponentially. Notice that since H and H ess are equivalent, Lemma 2.3 implies that both are negligible in front of DH and DH ess .
More involved is the proof that L ess (p) ∼ L(p). Let us notice first that the point p 0 = p 0 (q) where L(q) reaches its sup goes to infinity as |q| → ∞. Indeed, this comes from the fact that q = DH(p 0 ) and that DH(p) grows at least exponentially, see Lemma 2.2. The same holds for DH ess .
Then we proceed as follows: for |q| large enough, since the sup below are attained for |p| also large, there is some constant C > 0 such that Hence if we denote by p 1 = p 1 (q) the point where the sup on the right is attained, we have q = DH ess (p 1 ) + 3C|p 1 |p 1 , while q = DH(p 0 ) for the sup in L(q). Now, and from this we will conclude that p 1 (q) ∼ p 0 (q). More precisely, for some ξ in the segment [p 0 , p 1 ]. We take the scalar product with (p 1 − p 0 ) and use Lemma 2.2 to get an estimate from above: From this follows that for some constant still noted C > 0, Noticing that |ξ| min{|p 0 |, |p 1 |} → ∞, this estimate implies that as |q| → ∞, Hence we have proved that p 1 (q) ∼ p 0 (q).
The same calculation is valid for the sup on the left of (2.7), so that we conclude that About the symmetry see [9, Lemma 2.4].
We finally prove a result which allows us to compare the Lagrangians when the kernels are ordered. But for some technical reason, we need the kernels to be strictly ordered in B ρ \ B ρ0/2 , hence we introduce the following notation: Definition 2.4. We say that J 1 and J 2 are essentially ordered and we denote J 1 J 2 if there exist a, b > 0 such that ρ 0 /2 < a < b < ρ 0 and Lemma 2.6. Let us assume that J 1 , J 2 satisfy (2.1) and (2.4) and that J 1 J 2 . Denote by H 1 , H 2 , L 1 , L 2 the associated Hamiltonians and Lagrangians. Then for p, q big enough: Proof. The inequality H ess 1 (p) H ess 2 (p) is always true, since the integrand in H ess is always positive and J 1 J 2 . But as such, this is not enough to derive the same inequality between H 1 and H 2 . To this end, let us notice that since J 1 J 2 , then there exists ε > 0 and ρ 0 /2 < a < b < ρ 0 such that where c(χ) is the integral of χ over {a < |y| < b} ∩ {p · y > a|p|}. It is clear that this set has non-empty interior and since χ > 0 in this set, then c(χ) is a stictly positive constant. This proves that not only are H ess 1 and H ess 2 ordered, but moreover the difference between the two is at least of the order of an exponential.
Then we use, as in the proof of Lemma 2.5, that H 1 (p) = H ess 1 (p) + O(|p| 3 ) and the same for H 2 , so that Thus for |p| big enough, we have indeed H 1 (p) H 2 (p).
The inequality for L 1 and L 2 follows by definition.
In this case, the tails of both kernels and their support (if compactly supported) remain exactly the same so that essentially, all the estimates we give in the sequel are the same for both kernels.

Viscosity solutions of the limit Hamilton-Jacobi equation
In Section 3, we will have to study the following Hamilton-Jacobi equation with Cauchy-Dirichlet boundary values, where we assume here that H ∈ C(R N ; R). We refer to Section 6 for the case when the Hamiltonian could take infinite values. Let us first recall the definition of viscosity solutions for this equation (see for instance [17]): A locally bounded l.s.c. function is a viscosity supersolution if the same holds with reversed inequalities and min replaced by max at the boundary. Finally a viscosity solution is a locally bounded function u such that its u.s.c. and l.s.c. envelopes are respectively sub-and super-solutions of (2.9).
Notice that in general, the initial data as to be understood in the relaxed viscosity way (either the data is taken, or the sub/super solution condition holds) but it is well-known (see for instance [2]) that in the case of continuous Hamiltonian, this condition is equivalent to the one we use here. Keep in mind, however, that this will not be the case in Section 6.
Since H is convex we have the following representation: Lemma 2.8. If J satisfies (2.1) and (2.4), then there exists a unique viscosity solution of (2.9), which is given by Proof. The assumptions made on J imply that both H and L are finite everywhere, convex and super-linear so that uniqueness holds for this problem, see [19]. Then the Lax-Oleinik formula in the bounded domain B 1 × (0, ∞) gives: Using that L 0 and L(0) = 0, we obtain that the first min equals A. Then using that the function r → L(cr)/r is increasing, the second minimum is attained for s = 0 so that the result holds.

Theoretical Behaviour
The main goal of this section is to derive a theoretical bound, in terms of the Lagrangian L, for the error made when approximating the solution, u, of (1.1) by solutions, u R , of the Dirichlet problem. This result will allow us to derive explicit rates of convergence provided we know the behaviour at infinity of L(q) (this will be the aim of the next sections): Theorem 3.1. If J satisfies (2.1) and (2.4), then for any fixed x ∈ R N and t > 0, as where the rate function is given by Moreover, for any T > 0 the o(1) is uniform in sets of the form We shall dedicate the rest of this section to the proof of this theorem. Notice that if one only assumes (2.5) instead of (2.4), the theorem still remains valid, but the proof requires some finer arguments since the Hamiltonian can be infinite in some regions of R N . In such a case, an initial layer appears that we analyse in detail in Section 6.

Formal convergence
Let us denote by v R the solution of (1.
We first rescale the equation both in x and t as follows: Then w R satisfies a rescaled equation in the fixed ball B 1 in the sense of viscosity: In order to estimate w R we follow [5] and perform the "usual" logarithmic transform, but we have to rescale accordingly, dividing by R (and not R 2 as it is the case for the heat equation). So, remembering that for t > 0, w R > 0, let us define .
and As for the differential terms, we have We thus arrive at the following equation for I R (to be interpreted in the sense of viscosity, with test functions): In the following subsection we then justify the convergence of I R towards the solution in the sense of viscosity solutions.

Passing to the limit in the viscous sense
A first problem comes from the fact that if w R approaches zero, then I R may not remain bounded. Hence to avoid upper estimates for I R , we use the same trick as in [5] which consists in modifying I R a little bit. For any A > 0, let which is bounded from above by A. Let us notice that since equation (1.1) is invariant under addition of constants, I A R satisfies the same equation as I R , that is, equation (3.3).
Proposition 3.2. The sequence (I A R ) converges locally uniformly in B 1 ×[0, ∞) as R → +∞ towards the unique viscosity solution I A of (2.9).
Proof. We introduce the half-relaxed limits, for x ∈ B 1 , t 0: and and we shall prove that they are respectively viscosity sub-and super-solutions of the limit problem (2.9). Then a uniqueness result will allow us to conclude.
Let us take δ ∈ (0, 1) and a test function φ such that I A − φ has a maximum at (x 0 , t 0 ). Up to a standard modification of φ, we can assume the maximum is strict so that there exist sequences R n → +∞ and (x n , t n ) → (x 0 , t 0 ) such that I A Rn − φ has a strict maximum at (x n , t n ) .
Case 1: the point (x 0 , t 0 ) is inside B 1 × (0, ∞). Then for n big enough, all the points (x n , t n ) are also inside B 1 × (0, ∞) so we may use the equation for I A Rn at those points and pass to the limit.
We first write down the viscosity inequality for I A R : where Diff(n) represents the differential terms, Int 1 (n, δ) is the integral over {|y| < δ} and Int 2 (n, δ) is the integral over {|y| δ}.
Let us first remark that passing to the limit in the differential terms is easy, using (2.3): For n big enough, the first integral term can be controled by: It remains to pass to the limit in Int 2 . To this end we use the fact that, since we have a maximum point, for any z ∈ R N , Then, we fix ε > 0, choose some M > 1 and split Int 2 into two terms as follows: Since w R is bounded by u 0 ∞ , we can choose M big enough so that the second term is less than ε, independently of n.
Then we write a Taylor expansion for φ near point x n : there exists a ξ n ∈ B M such that Since ξ n remains in B M and φ is smooth we have that D 2 φ(ξ n ) remains bounded. Hence, we can pass to the limit as n → +∞: Summing up the various terms, we obtain that for any δ > 0 and any ε > 0, there exists M = M (ε) > 1 such that where o δ (1) represent a quantity that goes to zero as δ → 0.
It only remains to pass to the limit as ε, δ → 0. Since J satisfies (2.1) and (2.4), the integral over R N converges, and we can send M to +∞ and obtain in the limit: which shows that I A at (x 0 , t 0 ) is a subsolution in the sense of viscosity.
Case 2: the point (x 0 , t 0 ) is located at the boundary, x 0 ∈ ∂B 1 , t 0 > 0. Then the sequence (x n , t n ) may have points x n either inside B 1 × (0, ∞), or at the boundary, or even outside B 1 . If x n ∈ B 1 , we use the equation as in the previous case while if x n ∈ ∂B 1 , we use the relaxed boundary condition in Definition 2.2. Finally, if |x n | > 1, then I Rn (x n , t n ) = 0 so that in any case, one has min ∂ t φ + H(Dφ) ; I A Rn 0 at (x n , t n ) .
We then pass to the limit as n → +∞ and get the relaxed condition for I A at the boundary.
Case 3: the point (x 0 , t 0 ) is located at t 0 = 0, x 0 ∈ B 1 . The same as in case 2 happens: we have either t n = 0 and then we use the initial condition, or t n > 0 in which case we use the equation. In any case we get which gives in the limit Now, it is well-known (see for instance [2,Thm 4.7]) that in this case, the initial condition is equivalent to lim t→0 I A A. Actually this can be proved as in Proposition 6.3, using that in the present situation, the Hamiltonian is finite everywhere.
Conclusion: First, the supersolution conditions for I A are obtained by the same method, with reversed inequalities. Then using comparison between u.s.c./l.s.c. sub/super solutions for (2.9), we get the inequality I A I A , which implies equality of both functions. Hence, all the sequence converges uniformly in B 1 × [0, T ] for all T > 0 to the unique solution I A .

Proof of Theorem 3.1
This result only comes from the fact that for any A > 0, by construction The fact that I A = I A , together with Lemma 2.8 yields the result for fixed (x, t), passing to the limit as A → ∞.
Now, the convergence of I R to I ∞ is locally uniform in B 1 × [0, ∞) so that for any T > 0 as long as x/R 1 and 0 t/R T , where o(1) is uniform with respect to x and t as above. Thus estimate (3.1) indeed holds, which ends the proof. Notice that at t = 0, both I R and I ∞ are infinite (which corresponds to A = +∞), but anyway, the difference I R − I ∞ remains uniformly controlled. In the next sections we shall derive more explicit estimates only for |x| θR, with θ ∈ (0, 1), because we use the asymptotic behaviour of L(q) as |q| → ∞.

Compactly supported kernels
In this section we prove that a general "R ln R" bound is valid for compactly supported kernels, extending the symmetric and regular case proved in [9]. In order to take into account the possible asymmetry of the kernel, we define below for any unit vector ν, the size of the support of J in the direction ν: Notice that since J is continuous, for any r > 0 close enough to ρ(ν), J is positive in a neighborhood of rν. Notice also that if J is symmetric, then ρ(ν) = ρ, the radius of the support of J. We shall first derive a bound from below for non-symmetric kernels in the logarithmic scale: Lemma 4.1. Let J be a continuous compactly supported kernel, let ν = p/|p| and define ρ(ν) as above. Then we have: Proof. Let us first choose a unit vector ν, define ρ(ν) by (4.1) and consider p ∈ R N going to infinity in this direction: p/|p| = ν and |p| → ∞. We begin with writing p · DH ess (p) = {p·y 0}∩{|y|>ρ0/2} p · y e p·y J(y) dy +
Then in order to have a more explicit bound using Theorem 3.1, we shall compare with a symmetric kernel, using then the radius of the support of J.
Theorem 4.2. Let J be a compactly supported kernel satisfying (2.1). We denote by ρ the size of the support of J: ρ = inf{r > 0 : supp (J) ⊂ B r } .
Then the following estimate holds: for any θ ∈ (0, 1) and T > 0, as R → ∞, Notice that J can be asymmetric and have a singularity at the origin.
Proof. We first use Lemma 2.6 to reduce our estimate to the case of symmetric, compactly supported kernels. More precisely, using Lemma 2.6, if J * is a symmetric kernel such that J J * and supp (J * ) = B ρ , then for |p| big enough, L * L where L * is the Lagrangian associated to J * . Taking a look at Theorem 3.1, this implies that Now we assume that |x| < θR so that in this set |x − y| (1 − θ)R → ∞ and we shall use the behaviour at infinity of L * . Lemma 2.5 allows us to wipe out the possible singular part near the origin as well as the differential terms of the Hamiltonian.
Since J * is symmetric, so is L ess * so that, noting L ess Then we use the results of [9, Lemma 4.1 and Corollary 4.2] applied to L * which is symmetric, associated to a nonsingular kernel to conclude.

Intermediate kernels
We consider now a general kernel J satisfying (2.1), positive everywhere in R N , so that we can always write J(y) = e −|y|ω(y) , with ω(y) = − ln J(y) |y| .
We will now make some further assumptions in this section: y → |y|ω(y) is superlinear and convex, ∃η ∈ (0, 1] , lim inf |y|→+∞ (y · Dω(y)) |y||Dω(y)| η. (5.1) Let us comment these hypotheses: (i) The regularity assumption on J (which implies the same regularity for ω) is not crucial since by comparison we can deal with less regular kernels, using the results of Section 2. (ii) About the convexity of |y|ω(y), it is actually only required for large |y| for the same reason: we only care about the tail of J. (iii) The case of compactly supported kernels, which would correspond to ω(y) = +∞ outside a ball is treated in Section 4. On the other hand, the case of critical kernels treated in Section 6 corresponds to lim ω(y) = ℓ < ∞. So, the assumption of superlinearity is in between: lim ω(y) = ∞. This is why we speak of intermediate kernels here. (iv) The superlinearity assumption implies that J automatically satisfies (2.4) since indeed, for any β > 0, we have that ω(y) > β for |y| large enough. (v) The convexity and superlinearity assumptions altogether allow us to define the Legendre transform K(·) of |y|ω(y), which will also be superlinear and convex, see [21]: This function K(·) will play a big role in estimating the rate of convergence. Notice that it is also the Legendre transform of ln(1/J). (vi) The "angle" condition on Dω says that the gradient cannot take a purely tangential position. This is a very weak assumption in this form that allows us to derive a minimum behaviour for non-symmetric kernels.

Properties of K
Thanks to Lemma 2.3, we know that where p 0 (q) is such that L(q) = p 0 · q − H(p 0 ). Thus, a main step consists in finding a lower bound for p · DH(p). Here is where K(p) plays an important role: roughly speaking, we will see that " p · DH(p) = R N p · y e p·y−|y|ω(y) dy e K(p) " .
Hence a detailed study of the properties of K is needed. To this aim, let y 0 (p) be the point where the sup in (5.2) is attained.
We shall now prove that for |p| large enough, the maximum point y 0 (p) is unique. Since y → |y|ω(y) is convex for large |y|, say |y| > C, then ϕ p is concave for |y| > C, independently of p. Indeed, this comes from the fact that D 2 ϕ p (y) = −D 2 (|y|ω(y)). Thus we take M > 0 large enough so that for any |p| > M , any minimum point y 0 (p) satisfies |y 0 (p)| > C and enters the region where ϕ p is concave. Then, the only case when there may exist several maximum points is the case when ϕ p would be constant on some open set. But since ω(y) → +∞ as |y| → ∞, this cannot happen for y large. Hence if |p| is large, the maximum is attained at a unique point y 0 (p).
Let us now make clear some definitions: Definition 5.1. The angle between two vectors a, b ∈ R N is defined as follows: Moreover, given a vector a ∈ R N and an angle α ∈ [0, π/2], we define the positive cone C + in the direction a with aperture α as follows: Accordingly we define the negative cone as follows: Notice that by considering only apertures 0 α π/2 (which is enough for our purpose here), we make sure that C − (a, α) ∩ C + (a, α) = {0}. The we have the following lemma: Then for |p| big enough, we have ∀y ∈ A(p) , p · y − |y|ω(y) p · y 0 − |y 0 |ω(y 0 ) − 1 .
To end the lemma, we only need to mention that A(p) is given by the intersection of the ball B(y 0 , 1/|p|) with a cone placed at y 0 , of aperture π/2 − arccos(η/2). Hence its volume is indeed given c(N, η)/|p| N for some constant c(N, η) > 0.
We assume now that J * (y) = e −|y|ω * (y) is symmetric in order to get a more precise estimate. We denote by L * and K * the associated Lagrangian and Legendre transform of |y|ω * (y). Since in this case K * is symmetric (and still superlinear), we know that for |p| large enough, K −1 * : R + → R N is defined and for any p ∈ K −1 * (|z|), we have |p| =constant.
Using again that K is superlinear and symmetric, we have that for any ε ∈ (0, 1), it is enough to take |q| big to get From this we get that for any ε ∈ (0, 1): and we pass to the limit as ε → 0 to get the result for L ess (q) ∼ p 0 · DH ess (p 0 ): Finally, we invoke Lemma 2.5 to conclude that the result holds for L(q).

Conclusion
We are now ready to prove one of the main result of this paper: Theorem 5.8. Let J(y) = e −|y|w(y) be a kernel satisfying (2.1) and (5.1). Let us consider a symmetric kernel J * (y) = e −|y|ω * (y) such that J J * , and denote by K * the associated Legendre transform of |y|ω * (y). Then the following estimate holds as R → ∞: for any θ ∈ (0, 1), (1)) .
Proof. The proof essentially follows the one in the case of compactly supported kernels: we first use Lemma 2.6 to reduce our estimate to the case of a symmetric kernel J J * : where L * (|x|) = L * (x) is the symmetric Lagrangian associated to J * . Now we assume that |x| < θR so that in this set dist(x; ∂B R ) (1 − θ)R → ∞ and we use the behaviour at infinity of L * given by (5.6) to get the result.
Corollary 5.9. In particular, if x remains in a bounded set B M , we can take any θ > 0 and if t remains also bounded we obtain a simpler estimate: (1)) .
Several remarks are to be made: (i) The authors gave in [9] some explicit estimates, which consists here more or less in expliciting the function K(·). We refer to Section 7.1 for a list of known explicit behaviours. (ii) Even if we are able to prove a lower estimate for asymmetric kernels -Lemma (5.6)using K(p) which is not symmetric in general, we are facing a problem: knowing the behaviour of p · DH(p) is not enough to know the behaviour of each of the vectors, unless we make sure that they point more or less in the same direction. And this is not clear unless the kernel is "almost" symmetric because of the exponential behaviour. This is why we compare with the smallest symmetric kernel above J in order to have a more explicit behaviour. (iii) Even if we were able to derive a bound taking into account the asymmetry, then we would have to study the min in Theorem 3.1, which is again not obvious unless we have an almost symmetric lagrangian. (iv) However, see Section 7.2 for the 1-D case where we can deal with asymmetric kernels, since the regions {x > 0} and {x < 0} are clearly separated.

Critical kernels
We assume now that J is symmetric and that Hence J satisfies (2.5) with β 0 = ℓ. We want to show that the estimate remains valid even if H is not finite everywhere. To this aim, we have to study the Hamilton-Jacobi equation more carefully. In this case, the domain of definition of the Hamiltonian H is exactly: dom(H) = B ℓ , that is, H = +∞ outside B ℓ . For simplicity, we will first assume that ℓ = β 0 = 1, the adaptations for other values being straightforward. Then we shall give the general result in Theorem 6.7.

Hamilton-Jacobi equation with nonfinite hamiltonian
We study the equation u t + H(Du) = 0, posed in the cylinder Q := B 1 × (0, ∞) (although most of the results of this section would hold also for more general cylinders). Here, we assume that the hamiltonian H is infinite in the complement of B 1 , which is the main difficulty. Following [3], we begin by constructing a new equation which is equivalent in the viscosity sense to u t + H(Du) = 0, the main interest being that it allows us to prove comparison and analyze the initial trace of solutions.
On the parabolic boundary we impose a continuous boundary condition f in the viscous sense. More precisely, we consider the following problem: (6.1) Definition 6.1. Given f ∈ C(∂ P Q), we say that an upper semi-continuous function u is a viscosity subsolution of (6.1) if for any smooth function ϕ such that u − ϕ reaches a maximum at (x 0 , t 0 ) we have: The same definition holds with reversed inequalities (and min instead of max) for an upper semi-continuous viscosity subsolution. And finally: Definition 6.2. A locally bounded function u : Q → R is a viscosity solution of (6.1) if its upper semi-continuous envelope is a supersolution and its lower semi-continuous envelope is a subsolution of (6.1).
Let us mention that in the case when H is finite everywhere, solutions take on the initial data in a classical way. But since here some data may not be compatible with the fact that dom(H) = B 1 , this implies that a boundary/initial layer appears, and this is precisely the phenomenon we are facing. In order to understand this layer, we need first to reinterpret the equation with a new Hamiltonian: Proposition 6.1. Subsolutions and supersolutions of (6.1) are also subsolutions and supersolutions (in the viscous sense) of the equation: with the same data on the parabolic boundary.
Proof. Let u be a viscosity subsolution and consider a smooth test function ϕ such that u − ϕ has a maximum at (x 0 , t 0 ). We assume for simplicity that (x 0 , t 0 ) ∈ Q (the argument being similar if it is a boundary point). Then since by definition we have necessarily H(Dϕ) < +∞, so that |Dϕ| 1 and thus u satisfies (in the viscous sense) also the inequation Now if v is a supersolution and ϕ is such that v − ϕ has a minimum at (x 0 , t 0 ), then which implies that v is a supersolution of (6.2). with u v on the parabolic boundary ∂ P Q. Then u v.
We fix µ ∈ (0, 1), and T > 0 and consider a point ( reaches its maximum. We assume that it is an interior point otherwise, using the boundary values one obtains immediately Φ 0 in B 2 1 × (0, T ) 2 which is what we want. Fixing one variable, since (x, t) → Φ(x, y 0 , t, s 0 ) reaches its maximum at (x 0 , t 0 ) one may consider the following test function for u at (x 0 , t 0 ): Indeed, if we denote by p : On the other hand, for v we use at (y 0 , s 0 ) the test function which leads to max q + H(p) ; |p| − 1 0 .

(6.4)
If we assume that |p| 1 then µ −1 |p| µ −1 > 1 which is impossible from (6.3). So, both |p| and µ −1 |p| are less than 1 and then the proof follows standard arguments of viscosity solutions: we can combine (6.3) and (6.4), getting rid of the max which gives (after multiplying the first inequality by µ): We claim that h(µ) := µH(µ −1 p) − H(p) 0 for any p ∈ R N and µ ∈ (0, 1), which leads to a contradiction with (6.5), so that an interior maximum of Φ is impossible. Hence, Φ 0 in (B 1 ) 2 × (0, T ) 2 and since β, ε, C, T, µ are arbitrary, we finally conclude that u v in To end the proof, let us check the claim: using the convexity of H, one gets and since h(1) = 0, we see that h 0 for µ ∈ (0, 1). We take a test function ϕ(x, t) = C ε t + |x − x 0 | 2 /ε 2 such that This is always possible if ε is small enough, so that this implies u(0) f . Thus: Now, we consider a supersolution u and take ϕ(x) such that f − ϕ has a minimum at x 0 .
Then for any C > 0, has a maximum at t = 0, x = x 0 . Using ψ as test-function, we obtain For C big enough, we have −C + H(Dϕ(0)) < 0 so that there remain two possibilities: So if u is a solution, both inequalities give the equality. Consequently, using (6.2), we obtain that I A = I A , so that as R → +∞, all the sequence I A R converges to the unique solution I A of the problem. Then as A → +∞, I A → I which satisfies the equation with I(0, x) = dist(x; ∂B 1 ). Now we have to identify the limit I, and to do so we have to study some properties of this specific Lagrangian. Lemma 6.5. The Lagrangian L satisfies the following properties: |DL(q)| < 1 and L(q) ∼ |q| as |q| → ∞ .
Proof. Since by definition, and H(p) → +∞ as |p| → 1, this implies that the sup is attained at some |p 0 (q)| < 1. On the other hand, a simple calculus shows that DL(q) = p 0 (q), so that indeed, for any q, |DL(q)| < 1. This also implies a first basic estimate: L(q) |q|. To get the equivalent, we first bound H(p). Let us first notice that indeed the singular and differential parts of the hamiltonian remain bounded in the set {|p| < 1}, as well as Df (p). Now, for any |p| < 1, Hence, and this sup is attained for p 0 = p 0 (q) satisfying the equation: Thus, as |q| → ∞, necessarily |p 0 | → 1 and since p 0 and q point in the same direction, Since we have seen that L(q) |q|, we conclude that L(q) ∼ |q| as |q| → ∞ .
Since J is assumed to be symmetric, so are H and L and we write L(|x|) = L(x).
Proposition 6.6. The solution of (6.1) with initial data I(0, x) = dist(x; ∂B 1 ) and I = 0 on the boundary is given by the Lax-Oleinik formula: Proof. Since |DL| < 1, then |DI| < 1 so that the compatibility condition is always fulfilled and the equation holds everywhere. Now we take a look at the initial data. Since L(q) ∼ |q|, this implies: so that indeed the Lax-Oleinik formula gives a solution. Since the viscosity solution is unique, this ends the proof.
The reader will easily check that if ℓ = β 0 = 1, then all the results of this section remain valid and then L(q) ∼ β 0 |q| as |q| → ∞. Moreover, Lemma 2.5 and Lemma 2.6 are also valid in the present case since |p| remains bounded: as we have seen, H(p) = H ess (p) + O(1) as |p| = β 0 .
Hence we may write down a more general result for possibly non-symmetric and singular kernels: Theorem 6.7. Let J be a kernel satisfying (2.1) and (2.5). In particular, J can be asymmetric and have a singularity at the origin. Then for any θ ∈ (0, 1), we have the following estimate as R → ∞: for any θ ∈ (0, 1) T > 0, Proof. We skip the details since this is the same as for Proposition 4.2: we first reduce the estimate to symmetric kernel by comparison, putting a symmetric kernel above J with the same β 0 in (2.5) and then we wipe out the possible singularities and differential terms. Actually, the proof is even simpler since since those terms remain bounded in the set {|p| < β 0 }, hence only the exponential part of the Hamiltonian plays a role in the estimate. Remark 6.8. As β 0 → 0, the estimate gets worse. Indeed, this means that the kernel tends to behave slower than an exponential and we are facing a problem of fat tails (like a power decay), that this method cannot handle.

Explicit bounds
In some particular cases, we already gave concrete estimates in [9], by studying directly the Hamiltonian H. Computing the function K −1 , we recover here these estimates under the following form: sup Table 7.1 collects some known asymptotic behaviors of K −1 and sup(u − u R ). Notice that since K is superlinear (except in the critical case), then K −1 (z) is defined for z > 0 big enough. Most of the calculations are straightforward, we only sketch the case J(y) = e e −y : in this case, K(y) ∼ |y| ln |y| and if we define z = |y| ln |y| we get ln z = ln |y| + ln ln |y| ∼ ln |y|. Hence z ∼ |y| ln z, which implies |y| ∼ z/ ln z. Let us also mention that in the case J(y) = e −α|y| , K −1 (z) = α in the sense of graphs.
Remark 7.1. As we have seen, the presence of a singularity at the origin does not modify the behaviour of K −1 , so by instance, if we multiply by |y| −2 any of the previous kernels we obtain the same estimate.

Non-symmetric kernels in one space dimension
As we have seen, it is in general difficult to give an explicit behaviour of sup |u − u R | for non-symetric kernels, unless we compare with a symetric one. But in the case N = 1, the regions where {x > 0} and {x < 0} are clearly seperated so that more precise estimates can be given according to the tails of J at −∞ and +∞ which can be different. We shall just illustrate this in an explicit case which concerns the most extreme cases we can cover: on the one side we have a compactly supported kernel, while on the other side, we have an exponential decay. Proof. A straightforward calculus shows that the Hamiltonian is defined in the region {p > −1} and that: For q > 0, we calculate the Lagragian as follows: Indeed, if q > 0, then pq > 0 only for p > 0 and we know the sup is nonnegative so that it has to be attained for p > 0. With this remark, the estimate is just the same as for the case when J(y) = 1 2 1 {y 0} . The same remark holds when q < 0: the sup is attained in the region −1 < p < 0 and the behaviour is given by then exponential decay of J.
Then we are able to use Theorem 3.1 in a more precise way (we consider x fixed in a bounded domain for simplicity): Proof. We just come back to the expression of I ∞ : So, if 1 < x < M , whether y = −R or y = +R, we always have Rx − y (M − 1)R → +∞. Hence the min of L is attained for y = −R and we recover the behaviour of L(q) for q = R(M + 1) → ∞, that is, a R ln R behaviour. On the other hand, if −M < x < −1, the min is attained for y = −R and we get the linear behaviour of L(q) for q → −∞. In Figure 3 we plot for u 0 = 1 the approximations u R for R = 10, 15, 20 and non symmetric J as in Proposition 7.2. This illustrates the different rate of convergence whether x < 0 or x > 0.

KPP-type results
In this section, we briefly explain how our results allow to treat a non-local version of the KPP-problem (Kolmogorov-Petrovskii-Piskounov) associated to equation (1.1) with the classical monostable u(1 − u)-term. For simplicity we shall just explain this on the following equation: with a continuous initial data u 0 (x) = g(x), 0 g 1. Existence of solutions with initial data 0 g 1 may be obtained for instance by Perron's method.
The interested reader will find further references about this equation and traveling waves in the works of J. Coville and L. Dupaigne [16].
Passage to the limit in the left-hand side is done exactly as in Section 3 while handling the right-hand side follows exactly from KPP classical techniques: first notice that by construction, I ε,A 0 0. Then, if in the limit I A 0 (x, t) = lim ε→0 I ε,A 0 > 0, this means that u ε → 0 so that clearly as ε → 0, the right-hand side converges to 1.
For (ii), we set similarly The first step consists in proving that I A 1 := lim sup ε→0 I ε,A cannot be derived here since no information on the regularity of C is available. The final result then follows from a representation formula for I A 1 . We refer to [4] for the details. Then, coming back to the original variables, one can obtain explicit exponential convergence rates, which follows from our study of the asymptotic behaviour of the Lagrangian associated to H (as was done in Sections 4, 5, 6).

Relation with optimal existence results
We would like to add another final comment on a related subject. As was said, by estimating sup BR |u − u R |(x, T ), we are measuring the total amount of processes that can escape the box B R between t = 0 and t = T . Another way of understanding this is that we are somehow estimating the Green kernel associated to the equation.
Thus in [10], the authors together with R. Ferreira are deriving similar estimates but in the context of optimal initial data, which is also a way of measuring the behaviour of the kernel at infinity. Hence it is not so surprising that similar estimates appear, even if they are obtained through a totally different method.
For instance, it turns out that if J is compactly supported, the optimal class of existence for u t = J * u − u in R N consists of initial data satisfying: |u 0 (x)| C e |x| ln |x| , hence we recover a R ln R estimate for the Green function, typical of compactly supported kernels. We refer to [10] for more results in this direction.