Boundary Harnack inequality for Markov processes with jumps

We prove a boundary Harnack inequality for jump-type Markov processes on metric measure state spaces, under comparability estimates of the jump kernel and Urysohn-type property of the domain of the generator of the process. The result holds for positive harmonic functions in arbitrary open sets. It applies, e.g., to many subordinate Brownian motions, L\'evy processes with and without continuous part, stable-like and censored stable processes, jump processes on fractals, and rather general Schr\"odinger, drift and jump perturbations of such processes.


Introduction
The boundary Harnack inequality (BHI) is a statement about nonnegative functions which are harmonic on an open set and vanish outside the set near a part of its boundary. BHI asserts that the functions have a common boundary decay rate. The property requires proper assumptions on the set and the underlying Markov process, ones which secure relatively good communication from near the boundary to the center of the set. By this we mean that the process starting near the boundary visits the center of the set at least as likely as creep far along the boundary before leaving the set.
BHI for harmonic functions of the Laplacian ∆ in Lipschitz domains was proved in 1977-78 by B. Dahlberg, A. Ancona and J.-M. Wu ([4, 38, 83]), after a pioneering attempt of J. Kemper ([57,58]). In 1989 R. Bass and K. Burdzy proposed an alternative probabilistic proof based on elementary properties of the Brownian motion ( [13]). The resulting 'box method' was then applied to more general domains, including Hölder domains of order r > 1/2, and to more general second order elliptic operators ( [14,15]). BHI trivially fails for disconnected sets, and counterexamples for Hölder domains with r<1/2 are given in [15]. In 2001-09, H. Aikawa studied BHI for classical harmonic functions in connection to the Carleson estimate and under exterior capacity conditions ( [1,2,3]).
Moving on to nonlocal operators and jump-type Markov processes, in 1997 K. Bogdan proved BHI for the fractional Laplacian ∆ α/2 (and the isotropic α-stable Lévy process) for 0 < α < 2 and Lipschitz sets ( [19]). In 1999 R. Song and J.-M. Wu extended the results to the so-called fat sets ( [75]), and in 2007 K. Bogdan, T. Kulczycki and M. Kwaśnicki proved BHI for ∆ α/2 in arbitrary, in particular disconnected, open sets ( [26]). In 2008 P. Kim, R. Song and Z. Vondraček proved BHI for subordinate Brownian motions in fat sets ( [63]) and in 2011 extended it to a general class of isotropic Lévy processes and arbitrary domains ( [65]). Quite recently, BHI for ∆ + ∆ α/2 was established by Z.-Q. Chen, P. Kim, R. Song and Z. Vondraček [32]. We also like to mention BHI for censored processes [21,44] by K. Bogdan, K. Burdzy, Z.-Q. Chen and Q. Guan, and fractal jump processes [55,77] by K. Kaleta, M. Kwaśnicki and A. Stós. Generally speaking, BHI is more a topological issue for diffusion processes, and more a measure-theoretic issue for jump-type Markov processes, which may transport from near the boundary to the center of the set by direct jumps. However, [19,26] show in a special setting that such jumps determine the asymptotics of harmonic functions only at those boundary points where the set is rather thin, while at other boundary points the main contribution to the asymptotics comes from gradual 'excursions' away from the boundary.
We recall that BHI in particular applies to and may yield an approximate factorization of the Green function. This line of research was completed for Lipschitz domains in 2000 by K. Bogdan ([20]) for ∆ and in 2002 by T. Jakubowski ([52]) for ∆ α/2 . It is now a well-established technique ( [47]) and extensions were proved, e.g., for subordinate Brownian motions by P. Kim, R. Song and Z. Vondraček ( [66]). We should note that so far the technique is typically restricted to Lipschitz or fat sets. Furthermore, for smooth sets, e.g. C 1,1 sets, the approximate factorization is usually more explicit. This is so because for smooth sets the decay rate in BHI can often be explicitly expressed in terms of the distance to the boundary of the set. The first complete results in this direction were given for ∆ in 1986 by Z. Zhao ([84]) and for ∆ α/2 in 1997 by T. Kulczycki ([67]) and in 1998 by Z.-Q. Chen and R. Song ([36]). The estimates are now extended to subordinate Brownian motions, and the renewal function of the subordinator is used in the corresponding formulations ( [66]). Accordingly, the Green function of smooth sets enjoys approximate factorization for rather general isotropic Lévy processes ( [29,66]). We expect further progress in this direction with applications to perturbation theory via the so-called 3G theorems, and to nonlinear partial differential equations ( [25,47,70]). We should also mention estimates and approximate factorization of the Dirichlet heat kernels, which are intensively studied at present. The estimates depend on BHI ( [24]), and reflect the fundamental decay rate in BHI ( [31,46]).
BHI tends to self-improve and may lead to the existence of the boundary limits of ratios of nonnegative harmonic functions, thanks to oscillation reduction ( [13,19,26,54]). The oscillation reduction technique is rather straightforward for local operators. It is more challenging for non-local operators, as it involves subtraction of harmonic functions, which destroys global nonnegativity. The technique requires a certain scale invariance, or uniformity of BHI, and works, e.g., for ∆ in Lipschitz domains ( [13]) and for ∆ α/2 in arbitrary domains ( [26]). We should remark that Hölder continuity of harmonic functions is a similar phenomenon, related to the usual Harnack inequality, and that BHI extends the usual Harnack inequality if, e.g., constant functions are harmonic. Hölder continuity of harmonic functions is crucial in the theory of partial differential equations [6,16], and the existence of limits of ratios of nonnegative harmonic functions leads to the construction of the Martin kernel and to representation of nonnegative harmonic functions ( [5,26]).
The above summary indicate further directions of research resulting from our development. The main goal of this article is to study the following boundary Harnack inequality. In Section 2 we specify notation and assumptions which validate the estimate.
(BHI) Let x 0 ∈ X, 0 < r < R < R 0 , and let D ⊆ B(x 0 , R) be open. Suppose that nonnegative functions f, g on X are regular harmonic in D with respect to the process X t , and vanish in B(x 0 , R) \ D. There is c (1.1) = c (1.1) (x 0 , r, R) such that f (x)g(y) ≤ c (1.1) f (y)g(x) , x, y ∈ B(x 0 , r).
Here X t is a Hunt process, having a metric measure space X as the state space, and R 0 ∈ (0, ∞] is a localization radius (discussed in Section 2). Also, a nonnegative function f is said to be regular harmonic in D with respect to X t if where τ D is the time of the first exit of X t from D. To facilitate cross-referencing, in (1.1) and later on we let c (i) denote the constant in the displayed formula (i). By c or c i we denote secondary (temporary) constants in a lemma or a section, and c = c(a, . . . , z), or simply c(a, . . . , z), means a constant c that may be so chosen to depend only on a, . . . , z.
Throughout the article, all constants are positive. The present work started with an attempt to obtain bounded kernels which reproduce harmonic functions. We were motivated by the so-called regularization of the Poisson kernel for ∆ α/2 ( [22], [26,Lemma 6]), which is crucial for the Carleson estimate and BHI for ∆ α/2 . In the present paper we construct kernels obtained by gradually stopping the Markov process with a specific multiplicative functional before the process approaches the boundary. The construction is the main technical ingredient of our work, and is presented in Section 4. The argument is intrinsically probabilistic and relies on delicate analysis on the path space. At the beginning of Section 4 the reader will also find a short informal presentation of the construction. Section 2 gives assumptions and auxiliary results. The boundary Harnack inequality (Theorem 3.5), and the so-called local supremum estimate (Theorem 3.4) are presented in Section 3, but the proof of Theorem 3.4 is deferred to Section 4. In Section 5 we verify in various settings the scale-invariance of BHI, discuss the relevance of our main assumptions from Section 2, and present many applications, including subordinate Brownian motions, Lévy processes with or without continuous part, stable-like and censored processes, Schrödinger, gradient and jump perturbations, processes on fractals and more.

Assumptions and Preliminaries
Let (X, d, m) be a metric measure space such that all bounded closed sets are compact and m has full support. Let B(x, r) = {y ∈ X : d(x, y) < r}, where x ∈ X and r > 0. All sets, functions and measures considered in this paper are Borel. Let R 0 ∈ (0, ∞] (the localization radius) be such that X\B(x, 2r) = ∅ for all x ∈ X and all r < R 0 . Let X∪{∂} be the one-point compactification of X (if X is compact, then we add ∂ as an isolated point). Without much mention we extend functions f on X to X∪{∂} by letting f (∂) = 0. In particular, we write f ∈ C 0 (X) if f is a continuous real-valued function on X ∪ {∂} and f (∂) = 0. If furthermore f has compact support in X, then we write f ∈ C c (X). For a kernel k(x, dy) on X ([39]) we let kf (x) = f (y)k(x, dy), provided the integral makes sense, i.e., f is (measurable and) either nonnegative or absolutely integrable. Similarly, for a kernel density function k(x, y) ≥ 0, we let k(x, E) = E k(x, y)m(dy) and k(E, y) = Let (X t , ζ, M t , P x ) be a Hunt process with state space X (see, e.g., [18, I.9] or [40, 3.23]). Here X t are the random variables, M t is the usual right-continuous filtration, P x is the distribution of the process starting from x ∈ X, and E x is the corresponding expectation. The random variable ζ ∈ (0, ∞] is the lifetime of X t , so that X t = ∂ for t ≥ ζ. This should be kept in mind when interpreting (1.2) above, (2.1) below, etc. The transition operators of X t are defined by whenever the expectation makes sense. We assume that the semigroup T t is Feller and strong Feller, i.e., for t > 0, T t maps bounded functions into continuous ones and C 0 (X) into C 0 (X). The Feller generator A of X t is defined on the set D(A) of all those f ∈ C 0 (X) for which the limit exists uniformly in x ∈ X. The α-potential operator, is defined whenever the expectation makes sense. We let U = U 0 , the potential operator. The kernels of T t , U α and U are denoted by T t (x, dy), U α (x, dy) and U(x, dy), respectively. Recall We enforce a number of conditions, namely Assumptions A, B, C and D below. We start with a duality assumption, which builds on our discussion of X t .
Assumption A. There are Hunt processes X t andX t which are dual with respect to the measure m (see [18,VI.1] or [37, 13.1]). The transition semigroups of X t andX t are both Feller and strong Feller. Every semi-polar set of X t is polar.
In what follows, objects pertaining toX t are distinguished in notation from those for X t by adding a hat over the corresponding symbol. For example,T t andÛ α denote the transition and α-potential operators ofX t . The first sentence of Assumption A means that for all α > 0, there are functions U α (x, y) =Û α (y, x) such that for all f ≥ 0 and x ∈ X, and such that x → U α (x, y) is α-excessive with respect to T t , and y → U α (x, y) is α-excessive with respect toT t (that is, α-co-excessive). The α-potential kernel U α (x, y) is unique (see [37,Theorem 13.2] or remarks after [18,Proposition VI.1.3]).
The condition in Assumption A that semi-polar sets are polar is also known as Hunt's hypothesis (H). Most notably, it implies that the process X t never hits irregular points, see, e.g., [18,I.11 and II.3] or [37,Chapter 3]. The α-potential kernel is non-increasing in α > 0, and hence the potential kernel U(x, y) = lim t→0 We consider an open set D ⊂ X and the time of the first exit from D for X t andX t , We define the processes killed at τ D , We let T D t (x, dy) andT D t (x, dy) be their transition kernels. By [37,Remark 13.26], X D t andX D t are dual processes with state space D. Indeed, for each x ∈ D, P x -a.s. the process X t only hits regular points of X \ D when it exits D. In the nomenclature of [37, 13.6], this means that the left-entrance time and the hitting time of X \ D are equal P x -a.s. for every x ∈ D. In particular, the potential kernel G D (x, y) of X D t exists and is unique, although in general it may be infinite ( [18, pp. 256-257]). G D (x, y) is called the Green function for X t on D, and it defines the Green operator G D , Note that U(x, y) = G X (x, y). When X t is symmetric (self-dual) with respect to m, then Assumption A is equivalent to the existence of the α-potential kernel U α (x, y) for X t , since then Hunt's hypothesis (H) is automatically satisfied, see [37].
The following Urysohn regularity hypothesis plays a crucial role in our paper, providing enough 'smooth' functions on X to approximate indicator functions of compact sets.
x ∈ X, and the boundary of the set {x : f (x) > 0} has measure m zero. We let where the infimum is taken over all such functions f .
Thus, nonnegative functions in D(A) ∩ D(Â) separate the compact set K from the closed set X \ D: there is a Urysohn (bump) function for K and X \ D in the domains. Since the supremum in (2.2) is finite for any f ∈ D and the infimum is taken over a nonempty set, (K, D) is always finite.
Note that constant functions are not in D(A) nor D(Â) unless X is compact. In the Euclidean case X = R d , D can often be taken as the class C ∞ c (R d ) of compactly supported smooth functions. The existence of D is problematic if X is more general. However, for the Sierpiński triangle and some other self-similar (p.c.f.) fractals, D can be constructed by using the concept of splines on fractals ( [55,78]). Also, a class of smooth indicator functions was recently constructed in [71] for heat kernels satisfying upper sub-Gaussian estimates on X. Further discussion is given in Section 5 and Appendix A. Here we note that Assumption B implies that the jumps of X t are subject to the following identity, which we call the Lévy system formula for X t , for all x ∈ X, and ν is a kernel on X (satisfying ν(x, {x}) = 0 for all x ∈ X), called the Lévy kernel of X t , see [17,74,80]. For more general Markov processes, ds in (2.3) is superseded by the differential of a perfect, continuous additive functional, and (2.3) defines ν(x, ·) only up to a set of zero potential, that is, for m-almost every x ∈ X. By inspecting the construction in [17,74], and using Assumption B, one proves in a similar way as in [12,Section 5] that the Lévy kernel ν satisfies x ∈ X \ supp f , then νf (x) = Af (x). By Assumption B, the mapping f → νf (x) is a densely defined, nonnegative linear functional on C c (X \ {x}), hence it corresponds to a nonnegative Radon measure ν(x, dy) on X \ {x}. As usual, we let ν(x, {x}) = 0. The Lévy kernelν(y, dx) forX t is defined in a similar manner. By duality, ν(x, dy)m(dx) = ν(y, dx)m(dy).
As an application of (2.3) we consider the martingale where f (s, y, z) = 1 A (s)1 E (y)1 F (z). We stop the martingale at τ D and we see that . A similar result was first proved in [51]. For this reason we refer to (2.5) as the Ikeda-Watanabe formula (see also (2.12) and (2.6) below). Integrating (2.5) against dt and dy we obtain For x 0 ∈ X and 0 < r < R, we consider the open and closed balls B(x 0 , r) = {x ∈ X : d(x 0 , x) < r} and B(x 0 , r) = {x ∈ X : d(x 0 , x) ≤ r}, and the annular regions Note that B(x 0 , r), the closure of B(x 0 , r), may be a proper subset of B(x 0 , r).
Recall that R 0 denotes the localization radius of X. The following assumption is our main condition for the boundary Harnack inequality. It asserts a relative constancy of the density of the Lévy kernel. This is a natural condition, as seen in Example 5.14.
In particular, if 0 < R < R 0 and D ⊆ B(x 0 , R), then the Green function G D (x, y) exists (see the discussion following Assumption A), and for each x ∈ X it is finite for all y in X less a polar set. We need to assume slightly more. The following condition may be viewed as a weak version of Harnack's inequality.
Assumption D. If x 0 ∈ X, 0 < r < p < R < R 0 and B = B(x 0 , R), then (2.10) Assumptions A, B, C and D are tacitly assumed throughout the entire paper. We recall them explicitly only in the statements of BHI and local maximum estimate.
When saying that a statement holds for almost every point of X, we refer to the measure m. The following technical result is a simple generalization of [18,Proposition II.3.2].
Proposition 2.2. Suppose that Y t is a standard Markov process such that for every x ∈ X and α > 0, the α-potential kernel V α (x, dy) of Y t is absolutely continuous with respect to m(dy). Suppose that function f is excessive for the transition semigroup of Y t , and f is not identically infinite. If function g is continuous and f (x) ≤ g(x) for almost every for all x ∈ B(x 0 , r), as desired. (See e.g. [18,37] for the notion of fine topology and fine continuity of excessive functions.) If X t is transient, (2.10) often holds even when G B is replaced by G X = U. In the recurrent case, we can use estimates of U α , as follows.
The following formula obtained by Dynkin (see [40, formula (5.8)]) plays an important role. If τ is a Markov time, E x τ < ∞ and f ∈ D(A), then is supported in X \ D and X t ∈ D P y -a.s. for t < τ and x ∈ X, then (2.12) We note that (2.12) extends to nonnegative functions f on X which vanish on D. Indeed, both sides of (2.12) define nonnegative functionals of f ∈ C 0 (X \ D), and hence also nonnegative Radon measures on X \ D. By (2.12), the two functionals coincide on D ∩ C 0 (X \ D), and this set is dense in C 0 (X \ D) by the Urysohn regularity hypothesis. This proves that the corresponding measures are equal. We also note that one cannot in general relax the condition that f = 0 on D. Indeed, even if m(∂D) = 0, X τ may hit ∂D with positive probability.
Recall that a function f ≥ 0 on X is called regular harmonic in an open set D ⊆ X if f (x) = E x f (X(τ D )) for all x ∈ X. Here a typical example is x → E x ∞ 0 g(X t )dt if g ≥ 0 vanishes on D. By the strong Markov property we then have f (x) = E x f (X τ ) for all stopping times τ ≤ τ D . Accordingly, we call f ≥ 0 regular subharmonic in D (for X t ), if f (x) ≤ E x f (X τ ) for all stopping times τ ≤ τ D and x ∈ X. Here a typical example is a regular harmonic function raised to a power p ≥ 1. We like to recall that f ≥ 0 is called harmonic in D, if f (x) = E x f (X(τ U )) for all open and bounded U such that U ⊆ D, and all x ∈ U . This condition is satisfied, e.g., by the Green function G D (·, y) in D \ {y}, and it is weaker than regular harmonicity. In this work however, only the notion of regular harmonicity is used. For further discussion, we refer to [35,48,40,81].

Boundary Harnack inequality
Recall that Assumptions A, B, C and D are in force throughout the entire paper. Some results, however, hold in greater generality. For example, the following Lemma 3.1 relies solely on Assumption B and (2.9), and it remains true also when X t is a diffusion process. Also, Lemma 3.2 and Corollary 3.3 require Assumptions B and C but not A or D.
Lemma 3.1. If x 0 ∈ X and 0 < r < R <R < ∞, then for all D ⊆ B(x 0 , R) we have Proof. We fix an auxiliary numberr >R and x ∈ B(x 0 , r). Let f ∈ D be a bump function from Assumption B for the compact set A(x 0 , R,R) and the open set A(x 0 , r,r) .
We will now clarify the relation between BHI and local supremum estimate.
Lemma 3.2. The following conditions are equivalent: and if (b) holds, then we may let Denote the terms on the right hand side by I and J, respectively. By (3.1) and (3.2), in the upper bound and 1/c (2.7) in the lower bound.
We like to remark that BHI boils down to the approximate factorization (3.3 (2.6). However, ν(z, E) in (2.6) is quite singular and much larger than ν(x 0 , E) if both z and E are close to ∂B(x 0 , R). Our main task is to prove that the contribution to (2.6) from such points z is compensated by the relatively small time spent there by X D t when starting at x ∈ D. In fact, we wish to control (2.6) by an integral free from singularities (i.e. (3.2)), if x and E are not too close.
The main technical result of the paper is the following local supremum estimate for subharmonic functions, which is of independent interest. The result is proved in Section 4.
Theorem 3.4. Suppose that Assumptions A, B, C and D hold true. Let x 0 ∈ X and 0 < r < q < R < R 0 , where R 0 is the localization radius from Assumptions C and D. Let function f be nonnegative on X and regular subharmonic with respect to X t in B(x 0 , R). Then where Theorem 3.4 (to be proved in the next section) and Corollary 3.3 lead to BHI. We note that no regularity of the open set D is assumed.
Remark 3.6. (BHI) is said to be scale-invariant if c (1.1) may be so chosen to depend on r and R only through the ratio r/R. In some applications, the property plays a crucial role, see, e.g., [14,26]. If X t admits stable-like scaling, then c (1.1) given by (3.10) is scale-invariant indeed, as explained in Section 5 (see Theorem 5.4).
Remark 3.7. The constant c (1.1) in Theorem 3.5 depends only on basic characteristics of X t . Accordingly, in Section 5 it is shown that BHI is stable under small perturbations.
Remark 3.8. BHI applies in particular to hitting probabilities: Remark 3.9. BHI implies the usual Harnack inequality if, e.g., constants are harmonic.
The approach to BHI via approximate factorization was applied to isotropic stable processes in [26], to stable-like subordinate diffusion on the Sierpiński gasket in [55], and to a wide class of isotropic Lévy processes in [65]. In all these papers, the taming of the intensity of jumps near the boundary was a crucial step. This parallels the connection of the Carleson estimate and BHI in the classical potential theory, see Section 1.

Regularization of the exit distribution
In this section we prove Theorem 3.4. The proof is rather technical, so we begin with a few words of introduction and an intuitive description of the idea of the proof.
In [26,Lemma 6], an analogue of Theorem 3.4 was obtained for the isotropic α-stable Lévy processes by averaging harmonic measure of the ball against the variable radius of the ball. The procedure yields a kernel with no singularities and a mean value property for harmonic functions. In the setting of [26] the boundedness of the kernel follows from the explicit formula and bounds for the harmonic measure of a ball. A similar argument is classical for harmonic functions of the Laplacian and the Brownian motion. For more general processes X t this approach is problematic: while the Ikeda-Watanabe formula gives precise bounds for the harmonic measure far from the ball, satisfactory estimates near the boundary of the ball require exact decay rate of the Green function, which is generally unavailable. In fact, resolved cases indicate that sharp estimates of the Green function are equivalent to BHI ( [20]), hence not easier to obtain. Below we use a different method to mollify the harmonic measure.
Recall that the harmonic measure of B is the distribution of X(τ B ). It may be interpreted as the mass lost by a particle moving along the trajectory of X t , when it is killed at the moment τ B . In the present paper we let the particle lose the mass gradually before time τ B , with intensity ψ(X t ) for a suitable function ψ ≥ 0 sharply increasing at ∂B. The resulting distribution of the lost mass defines a kernel with a mean value property for harmonic functions, and it is less singular than the distribution of X(τ B ).
Throughout this section, we fix x 0 ∈ X and four numbers 0 < r < p < q < R < R 0 , where R 0 is defined in Assumptions C and D. For the compact set B(x 0 , q) and the open and , and δ can be arbitrarily close to (B(x 0 , q), B(x 0 , R)).
We consider a function ψ : X ∪ {∂} → [0, ∞] continuous in the extended sense and such that ψ( We see that A t is a right-continuous, strong Markov, nonnegative (possibly infinite) additive functional, and A t = ∞ for t ≥ ζ. We define the right-continuous multiplicative functional For a ∈ [0, ∞], we let τ a be the first time when A t ≥ a. In particular, τ ∞ is the time when A t becomes infinite. Note that A t and M t are continuous except perhaps at the single (random) moment τ ∞ when A t becomes infinite and the left limit If ψ grows sufficiently fast near ∂V , then in fact τ ∞ = τ V , as we shall see momentarily.
Proof. We first assume that x ∈ X\V . In this case it suffices to prove that A 0 = ∞. Since Aϕ(y) ≤ δ for all y ∈ X, and ϕ(x) = 0, from Dynkin's formula for the (deterministic) time s it follows that E x (ϕ(X s )) ≤ δs for all s > 0. By the Schwarz inequality, where 0 < ε < t. Here we use the conventions 1/0 = ∞ and 0 · ∞ = ∞. Thus, with the convention 1/∞ = 0. Hence, By taking ε 0, we obtain It follows that A t = ∞ P x -a.s. We conclude that A 0 = ∞ and M 0 = 0 P x -a.s., as desired. When x ∈ V , the result in the statement of the lemma follows from the strong Markov property. Indeed, by the definition is the shift operator on the underlying probability space, which shifts sample paths of X t by the random time τ V , and From now on we only consider the case when the assumptions of Lemma 4.1 are satisfied, and c 1 , c 2 are reserved for the constants in the condition ψ(x) ≥ c 1 (ϕ(x)) −1 − c 2 . By the definition and right-continuity of paths of X t , A t and M t are monotone rightdifferentiable continuous functions of t on [0, τ V ), with derivatives ψ(X t ) and −ψ(X t )M t , respectively.
Let ε a (·) be the Dirac measure at a. Lemma 4.1 yields the following result.
for any measurable random time τ and nonnegative or bounded function f .
We emphasize that if M t has a jump at τ , in which case we must have τ = τ V , then the jump does not contribute to the Lebesgue-Stieltjes integral [0,τ ) f (X t )dM t in (4.5). The same remark applies to (4.6) below.
Recall that τ a = inf {t ≥ 0 : A t ≥ a}. Note that τ a are Markov times for X t , a → τ a is the left-continuous inverse of t → A t , and the events {t < τ a } and {A t < a} are equal. We have A(τ a ) = a unless τ a = τ V , and, clearly, τ a ≤ τ ∞ = τ V .
The following may be considered as an extension of Dynkin's formula.
In fact, (4.6) holds for every strong Markov right-continuous multiplicative functional M t .
Since min(τ, τ a ) is a Markov time for X t , we can apply Dynkin's formula. It follows that By Fubini and the substitution τ We emphasize that the last equality holds true also if τ = τ V with positive probability. We see that (4.6) holds. By (4.5) we obtain (4.7).
The functional M t is a Feynman-Kac functional, interpreted as the diminishing mass of a particle started at x ∈ X. We shall estimate the kernel π ψ (x, dy), defined as the expected amount of mass left by the particle at dy. Namely, for any nonnegative or bounded f we define x ∈ X.
The potential kernel G ψ (x, dy) of the functional M t will play an important role. Namely, for any nonnegative or bounded f we let In the second equality above, the identities M t = ∞ At e −a da and {t < τ a } = {A t < a} were used together with Fubini, as in the proof of Lemma 4.3. We note that G ψ (x, dy) measures the expected time spent by the process X t at dy, weighted by the decreasing mass of X t (compare with the similar role of G V (x, y)m(dy)). There is a semigroup of operators T ψ t f (x) = E x (f (X t )M t ) associated with the multiplicative functional M t . Furthermore, T ψ t are transition operators of a Markov process X ψ t , the subprocess of X t corresponding to M t . With the definitions of [18], M t is a strong Markov right-continuous multiplicative functional and V is the set of permanent points for M t . Therefore, X ψ t is a standard Markov process with state space V , see [18, III.3.12, III.3.13 and the discussion after III. 3.17]. (From (4.4) and [18,Proposition III.5.9] it follows that M t is an exact multiplicative functional. Furthermore, since M t can be discontinuous only at t = τ V , the functional M t is quasi-left continuous in the sense of [18,III.3.14], and therefore X ψ t is a Hunt process on V . However, we do not use these properties in our development.) Informally, X ψ t is obtained from X t by terminating the paths of X t with rate ψ(X t )dt, and π ψ (x, dy) is the distribution of X t stopped at the time when X ψ t is killed. Furthermore, G ψ (x, dy) is the potential kernel of X ψ t . To avoid technical difficulties related to subprocesses and the domains of their generators, in what follows we rely mostly on the formalism of additive and multiplicative functionals.
The multiplicative functionalM t is defined just as M t , but for the dual processX t . We correspondingly defineπ ψ andĜ ψ . Since the paths ofX t can be obtained from those of X t by time-reversal and M t andM t are defined by integrals invariant upon time-reversal, the definition ofM t agrees with that of [37, formula (13.24)]. Hence, by [37,Theorem 13.25], M t andM t are dual multiplicative functionals. It follows that the subprocessX ψ t ofX t corresponding to the multiplicative functionalM t is the dual process of X ψ t ; see [37,13.6 and Remark 13.26]. Hence, the potential kernel G ψ of X ψ t admits a uniquely determined density function G ψ (x, y) (x, y ∈ V ), which is excessive in x with respect to the transition semigroup T ψ t of X ψ t , and excessive in y with respect to the transition semigroupT ψ t of X ψ t . Furthermore,Ĝ ψ (x, y) = G ψ (y, x) is the density of the potential kernel ofX ψ t . Since G ψ (x, dy) is concentrated on V , we let G ψ (x, y) = 0 if x ∈ X \ V or y ∈ X \ V . Clearly, G ψ (x, dy) is dominated by G V (x, dy) for all x ∈ V , and therefore There are important relations between π ψ , G ψ , ψ and A. If f is nonnegative or bounded and vanishes in X \ V , then by Corollary 4.2 we have x ∈ V. (4.11) Considering τ = τ V , we note that M (τ V ) = 0, and so for bounded or nonnegative f If f ∈ D(A), then formula (4.6) gives Furthermore, by (4.7), for f ∈ D(A) we have In particular, if f ∈ D(A) vanishes outside of V , then we have (which also follows directly from (4.11) and (4.12)). Formula (4.13) means that the generator of X ψ t agrees with A − ψ on the intersection of the respective domains.
We now introduce the Green operators G ψ U and harmonic measures π ψ U for X ψ t . Let U be an open subset of V . For nonnegative or bounded f and x ∈ V we let By (4.7), for any f ∈ D(A) we have (4.14) In particular, by an approximation argument, Formulas (4.14) and (4.15) can be viewed correspondingly as Dynkin's formula applied to the first exit time, and the Ikeda-Watanabe formula for X ψ t . Recall that x 0 ∈ X, 0 < r < p < q < R < R 0 , B(x 0 , q) ⊆ V ⊆ B(x 0 , R), see Figure 1, ϕ ∈ D is positive in V and vanishes in X \ V , and ϕ(x) = 1 for x ∈ B(x 0 , q).
Proof. By (4.14), for x ∈ U we have Essentially, we use here (and later on) superharmonicity of ϕ with respect to A − ψ.
This and (4.18) yield that g(x) ≤ c (4.17) ϕ(x), with c (4.17) given in the statement of the lemma.
Recall that g = G ψ f , where f is an arbitrary nonnegative function vanishing outside B(x 0 , r) with integral equal to 1. Hence, by approximation, for each x ∈ X \ B(x 0 , q), formula (4.17) holds for almost every y ∈ B(x 0 , r). By Proposition 2.2 (applied to X ψ t ), (4.17) holds for every y ∈ B(x 0 , r).
The above arguments can be repeated for the dual processX t . Hence, the dual versions of Lemmas 4.4 and 4.5 hold true, with the same c (4.17) .
Proof. For x ∈ V define g(x) = π ψ (x, ∂V ). By (4.19), V g(x)ϕ(x)m(dx) = 0, so that g vanishes almost everywhere in V . We claim that g is excessive for the transition semigroup T ψ t of X ψ t . Indeed, we have g(x) = E x (M (τ V −); X(τ V ) ∈ ∂V ), so that by the Markov property, for any t > 0 and x ∈ V , The right-hand side does not exceed g(x), and by monotone convergence, it converges to g(x) as t 0. Hence g is an excessive function equal to zero almost everywhere in V . By [18], Proposition II.3.2 (or by Proposition 2.2), g(x) = 0 for all x ∈ V .
Recall that according to the remark following Lemma 4.1, we keep assuming that for some c 3 > 0, and letM t be the multiplicative functional defined in a similar manner as M t , but with ψ replaced byψ. Clearly, for all t > 0 we have M t = 0 if and only ifM t = 0. Sincẽ ψ(x) ≥ c 3 + δ/ϕ(x), an application of Lemma 4.7 toψ yields the following result.
The local maximum estimate is now proved as follows.
We conclude this section with a result on diffusion processes. The above argument remains valid when ν vanishes everywhere, i.e., X t is a diffusion process. In this case (2.9) is not a consequence of Assumption C, so we need to add (2.9) as an assumption. No other changes in the argument are needed, and in fact the proof of Lemma 4.5 simplifies significantly, since X t exits U through the boundary of U , and therefore X(τ U ) is never in B(x 0 , p). Therefore, we have proved the following result.
Theorem 4.11. Assume that X t is a diffusion process satisfying Assumptions A, B and D, and formula (2.9). Let x 0 ∈ X and 0 < r < q < R < R 0 , where R 0 is the localization radius of 2.9 and Assumption D. Let f be a nonnegative function on B(x 0 , R), regular subharmonic in B(x 0 , R) with respect to X t . Then x ∈ B(x 0 , r).
Remark 4.12. For diffusion processes, local supremum estimate (4.24) for subharmonic functions is typically proved analytically, using Sobolev embeddings and Moser iteration, see, e.g., [42]. Theorem 3.4 requires more regularity of the process X t as compared to the analytical approach because we assume the existence of bump functions in the domain of the Feller generator (Assumption B), while Moser iteration is based on the energy form. However, our approach does not depend on Sobolev embeddings, and so it applies also to Sierpiński carpets and some other highly irregular state spaces X. It would be interesting to find an analytical proof of the local supremum estimate for jump-type processes, which would not require Assumption B. Related results have been recently studied when the Lévy kernel ν(x, y) is comparable to (d(x, y)) −d−α (see [56] and the references therein). Further comments on this subject are given in Example 5.6 and Appendix A.

Extensions and examples
In this section we study several applications of our boundary Harnack inequality, and discuss limitations of Theorem 3.5. We sketch the range of possible applications by indicating rather general classes of processes satisfying the assumptions of Theorem 3.5, without getting into technical details. Before that, however, we discuss an important notion of scale-invariance introduced in Remark 3.6. This property can be proved in a fairly general setting, which we call stable-like scaling.
for some q ≥ 0 and for all x, y ∈ X.
Note that the same parameter q appears in the lower and the upper bound.
Proof. Conditions (b) and (c) follow directly from (g). Furthermore, by (a) and the triangle inequality, there is R 0 > 0 such that if x 0 ∈ X and 0 < r < R 0 , then for some y ∈ B(x 0 , c 1 r) \ B(x 0 , r) where c 1 > 2, the balls B(x 0 , r) and B(y, r) are disjoint. Hence, for all x ∈ B(x 0 , r) we have by (a) and (g), ν(x, X \ B(x 0 , r)) ≥ ν(x, B(y, r)) ≥ c 2 r −α . As in the proof of Proposition 2.1, it follows that P x (τ B(x 0 ,r) > t) ≤ exp(−c 2 r −α t), and therefore E x (τ B(x 0 ,r) ) ≤ c −1 2 r α . We also have the following sufficient condition for scaling properties (d) and (e). Proposition 5.3. Assume that scaling property (a) holds. Suppose that the transition density T t (x, y) of a Hunt process X t exists, and that for some α > 0, r 0 > 0, for x, y ∈ X with d(x, y) < r 0 , and any t ∈ (0, r α 0 ). Then Assumption D and scaling conditions (d) and (e) hold. The constant c (2.10) and the localization radius R 0 in (2.10) depend only on the constants in (5.1) (including α and r 0 ) and in the Ahlfors regularity condition.
Proof. Both cases α > n and α < n are very similar (in fact, slightly simpler) to the remaining case α = n. Hence we give a detailed argument only when α = n.
If X t has α-stable-like scaling, then, by a simple substitution, in Theorem 3.5 we have Hence the boundary Harnack inequality is uniform in all scales R ∈ (0, R 0 ), or scaleinvariant, as claimed in Remark 3.6. We state this result as a separate theorem for future reference.
Theorem 5.4. If the assumptions of Theorem 3.5 are satisfied, and the process X t has α-stable-like scaling, then the boundary Harnack inequality (BHI) is scale-invariant: c (1.1) (x 0 , r, R) depends only on r/R.
In typical applications, one verifies (typically quite straightforward) conditions (a) and (g), formula (5.1) (which has been proved for a fairly general class of processes), and condition (f). When dealing with processes given the Lévy kernel ν(x, y), condition (f) turns out to be the most restrictive one.
Example 5.5 (Lévy processes). Theorem 3.5 applies to a large class of Lévy processes. In this case, the notion of processes in duality and properties of the Feller generator simplify significantly, see [72].
Let X t be a Lévy process in X = R k (with the Euclidean distance d and Lebesgue measure m). Then X t is always Feller, and it is strong Feller if and only if the distribution of X t is absolutely continuous (with respect to the Lebesgue measure). If this is the case, Assumption A is satisfied: the dual of X t exists, and it is the reflected process, X t −X 0 = −(X t − X 0 ). Assumption B is always satisfied with D = C ∞ c (R k ). The Lévy kernel of X t is translation-invariant, ν(x, E) = ν(E −x), where ν(dz) is the Lévy measure of X t . Therefore, Assumption C can be restated as follows: the Lévy measure of X t is absolutely continuous, and its density function ν(z) satisfies whenever 0 < r < R, with constant c (2.7) (0, r, R). If, e.g., ν(z) is isotropic and radially non-increasing, then (5.2) is equivalent to ν(z 2 ) ≥ cν(z 1 ) > 0 being valid whenever z 1 , z 2 ∈ R k , |z 1 | ≥ 1 and |z 2 | = |z 1 | + 1. Indeed, let us assume the latter condition. By radial monotonicity, ν is locally bounded on R k \ {0} from above and below by positive constants. Therefore, c 1 = c 1 (c, ν, r, R) > 0 exists such that ν(z 2 ) ≥ c 1 ν(z 1 ) if |z 1 | ≥ R−r and |z 2 | = |z 1 |+1. If follows that (c 1 ) n ν(z 1 ) ≤ ν(z 2 ) ≤ ν(z 1 ) if R−r ≤ |z 1 | ≤ |z 2 | ≤ |z 1 |+n and n = 1, 2, . . . Taking n ≥ r we obtain (5.2), as desired. Finally, Assumption D in many cases follows from estimates of the potential kernel U(x, y) = U(y−x), or, in the recurrent case, the α-potential kernel U α (x, y) = U α (y − x), see Proposition 2.3. We conclude that boundary Harnack inequality holds for a Lévy process X t , provided that its Lévy measure satisfies (5.2), one-dimensional distributions of X t are absolutely continuous, and the Green functions of balls satisfy Assumption D. This class includes: • subordinate Brownian motions which are not compound Poisson processes and have non-zero Lévy measure density function satisfying ν(z 2 ) ≥ cν(z 1 ) if |z 1 | ≥ 1 and |z 2 | = |z 1 | + 1 (for properties of these processes, see, e.g., [23,64]); • (possibly asymmetric) Lévy processes with non-degenerate Brownian part and Lévy measure satisfying (5.2); • (possibly asymmetric) strictly stable Lévy processes, whose Lévy measure is of the form |z| −d−α f (z/|z|)dz for a function f bounded from below and above by positive constants. Scale-invariance depends on more accurate estimates. We give here some examples and directions.
• For the class of strictly stable Lévy processes just mentioned above, scale-invariance follows from the estimates of the transition density given in [82, Theorem 1.1] and Proposition 5.3; see also [28] and the references therein for related estimates in the symmetric (but anisotropic) case. • Some Lévy processes for which Theorem 3.5 gives scale-invariant BHI are included in Example 5.6 (stable-like Lévy processes) and Example 5.8 (mixtures of isotropic stable processes, relativistic stable processes, etc.). • A non-scale-invariant case (mixture of an isotropic stable process and the Brownian motion) is discussed in Example 5.13. • Our results may be used to recover the recent scale-invariant BHI given in [65, Theorem 1.1]. More specifically, using our results and the first part of [65], one can obtain scale-invariant BHI for the Lévy processes considered therein (isotropic Lévy processes with the Lévy measure comparable to that of a rather general subordinate Brownian motion with some scaling properties), thus replacing the second part of [65] and significantly simplifying the whole argument of [65]. • Similarly, the estimates given in [60], combined with Theorem 3.5, should give a compact proof of scale-invariant BHI for a class of subordinate Brownian motions with Lévy-Khintchine exponent slowly varying at ∞.
Example 5.6 (Stable-like processes). Let X be a closed set in R k , and let m be a measure on X such that X, with the Euclidean distance, is an Ahlfors regular n-space for some n > 0. For example, X can be entire R k or the closure of an open set in R k (with the Lebesgue measure m; then n = k). On the other hand, X can be a fractal set, such as Sierpiński gaskets (n = log(k + 1)/ log 2) or Sierpiński carpets (n = log(3 k − 1)/ log 3) in R 2 , equipped with an appropriate Hausdorff measure. By this assumption, scaling property (a) is satisfied. Let α ∈ (0, 2), and suppose that ν(x, y) = ν(y, x) and This immediately gives Assumption C with scaling property (g). By [33,Theorem 1], there is a Feller, strong Feller, symmetric pure-jump Hunt process X t with Lévy kernel ν, and the continuous transition probability T t (x, y) of X t satisfies (5.1) for some r 0 . Assumption D and scaling property (e) follow by Proposition 5.3. Since X t is symmetric (self-dual) and has continuous transition densities, Assumption A is also satisfied.
Finally, we assume that Assumption B holds with scaling property (f) (see below). Under the above assumptions, scale-invariant boundary Harnack inequality holds with some localization radius. When X is unbounded, α = n and scaling property (a) holds for all r > 0, then (5.1) holds for all t > 0 and all x, y ∈ X, see [33], and therefore we can take R 0 = ∞.
We list some cases when Assumption B with scaling property (f) is known to hold true.
• When X = R k and ν(x, y) is a function of x − y, then X t is a symmetric Lévy process and we can simply take D = C ∞ c (R k ). • More generally, let X = R k , and assume that ν(x, y) = κ(x, y)|y − x| −k−α for a C ∞ b (R k × R k ) function κ. We claim that Assumption B with scaling property (f) |z| k+α dz · ∇f (x).

(5.4)
ThenÃ is a symmetric pseudo-differential operator with appropriately smooth symbol, and by [50,Theorem 5.7], the closure ofÃ is the Feller generator of a symmetric Hunt processX t (we omit the details). Since the pure-jump Feller processes X t andX t have equal Lévy kernels, they are in fact equal processes, and hence the closure ofÃ is the Feller generator of X t . Assumption B with D = C ∞ c (R k ) follows, and scaling property (f) is a simple consequence of (5.4). See also [49,79].
• When α ∈ [1, 2), X is the closure of an open set with C 1,β -smooth boundary for some β > α − 1, and ν(x, y) = c|x − y| −k−α , then one can take D to be the class of C ∞ c (R k ) functions with normal derivative vanishing everywhere on the boundary of X (see [45, Theorem 6.1(ii)]).
• For the case when X t is a subordinate diffusion on X, see Example 5.7. In this case, when X is a fractal set, one can even deal with α greater than 2. Note that an analytical proof of Theorem 3.4 discussed in Remark 4.12 may lead to a generalization of this example, which would not require Assumption B.
Example 5.7 (Stable-like subordinate diffusions in metric measure spaces). Suppose that (X, d, m) is an Ahlfors regular n-space for some n > 0. Assume that the metric d is uniformly equivalent to the shortest-path metric in X. Suppose that there is a diffusion process Z t with a symmetric, continuous transition density T Z t (x, y) satisfying the sub-Gaussian bounds for all x, y ∈ X and t ∈ (0, t 0 ) (t 0 = ∞ when X is unbounded). Here d w ≥ 2 is the walk dimension of the space X. The existence of such a diffusion process Z t is wellknown when X is a Riemannian manifold (d w = 2; see [43]), the k-dimensional Sierpiński gasket (d w = log(k + 3)/ log 2 > 2; see [11]), more general nested fractals [41,68], or the Sierpiński carpets [7,8]; see [59] for more information.
Let α ∈ (0, d w ) and let X t be the stable-like process obtained by subordination of Z t with the α/d w -stable subordinator η t , X t = Z(η t ). These processes were first studied in [27,69,76]. By the subordination formula, the transition density estimate (5.1) holds for some r 0 (if X is unbounded, then it was proved in [27] that we can take r 0 = ∞).
Since X t is symmetric and has continuous transition densities, Assumption A is clearly satisfied. The Lévy kernel of X t satisfies c −1 d(x, y) −n−α ≤ ν(x, y) ≤ cd(x, y) −n−α , see [27], and Assumption C with scaling property (g) follows. Assumption D and scaling property (e) follow from the transition density estimate (5.1) by Proposition 5.3; see also [27,Lemmas 5.3 and 5.6]. Finally, Assumption B with scaling property (f) follows by the construction of [71,Section 2]. Roughly speaking, the method of [71] yields smooth bump functions in the domain of the generator of the diffusion Z t with appropriate scaling. By the subordination formula, these bump functions are in the domain of A, and the constants scale appropriately. Since there are some nontrivial issues related to the construction, we repeat the construction with all details in Appendix A. By Corollary A.4 there, Assumption B is satisfied with scaling property (f).
We conclude that scale-invariant boundary Harnack inequality for X t holds in the full range of α ∈ (0, d w ). Noteworthy, we obtain a regularity result also for α ≥ 2, when Lipshitz functions no longer belong to the domain of the Dirichlet form of X t .
This example can be extended in various directions. Instead of taking η t the α/d wstable subordinator, one can consider a subordinator η t whose Laplace exponent ψ is a complete Bernstein function regularly varying of order α/d w (α ∈ (0, d w )) at infinity. Such subordinators have no drift, and the Lévy measure with completely monotone density function, regularly varying of order −1−α/d w at 0. Their potential kernel is regularly varying of order −1 + α/d w at 0. We refer the reader to [23,64,73] for more information about subordination, complete Bernstein functions and regular variation. By the subordination formula, following the method applied for the Euclidean case X = R k in [64,65], one can obtain two-sided estimates for the Lévy kernel ν(x, y) and the potential kernel U(x, y) in terms of ψ, at least when X is unbounded and α < d. These estimates are sufficient to prove the scale-invariant boundary Harnack inequality.
Similar methods should be applicable also when X t is recurrent (that is, X is bounded, or α ≥ d). In this case, estimates of U(x, y) need to be replaced by estimates of the λ-potential kernel U λ (x, y). Another interesting directions are the case of slowly varying ψ, which corresponds to α = 0, and, on the other hand, the case of pure-jump processes with ψ regularly varying of order 1 (that is, α = d w ). Finally, one can perturb processes considered above, in a similar way as in the next example.
Example 5.8 (Stability under small perturbations). Let X = R k , d be the Euclidean distance, m be the Lebesgue measure, and α ∈ (0, 2). Suppose thatν(x, y) is a Lévy kernel of a Hunt processX t considered in Example 5.6, andÃ is the corresponding Feller generator. For example,ν(x, y) can be any function of y − x satisfying (5.3). In this example we consider a perturbation ν(x, y) of the kernelν(x, y).
Although a more general construction is feasible, we are satisfied with the following setting. Let ν(x, y) =ν(x, y) + n(x, y), where n(x, y) is chosen so that ν(x, y) satisfies the scaling property (g), n(x, y) andn(x, y) = n(y, x) are kernels of bounded operators on C 0 (R k ), and the last assumption guarantees that m is an excessive (in fact, invariant) measure for the process X t defined below.
The formula N f (x) = R k (f (y) − f (x))n(x, y)dy defines a bounded linear operator on C 0 (R k ), and A =Ã + N (defined on the domain ofÃ) has the positive maximum property. By a standard perturbation argument, A is the Feller generator of a Hunt process X t , and ν(x, y) is the Lévy kernel of X t . The processX t and its Feller generator A are constructed in a similar manner, using the Feller generator of the dual ofX t and the kerneln(x, y). It is easy to see that , from which it follows thatX t is indeed the dual of X t . The transition density ofX t satisfies (5.1) (see Example 5.6). The process X t can be constructed probabilistically usingX t and Meyer's method of adding and removing jumps. Hence, by [9, Lemma 3.6] and [10, Lemma 3.1(c)], the transition density of X t exists and also satisfies (5.1) for smaller r 0 (see also [30,Proposition 2.1]).
It follows that Assumption A is satisfied. Assumption B holds with D = C ∞ c (R k ), and scaling property (f) (with finite R 0 ) follows from the α-stable-like scaling ofÃ and boundedness of N . Since we assumed that (g) holds true, Assumption C is satisfied with scaling properties (b), (c). Assumption D and scaling properties (d), (e) follow from transition density estimate (5.1) by Proposition 5.3. Hence, scale-invariant boundary Harnack inequality holds true for X t .
The above setting includes mixtures of isotropic stable processes (Lévy processes generated by A = −(−∆) α/2 − c(−∆) β/2 with 0 < β < α < 2 and c > 0) and relativistic stable processes (Lévy processes generated by A = m − (−∆ + m 2/α ) α/2 with m > 0). Also, the dependence of constants on the parameters c, β, m can be easily tracked. Since the perturbation n(x, y) can be asymmetric, many non-symmetric processes are included. Finally, this example can be adapted to the setting of Ahlfors-regular n-sets in R k , as in Example 5.6. Example 5.9 (Processes killed by a Schödinger potential). Suppose that the assumptions for the boundary Harnack inequality in Theorem 3.5 are satisfied. Let X be an open set in X. Let M t be a strong right-continuous multiplicative functional quasi-left continuous on [0, ∞), for which all points of X are permanent, and such that M t = 0 for t ≥ τ X . Finally, let X M t be the subprocess corresponding to M t (in a similar way as in Section 4; see [18] for definitions). Then X M t is a Hunt process on X , uniquely determined by the relation P M x (X M t ∈ E) = E x (M t ; X t ∈ E) for any E ⊆ X and x ∈ X . Assume that M t is a continuous function of t ∈ [0, τ X ). We claim that in this case the Lévy kernel ν M (x, y) of X M t is again given by ν(x, y), restricted to X × X . Indeed, by formula (4.6) of Lemma 4.3, for x ∈ X and f ∈ D(A) vanishing in a neighborhood of x, When divided by t, this converges (for a fixed x) to Af (x) as t → 0 + . Hence, ν M f (x) = νf (x). By an approximation argument, this holds for any f ∈ C c (X ) vanishing in a neighborhood of x, proving our claim. (Note that, however, in general, functions in D(A) need not belong to the domain of the generator of X M t , even if X = X.) We remark that many such functionals M t are related to Schrödinger potentials V : for a nonnegative function V , we have M t = exp(− t 0 V (X s )ds) for t < τ X , see [18]. A similar construction was used in Section 4 for a particular choice of V . In some applications, the potential V can take negative values, the case not covered by this example.
Let D ⊆ X be an open set. By the definition of a subharmonic function, a nonnegative function f regular subharmonic on D ∩ X with respect to the process X M t , extended by f (x) = 0 for x ∈ X \ X , is also regular subharmonic in D with respect to X t . Hence, the hypothesis of Theorem 3.4 holds for X M t with the same constant c (3.9) . Of course, one needs to replace the sets in the statement of Theorem 3.4 by their intersections with X .
We claim that also Lemma 3.1 holds for X M t with the same constant. Indeed, with the definitions of the proof of Lemma 3.1 and D = D ∩ X , for x ∈ B(x 0 , r) ∩ X we have By formula (4.6) of Lemma 4.3, The second summand on the right hand side is nonpositive. It follows that, as desired. In Lemma 3.2, only the estimates of the Lévy measure and mean exit time are used. Therefore, also Lemma 3.2 holds for the process X M t with unaltered constants. In a similar way, the proof of Theorem 3.5 works for the process X M t without modifications. We conclude that the boundary Harnack inequality holds for X M t with the same constants. For convenience, we state this result as a separate theorem.
Theorem 5.10. Suppose that Assumptions A, B, C and D hold true. Let X be an open subset of X, and let X M t be a subprocess of X t , with state space X , corresponding to a strong right-continuous multiplicative functional for X t , continuous before X t hits X \ X , vanishing after that time, and quasi-left continuous on [0, ∞). Then the boundary Harnack inequality holds true for the process X M t with the same constant c (1.1) given by (3.10). More precisely, if x 0 ∈ X, 0 < r < R < R 0 , D ⊆ B(x 0 , R) is open, f, g are nonnegative regular harmonic functions in D ∩ X (with respect to the process X M t ), and f, g vanish in (B(x 0 , R) \ D) ∩ X , then we have We remark that the continuity assumption for M t is essential. If, for example, M t is equal to 1 until the first jump larger than 1, and then 0, the boundary Harnack inequality typically does not hold, by an argument similar to one in Example 5.14 below.
Example 5.11 (Actively reflected and censored stable processes). Let X ⊆ R k be open and let X be the closure of X in R k . Suppose that X satisfies property (a). Let ν(x, y) = c|x − y| −n−α . As in Example 5.6, under suitable assumptions on X, there is a stable-like process X t with the Lévy kernel ν(x, y), and scale-invariant BHI holds for X t . In [21], the process X t is called actively reflected α-stable process in X, and the process X t , obtained from X t by killing it upon hitting X \ X , is named censored α-stable process in X (see [21,Remark 2.1]). Clearly, the boundary Harnack inequality for X t is the special case of the boundary Harnack inequality for X t , corresponding to open sets D contained in X . (Note that this is in fact a special case of Theorem 5.10, with M t = 1 for t < τ X .) Hence, we have scale-invariant BHI for the actively reflected α-stable process X t and the censored α-stable process X t , whenever X is a Lipschitz set in the case α ∈ (0, 1), and X is an open set with C 1,β -smooth boundary for some β > α − 1 in the case α ∈ [1, 2). The above extends the results of [21,44].
Example 5.12 (Gradient-type perturbations of stable processes). Let α ∈ (1, 2). If b : R k → R k is bounded and differentiable, partial derivatives of b are bounded, and div b = 0, then the process X t generated by −(−∆) α/2 + b · ∇, and the processX t generated by −(−∆) α/2 − b · ∇ are mutually dual. Such processes are considered in the recent paper [53]. The Lévy kernels of X t andX t are the same as that of the isotropic α-stable Lévy process generated by (−∆) α/2 , see [25]. Furthermore, D = C ∞ c (R k ) is contained in the domains of A andÂ. Therefore, a scale-invariant (with finite R 0 ) boundary Harnack inequality holds for the process X t .
We conclude this article with some negative or partially negative examples.
Example 5.13 (Lévy processes with Brownian component). Let X = R k , d be the Euclidean distance, m be the Lebesgue measure, and α ∈ (0, 2). Let X t be the sum of two independent processes, the Brownian motion and the isotropic α-stable Lévy process. That is, X t is the Lévy process with generator A = c 1 ∆ − c 2 (−∆) α/2 .
Clearly, X t is symmetric and has transition densities, so Assumption A is satisfied. Furthermore, D(A) contains C ∞ c (R k ), and hence Assumption B is satisfied with 2-stable-like scaling: the property (f) holds with α replaced by 2. On the other hand, Assumption C clearly holds with α-stable-like scaling (g). Furthermore, detailed estimates for the transition density of X t can be established ( [34]), from which Assumption D follows as in Proposition 5.3, with 2-stable scaling.
It follows that boundary Harnack inequality holds despite the diffusion component. However, the constant c (1.1) (x 0 , r, R) is not bounded when, for example, R = 2r and r → 0 + . This is a typical behavior for processes comprising both jump and diffusion part, and for general open sets one cannot expect a scale-invariant result: the boundary Harnack inequality in the form given in (BHI) does not hold for the Brownian motion without some regularity assumptions on the boundary of D, cf. [14]. On the other hand, the scale-invariant boundary Harnack inequality for X t in more smooth domains was established in [32].
Example 5.14 (Truncated stable processes). This example shows why Assumption C is essential for the boundary Harnack inequality in the form given in (BHI). Consider the truncated isotropic α-stable Lévy process X t in X = R k , α ∈ (0, 2), n ≥ 1. This is a pure-jump Lévy process with Lévy kernel ν(x, y) = c|x − y| −n−α 1 B(x,1) (y). Clearly, Assumptions A, B and D, as well as formula (2.9), hold true with α-stable-like scaling and R 0 = 1, but Assumption C is violated.
We examine two specific harmonic functions. Let v be a vector in R d with |v| = 2/3, let r ∈ (0, 1/6) be a small number, and define B 1 = B(x 1 , r) and B 2 = B(x 2 , r), where x 1 , x 2 ∈ R k are arbitrary points satisfying Suppose that x ∈ B 1 . By (2.12), we have When x ∈ B 2 , then, again by (2.12), Similar estimates hold true for f 2 . It follows that This ratio can be arbitrarily small when r → 0, and therefore (BHI) cannot hold for truncated stable process uniformly with respect to the domain. We remark that by an appropriate modification of the above example, one can even construct a single domain (an infinite union of balls) for which (BHI) is false. Also, modifications of the above example for other truncated processes, or for processes with super-exponential decay of the density of the Lévy measure can be given.
On the other hand, if the regular harmonic functions f and g (of the truncated αstable process X t ) vanish outside a unit ball, then clearly f and g are harmonic in D also with respect to the standard (that is, non-truncated) isotropic α-stable process in R k . Therefore, the boundary Harnack inequality actually holds true for such functions. A different version of boundary Harnack inequality was proved for X t under some regularity assumptions on the domain of harmonicity in [61,62]. In this part we repeat the construction of smooth bump functions of [71]. We adopt the setting of Example 5.7: Z t is a diffusion process on an Ahlfors-regular n-space X, the transition semigroup T Z t of Z t satisfies sub-Gaussian bounds (5.5), and X t is defined to be the process Z t subordinated by an independent α/d w -stable subordinator, α ∈ (0, d w ). The generator of Z t serves as the (Neumann) Laplacian ∆ on X, and T Z t is the heat semigroup.
Let h = T Z t g for some t > 0 and g ∈ L 2 (X). One of the main results of [71], Theorem 2.2, states that given any compact K and ε > 0, there is a function f such that f ∈ D(∆ l ) for all l > 0, f (x) = h(x) on K and f (x) = 0 when dist(x, K) ≥ ε. There are at least three issues when one tries to apply this result in our setting.
First, Theorem 2.2 in [71] is given under the assumption that the spectral gap of ∆ is positive. However, this assumption is used only in the proof of Lemma 2.6, which contains a flaw: positivity of the spectral gap λ does not imply the inequality P t f − f L 2 (X) ≤ λt f L 2 (X) (see line 3 on page 1769 and line 12 on page 1773 in [71]). This issue has been resolved by the authors of [71] in an unpublished note, containing a corrected version of the proof of Lemma 2.6. The new argument does not involve the condition on the spectral gap, which therefore turns out to be superfluous. For future reference, we provide the corrected version of the proof of Lemma 2.6 below.
Second, to get Assumption B, we need to apply the above theorem with h(x) = 1 for x ∈ K, where h = T Z t g. This condition is satisfied when g(x) = 1 for all x ∈ X. However, such a function g is in L 2 (X) only when m is a finite measure, and the general case is not covered by [71]. For that reason, we choose to repeat the construction of [71] in L ∞ (X) (instead of L 2 (X)) setting.
Finally, for a scale-invariant boundary Harnack inequality, we need an upper bound for ∆f L ∞ (X) with explicit dependence on scale, that is, explicit in ε and the size (e.g. the diameter) of K. Such properties of the estimates are irrelevant in [71], but it turns out that they can be obtained by carefully following the proof of Theorem 2.2 in [71].
For the above reasons, we decide to give a complete proof of an L ∞ (X) version of Theorem 2.2 in [71]. However, it should be emphasized that method was completely developed in [71]. Although we only need the result for g(x) = h(x) = 1 for all x ∈ X, for future reference we consider the general case.
Theorem A.1 (a variant of [71, Theorem 2.2]). Suppose that K ⊆ X is a compact set, ε, s > 0 and h = T Z s g for some g ∈ L ∞ (X). Then there is a function f ∈ L ∞ (X) such that f (x) = h(x) for x ∈ K, f (x) = 0 when dist(x, K) ≥ ε, and f ∈ D(∆ l ) for any l > 0. Furthermore, the L ∞ (X) norm of f is bounded by the L ∞ (X) norm of g, f is nonnegative if g is nonnegative, and for all l > 0 we have where c (A.1) = c (A.1) (l, ε dw /s, Z t ).
Proof. We divide the argument into five steps. All constants in this proof may depend not only on the parameters given in parentheses, but also on the space X and the process Z t . Since we never refer to the semigroup of the subordinate process X t , in this proof for simplicity we write T t = T Z t . Furthermore, also in this proof only, we extend ∆ to the L ∞ (X) generator of T t (recall that originally ∆ was defined as the C 0 (X) generator), and denote by ∆ L 2 (X) the L 2 (X) generator of T t , that is, the generator of the semigroup of operators T t acting on L 2 (X). Clearly, ∆f = ∆ L 2 (X) f m-a.e. whenever f ∈ D(∆) ∩ D(∆ L 2 (X) ).
Step 1. We begin with some general estimates. By the spectral theorem and the inequality λ l e −λt ≤ (le/t) l , for any f ∈ L 2 (X) and l ≥ 0, we have T t f ∈ D((∆ L 2 (X) ) l ), and (∆ L 2 (X) ) l T t f L 2 (X) ≤ (le/t) l f L 2 (X) .
In particular, given any s > 0 and ε > 0 it is possible to choose a strictly increasing sequence s j > 0 convergent to s, with s 0 = 0, such that if t j = s j − s j−1 (j ≥ 1), then lim j→∞ D(2 −j ε, s − s j ) = 0 and ≤ c 6 (l, ε dw /s) ε ldw < ∞ for any ε > 0, l ≥ 0. For example, one can take s j = (1 − 4 −dwj )s. Note, however, that the above series would diverge if t j decreased either too slowly or too rapidly.
Below we prove that T s−s j u j converges to a function f with the desired properties.

(A.3)
Step 4. We follow the corrected version of the proof of [71, Lemma 2.6]. Let l ≥ 0. For j ≥ 1 we have Observe that the results of Step 1 and the equality u
It follows that the sequence ∆ l T s−s j u j converges in L ∞ (X) as j → ∞ for every l ≥ 0. Therefore, if f (x) = lim j→∞ T s−s j u j (x), then for all l ≥ 0 we have f ∈ D(∆ l ) and ≤ c 7 (l, ε dw /s) ε ldw+n/2 m(K 0 ) g L ∞ (X) , as desired.
Step 5. By the definition of u j , for j ≥ 1 we have It follows that The right hand side converges to 0 as j → ∞. Hence, f (x) = T s g(x) = h(x) for x ∈ K. Furthermore, u j L ∞ (X) ≤ g L ∞ (X) , and therefore also f L ∞ (X) ≤ g L ∞ (X) . Finally, if g ≥ 0, then u j ≥ 0 for all j ≥ 1, and so f ≥ 0.
By choosing g(x) = h(x) = 1 and s = ε dw , we obtain the following result.
Corollary A.2. Suppose that K ⊆ X is a compact set and ε > 0. Then there is a function f ∈ L ∞ (X) such that f (x) = 1 for x ∈ K, f (x) = 0 when dist(x, K) ≥ ε, and f ∈ D(∆ l ) for any l > 0. Furthermore, 0 ≤ f (x) ≤ 1 for all x ∈ X, and for all l > 0 we have where c (A.4) = c (A.4) (l, Z t ).
In general, the boundary of the set {x ∈ X : f (x) > 0} might be highly irregular. However, when we relax the smoothness hypothesis on f , we can require f to be positive on an arbitrary given open set.
Proof. Let f 0 be the function constructed in Theorem A.1 for h(x) = g(x) = 1, and denote by V an arbitrary open set with the following properties: {x ∈ X : f (x) > 0} ⊆ V ⊆ {x ∈ X : dist(x, K) < 2ε}, and m(∂V ) = 0. For example, one can take V = {x ∈ X : dist(x, K) < r} for a suitable r ∈ (ε, 2ε). Let B j , j = 1, 2, . . ., be a family of balls contained in V ∩ {x ∈ X : f 0 (x) < 1/2} such that twice smaller balls B j form a countable covering of V ∩ {x ∈ X : f 0 (x) < 1/2}, and let f j be the function as in Corollary A.2, equal to 1 on B j and vanishing on X \ B j . Finally, choose ε j > 0 so that for l = 0, 1, . . . , L, Then f = f 0 + ∞ i=1 ε i f i has all the desired properties, with ε replaced by 2ε. Corollary A.4. Assumption B holds with α-stable scaling.

Therefore,
Af L ∞ (X) ≤ c 1 ∆f L ∞ (X) α/dw . Let 0 < r < R, and take K = B(x 0 , r), D = B(x 0 , R), ε = R − r. We see that This gives half of the α-stable scaling property (f), and the other half is proved in a similar manner.