Semi-Lagrangian schemes for linear and fully non-linear diffusion equations

For linear and fully non-linear diffusion equations of Bellman-Isaacs type, we introduce a class of approximation schemes based on differencing and interpolation. As opposed to classical numerical methods, these schemes work for general diffusions with coefficient matrices that may be non-diagonal dominant and arbitrarily degenerate. In general such schemes have to have a wide stencil. Besides providing a unifying framework for several known first order accurate schemes, our class of schemes includes new first and higher order versions. The methods are easy to implement and more efficient than some other known schemes. We prove consistency and stability of the methods, and for the monotone first order methods, we prove convergence in the general case and robust error estimates in the convex case. The methods are extensively tested.


Introduction
In this paper we introduce and analyze a class of monotone approximation schemes for fully non-linear diffusion equations of Bellman-Isaacs type, The coefficients a α,β = 1 2 σ α,β σ α,β ⊤ , b α,β , c α,β , f α,β and the initial data g take values respectively in S N , the space of N × N symmetric matrices, R N , R, R, and R. We will only assume that a α,β is positive semi-definite, thus the equation is allowed to degenerate and does not have any smooth solutions in general. Under suitable assumptions (see Section 2), the initial value problem (1.1)-(1.2) has a unique, bounded, Hölder continuous, viscosity solution u. This function is the upper or lower value of a stochastic differential game, or, if A or B is a singleton, the value function of a finite horizon, optimal stochastic control problem [25].
We introduce a family of schemes that we call Semi-Lagrangian (SL) schemes. They are a type difference-interpolation schemes and arise as time-discretizations of the following semi-discrete equation where L α,β k is a monotone difference approximation of L α,β and I h is a monotone interpolation operator on the spatial grid X h . For more details see Section 3. Typically these scheme are first order wide-stencil schemes, and if the matrix a α,β is bad enough, the stencil has to keep increasing as the grid is refined to have convergence. They include as special cases schemes from [16,19,7,12], more efficient versions of these schemes, and a new second order compact stencil scheme. There are two main advantages of these schemes: (i) they are easy to understand and implement, and more importantly, (ii) they are consistent and monotone for every positive semi-definite diffusion matrix a α,β = 1 2 σ α,β σ α,β ⊤ . The last point is important because monotone methods are known to converge to the correct solution [3], while non-monotone methods need not converge [21] or can even converge to false solutions [23].
Classical finite difference approximations (FDMs) of (1.1) are not monotone (of positive type) unless the matrix a α,β satisfies additional assumptions like e. g. being diagonally dominant [18]. More general assumptions are given in e. g. [5,14] but at the cost of increased stencil length. In fact, Dong and Krylov [14] proved that no fixed stencil FDM can approximate equations with a second derivative term involving a general positive semi-definite matrix function a α,β . Note that this type of result has been known for a long time, see e. g. [20,12]. Some very simple examples of such "bad" matrices are given by and these type of matrices appear in Finance, Stochastic Control Theory, and Mean Curvature Motion. The third example leads to quasi-linear equations and will not be considered here, we refer instead to [12].
To obtain convergent or monotone methods for problems involving non-diagonally dominant matrices, we know of two strategies: (i) The classical method of rotating the coordinate system locally to obtain diagonally dominant matrices a α,β , see e. g. Section 5.4 in [18], and (ii) the use of wide stencil methods. The two solutions seem to be somewhat related, but at least the defining ideas and implementation are different. Both ways lead to methods that have reduced order compared to standard schemes for diagonally dominant problems. But it seems to us that it is much more difficult to implement the first strategy.
In addition to the wide-stencil methods mentioned above, we also mention the method of Bonnans-Zidani [5] which is not an SL type scheme. Schemes for other types of equations related to our SL schemes have been studied by Crandall-Lions [12] for the Mean Curvature Motion equation, by Oberman [22] for Monge-Ampère equations, and by Camilli and the second author for non-local Bellman equations [8]. The terminology SL schemes is already used for schemes for transport equations, conservation laws, and first order Hamilton-Jacobi equations. In the Hamilton-Jacobi setting, these schemes go back to the 1983 paper [9] of Capuzzo-Dolcetta.
The rest of this paper is organized as follows. In the next Section we explain the notation and state a well-posedness and regularity result for (1.1)-(1.2). The SL schemes are motivated and defined in an abstract setting in Section 3, and in Section 4 we prove that they are consistent, monotone, L ∞ -stable, and convergent. We provide several examples of SL schemes in Section 5, including the linear interpolation SL (LISL) scheme. This is the basic example of this paper, and it is a first order scheme that can be defined on unstructured grids.
Our SL schemes make use of monotone interpolation, and higher order interpolation is not monotone in general. But for essentially monotone solutions, we can use monotone cubic Hermite interpolation (see Fritsch and Carlson [17] and Eisenstat, Jackson and Lewis [15]) to obtain new second order schemes called monotone cubic interpolation SL (MCSL) schemes. These new schemes are defined in Section 6, and in contrast to the LISL schemes they are compact stencil schemes. Note well that in the special case of first order HJ-equations with monotone solutions, these schemes are consistent, monotone, second order schemes! To our knowledge, this is the first example of a monotone scheme which is more than first order accurate in the HJ-setting.
We discuss various issues concerning the SL schemes in Section 7. We compare the LISL scheme to the scheme of Bonnans-Zidani [5] and find that the LISL scheme is much easier to understand and implement, and in general it is much faster. However, on bounded domains the LISL scheme will in some cases over-step the boundaries and some ad hoc solution has to be found. This problem is avoided by the Bonnans-Zidani scheme at the cost of being less accurate near the boundary than in the interior. Finally, we explain that the SL schemes can be interpreted as collocation methods for derivative free equations, or as dynamic programming equations of discrete stochastic differential games or optimal control problems.
In Sections 8,9 and Appendix B, we produce robust error estimates for convex equations (i. e. B is a singleton in (1.1)). These estimates are obtained through the regularization method of Krylov and apply to degenerate equations, non-smooth solutions, and both the LISL and MCSL schemes. Finally, in Section 10, our methods are extensively tested. In particular we find that the LISL and MCSL schemes yield much faster methods to solve the finance problem of Bruder, Bokanowski, Maroso, and Zidani [6].

Notation and well-posedness
In this section we introduce notation and assumptions, and give a well-posedness and comparison result for the initial value problem (1.1) -(1.2).
We denote by ≤ the component by component ordering in R M and the ordering in the sense of positive semi-definite matrices in S N . The symbols ∧ and ∨ denote the minimum respectively the maximum. By | · | we mean the Euclidean vector norm in any R p type space (including the spaces of matrices and tensors). Hence if X ∈ R N,P , then |X| 2 = i,j |X ij | 2 = tr(XX ⊤ ) where X ⊤ is the transpose of X.
If w is a bounded function from some set Q ′ ⊂ Q ∞ into either R, R M , or the space of N × P matrices, we set Furthermore, for δ ∈ (0, 1], we set Let C b (Q ′ ) and C 0,δ (Q ′ ), δ ∈ (0, 1], denote respectively the space of bounded continuous functions on Q ′ and the subset of C b (Q ′ ) in which the norm | · | δ is finite. Note in particular the choices Q ′ = Q T and Q ′ = R N . In the following we always suppress the domain Q ′ when writing norms.
The proof is standard. Assumption (A1) can be relaxed in many ways, e. g. using weighted norms, Hölder or uniform continuity, etc. But in doing so, solutions can become unbounded and less smooth, and the analysis becomes harder and less transparent. Therefore we will not consider such extensions in this paper.
By solutions in this paper we always mean viscosity solutions, see e. g. [11,25].

Definition of SL schemes
In this section we propose a class of monotone approximation schemes for (1.1)-(1.2) which we call Semi-Lagrangian or SL schemes. This class includes (parabolic versions of) the "control schemes" of Menaldi [19] and Camilli and Falcone [7] and the monotone schemes of Crandall and Lions [12]. It also includes SL schemes for first order Bellman equations [9,16], and it allows for more effective versions of these schemes as discussed in Section 5. For a motivation for the name, we refer to Remark 7.2. For the time discretization we propose a generalized midpoint rule that includes explicit, implicit, and a second order Crank-Nicolson like approximation. Note that the equation is non-smooth in general, so the usual way of defining a Crank-Nicolson scheme would only give a first order scheme in time.
The schemes will be defined on a possibly unstructured family of grids {G ∆t,∆x } with for ∆t, ∆x > 0. Here 0 = t 0 < t 1 < · · · < t n < t n+1 satisfy max n ∆t n ≤ ∆t where ∆t n = t n − t n−1 , and X ∆x = {x i } i∈N is the set of vertices or nodes for a non-degenerate polyhedral where diam is the diameter of the set and B Tj is the greatest ball contained in T j .
To motivate the numerical schemes, we write σ = (σ 1 , σ 2 , ..., σ m , ..., σ P ) where σ m is the m-th column of σ and observe that for k > 0 and smooth functions φ, These approximations are monotone (of positive type) and the errors are bounded by 1 48 P |σ| 4 0 |D 4 φ| 0 k 2 and 1 2 |b| 2 0 |D 2 φ| 0 k 2 respectively. To relate these approximations to a grid G, we replace φ by its interpolant Iφ on that grid and obtain If the interpolation is monotone (positive) then the full discretization is still monotone and represents a typical example of the discretizations we consider below.
To construct the general scheme, we generalize the above construction. We now consider general finite difference approximations of the differential operator L α,β [φ] in (1.1) defined as for k > 0 and some M ≥ 1, where for all smooth functions φ, This approximation and interpolation yield a semi-discrete approximation of (1.1), and the final scheme can then be found after discretizing in time using a parameter θ ∈ [0, 1], As initial conditions we take Remark 3.1. For the choices θ = 0, 1, and 1/2 the time discretization corresponds to respectively explicit Euler, implicit Euler and midpoint rule. For θ = 1/2, the full scheme can be seen as generalized Crank-Nicolson type discretizations.

Analysis of SL schemes
In this section we prove that the SL scheme (3.3) is consistent and monotone, and we present L ∞ -stability, existence, uniqueness, and convergence results for these schemes. In Section 8 we also give error estimates when B is a singleton and equation (1.1) is convex.
For the approximation L α,β k and interpolation I we will assume that There are K ≥ 0, r ∈ N such that |(Iφ) − φ| 0 ≤ K|D p φ| 0 ∆x p for (I1) all N ∋ p ≤ r and all smooth functions φ.
There is a non-negative basis of functions {w j (x)} j such that (I2) Under assumption (Y1), a Taylor expansion shows that L α,β k is a second order consistent approximation satisfying (3.2). If we assume also (I1) , it then follows (3.2), and by (I1), Remark 4.1. Assumption (Y1) is similar to the local consistency conditions used in [18]. The O(k 4 ) terms insure that the method is second order accurate as k → 0. Convergence will still be achieved if we relax O(k 4 ) to o(k 2 ) as k → 0.
Remark 4.2. An interpolation satisfying (I2) is said to be positive or monotone and preserves monotonicity of the data. Note that such an interpolation Iφ does not use (exact) derivatives to reconstruct the function φ. Typically I will be constant, linear, or multi-linear interpolation (i. e. r ≤ 2 in (I1)) since higher order interpolation is not monotone in general. For later use we note that from (I1) and (I2) it follows that is monotone if the following CFL condition holds, Remark 4.3. By parabolic regularity D 2 ∼ ∂ t which means that e. g. |D 2 φ tt | 0 is proportional to |φ ttt | 0 . When θ = 1/2 ("Crank-Nicolson"), the scheme (3.3) is second order accurate in time.
Proof. It is immediate that the scheme (3.3) is consistent with (1.1) with a truncation error bounded by To prove part (b), we note that since i w i ≡ 1 we have This quantity is non-negative by (I2), and i l α,β,n−1+θ j,i = M k 2 by (I1), (I2), and i w i ≡ 1. Therefore the only non-zero coefficients in (4.2) at (t n , x j ) are B α,β,n,n Existence, uniqueness, stability, and convergence results are given below. In the following, we denote by c α,β,+ the positive part of c α,β .
Proof. Existence and uniqueness of bounded solutions follow by induction. Let t = t n and assume U n−1 is a known bounded function. For ε > 0 we define the operator T by Note that the fixed point equation U n = T U n is equivalent to equation (3.3). By the definition and sign of the B-coefficients we see that ) ≥ 0 and ε(1−∆t n θ sup α,β |c α,β,+ | 0 ) < 1 for all j, n, α, β. Taking the supremum over all j and interchanging the role of U andŨ proves that T is a contraction on the Banach space of bounded functions on X ∆x under the sup-norm. Existence and uniqueness then follows from the fixed point theorem (for U n ) and for all of U by induction since U 0 = g is bounded.

Examples of SL schemes
, including previous approximations that have appeared in [16,19,7,12] plus more computational efficient variants. 1. The approximation of Falcone [16] (see also [9]), The corrected version of the approximation of Camilli-Falcone [7] (see also [19]), The new approximation obtained by combining approximations 1 and 2, for j ≤ P , y α,β,± k,P +1 = k 2 b α,β and M = P + 1. 5. The new, more efficient version of approximation 3, Approximation 5 is always more efficient than 3 in the sense that it requires fewer arithmetic operations. The most efficient of approximations 3, 4, and 5, is 4 when σ α,β does not depend on α, β but b α,β does, and 5 in the other cases.

Linear interpolation SL scheme (LISL).
To keep the scheme (3.3) monotone, linear or multi-linear interpolation is the most accurate interpolation one can use in general. In this typical case we call the full scheme (3.3)-(3.4) the LISL scheme, and we will now summarize the results of Section 4 for this special case.
3) hold, then there exists a unique bounded and L ∞ -stable solution U h of the LISL scheme converging uniformly to the solution u of (1.1)-(1.2) as ∆t, k, ∆x k → 0. From this result it follows that the scheme is at most first order accurate, has wide and increasing stencil and a good CFL condition. From the consistency error and the definition of L α,β k the stencil is wide since the scheme is consistent only if ∆x/k → 0 as ∆x → 0 and has stencil length proportional to Here we have used that if (A1) holds and σ ≡ 0, then typically y α,β,± k,i ∼ k. Also note that if k = ∆x 1/2 , then l ∼ ∆x −1/2 . Finally, in the case θ = 1 the CFL condition for (3.3) is ∆t ≤ Ck 2 ∼ ∆x when k = O(∆x 1/2 ), and it is much less restrictive than the usual parabolic CFL condition, ∆t = O(∆x 2 ).

A second order SL scheme for monotone solutions
In this section we introduce new second order SL schemes of the form (3.3)-(3.4) for non-degenerate grids of tensor product type. These schemes are based on monotone cubic Hermite interpolation [17,15], and are consistent for monotone solutions of the scheme. We will call these schemes monotone cubic interpolation SL schemes or MCSL schemes in short.
To define monotone cubic Hermite interpolation, we start by considering a 1D function φ.
where φ i = φ(x i ) and d i is an estimate of the derivative of φ at x i . It follows that To obtain a fourth order accurate interpolant, φ ′ i must be at least third order accurate. We will use the symmetric fourth order approximation However, the resulting interpolation is not monotone. Necessary and sufficient conditions for monotonicity were found by Fritsch and Carlson [17] (see also [24]): If ∆ i = 0, then monotonicity follows if and only if d i = d i+1 = 0, and if Eisenstat, Jackson and Lewis [15] give an algorithm that modifies the derivative approximation d i such that the above conditions are fulfilled, and for monotone data the resulting interpolant is a C 1 fourth order approximation. We will only consider C 0 interpolants, and in that case their algorithm simplifies to: Step 1 Compute the initial d i using (6.2). Step Step 3 Set α i := max{α i , 0} and β i := max{β i , 0}. Step Multidimensional interpolation operators are obtained as tensor products of onedimensional interpolation operators, i. e. by interpolating dimension by dimension.
Lemma 6.1. The above monotone cubic interpolation is always monotone and satisfies (I2). If the interpolated function is strictly monotone between grid points, then (I1) holds with r = 4 and the method is fourth order accurate.
Proof. Assumption (I2) holds by construction. The error estimate follows from [15], since the above algorithm coincides with the two sweep algorithm given there when n = 1 interval is considered. In [15] it is proved that this algorithm gives third order accurate approximations to the exact derivatives and hence the cubic Hermite polynomial constructed using this approximation is fourth order accurate.
Remark 6.1. Carlson and Fritsch [10] constructed an alternative monotone bicubic interpolation algorithm for R 2 . We are not aware of any work on high order monotone interpolation on unstructured grids.
By the Lemma 6.1 and the results in Section 4 we have the following result: Corollary 6.2. Assume that (A1), (Y1) hold, and that for all h ∈ (0, 1), there exists a bounded solution U h of the MCSL scheme such that IU h is strictly xmonotone between points in the x-grid X ∆x . (a) The MCSL scheme is monotone if the CFL condition (4.3) hold.
, and hence the scheme is second order accurate when k = O(∆x) and ∆t = O(∆x 2 ) for θ = 1 2 or ∆t = O(∆x) for θ = 1 2 . (c) If 2θ∆t sup α,β |c α,β | 0 ≤ 1, then the solution U h is unique, L ∞ -stable, and converges uniformly to the solution u of (1.1)-(1.2) as ∆t, k, ∆x 4 k 2 → 0. Remark 6.2. The MCSL scheme is always monotone, but for non-monotone solutions the scheme is not consistent when k = O(∆x) since the consistency error then is O(∆t + k + ∆x 2 k 2 ). Moreover, it is easy to see that the MCSL scheme has strictly monotone solutions (between grid points) whenever the collocation equation (7.1) (see Section 7) has strictly monotone solutions (between grid points). To prove such type of results one can use comparison principle arguments, and we refer to Appendix A for results concerning equation (1.1). Since (7.1) satisfies the comparison principle (the proof is essentially a simplified version of the proof in Appendix B), the argument proving Lemma A.1 shows that its solutions are monotone under assumption (A2) in Appendix A. Under this assumption, existence of a monotone solution follows from Theorem 4.2 in Section 4. Remark 6.3. It is well known that consistent monotone methods for first order equations are at most first order accurate in general. However, the MCSL scheme is an example showing that second order consistent monotone schemes are possible if the solutions have special structure: When σ α,β ≡ 0, the MCSL scheme is a monotone second order scheme for a first order equation when solutions are monotone. Experts we have talked to seem to be surprised by this fact.
7. Discussion 7.1. Comparison with the scheme of Bonnans-Zidani (BZ). In the paper [5] (see also [4,6]) Bonnans and Zidani suggest an alternative approach to discretize non-linear degenerate diffusion equations. Their idea is to approximate the diffusion matrix a α,β by a nicer matrix a α,β k which admits monotone finite difference approximations. For every k ∈ N they find a stencil and positive numbers a α,β k,ξ such that This new diffusion matrix a α,β k gives a diffusion term that can be decomposed into a linear combination of directional derivatives and these are again approximated by central difference approximations, This approximation is monotone by construction and respects the grid. In two space dimensions, a α,β k can be chosen such that |a α,β − a α,β k | = O(k −2 ) (cf. [4]), and then it is easy to see that the consistency error is When b α,β ≡ 0, the BZ scheme can be obtained from (3.3) by replacing our L α,β k by the above Bonnans-Zidani diffusion approximation. This scheme shares many properties with the LISL scheme, it is at most first order accurate (take k ∼ ∆x −1/2 ), it has a similar wide and increasing stencil, and it has a similar good CFL condition ∆t ≤ Ck 2 ∆x 2 (∼ ∆x when k ∼ ∆x −1/2 ). To understand why the stencil is wide, simply note that k by definition is the stencil length and that the scheme is consistent only if k → ∞ and k∆x → 0. The typical stencil length is k ∼ ∆x −1/2 , just as it was for the LISL scheme.
The main drawback of this method is that it is costly since we must compute the matrix a α,β k for every x, t, α, β in the grid. The LISL scheme is easier to understand and implement and runs faster. Later we will see numerical indications that LISL runs at least 10 times faster than the BZ scheme on some test problems. The BZ scheme has the advantage that it is easy to modify to prevent it from leaving the domain (accuracy is then reduced or monotonicity is lost). It is less natural to do this for the LISL scheme. However, in many problems it is not necessary to do any modification near the boundary as we will see below.
The MCSL scheme in the typical case when k = ∆x, is a second order accurate and compact stencil scheme having the usual (not so good) CFL conditions for parabolic problems ∆t ∼ k 2 = ∆x 2 . It is far more efficient than the other two schemes as can be seen in Section 10, but it is only guaranteed to converge when the computed solutions are essentially monotone. The other two schemes "always" converge. We have summarized our findings in the  [16,7]. The idea is that if denotes the interpolant space associated to the interpolation I, equation (3.3) can be stated in the following equivalent way: Find U ∈ W ∆x (Q T ) solving In general W ∆x can be any space of approximations which is interpolating on the grid X ∆x , e. g. a space of splines. We do not consider this case here.

Stochastic game/control interpretation.
In general the scheme (3.3)-(3.4) can be interpreted as the dynamical programming equation of a discrete stochastic differential game. We will now try to explain this in the less technical case when B is a singleton and the game simplifies to an optimal stochastic control problem.
Assume that (A1) holds, and for simplicity, that c α (t, x) ≡ 0 and the other coefficients are independent of t. Then it is well-known (cf. [25]) that the (viscosity) solution u of equation (1.1)-(1.2) is the value function of the stochastic control problem: where A is a set of admissible A-valued controls and the diffusion process X s = X t,x,α(·) s is constrained to satisfy the SDE This result is a consequence of dynamical programming (DP) and (1.1) is called the DP equation of the control problem (7.2)-(7.3). Similarly, the schemes (3.3)-(3.4) are DP equations (at least in the explicit case) of suitably chosen discrete time and space control problems approximating (7.2)-(7.3). We refer to [18] for more details.
We take the slightly different approach explored in [9,16,19,7] to show the relation to control theory. The idea is to write the SL scheme in collocation form (7.1) and show that (7.1) is the DP equation of a discrete time continuous space optimal control problem. We illustrate this approach by deriving an explicit scheme involving L α k as defined in part 4 Section 5.1. Let {t 0 = 0, t 1 , . . . , t M = T } be discrete times and consider the discrete time approximation of (7.2)-(7.3) given bỹ 4)X m = x,X n =X n−1 + σ αn (X n−1 ) k n ξ n + b αn (X n−1 ) k 2 n η n , n > m, (7.5) where k n = (P + 1)∆t n , A M ⊂ A is an appropriate subset of piecewise constant controls, and ξ n = (ξ n,1 , . . . , ξ n,P ) ⊤ and η n are mutually independent sequences of i. i. d. random variables satisfying P (ξ n,1 , . . . , ξ n,P , η n ) = ±e j = 1 2(P + 1) if j ∈ {1, . . . , P }, P (ξ n,1 , . . . , ξ n,P , η n ) = e P +1 = 1 P + 1 , (e j denotes the j-th unit vector) and all other values of (ξ n,1 , . . . , ξ n,P , η n ) have probability zero. Here we have used a weak Euler approximation of the SDE coupled with a quadrature approximation of the integral. Note thatX ≈ X andũ ≈ u when ∆t is small. By DP P +1 , we find (7.1) with θ = 0. In [7], a similar argument is given in the stationary case for schemes involving the L α k of part 3 Section 5.1. In fact it is possible to identify all L α k 's appearing in Section 5.1 with DP equations of suitably chosen discrete time continuous space control problems. However assumption (Y1) is not strong enough for this approach to work for the general L α k defined in Sections 3 and 4. Remark 7.1. A DP approach naturally leads to explicit methods for time dependent PDEs. But implicit methods can also be derived using a trick. Discretize the PDE in time by backward Euler to find a (sequence of) stationary PDEs and use the DP approach on each stationary PDE. For stationary problems the DP equation is always implicit, so the result is an implicit iteration scheme.
Remark 7.2. By the definition of L α k and (Y1), x + y α,± i,k can be seen as a short time approximation of (7.3). Hence the scheme (3.3) tracks particle paths approximately. In fact by the above discussion we might say that the scheme follows particles in the mean because of the expectation. For first order PDEs, schemes defined in this way are called SL schemes by e. g. Falcone. Moreover, in this case our schemes will coincide with the SL schemes of Falcone [16] in the explicit case. This explains why we choose to call these schemes SL schemes also in the general case.

Error estimates in the convex case
We will derive error bounds in the case when B is a singleton and (1.1) is convex. It is not known how to prove such results in the general case. Here and in the following, we do not indicate the trivial β dependence any more. For simplicity we also take a uniform time-grid, letting G = ∆t {0, 1, . . . , N T } × X ∆x in this section. Let Q ∆t,T := ∆t {0, 1, . . . , N T } × R N and consider the intermediate equation The first step is now to find a bound on |U − V |.
Assuming V n has p bounded derivatives, we rearrange the equation and use (I1) to see that By the CFL condition (4.3), the coefficients of the above inequality are all non-negative. Hence since W n ≤ |W n | 0 := sup i |W n i |, we may replace W n by |W n | 0 on the right hand side. Moreover, since I|W n | 0 = |W n | 0 and L α k [|W n | 0 ] = 0, the upper bound on the right hand side then reduces to The same bound holds if we replace W by −W , and hence we can conclude that

an iteration then reveals that
when ∆t is small enough. Since V n is Lipschitz (p = 1), the lemma follows.
Next we want to estimate |V − u| when u solves (1.1)-(1.2). This can be done using the regularization method of Krylov if we can find suitable continuity and continuous dependence results for the scheme. These results rely on the following additional (covariance-type) assumptions: Whenever two sets of data σ, b andσ,b are given, the corresponding approximations L α k , y α,± k,i andL α k ,ỹ α,± k,i in (3.1) satisfy when σ, b, y ± k are evaluated at (t, x) andσ,b,ỹ ± k are evaluated at (t, y) for all t, x, y. In Section 9 we will prove the following error estimate.
It also follows from the regularity results in Section 9 (see Proposition 9.4) that |V n | 1 ≤ 2C T , so by Lemma 8.1 and Theorem 8.2 we have the following result.
This error bound applies to both the LISL and MCSL schemes, and it also holds for unstructured grids. If the solutions are more regular, it is possible to obtain better error estimates. But general and optimal results are not available. The best estimate in our case is O(∆x 1/5 ) which is achieved when k = O(∆x 2/5 ) and ∆t = O(k 2 ). Note that the CFL conditions (4.3) already imply that ∆t = O(k 2 ) if θ < 1. Also note that the above bound does not show convergence when k is optimal for the LISL scheme (k = O(∆x 1/2 )) or the MCSL scheme (k = O(∆x)).
Remark 8.1. These results are consistent with results for LISL type schemes for stationary Bellman equations. In fact if all coefficients are independent of time and c α (x) < −c < 0, then by combining the results of [7] and [1], exactly the same error estimate is obtained for the solution of a particular stationary LISL scheme and the unique stationary Lipschitz solution of (1.1).

Proof of Theorem 8.2
We start by an existence and uniqueness result. The proof is similar to (but simpler than) the proof of Theorem 4.2 with the modification that the fixed point is achieved in the Banach space C b (R N ) instead of the space of bounded functions on X ∆x .
Proof. Part (i) follows from Theorem 9.2 and Remark 9.1 sinceŨ ≡ 0 satisfies (9.1) with (σ α ,b α ,c α ,f α ,g α ) = (σ α , b α , c α , 0, 0). Part (ii) follows by taking U =Ũ and x = y. Now we extend the scheme (8.1) to the whole space Q T . One way to do this and to obtain continuous in time solutions, is to pose initial conditions on [0, ∆t) by interpolating between g(x) and U (∆t, x) where U is the solution of (8.1)-(8.2).
From the previous results for U the existence, uniqueness, and properties of V easily follow.
Then there is a constant C T ≥ 0 independent of k, ∆t, ∆x such that for all t ∈ [0, T ], Proof. First note that the initial data on [0, ∆t] is uniformly bounded and Lipschitz continuous in x and t by construction and Corollary 9.3. (a) Existence of a bounded x-continuous solution follows from repeated use of Lemma 9.1 since we have initial conditions on [0, ∆t]. Continuity in time follows from Theorem 9.2 (with x = y) since the data is t-continuous.
(c) Note that by construction of the initial data and Theorem 9.2 with x = y, the result holds for t ∈ [0, ∆t], and then the result holds for any t > ∆t by another application of Theorem 9.2 with x = y.
Using Krylov's method of shaking the coefficients, we will now find smooth subsolutions of (9.2). First we introduce the auxiliary equation where τ e φ(t, x) = φ(t, x + e) and V ε (∆t, x) is obtained by first solving (9.4) for discrete times t n = n∆t. For this equation to be well-defined for t ∈ (∆t, T ], the data and y α,± k,i must be defined for t ∈ (−∆t − ε 2 , T + ε 2 ]. But this is ok since one can easily extend these functions to t ∈ [−r, T + r] for any r > 0 in such a way that (A1), (Y1), (Y2) still hold. Also note that and hence (9.4) is an equation of the same type as (9.2) (with different A and shifted coefficients) satisfying (A1), (Y1), (Y2) whenever (9.2) does. By Proposition 9.4 there is a unique solution V ε of (9.4)-(9.5) in [0, T + ∆t + ε 2 ] × R N . Let U ε (t, x) := V ε (t + ∆t + ε 2 , x) and define by convolution, x ε ), and Note that U ε is well defined on the time interval [−∆t, T ]. By the next result it is the sought after smooth subsolution of (9.2).
Proposition 9.5. Under the assumptions of Proposition 9.4, the function U ε defined in (9.7) satisfies Proof. The regularity estimates in (i) are immediate from properties of convolutions and the regularity of V ε . The bound on U ε − V (in [0, T ]) in (ii) follows from Proposition 9.4 (c) and (A1) which imply and regularity of V ε along with properties of convolutions, To see that U ε is a subsolution of (9.2), first note that from the definition of U ε and (9.4) it follows that for all (t, x) ∈ [−ε 2 , T ] × R N , |e|, s 2 ≤ ε, and α ∈ A. Now we change variables from (t + s, x + e) to (t, x) to find that for all (t, x) ∈ [0, T ] × R N , |e|, s 2 ≤ ε, and α ∈ A. Then we multiply by ρ ε (s, e) and integrate w. r. t. (s, e). To see what the result is, note that For the whole equation we then have, x) + f α (t θ , x) for all (t, x) ∈ Q T and α ∈ A. Since this inequality holds for all α, it follows that U ε is a subsolution of (9.2) in all of Q T .
We are now in a position to prove the error estimate given in Theorem 8.2.
Proof of Theorem 8.2. Let U ε be defined in (9.7). By Proposition 9.5 (i) and Lemma 4.1 (a), in Q T . Moreover, by Proposition 9.5 (ii), It follows that there is a constant C ≥ 0 such that is a classical subsolution of (1.1)-(1.2). By the comparison principle and hence by Proposition 9.5 (ii), We minimize w. r. t. ε and find that The lower bound on u−U follows with symmetric -but much easier -arguments where a smooth supersolution of the equation (1.1) is constructed. Consistency and comparison for the scheme (9.2) is then used to conclude. In view of Lemma 4.1, the lower bound is a direct consequence of Theorem 3.1 (a) in [2].

Numerical results
In the following, we apply the LISL and MCSL schemes to linear and convex test problems in two space-dimensions. Hence all problems in this section are independent of β. For the LISL scheme, we choose k = √ ∆x and a regular triangular grid, whereas for the MCSL scheme we choose k = ∆x and a regular rectangular grid. If not stated otherwise, we use θ = 0 (explicit methods), CFL condition ∆t = k 2 , and approximation 5.1.5 for L α,β . As error measure we will always use the L ∞ -norm, and the error rates are calculated as r i = ln ei −ln ei−1 ln ∆xi −ln ∆xi−1 . All calculations are done in Matlab, on an INTEL Core2 Duo Mobile T7700, 2.4Ghz Laptop.
10.2. Linear problem with non-smooth solution. The second problem we test is a problem with non-smooth exact solution in [−π, π] 2 given by The coefficients in (1.1) are given by and we pose Dirichlet boundary conditions. This is a monotone non-smooth problem, and we obtain order one half applying the LISL scheme and order one applying the MCSL scheme, i. e. reduced rates, see Table 3.
b) The next test problem has exact solution u(t, x 1 , x 2 ) = (2 − t) sin x 1 sin x 2 and coefficients and control set given by In both examples, the control is discretized on the unit circle by 4π h grid points. The results at t = 0.5 are given in Table 4 , where again the grid is adapted to monotonicity. As expected for smooth solutions, the LISL scheme yields a numerical order of convergence of one, whereas the MCSL scheme yields order two.  Table 4: Results for optimal control problems at t = 0.5, grid adapted to monotonicity 10.4. Convergence test for a super-replication problem. We consider a test problem from [6] which was used to test convergence rates for numerical approximations of a super-replication problem from finance we will consider in Subsection 10.5. The corresponding PDE is as exact solution as in [6], and then f is forced to be In [6] η(x) = x, while we take η(x) = x(3 − x) to prevent the LISL scheme from overstepping the boundaries. Note that changing η does not change the solutions as long as η > 0 in the interior of the domain, see [6], and hence the above equation is equivalent to the equation used in [6]. The initial values and Dirichlet boundary values at x 1 = 0 and x 2 = 0 are taken from the exact solution. As in [6], at x = 3 and y = 3 homogeneous Neumann boundary conditions are implemented.
To approximate the values of α 1 , α 2 , the Howard algorithm is used (see [6]), which requires an implicit time discretization, so we choose θ = 1. The minimization is done over α 1,k + iα 2,k = e 2πik/2N∆x , k = 1, . . . , N ∆x , where N ∆x is the number of space grid points, i. e. N ∆x = 3/∆x. The results at t = 1 are given in Table 5. Again, the numerical order of convergence is approximately one when the LISL scheme is used and approximately two for the MCSL scheme. Note that compared to the results in [4], for comparable accuracies the LISL scheme is about ten times faster, the MCSL scheme about 100 to 1000 times faster.   16 11.58 7.50e-2 5.03e-3 1.86 149.24 Table 5: Results for the convergence test for the super-replication problem at t = 1 10.5. A super-replication problem. We apply our method to solve a problem from finance, the super-replication problem under gamma constraints considered in [6]. It consists of solving equation ( The solution obtained with the LISL scheme is given in Figure 1 and coincides with the solution found in [6]. It gives the price of a put option of strike and maturity 1, and x 1 and x 2 are respectively the price of the underlying and the price of the forward variance swap on the underlying. Proof. Assume that u ≥ 0 and c α,β ≤ 0 and let v(t, x) = u(t, x + he). Since v(t, x) satisfies (1.1)-(1.2) at the point (t, x + he), an application of (A2) shows that it also is a supersolution at the point (t, x). By the comparison principle u ≤ v and the theorem is proved. In the general case consider w = e − sup α,β |c|0t (u + |u| 0 ), and note that w ≥ 0 and the corresponding c α,β -coefficient c α,β − sup α,β |c α,β | 0 is non-positive. The first result then applies to w, and hence the theorem holds for u. If u is non-smooth, thenf is well-defined only if a e ≡ 0 ≡ b e , and the condition thatf ≥ 0 is essentially equivalent to assumption (A2). Of course, it is possible to relax (A2) if N = 1 or solutions are more regular.
After the change of variables (x,ȳ) = (ln x, ln y), this equation reduces to a constant coefficient equation. Since the initial data is decreasing inx andȳ, the same is true for the solution u by Lemma A.1. Going back to (x, y) variables, we then find that u is decreasing also in x and y. (Strictly speaking we must extend u(t, ·, ·) to R 2 in a suitable way to apply Lemma A.1).
The rest of the proof will aim at getting a contradiction for the caset > 0. Even if we do not write it like that, what we show below is that ψ(t,x,ỹ)−ψ(t−∆t,x,ỹ) ∆t ≤ o(1) − η as δ → 0, and this is impossible since (t,x,ỹ) is a maximum point of ψ.