Adaptive wavelet methods for elliptic operator equations: Convergence rates

By Albert Cohen, Wolfgang Dahmen, and Ronald DeVore

Abstract

This paper is concerned with the construction and analysis of wavelet-based adaptive algorithms for the numerical solution of elliptic equations. These algorithms approximate the solution of the equation by a linear combination of wavelets. Therefore, a benchmark for their performance is provided by the rate of best approximation to by an arbitrary linear combination of wavelets (so called -term approximation), which would be obtained by keeping the largest wavelet coefficients of the real solution (which of course is unknown). The main result of the paper is the construction of an adaptive scheme which produces an approximation to with error in the energy norm, whenever such a rate is possible by -term approximation. The range of for which this holds is only limited by the approximation properties of the wavelets together with their ability to compress the elliptic operator. Moreover, it is shown that the number of arithmetic operations needed to compute the approximate solution stays proportional to . The adaptive algorithm applies to a wide class of elliptic problems and wavelet bases. The analysis in this paper puts forward new techniques for treating elliptic problems as well as the linear systems of equations that arise from the wavelet discretization.

1. Introduction

1.1. Background

Adaptive methods, such as adaptive finite elements methods (FEM), are frequently used to numerically solve elliptic equations when the solution is known to have singularities. A typical algorithm uses information gained during a given stage of the computation to produce a new mesh for the next iteration. Thus, the adaptive procedure depends on the current numerical resolution of . Accordingly, these methods produce a form of nonlinear approximation of the solution, in contrast with linear methods in which the numerical procedure is set in advance and does not depend on the solution to be resolved.

The motivation for adaptive methods is that they provide flexibility to use finer resolution near singularities of the solution and thereby improve on the approximation efficiency. Since the startling papers Reference 2Reference 3 the understanding and practical realization of adaptive refinement schemes in a finite element context has been documented in numerous publications Reference 3Reference 4Reference 5Reference 13Reference 36. A key ingredient in most adaptive algorithms are a-posteriori error estimators or indicators derived from the current residual or the solution of local problems. They consist of local quantities such as jumps of derivatives across the interface between adjacent triangles or simplices. One often succeeds in bounding the (global) error of the current solution with respect to the energy norm, say, by sums of these quantities from below and above. Thus refining the mesh where these local quantities are large is hoped to reduce the bounds and hence the error in the next computation. Computational experience frequently confirms the success of such techniques for elliptic boundary value problems in the sense that adaptively generated highly nonuniform meshes indeed give rise to an accuracy that would require the solution of much larger systems of equations based on uniform refinements. However, on a rigorous level the quantitative gain of adaptive techniques is usually not clear. The central question is whether the mesh refinements actually result, at each step, in some fixed error reduction. To our knowledge, only in Reference 35 has convergence of an adaptive scheme been established for a rather special case, namely a piecewise linear finite element discretization of the classical Dirichlet problem for Laplace’s equation. There is usually no rigorous proof of the overall convergence of such schemes unless one assumes some quantitative information such as the saturation property about the unknown solution Reference 13. Saturation properties are assumed but not proven to hold.

Moreover, the derivation of error indicators in conventional discretizations hinges on the locality of differential operators. Additional difficulties are therefore encountered when considering elliptic operators with nonlocal Schwartz kernel arising, for instance, in connection with boundary integral equations.

In summary, there seem to be at least two reasons for this state of affairs:

(i) There is an inherent difficulty, even for local operators, in utilizing the information available at a given stage in the adaptive computation to guarantee that a suitable reduction will occur in the residual error during the next adaptive step.

(ii) Finite element analysis is traditionally based on Sobolev regularity (see e.g., Reference 14 or Reference 15), which is known to govern the performance of linear methods. Only recent developments in the understanding of nonlinear methods have revealed that Besov regularity is a decidedly different and more appropriate smoothness scale for the analysis of adaptive schemes, see e.g., Reference 31.

In view of the significant computational overhead and the severe complications caused by handling appropriate data structures for adaptive schemes, not only guaranteeing convergence but above all knowing its speed is of paramount importance for deciding whether or under which circumstances adaptive techniques actually pay off. To our knowledge nothing is known so far about the actual rate of convergence of adaptive FEM solvers, by which we mean the relation between the accuracy of the approximate solution and the involved degrees of freedom, or better yet the number of arithmetic operations.

1.2. Wavelet methods

An alternative to FEM are wavelet based methods. Similarily to mesh refinement in FEM, these methods offer the possibility to compress smooth functions with isolated singularities into high-order adaptive approximations involving a small number of basis functions. In addition, it has been recognized for some time Reference 11 that for a large class of operators (including integral operators) wavelet bases give rise to matrix representations that are quasi-sparse (see Sections 2 and 3 for a definition of quasi-sparse) and admit simple diagonal preconditioners in the case of elliptic operators. Therefore, it is natural to develop adaptive strategies based on wavelet discretizations in order to solve elliptic operator equations numerically.

The state of wavelet-based solvers is still in its infancy, and certain inherent impediments to their numerical use remain. These are mainly due to the difficulty of dealing with realistic domain geometries. Nevertheless, these solvers show great promise, especially for adaptive approximation (see e.g., Reference 1Reference 12Reference 16Reference 18Reference 25). Most adaptive strategies exploit the fact that wavelet coefficients convey detailed information on the local regularity of a function and thereby allow the detection of its singularities. The rule of thumb is that wherever wavelet coefficients of the currently computed solution are large in modulus, additional refinements are necessary. In some sense, this amounts to using the size of the computed coefficients as local a-posteriori error indicators. Note that here refinement has a somewhat different meaning than in the finite element setting. There the adapted spaces result from refining a mesh. The mesh is the primary controlling device and may create its own problems (of geometric nature) that have nothing to do with the underlying analytic task. In the wavelet context refinement means to add suitably selected further basis functions to those that are used to approximate the current solution. We refer to this as space refinement.

In spite of promising numerical performances, the problem remains (as in the finite element context) to quantify these strategies, that is, to decide which and how many additional wavelets need to be added in a refinement step in order to guarantee a fixed error reduction rate at the next resolution step. An adaptive wavelet scheme based on a-posteriori error estimators has been recently developed in Reference 20, which ensures this fixed error reduction for a wide class of elliptic operators, including those of negative order. This shows that making use of the characteristic features of wavelet expansions, such as the sparsification and preconditioning of elliptic operators, allows one to go beyond what is typically known in the conventional framework of adaptive FEM. However, similar to FEM, there are so far no results about the rate of convergence of adaptive wavelet based solvers, i.e., the dependence of the error on the number of degrees of freedom.

1.3. The objectives

The purpose of the present paper is twofold. First, we provide analytical tools that can be utilized in studying the theoretical performance of adaptive algorithms. Second, we show how these tools can be used to construct and analyze wavelet based adaptive algorithms which display optimal approximation and complexity properties in the sense that we describe below.

The adaptive methods we analyze in this paper take the following form. We assume that we have in hand a wavelet basis to be used for numerically resolving the elliptic equation. Our adaptive scheme will iteratively produce finite sets , , and the Galerkin approximation to from the space . The function is a linear combination of wavelets. Thus the adaptive method can be viewed as a particular form of nonlinear -term wavelet approximation, and a benchmark for the performance of such an adaptive method is provided by comparison with best -term approximation (in the energy norm) when full knowledge of is available.

Much is known about -term approximation. In particular, there is a characterization of the functions that can be approximated in the energy norm with accuracy by using linear combinations of wavelets. As we already mentioned, this class is typically a Besov space, which is substantially larger than the corresponding Sobolev space which ensures accuracy for uniform discretization with parameters. In several instances of the elliptic problems, e.g., when the right hand side has singularities, or when the boundary of has corners, the Besov regularity of the solution will exceed its Sobolev regularity (see Reference 19 and Reference 21). So these solutions can be approximated better by best -term approximation than by uniformly refined spaces, and the use of adaptive methods is suggested. Another important feature of -term approximation is that a near best approximation is produced by thresholding, i.e., simply keeping the largest contributions (measured in the same metric as the approximation error) of the wavelet expansion of .

Of course, since best -term approximation requires complete information on the approximated function, it cannot be applied directly to the unknown solution. It is certainly not clear beforehand whether a concrete numerical scheme can produce at least asymptotically the same convergence rate. Thus ideally an optimal adaptive wavelet algorithm should produce a result similar to thresholding the exact solution. In more quantitative terms this means that whenever the solution is in , the approximations should satisfy

where is the energy norm and is the norm for . Since in practice one is mostly interested in controlling a prescribed accuracy with a minimal number of parameters, we shall rather say that the adaptive algorithm is of optimal order if whenever the solution is in , then for all , there exists such that

and such that

Such a property ensures an optimal memory size for the description of the approximate solution.

Another crucial aspect of the adaptive algorithms is their computational complexity: we shall say that the adaptive algorithm is computationally optimal if, in addition to Equation 1.2Equation 1.3, the number of arithmetic operation needed to derive is proportional to . Note that an instance of computational optimality in the context of linear methods is provided by the full multigrid algorithm when represents the number of unknowns necessary to achieve a given accuracy on a uniform grid. We are thus interested in algorithms that exhibit the same type of computational optimality with respect to an optimal adaptive grid which is not known in advance and should itself be generated by the algorithm.

The main accomplishment of this paper is the development and analysis of an adaptive numerical scheme which for a wide class of operator equations (including those of negative order) is optimal with regard to best -term approximation and is also computationally optimal in the above sense. Let us mention that a simplified version of this algorithm has been developed and tested in Reference 18, as well as a more elaborate version in Reference 6. In this last version, appropriate object-oriented data structures play a crucial role for handling properly the sparse representation of the solution. In both cases, the numerical experiments confirm a similar behavior between the numerical error generated by such adaptive algorithms and by thresholding the exact solution. We note however, that depending on the concrete application at hand, these implementations still suffer from currently suboptimal schemes for a central task, namely evaluating the entries of the stiffness matrices. Since, in the case of piecewise polynomial wavelets and (piecewise) constant coefficients PDE’s, these elements are given by explicit formulae, the cost of this task is negligible. However, a much more sophisticated treatment appears to be necessary in the general case where, for instance, the techniques recently proposed in Reference 10 should lead to significant improvements.

Although the present investigations are confined to symmetric elliptic problems, the results provide, in our opinion, a core ingredient for the treatment of more complex tasks. For instance, the first steps in this direction are the development of a-posteriori wavelet strategies for saddle point problems in Reference 22, the use of stabilized variational formulations in Reference 9, or least squares formulations in Reference 24.

1.4. Organization of the paper

In Section 2, we introduce the general setting of elliptic operator equations where our results apply. In this context, after applying a diagonal preconditioner, wavelet discretizations allow us to view the equation as a discrete well conditioned linear system.

In Section 3, we review certain aspects of nonlinear approximation, quasi-sparse matrices and fast multiplication using such matrices. The main result of this section is an algorithm for the fast computation of the application of a quasi-sparse matrix to a vector.

In Section 4, we analyze the rate of convergence of the refinement procedure introduced earlier in Reference 20. We will refer to this scheme here as Algorithm I. We show that this algorithm is optimal for a small range of . However, the full range of optimality should be limited only by the properties of the wavelet basis (smoothness and vanishing moments) and the operator; this is not the case for Algorithm I. The analysis in Section 4, however, identifies the barrier that keeps Algorithm I from being optimal in the full range of .

In Section 5, we introduce a second strategy—Algorithm II—for adaptively generating the sets that is shown to provide optimal approximation of order for the full range of . The new ingredient that distinguishes Algorithm II from Algorithm I is the addition of thresholding steps which delete some indices from . This would be the analogue of coarsening the mesh in FEM.

Although we have qualified so far both procedures in Sections 4 and 5 as “algorithms”, we have actually ignored any issue concerning practical realization. They are idealized in the sense that the exact assessment of residuals and Galerkin solutions is assumed. This was done in order to clearly identify the essential analytical tasks. Practical realizations require truncations and approximations of these quantities. Section 6 is devoted to developing the ingredients of a realistic numerical scheme. This includes quantitative thresholding procedures, approximate matrix/vector multiplication, approximate Galerkin solvers and the approximate evaluation of residuals.

In Section 7 we employ these ingredients to formulate a computable version of Algorithm II, which is shown to be computationally optimal for the full range of . Recall that this means that it realizes for this range the order of best -term approximation at the expense of a number of arithmetic operations that stays proportional to the number of significant coefficients. Computational optimality hinges to a great extent on the fast approximate matrix/vector multiplication from Section 3.

It should be noted however that an additional cost in our wavelet adaptive algorithm is incurred by sorting the coefficients in the currently computed solution. This cost at stage is of order , where , thus slightly larger than the cost in arithmetic operations. It should be stressed that the complexity of the algorithm is analysed under the assumption that the solution exhibits a certain rate of best -term approximation which is, for instance, implied by a certain Besov regularity. The algorithm itself does not require any a-priori assumption of that sort.

We have decided to carry out the (admittedly more technical) analysis of the numerical ingredients in some detail in order to substantiate our claim that the optimality analysis is not based on any hidden assumptions (beyond those hypotheses that are explicitly stated) such as accessing infinitely many data. Nevertheless the main message of this paper can be read in Sections 4 and 5: optimal adaptive approximations of elliptic equations can be computed by iterative wavelet refinements using a-posteriori error estimators, provided that the computed solution is regularly updated by appropriate thresholding procedures. This fact was already suggested by numerical experiments in Reference 18 that show similar behavior between the numerical error generated by such adaptive algorithms and by thresholding the exact solution.

2. The setting

In this section, we shall introduce the setting in which our results apply. In essence, our analysis applies whenever the elliptic operator equation takes place on a manifold or domain which admits a biorthogonal wavelet basis.

2.1. Ellipticity assumptions

This subsection gives the assumptions we make on the operator equation to be numerically solved. These assumptions are quite mild and apply in great generality.

Let denote a bounded open domain in the Euclidean space with Lipschitz boundary or, more generally, a Lipschitz manifold of dimension . In particular, could be a closed surface which arises as a domain for a boundary integral equation. The space consists of all square integrable functions on with respect to the (canonically induced) Lebesgue measure. The corresponding inner product is denoted by

Let be a linear operator mapping a Hilbert space into (its dual relative to the pairing ), where is a space with the property that either or its dual is embedded in . The operator induces the bilinear form defined on by

where denotes the duality product.

(A1): We assume that the bilinear form is symmetric positive definite and elliptic in the sense that

Here, and throughout this paper, means that both quantities can be uniformly bounded by constant multiples of each other. Likewise indicates inequalities up to constant factors.

It follows that is also a Hilbert space with respect to the inner product and that this inner product induces an equivalent norm (called the energy norm) on by

By duality, thus defines an isomorphism from onto . We shall study the equation

with . From our assumptions, it follows that for any , this equation has a unique solution in , which will always be denoted by . This is also the unique solution of the variational equation

The typical examples included in the above assumptions are the Poisson or the biharmonic equations on bounded domains in ; single or double layer potentials, and hypersingular operators on closed surfaces arising in the context of boundary integral equations. In these examples is a Sobolev space, e.g., , , or ; see Reference 23Reference 20Reference 41 for examples.

2.2. Wavelet assumptions

By now wavelet bases are available for various types of domains that are relevant for the formulation of operator equations. This covers, for instance, polyhedral surfaces of dimension two and three Reference 29 as well as manifolds or domains that can be represented as a disjoint union of smooth regular parametric images of a simple parameter domain such as the unit cube Reference 27.

There are many excellent accounts of wavelets on (see e.g., Reference 38 or Reference 30). For the construction and description of wavelet bases on domains and manifolds, we refer the reader to the survey paper Reference 23 and the references therein. This survey also sets forth the notation we shall employ below for indexing the elements in a wavelet basis. To understand this notation, it may be useful for the reader to keep in mind the case of wavelet bases on . In this setting, a typical biorthogonal wavelet basis of compactly supported functions is given by the shifted dilates of a set of functions. Namely, the collection of functions

forms a Riesz basis for . The dual basis is given by

with again a set of functions. The integer gives the dyadic level ( the frequency) of the wavelet. The multi-integer gives the position () of the wavelet. Namely, the wavelet has support contained in a cube of diameter centered at the point . Note that there are functions with the same dyadic level and position .

Another way to construct a wavelet basis for is to start the multiscale decomposition at a finite dyadic level . In this case, the basis consists of the functions of Equation 2.7 with , together with a family of functions

with a fixed (scaling) function. Wavelet bases for domains take a similar form except that some alterations are necessary near the boundary.

We shall denote wavelet bases by . In the particular case above, this notation incorporates the three parameters into the one . We use to denote the dyadic level of the wavelet. We let , , consist of the wavelets at level .

In all classical constructions of compactly supported wavelets, there exist fixed constants and such that and such that for all there are at most indices such that .

Since we shall consider only bounded domains in this paper, the wavelet decomposition will begin at some fixed level . For notational convenience only, we assume . We define to be the set of scaling functions in the wavelet basis. We shall assume that is a domain or manifold which admits two sets of functions:

that form a biorthogonal wavelet bases on : writing for any two collections of functions in , one has

where is the identity matrix.

A typical feature in the theory of biorthogonal bases is that the sequences are Riesz bases. That is, using the shorthand notation , one has

This property means that the wavelet bases characterize . In the present context of elliptic equations, we shall need not Equation 2.12 but rather the fact that these bases provide a characterization of and in terms of wavelet coefficients. This is expressed by the following specific assumption.

(A2): Let the energy space be equipped with the norm and its dual space with the norm . We assume that the wavelets in are in , whereas those in are in (in this context, we can assume that Equation 2.11 simply holds in the sense of the duality ). We assume that each has a wavelet expansion (with coordinates ) and that

with a fixed positive diagonal matrix.

Observe that Equation 2.13 implies that , and that (resp. ) is an unconditional (resp. Riesz) basis for . By duality, one easily obtains that each has a wavelet expansion (with coordinates ) that satisfies

One should keep in mind, though, that is only needed for analysis purposes. The Galerkin schemes to be considered below only involve , while never enters any computation and need not even be known explicitly.

It is well known (see e.g., Reference 27) that wavelet bases provide such characterizations for a large variety of spaces (in particular the Sobolev and Besov spaces for a certain parameter range which depends on the smoothness of the wavelets). In the context of elliptic equations, is typically some Sobolev space . In this case (A2) is satisfied whenever the wavelets are sufficiently smooth, with . For instance, when , one has .

2.3. Discretization and preconditioning of the elliptic equation

Using wavelets, we can rewrite Equation 2.5 as an infinite system of linear equations. We take wavelet bases and satisfying (A2) and write the unknown solution and the given right hand side in terms of the basis . This gives the system of equations

The solution to Equation 2.15 gives the wavelet coefficients of the solution to Equation 2.5.

An advantage of wavelet bases is that they allow for trivial preconditioning of the linear system Equation 2.15. This preconditioning is given by the matrix of (A2) and results in the system of equations

or, more compactly,

where

Let us briefly explain the effect of the above diagonal scaling with regard to preconditioning. To this end, note that by (A1), the matrix is symmetric positive definite. We define its associated bilinear form by

where is the standard inner product in , and denote the norm associated with this bilinear form by . In other words,

Combining the ellipticity assumption (A1) together with the wavelet characterization of (A2), we obtain that and are equivalent norms, i.e., there exist constants such that

It is immediate that these constants are also such that

and

In particular, the condition number of satisfies

The fact that the diagonal scaling turns the original operator into an isomorphism on will be a cornerstone of the subsequent development. Denoting by the entries of and by the section of restricted to the set , we see from the positive definiteness of that

and that the condition numbers of the submatrices remain uniformly bounded for any subset , i.e.,

Finally, it is easy to check that the constants and also provide the equivalence

Here and later, we adopt the following rule about denoting constants. We shall denote constants which appear later in our analysis by . Other constants, whose value is not so important for us, will be denoted by or incorporated into the notation.

A typical instance of the above setting involves Sobolev spaces , in which case the entries of the diagonal matrix can be chosen as . Of course, the constants in Equation 2.24 will then depend on the relation between the energy norm Equation 2.20 and the Sobolev norm. In some cases such a detour through a Sobolev space is not necessary, and Equation 2.13 can be arranged to hold for a suitable when already coincides with the energy norm. A simple example is , where is an appropriate choice. In fact, Equation 2.13 will then hold independently of .

2.4. Quasi-sparsity assumptions on the stiffness matrix

Another advantage of the wavelet basis is that for a large class of elliptic operators, the resulting preconditioned matrix exhibits fast decay away from the diagonal. This will later be crucial with regard to storage economy and efficiency of (approximate) matrix/vector multiplication.

Consider for example the (typical) case when is the Sobolev space of order or its subspace . Then, for a large class of elliptic operators, we have

with and and

The validity of Equation 2.28 has been established in numerous settings (see e.g., Reference 23Reference 11Reference 40Reference 42). Decay estimates of the form Equation 2.28 were initially introduced in Reference 37 in the context of Littlewood-Paley analysis. The constant depends on the smoothness of the wavelets, whereas is related to the approximation order of the dual multiresolution (resp. the vanishing moments of the wavelets) and the order of the operator . Estimates of the type Equation 2.28 are known to hold for a wide range of cases, including classical pseudo-differential operators and Calderón-Zygmund operators (see e.g., Reference 26Reference 41). In particular, the single and double layer potential operators fall into this category. We refer the reader to Reference 23 for a full discussion of settings where Equation 2.28 is valid.

We introduce the class of all matrices which satisfy

with defined by Equation 2.29. We say that a matrix is quasi-sparse if it is in the class for some and . Properties of quasi-sparse matrices will be discussed in Section 3.

(A3): We assume that, for some , , the matrix of Equation 2.17 is in the class .

Let us note that in the case discussed earlier, we obtain Equation 2.30 from Equation 2.28 because .

2.5. Wavelet Galerkin methods

A wavelet based Galerkin method for solving Equation 2.5 takes the following form. We choose a finite set of wavelet indices and use the space as our trial and analysis space. The approximate Galerkin solution from is defined by the conditions