# A projection-based error analysis of HDG methods

By Bernardo Cockburn, Jayadeep Gopalakrishnan, Francisco-Javier Sayas

## Abstract

We introduce a new technique for the error analysis of hybridizable discontinuous Galerkin (HDG) methods. The technique relies on the use of a new projection whose design is inspired by the form of the numerical traces of the methods. This renders the analysis of the projections of the discretization errors simple and concise. By showing that these projections of the errors are bounded in terms of the distance between the solution and its projection, our studies of influence of the stabilization parameter are reduced to local analyses of approximation by the projection. We illustrate the technique on a specific HDG method applied to a model second-order elliptic problem.

## 1. Introduction

This paper is dedicated to presenting a new technique for the error analysis of an emerging class of numerical methods called hybridizable discontinuous Galerkin (HDG) methods Reference 11. The main idea is to devise a projection that matches the form of the numerical traces of the HDG method. The analysis of the projection of the error then becomes simple, concise and independent of the particular choice of the stabilization parameters.

We do not aim for maximal generality, but rather to convey the advantages of our technique compared to other DG analyses. Hence we only consider one specific HDG method, namely, the LDG-hybridizable (LDG-H) method Reference 11 for the model problem

where is a Lipschitz polyhedral domain in (). Here is a variable matrix-valued coefficient, which we assume to be symmetric and uniformly positive definite, is in and in

Let us put our result in historical perspective. To do that, we begin by describing the HDG method. Consider a partitioning of the domain into elements forming a mesh satisfying the standard finite element conditions Reference 7; the faces of are going to be denoted by . This method yields a scalar approximation to , a vector approximation to , and a scalar approximation to the trace of on element boundaries, in spaces of the form

respectively, where , , and are finite dimensional spaces. The HDG approximations in , in , and the numerical trace in , are determined by requiring that

hold for all in and with a specific defined on ; see Reference 11. Above and throughout, we use the notation

where we write whenever is a domain of , and whenever is a domain of . For vector functions and , the notations are similarly defined with the integrand being the dot product . For HDG methods is taken to be an unknown in , while the numerical trace is prescribed on in such a way that both and can be eliminated from the above equations to give rise to a single equation for ; see Reference 11. Thus the often made criticism that DG methods have too many unknowns does not apply to HDG methods. Additional advantages of HDG methods include the ability to postprocess to get higher order solutions as we shall see in Section 5.

In contrast, the DG methods of the last century for second-order elliptic problems use only the first two equations Equation 1.3aEquation 1.3b, together with specific prescriptions of both and to define their approximations. For most such methods, the above mentioned elimination is not feasible. Moreover, their analysis is also very different from ours, as it does not require the use of any special projection. This can be seen in Reference 2, where most of the then known DG methods were analyzed in a single unifying framework. All of these methods were shown to converge with order in the scalar variable and with order in the flux . The order of convergence in is optimal and that in sub-optimal since the methods use as local spaces and the sets and , respectively, where and is the space of polynomials of total degree at most .

Two particular DG methods of the local discontinuous Galerkin (LDG) type, fitting in the above-mentioned unifying framework, deserve special mention as they provide approximations to the flux which converge with better orders of convergence. Their analyses do use special projections. The first is defined in the one-dimensional case and provides an order of convergence for the approximate flux of ; see Reference 6. The second is an extension of that method to multiple dimensions. It uses Cartesian grids and takes as local spaces and the sets and , respectively, where and is the space of polynomials of degree at most in each variable. The order of convergence for the approximate flux can be proven to be ; see Reference 13. In both cases, the special projections used in the analysis were carefully devised to capture the structure of the numerical fluxes and .

Another example of a method requiring a special projection to carry out its analysis is the so-called minimal dissipation LDG method. This is a special LDG method whose penalization parameter is identically zero in all the interior faces; as a consequence, it cannot be analyzed as in Reference 2. However, by using a suitably defined projection, the lack of stabilization can be overcome and sharp error estimates can be obtained. The flux can be shown to converge with the sub-optimal but sharp order of ; see Reference 8.

Now, let us consider the HDG methods for which

for some nonnegative penalty function defined on which we assume to be constant on each face of the triangulation. As explained in Reference 11, these methods are called the LDG-hybridizable (LDG-H) methods because the above numerical trace is that of the LDG method applied separately on each mesh element . The well-known hybridized versions of the Raviart-Thomas (RT) method Reference 18 and the Brezzi-Douglas-Marini (BDM) method Reference 5 can be considered to be HDG methods for which ; their analyses are carried out by using the celebrated RT and BDM projections. The analysis of HDG methods for which is not identically equal to zero, has been carried out in Reference 12 also by using special projections. All of the above methods were proven to provide approximations to the flux converging with the optimal order of ; the first HDG method with this property was introduced in Reference 9.

In this paper, we provide a new approach to the error analysis of HDG methods. It is an alternative to the techniques in Reference 12 for general HDG methods, and those in Reference 9 for the particular method treated therein. We recover all the results of Reference 9Reference 12 and obtain new superconvergence results for the projection of the scalar variable with our new approach. The novelty of our analysis is the use of new projections which are fitted to the structure of the numerical traces of the scheme in the sense that

where is the -projection into ; cf. the expression for from Equation 1.4.

Projections tailored to the numerical traces have actually been widely used. Indeed, beginning with the simplest example, notice that the projections used to analyze both the RT and BDM methods capture the structure of its numerical fluxes by satisfying

This should be compared with the definition of their numerical trace Reference 11

Besides the projections used to carry out the analysis of the DG methods in Reference 8, Reference 6 and Reference 13, we have the projections used in Reference 9 to analyze an HDG type method called single-face hybridizable (SFH) method. The name arises due to the fact that for every simplex , the penalty function is nonzero just on a single face, say . The projections and used there are such that for each simplex ,

This is in accordance with the numerical traces of the SFH method given by

The SFH method is also an HDG method of the LDG-H type, so our projection can also be used to analyze the SFH method. It is interesting to point out that the component of the SFH projection introduced in Reference 9 and ours do coincide when is as above, whereas the component does not, except when is a polynomial of degree on each simplex . The final estimates for the SFH method are the same with both approaches.

The remainder of this paper is organized as follows. In Section 2, we define the new projection and state its key properties. In Section 3, we use an energy argument to obtain an optimal error estimate for the approximate flux and its numerical trace . In Section 4, we use a duality argument to obtain a superconvergence estimate of the projection of the error in and its numerical trace . In Section 5 we discuss previously known element-by-element postprocessing of the flux and the scalar variable. They will result in further approximations and with interesting properties. Finally, we conclude in Section 6 by summarizing our results and relating them to the previous work Reference 12. Proof of the properties of the projection have been gathered in the Appendix.

## 2. The projection

The main ingredient of our error analysis is a new projection into the product space . In this section, we introduce it and establish its properties. From now on, we are going to use the following local spaces:

Also, we consider the stabilization function to be constant on each face.

The projected function is denoted by or by where and are the components of the projection in and , respectively. The values of the projection on any simplex are fixed by requiring that the components satisfy the equations

for all faces of the simplex . If , then Equation 2.1a and Equation 2.1b are vacuous and is defined solely by Equation 2.1c. Note that although we denoted the first component of the projection by , it depends not just on , but rather on both and , as we see from 2.1. The same is true for . Hence the notation for is somewhat misleading, but its convenience outweighs this disadvantage.

The domain of is a subspace of on which the right-hand sides of 2.1 are well defined. Indeed, all functions and that are regular enough for their traces and to be in are in the domain of , for example, , where

That the left-hand sides of 2.1 uniquely determine and is proved as part of the next theorem.

To state the theorem, we need to introduce additional notation and conventions. We use to denote the -norm for any . More generally, we denote the norm and seminorm on any Sobolev space by and , respectively. Also, as is usual in finite element analysis, we restrict elements to have shape regularity, i.e., all mesh elements under consideration satisfy , where , is the diameter of the largest ball contained in , and is a fixed constant. We use , with or without subscripts, to denote a generic constant, independent of the elements and functions involved in our inequalities (but it may depend on ). The value of  at different occurrences may differ. While we absorb dependencies on into , the dependencies on will always be explicitly mentioned.

Our first result states that the projection is well defined and has reasonable approximation properties. See Appendix A for a detailed proof.

### Theorem 2.1.

Suppose , is nonnegative and . Then the system 2.1 is uniquely solvable for and . Furthermore, there is a constant independent of and such that

for in . Here , where is a face of at which is maximum.

Note that if is of unit order, both approximation errors converge with the optimal order of , when the functions and are smooth enough. If , or if is bounded above and below uniformly by fixed constants, then a standard Bramble-Hilbert argument proves the order  approximation estimates of the theorem. However, to track the dependence of the constants on , and to compare with previous works, we need to perform a more careful analysis. This is done in Appendix A in elaborate detail. To make an immediate comparison, note that for the used in the SFH method Reference 9, and the approximation properties of are independent of the stabilization function .

We conclude this section with a property of the projection that we use critically in the error analysis of the method.

### Proposition 2.1 (A weak commutativity property).

For any in and any in the domain of , we have

For any ,

## 3. Flux error estimates by an energy argument

The purpose of this section is to give error estimates under minimal regularity assumptions on the solution. Since the projection  is designed to fit the structure of the numerical trace, the equations satisfied by the projection of the errors have a form amenable to simple analysis, as we see next. We will need the -orthogonal projection onto , which we denote by . Note that because is piecewise constant,

a fact that we will use repeatedly without explicit mention. The projection of the errors satisfy the following.

### Lemma 3.1.

Let , , and Then

for all and , where

### Proof.

Let us begin by noting that the exact solution and satisfies

for all and . By the definition of and , the above implies

for all and . Subtracting Equation 1.3a and Equation 1.3b from the above two equations, respectively, we obtain Equation 3.1a and Equation 3.1b. The equation Equation 3.1c follows directly from the boundary condition Equation 1.3c. To prove Equation 3.1d we proceed as follows:

where we have used the definition of . Note that this proves the identity Equation 3.1e because . Since is in and since satisfies Equation 1.3d, both terms above are zero. This completes the proof.

We have

### Proof.

Taking in Equation 3.1a, in Equation 3.1b, in Equation 3.1c and in Equation 3.1d, and adding the resulting four equations, we get

where

But, by integration by parts, we get

by the definition of , namely Equation 3.1e. This completes the proof.

While Lemma 3.2 identifies an “energy” norm, Lemma 3.1 gives “consistency” relations. These are enough to immediately prove an error estimate for the flux. To state it, we need the norm defined by for any function , and the -weighted -norm

### Theorem 3.1 (Flux error estimates).

Let and solve the HDG equations Equation 1.3 and let the exact solution be in the domain of . Then, for ,

where .

### Proof.

By the Cauchy-Schwarz inequality applied to the identity of Lemma 3.2, we get

and Equation 3.2 follows since . For Equation 3.3, we start by using the identity Equation 3.1e to get

where we used an inverse inequality and the fact that is uniformly bounded. Then Equation 3.3 follows from Equation 3.4.

## 4. Superconvergence of the scalar variableby a duality argument

In the previous section, we established an error estimate for that only requires that the solution be as regular as required for the application of the projection. On domains permitting higher regularity estimates, we can perform an analogue of the Aubin-Nitsche duality argument Reference 3Reference 17 to get higher rates of convergence. In particular, such arguments will give us error estimates for and .

We thus begin by introducing the dual problem for any given in :

We assume that this boundary value problem admits the regularity estimate

for all in . This is well known to hold in several cases, e.g., if and is a convex polygon Reference 16. Recall that we have been tacitly assuming that is in the domain of . By Equation 4.2, is also regular enough to apply , so we have the following lemma.

### Lemma 4.1 (The duality argument).

For any , we have

Consequently,

where

### Proof.

We have

by the continuity of and the fact that on by Equation 3.1c. Then

Moreover,

To bring this into the needed form, we continue:

Finally, the inequality of the lemma follows by applying the Cauchy-Schwarz inequality to the identity and using the first estimate of Theorem 3.1.

### Theorem 4.1.

Suppose the regularity assumption Equation 4.2 holds. Then

where and .

### Proof.

Let us prove the first inequality. By the first estimate of Theorem 2.1 with set to 0 and , we get that

by the regularity estimate Equation 4.2, and the first estimate follows.

The second estimate follows from the first from the same local argument used in Reference 5 to obtain a similar estimate for the BDM method. Indeed, when , we can select a function such that on and Using as the test function in Equation 3.1a, and applying an inverse inequality, we find that

Applying the first estimate of the theorem for and the estimate in Theorem 3.1 for , we get the second inequality of the theorem. This completes the proof.

## 5. Enhanced accuracy by postprocessing

In this section we describe a few techniques to postprocess the approximate solution and flux.

### 5.1. Flux postprocessing

We can obtain a postprocessed flux with better conservation properties. Although converges at the same order as , it is in and its divergence converges at one higher order than .

To define , we use a slight modification of the Raviart-Thomas projection Reference 18, as used in the framework of Darcy flows Reference 4, or for the Navier-Stokes equations Reference 14, or for the postprocessing of superconvergent DG methods for second-order elliptic problems Reference 12. We follow Reference 12. Thus, on each simplex , we take where is the only element of satisfying

If -orthogonal hierarchical basis functions are used, computation of reduces to solving, on each element, a linear system of size times the dimension of the space of homogeneous polynomials of degree . It is instructive to compare the above definition to that of the Raviart-Thomas projection, namely on each simplex , we set as the only element of satisfying

This comparison yields the following theorem. For a proof, see Reference 12.

#### Theorem 5.1.

For any , we have that . Moreover,

### 5.2. Postprocessing the approximate scalar variable

There are a few well-known ways to postprocess to obtain a new approximation  of enhanced accuracy.

As the first postprocessing method, we define in satisfying

where and Clearly, satisfies a discrete Neumann problem on each element with the computed approximate solution as data. This method was introduced in Reference 15Reference 19Reference 20 in the framework of mixed methods.

A variation of this postprocessing was proposed in Reference 9Reference 12 for DG methods. It is obtained simply by substituting in place of on the right-hand side of Equation 5.1a. This yields our second postprocessing alternative. The solution so obtained is denoted by .

As a third alternative, we define obtained by following Reference 19Reference 20. Let denote the -orthogonal complement of in . The solution is of the form where is the unique function in satisfying

As in the case of the flux postprocessing, if we are using an -orthogonal hierarchical basis to find , we need only invert a symmetric, positive definite matrix whose order is the dimension of . Note also that to evaluate the right-hand side of Equation 5.2, we need only use -dimensional quadratures, as

This follows by setting in Equation 1.3a, the first equation of the HDG method.

All the above postprocessed solutions converge at a higher rate than  (whenever ) as stated in the next theorem. It can be proved using the superconvergence estimate of Theorem 4.1 in a standard way (see Reference 20) so we omit it.

#### Theorem 5.2.

Under the same assumption as Theorem 4.1, the result of any of the three above mentioned postprocessings satisfy

for any and and any .

## 6. Concluding remarks

We have presented a technique for error analysis of HDG methods that is remarkable for its brevity, especially in comparison with previous DG analyses Reference 2Reference 9Reference 12. We achieved this through the use of a new projection . While brevity and elegance is traditionally achieved in the analysis of mixed methods, like RT and BDM methods, via the use of projections with commutativity properties, in the case of DG methods, commutativity seems not to be of paramount importance. Rather, what seems important is a projection tailored to fit the structure of the DG method, such as our .

Because is adapted to the structure of our numerical traces, we found it easy to estimate the projection of the errors. To summarize, we proved,

where the first three estimates hold for all and the last for . Thus, by the approximation properties of the projection (Theorem 2.1), if the penalty function is such that is of order one on each , we obtain the optimal order of convergence of for the approximate flux and its numerical trace. Of course, by the triangle inequality, the above estimates imply that the error of these variables converges to zero at the optimal order. If , the projection of the errors for and its trace superconverge at order . This can be exploited to get locally postprocessed solutions of enhanced accuracy.

To end, note that the above estimates imply that

According to the main result in Reference 12, such an inequality implies optimal order of convergence for the numerical flux and its postprocessing for . It also implies, for , the superconvergence of the orthogonal projection of the error in  into , and furthermore the superconvergence of . This is in perfect agreement with our results.

The extension of our approach to other equations of practical interest appearing in, for example, fluid flow and solid mechanics, can prove to be useful not only to analyze already existing HDG methods but also to devise new ones. This constitutes the objective of ongoing research.

## Appendix A. Proof of Theorem 2.1

We begin by observing that once we prove the approximation estimates of the theorem, then the unisolvency of the equations defining the projection follows as a corollary. This is because the number of equations and unknowns in 2.1, namely

and

respectively, coincide, i.e. 2.1 is a square linear system. Hence setting and in the approximation estimates we find that the projection must vanish.

In view of this, in the remainder of this section, we develop estimates for any and satisfying 2.1 without assuming uniqueness a priori (although it will follow a posteriori). But first, we begin with two auxiliary lemmas.

### A.1. Two estimates involving orthogonal polynomials

Let

#### Lemma A.1.

Let be any face of a simplex . The trace map

is a bijection. Moreover,

#### Proof.

We first prove that is injective. Suppose for some . Then we can write , where denotes the barycentric coordinate function of that vanishes on and is some function in . But since for all , we then have so and hence . The surjectivity of now follows by counting dimensions and using the injectivity.

Finally, the estimate of the lemma follows from the injectivity and a standard scaling argument.

#### Lemma A.2.

Let be a nonnegative function on , constant on each face of , and such that . Let satisfy the equation

where is linear. Then

where .

#### Proof.

Let be a face of at which . Then, by Lemma A.1,

and the wanted estimate follows.

### A.2. Decoupling the projection component  upper Pi Subscript upper W Baseline u $\varPi _{\!\scriptscriptstyle {W}} u$

Now we characterize the second (scalar) component of the projection , namely , independently of the first, and prove its approximation properties.

#### Proposition A.1.

On each element , the component satisfies

#### Proof.

The first equation Equation A.1a is the same as an equation defining the projection Equation 2.1b. For the second, note that Equation 2.1c implies

Simplifying the right-hand side using

we finish the proof.

Proposition A.1 permits comparison with the SFH method. Suppose is selected as in the SFH method and suppose . Then the system A.1 becomes

Note that to obtain the last equation, we used the surjectivity of (Lemma A.1). Thus, in this case, coincides with the projection used in the analysis of the SFH method Reference 9 (denoted there by ).

We are now ready to obtain the estimate of in Theorem 2.1.

#### Proposition A.2.

Suppose the assumptions on in Theorem 2.1 hold. Then,

for in (and consequently, is uniquely determined by A.1).

#### Proof.

To prove the result, we set , where is the -orthogonal projection of into , and note that

The first term can be readily estimated by using the standard approximation properties of the -projection. Let us estimate the second term.

Equation Equation A.1a shows that belongs to , and Equation A.1b implies

where and . By Lemma A.2 with , and , this implies that

Let us estimate . Since , we have

where is the -projection of into when . If , we set . Hence,

for in , by the approximation properties of the -projection.

Finally, let us estimate . By a scaling argument,

A trace inequality and the approximation properties of the -projection imply

for any in . This completes the proof.

### A.3. Properties of the flux component  upper P Subscript upper V Baseline bold-italic q $P_{\scriptscriptstyle {V}}\boldsymbol{q}$

To study the flux component, we recall another projection introduced and studied in Reference 8 and later used in Reference 9Reference 12. Let be a face of at which is a maximum. For any function in the domain of , the restriction of to is defined to be the unique element of satisfying