Local well-posedness for quasi-linear problems: A primer

By Mihaela Ifrim and Daniel Tataru

Abstract

Proving local well-posedness for quasi-linear problems in partial differential equations presents a number of difficulties, some of which are universal and others of which are more problem specific. On one hand, a common standard for what well-posedness should mean has existed for a long time, going back to Hadamard. On the other hand, in terms of getting there, there are by now both many variations—and also many misconceptions.

The aim of these expository notes is to collect a number of both classical and more recent ideas in this direction, and to assemble them into a cohesive roadmap that can be then adapted to the reader’s problem of choice.

1. Introduction

Local well-posedness is the first question to ask for any evolution problem in partial differential equations (PDEs). These notes, prepared by the authors for a summer graduate school seminar at MSRI Reference 13 in 2020, aim to discuss ideas and strategies for local well-posedness in quasi-linear and fully nonlinear evolution equations, primarily of hyperbolic type. We hope to persuade the reader that the structure presented here should be adopted as the standard for proving these results. Of course, there are many possible variations, and we try to point out some of them in our many remarks. While a few of the ideas here can be found in several of the classical books (see, e.g., Reference 30, Reference 10, Reference 3, Reference 24), some of the others have appeared only in articles devoted to specific problems and have never been collected together, to the best of our knowledge.

1.1. Nonlinear evolutions

For our exposition we will adopt a two-track structure, where we will broadly discuss ideas for a general problem and, in parallel, implement these ideas on a simple, classical, concrete example.

Our general problem will be a nonlinear PDE of the form

$$\begin{equation} u_t = N(u), \qquad u(0) = u_0, \cssId{gen-eq}{\tag{1.1}} \end{equation}$$

i.e., a first-order system in time, where we think of $u$ as a scalar- or a vector-valued function belonging to a scale of either real or complex Sobolev spaces. This scale will be chosen to be $H^s \coloneq H^s(\mathbb{R}^n)$ for the purpose of this discussion, though in practice it often has to be adapted to the class of problems to be considered. The nonlinearity $N$ represents a nonlinear function of $u$ and its derivatives,

$$\begin{equation*} N(u) = N (\{ \partial ^\alpha u\}_{|\alpha | \leq k}) , \end{equation*}$$

where we will refer to $k$ as the order of the evolution. Here typical examples include $k=1$ (hyperbolic equations), $k = 2$ (Schrödinger-type evolutions) and $k=3$ (Korteweg–de Vries-type evolutions). But many other situations arise in models which are nonlocal, e.g., in water waves one encounters $k = \frac{1}{2}$ for gravity waves (resp., $k = \frac{3}{2}$ for capillary waves).

Some problems are most naturally formulated as second-order evolutions in time, for instance nonlinear wave equations. While some such problems admit also good first-order in time formulations (e.g., the compressible Euler flow), it is sometimes better to treat them as second order. Regardless, our roadmap still applies, with obvious adjustments.

Our model problem will be a classical first-order symmetric hyperbolic system in $\mathbb{R} \times \mathbb{R}^n$, of the form

$$\begin{equation} \partial _t u = \mathcal{A}^j(u) \partial _j u, \qquad u(0) = u_0 , \cssId{sym-hyp}{\tag{1.2}} \end{equation}$$

where $u$ takes values in $\mathbb{R}^m$ and the $m\times m$ matrices $\mathcal{A}^j$ are symmetric and smooth as functions of $u$. Here the order of the nonlinearity $N$ is $k = 1$, and the scale of Sobolev spaces to be used is indeed the Sobolev scale.

1.2. What is well-posedness?

To set the expectations for our problems, we recall the classical Hadamard standard for well-posedness, formulated relative to our chosen scale of spaces.

Definition 1.1.

The problem Equation 1.1 is locally well-posed in a Sobolev space $H^s (\mathbb{R}^n)$ if the following properties are satisfied:

(i): Existence: For each $u_0 \in H^s$ there exists some time $T > 0$ and a solution $u \in C([0,T]; H^s)$.
(ii): Uniqueness: The above solution is unique in $C([0,T]; H^s)$.
(iii): Continuous dependence: The data to solution map is continuous from $u_0 \in H^s$ to $u \in C([0,T];H^s)$.

As a historical remark, we note that Hadamard primarily discussed the question of well-posedness in the context of linear PDEs, specifically for the Laplace and wave equations, beginning with an incipient form in Reference 8 and a more developed form in Reference 9. It is in the latter reference where the continuous dependence is discussed, seemingly inspired by Cauchy’s theorem for ordinary differential equations.

The above definition should not be taken as universal, but rather as a good starting point, which may need to be adjusted depending on the problem. Consider for instance the uniqueness statement, which, as given in Definition 1.1(ii), is in the strongest form, which is often referred to as unconditional uniqueness. Often this may need to be relaxed somewhat, particularly when low regularity solutions are concerned. Some common variations concerning uniqueness are as follows:

(a): The solutions $u$ in Definition 1.1(i) are shown to belong to a smaller space, $X^s_T \subset C([0,T]; H^s(\mathbb{R}^n))$, and then the uniqueness in Definition 1.1(ii) holds in the same class.
(b): Unconditional uniqueness holds a priori only in a more regular class $H^{N}$ with $N > s$, but the data to solution map extends continuously as a map from $H^s$ to $C([0,T]; H^s)$.

Since we are discussing nonlinear equations here, the lifespan of the solutions need not be infinite, i.e., there is always the possibility that solutions may blow up in finite time. In particular, in the context of well-posed problems it is natural to consider the notion of maximal lifespan, which is the largest $T$ for which the solution exists in $C([0,T); H^s)$; here the limit of $u(t)$ as $t$ approaches $T$ cannot exist, or else the solution $u$ may be continued further.

In this context, the last property in Definition 1.1 should be interpreted to mean in particular that, for a solution $u\in C([0,T];H^s)$, small perturbations of the initial data $u_0$ yield solutions which are also defined in $[0,T]$. This in turn implies that the maximal lifespan $T = T(u_0)$ is lower semicontinuous as a function of $u_0 \in H^s(\mathbb{R}^n)$.

In view of the above discussion, it is always interesting to provide more precise assertions about the lifespan of solutions, or, equivalently, continuation (or blow-up) criteria for the solutions. Some interesting examples are as follows:

(a)

The lifespan $T(u_0)$ is bounded from below uniformly for data in a bounded set,$$\begin{equation*} T(u_0) \geq C(\|u_0\|_{H^s}) > 0. \end{equation*}$$

This implies a blow-up criteria as follows:$$\begin{equation*} \lim _{t \to T(u_0)} \| u(t) \|_{H^s} = \infty . \end{equation*}$$

(b)

The blowup may be characterized in terms of weaker bounds,$$\begin{equation*} \lim _{t \to T(u_0)} \| u(t) \|_{Y} = \infty , \end{equation*}$$

relative to a Banach topology $Y \supset H^s$, or perhaps a time integrated version thereof$$\begin{equation*} \int _{0}^{T(u_0)} \| u(t) \|_{Y} dt = \infty . \end{equation*}$$

To conclude our discussion of the above definition, we note that many well-posedness statements also provide additional properties for the flow:

Higher regularity: If the initial data has more regularity $u_0 \in H^\sigma$ with $\sigma > s$, then this regularity carries over to the solution, $u \in C[(0,T);H^\sigma ]$, with bounds and lifespan bounds depending only on the $H^s$ size of the data.
Weak Lipschitz bounds: On bounded sets in $H^s$, the flow is Lipschitz in a weaker topology (e.g., up to $H^{s-1}$ in our model problem).

Both of these properties are often an integral part of a complete theory and frequently also serve as intermediate steps in establishing the main well-posedness result.

In all of the above discussion, a common denominator remains the fact that the solution to data map is locally continuous but not uniformly continuous. It is very natural indeed to redefine (expand) the notion of quasi-linear evolution equations to include all flows that share this property.

In many problems of this type, one is interested not only in local well-posedness in some Sobolev space $H^s$, but also in lowering the exponent $s$ as much as possible. We will refer to such solutions as rough solutions. Then, a natural question is what kind of regularity thresholds should one expect or aim for in such problems? One clue in this direction comes from the scaling symmetry, whenever available. As an example, our model problem exhibits the scaling symmetry

$$\begin{equation*} u(t,x) \to u(\lambda t, \lambda x), \qquad \lambda > 0. \end{equation*}$$

The scale-invariant initial-data Sobolev space corresponding to this symmetry is the homogeneous space $\dot{H}^{s_c}$, where $s_c = n/2$. This space is called the critical Sobolev space, and it should heuristically be thought of as an absolute lower bound for any reasonable well-posedness result. Whereas in some semilinear dispersive evolutions one can actually reach this threshold, in nonlinear flows it seems to be out of reach in general.

1.3. A set of results for the model problem

In order to state the results, we begin with a discussion of control parameters. We will use two such control parameters. The first one is

$$\begin{equation*} A = \| u \|_{L^\infty }. \end{equation*}$$

This is a scale-invariant quantity, which appears in the implicit constants in all of our bounds. Our second control parameter is

$$\begin{equation*} B = \| \nabla u \|_{L^\infty }, \end{equation*}$$

which instead will be shown to control the energy growth in all the energy estimates. Precisely, the norm $B$ plays the role of the norm $Y$ mentioned in the discussion above.

The primary well-posedness result for the model problem is as follows:

Theorem 1.

The equation Equation 1.2 is locally well-posed in $H^s$ in the Hadamard sense for $s > \frac{d}{2}+1$.

The reader will notice that this result is one derivative above scaling. It is also optimal in some cases, including the scalar case (where the problem can be solved locally using the method of characteristics), but it is not optimal in many other cases where the system is dispersive.

For the uniqueness result we have in effect a stronger statement that only requires Lipschitz bounds for $u$. This however does not improve the scaling comparison relative to the critical spaces:

Theorem 2.

Uniqueness holds in the Lipschitz class, and we have the $L^2$ difference bound

$$\begin{equation} \| (u_1- u_2)(t)\|_{L^2} \lesssim e^{C(A) \int _0^t B(s)\, ds} \| (u_1- u_2)(0)\|_{L^2}. \tag{1.3} \end{equation}$$

This is exactly the kind of weak Lipschitz bound discussed earlier. With a bit of additional effort, for the $H^s$ solutions in Theorem 1 this may be extended to a larger range of Sobolev spaces,

$$\begin{equation} \| (u_1- u_2)\|_{L^\infty ( [0,T];H^\sigma )} \lesssim \| (u_1- u_2)(0)\|_{H^\sigma }, \qquad |\sigma | \leq s-1. \tag{1.4} \end{equation}$$

The small price to pay here is that now the implicit constant in the estimate depends not only on $A$ and $B$ but also on the norms of $u_1$ and $u_2$ in $C([0,T];H^s)$.

A key role in the proof of the well-posedness result is played by the energy estimates, which are also of independent interest:

Theorem 3.

The following bounds hold for for solutions to Equation 1.2 for all $s \geq 0:$

$$\begin{equation} \| u(t)\|_{H^s} \lesssim e^{C(A) \int _0^t B(s)\, ds} \| u(0)\|_{H^s}. \tag{1.5} \end{equation}$$

Finally, as a corollary of the last result, we obtain a continuation criteria for solutions:

Theorem 4.

Solutions can be continued in $H^s$ for as long as $\int B$ remains finite.

Theorem 1 has been first proved by Kato Reference 16, borrowing ideas from nonlinear semigroup theory; see, e.g., Barbu’s book Reference 4. The existence and uniqueness part, as well as the energy estimates, can also be found in standard references; e.g., in the books of Taylor Reference 30, Hörmander Reference 10, and Sogge Reference 24 (in the last two the wave equation is considered, but the idea is similar). However, interestingly enough, the continuous dependence part is missing in all these references. We did find presentations of continuous dependence arguments inspired from Kato’s work in Chemin’s book Reference 3, and also on Tao’s blog Reference 26.

Our objective for the remainder of the paper will be to provide complete proofs for Theorems 1, 2, 3, and 4, which readers may take as a guide for their problem of choice. While these results are not new in the model case we consider, to the best of our knowledge this is the first time when the proofs of these results are presented in this manner. Along the way, we will also provide extensive comments and pointers to alternative methods developed along the years.

In particular, we would emphasize the frequency envelope approach for the regularization and continuous dependence parts, as well as the time discretization approach for the existence proof. The frequency envelope approach has been repeatedly used by the authors, jointly with different collaborators, in a number of papers (see, e.g., Reference 23, Reference 29, Reference 18, Reference 12, Reference 15), with some of the ideas crystalizing along the way. The version of the existence proof based on time discretization is in some sense very classical, going back to ideas which have originally appeared in the context of semigroup theory; however, its implementation is inspired from the authors’ recent work Reference 15, though the situation considered here is considerably simpler.

1.4. An outline of these notes

Our strategy will be, in each section, to provide some ideas and a broader discussion in the context of the general equation Equation 1.1 and then show how this works in detail in the context of our chosen example Equation 1.2.

In Section 2 we introduce the paradifferential form of our equations, both the main equation and its linearization. This is an idea that goes back to work of Bony Reference 6 and helps clarify the roles played by different frequency interaction modes in the equation. Another very useful reference here is Metivier’s more recent book Reference 21.

Section 3 is devoted to the energy estimates in multiple contexts. These are presented both for the full equation, for its linearization, for its associated linear paradifferential flow, and for differences of solutions. The latter, in turn, yields the uniqueness part of the well-posedness theorem. A common misconception here has been that for well-posedness it suffices to prove energy estimates for the full equation. Instead, in our presentation we regard the bound for the linearized problem as fundamental, though, at the implementation level, it is the paradifferential flow bound that can be found at the core.

Section 4 provides two approaches for the existence part of the well-posedness theorem. The first one, more classical, is based on an iteration scheme, which works well on our model problem but may run into implementation issues in more complex problems. The second approach, which we regard as more robust, relies on time discretization, and is somewhat related to nonlinear semigroup theory, which also inspired Kato’s work. Two other possible strategies, which have played a role historically, are briefly outlined.

Section 5 introduces Tao’s notion of frequency envelopes (see for example Reference 27), which is very well suited to track the flow of energy as time progresses. This is then used to show how rough solutions can be obtained as uniform limits of smooth solutions. This is a key step in many well-posedness arguments, and helps decouple the regularity for the initial existence result from the rough data results.

Finally, Section 6 is devoted to the continuous dependence result, where we provide the modern frequency-envelope-based approach. At the same time, for a clean, elegant reinterpretation of Kato’s original strategy, we refer the reader to Tao’s blog Reference 26.

2. A menagerie of related equations

While ultimately one would want all the results stated in terms of the full nonlinear equation, any successful approach to quasi-linear problems needs to also consider a succession of closely related linear equations as well as associated reformulations of the nonlinear flow. Here we aim to motivate and describe these related flows, stripping away technicalities.

2.1. The linearized equation

This plays a key role in comparing different solutions; we will write it in the form

$$\begin{equation} v_t = DN(u)v, \qquad v(0) = v_0 , \cssId{gen-eq-lin}{\tag{2.1}} \end{equation}$$

where $DN$ stands for the differential of $N$, which in our setting is a partial differential operator of order $k$. One may also reinterpret the equation for the difference of two solutions as a perturbed linearized equation with a quadratic source term. Some caution is required here, because often some structure is lost in doing this, and the question is whether or not that is too much.

In the particular case of Equation 1.2, the linearized equation takes the form

$$\begin{equation} \partial _t v = \mathcal{A}^j(u) \partial _j v + D\mathcal{A}^j(u) v \, \partial _j u, \qquad v(0) = v_0 . \cssId{sym-hyp-lin}{\tag{2.2}} \end{equation}$$

2.2. The linear paradifferential equation

One distinguishing feature of quasi-linear evolutions is that the nonlinearity cannot be interpreted as perturbative. Nevertheless, one may seek to separate parts of the nonlinearity which can be seen as perturbative, at least at high regularity, in order to better isolate and understand the nonperturbative part.

To narrow things down, consider a nonlinear term which is quadratic, say of the form $\partial ^\alpha u_1 \partial ^\beta u_2$, and consider the three modes of interaction between these terms, according to the Littlewood–Paley trichotomy or paraproduct decomposition,

$$\begin{equation*} \partial ^\alpha u_1 \partial ^\beta u_2 = T_{\partial ^\alpha u_1} \partial ^\beta u_2 + T_{\partial ^\beta u_2} \partial ^\alpha u_1 + \Pi (\partial ^\alpha u_1, \partial ^\beta u_2), \end{equation*}$$

where the three terms represent the $low$-$high$, $high$-$low$, and the $high$-$high$ frequency interactions, respectively. The high-high interactions in the last term are always perturbative at high regularity, so they are placed into the perturbative box. But one cannot do the same with the low-high or high-low interactions, which are kept on the nonperturbative side. This is closely related to the linearization and, indeed, at the end of the day, we are left with a paradifferential style nonperturbative part of our evolution, which we can formally write as

$$\begin{equation} w_t = T_{DN(u)} w, \qquad w(0) = w_0 . \cssId{gen-eq-para}{\tag{2.3}} \end{equation}$$

Here, one can naively use Bony’s notion of a paraproduct Reference 6 to define the linear operator $T_{DN(u)}$ as

$$\begin{equation*} T_{DN(u)} w = \sum _{|\alpha |\leq k } T_{\partial _{p^\alpha }N(u) } \partial ^\alpha w, \end{equation*}$$

where $p^\alpha$ is a placeholder for the $\partial ^\alpha u$ argument of the nonlinearity $N$. However, there are also other related choices one can make; see for instance the discussion at the end of this subsection. For a discussion on the use of paradifferential calculus in nonlinear PDEs (though not the above notation), we refer the reader to Metivier’s book Reference 21.

One can think of the above evolution as a linear evolution of high frequency waves on a low frequency background. Then one can interpret solving the nonperturbative part of our evolution as an infinite-dimensional triangular system, where each dyadic frequency of the solution is obtained at some step by solving a linear system with coefficients depending only on the lower components, and in turn it affects the coefficients of the equations for the higher frequency components. Of course, this should only be understood in a philosophical sense, because a variable coefficient flow in general does not preserve frequency localizations. This can sometimes be achieved with careful choices of the paraproduct quantizations, but it never seems worthwhile to implement, as the perturbative terms will mix frequencies anyway and add tails.

Turning to our model problem, in a direct interpretation the associated paradifferential equation will have the form

$$\begin{equation} \partial _t w = T_{\mathcal{A}^j(u)} \partial _j w + T_{D\mathcal{A}^j(u) \partial _j u} w \, , \qquad w(0) = w_0 . \cssId{sym-hyp-para}{\tag{2.4}} \end{equation}$$

However, upon closer examination one may see several choices that could be made. Considering for instance the first paraproduct, which of the following expressions would make the better choice at frequency $2^k$ ?

$$\begin{equation*} \mathcal{A}^j(u)_{< k-8} \partial _j w_k, \qquad \mathcal{A}^j(u_{< k-8}) \partial _j w_k, \qquad [\mathcal{A}^j(u_{< k-8})]_{<k-4} \partial _j w_k. \end{equation*}$$

The last one may seem the most complicated, but it is also the most accurate. In many cases, including our model problem, it makes no difference in practice. However, one should be aware that often a simpler choice, which is made for convenience in one problem, might not work in a more complex setting.

Remark 2.1.

Here the frequency gap, which was set to be equal to $8$ in the above formulas, is chosen rather arbitrarily; its role is simply to enforce the frequency separation between the coefficients and the leading term. On occasion, particularly in large data problems, it is also useful to work instead with a large frequency gap as a proxy for smallness; see, e.g., Reference 25.

2.3. The paradifferential formulation of the main equations

Consider first our general equation Equation 1.1, which we can write in the form

$$\begin{equation} u_t = T_{DN(u)} u + F(u), \qquad u(0) = u_0 . \cssId{gen-para}{\tag{2.5}} \end{equation}$$

Here one would hope that the paradifferential source term can be seen as perturbative, in the sense that

$$\begin{equation*} F: H^s \to H^s, \qquad \text{ Lipschitz}. \end{equation*}$$

Similarly, we can write the linearized equation Equation 2.1 in the same format,

$$\begin{equation} v_t = T_{DN(u)} v + F^{\operatorname {lin}}(u) v, \qquad v(0) = v_0 , \cssId{gen-para-lin}{\tag{2.6}} \end{equation}$$

with the appropriate nonlinearity $F^{\operatorname {lin}}$. This is still based on the paradifferential equation Equation 2.5 but can no longer be interpreted as the direct paralinearization of the linearized equation. This is because the expression $F^{\operatorname {lin}}(u) v$ also contains some low-high interactions, precisely those where $v$ is the low frequency factor.

3. Energy estimates

Energy estimates are a critical part of any well-posedness result, even if they do not tell the entire story. In this section we begin with a heuristic discussion of several ideas in the general case and then continue with some more concrete analysis in the model case.

3.1. The general case

Consider first the energy estimates for the general problem Equation 1.1, where it is simpler to think of this in the paradifferential formulation Equation 2.3. An energy estimate for this problem is an estimate that allows us to control the time evolution of the Sobolev norms of the solution. In the simplest formulation, the idea would be to prove that

$$\begin{equation*} \frac{d}{dt} \| u \|^2_{H^\sigma } \lesssim C \| u\|^2_{H^\sigma }, \end{equation*}$$

with a constant $C$ that at the very least depends on the $H^s$ norm of $u$.

There are two points that one should take into account when considering such estimates. The first is that it is often useful to strenghten such bounds by relaxing the dependence of the constant $C$ on $u$. Heuristically, the idea is that this constant measures the effect of nonlinear interactions, which are strongest when our functions are pointwise large, not only large in an $L^2$ sense. Thus, it is often possible to replace the constant $C$ with an analogue of the uniform control norm $B$ in the model case, perhaps with some additional implicit dependence on another scale invariant uniform control parameter $A$. See however the discussion in Remark 3.2.

A second point is that, although it is tempting to try to work directly with the $H^s$ norm, it is often the case that the straight $H^s$ norm is not well adapted to the structure of the problem; see, e.g., what happens in water waves Reference 2, Reference 12. Then it is useful to construct energy functionals $E^\sigma$ adapted to the problem at hand. For these energies we should aim for the following properties.

(i)

Energy equivalence:$$\begin{equation} E^\sigma (u) \approx \|u\|_{H^\sigma }^2. \tag{3.1} \end{equation}$$

(ii)

Energy propagation:$$\begin{equation} \frac{d}{dt} E^\sigma (u) \lesssim _A B \| u\|^2_{H^\sigma }, \cssId{En-sigma}{\tag{3.2}} \end{equation}$$

where the control parameter $B$ satisfies$$\begin{equation} B \lesssim \| u\|_{H^s} . \tag{3.3} \end{equation}$$

Now consider our main equation written in the form Equation 2.3. For the perturbative part of the nonlinearity $F$ we hope to have some boundedness,

$$\begin{equation} \| F(u) \|_{H^\sigma } \lesssim _A B \| u\|_{H^\sigma } . \cssId{F-bound}{\tag{3.4}} \end{equation}$$

This in turn allows us to reduce nonlinear energy bounds of the form Equation 3.2 to similar bounds for the linear paradifferential equation Equation 2.5. One may legitimately worry here that some structure is lost when we decouple the paradifferential coefficients from the evolution variable; however, the point is that these two objects are indeed separate, as they represent different frequencies of the solution.

Remark 3.1.

In our discussion here we took the simplified view that bounds for $F$ begin at $\sigma = 0$. But this is not always the case in practice, and often one needs to identify the lower range for $\sigma$ where this works; see, e.g., the nonlinear wave equation Reference 23, the wave map equation Reference 29, or the water wave problem considered in Reference 1.

Now consider the paradifferential evolution Equation 2.5, and begin with the $L^2$ case by setting $\sigma = 0$. Then we need to produce a linearized type energy $E^{0,\operatorname {lin}}_u$ so that the solutions satisfy

$$\begin{equation} \frac{d}{dt} E^{0,\operatorname {lin}}_u(w) \lesssim _A B \| w\|^2_{L^2} . \cssId{En-0-para}{\tag{3.5}} \end{equation}$$

Then the associated nonlinear energy at $\sigma =0$ would be

$$\begin{equation*} E^0(u) = E^{0,\operatorname {lin}}_u (u). \end{equation*}$$

If $E_u^{0,\operatorname {lin}}(w)= \|w\|_{L^2}^2$, then the bound Equation 3.5 would simply require that the paradifferential operator $T_{DN(u)}$ is essentially antisymmetric in $L^2$. If that is not true, then the backup plan is to find an equivalent Hilbert norm on $L^2$ so that the antisymmetry holds. Some care is needed however; if this norm depends on $u$, then this dependence needs to be mild.

The next step is to consider a larger $\sigma$. By interpolation it suffices to work with integer $\sigma$, in which case one might simply differentiate Equation 2.3,

$$\begin{equation*} (\partial ^\sigma w)_t = T_{DN(u)} (\partial ^\sigma w) + [\partial ^\sigma , T_{DN(u)}] w. \end{equation*}$$

Here we would be done if the last commutator is bounded from $H^\sigma$ into $L^2$. In principle that would be the case almost automatically, at least when the order $k$ of $N$ is at most one. One can heuristically associate this with the finite speed of propagation in the high frequency limit.

Remark 3.2.

The case $k > 1$, which corresponds to an infinite speed of propagation, is often more delicate; see, e.g., Reference 17, Reference 18, Reference 19 for quasi-linear Schrödinger flows or Reference 14 for capillary waves. There one needs to further develop the function space structure based on either dispersive properties of solutions or on normal forms analysis.

3.2. Coifman-Meyer and Moser type estimates

Before considering our model problem, we briefly review some standard bilinear and nonlinear estimates that play a role later on. In the context of bilinear estimates, a standard tool is to consider the Littlewood–Paley paraproduct type decomposition of the product of two functions, which leads to Coifman–Meyer type estimates; see Reference 7, Reference 22:

Proposition 3.3.

Using the standard paraproduct notations, one has the following estimates,

$$\begin{equation} \begin{aligned} &\Vert T_fg\Vert _{L^2} \lesssim \Vert f\Vert _{L^\infty } \Vert g\Vert _{L^2},\\ &\Vert T_fg\Vert _{L^2} \lesssim \Vert g\Vert _{BMO} \Vert f\Vert _{L^2},\\ &\Vert \Pi (f,g)\Vert _{L^2} \lesssim \Vert f\Vert _{BMO} \Vert g\Vert _{L^2},\\ \end{aligned} \cssId{CM}{\tag{3.6}} \end{equation}$$

as well as the commutator bound

$$\begin{equation} \Vert [P_k, f]g\Vert _{L^2} \lesssim 2^{-k}\Vert \partial _xf\Vert _{L^{\infty }}\Vert g\Vert _{L^2}. \tag{3.7} \end{equation}$$

Here $P_k$ is the Littlewood–Paley projection onto frequencies $\approx 2^k$.

These results are standard in the harmonic/microlocal analysis community. For nonlinear expressions we use Moser type estimates instead:

Proposition 3.4.

The following Moser estimate holds for a smooth function $F$, with $F(0)=0$, and $s \geq 0:$

$$\begin{equation*} \Vert F(u)\Vert _{H^s}\lesssim _{\Vert u\Vert _{L^{\infty }}}\Vert u\Vert _{H^s}. \end{equation*}$$

Of course many more extensions of both the bilinear and the nonlinear estimates above are available.

3.3. The model case

We now turn our attention to our model problem, where, if we adopt the expression Equation 2.4 for the paradifferential flow, the source term $F(u)$ is given by

$$\begin{equation} F(u) = \mathcal{A}^j \partial _j u - T_{\mathcal{A}^j(u)} \partial _j u - T_{D\mathcal{A}^j(u) \partial _j u} u . \tag{3.8} \end{equation}$$

We can rewrite this in the form

$$\begin{equation} F(u) = \Pi (\mathcal{A}^j(u), \partial _j u) + T_{\partial _j u} \mathcal{A}^j(u) - T_{D\mathcal{A}^j(u) \partial _j u} u . \cssId{texmlid1}{\tag{3.9}} \end{equation}$$

For this expression we can show that it always plays a perturbative role:

Proposition 3.5.

The above nonlinearity $F$ satisfies the following bounds:

(i)

Sobolev bounds:$$\begin{equation} \| F(u) \|_{H^\sigma } \lesssim _A B \|u\|_{H^\sigma }, \qquad \sigma \geq 0. \tag{3.10} \end{equation}$$

(ii)

Difference bounds:$$\begin{equation} \| F(u) - F(v) \|_{H^\sigma } \lesssim _A B \left[\|u-v\|_{H^\sigma }+ \| u-v\|_{L^\infty } (\| u\|_{H^\sigma } + \|v\|_{H^\sigma }) \right] , \qquad \sigma \geq 0 , \tag{3.11} \end{equation}$$

as well as$$\begin{equation} \| F(u) - F(v) \|_{L^2} \lesssim _A B \|u-v\|_{L^2}. \tag{3.12} \end{equation}$$

The next-to-last bound shows in particular that $F$ is Lipschitz in $H^s$ for $s > d/2$. The simplification in the case $\sigma = 0$ is also useful in order to bound differences of solutions in the $L^2$ topology.

Proof.

(i) We use the expression Equation 3.9 for $F$. The first term can be estimated using a version of the Coifman–Meyer estimates and Moser estimates by

$$\begin{equation*} \| \Pi (\mathcal{A}^j(u), \partial _j u)\|_{H^\sigma } \lesssim \| \mathcal{A}^j(u)\|_{H^\sigma } \| \partial _j u\|_{BMO} \lesssim _A B \|u\|_{H^\sigma }. \end{equation*}$$

For the second term we use again paraproduct bounds and Moser estimates to get

$$\begin{equation*} \| T_{\partial _j u} \mathcal{A}^j(u) \|_{H^\sigma } \lesssim \| \partial _j u\|_{L^\infty } \| \mathcal{A}^j(u) \|_{H^\sigma } \lesssim _A \| \partial _j u\|_{L^\infty } \| u \|_{H^\sigma }. \end{equation*}$$

The third term is similar to the second.

(ii) First, we note the representation

$$\begin{equation*} \mathcal{A}(u) - \mathcal{A}(v) \eqcolon G(u,v)(u-v), \end{equation*}$$

which we use to separate $u-v$ factors. Here $G(u,v)$ is a smooth function of $u$ and $v$. Then taking differences in the first term of $F$, we need two estimates

$$\begin{equation*} \| \Pi (\mathcal{A}^j(u), \partial _j (u-v))\|_{H^\sigma } \lesssim \| \partial \mathcal{A}^j(u)\|_{L^\infty } \| u-v\|_{H^\sigma } \lesssim _A B \|u-v\|_{H^\sigma } \end{equation*}$$

and

$$\begin{equation*} \begin{split} \| \Pi (G(u,v)(u-v), \partial _j v)\|_{H^\sigma } &\lesssim \| G(u,v)(u-v)\|_{H^\sigma } \| \partial v\|_{L^\infty }\\ &\lesssim _A B (\|u-v\|_{H^\sigma } + \| u-v\|_{L^\infty }(\|u\|_{H^\sigma } +\|v\|_{H^\sigma })), \end{split} \end{equation*}$$

noting that for $\sigma =0$ the last term can be avoided.

Similarly, we have two estimates corresponding to the second term in $F$, namely

$$\begin{equation*} \begin{aligned} \|T_{\partial _j u} \mathcal{A}^j(u) - T_{\partial _j v} \mathcal{A}^j(v)\|_{H^\sigma } &= \|T_{\partial _j u}[G(u,v) (u-v)] - T_{\left[\partial _j u-\partial _j v\right]} \mathcal{A}^j(v)\|_{H^\sigma }\\ &\lesssim \|T_{\partial _j u}[G(u,v) (u-v)]\|_{H^\sigma } + \|T_{\left[\partial _j u-\partial _j v\right]} \mathcal{A}^j(v)\|_{H^\sigma }, \\ \end{aligned} \end{equation*}$$

where

$$\begin{equation*} \|T_{\left[\partial _j u-\partial _j v\right]} \mathcal{A}^j(v)\|_{H^\sigma } \lesssim \| u-v\|_{L^\infty } \| \partial _j \mathcal{A}^j(v) \|_{H^\sigma } \lesssim _A B \|u-v\|_{L^\infty } \| v \|_{H^\sigma } \end{equation*}$$

and

$$\begin{equation*} \|T_{\partial _j u} [G(u,v)(u-v)] \|_{H^\sigma } \lesssim _A \| \partial _j u\|_{L^\infty } (\|u-v\|_{H^\sigma } + \| u-v\|_{L^\infty }(\|u\|_{H^\sigma } +\|v\|_{H^\sigma })), \end{equation*}$$

both with obvious simplifications if $\sigma = 0$. Finally, the bounds for the third term in $F$ are similar to the ones for the second.

■

Remark 3.6.

For Proposition 3.5 one can further relax $B$ to a $BMO$ norm,

$$\begin{equation*} B = \| \nabla u \|_{BMO}. \end{equation*}$$

On the other hand we can also simplify the paradifferential equation Equation 2.4 to a simpler version,

$$\begin{equation*} w_t = T_{\mathcal{A}^j(u)} \partial _j w , \end{equation*}$$

but in this case we can no longer relax $B$ to a BMO norm.

Next we consider the paradifferential equation:

Proposition 3.7.

Assume that $u \in L_{t,x}^\infty$ and $\nabla u \in L^1_t L_x^\infty$ (i.e., $B\in L^1_t).$ Then the paradifferential equation Equation 2.4 is well-posed in all $H^\sigma$ spaces, $\sigma \in \mathbb{R}$, and

$$\begin{equation} \frac{d}{dt} \| w\|_{H^\sigma }^2 \lesssim _A B \|w\|_{H^\sigma }^2 . \tag{3.13} \end{equation}$$

Proof.

We first consider the energy estimate, where we work with the corresponding inhomogeneous equation,

$$\begin{equation} \partial _t w = T_{\mathcal{A}^j(u)} \partial _j w + T_{D\mathcal{A}^j(u) \partial _j u} w + f\, , \qquad w(0) = w_0 . \cssId{sym-hyp-para-inhom}{\tag{3.14}} \end{equation}$$

The $L^2$ bound is easiest; we have

$$\begin{equation*} \frac{1}{2} \frac{d}{dt} \|w\|_{L^2}^2 = \int w \cdot T_{\mathcal{A}^j(u)} \partial _j w + w\cdot T_{D\mathcal{A}^j(u) \partial _j u} w + w \cdot f \, dx. \end{equation*}$$

In the second term we simply estimate the paracoefficient in $L^\infty$. In the first term we commute and integrate by parts to arrive at

$$\begin{equation*} \frac{1}{2} \int - w \cdot T_{\partial _j \mathcal{A}^j(u)} w + w \cdot (T_{\mathcal{A}^j(u)}- (T_{\mathcal{A}^j(u)})^*) \partial _j w\, dx , \end{equation*}$$

where due to the symmetry of the matrices $\mathcal{A}^j$ we have the bound

$$\begin{equation} \| (T_{\mathcal{A}^j(u)}- (T_{\mathcal{A}^j(u)})^*) \partial _j w \|_{L^2} \lesssim _A B\|w\|_{L^2}, \cssId{adj-diff}{\tag{3.15}} \end{equation}$$

which shows that the corresponding paraproduct operators are self-adjoint at leading order. Here we use the $^*$ notation to denote the adjoint of an operator. Hence we obtain

$$\begin{equation*} \left|\frac{d}{dt} \| w\|_{L^2}^2 \right| \lesssim _A B \|w\|_{L^2}^2 + \| w\|_{L^2} \|f\|_{L^2}, \end{equation*}$$

which further by Gronwall’s inequality yields

$$\begin{equation} \| w\|_{L_t^\infty ([0,T]; L_x^2)} \lesssim _{A}e^{\int _0^TB\,dt} (\|w(0)\|_{L_x^2} + \| f \|_{L_t^1 L^2_x}) . \tag{3.16} \end{equation}$$

This by itself does not prove well-posedness in $L^2$, it only proves uniqueness. However, a similar bound will hold for the backward adjoint system in the same spaces. This is because the adjoint system coincides with the direct system modulo $L^2$ bounded terms. Together, these two pieces of information yield $L^2$ well-posedness for the paradifferential system in $L^2$. This is a standard linear duality argument, where the solutions are constructed by a direct application of the Hahn–Banach theorem. In a nutsell, one has the following equivalencies (see for instance Reference 11):

$$\begin{equation*} \begin{split} &\text{Energy estimates for the direct forward problem}\\ &\qquad \Longleftrightarrow \text{Existence for the adjoint backward problem},\\[6.0pt] &\text{Energy estimates for the adjoint backward problem}\\ &\qquad \Longleftrightarrow \text{Existence for the direct forward problem}. \end{split} \end{equation*}$$

Exactly the same argument applies in $H^\sigma$, with the small change that now the the adjoint system should be considered in $H^{-\sigma }$. There the bound Equation 3.15 is replaced by

$$\begin{equation} \| (\langle D \rangle ^\sigma T_{\mathcal{A}^j(u)}- (T_{\mathcal{A}^j(u)})^*\langle D \rangle ^\sigma ) \partial _j w \|_{L^2} \lesssim \| \nabla \mathcal{A}(u)\|_{L^\infty } \|w\|_{H^\sigma }\lesssim _A B \|w\|_{H^\sigma }. \cssId{texmlid2}{\tag{3.17}} \end{equation}$$ ■

Combining the last two propositions, Proposition 3.5 and Proposition 3.7, we obtain the $H^\sigma$ bound in Theorem 3.

3.4. The linearized equation

Next, we turn our attention to the linearized equation, which we also write in a paradifferential form,

$$\begin{equation} \partial _t v = T_{\mathcal{A}^j(u)} \partial _j v + T_{D\mathcal{A}^j(u) \partial _j u} v + F^{\operatorname {lin}}(u) v, \qquad v(0) = v_0 , \cssId{sym-hyp-lin-para}{\tag{3.18}} \end{equation}$$

where

$$\begin{equation*} \begin{split} F^{\operatorname {lin}}(u) v &\coloneq \Pi (\mathcal{A}^j(u), \partial _j v ) + \Pi (D\mathcal{A}^j(u) \partial _j u, v) + T_{\partial _j v} \mathcal{A}^j(u) + T_v (D\mathcal{A}^j(u) \partial _j u)\\ &\coloneq F^{\operatorname {lin}}_{\Pi }(u) v + F^{\operatorname {lin}}_{T}(u) v. \end{split} \end{equation*}$$

We note here that equation Equation 3.18 is not exactly a true paralinearization of the linearized equation, as $F^{\operatorname {lin}}_{T}(u) v$ does contain low-high interactions. This difference is observed in the estimates satisfied by the two terms.

On one hand, the term $F^{\operatorname {lin}}_{\Pi }(u) v$ satisfies good bounds in all Sobolev spaces,

$$\begin{equation} \| F^{\operatorname {lin}}_\Pi (u) v \|_{H^\sigma } \lesssim _A B \|v\|_{H^\sigma }, \qquad \sigma \geq 0 , \tag{3.19} \end{equation}$$

so it can be seen as a true perturbative term. This is a simple Coifman–Meyer type estimate which is left for the reader.

On the other hand, assuming we know that $u \in H^s$, the term $F^{\operatorname {lin}}_{T}(u) v$ can at best be estimated in $H^{s-1}$. There of course we could not use the control norms; instead we would have to use the full $H^s$ norm of $u$. However, we can use the control norms for $L^2$ bounds to directly estimate

$$\begin{equation} \| F^{\operatorname {lin}}_T(u) v \|_{L^2} \lesssim _A B \|v\|_{L^2}. \tag{3.20} \end{equation}$$

Combining the last two estimates with Proposition 3.7 we perturbatively obtain the following:

Proposition 3.8.

Assume that $A \in L^\infty$ and that $B \in L^1$. Then the linearized equation Equation 2.2 is well-posed in $L^2$, with bounds

$$\begin{equation} \frac{d}{dt} \| v\|_{L^2}^2 \lesssim _A B \|v\|_{L^2}^2 . \tag{3.21} \end{equation}$$

We observe the obvious fact that one does not need paradifferential calculus in order to prove this proposition; a simple integration by parts suffices. However, it is instructive to dissect the terms in the equation and understand their respective roles. Also, it is interesting to observe that in appropriate settings, the linearized equation can be thought of as a perturbation of the associated paradifferential equation.

Remark 3.9.

Well-posedness and bounds for the linearized equation can be also obtained in all $H^\sigma$ spaces for $|\sigma | \leq s-1$. However, this can no longer be done in terms of our control parameters; for instance if $\sigma = s-1$, then we need to use the full $H^s$ norm of the solutions. While interesting, this observation will not be needed for the rest of the paper.

3.5. Difference bounds and uniqueness

The easiest way to compare two solutions $u_1$ and $u_2$ for Equation 1.1 is to subtract their respective equations to obtain an equation for $v= u_1-u_2$. In the general case, using the form Equation 2.5 of the equation, we obtain

$$\begin{equation*} v_t = T_{DN(u_1)} v + T_{DN(u_1)-DN(u_2)} u_2 + F(u_1)-F(u_2). \end{equation*}$$

Here we identify this equation as the paradifferential equation associated to $u_1$, but with two source terms, which we would like to interpret as perturbative in a low regularity Sobolev space, say $L^2$. That would yield a bound of the form

$$\begin{equation} \| v(t)\|_{L^2} \lesssim e^{C(A)\int _0^t B(s) ds} \|v(0)\|_{L^2}, \cssId{diff-est}{\tag{3.22}} \end{equation}$$

where $A= A_1+A_2$, $B = B_1+B_2$, with $A_i=\Vert u_i\Vert _{L^\infty }$, and $B_i=\Vert \nabla u_i\Vert _{L^{\infty }}$, for $i=\overline{1,2}$.

Let us see how this works out in our model problem. We will show the following:

Proposition 3.10.

Let $u_1$ and $u_2$ be two Lipschitz solutions to Equation 1.2 with associated control parameters $A_1,B_1$ (resp., $A_2,B_2)$. Then their difference $v = u_1-u_2$ satisfies the bound Equation 3.22.

Proof.

We have already seen in Proposition 3.7 that the paradifferential evolution is well-posed in $L^2$, and in Proposition 3.5 that we have a good Lipschitz bound for $F$. It remains to bound the remaining difference,

$$\begin{equation*} \|T_{DN(u_1)-DN(u_2)} u_2\|_{L^2} \lesssim _A B \|u_1-u_2\|_{L^2}. \end{equation*}$$

For this we write

$$\begin{equation*} \begin{aligned} T_{DN(u_1)-DN(u_2)} u_2 = & \ T_{\mathcal{A}^j(u_1)-\mathcal{A}^j(u_2)} \partial _j u_2 + T_{D\mathcal{A}^j(u_1) \partial _j u_1 - D\mathcal{A}^j(u_2) \partial _j u_2} u_2 \\ = & \ T_{\mathcal{A}^j(u_1)-\mathcal{A}^j(u_2)} \partial _j u_2 + T_{(D\mathcal{A}^j(u_1) - D\mathcal{A}^j(u_2)) \partial _j u_1} u_2 \\ & - T_{\partial _j D\mathcal{A}^j(u_2) (u_1 - u_2)} u_2 + T_{\partial _j(D\mathcal{A}^j(u_2) (u_1 - u_2))} u_2 . \end{aligned} \end{equation*}$$

For the first term we have a Coifman–Meyer type bound

$$\begin{equation*} \|T_{\mathcal{A}(u_1) - \mathcal{A}(u_2)} \nabla u_2\|_{L^2} \lesssim \| u_1-u_2\|_{L^2} \| \nabla u_2 \|_{BMO} \lesssim B \|u_1-u_2\|_{L^2}. \end{equation*}$$

The second term is even easier,

$$\begin{equation*} \begin{split} \| T_{(D\mathcal{A}^j(u_1) - D\mathcal{A}^j(u_2)) \partial _j u_1} u_2\|_{L^2} &\lesssim \| (D\mathcal{A}^j(u_1) - D\mathcal{A}^j(u_2)) \partial _j u_1\|_{L^2} \|u_2\|_{L^\infty }\\ &\lesssim _A B \|u_1-u_2\|_{L^2}, \end{split} \end{equation*}$$

and the third term is similar. Finally, in the fourth term we can use a Coifman–Meyer type bound to rebalance again the derivatives and obtain

$$\begin{equation*} \| T_{\partial _j(D\mathcal{A}^j(u_2) (u_1 - u_2)} u_2 \|_{L^2} \lesssim \| D\mathcal{A}^j(u_2) (u_1 - u_2)\|_{L^2} \| \nabla u_2\|_{BMO}, \end{equation*}$$

concluding as before.

■

Remark 3.11.

The observant reader may have noticed that for our model problem the difference bound can be directly proved using a simple integration by parts, without any need for paradifferential calculus, and may wonder why we are doing it this way. There are three reasons for this: (i) to show that it works, (ii) to show how both the bound for the full equation and the bound for the difference equation can be seen as two sides of the same coin, and (iii) to provide a guide for the reader for situations where a simpler approach does not work.

Remark 3.12.

In the same vein as Remark 3.9, bounds for the difference equation can be also obtained in all $H^\sigma$ spaces for $|\sigma | \leq s-1$.

Remark 3.13.

In our particular example it was easy to cast the difference equation in a form which is very much like the linearized equation. However, this is not always the case. For this reason, we point out that there is another way one can think of difference bounds, namely by viewing the two initial data $u_{01}$ and $u_{02}$ as being connected via a one parameter family of data $u_{0h}$ where $h \in [1,2]$. Then we can interpret the difference $u_2-u_1$ as

$$\begin{equation*} u_2 - u_1 = \int _{1}^2 \frac{d}{dh} u_h\, dh, \end{equation*}$$

where $u_h$ are the solutions with data $u_{0h}$. Here the integrand represents a solution to the linearized equation around $u_h$. Hence difference bounds for $u_2-u_1$ can be obtained by integrating bounds for the linearized equation. The only downside to such an argument is that such bounds will require the control parameters for the entire family of solutions, rather than just the endpoints.

4. Existence of solutions

Here we consider the question of existence of solutions for the evolution Equation 1.1 with initial data in $H^s$, where $s$ will be taken sufficiently large. The idea here is to construct a good sequence of approximate solutions $u^{n}$, which will eventually be shown to converge in a weaker topology. The tricky bit is to choose the correct iteration scheme.

Naively, one might think of trying to base such a scheme on the linearized flow, setting

$$\begin{equation*} \partial _t (u^{n+1}-u^n) - DN(u^{n})(u^{n+1}-u^n) = - (\partial _t u^n - N(u^n)), \qquad (u^{n+1}-u^n)(0) = 0, \end{equation*}$$

where the expression on the right represents the error at step $n$. Here one can eliminate the time derivative of $u^n$ and rewrite this as

$$\begin{equation*} \partial _t u^{n+1} - DN(u^{n})u^{n+1} = N(u^n) - DN(u^n) u^n, \qquad u^{n+1}(0) = u_0. \end{equation*}$$

This would be akin to a Nash–Moser scheme, which, even when it works, loses derivatives. That may be reasonable in a small divisor situation, but not so much if our goal is to obtain a Hadamard style well-posedness result. Nevertheless, Nash–Moser schemes have been used on occasion to produce solutions for quasi-linear evolutions, though often they prove to be unnecessary.

Remark 4.1.

We observe that for the existence of solutions one does not need to work from the start at low regularity. As we will see, rough solutions can be constructed later on as limits of smooth solutions. This is, strictly speaking, not necessary in our model problem, but for more nonlinear, geometric problems it does seem to make a difference. This is because in such situations it is often easier to compare exact solutions via the linearized equation, which is a geometric object, instead of working with approximate solutions where the geometric character might be lost.

We will present two strategies to prove existence, and at the end we point out several other methods which have been successfully used in existence proofs.

4.1. Take 1: An iterative/fixed point construction

In order not to lose derivatives in the approximation scheme, the idea here is to carefully choose how to distribute $u^{n+1}$ and $u^n$ in the iteration. A key observation is that, whereas solving the linearized equation would cause a loss of derivatives, solving the paradifferential equation does not in general. Then, a good starting point would be the formulation Equation 2.3 of the equations, which would suggest the following iteration scheme:

$$\begin{equation} \partial _t u^{n+1} - T_{DN(u^{n})} u^{n+1} = F(u^n), \qquad u^{n+1}(0) = u_0. \tag{4.1} \end{equation}$$

We will apply this scheme on a time interval $[0,T]$, with $T=T(M)$ sufficiently small depending on the initial data size

$$\begin{equation*} M \coloneq \|u_0\|_{H^s}. \end{equation*}$$

For the above sequence $u^{n}$ the aim would be to inductively prove two uniform bounds in $[0,T]$,

$$\begin{equation} \| u^n \|_{L_t^\infty H_x^s} \leq CM , \tag{4.2} \end{equation}$$

and

$$\begin{equation} \| u^{n+1} - u^{n}\|_{L_t^\infty L_x^2} \leq C(M) T \| u^{n} - u^{n-1}\|_{L_t^\infty L_x^2}, \tag{4.3} \end{equation}$$

where $C$ is a fixed large constant. In the last bound, the time interval size $T$ is used in order to gain smallness for the constant, which is needed in order to obtain convergence. Together, these two bounds imply convergence in $L_t^\infty L_x^2$ to some function $u$, as well as $L_t^\infty H_x^s$ regularity for the limit. This in general suffices in order to show that the limit solves the equation.

To obtain uniform bounds for this evolution one would need two pieces of information:

(1)

Well-posedness of the paradifferential equation Equation 2.3 in $L^2$ and more generally in all $H^s$ spaces. Heuristically, the two should be equivalent, as the operator $T_{DN(u^{n})}$ does not change the dyadic frequency localization. In practice though it might not be as easy, as leakage to other frequencies may occur, and in particular even the associated Hamilton flow might not preserve the dyadic localization on a unit time scale.

(2)

Lipschitz property of $F$ in Sobolev spaces. More generally, a bound of the form$$\begin{equation} \| F(u) - F(v) \|_{H^\sigma } \leq C(\| u\|_{H^s},\|v\|_{H^s}) \| u-v \|_{H^\sigma }, \qquad \sigma \geq 0, \cssId{dF}{\tag{4.4}} \end{equation}$$

which should be thought of as a Moser type inequality.

In addition to uniform bounds in a strong norm $H^s$, one would also like to have convergence in a weaker topology, say $L^2$ for the purpose of this presentation. The difference equation reads

$$\begin{equation} (\partial _t - T_{DN(u^{n})}) (u^{n+1}-u^{n}) = F(u^n) - F(u^{n-1}) + (T_{DN(u^{n-1})}- T_{DN(u^{n})}) u^n. \cssId{diff-eqq}{\tag{4.5}} \end{equation}$$

Here energy estimates in $L^2$ would follow from (1) and (2) above, provided that the last difference has a good bound

$$\begin{equation*} \| (T_{DN(u^{n-1})}- T_{DN(u^{n})}) u^n \|_{L^2} \lesssim C(\|u^{n-1}\|_{H^s},\|u^n\|_{H^s}) \| u_{n} - u_{n-1} \|_{L^2}. \end{equation*}$$

This is in general relatively straightforward if $s$ is large enough.

Remark 4.2.

The argument above yields solutions which are apriori only in $L_t^\infty H_x^s$ as opposed to $C(H^s)$, as desired. Getting continuity in $H^\sigma$ for $\sigma < s$ is relatively straightforward by interpolation, but proving continuity in $H^s$ requires considerable extra work⁠Footnote¹ if one wants a direct argument. The easy way out is to rely on the arguments in the next section, where we show that all $H^s$ solutions can be seen as uniform limits of smooth solutions.

E.g., by showing continuity in time of solutions to the linear paradifferential equation.

✖

Remark 4.3.

The above iterative argument can be rephrased as a fixed point argument as follows. For $u \in C[0,T;H^s]$ we define $Lu(t) : = v$ as the solution to

$$\begin{equation*} \partial _t v - T_{DN(u)} v = F(u), \qquad v(0) = u_0 \end{equation*}$$

Then the desired solution $u$ has to be a fixed point for $L$. Solutions to this fixed point problem may often be obtained using the contraction principle in the right topology. Precisely, the strategy is to choose the domain of $L$ to be the ball $B(0,CM)$ in $L^\infty [0,T;H^s]$, but endow this ball with a weaker topology, e.g., $C[0,T; L^2]$. Then both the mapping properties of $L$ and the small Lipschitz constant can be achieved by choosing the time $T$ sufficiently small. Here for the domain we have to choose $L^\infty$ rather than continuity in order to guarantee completeness.

We now implement this scheme for our model problem. Denoting $M = \|u_0\|_{H^s}$, we will prove inductively that for fixed large enough $T$ and small enough $T$, we have the bound

$$\begin{equation*} \| u^n \|_{C(0,T;H^s)} \leq CM. \end{equation*}$$

Taking this as an induction hypothesis, we have the following bounds for the control parameters $A^n$ and $B^n$ associated to $u^n$:

$$\begin{equation*} A^n, B^n \lesssim CM. \end{equation*}$$

Then we can estimate $u^{n+1}$ in $H^s$ by combining Propositions 3.7 and 3.5 to obtain

$$\begin{equation*} \frac{d}{dt} \| u^{n+1}\|_{H^s}^2 \lesssim C(M)(1 + \| u^{n+1}\|_{H^s}^2) , \end{equation*}$$

and by Gronwall’s inequality we arrive at

$$\begin{equation*} \| u^n \|_{C(0,T;H^s)} \lesssim M e^{C(M) T}, \end{equation*}$$

with a universal implicit constant. This completes the induction if we first choose $C$ large enough (to dominate the implicit constant) and then $T$ small enough (depending on $C$ and $M$).

On the other hand, in order to prove the convergence in $L^2$, we use the equation Equation 4.5 for the difference $u^{n+1}-u^n$ and claim that the following $L^2$ estimate holds:

$$\begin{equation} \frac{d}{dt} \| u^{n+1}-u^n\|_{L^2}^2 \lesssim C(M) \| u^{n+1}-u^n\|_{L^2}^2 + C(M) \| u^{n}-u^{n-1}\|_{L^2}^2 . \cssId{l2-diff}{\tag{4.6}} \end{equation}$$

Assuming this is true, by Gronwall’s inequality we obtain

$$\begin{equation*} \| u^{n+1}-u^n\|_{C(0,T;L^2)} \lesssim C(M) T e^{ C(M)T} \| u^{n}-u^{n-1}\|_{C(0,T;L^2)} , \end{equation*}$$

which gives us the small Lipschitz constant if $T$ is sufficiently small, depending only on $M$.

It remains to prove Equation 4.6. For the paradifferential equation we can use Proposition 3.7 and for the $F$ difference we can use Proposition 3.5, so it remains to examine the last term in Equation 4.5, and show that

$$\begin{equation*} \|(T_{DN(u^{n-1})}- T_{DN(u^{n})}) u^n\|_{L^2} \lesssim C(M) \|u^{n-1}-u^n\|_{L^2}. \end{equation*}$$

In the case of the model problem the difference on the left reads

$$\begin{equation*} T_{\mathcal{A}^j(u^{n-1}) - \mathcal{A}^j(u^n)} \partial _j u^n + T_{D\mathcal{A}^j(u^{n-1}) \partial _j u^{n-1}- D\mathcal{A}^j(u^{n}) \partial _j u^{n}} u^n. \end{equation*}$$

For the first term we have the obvious bound

$$\begin{equation*} \begin{split} \| T_{{\mathcal{A}}^j(u^{n-1}) - \mathcal{A}^j(u^n)} \partial _j u^n\|_{L^2} &\lesssim \| \mathcal{A}^j(u^{n-1}) - \mathcal{A}^j(u^n)\|_{L^2} \|\partial _j u^n\|_{L^\infty }\\ &\lesssim C(M)\| u^{n-1}- u^n\|_{L^2}. \end{split} \end{equation*}$$

The second term is split into three parts,

$$\begin{equation*} T_{(D\mathcal{A}^j(u^{n-1})- D\mathcal{A}^j(u^{n})) \partial _j u^{n}} u^n - T_{\partial _j D\mathcal{A}^j(u^{n-1})( u^{n-1}- u^{n})} u^n + T_{\partial _j [D\mathcal{A}^j(u^{n-1})( u^{n-1}- u^{n})]} u^n, \end{equation*}$$

where the first two parts are easy to estimate. A similar bound follows for the third term after we move the derivative onto the high frequency factor, using an estimate of the form

$$\begin{equation*} \|T_{\partial f} g\|_{L^2} \lesssim \|f\|_{L^2} \|\partial g\|_{BMO}, \end{equation*}$$

which is a corollary of the second bound in Equation 3.6.

4.2. Take 2: A time discretization method

Here the idea is to discretize time at a small scale $\epsilon$, and to construct approximate discrete solutions $u^\epsilon (j\epsilon )$ with the following properties:

(i): Uniform bounds:$$\begin{equation} \| u^\epsilon (j\epsilon ) \|_{H^s} \leq CM, \qquad j \ll _M \epsilon ^{-1}. \cssId{onestep-ee}{\tag{4.7}} \end{equation}$$
(ii): Approximate solution:$$\begin{equation} \| u^\epsilon ((j+1)\epsilon ) - u^\epsilon (j\epsilon ) - \epsilon N(u^\epsilon (j\epsilon )) \|_{L^2} \lesssim \epsilon ^2. \cssId{onestep-approx}{\tag{4.8}} \end{equation}$$

Once this is done, if $s$ is large enough,⁠Footnote² then it is a relatively straightforward matter to show that a uniform limit $u$ exists⁠Footnote³ on a subsequence as $\epsilon \to 0$ by applying the Arzelà–Ascoli theorem. This works in a time interval $[0,T]$ with $T \ll _M 1$. By passing to the limit in the above bounds in a weak topology, it follows that the limit $u$ solves the equation and has regularity

For instance, in our model case case $s > n/2+1$ suffices.

✖

Here one may extend $u^\epsilon$ to all times by linear interpolation.

✖

$$\begin{equation*} u \in L^\infty (0,T;H^s) \cap \text{Lip}(0,T;L^2). \end{equation*}$$

The nice feature of this method is that one really only needs to carry out one single step. Precisely, given $u_0 \in H^s$ with size $M$, and $0 < \epsilon \ll 1$, one needs to find $u_1$ (which corresponds to $u^\epsilon (\epsilon )$ above) with the following properties:

(i)$^\prime$: Uniform bounds:$$\begin{equation} \| u_1 \|_{H^s} \leq (1+ C(M) \epsilon ) \|u_0\|_{H^s}. \cssId{single-ee}{\tag{4.9}} \end{equation}$$
(ii)$^\prime$: Approximate solution:$$\begin{equation} \| u_1 - u_0 - \epsilon N(u_0) \|_{L^2} \lesssim \epsilon ^2. \cssId{single-eqn}{\tag{4.10}} \end{equation}$$

Reiterating this, the bound Equation 4.7 follows by applying a discrete form of Gronwall’s inequality.

Remark 4.4.

The $\epsilon ^2$ bound in (ii)$^\prime$ can be harmlessly replaced by $\epsilon ^{1+\delta }$ with a small constant $\delta > 0$.

Remark 4.5.

Sometimes the square $H^s$ norm of $u$ is not the correct quantity to propagate in time, and one needs to replace it with appropriate equivalent energies $E^s$ in property (ii)$^\prime$.

Remark 4.6.

The choice of the $L^2$ in (ii)$^\prime$ above was in order to keep the exposition simple. However, sometimes a different topology may be required by the problem; see, e.g., Reference 29, Reference 1.

The remaining question is how to construct the single iterate satisfying properties (i)$^\prime$ and (ii)$^\prime$ above. The obvious choice would be Euler’s method, which is to set

$$\begin{equation*} u_1 = u_0 + \epsilon N(u_0), \end{equation*}$$

but this does not work because it loses derivatives.

Inspired by the nonlinear semigroup theory Reference 4, one may choose instead to solve

$$\begin{equation*} u_1 - \epsilon N(u_1) = u_0. \end{equation*}$$

This idea has potential at least when this is an elliptic equation. Alternatively, one may opt for a paradifferential version

$$\begin{equation*} u_1 - \epsilon T_{DN(u_0)} u_1 = u_0 + \epsilon F(u_0), \end{equation*}$$

which has the advantage that one only needs to solve a linear elliptic equation. However, ellipticity is not guaranteed.

Instead, here we will adopt a two-step approach, which has the advantage that no partial differential equation needs to be solved. Precisely, our steps are as follows:

Step 1 (Regularization).

Here we take the initial data $u_0$, and we regularize it on an $\epsilon$ dependent scale. Precisely, if $k$ is the order of the nonlinearity $N$, then it is natural to choose the spatial truncation frequency scale to be $\epsilon ^{-\frac{1}{2k}}$, which corresponds to an order $2k$ parabolic regularization; this regularization scale is needed in order to be able to bound the error in the Euler step. Then our regularization $\tilde{u}$ would have the following properties:

(a): Regularization:$$\begin{equation} \| \tilde{u} \|_{H^{s+k}} \lesssim \epsilon ^{-\frac{1}{2}} \|u_0\|_{H^s}. \cssId{tu-high}{\tag{4.11}} \end{equation}$$
(b): Energy bound:$$\begin{equation} E^s(\tilde{u}) \leq (1+ C(M) \epsilon ) E^s(u_0). \cssId{tu-ee}{\tag{4.12}} \end{equation}$$
(c): Approximate solution:$$\begin{equation} \| \tilde{u} - u_0 \|_{L^2} \lesssim \epsilon ^2. \cssId{tu-err}{\tag{4.13}} \end{equation}$$

Step 2 (Euler iteration).

Here we simply set

$$\begin{equation} u_1 = \tilde{u} + \epsilon N(\tilde{u}) , \cssId{u1-def}{\tag{4.14}} \end{equation}$$

so that the approximate solution bound Equation 4.10 becomes relatively straightforward, and the energy bound Equation 4.9 becomes akin to proving the energy estimate; see the example below.

We now implement the above strategy on our chosen model problem. Here our chosen energy is simply the Sobolev norm,

$$\begin{equation*} E^N(u) = \| u\|_{H^N}^2. \end{equation*}$$

Our equation has order $k=1$, so the proper regularization scale is $\delta x = \epsilon ^\frac{1}{2}$. Hence, we use a Littlewood–Paley projector to simply define

$$\begin{equation*} \tilde{u} = P_{<\epsilon ^{-\frac{1}{2}}} u, \end{equation*}$$

and the three properties (a), (b), and (c) above are trivially satisfied.

Next we turn our attention to the Euler iteration Equation 4.14 for which we need to establish the properties (i)$^\prime$ and (ii)$^\prime$. We begin with (i)$^\prime$, where it suffices to compare the energies of $u_1$ and $\tilde{u}$. For $|\alpha | \leq N$ we have

$$\begin{equation*} \partial ^\alpha u_1 = \partial ^\alpha \tilde{u} + \epsilon \partial ^\alpha (\mathcal{A}^j(\tilde{u}) \partial _j \tilde{u}). \end{equation*}$$

If $|\alpha | < N$, then in the second term on the right we have at most $N$ derivatives, so this term has size $O(\epsilon )$ in the $L^2$ norm

$$\begin{equation*} \| \partial ^\alpha (\mathcal{A}^j(\tilde{u}) \partial _j \tilde{u}) \|_{L^2} \lesssim _A \| \tilde{u}\|_{H^N}, \end{equation*}$$

and we can neglect it.

It remains to consider $|\alpha |=N$. Then we can separate the terms with no more than $N$ derivatives and estimate them as above, using appropriate interpolation inequalities,

$$\begin{equation*} \partial ^\alpha (\mathcal{A}^j(\tilde{u}) \partial _j \tilde{u} ) = \mathcal{A}^j(\tilde{u}) \partial ^\alpha \partial _j \tilde{u} + O_{L^2}(B \| \tilde{u}\|_{H^N}). \end{equation*}$$

Hence we have

$$\begin{equation*} \partial ^\alpha u_1 = \partial ^\alpha \tilde{u} + \epsilon \mathcal{A}^j(\tilde{u}) \partial ^\alpha \partial _j \tilde{u} + O_{L^2}(\epsilon ), \end{equation*}$$

and, neglecting $O(\epsilon )$ terms, we compute $L^2$ norms,

$$\begin{equation*} \|\partial ^\alpha u_1\|_{L^2}^2 = \|\partial ^\alpha \tilde{u}\|_{L^2}^2 + 2\epsilon \int \partial ^\alpha \tilde{u} \cdot \mathcal{A}^j(\tilde{u}) \partial ^\alpha \partial _j \tilde{u} \, dx + \epsilon ^2 \| A^j(\tilde{u}) \partial ^\alpha \partial _j \tilde{u} \|^2_{L^2} . \end{equation*}$$

The last $L^2$ norm has size $O(\epsilon )$ in view of property (a) above. On the other hand, in the integral we use the symmetry of $\mathcal{A}$ to integrate by parts,

$$\begin{equation*} 2\int \partial ^\alpha \tilde{u} \cdot \mathcal{A}^j(\tilde{u}) \partial ^\alpha \partial _j \tilde{u} \, dx = - \int \partial ^\alpha \tilde{u} \cdot \partial _j \mathcal{A}^j(\tilde{u}) \partial ^\alpha \tilde{u} \, dx , \end{equation*}$$

which can again be estimated by $\lesssim _A B \| \tilde{u}\|_{H^N}^2$. Thus we obtain

$$\begin{equation*} \| u_1 \|_{H^N}^2 \lesssim _A (1+ \epsilon B) \| \tilde{u}\|_{H^N}^2, \end{equation*}$$

as desired, as $B$ can be estimated by the Sobolev norm of $u_0$ by Sobolev embeddings.

It remains to consider (ii)$^\prime$, where, by (c) above, it suffices to show that

$$\begin{equation*} \| \mathcal{A}^j(u) \partial _j u - \mathcal{A}^j(\tilde{u}) \partial _j \tilde{u} \|_{L^2} \lesssim _M \epsilon . \end{equation*}$$

This is a soft argument, where we simply write

$$\begin{equation*} \| \mathcal{A}^j(u) \partial _j u - \mathcal{A}^j(\tilde{u}) \partial _j \tilde{u} \|_{L^2} \lesssim _M \| \mathcal{A}(u) - \mathcal{A}(\tilde{u})\|_{L^2} + \| \partial _j u - \partial _j \tilde{u} \|_{L^2} \lesssim _M \| u - \tilde{u}\|_{H^1}, \end{equation*}$$

where the $H^1$ norm on the right is bounded by interpolating (c) above with the uniform $H^N$ bound provided by (b). This requires $N \geq 2$.

4.3. Other strategies

Most of the other strategies to prove existence of solutions are based on constructing approximate flows, and solutions are obtained as limits of solutions to the approximate flows. There are two such methods which are more widely used.

(a) Parabolic regularization. Here one uses a parabolic regularization of the original flow Equation 1.1, defining the approximate solutions $u^\epsilon$ by

$$\begin{equation*} u^\epsilon _t = N(u^\epsilon ) - \epsilon (-\Delta )^k u^\epsilon , \qquad u(0) = u_0, \end{equation*}$$

where the correct choice for the parabolic term seems to be to double the order of the original equation. These problems can often be solved for a short $\epsilon$-dependent time, as semilinear problems, with a direct fixed-point argument. However, in doing this, the main challenge is to prove uniform in $\epsilon$ bounds for these approximate flows. This sometimes requires more careful choices of the regularization term, to make it fit better with the geometry of the problem.

(b) Galerkin approximation. Here the idea is to work with a low frequency projector in the equation, e.g., of the type

$$\begin{equation*} u_t = P_{<h} N(P_{<h} u) \end{equation*}$$

with $h \to \infty$; see, e.g., the example in Reference 30. The local solvability for this evolution becomes trivial as this evolution is an ordinary differential equation in a Hilbert space, but the challenge is again to prove uniform in $\epsilon$ bounds for these approximate flows. The double use of the projector above is a choice that usually facilitates achieving this objective. Depending on the problem, this may require careful choices for the frequency projectors, adapted to the problem.

5. Rough solutions as limits of smooth solutions

Here we explore the idea of constructing rough solutions as limits of smooth solutions. There are at least two good reasons to do this, which we discuss in order:

(1): In quasi-linear problems one does not expect any sort of uniformly continuous dependence of solutions on the initial data, so the continuity of the flow map becomes a purely qualitative assertion. However, one can still ask for a quantitative way of comparing solutions, and such a quantitative venue is found by using the regular approximations as a convenient proxy. This is discussed in the last section.
(2): It is also often the case that more regular solutions are sometimes easier to produce, and in such situations, obtaining the rough solutions as limits of smooth solutions might be the only option. This is particularly the case in problems where the state space is not a linear space, such as Schrödinger maps Reference 20, Yang–Mills, or other problems with a nontrivial gauge structure. See also Reference 15 for an implementation of this idea in a free boundary problem. This is because in such problems it is always easier to obtain estimates for the linearized equations, or at least to compare exact solutions, rather than to cook up a constructive scheme which is consistent with the geometry.

To make this analysis quantitative, it is very useful to track the flow of energy between different frequencies. Whereas energy cascades (energy migration to higher frequencies) have long been associated with blow-up phenomena, well-posedness should correspond to a lack thereof. To quantify this, we will use Tao’s notion of frequency envelopes.

5.1. Frequency envelopes

Frequency envelopes, introduced by Tao (see for example Reference 27), are a very useful device in order to track the evolution of the energy of solutions between dyadic energy shells. As there is always nearby leakage between the dyadic shells in nonlinear flows, one needs to do this in a more stable way, rather than look directly at the exact amount of energy in every shell.

This is realized via the following definition:

Definition 5.1.

We say that $\{c_k\}_{k\geq 0} \in \ell ^2$ is a frequency envelope for a function $u$ in $H^s$ if we have the following two properties:

(a) Energy bound:

$$\begin{equation} \|P_k u\|_{H^s} \leq c_k. \tag{5.1} \end{equation}$$

(b) Slowly varying:

$$\begin{equation} \frac{c_k}{c_j} \lesssim 2^{\delta |j-k|} , \quad j,k\in \mathbb{N}. \cssId{fe-delta}{\tag{5.2}} \end{equation}$$

Here $P_k$ represent the standard Littlewood–Paley projectors, and $\delta$ is a positive constant, which is taken small enough in order to account for the energy leakage between nearby frequencies.

One can also try to limit from above the size of a frequency envelope, for instance by requiring that

$$\begin{equation*} \| u\|_{H^s}^2 \approx \sum c_k^2. \end{equation*}$$

We call such envelopes sharp. Such frequency envelopes always exist; for instance, one can take

$$\begin{equation*} c_k = \sup _j 2^{-\delta |j-k|} c_j. \end{equation*}$$

For a better understanding see Figure 1, where the actual dyadic norms, indicated by bullets on a logarithmic scale, are lifted (based on the above formula) to a slowly varying frequency envelope, indicated by the circles.

We will use frequency envelopes in order to track the evolution of energy in time as follows: we start with a sharp frequency envelope for the initial data, and then seek to show that we can propagate this frequency envelope to the solutions to our quasi-linear flow, at least for a short time.

Remark 5.2.

One alternative here is to unbalance the choice of $\delta$ in Equation 5.2, asking for a small $\delta$ if $k < j$, but replacing $\delta$ with a large constant for $k > j$. This heuristically corresponds to a better control of leakage to higher frequencies, and it is useful in order to deal with higher regularity properties also within the frequency envelope setup.

5.2. Regularized data

Consider an initial data $u_0 \in H^s$ with size $M$, and let $\left\{c_k\right\}_{k\geq 0}$ be a sharp frequency envelope for $u_0$ in $H^s$. For $u_0$ we consider a family of regularizations $u_0^h \in H^\infty \coloneq \bigcap _{s=0}^{\infty }H^s$ at frequencies $\lesssim 2^h$ where $h$ is a dyadic frequency parameter. This parameter can be taken either discrete or continuous, depending on whether we have access to difference bounds or only to the linearized equation. Suppose we work with differences. Then the family $u^h_0$ can be taken to have similar properties to Littlewood–Paley truncations as follows.

(i): Uniform bounds:$$\begin{equation} \| P_k u^h_0 \|_{H^s} \lesssim c_k. \tag{5.3} \end{equation}$$
(ii): High frequency bounds:$$\begin{equation} \| u^h_0 \|_{H^{s+j}} \lesssim 2^{jh} c_h , \qquad j > 0. \tag{5.4} \end{equation}$$
(iii): Difference bounds:$$\begin{equation} \| u^{h+1}_0- u^h_0 \|_{L^2} \lesssim 2^{-sh} c_h . \tag{5.5} \end{equation}$$
(iv): Limit as $h \to \infty$:$$\begin{equation} u_0 = \lim _{h\to \infty } u_0^h \qquad \text{ in } H^s. \tag{5.6} \end{equation}$$

Correspondingly, we obtain a family of smooth solutions $u^h$.

Here, in the simplest setting where the phase space is linear, one may simply choose $u^h_0 = P_{< h} u_0$, which would have all the above properties. However, in geometric settings where the phase space is nonlinear, a more complex regularization method may be needed—for instance, using a corresponding geometric heat flow, see Reference 28, or a variable scale regularization, as in Reference 15.

5.3. Uniform bounds

Corresponding to the above family of regularized data, we obtain a family of smooth solutions $u^h$. For this we can use the energy estimates as in Theorem 3 to propagate Sobolev regularity for solutions as well as difference bounds as in Proposition 3.10. This yields a time interval $[0,T]$ where all these solutions exist, and whose size $T$ depends only on $M = \|u_0\|_{H^s}$, where we have the following properties:

(i): High frequency bounds:$$\begin{equation} \| u^h \|_{C(0,T;H^{s+j})} \lesssim 2^{jh} c_h , \qquad j > 0. \cssId{hf-bd}{\tag{5.7}} \end{equation}$$
(ii): Difference bounds:$$\begin{equation} \| u^{h+1}- u^h\|_{C(0,T;L^2)} \lesssim 2^{-sh} c_h . \cssId{diff-bd}{\tag{5.8}} \end{equation}$$

From Equation 5.7 one may obtain a similar bound for the difference $u^{h+1}-u^h$. Interpolating this with Equation 5.8, we also have

$$\begin{equation} \| u^{h+1}- u^h\|_{C(0,T;H^m)} \lesssim 2^{-(s-m)h} c_h, \qquad m \geq 0. \cssId{interp-bd}{\tag{5.9}} \end{equation}$$

One may use these bounds to establish uniform frequency envelope bounds for $u^h$,

$$\begin{equation} \| P_k u^h \|_{C(0,T;H^{s})} \lesssim c_k 2^{-N(k-h)_+}, \cssId{fe-bd}{\tag{5.10}} \end{equation}$$

on the same time interval which depends only on the initial data $H^s$ size. This is a direct consequence of Equation 5.7 for $k \geq h$, while if $k < h,$ we can use the telescopic expansion

$$\begin{equation*} u^h = u^k + \sum _{l = k}^{h-1}\left( u^{l+1} - u^l\right), \end{equation*}$$

and use Equation 5.7 for the first term and Equation 5.8 for the differences.

5.4. The limiting solution

Consider now the convergence of $u^h$ as $h \to \infty$. From the difference bounds Equation 5.8 we obtain convergence in $L^2$ to a limit $u \in C(0,T; L^2)$, with

$$\begin{equation*} \|u - u^h \|_{C(0,T;L^2)} \lesssim 2^{-sh}. \end{equation*}$$

On the other hand, expanding the difference as a telescopic sum, we get

$$\begin{equation*} u - u^h = \sum _{m = h}^\infty u^{m+1} - u^m, \end{equation*}$$

where, in view of the above bounds Equation 5.7 and Equation 5.8, each summand is essentially concentrated at frequency $2^m$, with $H^s$ size $c_m$ and exponentially decreasing tails. This leads to

$$\begin{equation} \| u - u^h \|_{C(0,T;H^s)} \lesssim c_{\geq h}\coloneq \left(\sum _{m \geq h} c_m^2\right)^\frac{1}{2}, \cssId{rough-limit}{\tag{5.11}} \end{equation}$$

so we also have convergence in $C(0,T;H^s)$.

This type of argument plays multiple roles:

(1): It produces rough solutions as smooth solutions, justifying the earlier assertion that it often suffices to carry out the initial construction of solutions only in a smooth setting.
(2): It establishes the continuity of solutions as $H^s$ valued flows, which is sometimes missing from the constructive proof of existence.
(3): It provides the quantitative bound Equation 5.11 for the difference between the rough and the smooth solutions, which plays a key role in the continuous dependence proof in the next section.

6. Continuous dependence

Here we use frequency envelopes in order to prove continuous dependence of the solution $u \in C(0,T;H^s)$ as a function of the initial data $u_0 \in H^s$, and also to discuss some historical alternatives.

6.1. The continuous dependence proof

Consider a sequence of initial data

$$\begin{equation*} u_{0j} \to u_0 \qquad \text{in} \ H^s, \quad s > \frac{d}{2}+1, \end{equation*}$$

and the corresponding solutions $u_j$, $u$ which exist with a uniform lifespan $[0,T]$, where $T$ depends only on the initial data size $\|u_0\|_{H^s}$. We will prove that $u_j \to u$ in $C(0,T;H^s)$. Once we have this property, it automatically extends to any larger time interval $[0,T_1]$, where the solution $u$ is defined and satisfies $u \in C(0,T_1;H^s)$. This should be understood in the sense that for all large enough $j$, the solutions $u_j$ are defined in $[0,T_1]$, with similar regularity, and the convergence holds as $j \to \infty$.

The difference bounds in Proposition 3.10 guarantee that $u_j \to u$ in $C(0,T;L^2)$. Since $u_j$ are uniformly bounded in $C(0,T;H^s)$, this also implies convergence in $C(0,T;H^\sigma )$ for every $0 \leq \sigma < s$, but not for $\sigma = s$.

It remains to consider the convergence in the strong topology, i.e., in $H^s$. Rather than trying to compare the solutions $u_j$ and $u$ directly, we will use as a proxy the approximate solutions $u_j^h$ (resp., $u^h$). For these, we will take advantage of the fact that their initial data converge in all Sobolev norms,

$$\begin{equation*} u_{0j}^h \to u_0^h \qquad \text{in} \ H^\sigma , \quad 0 \leq \sigma < \infty . \end{equation*}$$

Hence, according to the preceding discussion, we have convergence of the regular solutions in all Sobolev norms,

$$\begin{equation*} u_{j}^h \to u^h \qquad \text{in} \ C(0,T;H^\sigma ), \quad 0 \leq \sigma < \infty . \end{equation*}$$

To compare the solutions $u$ and $u^j$ themselves, we use the triangle inequality,

$$\begin{equation} \| u_j - u \|_{C(0,T;H^s)} \lesssim \| u_j^h - u^h \|_{C(0,T;H^s)} +\| u^h - u \|_{C(0,T;H^s)} +\| u_j^h - u_j \|_{C(0,T;H^s)}. \cssId{diff-bd-j}{\tag{6.1}} \end{equation}$$

The first term goes to zero as $j \to \infty$ for fixed $h$, while the second goes to zero as $h \to \infty$, but does not depend on $j$. It is the third term which is the problem, and for which we need to gain some smallness uniformly in $j$.

However, in the previous section we have learned to estimate such differences using frequency envelopes. Precisely, let $\big \{c_k\big \}_{k\geq 0}$ (resp., $\big \{ c_k^j\big \}_{k\geq 0}$) be frequency envelopes for the initial data $u_0$ (resp., $u_0^j$) in $H^s$. Then, as we saw in the previous section, we can estimate the last two terms above in terms of frequency envelopes and obtain

$$\begin{equation} \| u_j - u \|_{C(0,T;H^s)} \lesssim \| u_j^h - u^h \|_{C(0,T;H^s)} + c_{\geq h} + c^j_{\geq h}. \tag{6.2} \end{equation}$$

The important observation is that the convergence $u_{0j} \to u_0$ in $H^s$ allows us to choose the frequency envelopes $c$ (resp., $c^j$) so that

$$\begin{equation*} c^j \to c \qquad \text{in } \ell ^2. \end{equation*}$$

This implies that

$$\begin{equation*} \lim _{j \to \infty } c^j_{\geq h} = c_{\geq h}. \end{equation*}$$

Hence, passing to the limit $j \to \infty$ in the relation Equation 6.1, we obtain

$$\begin{equation} \limsup _{j \to \infty } \| u_j - u \|_{C(0,T;H^s)} \lesssim c_{\geq h}, \tag{6.3} \end{equation}$$

and finally letting $h \to \infty$, we obtain

$$\begin{equation*} \lim _{j \to \infty } \| u_j - u \|_{C(0,T;H^s)} = 0, \end{equation*}$$

as desired.

6.2. Comparison with Kato, and Bona and Smith

The more classical approach for continuous dependence goes back to Kato Reference 16 as well as a variation due to Bona and Smith Reference 5. We will briefly describe this approach using our notations and setup. We caution the reader that the original arguments in these papers are not self-contained and are instead mixed with the other parts of well-posedness proofs, so it is not exactly easy to correlate the papers with the description below. In effect our discussion below is more closely based on the interpretations of Kato’s work provided by Chemin Reference 3 and, even closer, by Tao Reference 26.

This also relies on the use of some sort of approximate solutions $u^h$. However, in this approach one aims to directly estimate the difference $u^h - u$ in $H^s$ in terms of the corresponding initial data. One might at first hope to directly track the difference $\| u^h - u\|_{C(0,T;H^s)}$, but this cannot work without knowledge that the low frequencies of the difference (i.e., below $2^h$) are better controlled. So the better object to track turns out to be a norm of the form

$$\begin{equation} \| u^h - u\|_{H^s} + 2^{kh} \| u^h - u\|_{H^{s-k}}, \cssId{control-norm-reg}{\tag{6.4}} \end{equation}$$

where we recall that $k$ is the order of our nolinearity. Here the second part can be estimated directly for any two $H^s$ solutions (see Remark 3.12), so one can think of this as decoupled as a two-step process. To better understand why this works, it is useful to write the equation for the difference $w= u^h-u$ in a paradifferential form,

$$\begin{equation} \partial _t w + T_{DN(u)} w = [F(u) - F(u^h)] + T_{DN(u)- DN(u^h)} u^h, \cssId{diff-eqn}{\tag{6.5}} \end{equation}$$

which should essentially be thought of as a perturbation of the linear paradifferential flow, which can be estimated in all Sobolev spaces. The $F$ difference is tame because $F$ admits Lipschitz bounds in all Sobolev spaces, so the issue is the last term.

There there is seemingly a loss of $k$ derivatives, but these derivatives are applied to $u^h$, which has higher regularity bounds, so they yield losses of at most a $2^{kh}$ factor. But this factor can be absorbed by the lower frequency paradifferential coefficients given by $DN(u)- DN(u^h)$, in view of the $2^{kh}$ factor in Equation 6.4. Here it is important that we write the equation using $T_{DN(u)}$ rather $T_{DN(u^h)}$ on the left, which allows us to use $u^h$ as the argument in the last term on the right.

In Kato’s argument the same principle is used to get $H^s$ bounds not only for the difference $u^h - u$ but also for $u^h-v$ for an arbitrary solution $v$. On the other hand, in the Bona–Smith version one estimates only $u^h - u$, but the proof is more roundabout in that $u^h$ is not only assumed to have regularized data but also to solve a regularized equation, thus combining the existence and the continuous dependence arguments.

In our opinion, working with frequency envelopes has definite advantages:

•: It provides more accurate information on the solutions.
•: It does not require any direct difference bounds in the strong $H^s$ topology.
•: By working with a continuous, rather than a discrete family of regularizations, one can fully replace difference estimates by bounds for the linearized equation, which is to be preferred in many cases, in particular in geometric contexts where the state space is an infinite-dimensional manifold.

Acknowledgments

Both authors are extremely grateful to MSRI for their full support in holding the graduate summer school “Introduction to water waves” in a virtual format due to the less than ideal circumstances.

About the authors

Mihaela Ifrim is Luce Associate Professor in the department of mathematics at University of Wisconsin, Madison. She works on a broad class of nonlinear evolutions arising in fluid dynamics and in the study of other nonlinear wave phenomena.

Daniel Tataru is professor in the department of mathematics at University of California, Berkeley. His research interests are in nonlinear partial differential equations, primarily in nonlinear waves and other dispersive and fluid flows.

Local well-posedness for quasi-linear problems: A primer

Abstract

1. Introduction

1.1. Nonlinear evolutions

1.2. What is well-posedness?

1.3. A set of results for the model problem

1.4. An outline of these notes

2. A menagerie of related equations

2.1. The linearized equation

2.2. The linear paradifferential equation

2.3. The paradifferential formulation of the main equations

3. Energy estimates

3.1. The general case

3.2. Coifman-Meyer and Moser type estimates

3.3. The model case

3.4. The linearized equation

3.5. Difference bounds and uniqueness

4. Existence of solutions

4.1. Take 1: An iterative/fixed point construction

4.2. Take 2: A time discretization method

4.3. Other strategies

5. Rough solutions as limits of smooth solutions

5.1. Frequency envelopes

5.2. Regularized data

5.3. Uniform bounds

5.4. The limiting solution

6. Continuous dependence

6.1. The continuous dependence proof

6.2. Comparison with Kato, and Bona and Smith

Acknowledgments

About the authors

Table of Contents

Figures

Mathematical Fragments

References

Article Information

Settings