# Some recent progress in singular stochastic partial differential equations

## Abstract

Stochastic partial differential equations are ubiquitous in mathematical modeling. Yet, many such equations are too singular to admit classical treatment. In this article we review some recent progress in defining, approximating, and studying the properties of a few examples of such equations. We focus mainly on the dynamical equation, the KPZ equation, and the parabolic Anderson model, as well as a few other equations which arise mainly in physics.

## 1. Introduction

Partial differential equations (PDEs) and randomness are ubiquitous constructions used to model both mathematical and physical phenomena. For instance, PDEs have been used for centuries to describe the building block laws of physics, and to model aggregate macroscopic phenomena, such as heat conduction, diffusion, electro-magnetic dynamics, interface and fluid dynamics. Randomness has become a default paradigm for modeling systems with uncertainty or with many complicated or chaotic microscopic interactions.

Combining these two approaches leads to the study of stochastic PDEs (SPDEs) in which the coefficients or forcing terms in PDEs are described via certain random processes. While SPDEs have become increasingly important in applications, there remain many fundamental mathematical challenges in their study—in particular, showing how they arise from microscopic particle based models remains a major source of research problems and has seen some radical progress in the past decade.

The purpose of this article is to introduce a few important classes of SPDEs and to describe how they arise and the mathematical challenges that go along with demonstrating that. Though this article will mainly focus on nonlinear systems, we will start our investigation in Section 2 in the simpler and more classical setting of linear SPDEs which are very well understood. In Section 3 we turn our attention to nonlinear SPDEs and introduce our two main examples (the dynamical equation and the KPZ (Kardar–Parisi–Zhang) equation) along with a host of other important SPDEs which arise in physics. Our discussion in this section is heuristic and ignores some of the serious mathematical challenges which arise when one tries to make sense of what it means to “solve” an SPDE. This challenge is addressed in Section 4. In the course of making sense of SPDEs, there are often *renormalizations* which arise (effectively changing the equation). Section 5 describes how these renormalizations have physical meaning and arise in certain discrete approximation schemes for the continuum equations. Finally, Section 6 seeks to demonstrate how these SPDEs (in particular, the KPZ equation) arise as universal limits from microscopic systems.

Before proceeding to our main text, one disclaimer. Our aim is to make this material approachable to nonexperts. As such, we will not state precise theorems or give proofs but rather will attempt to provide some intuition behind results and the challenges which accompany proving them. An interested reader can find much more detail and precision in the works cited or, can consult other survey articles such as Reference GP18bReference Gub18Reference CW17Reference Hai15aReference Hai14a.

## 2. A first (linear) SPDE

We will start our discussion on linear SPDEs with the *stochastic heat equation* which is driven by a random additive noise term :

where is the so-called *space-time white noise*. It will take a bit of work to define this noise and make sense of what it means to solve this equation. However, before going down that route, we will first address the question of what sort of physical system does this model? In particular, we will explain heuristically how this equation arises from a simple microscopic model of polymers in liquid.

Consider modeling a polymer chain (e.g., composed of DNA or proteins) in a liquid. A simple model involves describing the polymer by a string of beads that are linked together sequentially by springs and that are subject to *kicking* by noise, as shown in Figure 2.1, where :

Imagine that each bead of the polymer is kicked by the surrounding liquid molecules. In our simplified model,Footnote^{1} As usual, one always has to make various simplifying assumptions in order to describe a complicated physical system via a mathematically analyzable model. It is natural to ask whether having random kicking leads to a reasonable microscopic model. After all, the liquid itself is governed by certain physical laws of motion for its particles. Such concerns arose early in the development of Brownian motion as the model for a single tracer particle moving in a liquid; see Reference Bru68 for a nice historical review. We do not provide further justification for this as a reasonable microscopic model here.^{✖} we describe such a system via the following equations of motion for the position of the bead: th

where is random kicking at time and where a boundary condition is given by fixing , and for all time. Equation Equation 2.2 means the following.

- •
The linear drift terms and arise from assuming a linear spring force between the bead with its neighboring (in the sense of label number) beads. Without the kicking term th equation ,Equation 2.2 would simply be a coupled system of ordinary differential equations.

- •
The term represents the

*random kicking*that is experienced by the bead at time th We make the simplifying assumption that the kicking is “overdamped”, .Footnote^{2}Essentially, this means that the kicks occur instantaneously in time and do not result in any inertia. This effectively decouples the various kicks.^{✖}and we model the kicks in terms of random jumps in the location of the Namely, for each particle . there is a random sequence of kicking timesFootnote^{3}It is natural to assume the gaps between times are chosen according to independent exponential random variables of mean 1. In this case, the times are distributed as a*Poisson point process*of intensity 1.^{✖}At the kicking time . we update , where , is an random variable. We assume that the -valued are statistically isotropic (i.e., their distribution is invariant under rotation) and are all independent and identically distributed (i.i.d.). Note that the resulting process is piecewise continuous, with jumps occurring at the kicking times.

The question with which we are concerned is what happens to the polymer when its length grows, and possibly space and time are scaled accordingly. By default one might expect that as increases, the complexity of studying this system goes likewise. However, it turns out that there is a very tractable continuum limit for the evolution of our polymer model. That is to say, in the scaling limit, things simplify! In fact, this limit is quite robust and (up to some scaling constants) is not affected by various changes in the microscopic model, such as how we model the kicking (e.g., different distribution on the or on the kicking times). This robustness can, itself, be seen as evidence that the microscopic model may be reasonable.

With the aim of demonstrating a continuum limit of our model, think of as large and define

where encodes the label via (the closest integer to The linear drift term in ).Equation 2.2 is in fact a discrete Laplacian, so under our diffusive scaling (that is, scaling by and by it approximates a continuum Laplacian ) Thus, for large . one can expect that the following relation approximately holds:

In the scaled coordinates, the kicking starts to add up. Namely, in a time-space region of size there are roughly , kicks. By the central limit theorem the sum of i.i.d. random variables divided by converges to a Gaussian random variable. Thus, on each time-space region, the kicking adds up to a Gaussian random variable with variance equal to the area of the region. Different regions have covariance given by the area of their overlap. This limit is called *space-time* (or *time-space*, given our ordering of variables) *white noise* and is denoted by This heuristic leads us to the following type .Footnote^{4} This type of convergence result was first proved by Funaki Reference Fun83 in the slightly different setting where the are driven by Brownian motions. In the present setting, we do not know if a precise result of this sort has been proved (though we have no doubt that it can be). We are suppressing coefficients which may (depending on the nature of the discrete noise) arise in the limiting equation.^{✖} of limit as ,

where and where is space-time white noise.

Equation Equation 2.5 is our first example of an SPDE—it is called the linear stochastic heat equation with additive noise. Even in this simple linear example we encounter an equation which requires some work to make sense of because of the noise. For convenience of our exposition, from this point forward, we will think of as a *spatial* variable (although in our example it actually stands for the parametrization of the limiting polymer length); and although and are the three components are completely independent (decoupled), so it will be convenient to simply consider equation -valued,Equation 2.5 as instead of -valued in the rest of this paper. -valued

Let us look at equation Equation 2.5 more closely, with spatial dimension now being arbitrary (recall corresponds with the above polymer example),

with, for instance, periodic boundary condition.

Consider the case where ,Equation 2.6 becomes the stochastic (ordinary) differential equation The . white noise is defined to be the derivative of Brownian motion so that is a Brownian motion. Of course, Brownian motion is famously almost nowhere differentiable, so is not defined as a function. Rather, can be defined as a random *distribution* in a suitable *negative* regularity space (such as a negative Sobolev space). Since the Hölder regularity of Brownian motion is (meaning any exponent below its derivative is said to have regularity ), There are other ways to define . For instance, if we restrict to a periodic . then , where the , are i.i.d. Gaussian random variables, and the constitute an orthonormal basis of Alternatively, one can define . via the machinery of Gaussian processes (see, for instance Reference Jan97) wherein it suffices to specify its mean and variance. By definition, is mean zero, and since is a *distribution* as mentioned above, its covariance

(here represents the expectation value operator and is a Dirac delta function) must be interpreted in a distributional sense as well. For a smooth test function one defines the stochastic integral , Then . is defined by the property that for all test functions and for test functions , and ,

The random distribution is defined from this information using Kolmogorov’s continuity theorem.

For general dimension space-time white noise can be defined via analogous methods. As a Gaussian process, , is a random distribution with covariance

where the last is the Dirac delta function on space. Its action on space-time test functions -dimensional and has covariance given by Equation 2.7. where the is product over space-time.

As the dimension increases, the regularity of *decreases*. We will work with spaces of space-time distributions (for or functions (for ) denoted by ) These are essentially equivalent to the Besov spaces . in harmonic analysis, and their precise definitions can be given via wavelets in Reference Hai14b, Eq. (3.2). The smaller corresponds to less regular (or more *singular*) functions or distributions. A well-known result is that the space-time white noise for .Footnote^{5} As we will primarily work with parabolic equations, these spaces have a built-in *parabolic scaling* between time and space wherein time regularity is doubled. For instance a function has second continuous spatial derivatives and first continuous time derivative. Extending the situation discussed earlier for ( white noise, we have that space-time white noise ) for The case . has which, given the doubling of time regularity, corresponds to the regularity discussed above.^{✖}

Having made sense of it remains to understand what it means to ,*solve* equation Equation 2.6. For a linear equation as Equation 2.6, the meaning of solution is not hard to define; essentially one only needs to give a suitable meaning to the inverted linear differential operator acting on the noise given an initial data : the solution to ,Equation 2.6 is defined by

Here is the heat semigroup so that solves the classical (deterministic) heat equation starting from with heat kernel , The expression . also acts as an integral operator on via

which is the space-time convolution of the heat kernel with Just like . itself, is also a *well-defined* random distribution.

To get a bit more flavor of “solution theories” of stochastic PDEs, we list some well-known properties for Equation 2.6.

- (P1)
The solution, in the above sense, is obviously

*unique*, since the difference of two solutions would solve a deterministic heat equation with zero initial condition which must be zero.- (P2)
With the aforementioned regularity of by standard parabolic PDE theory, in particular the Schauder estimate which states that the operator , increases regularity by one has , for any In particular, . is (almost surely) a random continuous function in and a random distribution in , So the limiting polymer parametrized by . in the above example is a random continuous curve.

- (P3)
The random distribution has

*Gaussian probability law*. This is because is Gaussian and any*linear*combination of Gaussian random variables is still Gaussian.- (P4)
Equation Equation 2.6 has an

*invariant measure*called the Gaussian free field. This is a Gaussian random field on with covariance given by the Green’s function of the Laplace Being invariant means that if the initial data . is random and distributed as Gaussian free field, then has the same law of Gaussian free field for all On the other hand starting from arbitrary . the law of , will approach that of the Gaussian free field as We refer to .Reference She07 for a nice review of the Gaussian free field.- (P5)
Equation Equation 2.6 is scaling invariant in any dimension namely, ,

where The last scaling relation of the white noise can be seen from its covariance .Equation 2.8 recalling that the Dirac on space has scaling dimension -dimensional Note that the scaling taken in .Equation 2.3 and Equation 2.4 was precisely the one that leaves the limit equation invariant.

So far a reader who is new to the area of SPDEs should have acquired the following message: the solution theories of SPDEs share some of the same fundamental challenges as in the study of classical PDEs. These include showing that solutions exist (or can be defined) both locally or globally, are unique within certain regularity classes, and arise as a scaling limits for various approximation schemes. The rest of this article will focus on recent progress on these challenges for nonlinear SPDEs. Before doing so, let us briefly remark on another important challenge present both for PDEs and SPDEs—explicitly representing solutions via formulas.

Linear PDEs always admit explicit solutions. Linear SPDEs, as we saw above, have solutions which are random Gaussian processes with explicit mean and covariance (computable from the equation explicitly). Most nonlinear PDEs *do not* admit explicit solutions; those that do are generally related to the area of *integrable systems*. Likewise, most nonlinear SPDEs do not admit explicit descriptions for the probability distribution of their solutions. There are, however, a few special SPDEs (such as the KPZ equation discussed below) which can be explicitly solved in this sense. The study of such SPDEs fall under the area of *integrable probability* or *exactly solvable systems*; see for instance Reference Cor14Reference BP15 and references therein. We will not pursue this direction further in this article.

## 3. Nonlinear SPDEs

Linear systems are often insufficient to effectively model many interesting phenomena. Indeed, as we will now see, nonlinear SPDEs arise in a number of important areas of physics (and many other directions that we will not discuss here). Such nonlinear systems, however, are generally much more challenging to work with. Before coming to that, let us start with a few examples.

Consider a piece of magnet that is being heated up, as in Figure 3.1. As the temperature increases, the magnetic field produced by the magnet weakens, and at a critical temperature known as the ,*Curie temperature*,Footnote^{6} The Curie temperature is named after Pierre Curie who first experimentally demonstrated that certain magnets lost their magnetism entirely at a critical temperature.^{✖} the magnetic field disappears. Though various magnet materials have different microscopic structures, a common physical explanation for magnetism is that it comes from the alignment of the magnetic moments of many of the atoms in the material. As a simplified mathematical model one can imagine that a magnet is made up of millions of tiny arrows (or spins) with directions oscillating over time. Below the Curie temperature, i.e., the spins tend to align in order to minimize an interaction energy (which energetically prefers alignment), which causes a macroscopic magnetization (shown in the bottom-right picture); above the Curie temperature , the spin configurations are much more disordered due to strong thermal fluctuation, ,Footnote^{7} In statistical mechanics, thermal fluctuations are random deviations of a system from its low energy state. All thermal fluctuations become larger and more frequent as the temperature increases, and likewise they decrease as temperature approaches absolute zero. Thermal fluctuations are a basic manifestation of the temperature of systems.^{✖} and as a result the magnetic fields cancel out (shown in the bottom-left picture).

A general mantra in statistical physics holds that “interesting scaling limits arise at critical points”. In particular, here we would like to understand what happens to the spin system when the temperature approaches while time and space are tuned accordingly. Near criticality, the spins start to oscillate more and more drastically and the small scale disorder starts to propagate to larger and larger scales. The resulting magnetic field fluctuations are believed to be described by the nonlinear SPDE,

when and the spatial dimension of is or , This is called the .*dynamical equation* since the deterministic part arises from the gradient of an energy We will return to this equation later in Section .5.1 and describe how it arises from a particular model of magnets.

As another example, we consider a model for interface growth, where each point of the interface randomly grows up or drops down over time, with a trend to locally smooth out the interface (like the spring force in the polymer example in Section 2). Such systems are ubiquitously found in nature—for instance, the left image in Figure 3.2 shows the end result of an interface grown in the ocean from a volcanic eruption.

We are interested in modeling the evolution of such interfaces.

To drastically simplify the situation, let us assume our interface is a one-dimensional function for The simplest scenario is that the upward growth and downward drop of . occurs *equally* likely, though randomly. In this case, it turns out that the interface behaves similarly to the one-dimensional version of the polymer in Section 2 whose beads are kicked by *isotropic* random force (see the right image in Figure 3.2). Thus, the large scale fluctuation should be given by the linear SPDE Equation 2.5.

In the asymmetric scenario where the interface is more likely to grow up than to drop down, one expects to see nontrivial fluctuation described by equation Equation 2.5 perturbed by a nonlinearity. In particular, the asymmetry should not be too strong or it will overwhelm the local smoothing (the term) and randomness (the term), and not be too weak or it will not change the limiting equation. This critical tuning is called *weakly asymmetric* and results (under the same sort of scaling as in the symmetric case) in the following SPDE description for fluctuation of :

Due to the asymmetry, the interface establishes an overall height shift. So, for the above limit, we must recenter into this moving frame. The KPZ equation was first proposed by Kardar, Parisi, and Zhang in Reference KPZ86; see the nice review Reference KS91b for more background. We will return to this equation later in Section 6 and describe why it arises from various models of interface growth.

### 3.1. Some other important nonlinear SPDEs

Besides the Equation and Equation KPZ equations discussed above, there are a number of physically important equations—some of which we briefly review now. The reader is warned that it is a formidable challenge to define the meaning of a solution to nonlinear SPDEs driven by very singular noises. We postpone this important issue until Section 3.2.

- •
**Stochastic Navier–Stokes equation**(with spatial dimensions of particular physical interest),where is the pressure and is a valued noise. When the noise -vector is taken to be singular (for instance each component of is an independent space-time white noise), it models motion of fluid with randomness arising from microscopic scales, and in this case we refer to Reference DPD02Reference ZZ15 for well-posedness results.

We remark that while this article focuses on singular noises, when modeling

*large scale*random stirring of the fluid, the noise is often assumed to be smooth (it is called*colored noise*in contrast with white noise), and in fact the most important case is that the equation is driven by only a small number of random Fourier modes. In these situations the*long-time behavior*is of primary interest, and various dynamical system questions such as ergodicity and mixing are studied. There is a vast literature on this topic, and we only refer to the book Reference KS12 and the survey articles Reference Mat03Reference Fla08Reference Kup10.- •
**Stochastic heat equation with multiplicative noise**in one spatial dimension,where is some continuous function. The Itô solution theory successful for stochastic ordinary differential equations (ODEs) can extend to this stochastic PDE; see for instance the lecture notes Reference Wal86. The specialization i.e., ,

has a significant connection to the KPZ equation: one can formally check that if solves KPZ, then the Hopf–Cole transform solves Equation 3.3. Other choices of are

which (along with arise in modeling population dynamics and genetics; see for instance )Reference Daw72Reference Fle75.

- •
**Nonlinear parabolic Anderson model**in spatial dimensions ,where is a continuous function and is a noise which typically is assumed to be spatially independent (i.e., white) but constant in time. This models the motion of mass through random media. The assumption of constant in time noise is consistent with the regime where the mass is assumed to move much faster than the time scale in which the media changes. We refer to Reference GIP15Reference HL18 and references therein for well-posedness results.

The parabolic Anderson model, especially the linear case ( is a simple model which exhibits ),

*intermittency*over long time; for the study of long time behavior, one often considers the spatial-discrete equation with being independent noises on lattice sites; see, for instance, the reviews Reference CM94 and Reference K16 for further discussion and references regarding long time behaviors of the parabolic Anderson model.- •
**Dynamical sine-Gordon equation**,This equation describes the natural dynamics of a class of two-dimensional systems that exhibits the Berezinskiĭ–Kosterlitz–Thouless (BKT) phase transition Reference Ber70Reference KT73Reference Jos13, such as two-dimensional Coulomb gas and certain condensed matter materials.Footnote

^{8}These include thin disordered super-conducting granular films. The phase transition is from bound vortex–antivortex pairs at low temperatures to unpaired vortices and antivortices at some critical temperature.^{✖}See Reference MS77Reference FS81 for earlier studies of the model in equilibrium. Here represents the inverse temperature, and is the BKT critical point. See Reference HS16Reference CHS18 for the construction of local solutions of this dynamic.- •
**Random motion of a curve**in an manifold -dimensional driven by independent space-time white noises (see Reference Hai16Reference BGHZ19Reference RWZZ18),where is a map from an interval to , are the Christoffel symbols for the Levi-Civita connection, and is a collection of vector fields on the manifold. This is a non-Euclidean generalization of Equation 2.5.

- •
**Stochastic Yang–Mills flow**in spatial dimensionswhere the deterministic part (without is the Yang–Mills gradient flow introduced in )Reference DK90 which is extensively studied in geometry (see the monograph Reference Fee14). Here, in the setting of differential geometry, one fixes a Lie group, is a connection (or a Lie algebra valued 1-form), is the curvature of , is the covariant derivative operator, and is its adjoint. The noise is a 1-form with each component being an (independent copy of) Lie algebra valued space-time white noise. See Reference She18 for some initial progress in in the case that the Lie group is Abelian. Note that Equation 3.7 is not a parabolic equation, and one usually adds an additional term on the right-hand side to obtain a parabolic equation,

which is gauge equivalent with the original equation (the Donaldson–De Turck trick).

The study of geometric equations with randomness such as Equation 3.6 and Equation 3.7 is of general interest. Equation Equation 3.7 is motivated by the problem of quantization of the Yang–Mills field theory; see also the next item.

- •
**Stochastic quantization**. This refers to a large class of singular SPDEs arising from Euclidean quantum field theories defined via Hamiltonians (or actions, energy, etc.). They were introduced by Parisi and Wu in Reference PW81. Given a Hamiltonian which is a functional of , one considers a gradient flow of , perturbed by space-time white noise :Here is the variational derivative of the functional for instance, when ; is the Dirichlet form, and Equation 3.9 boils down to the stochastic heat equation Equation 2.6. Note that can be also multicomponent fields, with being likewise multicomponent. The aforementioned equation, sine-Gordon equation, and stochastic Yang–Mills flow all belong to this class of stochastic quantization equations, each corresponding to a Hamiltonian .

The significance of these

*stochastic quantization equations*Equation 3.9 is that given a Hamiltonian the formal measure ,is formally an invariant measureFootnote

^{9}Being invariant means that if the initial condition of Equation 3.9 is random with*probability law*given by Equation 3.10, then the solution at any will likewise be distributed according to this same probability law. For readers familiar with stochastic ODEs, one simple example is given by the Ornstein–Uhlenbeck process, where , is the Brownian motion, and its invariant measure is the (one-dimensional) Gaussian measure .^{✖}for equation Equation 3.9. Here is the formal Lebesgue measure and is a*normalization constant*. We emphasize that Equation 3.10 is only a formal measure because, among several other reasons, there is no*Lebesgue measure*on an infinite-dimensional space and it is a priori not clear at all if the measure can be normalized. These measures arise from*Euclidean quantum field theories*. In their path integral formulations, quantities of physical interest are defined by expectations with respect to these measures. The task of*constructive quantum field theory*is to give precise meaning or constructions to these formal measures; see the book Reference Jaf00.Given the very recent progress of SPDEs, a new approach to construct the measure of the form Equation 3.10 is to construct the

*long-time*solution to the stochastic PDE Equation 3.9 and average the distribution of the solution over time. This approach has been shown to be successful for the model in in a series of very recent works, which starts with Reference MW17b on the torus where a priori estimates were obtained to rule out the possibility of finite time blowup. Then Reference GH18bReference GH18a established a priori estimates for solutions on the full space yielding the construction of quantum field theory on the whole as well as verification of some key properties that this invariant measure must satisfy as desired by physicists, such as reflection positivity. See also ,Reference AK17. Similar uniform a priori estimates are obtained by Reference MW18 using maximum principle.- •
**Random (nonlinear) Schrödinger equation**,where is complex valued, and is a real valued

*spatial*white noise. The linear case is a model for Anderson localization (a complex version of Equation 3.4; see the recent works Reference AC15Reference GKR18). In the nonlinear case, it describes the evolution of nonlinear dispersive waves in a totally disordered medium, with corresponding to the*focusing case*and to the*defocusing case*; see Reference Con12Reference GGF 12 for its physical background and Reference DW18Reference DM17Reference GUZ18 for recent mathematical results.- •
**(Nonlinear) stochastic wave equation**,with given initial data The linear case .( in ) spatial dimension, as Walsh explained in Reference Wal86, describes “a guitar left outdoors during a sandstorm. The grains of sand hit the strings continually but irregularly.” If is the random measure of the number of grains hitting in (centered by subtracting the mean), then should be space-time white noise since the numbers hitting over different time intervals or string portions will be essentially independent. The position of the string should satisfy Equation 3.12 with .

Equation Equation 3.12 with nonzero are investigated in earlier works by Reference AHR96Reference OR98 in spatial dimensions and they proved that with just a function , (satisfying some

*nice*properties) the solution to Equation 3.12 is trivial, namely the same with the solution for the reason for this triviality will be clear in the next subsection. ;More recently, Reference GKO18b obtained nontrivial solutions with given (formally!—see the next subsection) by in and , in in Reference GKO18a. Reference GUZ18 then studied a stochastic wave equation with and multiplicative noise ( on the right-hand side) in .

#### Remark 3.1.

Note that these nonlinear SPDEs are generally not scaling invariant, unlike the linear stochastic heat equation Equation 2.6, which is scaling invariant in any dimension (recall property (P5) at the end of Section 2). For instance, for the KPZ equation, will satisfy where , is a new white noise and thus not invariant unless (Indeed, any choice of three scaling components for . cannot make four terms invariant.) This will turn out to be important for defining solutions to these equations; see Remark 4.2. Also, as we will see in Sections 5 and 6, it would not be possible to derive these equations (that are not scaling invariant) as limits of scaling certain physical models, unless the physical models have a weak asymmetry, a long interaction range, or a weak intensity of noise, which sets an *additional scale*.

### 3.2. Challenge of solution theory

Solution theories for SPDEs have been developed since the 1970s. Earlier progress was recorded by the books written in the 1980s such as Reference KR82Reference Wal86, and more recent books such as Reference CR99Reference PR07Reference DKM 09Reference DPZ14Reference LR15. However, many very important equations, including some of those listed above, remained poorly understood—that is, until very recently.

The difficulty in building solution theories to nonlinear SPDEs is that often these equations are *too singular*; namely, the solution (if it exists) would have low enough regularity so that certain parts of the equation do not a priori make sense. Indeed, recall that for the linear equation Equation 2.6 in spatial dimensions, the solution is almost surely an element of for which is continuous when , and is a distribution when Since with nonlinear terms the solutions are not expected to be more regular, the .“ term in the ” equation when is a priori meaningless because distributions in general cannot be multiplied. Similarly, for the KPZ equation, if , is distribution valued and thus the term “ does not have a clear meaning. For this type of singular SPDE, it is challenging to even ”*interpret what one means* by a solution.Footnote^{10} The same issue exists for the dispersive equations. For instance, the solution to the stochastic wave equation Equation 3.12 in spatial dimensions is distributional, and this is exactly the reason that the triviality result by Reference AHR96Reference OR98 for the nonlinear problem should be expected. Indeed, their proofs are based on a *Colombeau distribution* machinery.^{✖}

Starting in the 1980s, the idea of renormalization entered the study of SPDEs Reference JLM85Reference AR91Reference DPD03. Recently, this idea has received far-reaching generalization in work by Hairer Reference Hai14b, Gubinelli, Imkeller, and Perkowski Reference GIP15, and many subsequent works. The idea is to subtract terms with infinite constants from the nonlinearities. Taking the equation as an example with spatial dimension one needs to consider the renormalized , equation,

This precisely means the following. Since the origin of the problem is the singularity of the driving noise one starts by regularizing , For instance, we take a space-time convolution of . with a mollifier that is a smooth function of space and time with support of size so that (the Dirac delta distribution) as Now we consider the . equation driven by ,

For any due to the smoothness of the noise, we can solve the above PDE in the classical sense, and , implies As . , but , do not converge to any nontrivial limit!

The idea is that before the limit, one should insert renormalization terms (also often called *counter-terms* in the context of quantum field theoryFootnote^{11} In fact the corresponding quantum field theory requires a renormalization for dimensions and which is well known in physics.^{✖})

where *diverges* as at a suitable rate. If the sequence of constants is suitably chosen, the sequence of smooth solutions of Equation 3.15 will converge to a nontrivial limit as :

This is what we mean by a solution to the (renormalized) equation. Note that we do not attempt to make sense of a limiting equation Equation 3.13, but we construct via by a limit procedure, the limit of solutions to a sequence of regularized and renormalized equations Equation 3.15.

The same renormalization procedure applies to the KPZ equation (in one spatial dimension) and many of the other singular SPDEs listed above.

This discussion prompts several questions:

- •
Why does converge, and how does one choose suitable constants to make this convergence happen? Is the resulting limit unique, or does it depend on the mollification? This is essentially the question of “well-posedness” which will be addressed in Section 4.

- •
Why are we allowed to “change” the equation by inserting new terms that are not negligible—in fact infinite? We address this renormalization question in Section 5, in which we will see that the SPDEs such as arise as

*scaling limits*of physical systems and the renormalization will turn out to have physical meanings in these systems.- •
How robust are these singular SPDEs under different approximation schemes? This is a universality question, meaning that one singular SPDE should be able to serve as the continuum large scale description of a

*class*of systems which may have different small scale details. We discuss this in Section 6.

These questions are of course entangled in many ways. In terms of approximations and convergence, the procedure described as in Equation 3.14 is the simplest way of approximation and approaching a limit, but scaling limits of physical models are essentially also ways of obtaining the limits. In terms of uniqueness, one expects to get the same SPDE limit not only for different choices of mollifications in Equation 3.14 but also via scaling limits of perhaps apparently very different models, which is what universality means. Section 5 below will focus on the meaning of renormalization in physical models, and Section 6 will provide more detailed discussions on deriving an SPDE from these physical models, and of course in all these endeavors one needs to first understand the meaning of a solution as discussed in Section 4.

## 4. Well-posedness of singular SPDEs

We discuss how to choose suitable renormalization constants so that one can obtain a nontrivial limit for solutions to renormalized equations. Our exposition consists of two parts.

- •
Starting from the 1990s, solutions to renormalized singular SPDEs have been constructed; see for instance Reference AR91Reference DPD03. Here we present an elegant argument due to Reference DPD03 which illustrates a simple example of renormalization, plus a standard Picard iteration (fixed point) PDE argument. This argument, despite its simplicity, yields solutions to several singular (but not

*too*singular) equations, such as the equation in two spatial dimensions.- •
The above argument fails for more singular SPDEs, such as in three spatial dimensions and the KPZ equation in one spatial dimension. This motivates us to turn to a more robust approach—the theory of regularity structures introduced in Reference Hai14b. We will also mention some alternative theories or methods, such as paracontrolled distributions Reference GIP15 or renormalization groups Reference Kup16.

To focus our discussion in this section, we will work with the equation.

### 4.1. A PDE argument and renormalization

Consider the equation, where the spatial variable takes values in the two-dimensional torus. As explained above, we take a sequence of mollified noises and consider the mollified equation Equation 3.14.

Write for the *stationary solution*Footnote^{12} This corresponds to dropping the term involving initial data in Equation 2.9 and integrating time in Equation 2.10 from instead of Stationarity means that the distribution of . does not depend on This assumption will be convenient when performing moment calculations, such as .Equation 4.3. Namely, the moments will not depend on space-time points.^{✖} to the mollified linear stochastic heat equation Equation 2.6 The key observation is that the most singular part of . is so if we write ,

we can expect the remainder to converge in a space of better regularity. Subtracting this linear equation from Equation 3.14 gives

This equation looks more promising since the rough driving noise has dropped out. This manipulation has not solved the problem of multiplying distributions, since the limit of is still a distribution valued in two spatial dimensions (as we discussed earlier—see the fact (P2) in the end of Section 2). However is a rather concrete object since it is *Gaussian* distributed ((P3) in the end of Section 2). This makes it possible to study the behavior of and via probabilistic methods.

As an illustration, consider the expectation

where is the convolution of introduced in Equation 3.14 with itself and is the heat kernel introduced in Equation 2.10. Due to the singularity of the heat kernel at the origin, this integral diverges like as in two spatial dimensions. Denoting (which does not depend on by stationarity of this calculation indicates that in ),Equation 4.2 we should subtract from and subtract ,Footnote^{13} The factor arises from three ways of choosing two powers of from the cubic term A .*Wick theorem* allows one to compute expectation of a product of arbitrarily many Gaussian variables, and in two dimensions the Wick renormalized power .^{✖} from .

This amounts to considering the renormalized equation

These renormalized powers of do converge to nontrivial limits. In fact, thanks to Gaussianity of given a smooth test function , one can explicitly compute any probabilistic moment of and prove its convergence. By choosing from a suitable set of wavelets or Fourier basis, one can apply a version of Kolmogorov’s theoremFootnote^{14} This is a version of Kolmogorov’s theorem (formulated differently as the classical Kolmogorov theorem) which is adapted to prove convergences in the spaces in the present context.^{✖} to prove that and converge in for any We denote these limits . and They are elements of . for any .

To summarize, we have found that the renormalization constants can be found through *explicit* moment calculations/expectations.

Passing Equation 4.2 to the limit, we get

We can prove local well-posedness of this equation as a *classical* PDE, by a standard fixed point argument. For this, we use a classical result in harmonic analysis under the name Young’s theoremFootnote^{15} Note that the original form of Young’s theorem is only for one-dimensional functions of finite and the version of Young’s theorem we are referring to here can be found in -variations,Reference Hai14b, Proposition 4.14 and Reference BCD11, Theorems 2.47 and 2.52, for instance.^{✖} which states that if , and , then , Thus if we assume that . for, say, then the worst term in the parenthesis in ,Equation 4.5 has regularity By the classical Schauder estimate, which states that the heat kernel improves regularity by . the fixed point map ,

is well defined. Namely, it maps a generic element to a new element which is *again* in for This is since for . sufficiently close to one has With a bit of extra effort, one can show that over a short time interval the fixed point map is contractive and thus has a fixed point in . and this fixed point , is the solution. (The sharp result is for any To conclude, one has .) which is the local solution to the renormalized , equation in two spatial dimensions.

The above argument was first used by Da Prato and Debussche in Reference DPD03, and it applies to other equations, for instance the stochastic Navier–Stokes equation Equation 3.1 with space-time white noise on a two-dimensional torus Reference DPD02. Let us mention another, somewhat surprising application, that is the dynamical sine-Gordon equation Equation 3.5 in two spatial dimensions in the regime The renormalized equation reads .

where is a renormalization constant which diverges like By writing . with one finds that ,

Reference HS16
proved that converges to a nontrivial limit in so , is preciselyFootnote^{16} Recall that, for the fixed point argument to work, we must have that the regularity plus 2 is at least 1.^{✖} the regime where the above classical PDE argument applies. Note that the constant can be again found by calculating the expectation of i.e., the characteristic function of the Gaussian random variable , .

The same idea (but with a slightly different transformation than Equation 4.1) applies to the linear parabolic Anderson model in Reference HL15:

where is regularized *spatial* white noise on With a transformation . where , one can simply check ,

Again, is a Gaussian process, and with , converges to a nontrivial limit, and the equation for is shown to be locally well-posed by standard PDE methods as above. (In fact Reference HL15 constructed a global solution on making use of the linearity of the equation.) ,Reference DW18 studies the stochastic Schrödinger equation Equation 3.11 on in which a transformation similar to that in ,Reference HL15 can be applied.

This type of strategy has also been applied to stochastic hyperbolic equations. Consider the stochastic nonlinear wave equations,

with given initial data where , is the space-time white noise on and , .Reference GKO18b adopts the above Da Prato–Debussche trick to write as a linear part plus a remainder. Such an idea previously appears in the context of deterministic dispersive PDEs with random initial data in earlier work of McKean Reference McK95 and Bourgain Reference Bou96. The proof in Reference GKO18b is based on a fixed point argument for the remainder equation (as above), but with the Schauder estimates replaced by Strichartz estimates for the wave equations. The key point is to use function spaces where the wave equation allows for a gain in regularity. This gain is sufficient to prove that the remainder has better regularity than the linear solution and gives a well-defined nonlinearity for which suitable local-in-time estimates can be established.

### 4.2. Regularity structures and paracontrolled distributions

The above argument fails for the equation in three spatial dimensions. As dimension increases, the space-time white noise (and thus the solution) becomes more singular. To see how this problem rears its head, consider the term in Equation 4.5. One can showFootnote^{17} In three spatial dimensions therefore , From this one can show that . (the rigorous proof of this fact is done by moment analysis).^{✖} that in three spatial dimensions, as a space-time distribution, for any Thus, to multiply . with we would have to formulate the fixed point map ,Equation 4.6 for with The product . would then lie in for any Unfortunately, . only provides two more degrees of regularities, and thus the fixed point map Equation 4.6 will not bring an element in back to the same space.

A natural idea is to go one step further in the expansion Equation 4.1. In view of the equation Equation 4.2, we define a second order perturbative term, and rewrite the expansion as ,

It turns out that one can prove that converges in for to a limit It remains to see whether . converges to a limit with even better regularity. Using Equation 4.4, it is straightforward to derive an equation for :

There should be eight terms in the parenthesis, but we have only written down the two terms that are important for our discussion; the other terms (in “ can be treated by the standard PDE argument as in the two-dimensional case above. It turns out that even after this higher order expansion ”)Equation 4.7, the above PDE fixed point argument still does not work because of the two terms written on the right-hand side of Equation 4.8.

For the second term, we have , and This is below the borderline of applicability of Young’s theorem. It is not hard to overcome this difficulty. In fact, in three spatial dimensions, the term . requires further renormalization to converge to a nontrivial limit. Since this term is nothing but a convolution of several heat kernels and Gaussian noises, one can again carry out a moment analysis to find a suitable renormalization constant which turns out to diverge logarithmically such that ,

This amounts to renormalizing the equation in three spatial dimensions as

where and See also Remark .4.1.

The first term, is of the same nature as , in Equation 4.6, so we suffer exactly the same vicious circle of difficulty as in the discussion for namely, the fixed point argument does not close. In fact, higher order expansions beyond ;Equation 4.7 will always end up with such a term so the same problem will remain. This is the real obstacle. The idea of regularity structures (which overcomes this obstacle) is that the solutions to the two equations,

should have the same small scale behavior, because is more singular than and it is the factor that dominates the small scale roughness. (Here we have ignored all the other terms in Equation 4.8, which have better regularities than that of in order to focus on the main issue of the problem.) This a priori knowledge that , should locally look like can be formulated as that when the space-time points and are close, one should expect thatFootnote^{18} Equation Equation 4.12 is reminiscent of a Taylor expansion where one approximates a differentiable function by Taylor polynomials. Here we approximate , by which is also an object that is simply a convolution of heat kernels with white noises. Taylor polynomials are special examples of regularity structures, and the theory of regularity structures is a generalization of Taylor expansion.^{✖}

Namely, the local increment of is approximately the same as the local increment of to multiplying a factor —up which depends on the base point the reason that this multiplicative factor should be ; is clear from the structure of equation Equation 4.11.

Since is again a *concrete* object, which is simply convolutions of heat kernels with powers of Gaussians, it is easy to prove by analyzing its moments as before. Thus if satisfies Equation 4.12, that is, locally looks like one has , as well. The converse is not true; the set of satisfying Equation 4.12 is a strictly smaller set than The key is to formulate a fixed point problem in the space of all functions . that have prescribed local expansion Equation 4.12 (rather than in standard function spaces such as The aforementioned vicious term ). which could not be defined for arbitrary , can now be defined if , locally looks like because , is again simply a concrete combination of Gaussian processes and heat kernels! It turns out that the fixed point argument closes in the space of functions having such prescribed local expansions, and the fixed point together with Equation 4.7 yields the solution to the equation in three dimensions.

The above idea of solving stochastic equations in a space of functions or distributions that have prescribed local approximations by certain canonical objects to some extent had its precursor in the simpler setting of stochastic ordinary differential equations, which is called *rough path theory* (see Reference Lyo98 or the book Reference FH14) in particular a formulation by Reference Gub04. Constructing the solution to the equation on a three-dimensional torus was the first example of the theory built in Reference Hai14b. The review articles Reference Hai15aReference Hai15b have more detailed pedagogical explanations on the theory and the application to this equation.

#### Remark 4.1.

We have found the renormalization of , and and proved convergence of the renormalized objects by moment analysis. Analyzing moments of these random objects are the only probabilistic component of Reference Hai14b. As the equation in question becomes more singular, the number of such random objects to be studied increases, and it is tedious or even impossible to analyze each of them by hand. Reference CH16 develops a *blackbox* that provides systematic and automatic treatment for renormalization and moment analysis for these perturbative objects arising from general singular SPDEs. Moreover, there are also algebraic aspects for the renormalization procedure (so-called renormalization groups), which has been systematically treated in Reference BHZ16. Finally, there is a question regarding what the renormalized equation (e.g., Equation 4.10) will look like after renormalizing these random objects, and this is answered in Reference BCCH17.

Hairer’s theory has been applied to provide solutions to other very singular SPDEs, for instance, a generalized parabolic Anderson model (a generalization of Equation 3.4),

where and are sufficiently regular functions. The well-posedness of the KPZ equation in one spatial dimension was solved in Reference Hai13 using the theory of *controlled rough paths* Reference Gub04, which can now be viewed as a special case of regularity structures; see the book Reference FH14 for rough paths, regularity structures, and applications to KPZ.

Other applications include (but are not limited to) the stochastic Navier–Stokes equation Equation 3.1 with white noise on the three-dimensional torus Reference ZZ15, the stochastic heat equation with multiplicative noise Equation 3.2 Reference HP15Reference HL18, the dynamical sine-Gordon equation Equation 3.5 on two-dimensional torus for arbitrary Reference HS16Reference CHS18, the stochastic quantization of Abelian gauge theory/stochastic gauged Ginzburg–Landau equation by Reference She18, and the random motion of string in manifold Equation 3.6 Reference BGHZ19.

Besides Hairer’s theory Reference Hai14b, some alternative methods have also been introduced. The *paracontrolled distribution* method of Gubinelli, Imkeller, and Perkowski Reference GIP15 is based on a similar idea of controlling the local behavior of solutions, but it is implemented in a different way by using Littlewood–Paley theory and paraproducts Reference BCD11. See Reference GP18b for a review on paracontrolled distribution. The paracontrolled distribution method has been also successfully applied to, for instance, the KPZ equation Reference GP17bReference Hos18 in (and more recently the construction of a solution on the entire real line instead of a circle Reference PR18), a multi-component coupled KPZ equation Reference FH17, and the equation Reference CC18b in (and more interestingly, its global solution by Reference MW17b).

The paracontrolled distribution method has not only allowed us to prove well-posedness results for stochastic PDEs, but it also resulted in the construction of other singular objects that could not be made sense of before. For instance, Reference AC15 constructed the Anderson Hamiltonian (i.e., Schrödinger operator) on the two-dimensional torus, formally defined as where , is a singular potential such as white noise. As another example, Reference CC18a proved existence and uniqueness of solution for stochastic *ordinary* differential equations with *distributional* valued drift where , is a Brownian motion, and this is achieved via the study of the generator of the above stochastic ordinary differential equations given by -dimensional .Reference CC18a also managed to make sense of a singular *polymer measure* on the space of continuous functions formally given by where , is the Wiener measure (i.e., the Gaussian measure for Brownian motions) on for , is a spatial white noise on the torus independent of -dimensional and , is an (infinite) renormalization constant.

We will discuss another application of the paracontrolled distribution method on the scaling limit problem with a bit more detail in Section 5.2.

In the line of this paracontrolled distribution approach, Reference BB16 provided a semigroup approach, which has been applied to the generalized parabolic Anderson model Equation 3.4 on a potentially unbounded two-dimensional Riemannian manifold.

Another method based on renormalization group flow was introduced by Kupiainen Reference Kup16, which for instance has been applied to prove local well-posedness for a generalized KPZ equation Reference KM17 introduced by H. Spohn Reference Spo14 in the context of stochastic hydrodynamics.

With all these alternative methods, the theory of regularity structures is by far the most systematic and general approach; for instance it has developed the *blackbox theorems* as mentioned in Remark 4.1 which makes the implementation of this theory very automatic, and it can deal with equations which are extremely singular (that is, very close or even arbitrarily close to criticality, see Remark 4.2) such as the random string in a manifold Equation 3.6 or the dynamical sine-Gordon equation Equation 3.5 for arbitrary .

#### Remark 4.2 (*Subcriticality of stochastic PDE*).

The methods developed in Reference Hai14b, Reference BCD11, and Reference Kup16 are all for *subcritical* semilinear stochastic PDEs. For stochastic PDEs with white noise, the equation being subcritical means that the nonlinear term has better regularity than the linear terms; namely, small scale roughness is dominated by the linear solution. For instance, for the equation in three spatial dimensions, the term has regularity while and have regularities Subcriticality often depends on spatial dimensions: .Equation KPZ, Equation 3.2, and Equation 3.6 are subcritical in while ,Equation , the parabolic Anderson model Equation 3.4, the Navier–Stokes equation Equation 3.1 with space-time white noise, and the stochastic Yang–Mills heat flow Equation 3.8 are subcritical in The dynamical sine-Gordon equation .Equation 3.5 however is subcritical for .

The stochastic PDEs being discussed here in supercritical regimes (i.e., above the aforementioned criticalities) are not expected to have nontrivial meanings of solutions. We only expect to get Gaussian limit, although the Gaussian variances may be nontrivial; the reader is referred to Reference MU18, Theorem 1.1 for a flavor of such a result for the KPZ equation in .

Critical dimensions are much more subtle. We refer to Reference CD18Reference CSZ17bReference CSZ18Reference Gu18 for the very new progress on the KPZ equation in .

#### Remark 4.3.

We remark that although we have focused on semilinear equations in our expositions, the methods developed in Reference Hai14bReference BCD11 have also extended to quasilinear equations; see Reference OW18Reference FG19Reference BDH19Reference GH17b.

### 4.3. A brief discussion on weak solutions

The solutions to SPDEs that we have discussed so far are called *strong solutions*, as opposed to the *weak solutions* that we will now briefly discuss.Footnote^{19} Not all equations that admit weak solutions admit strong solutions. A famous stochastic differential equation example is called Tanaka’s equation; see Reference KS91a, Example 3.5.^{✖} Let us immediately point out that the weak solutions in the stochastic context have nothing to do with the weak solutions in deterministic PDE theory; one sometimes adopts the terminology “probabilistically weak solutions”.

For a strong solution, one starts with a probability space on which the noise is defined and then builds a mapping from that probability space and the initial data space to a space of functions (or distributions) that satisfies the prescribed equationFootnote^{20} As we saw, making sense of what it means to satisfy the equation often takes significant work and involves regularizations and renormalizations. There are also some measurability assumptions which should be imposed on strong solutions so future noise cannot effect the evolution before its time.^{✖} with probability (i.e., for almost every point in the probability space). Though subtle, it is important to understand that a strong solution to an SPDE need not be function valued (as we saw, in some instances it is distribution valued, living in some spaces of negative regularity).

For a standard PDE, a weak solution requires that the equation holds when tested against a suitable class of functions. For SPDEs, the analogue of this involves treating solutions statistically as probability measures on the solution space, rather than as random variables supported on the probability space on which the noise is defined. Roughly, a weak solution means that we can define some noise (with the right distribution and measurability assumptions satisfied) so that the canonical processFootnote^{21} The canonical process is the random variable whose probability space is defined as the solution space equipped with the probability measure of the proposed solution.^{✖} on the solution space, along with the noise satisfying the desired equation.Footnote^{22} Here is a, hopefully, more intuitive explanation for this different notion of strong versus weak. Imagine that human life were governed by an SPDE. Then, a strong solution would tell us how each individual’s life would unfold, given the knowledge of all of the randomness which befalls them, in addition to the world around them. A weak solution is statistical—it tells us that people with certain characteristics have certain probabilities of having their life unfold in various ways. Given such a prescription of probabilities, how can be verify that this is, indeed, a weak solution to the “SPDE of life”? Well, we need to demonstrate that there exists randomness which would, in fact, result in the aforementioned probabilities. Then, we would need to verify that the randomness is distributed in the way that the SPDE of life claims (e.g., space-time white noise). While this is all a bit tongue-in-cheek, we hope it helps explain the difference.^{✖} As we will see, *martingale problems* provide a very convenient way to demonstrate that a weak solution solves an SPDE (instead of demonstrating the existence of a suitable noise as above).

Let us illustrate these ideas in the simplest setting of stochastic differential equations (SDEs). Let denote temporal white noise. Like its space-time counterpart, this can be defined in various ways (e.g., as a series with random coefficients). Consider the SDE which in integrated form reads , (let us assume that for simplicity). Once integration with respect to white noise is understood, this defines a solution map (and hence a strong solution) from to the full trajectory of for One checks that . is continuous and that its marginal distributions are Gaussian with covariance of and given by the minimum of , and This, in fact, implies that the distribution of the function . is a Wiener measure—that is, the distribution of Brownian motion. If instead of we had another Brownian motion (for instance, we could have or just an independent Brownian motion), then would be a weak solution, but not a strong solution. This is because and have the same distribution, even if they are not “driven” by the same noise.

The martingale problem provides an alternative characterization to the Gaussian description above for Brownian motion. The Lévy characterization theorem says that is distributed as a Brownian motion if it is almost surely continuous and both and are martingales.Footnote^{23} In fact, *local* martingales.^{✖} A measure on that satisfies this is said to satisfy the martingale problem characterizing Brownian motion.

What does it mean that (or is a martingale? Roughly speaking, this means that given the history of ) up to time the expected value of its future location is exactly , This is like a fair gambling system in which your future expected profit is always zero. Martingales are essentially a particular class of centered noise. .

Martingale problems exist for general classes of SDEs and are often very useful for proving convergence results. For instance, to show that a discrete time Markov chain converges to an SDE (e.g., a random walk converges to Brownian motion), one can demonstrate that the discrete chain satisfies a discrete version of the SDE’s martingale problem. Then, provided one can demonstrate compactness of the measures (on the evolution of the Markov chain), all limit points must satisfy the limiting SDE’s martingale problem. This generally proves uniqueness of the limit points and, hence, convergence.

Weak solutions to linear SPDEs can also be characterized in terms of martingale problems. Let us describe how this works for the multiplicative noise stochastic heat equation Equation 3.3, recalled here:Footnote^{24} This equation also admits a strong solution which can be written as a *chaos series* of multiple stochastic integrals against space-time white noise .^{✖}

Let us write and think of as a measure on (continuous maps from to continuous spatial functions). For any test function write , With this notation, define the processes .

We say that satisfies the martingale problem for the multiplicative noise stochastic heat equation if both and are (local) martingales for all test functions Any . that satisfies this is a weak solution; see for instance Reference BG97, Definition 4.10. Just as martingale problems are useful in proving convergence of Markov chains to SDEs, so too can they be used in SPDE convergence proofs; see Sections 6.2 and 6.3 for some examples where this type of martingale problem has been used for such a purpose.

It is generally hard to formulate a martingale problem characterization for weak solutions to singular nonlinear SPDEs. For the KPZ equation (in one spatial dimension) one can use the Hopf–Cole transform and define , as a Hopf–Cole solution to the KPZ equation if is a solution to the multiplicative noise stochastic heat equation. This notion of solution agrees with those discussed earlier in this text. However, such linearizing transformations are uncommon, and this should be thought of as a rather useful trick, not a general theory.

Remarkably, for the stochastic Burgers equation (which is formally the equation satisfied by the spatial derivative of the KPZ equation )

Reference GJ14
found a way to formulate a martingale problem characterization and Reference GP17a (with a slightly improved formulation) matches the solution to this martingale problem to the Hopf–Cole solution (see Equation 3.3) whereby showing uniqueness of the solution to this martingale problem; they call it an *energy solution*. There were some limitations of this notion of solution, namely it only works for particular types of initial data; very recently, however, Reference GP18c has generalized the notion of energy solution to system configurations with finite entropy with respect to stationarity; and Reference Yan18 has extended this method to include more general initial data such as flat initial data. It has proved to be quite useful in demonstrating convergence results, as we explain later in Section 6.4. Finally, let us mention that very recently Reference GP18a developed a martingale approach for a class of singular stochastic PDEs of Burgers type, including fractional and multicomponent Burgers equations.

Let us end this discussion by mentioning (without any explanation) another powerful approach to defining weak solutions of SPDEs—Dirichlet forms. For example, for the equation in two spatial dimensions, before Da Prato and Debussche constructed their strong solution in Reference DPD03, the paper Reference AR91 constructed a weak solution via Dirichlet forms (which involves significant functional analysis). For comprehensive discussion on this topic we refer to the book Reference FOT11 and the references therein.

## 5. Renormalization in physical models

Let us take stock of what we have learned so far. In Section 2 we observed that (at least in the linear case) SPDEs arise as scaling limits for microscopic models of physical systems. In Section 3 we introduced a number of nonlinear SPDEs and claimed that they model various interesting physical systems. However, before trying to justify that claim, we had to confront the challenge of well-posedness. Namely, how to make sense of what a “solution” means to these equations. Section 4 surveyed the main techniques for doing this.

In defining a solution, we mollified (or smoothed) the noise and defined the solution through a limit transition. From that perspective, it is reasonable to hope that the same methods can be applied to other types of regularizations of the noise, or equation—for instance to show that discrete systems converge to the SPDEs. We will address this further in Section 6.

In Section 4 we found that besides regularizing the noise, we also had to introduce certain renormalizations to our equations for them to admit limits. At first glance this tweaking of the equation seems a bit crooked. In this section we will explain how these renormalizations have concrete physical meaning, thus justifying our definitions. For instance, a diverging renormalization constant may relate to a tuning for a microscopic system of the overall scale, reference frame, temperature, or other physically meaningful parameters. We will focus our discussion on two systems: the dynamic equation and parabolic Anderson model. For the KPZ equation, we save this discussion until Section 6, where we will also highlight the notion of universality.

### 5.1. equation

Let us consider the example of a magnet near its critical temperature in Section 3. Many mathematical models have been proposed to describe various behaviors of magnetic systems. Here we investigate one particular example called the *Kac–Ising model*.Footnote^{25} The Ising model was introduced in 1920 by Lenz and named after his student Ising who showed that in one spatial dimension it did not admit any phase transition. That original model involves nearest-neighbor pair interactions of and spins. There are many generalizations of this model besides the long-range Kac–Ising model we consider here. For instance, one can consider the higher spin versions with different types of spins and interactions which award equal spins along edges. This is known as the Potts model. -state^{✖}

We define the model in two dimensions. Denote by a large two-dimensional discrete torus of “radius” (we introduce here to later take to zero), which represents the space in which our magnetic material lives. Each site is decorated with a spin which for simplicity is assumed to take values in Denote by . the set of spin configurations on For a spin configuration . we define the Hamiltonian as ,

where, for ), is a nonnegative functionFootnote^{26} A concrete choice of the interaction kernel is to set where , is a smooth, nonnegative function with compact support and is chosen to ensure that .^{✖} supported on which integrates to Then for any .*inverse temperature* we can define the *Gibbs measure* on as

where makes into a probability measure.

The measure is known as an equilibrium measure since the probability of finding a configuration is proportional to the exponential of the energy of that configuration. It is also known as equilibrium because it arises as the equilibrium (or stationary, or invariant) measure for various simple, local stochastic dynamics on the configuration space. We will consider one such example known as *Glauber dynamics* Reference Gla63. For let , denote the spin configuration that coincides with except the spin at position is flipped: The Glauber dynamic is the following continuous time Markov process: For each . the configuration , is updated to at rateFootnote^{27} For those not used to continuous time Markov processes, let us explain this more precisely. Starting from some configuration to each , we associate independent exponentially distributed random variables of rate (i.e., with mean inverse to the rate). One then compares all of these random variables, and for the whose random variable is minimal, the configuration updates to The time at which this occurs is the value of the associated random variable. From that point on, one repeats the whole story, choosing new exponential random variables with rates given by the updated . Due to the “memoryless” property of exponential random variables (i.e., for . exponential of rate 1, conditional on the law of , is that of a rate 1 exponential random variable), this constructs a continuous time Markov process.^{✖} where

Once an update occurs for some all rates are recalculated relative to the new configuration. It is standard to show that the measure , is the unique invariant measure for the Glauber dynamic, meaning that for any starting measuring, eventually the measure will converge in distribution to Likewise, if started according to . then the distribution at any later time will still be distributed according to that measure. ,

Figure 5.1 illustrates the dynamics where, for each fixed time one has a spin configuration , We would like to observe a scaling limit of the system from a large distance scale of . and a long time scale of At larger scales, the . oscillating between would yield a field in a very weak topology; it will be more convenient to consider an averaged fieldFootnote^{28} Using the interaction kernel to average out the field is merely a matter of convenience, and it will lead to a clean form of Equation 5.5. Also, convergence of follows a fortiori.^{✖}

We may also abuse notation and write suppressing the explicit dependence on , .

#### Remark 5.1.

In order to prove an SPDE limit, we first write down a discretized SPDE. Generally, this involves a coupled system of stochastic differential equations driven by martingales (recall the brief discussion from Section 4.3). Without delving deeply into details of the theory of Markov processes, let us illustrate this with a simple example. Our applications of this general idea will be more involved, though we will avoid going into details there also. Consider a continuous time random walk on where a left jump (by 1) occurs at rate and a right jump (by 1) occurs at rate For any function . the expected value , (where means the expected value assuming initial data satisfies the system of ODEs ) where the operator , acts on functions as Without taking the expectations, .