1 Introduction

Lévy trees arise as a natural generalization to the continuum trees defined by Aldous [8]. They are located at the intersection of several important fields: combinatorics of large discrete trees, Lévy processes and branching processes. Consider a branching mechanism \(\psi \), that is a function of the form

$$\begin{aligned} \psi (\lambda ) = \alpha \lambda + \beta \lambda ^2 + \int _{(0,+\infty )} \left( \mathrm{e}^{-\lambda x} -1 + \lambda x{\mathbf{1}}_{\{x<1\}}\right) \Pi (dx) \end{aligned}$$
(1)

with \(\alpha \in \mathbb R , \beta \ge 0, \Pi \) a \(\sigma \)-finite measure on \((0,\infty )\) such that \(\int _{(0,+\infty )} (1\wedge x^2)\ \Pi (dx)<+\infty \). In the (sub)critical case \(\psi ^{\prime }(0)\ge 0\), Le Gall and Le Jan [25] defined a continuum tree structure, which can be described by a tree \(\mathcal{T }\), for the genealogy of a population whose total size is given by a continuous-state branching process (CSBP) with branching mechanism \(\psi \). We will consider the distribution \(\mathbb{P }_r^\psi (d\mathcal{T })\) of this Lévy tree when the CSBP starts at mass \(r>0\), or its excursion measure \(\mathbb{N }^\psi [d\mathcal{T }]\), when the CSBP is distributed under its canonical measure. The \(\psi \)-Lévy tree possesses several striking features as pointed out in the work of Duquesne and Le Gall [13, 14]. For instance, the branching nodes can only be of degree 3 (binary branching) if \(\beta >0\) or of infinite degree if \(\Pi \ne 0\). Furthermore, there exists a “mass” measure \(\mathbf{m}^\mathcal{T }\) on the leaves of \(\mathcal{T }\), whose total mass corresponds to the total population size \(\sigma =\mathbf{m}^\mathcal{T }(\mathcal{T })\) of the CSBP. We will also consider the extinction time of the CSBP which corresponds to the height \(H_{\mathrm{max}}(\mathcal{T })\) of the tree \(\mathcal{T }\). The results can be extended to the super-critical case, using a Girsanov transformation given by Abraham and Delmas [2].

In [2], a decreasing continuum tree-valued process is defined using the so-called pruning procedure of Lévy trees introduced in Abraham, Delmas and Voisin [7]. By marking a \(\psi \)-Lévy tree with two different kinds of marks (the first ones lying on the skeleton of the tree, the other ones on the nodes of infinite degree), one can prune the tree by throwing away all the points having a mark on their ancestral line, that is, the branch connecting them to the root. The main result of [7] is that the remaining tree is still a Lévy tree, with branching mechanism related to \(\psi \). The idea of [2] is to consider a particular pruning with an intensity depending on a parameter \(\theta \), so that the corresponding branching mechanism \(\psi _\theta \) is \(\psi \) shifted by \(\theta \):

$$\begin{aligned} \psi _\theta (\lambda )=\psi (\theta +\lambda )- \psi (\theta ). \end{aligned}$$

Letting \(\theta \) vary enables to define a decreasing tree-valued Markov process \((\mathcal{T }_\theta ,\ \theta \in \Theta ^\psi )\), with \(\Theta ^\psi \subset \mathbb R \) the set of \(\theta \) for which \(\psi _\theta \) is well-defined, and such that \(\mathcal{T }_\theta \) is distributed according to \(\mathbb{N }^{\psi _\theta }\). If we write \(\sigma _\theta =\mathbf{m}^{\mathcal{T }_\theta }(\mathcal{T }_\theta )\) for the total mass of \(\mathcal{T }_\theta \), then the process \((\sigma _\theta ,\ \theta \in \Theta ^\psi )\) is a pure-jump process. The case \(\Pi =0\) was studied by Aldous and Pitman [9]. The time-reversed tree-valued process is also a Markov process which defines a growing tree process. Let us mention that the same kind of ideas have been used by Aldous and Pitman [10] and by Abraham et al. [5] in the framework of Galton–Watson trees to define growing discrete tree-valued Markov processes.

In the discrete framework of [5], it is possible to define the infinitesimal transition rates of the growing tree process. In [19], Evans and Winter define another continuum tree-valued process using a prune and re-graft procedure. This process is reversible with respect to the law of Aldous’s continuum random tree and its infinitesimal transitions are described using the theory of Dirichlet forms.

In this paper, we describe the infinitesimal behavior of the growing continuum tree-valued process, which is \((\mathcal{T }_\theta , \theta \in \Theta ^\psi )\) seen backwards in time. The Special Markov Property in [7] describes only two-dimensional distributions and hence the transition probabilities but, since the space of real trees is not locally compact, we cannot use the theory of infinitesimal generators to describe its infinitesimal transitions. Dirichlet forms cannot be used either since the process is not symmetric (it is increasing). However, it is a pure-jump process and our first main result shows that the infinitesimal transitions of the process can be described using a random point process of trees which are grafted one by one on the leaves of the growing tree. More precisely, let \(\{\theta _j,\ j\in J\}\) be the set of jumping times of the mass process \( (\sigma _\theta ,\ \theta \in \Theta ^\psi )\). Then, informally, at time \(\theta _j\), a tree \(\mathcal{T }^j\) distributed according to \({\mathbf{N}}^{\psi _{\theta _j}}[\mathcal{T }\in \cdot ]\), with:

$$\begin{aligned} {\mathbf{N}}^{\psi _\theta }[\mathcal{T }\in \cdot ] = 2\beta \mathbb{N }^{\psi _\theta }[\mathcal{T }\in \cdot ] + \int _{(0,+\infty )} \Pi (dr) r \mathrm{e}^{-\theta r} \mathbb{P }_r^{\psi _\theta }(\mathcal{T }\in \cdot ), \end{aligned}$$

is grafted at \(x_j\), a leaf of \(\mathcal{T }_{\theta _j}\) chosen at random (according to the mass measure \(\mathbf{m}^{\mathcal{T }_{\theta _j}}\)). We also prove that the random point measure

$$\begin{aligned} \mathcal{N }=\sum _{j\in J}\delta _{(x_j,\mathcal{T }^j,\theta _j)} \end{aligned}$$

has predictable compensator

$$\begin{aligned} \mathbf{m}^{\mathcal{T }_\theta }(dx) {\mathbf{N}}^{\psi _\theta }[d\mathcal{T }]\, {\mathbf{1}}_{\Theta ^\psi }(\theta )\,d\theta \end{aligned}$$

with respect to the backwards in time natural filtration of the process (Corollary 3.4).

The precise statement requires the introduction of the set of locally compact weighted real trees endowed with a Gromov–Hausdorff–Prokhorov distance. Therefore, we will assume that Lévy trees are locally compact, which corresponds to the Grey condition: \(\int ^{+\infty }\frac{du}{\psi (u)}<\infty \). In the (sub)critical case this implies that the corresponding height process of the Lévy tree is continuous and that the tree is compact. However, the tree-valued process is defined in [7] without this assumption and we conjecture that the jump representation of the tree-valued Markov process holds without this assumption.

The representation using the random point measure allows to describe the ascension time or explosion time (when it is defined)

$$\begin{aligned} A = \inf \left\{ \theta \in \Theta ^\psi \!,\ \sigma _\theta < \infty \right\} \end{aligned}$$

as \(\inf \{\theta _j,\ \mathbf{m}^{\mathcal{T }^j}(\mathcal{T }^j)<\infty \}\), being the first time (backwards in time) at which a tree with infinite mass is grafted. This representation is also used in Abraham and Delmas [3, 4] respectively on the asymptotics of the records on discrete subtrees of the continuum random tree and on the study of the record process on general Lévy trees.

This structure, somewhat similar to the Poissonian structure of the jumps of a Lévy process (although in our case the structure is neither homogeneous nor independent), allows us to study the time of first passage of the growing tree-valued process above a given height:

$$\begin{aligned} A_h = \sup \left\{ \theta \in \Theta ^\psi \!,\ H_{\mathrm{max}} (\mathcal{T }_\theta ) > h\right\} \!. \end{aligned}$$

We give the joint distribution of the ascension time and the exit time \((A,A_h)\), see Proposition 4.3. In particular, \(A_h\) goes to \(A\) as \(h\) goes to infinity: for \(h\) very large, with high probability the process up to \(A\) will not have crossed height \(h\), so that the first jump to cross height \(h\) will correspond to the grafting time of the first infinite tree, which happens at ascension time \(A\).

We also give in Theorem 4.6 the joint distribution of \((\mathcal{T }_{A_h-}, \mathcal{T }_{A_h})\) the tree just after and just before the jumping time \(A_h\). And we give a spinal decomposition of \(\mathcal{T }_{A_h}\) along the ancestral branch of the leaf on which the overshooting tree is grafted, which is similar to the classical Bismut decomposition of Lévy trees. Conditionally on this ancestral branch, the overshooting tree is then distributed as a regular Lévy tree, conditioned on being high enough to perform the overshooting. This generalizes results in [2] about the ascension time of the tree-valued process. Note that this approach could easily be generalized to study spatial exit times of growing families of super-Brownian motions.

All the results of this paper are stated in terms of real trees and not in terms of the height process or the exploration process that encode the tree as in [7]. For this purpose, we define in Sect. 2.1 the state space of rooted real trees with a mass measure (here called weighted trees or w-trees) endowed with the so-called Gromov–Hausdorff–Prokhorov metric defined by Abraham, Delmas and Hoscheit [6] which is a slight generalization of the Gromov–Hausdorff metric on the space of metric spaces, and also a generalization of the Gromov–Prokhorov topology of [20] on the space of compact metric spaces endowed with a probability measure.

The paper is organized as follows. In Sect. 2, we introduce all the material for our study: the state space of weighted real trees and the metric on it, see Sect. 2.1; the definition of sub(critical) Lévy trees via the height process; the extension of the definition to super-critical Lévy trees; the pruning procedure of Lévy trees. In Sect. 3, we recall the definition of the growing tree-valued process by the pruning procedure as in [7] in the setting of real trees and give another construction using the grafting of trees given by random point processes. We prove in Theorem 3.2 that the two definitions agree and then give in Corollary 3.4 the random point measure description. Section 4 is devoted to the application of this construction on the distribution of the tree at the times it overshoots a given height and just before, see Theorem 4.6.

2 The pruning of Lévy trees

2.1 Real trees

The first definitions of continuum random trees go back to Aldous [8]. Later, Evans, Pitman and Winter [18] used the framework of real trees, previously applied in the context of geometric group theory, to describe continuum trees. We refer to [17, 24] for a general presentation of random real trees. Informally, real trees are metric spaces without loops, locally isometric to the real line.

More precisely, a metric space \((T,d)\) is a real tree (or \(\mathbb R \) -tree) if the following properties are satisfied:

  1. (1)

    For every \(s,t\in T\), there is a unique isometric map \(f_{s,t}\) from \([0,d(s,t)]\) to \(T\) such that \(f_{s,t}(0)=s\) and \(f_{s,t}(d(s,t))=t\).

  2. (2)

    For every \(s,t\in T\), if \(q\) is a continuous injective map from \([0,1]\) to \(T\) such that \(q(0)=s\) and \(q(1)=t\), then \(q([0,1])=f_{s,t}([0,d(s,t)])\).

We say that a real tree is rooted if there is a distinguished vertex \(\varnothing \), which will be called the root of \(T\). Such a real tree is noted \((T,d,\varnothing )\). If \(s,t\in T\), we will note \([\![s,t ]\!]\) the range of the isometric map \(f_{s,t}\) described above. We will also note \([\![s,t [\![\) for the set \([\![s,t ]\!]{\setminus }\{t\}\). We give some vocabulary on real trees, which will be used constantly when dealing with Lévy trees. Let \(T\) be a real tree. If \(x\in T\), we will call \(degree \) of \(x\), and note \(n(x)\), the number of connected components of the set \(T{\setminus }\{x\}\). In a general tree, this number can be infinite, and this will actually be the case with Lévy trees. The set of leaves is defined as

$$\begin{aligned} \mathrm{Lf}(T)=\left\{ x\in T{\setminus } \{\varnothing \},\ n(x)=1\right\} \!. \end{aligned}$$

If \(n(x)\ge 3\), we say that \(x\) is a branching point. The set of branching points will be noted \(\mathrm{Br}(T)\). Among those, there is the set of infinite branching points, defined by

$$\begin{aligned} \mathrm{Br}_\infty (T) = \left\{ x\in \mathrm{Br}(T),\ n(x) = \infty \right\} \!. \end{aligned}$$

Finally, the skeleton of a real tree, noted \(\mathrm{Sk}(T)\), is the set of points in the tree that aren’t leaves. It should be noted, following Evans, Pitman and Winter [18], that the trace of the Borel \(\sigma \)-field of \(T\) on \(\mathrm{Sk}(T)\) is generated by the sets \([\![s,s^{\prime } ]\!],\ s,s^{\prime } \in \mathrm{Sk}(T)\). Hence, it is possible to define a \(\sigma \)-finite Borel measure \(\ell ^{T}\) on \(T\), such that

$$\begin{aligned} \ell ^{T}(\mathrm{Lf}(T)) = 0 \quad \text{ and }\quad \ell ^{T}([\![s,s^{\prime } ]\!])=d(s,s^{\prime }). \end{aligned}$$

This measure will be called length measure on \(T\). If \(x,y\) are two points in a rooted real tree \((T,d,\varnothing )\), then there is a unique point \(z\in T\), called the Most Recent Common Ancestor (MRCA) of \(x\) and \(y\) such that \([\![\varnothing ,x ]\!]\cap [\![\varnothing ,y ]\!]= [\![\varnothing ,z ]\!]\). This vocabulary is an illustration of the genealogical vision of real trees, in which the root is seen as the ancestor of the population represented by the tree. Similarly, if \(x\in T\), we will call height of \(x\), and note by \(H_x\) the distance \(d(\varnothing ,x)\) to the root. The function \(x\mapsto H_x\) is continuous on \(T\), and we define the height of \(T\) by

$$\begin{aligned} H_{\mathrm{max}}(T)=\sup _{x\in T} H_x. \end{aligned}$$

2.2 Gromov–Prokhorov metric

2.2.1 Rooted weighted metric spaces

This section is inspired by [15], but for the fact that we include measures on the trees, in the spirit of [27]. The detailed proofs of the results stated here can be found in [6].

Let \((X,d^X)\) be a Polish metric space. For \(A,B\in \mathcal B (X)\), we set

$$\begin{aligned} d_\mathrm{H}^X(A,B)= \inf \left\{ \varepsilon >0,\ A\subset B^\varepsilon \ \mathrm{and}\ B\subset A^\varepsilon \right\} \!, \end{aligned}$$

the Hausdorff distance between \(A\) and \(B\), where \(A^\varepsilon = \{ x\in X,\ \inf _{y\in A} d^X(x,y) < \varepsilon \}\) is the \(\varepsilon \)-halo set of \(A\). If \(X\) is compact, then the space of compact subsets of \(X\), endowed with the Hausdorff distance, is compact, see Theorem 7.3.8 in [12].

We will use the notation \(\mathcal{M }_{f}(X)\) for the space of all finite Borel measures on \(X\). If \(\mu ,\nu \in \mathcal{M }_f(X)\), we set:

$$\begin{aligned} d_\mathrm{P}^X(\mu ,\nu )&= \inf \left\{ \varepsilon >0,\ \mu (A)\le \nu (A^\varepsilon ) + \varepsilon \text{ and } \nu (A)\le \mu (A^\varepsilon )\right. \\&\qquad \left. +\varepsilon \ \text{ for } \text{ all } \text{ closed } \text{ set } A \right\} \!, \end{aligned}$$

the Prokhorov distance between \(\mu \) and \(\nu \). It is well known that \((\mathcal{M }_f(X), d_\mathrm{P}^X)\) is a Polish metric space, and that the topology generated by \(d_{P}^X\) is exactly the topology of weak convergence (convergence against continuous bounded functionals). If \(\Phi :X\rightarrow X^{\prime }\) is a Borel map between two Polish metric spaces and if \(\mu \) is a Borel measure on \(X\), we will note \(\Phi _*\mu \) the image measure on \(X^{\prime }\) defined by \(\Phi _*\mu (A)=\mu (\Phi ^{-1}(A))\), for any Borel set \(A\subset X\). Recall that a Borel measure is boundedly finite if the measure of any bounded Borel set is finite.

Definition 2.1

  • A rooted weighted metric space \(\mathcal{X }= (X,d^X, \varnothing ^X,\mu ^X)\) is a metric space \((X, d^X)\) with a distinguished element \(\varnothing ^X\in X\) and a boundedly finite Borel measure \(\mu ^X\).

  • Two rooted weighted metric spaces \(\mathcal{X }\!=\!(X,d^X,\varnothing ^X,\mu ^X)\) and \(\mathcal{X }^{\prime }\!=\!(X^{\prime },d^{X^{\prime }},\varnothing ^{X^{\prime }},\mu ^{X^{\prime }})\) are said to be GHP-isometric if there exists an isometric bijection \(\Phi :X \rightarrow X^{\prime }\) such that \(\Phi (\varnothing ^X)= \varnothing ^{X^{\prime }}\) and \(\Phi _* \mu ^X = \mu ^{X^{\prime }}\).

Notice that if \((X, d^X)\) is compact, then a boundedly finite measure on \(X\) is finite and belongs to \(\mathcal{M }_f(X)\). We will now use a procedure due to Gromov [21] to compare any two compact rooted weighted metric spaces, even if they are not subspaces of the same Polish metric space.

2.2.2 Gromov–Hausdorff–Prokhorov distance for compact metric spaces

Let \(\mathcal{X }=(X,d,\varnothing ,\mu )\) and \(\mathcal{X }^{\prime }=(X^{\prime },d^{\prime },\varnothing ^{\prime },\mu ^{\prime })\) be two compact rooted weighted metric spaces, and define

$$\begin{aligned} d_{\mathrm{GHP}}^{\mathrm{c}}(\mathcal{X },\mathcal{X }^{\prime })&= \inf _{\Phi ,\Phi ^{\prime },Z} \left( d_\mathrm{H}^Z(\Phi (X),\Phi ^{\prime }(X^{\prime })) + d^Z(\Phi (\varnothing ),\Phi ^{\prime }(\varnothing ^{\prime }))\right. \nonumber \\&\qquad \qquad \left. + d_\mathrm{P}^Z(\Phi _* \mu ,\Phi _*^{\prime } \mu ^{\prime }) \right) \!, \end{aligned}$$
(2)

where the infimum is taken over all isometric embeddings \(\Phi :X\hookrightarrow Z\) and \(\Phi ^{\prime }:X^{\prime }\hookrightarrow Z\) into some common Polish metric space \((Z,d^Z)\).

Note that equation (2) does not actually define a metric, as \(d_{\mathrm{GHP}}^{\mathrm{c}}(\mathcal{X },\mathcal{X }^{\prime })=0\) if \(\mathcal{X }\) and \(\mathcal{X }^{\prime }\) are GHP-isometric. Therefore, we will consider \(\mathbb{K }\), the set of GHP-isometry classes of compact rooted weighted metric space and identify a compact rooted weighted metric space with its class in \(\mathbb{K }\). Then the function \(d_{\mathrm{GHP}}^{\mathrm{c}}(\cdot ,\cdot )\) is finite on \(\mathbb{K }^2\).

Theorem 2.2

The function \(d_{\mathrm{GHP}}^{\mathrm{c}}(\cdot ,\cdot )\) defines a metric on \(\mathbb{K }\) and the space \((\mathbb{K }, d_{\mathrm{GHP}}^\mathrm{c})\) is a Polish metric space.

We will call \(d_{\mathrm{GHP}}^\mathrm{c}\) the Gromov–Hausdorff–Prokhorov metric.

2.2.3 Gromov–Hausdorff–Prokhorov distance

However, the definition of the Gromov–Hausdorff–Prokhorov metric on compact metric spaces is not yet general enough, as we want to deal with unbounded trees with \(\sigma \)-finite measures. To consider such an extension, we will consider complete and locally compact length spaces. We recall that a metric space \((X,d)\) is a length space if for every \(x,y\in X\), we have

$$\begin{aligned} d(x,y) = \inf L(\gamma ), \end{aligned}$$

where the infimum is taken over all rectifiable curves \(\gamma :[0,1]\rightarrow X\) such that \(\gamma (0)=x\) and \(\gamma (1)=y\), and where \(L(\gamma )\) is the length of the rectifiable curve \(\gamma \).

Definition 2.3

Let \(\mathbb L \) be the set of GHP-isometry classes of rooted weighted complete and locally compact length spaces and identify a rooted weighted complete and locally compact length spaces with its class in \(\mathbb L \).

If \(\mathcal{X }=(X,d,\varnothing ,\mu )\in \mathbb L \), then for \(r\ge 0\) we will consider its restriction to the ball of radius \(r\) centered at \(\varnothing \), \(\mathcal{X }^{(r)}=(X^{(r)}, d^{(r)}, \varnothing , \mu ^{(r)})\), where

$$\begin{aligned} X^{(r)}=\{x\in X,\ d(\varnothing ,x)\le r\}, \end{aligned}$$

the metric \(d^{(r)}\) is the restriction of \(d\) to \(X^{(r)}\), and the measure \(\mu ^{(r)}(dx)={\mathbf{1}}_{X^{(r)}} (x)\ \mu (dx)\) is the restriction of \(\mu \) to \(X^{(r)}\). Recall that the Hopf–Rinow theorem (Theorem 2.5.28 in [12]) implies that if \((X, d)\) is a complete and locally compact length space, then every closed bounded subset of \(X\) is compact. In particular, if \(\mathcal{X }\) belongs to \( \mathbb L \), then \(\mathcal{X }^{(r)}\) belongs to \(\mathbb{K }\) for all \(r\ge 0\).

We state a regularity lemma of \(d^\mathrm{c}_{\mathrm{GHP}}\) with respect to the restriction operation.

Lemma 2.4

Let \(\mathcal{X }\) and \(\mathcal{Y }\) belong to \(\mathbb L \). Then the function defined on \(\mathbb R _+\) by

$$\begin{aligned} r\mapsto d^\mathrm{c}_{\mathrm{GHP}}\left( \mathcal{X }^{(r)},\mathcal{Y }^{(r)}\right) \end{aligned}$$

is càdlàg.

This implies that the following function is well defined on \(\mathbb L ^2\):

$$\begin{aligned} d_{\mathrm{GHP}}(\mathcal{X },\mathcal{Y }) = \int _0^\infty \mathrm{e}^{-r} \left( 1 \wedge d^\mathrm{c}_{\mathrm{GHP}}\left( \mathcal{X }^{(r)},\mathcal{Y }^{(r)}\right) \right) \ dr. \end{aligned}$$

Theorem 2.5

The function \(d_{\mathrm{GHP}}\) defines a metric on \(\mathbb L \) and the space \((\mathbb L , d_{\mathrm{GHP}})\) is a Polish metric space.

The next result implies that \(d_{\mathrm{GHP}}^\mathrm{c}\) and \(d_{\mathrm{GHP}}\) define the same topology on \(\mathbb{K }\cap \mathbb L \).

Theorem 2.6

Let \((\mathcal{X }_n, n\in \mathbb{N })\) and \(\mathcal{X }\) be elements of \(\mathbb{K }\cap \mathbb L \). Then the sequence \((\mathcal{X }_n, n\in \mathbb{N })\) converges to \(\mathcal{X }\) in \((\mathbb{K },d_{\mathrm{GHP}}^\mathrm{c})\) if and only if it converges to \(\mathcal{X }\) in \((\mathbb L ,d_{\mathrm{GHP}})\).

Remark 2.7

At this point, we should clarify the connection between the Gromov–Hausdorff–Prokhorov metric \(d_{\mathrm{GHP}}\) we introduced here and various other metrics in the literature. First of all, a very similar approach was used by Miermont [27] to define a metric on the space of compact metric spaces, carrying a probability measure. On this space, the topologies generated by Miermont’s metric and by \(d_{\mathrm{GHP}}^\mathrm{c}\) coincide. As for the Gromov–Prokhorov metric introduced by Greven, Pfaffelhuber and Winter [20], it is in general neither weaker nor stronger than the \(d_{\mathrm{GHP}}\) metric. Indeed, the Gromov–Prokhorov metric does not take into account the geometrical features of the spaces into consideration (by design, it ignores sets of zero measure) which are however seen by the \(d_{\mathrm{GHP}}\) metric. For an enlightening discussion of the differences between all these points of view, see Chapter 27 of [28].

2.2.4 The space of w-trees

Note that real trees are always length spaces and that complete real trees are the only complete connected spaces that satisfy the so-called four-point condition:

$$\begin{aligned}&\forall x_1,x_2,x_3,x_4 \in X,\ d(x_1,x_2)+d(x_3,x_4)\nonumber \\&\quad \le (d(x_1,x_3)+d(x_2,x_4) ) \vee (d(x_1,x_4)+d(x_2,x_3)). \end{aligned}$$
(3)

Definition 2.8

We denote by \(\mathbb{T }\) be the set of (GHP-isometry classes of) complete locally compact rooted real trees endowed with a locally finite Borel measure, in short w-trees.

We deduce the following corollary from Theorem 2.5 and the four-point condition characterization of real trees.

Corollary 2.9

The set \(\mathbb{T }\) is a closed subset of \(\mathbb L \) and \((\mathbb{T },d_{\mathrm{GHP}})\) is a Polish metric space.

Height erasing. We define the restriction operators on the space of w-trees. Let \(a\ge 0\). If \((T,d,\varnothing ,\mathbf{m})\) is a w-tree, define

$$\begin{aligned} \pi _a(T) = \{ x \in T,\ d(\varnothing ,x) \le a \} \end{aligned}$$
(4)

and let \((\pi _a(T), d^{\pi _a(T)}, \varnothing ,\mathbf{m}^{\pi _a(T)})\) be the w-tree constituted of the points of \(T\) having height lower than \(a\), where \(d^{\pi _a(T)}\) and \(\mathbf{m}^{\pi _a(T)}\) are the restrictions of \(d\) and \(\mathbf{m}\) to \(\pi _a(T)\). When there is no confusion, we will also write \(\pi _a(T)\) for \((\pi _a(T), d^{\pi _a(T)}, \varnothing ,\mathbf{m}^{\pi _a(T)})\). We will also write \(T(a)=\{ x\in T,\ d(\varnothing ,x)=a\}\) for the level set at height \(a\). We say that a w-tree \(T\) is bounded if \(\pi _a(T)=T\) for some finite \(a\). Notice that a tree \(T\) is bounded if and only if \(H_{\mathrm{max}}(T)\) is finite.

Grafting procedure. We will define in this section a procedure by which we add (graft) w-trees on an existing w-tree. More precisely, let \(T \in \mathbb{T }\) and let \(((T_i,x_i),i\in I)\) be a finite or countable family of elements of \(\mathbb{T }\times T\). We define the real tree obtained by grafting the trees \(T_i\) on \(T\) at point \(x_i\). We set \(\tilde{T} = T \sqcup \left( \bigsqcup _{i\in I} T_i{\setminus }\{\varnothing ^{T_i}\} \right) \) where the symbol \(\sqcup \) means that we choose for the sets \(T\) and \((T_i)_{i\in I}\) representatives of isometry classes in \(\mathbb{T }\) which are disjoint subsets of some common set and that we perform the disjoint union of all these sets. We set \(\varnothing ^{\tilde{T}}=\varnothing ^T\). The set \(\tilde{T}\) is endowed with the following metric \(d^{\tilde{T}}\): if \(s,t\in \tilde{T}\),

$$\begin{aligned} d^{\tilde{T}} (s,t) = {\left\{ \begin{array}{ll} d^T(s,t)\ &{} \text{ if }\ s,t\in T, \\ d^T(s,x_i)+d^{T_i}\left( \varnothing ^{T_i},t\right) \ &{} \text{ if }\ s\in T,\ t\in T_i{\setminus }\{\varnothing ^{T_i}\}, \\ d^{T_i}(s,t)\ &{} \text{ if }\ s,t\in T_i{\setminus }\{\varnothing ^{T_i}\},\\ d^T(x_i,x_j)+d^{T_j}\left( \varnothing ^{T_j},s\right) &{} \text{ if }\ i\ne j \ \text{ and }\ s\in T_j{\setminus }\{\varnothing ^{T_j}\},\ \\ \quad +d^{T_i} \left( \varnothing ^{T_i},t\right) \ &{}\quad t\in T_i{\setminus }\{\varnothing ^{T_i}\}. \end{array}\right. } \end{aligned}$$

We define the mass measure on \(\tilde{T}\) by

$$\begin{aligned} \mathbf{m}^{\tilde{T}}=\mathbf{m}^T+\sum _{i\in I}{\mathbf{1}}_{ T_i{\setminus }\{\varnothing ^{T_i}\}} \mathbf{m}^{T_i}+ \mathbf{m}^{T_i}\left( \{\varnothing ^{T_i}\}\right) \delta _{x_i}, \end{aligned}$$

where \(\delta _x\) is the Dirac mass at point \(x\). It is clear that the metric space \((\tilde{T},d^{\tilde{T}},\varnothing ^{\tilde{T}})\) is still a rooted complete real tree. However, it is not always true that \(\tilde{T}\) remains locally compact (it still remains a length space anyway), or, for that matter, that \(\mathbf{m}^{\tilde{T}}\) defines a locally finite measure (on \(\tilde{T}\)). So, we will have to check that \((\tilde{T},d^{\tilde{T}},\varnothing ^{\tilde{T}}, \mathbf{m}^{\tilde{T}} )\) is a w-tree in the particular cases we will consider.

We will use the following notation:

$$\begin{aligned} \left( \tilde{T},d^{\tilde{T}},\varnothing ^{\tilde{T}}, \mathbf{m}^{\tilde{T}} \right) = T \circledast _{i\in I}(T_i,x_i) \end{aligned}$$
(5)

and write \(\tilde{T}\) instead of \((\tilde{T},d^{\tilde{T}},\varnothing ^{\tilde{T}}, \mathbf{m}^{\tilde{T}} ) \) when there is no confusion.

Real trees coded by functions. Lévy trees are natural generalizations of Aldous’s Brownian tree, where the underlying process coding for the tree (reflected Brownian motion in Aldous’s case) is replaced by a certain functional of a Lévy process, the height process. Le Gall and Le Jan [25] and Duquesne and Le Gall [14] showed how to generate random real trees using the excursions of a Lévy process above its minimum. We will briefly recall this construction, in order to introduce the pruning procedure on Lévy trees. Let us first work in a deterministic setting.

Let \(f\) be a continuous non-negative function defined on \([0,+\infty )\), with compact support, such that \(f(0)=0\). We set:

$$\begin{aligned} \sigma ^f=\sup \left\{ t,\ f(t)>0\right\} \!, \end{aligned}$$

with the convention \(\sup \varnothing =0\). Let \(d^f\) be the non-negative function defined by

$$\begin{aligned} d^f(s,t) = f(s) + f(t) - 2 \inf _{u\in [s\wedge t, s\vee t]} f(u). \end{aligned}$$

It can be easily checked that \(d^f\) is a semi-metric on \([0,\sigma ^f]\). One can define the equivalence relation associated to \(d^f\) by \(s\sim t\) if and only if \(d^f(s,t)=0\). Moreover, when we consider the quotient space

$$\begin{aligned} T^f=\left[ 0,\sigma ^f\right] _{/ \sim } \end{aligned}$$

and, noting again \(d^f\) the induced metric on \(T^f\) and rooting \(T^f\) at \(\varnothing ^f\), the equivalence class of 0, it can be checked that the space \((T^f,d^f,\varnothing ^f)\) is a compact rooted real tree. We denote by \(p^f\) the canonical projection from \([0,\sigma ^f]\) onto \(T^f\), which is extended by \(p^f(t)=\varnothing ^f\) for \(t\ge \sigma ^f\). Note that \(p^f\) is continuous. We define \(\mathbf{m}^{f}\), the mass measure on \(T^f\) as the image measure by \(p^f\) of the Lebesgue measure on \([0, \sigma ^f]\). We consider the (compact) w-tree \((T^f, d^f, \varnothing ^f, \mathbf{m}^f)\), which we will note \(T^f\).

It should be noticed that, if \(x\in T^f\) is an equivalence class, the common value of \(f\) on all the points in this equivalence class is exactly \(d^f(\varnothing ,x)=H_x\). Note also that, in this setting, \(H_{\mathrm{max}}(T^f)=\Vert f\Vert _\infty \) where \(\Vert f\Vert _\infty \) stands for the uniform norm of \(f\).

We have the following elementary result (see Lemma 2.3 of [14] when dealing with the Gromov–Hausdorff metric instead of the Gromov–Hausdorff–Prokhorov metric).

Proposition 2.10

Let \(f,g\) be two compactly supported, non-negative continuous functions such that \(f(0)=g(0)=0\). Then:

$$\begin{aligned} d_{\mathrm{GHP}}^{\mathrm{c}}(T^f,T^g) \le 6 \Vert f-g \Vert _\infty + \left| \sigma ^f-\sigma ^g \right| . \end{aligned}$$
(6)

Proof

The Gromov–Hausdorff distance can be evaluated using correspondences, see [12], section 7.3. A correspondence between two metric spaces \((E_1, d_1)\) and \((E_2, d_2)\) is a subset \(\mathfrak{R }\) of \(E_1\times E_2\) such that for \(\delta \in \{1,2\}\) the projection of \(\mathfrak{R }\) on \(E_\delta \) is onto: \(\{x_\delta , \ (x_1, x_2)\in \mathfrak{R }\}=E_\delta \). The distortion of \(\mathfrak{R }\) is defined by:

$$\begin{aligned} \mathrm{dis}(\mathfrak{R })= \sup \left\{ |d_1(x_1, x_2) - d_2(y_1,y_2)|,\ (x_1, y_1)\in \mathfrak{R }, (x_2, y_2)\in \mathfrak{R }\right\} \!. \end{aligned}$$

Let \(Z=E_1 \sqcup E_2\) be the disjoint union of \(E_1\) and \(E_2\) and consider the function \(d^Z\) defined on \(Z^2\) by \(d^Z= d_\delta \) on \(E_\delta ^2\) for \(\delta \in \{1,2\}\) and for \(x_1\in E_1\), \(x_2\in E_2\):

$$\begin{aligned} d^Z(x_1, x_2)=\inf \left\{ d_1(x_1,y_1)+\frac{1}{2}\mathrm{dis}(\mathfrak{R })+ d_2(y_2, x_2),\ (y_1, y_2)\in \mathfrak{R }\right\} . \end{aligned}$$

Then if \(\mathrm{dis}(\mathfrak{R })>0\), the function \(d^Z\) is a metric on \(Z\) such that

$$\begin{aligned} d^Z_H(E_1, E_2)\le \frac{1}{2} \mathrm{dis}(\mathfrak{R }). \end{aligned}$$

Let \(f,g\) be compactly supported, non-negative continuous functions with \(f(0)=g(0)=0\). Following [14], we consider the following correspondence between \(\mathcal{T }^f\) and \(\mathcal{T }^g\):

$$\begin{aligned} \mathfrak{R }=\left\{ (x^f,x^g), \, x^f=p^f(t) \text{ and } x^g=p^g(t) \text{ for } \text{ some } t\ge 0\right\} \!, \end{aligned}$$

and we have \(\mathrm{dis}(\mathfrak{R })\le 4 ||f-g||_\infty \) according to the proof of Lemma 2.3 in [14]. Notice \((\varnothing ^f, \varnothing ^g)\in \mathfrak{R }\). Thus, with the notation above and \(E_1=T^f\), \(E_2=T^g\), we get:

$$\begin{aligned} d_H^Z(T^f, T^g)\le 2 ||f-g||_\infty \quad \text{ and }\quad d^Z(\varnothing ^f, \varnothing ^g)\le 2 ||f-g||_\infty . \end{aligned}$$

Then, we consider the Prokhorov distance between \(\mathbf{m}^f\) and \(\mathbf{m}^g\). Let \(A^f\) be a Borel set of \(T^f\). We set \(I=\{t\in [0, \sigma ^f], \ p^f(t)\in A\}\). By definition of \(\mathbf{m}^f\), we have \(\mathbf{m}^f(A^f)=\mathrm{Leb} (I)\). We set \(A^g=p^g(I \cap [0, \sigma ^g] )\) so that \(\mathbf{m}^g(A^g) = \mathrm{Leb} (I \cap [0, \sigma ^g] )\ge \mathrm{Leb}(I) - |\sigma ^f -\sigma ^g|\). By construction, we also have that for any \(x^g\in A^g\), there exists \(t\in I\) such that \(p^g(t)=x^g\) and such that \(d^Z(x^g,x^f)=\mathrm{dis}(\mathfrak{R })/2\), with \(x^f=p^f(t)\in A^f\). This implies that \(A^g \subset (A^f)^r\) for any \(r>\mathrm{dis}(\mathfrak{R })/2\). We deduce that:

$$\begin{aligned} \mathbf{m}^f\left( A^f\right) \le \mathbf{m}^g\left( A^g\right) + \left| \sigma ^f -\sigma ^g\right| \le \mathbf{m}^g\left( (A^f)^r\right) + \left| \sigma ^f -\sigma ^g\right| . \end{aligned}$$

The same is true with \(f\) and \(g\) replaced by \(g\) and \(f\). We deduce that:

$$\begin{aligned} d^Z_P\left( \mathbf{m}^f, \mathbf{m}^g\right) \le \frac{1}{2}\mathrm{dis}(\mathfrak{R })+\left| \sigma ^f -\sigma ^g\right| \le 2 ||f-g||_\infty +\left| \sigma ^f -\sigma ^g\right| . \end{aligned}$$

We get:

$$\begin{aligned} d_H^Z\left( T^f, T^g\right) +d^Z\left( \varnothing ^f, \varnothing ^g\right) + d^Z_P\left( \mathbf{m}^f, \mathbf{m}^g\right) \le 6 \Vert f-g \Vert _\infty + \left| \sigma ^f-\sigma ^g\right| . \end{aligned}$$

This gives the result. \(\square \)

Remark 2.11

We could define the correspondence for more general functions \(f\): lower semi-continuous functions that satisfy the intermediate values property (see [13]). In that case, the associated real tree is not even locally compact (hence not necessarily proper). But the measurability of the mapping \(f\mapsto T^f\) is not clear in this general setting; this is why we only consider continuous functions \(f\) here and thus will assume the Grey condition (see next section) for Lévy trees.

2.3 Branching mechanisms

Let \(\Pi \) be a \(\sigma \)-finite measure on \((0,+\infty )\) such that we have \(\int (1 \wedge x^2) \Pi (dx) < \infty \). We set:

$$\begin{aligned} \Pi _{\theta } (dr)=\mathrm{e}^{-\theta r}\ \Pi (dr)\!. \end{aligned}$$
(7)

Let \(\Theta ^{\prime }\) be the set of \(\theta \in \mathbb R \) such that \( \int _{(1,+\infty )} \Pi _\theta (dr) < +\infty \). If \(\Pi =0\), then \(\Theta ^{\prime }=\mathbb R \). We also set \(\theta _\infty = \inf \Theta ^{\prime }\). It is obvious that \([0,+\infty )\subset \Theta ^{\prime }\), \(\theta _\infty \le 0\) and either \(\Theta ^{\prime }=[\theta _\infty , +\infty )\) or \(\Theta ^{\prime }=(\theta _\infty ,+\infty )\).

Let \(\alpha \in \mathbb R \) and \(\beta \ge 0\). We consider the branching mechanism \(\psi \) associated with \((\alpha ,\beta ,\Pi )\):

$$\begin{aligned} \psi (\lambda ) = \alpha \lambda + \beta \lambda ^2 + \int _{(0,+\infty )} (\mathrm{e}^{-\lambda r} -1 + \lambda r {\mathbf{1}}_{\{r<1\}}) \Pi (dr), \quad \lambda \in \Theta ^{\prime }. \end{aligned}$$
(8)

Note that the function \(\psi \) is smooth and convex over \((\theta _\infty ,+\infty )\). We say that \(\psi \) is conservative if for all \(\varepsilon >0\):

$$\begin{aligned} \int _{(0,\varepsilon ]} \frac{du}{|\psi (u)|} = +\infty . \end{aligned}$$

This condition will be equivalent to the non-explosion in finite time of continuous-state branching processes associated with \(\psi \) (see below). A sufficient condition for \(\psi \) to be conservative is to have \(\psi ^{\prime }(0+)>-\infty \), which is actually equivalent to \(\int _{(1,\infty )} r \Pi (dr) < \infty \). If \(X\) is a Lévy process with Laplace exponent \(\psi \), we can always write

$$\begin{aligned} \mathbb E [X_t] = -t\psi ^{\prime }(0), \end{aligned}$$

so that the condition \(\psi ^{\prime }(0+)>-\infty \) is equivalent to the existence of first moments for \(X\). Under this assumption, the branching mechanism can be rewritten under a simpler form. However, we point out that there exists several interesting branching mechanisms satisfying \(\psi ^{\prime }(0+)=-\infty \) and yet are conservative (such as Neveu’s branching mechanism \(\psi (u)=u\log u\)). Hence, we will not automatically assume that \(\psi ^{\prime }(0+)>-\infty \), but we will always make the following, slightly weaker, assumption.

Assumption 1

The function \(\psi \) is conservative and we have \(\beta >0\) or \(\int _{(0,1)} \ell \Pi (d\ell )=+\infty \).

The branching mechanism is said to be sub-critical (resp. critical, super-critical) if \(\psi ^{\prime }(0+) >0\) (resp. \(\psi ^{\prime }(0+)=0\), \(\psi ^{\prime }(0+) <0\)). We say that \(\psi \) is (sub)critical if it is critical or sub-critical.

We introduce the following branching mechanisms \(\psi _\theta \) for \(\theta \in \Theta ^{\prime }\):

$$\begin{aligned} \psi _\theta (\lambda )=\psi (\lambda +\theta )-\psi (\theta ), \quad \lambda +\theta \in \Theta ^{\prime }. \end{aligned}$$
(9)

Let \(\Theta ^\psi \) be the set of \(\theta \in \Theta ^{\prime }\) such that \(\psi _\theta \) is conservative. Obviously, we have:

$$\begin{aligned}{}[0,+\infty )\subset \Theta ^\psi \subset \Theta ^{\prime } \subset \Theta ^\psi \cup \{\theta _\infty \}. \end{aligned}$$

If \(\theta \in \Theta ^\psi \), we set:

$$\begin{aligned} \bar{\theta }=\max \left\{ q\in \Theta ^\psi ,\ \psi (q)=\psi (\theta )\right\} \!\!. \end{aligned}$$
(10)

We can give an alternative definition of \(\bar{\theta }\) if Assumption 1 holds. Let \(\theta ^*\) be the unique positive root of \(\psi ^{\prime }\) if it exists. Notice that \(\theta ^*=0\) if \(\psi \) is critical and that \(\theta ^*\) exists and is positive if \(\psi \) is super-critical. If \(\theta ^*\) exists, then the branching mechanism \(\psi _{\theta ^*}\) is critical. We set \(\Theta _*^\psi \) for \([\theta ^*,+\infty )\) if \(\theta ^*\) exists and \(\Theta _*^\psi =\Theta ^\psi \) otherwise. The function \(\psi \) is a one-to-one mapping from \(\Theta _*^\psi \) onto \(\psi (\Theta ^\psi _*)\). We write \(\psi ^{-1}\) for the inverse of the previous mapping. The set \(\{ q\in \Theta ^\psi ,\ \psi (q)=\psi (\theta )\}\) has at most two elements and we have:

$$\begin{aligned} \bar{\theta }=\psi ^{-1} \circ \psi (\theta ). \end{aligned}$$

In particular, if \(\psi _\theta \) is (sub)critical we have \(\bar{\theta }= \theta \) and if \(\psi _\theta \) is super-critical then we have \(\theta <\theta ^*<\bar{\theta }\). We will later on consider the following assumption.

Assumption 2

(Grey condition) The branching mechanism is such that:

$$\begin{aligned} \int ^{+\infty } \frac{du}{\psi (u)} < \infty . \end{aligned}$$

Let us point out that Assumption 2 implies that \(\beta >0\) or \(\int _{(0,1)} r \Pi (dr)=+\infty \).

Connections with branching processes. Let \(\psi \) be a branching mechanism satisfying Assumption 1. A continuous state branching process (CSBP) with branching mechanism \(\psi \) and initial mass \(x>0\) is the càdlàg \(\mathbb R _+\)-valued Markov process \((Z_a,\ a\ge 0)\) whose distribution is characterized by \(Z_0=x\) and:

$$\begin{aligned} \mathbb{E }[ \exp (-\lambda Z_{a+a^{\prime }}) | Z_a ] = \exp (-Z_a u(a^{\prime },\lambda ) ), \quad \lambda \ge 0, \end{aligned}$$

where \((u(a,\lambda ),a\ge 0, \lambda >0)\) is the unique non-negative solution to the integral equation:

$$\begin{aligned} \int _{u(a,\lambda )}^\lambda \frac{dr}{\psi (r)} = a;\ \ u(0,\lambda )=\lambda . \end{aligned}$$
(11)

The distribution of the CSBP started at mass \(x\) will be noted \(\mathbf{P}^\psi _x\). For a detailed presentation of CSBPs, we refer to the monographs [22, 23] or [26].

In this context, the conservativity assumption is equivalent to the CSBP not blowing up in finite time (Theorem 10.3 in [22]), and Assumption 2 is equivalent to the strong extinction time, \(\inf \{a,\ Z_a=0\}\), being a.s. finite. If Assumption 2 holds, then for all \(h>0\), \(\mathbf{P}_x^\psi (Z_{h} >0) = \exp (-x b(h))\), where \(b(h)=\lim _{\lambda \rightarrow +\infty } u(h,\lambda )\). In particular \(b(h)\) is such that

$$\begin{aligned} \int _{b(h)}^\infty \frac{dr}{\psi (r)} = h. \end{aligned}$$
(12)

Let us now describe a Girsanov transform for CSBPs introduced in [2] related to the shift of the branching mechanism \(\psi \) defined by (9). Recall notation \(\Theta ^\psi \) and \(\theta _\infty \) from the previous section. For \(\theta \in \Theta ^\psi \), we consider the process \(M^{\psi ,\theta }=(M^{\psi ,\theta }_a, a\ge 0)\) defined by:

$$\begin{aligned} M_a^{\psi ,\theta } = \exp \left( \theta x-\theta Z_a-\psi (\theta ) \int _0^a Z_s ds\right) . \end{aligned}$$
(13)

Theorem 2.12

(Girsanov transformation for CSBPs, [2]) Let \(\psi \) be a branching mechanism satisfying Assumption 1. Let \((Z_a,a\ge 0)\) be a CSBP with branching mechanism \(\psi \) and let \(\mathcal{F }= (\mathcal{F }_a,a\ge 0)\) be its natural filtration. Let \(\theta \in \Theta ^\psi \) be such that either \(\theta \ge 0\) or \(\theta <0\) and \(\int _{(1,+\infty )} r \Pi _\theta (dr)<+\infty \). Then we have the following:

  1. (1)

    The process \(M^{\psi ,\theta }\) is a \(\mathcal{F }\)-martingale under \(\mathbf{P}_x^\psi \).

  2. (2)

    Let \(a,x\ge 0\). On \(\mathcal{F }_a\), the probability measure \(\mathbf{P}_x^{\psi _\theta }\) is absolutely continuous w.r.t. \(\mathbf{P}_x^\psi \), and

    $$\begin{aligned} \frac{{d\mathbf{P}_x^{\psi _\theta }}_{|\mathcal{F }_a}}{{d\mathbf{P}_x^{\psi }}_{|\mathcal{F }_a}} = M_a^{\psi ,\theta }. \end{aligned}$$

2.4 The height process

Let \((X_t,\ t\ge 0)\) be a Lévy process with Laplace exponent \(\psi \) satisfying Assumption 1. This assumption implies that a.s. the paths of \(X\) have infinite total variation over any non-trivial interval. The distribution of the Lévy process will be noted \(\mathbb{P }^\psi (dX)\). It is a probability measure on the Skorokhod space of real-valued càdlàg processes. For the remainder of this section, we will assume that \(\psi \) is (sub)critical.

For \(t\ge 0\), let us write \(\hat{X}^{(t)}\) for the time-returned process:

$$\begin{aligned} \hat{X}^{(t)}_s= X_t-X_{(t-s)_-},\quad 0\le s< t \end{aligned}$$

and \(\hat{X}^{(t)}_t=X_t\). Then \((\hat{X}^{(t)}_s,\ 0\le s\le t)\) has same distribution as the process \((X_s,\ 0\le s\le t)\). We will also write \(\hat{S}^{(t)}_s = \sup _{[0,s]} \hat{X}^{(t)}_r\) for the supremum process of \(\hat{X}^{(t)}\).

Proposition 2.13

(The height process, [13]) Let \(\psi \) be a (sub)critical branching mechanism satisfying Assumption 1. There exists a lower semi-continuous process \(H=(H_t,\ t\ge 0)\) taking values in \([0,+\infty ]\), with the intermediate values property, which is a local time at 0, at time \(t\), of the process \(\hat{X}^{(t)}-\hat{S}^{(t)}\), such that the following convergence holds in probability:

$$\begin{aligned} H_t = \lim _{\varepsilon \downarrow 0} \frac{1}{\varepsilon } \int _0^t {\mathbf{1}}_{\{I_s^t \le X_s \le I_s^t +\varepsilon \}} ds \end{aligned}$$

where \(I_s^t=\inf _{s\le r \le t} X_r\). Furthermore, if Assumption 2 holds, then the process \(H\) admits a continuous modification.

From now on, we always assume that Assumptions 1 and 2 hold, and we always work with this continuous version of \(H\). The process \(H\) is called the height process.

For \(x>0\), we consider the stopping time \(\tau _x = \inf \{ t\ge 0,\ I_t \le -x \} \), where \(I_t=I^t_0\) is the infimum process of \(X\). We denote by \(\mathbb{P }^\psi _x(dH) \) the distribution of the stopped height process \((H_{t\wedge \tau _x}, t\ge 0)\) under \(\mathbb{P }^\psi \), defined on the space \(\mathcal C _+([0,+\infty ))\) of non-negative continuous functions on \([0,+\infty )\). The (sub)criticality of the branching mechanism entails \(\tau _x<\infty \) \(\mathbb{P }^\psi \)-a.s., so that under \(\mathbb{P }^\psi _x(dH)\), the height process has a.s. compact support.

The excursion measure. The height process is not a Markov process, but it has the same zero sets as \(X-I\) (see [13], Paragraph 1.3.1), so that we can develop an excursion theory based on the latter. By standard fluctuation theory, it is easy to see that 0 is a regular point for \(X-I\) and that \(-I\) is a local time of \(X-I\) at 0. We denote by \(\mathbb{N }^\psi \) the associated excursion measure. As such, \(\mathbb{N }^\psi \) is a \(\sigma \)-finite measure. Under \(\mathbb{P }^\psi _x\) or \(\mathbb{N }^\psi \), we set:

$$\begin{aligned} \sigma (H) = \int _0^\infty {\mathbf{1}}_{\{H_t \ne 0\}} dt. \end{aligned}$$

When there is no risk of confusion, we will write \(\sigma \) for \(\sigma (H)\). Notice that, under \(\mathbb{P }_x^\psi \), \(\sigma =\tau _x\) and that under \(\mathbb{N }^\psi \), \(\sigma \) represents the lifetime of the excursion. Abusing notations, we will write \(\mathbb{P }_x^\psi (dH)\) and \(\mathbb{N }^\psi [dH]\) for the distribution of \(H\) under \(\mathbb{P }_x^\psi \) or \(\mathbb{N }^\psi \). Let us also recall the Poissonian decomposition of the measure \(\mathbb{P }^\psi _x\). Under \(\mathbb{P }^\psi _x\), let \((a_j,b_j)_{j\in J}\) be the excursion intervals of \(X-I\) away from 0. Those are also the excursion intervals of the height process away from \(0\). For \(j\in J\), we will denote by \(H^{(j)}:[0,\infty )\rightarrow \mathbb R _+\) the corresponding excursion, that is

$$\begin{aligned} \ H^{(j)}_t = H_{(a_j+t)\wedge b_j}, \quad t\ge 0. \end{aligned}$$

Proposition 2.14

[14] Let \(\psi \) be a (sub)critical branching mechanism satisfying Assumption 1. Under \(\mathbb{P }^\psi _x\), the random point measure \(\mathcal{N }= \sum _{j\in J} \delta _{H^{(j)}}(dH)\) is a Poisson point measure with intensity \(x\mathbb{N }^\psi [dH]\).

Local times of the height process

Proposition 2.15

[13, Formula (36)] Let \(\psi \) be a (sub)critical branching mechanism satisfying Assumption 1. Under \(\mathbb{N }^\psi \), there exists a jointly measurable process \((L^a_s, a\ge 0, s\ge 0)\) which is continuous and non-decreasing in the variable \(s\) such that,

$$\begin{aligned} L_s^0=0,\quad s\ge 0 \end{aligned}$$

and for every \(t\ge 0\), for every \(\delta >0\) and every \(a>0\)

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}\mathbb{N }^\psi \left[ \sup _{0\le s\le t\wedge \sigma }\left| \varepsilon ^{-1} \int _0^s {\mathbf{1}}_{\{ a<H_r\le a+\varepsilon \}}\,dr - L^a_s\right| {\mathbf{1}}_{\{\sup H>\delta \}}\right] =0. \end{aligned}$$

Moreover, by Lemma 3.3 in [14], the process \((L_\sigma ^a,\ a\ge 0)\) has a càdlàg modification under \(\mathbb{N }^\psi \) with no fixed discontinuities.

(Sub)critical Lévy trees. Let \(\psi \) be a (sub)critical branching mechanism satisfying Assumptions 1 and 2. Let \(H\) be the height process defined under \(\mathbb{P }^\psi _x\) or \(\mathbb{N }^\psi \). We consider the so-called Lévy tree \(\mathcal{T }^H\) which is the random w-tree coded by the function \(H\), see Sect 2.2.4. Notice that we are indeed within the framework of proper real trees, since Assumption 2 entails compactness of \(\mathcal{T }^H\). The measurability of the random variable \(\mathcal{T }^H\) taking values in \(\mathbb{T }\) follows from Proposition 2.10 and Theorem 2.6. When there is no confusion, we will write \(\mathcal{T }\) for \(\mathcal{T }^H\). Abusing notations, we will write \(\mathbb{P }_x^\psi (d\mathcal{T })\) and \(\mathbb{N }^\psi [d\mathcal{T }]\) for the distribution on \(\mathbb{T }\) of \(\mathcal{T }=\mathcal{T }^H\) under \(\mathbb{P }_x^\psi (dH)\) or \(\mathbb{N }^\psi [dH]\). By construction, under \(\mathbb{P }^\psi _x\) or under \(\mathbb{N }^\psi \), we have that the total mass of the mass measure on \(\mathcal{T }\) is given by

$$\begin{aligned} \mathbf{m}^{\mathcal{T }}(\mathcal{T }) = \sigma . \end{aligned}$$
(14)

Proposition 2.14 enables us to view the measure \(\mathbb{N }^\psi [d\mathcal{T }]\) as describing a single Lévy tree. Thus, we will mostly work under this excursion measure, which is the distribution of the (isometry class of the) w-tree \(\mathcal{T }\) described by the height process under \(\mathbb{N }^\psi \). In order to state the branching property of a Lévy tree, we must first define a local time at level \(a\) on the tree. Let \((\mathcal{T }^{i,\circ },i\in I)\) be the trees that were cut off by cutting at level \(a\), namely the connected components of the set \(\mathcal{T }{\setminus }\pi _a(\mathcal{T })\). If \(i\in I\), then all the points in \(\mathcal{T }^{i,\circ }\) have the same MRCA \(x_i\) in \(\mathcal{T }\) which is precisely the point where the tree was cut off. We consider the compact tree \(\mathcal{T }^{i}=\mathcal{T }^{i,\circ } \cup \{x_i\}\) with the root \(x_i\), the metric \(d^{\mathcal{T }^i}\), which is the metric \(d^\mathcal{T }\) restricted to \(\mathcal{T }^i\), and the mass measure \(\mathbf{m}^{\mathcal{T }^i}\), which is the mass measure \(\mathbf{m}^\mathcal{T }\) restricted to \(\mathcal{T }^i\). Then \((\mathcal{T }^i, d^{\mathcal{T }^i}, x_i, \mathbf{m}^{\mathcal{T }^i})\) is a w-tree. Let

$$\begin{aligned} \mathcal{N }_a^{\mathcal{T }}(dx,d\mathcal{T }^{\prime }) = \sum _{i\in I} \delta _{(x_i,\mathcal{T }^i)}(dx,d\mathcal{T }^{\prime }) \end{aligned}$$
(15)

be the point measure on \(\mathcal{T }(a)\times \mathbb{T }\) taking account of the cutting points as well as the trees cut away. The following theorem gives the structure of the decomposition we just described. From excursion theory, we deduce that \(b(h)=\mathbb{N }^\psi [H_{\mathrm{max}}(\mathcal{T }) > h]\), where \(b(h)\) solves (12). An easy extension of [14] from real trees to w-trees gives the following result.

Theorem 2.16

[14] Let \(\psi \) be a (sub)critical branching mechanism satisfying Assumptions 1 and 2. There exists a \(\mathcal{T }\)-measure valued process \((\ell ^a, a\ge 0)\) càdlàg for the weak topology on finite measures on \(\mathcal{T }\) such that \(\mathbb{N }^\psi \)-a.e.:

$$\begin{aligned} \mathbf{m}^{\mathcal{T }}(dx) = \int _0^\infty \ell ^a(dx) da, \end{aligned}$$
(16)

\(\ell ^0=0, \inf \{a > 0,\ \ell ^a = 0\}=\sup \{a \ge 0,\ \ell ^a\ne 0\}=H_{\mathrm{max}}(\mathcal{T })\) and for every fixed \(a\ge 0\), \(\mathbb{N }^\psi \text{-a.e. }\):

  • \(\ell ^a\) is supported on \(\mathcal{T }(a)\),

  • We have for every bounded continuous function \(\varphi \) on \(\mathcal{T }\):

    $$\begin{aligned} \langle \ell ^a,\varphi \rangle&= \lim _{\varepsilon \downarrow 0} \frac{1}{b(\varepsilon )} \int \varphi (x) {\mathbf{1}}_{\{h(\mathcal{T }^{\prime })\ge \varepsilon \}} \mathcal{N }_a^{\mathcal{T }}(dx, d\mathcal{T }^{\prime }) \end{aligned}$$
    (17)
    $$\begin{aligned}&= \lim _{\varepsilon \downarrow 0} \frac{1}{b(\varepsilon )} \int \varphi (x) {\mathbf{1}}_{\{h(\mathcal{T }^{\prime })\ge \varepsilon \}} \mathcal{N }_{a-\varepsilon }^{\mathcal{T }}(dx, d\mathcal{T }^{\prime }),\ \text{ if }\ a>0. \end{aligned}$$
    (18)

Furthermore, we have the branching property: for every \(a>0\), the conditional distribution of the point measure \(\mathcal{N }_a^{\mathcal{T }}(dx,d\mathcal{T }^{\prime })\) under \(\mathbb{N }^\psi [d\mathcal{T }|H_{\mathrm{max}}(\mathcal{T })>a]\), given \(\pi _a(\mathcal{T })\), is that of a Poisson point measure on \(\mathcal{T }(a)\times \mathbb{T }\) with intensity \(\ell ^a(dx)\mathbb{N }^\psi [d\mathcal{T }^{\prime }]\).

The measure \(\ell ^a\) will be called the local time measure of \(\mathcal{T }\) at level \(a\). In the case of Lévy trees, it can also be defined as the image of the measure \(d_sL_s^a(H)\) by the canonical projection \(p^H\) (see [13]), so the above statement is in fact the translation of the excursion theory of the height process in terms of real trees. This definition shows that the local time is a function of the tree \(\mathcal{T }\) and does not depend on the choice of the coding height function. It should be noted that Equation (18) implies that \(\ell ^a\) is measurable with respect to the \(\sigma \)-algebra generated by \(\pi _a(\mathcal{T })\).

The next theorem, also from [14], relates the discontinuities of the process \((\ell ^a,a\ge 0)\) to the infinite nodes in the tree. Recall \(\mathrm{Br}_\infty (\mathcal{T })\) denotes the set of infinite nodes in the Lévy tree \(\mathcal{T }\).

Theorem 2.17

[14] Let \(\psi \) be a (sub)critical branching mechanism satisfying Assumptions 1 and 2. The set \(\{ d(\varnothing ,x),\ x\in \mathrm{Br}_\infty (\mathcal{T }) \}\) coincides \(\mathbb{N }^\psi \)-a.e. with the set of discontinuity times of the mapping \(a\mapsto \ell ^a\). Moreover, \(\mathbb{N }^\psi \)-a.e., for every such discontinuity time \(b\), there is a unique \(x_b\in \mathrm{Br}_\infty (\mathcal{T })\cap \mathcal{T }(b)\), and

$$\begin{aligned} \ell ^b = \ell ^{b-} + \Delta _b \delta _{x_b}, \end{aligned}$$

where \(\Delta _b>0\) is called mass of the node \(x_b\) and can be obtained by the approximation

$$\begin{aligned} \Delta _b = \lim _{\varepsilon \rightarrow 0} \frac{1}{b(\varepsilon )} n(x_b,\varepsilon ), \end{aligned}$$
(19)

where \(n(x_b,\varepsilon )=\int {\mathbf{1}}_{\{x=x_b\}}(x){\mathbf{1}}_{\{H_{\mathrm{max}}(\mathcal{T }^{\prime }) > \varepsilon \}}(\mathcal{T }^{\prime }) \mathcal{N }_b^\mathcal{T }(dx,d\mathcal{T }^{\prime })\) is the number of sub-trees originating from \(x_b\) with height larger than \(\varepsilon \).

Decomposition of the Lévy tree. We will frequently use the following notation for the following measure on \(\mathbb{T }\):

$$\begin{aligned} {\mathbf{N}}^{\psi }[\mathcal{T }\in \cdot ] = 2\beta \mathbb{N }^{\psi }[\mathcal{T }\in \cdot ] + \int _{(0,+\infty )} r\Pi (dr)\, \mathbb{P }_r^{\psi }[\mathcal{T }\in \cdot ]. \end{aligned}$$
(20)

where \(\psi \) is given by (8).

The decomposition of a (sub)critical Lévy tree \(\mathcal{T }\) according to a spine \([\![\varnothing , x ]\!]\), where \(x\in \mathcal{T }\) is a leaf picked at random at level \(a>0\), that is according to the local time \(\ell ^a(dx)\), is given in Theorem 4.5 in [14]. Then by integrating with respect to \(a\), we get the decomposition of \(\mathcal{T }\) according to a spine \([\![\varnothing , x ]\!]\), where \(x\in \mathcal{T }\) is a leaf picked at random on \(\mathcal{T }\), that is according to the mass measure \(\mathbf{m}^\mathcal{T }\). Therefore, we will state this decomposition without proof.

Let \(x\in \mathcal{T }\) and let \(\{x_i, i\in I_x\}=\mathrm{Br}(\mathcal{T })\cap [\![\varnothing , x ]\!]\) be the set of branching points on the spine \([\![\varnothing , x ]\!]\). For \(i\in I_x\), we set:

$$\begin{aligned} \mathcal{T }^i=\mathcal{T }{\setminus } \left( \mathcal{T }^{(x,x_i)} \cup \mathcal{T }^{(\varnothing ,x_i)}\right) , \end{aligned}$$

where \(\mathcal{T }^{(y,x_i)}\) is the connected component of \(\mathcal{T }{\setminus }\{x_i\}\) containing \(y\). We let \(x_i\) be the root of \(\mathcal{T }^i\). The metric and measure on \(\mathcal{T }^i\) are respectively the restriction of \(d^\mathcal{T }\) to \(\mathcal{T }^i\) and the restriction of \(\mathbf{m}^\mathcal{T }\) to \(\mathcal{T }^i{\setminus }\{x_i\}\). By construction, if \(x\) is a leaf, we have:

$$\begin{aligned} \mathcal{T }= [\![\varnothing , x ]\!]\circledast _{i\in I_x}(\mathcal{T }^i,x_i), \end{aligned}$$

where \( [\![\varnothing , x ]\!]\) is a w-tree with root \(\varnothing \), metric and mass measure the restrictions of \(d^\mathcal{T }\) and \(\mathbf{m}^\mathcal{T }\) to \( [\![\varnothing , x ]\!]\). We consider the point measure on \([0, H_x]\times \mathbb{T }\) defined by:

$$\begin{aligned} \mathcal{M }_x=\sum _{i\in I_x} \delta _{(H_{x_i},\mathcal{T }^i)}. \end{aligned}$$

Theorem 2.18

[14] Let \(\psi \) be a (sub)critical branching mechanism satisfying Assumptions 1 and 2. We have for any non-negative measurable function \(F\) defined on \([0,+\infty )\times \mathbb{T }\):

$$\begin{aligned} \mathbb{N }^\psi \left[ \int \mathbf{m}^\mathcal{T }(dx) F( H_x, \mathcal{M }_x) \right] =\int _0^{\infty }da \, \mathrm{e}^{-\psi ^{\prime }(0) a }\, \mathbb{E }\left[ F\left( a, \sum _{i\in I} {\mathbf{1}}_{\{z_i\le a\}} \delta _{(z_i, \hat{\mathcal{T }}^i)}\right) \right] , \end{aligned}$$

where under \(\mathbb{E }, \sum _{i\in I} \delta _{(z_i, \hat{\mathcal{T }}^i)}(dz, dT)\) is a Poisson point measure on \([0,+\infty )\times \mathbb{T }\) with intensity \( dz \, {\mathbf{N}}^{\psi }[dT]\).

CSBP process in the Lévy trees. Lévy trees give a genealogical structure for CSBPs, which is precised in the next theorem. We consider the process \(\mathcal{Z }=(\mathcal{Z }_a, a\ge 0)\) defined by:

$$\begin{aligned} \mathcal{Z }_a=\langle \ell ^a,1\rangle . \end{aligned}$$

If needed we will write \(\mathcal{Z }_a(\mathcal{T })\) to emphasize that \(\mathcal{Z }_a \) corresponds to the tree \(\mathcal{T }\).

Theorem 2.19

(CSBP in Lévy trees, [13] and [14]) Let \(\psi \) be a (sub)critical branching mechanism satisfying Assumptions 1 and 2, and let \(x>0\). The process \(\mathcal{Z }\) under \(\mathbb{P }_x^\psi \) is distributed as the CSBP \(Z\) under \(\mathbf{P}^\psi _x\).

Remark 2.20

This theorem can be stated in terms of the height process without Assumption 2.

2.5 Super-critical Lévy trees

Let us now briefly recall the construction from [2] for super-critical Lévy trees using a Girsanov transformation similar to the one used for CSBPs, see Theorem 2.12.

Let \(\psi \) be a super-critical branching mechanism satisfying Assumptions 1 and 2. Recall \(\theta ^*\) is the unique positive root of \(\psi ^{\prime }\) and that the branching mechanism \(\psi _\theta \) is sub-critical if \(\theta >\theta ^*\), critical if \(\theta =\theta ^*\) and super-critical otherwise. We consider the filtration \(\mathcal{H }=(\mathcal{H }_a,\ a\ge 0)\), where \(\mathcal{H }_a\) is the \(\sigma \)-field generated by the random variable \(\pi _a(\mathcal{T })\) and the \(\mathbb{P }_x^{\psi _{\theta ^*}}\)-negligible sets. For \(\theta \ge \theta ^*\), we define the process \(M^{\psi ,\theta }=(M_a^{\psi ,\theta }, a\ge 0)\) with

$$\begin{aligned} M_a^{\psi ,\theta } = \exp \Bigg ( \theta x -\theta \mathcal{Z }_a - \psi (\theta ) \int _0^a \mathcal{Z }_s ds \Bigg ) \end{aligned}$$

By absolute continuity of the measures \(\mathbb{P }_x^{\psi _\theta }\) (resp. \(\mathbb{N }^{\psi _\theta }\)) with respect to \(\mathbb{P }_x^{\psi _{\theta ^*}}\) (resp. \(\mathbb{N }^{\psi _{\theta ^*}}\)), all the processes \(M^{\psi _{\theta },-\theta }\) for \(\theta >\theta ^*\) are \(\mathcal{H }\)-adapted. Moreover, all these processes are \(\mathcal{H }\)-martingales (see [2] for the proof). Theorem 2.16 shows that \(M^{\psi _{\theta ^*},-\theta ^*}\) is \(\mathcal{H }\)-adapted. Let us now define the \(\psi \)-Lévy tree, cut at level \(a\) by the following Girsanov transformation.

Definition 2.21

Let \(\psi \) be a super-critical branching mechanism satisfying Assumptions 1 and 2. Let \(\theta \ge \theta ^*\). For \(a\ge 0\), we define the distribution \(\mathbb{P }_x^{\psi ,a}\) (resp. \(\mathbb{N }^{\psi ,a}\)) by: if \(F\) is a non-negative, measurable functional defined on \(\mathbb{T }\),

$$\begin{aligned} \mathbb E _x^{\psi ,a}[F(\mathcal{T })]&= \mathbb E _x^{\psi _{\theta }} \Big [ M_a^{\psi _\theta ,-\theta } F(\pi _a(\mathcal{T })) \Big ],\end{aligned}$$
(21)
$$\begin{aligned} \mathbb{N }^{\psi ,a}[F(\mathcal{T })]&= \mathbb{N }^{\psi _{\theta }} \left[ \exp \left( \theta \mathcal{Z }_a +\psi (\theta ) \int _0^a \mathcal{Z }_s (ds)\right) F(\pi _a(\mathcal{T })) \right] . \end{aligned}$$
(22)

It can be checked that the definition of \(\mathbb{P }_x^{\psi ,a}\) (and of \(\mathbb{N }^{\psi ,a}\)) does not depend on \(\theta \ge \theta ^*\). The probability measures \(\mathbb{P }_x^{\psi ,a}\) satisfy a consistence property, allowing us to define the super-critical Lévy tree in the following way.

Theorem 2.22

Let \(\psi \) be a super-critical branching mechanism satisfying assumptions 1 and 2. There exists a probability measure \(\mathbb{P }_x^{\psi }\) (resp. a \(\sigma \)-finite measure \(\mathbb{N }^\psi \)) on \(\mathbb{T }\) such that for \(a>0\), we have, if \(F\) is a measurable non-negative functional on \(\mathbb{T }\),

$$\begin{aligned} \mathbb E _x^\psi [F(\pi _a(\mathcal{T }))] = \mathbb E _x^{\psi ,a} [F(\mathcal{T })], \end{aligned}$$

the same being true under \(\mathbb{N }^\psi \).

The w-tree \(\mathcal{T }\) under \(\mathbb{P }_x^\psi \) or \(\mathbb{N }^\psi \) is called a \(\psi \)-Lévy w-tree or simply a Lévy tree.

Proof

For \(n\ge 1,\ 0<a_1<\dots <a_n\), we define a probability measure on \(\mathbb{T }^n\) by:

$$\begin{aligned}&\mathbb{P }_x^{\psi ,a_1,\dots ,a_n} (\mathcal{T }_1\in A_1,\dots ,\mathcal{T }_n \in A_n) = \mathbb{P }_x^{\psi ,a_n} ( \mathcal{T }\in A_n, \pi _{a_{n-1}}(\mathcal{T })\\&\quad \in A_{n-1},\dots , \pi _{a_1}(\mathcal{T }) \in A_1 ) \end{aligned}$$

if \(A_1,\dots ,A_n\) are Borel subsets of \(\mathbb{T }\). The probability measures

$$\begin{aligned} (\mathbb{P }_x^{\psi ,a_1,\dots ,a_n},\ n\ge 1,\ 0<a_1<\dots <a_n) \end{aligned}$$

then form a projective family. This is a consequence of the martingale property of \(M^{\psi _\theta ,-\theta }\) and the fact that the projectors \(\pi _a\) satisfy the obvious compatibility relation \(\pi _b \circ \pi _a = \pi _b\) if \(0<b<a\).

By the Daniell–Kolmogorov theorem, there exists a probability measure \(\tilde{\mathbb{P }}_x^\psi \) on the product space \(\mathbb{T }^\mathbb{R _+}\) such that the finite-dimensional distributions of a \(\tilde{\mathbb{P }}_x^\psi \)-distributed family are described by the measures defined above. It is easy to construct a version of a \(\tilde{\mathbb{P }}_x^\psi \)-distributed process that is a.s. increasing. Indeed, almost all sample paths of a \(\tilde{\mathbb{P }}_x^\psi \)-distributed process are increasing when restricted to rational numbers. We can then define a w-tree \(\mathcal{T }^a\) for any \(a>0\) by considering a decreasing sequence of rational numbers \(a_n \downarrow a\) and defining \(\mathcal{T }^a = \cap _{n\ge 1} \mathcal{T }^{a_n}\). Notice that \(\mathcal{T }^a\) is closed for all \(a\in \mathbb R _+\). It is easy to check that the finite-dimensional distributions of this new process are unchanged by this procedure. Let us then consider \(\mathcal{T }= \cup _{a>0} \mathcal{T }^a\), endowed with the obvious metric \(d^\mathcal{T }\) and mass measure \(\mathbf{m}\). It is clear that \(\mathcal{T }\) is a real tree, rooted at the common root of the \(\mathcal{T }^a\). All the \(\mathcal{T }^a\) are compact, so that \(\mathcal{T }\) is locally compact and complete. The measure \(\mathbf{m}\) is locally finite since all the \(\mathbf{m}^{\mathcal{T }^a}\) are finite measures. Therefore, \(\mathcal{T }\) is a.s. a w-tree. Then, if we define \(\mathbb{P }_x^\psi \) to be the distribution of \(\mathcal{T }\), the conclusion follows. Similar arguments hold under \(\mathbb{N }^\psi \). \(\square \)

Remark 2.23

Another definition of super-critical Lévy trees was given by Duquesne and Winkel [15, 16]: they consider increasing families of Galton–Watson trees with exponential edge lengths which satisfy a certain hereditary property (such as uniform Bernoulli coloring of the leaves). Lévy trees are then defined to be the Gromov–Hausdorff limits of these processes. Another approach via backbone decompositions is given in [11].

All the definitions we made for sub-critical Lévy trees then carry over to the super-critical case. In particular, the level set measure \(\ell ^a\), which is \(\pi _a(\mathcal{T })\)-measurable, can be defined using the Girsanov formula. Thanks to Theorem 2.12, it is easy to show that the mass process \((\mathcal{Z }_a=\langle \ell ^a,1 \rangle ,\ a\ge 0)\) is under \(\mathbb{P }_x^\psi \) a CSBP with branching mechanism \(\psi \). In particular, with \(u\) defined in (11) and \(b\) by (12), we have:

$$\begin{aligned} \mathbb{N }^\psi \left[ 1-\mathrm{e}^{-\lambda \mathcal{Z }_a }\right] =u(a,\lambda ) \quad \text{ and }\quad \mathbb{N }^\psi \left[ H_{\mathrm{max}}(\mathcal{T })>a \right] =\mathbb{N }^\psi \left[ \mathcal{Z }_a >0 \right] = b(a).\nonumber \\ \end{aligned}$$
(23)

Notice that \(b\) is finite only under Assumption 2. We set:

$$\begin{aligned} \sigma =\int _0^{+\infty } \mathcal{Z }_a\, da =\mathbf{m}^\mathcal{T }(\mathcal{T }) \end{aligned}$$
(24)

for the total mass of the Lévy tree \(\mathcal{T }\). Notice this is consistent with (16) and (14) which are defined for (sub)critical Lévy trees. Thanks to (24), notice that \(\sigma \) is distributed as the total population size of a CSBP with branching mechanism \(\psi \). In particular, its Laplace transform is given for \(\lambda > 0\) by:

$$\begin{aligned} \mathbb{N }^\psi \left[ 1-\mathrm{e}^{-\lambda \sigma }\right] = \psi ^{-1}(\lambda ). \end{aligned}$$
(25)

Notice that \(\mathbb{N }^\psi [\sigma =+\infty ]=\psi ^{-1}(0)> 0\). We recall the following theorem, from [2], which sums up the situation for general branching mechanisms \(\psi \).

Theorem 2.24

[2]  Let \(\psi \) be any branching mechanism satisfying Assumptions 1 and 2, and let \(q>0\) such that \(\psi (q)\ge 0\). Then, the probability measure \(\mathbb{P }_x^{\psi _q}\) on \(\mathbb{T }\) is absolutely continuous w.r.t. \(\mathbb{P }_x^\psi \), with

$$\begin{aligned} \frac{d\mathbb{P }_x^{\psi _q}}{d\mathbb{P }_x^\psi } =M_\infty ^{\psi ,q} = \mathrm{e}^{qx-\psi (q)\sigma } {\mathbf{1}}_{\{\sigma <+\infty \}}. \end{aligned}$$
(26)

Similarly, the excursion measure \(\mathbb{N }^{\psi _q}\) on \(\mathbb{T }\) is absolutely continuous w.r.t. \(\mathbb{N }^\psi \) and we have

$$\begin{aligned} \frac{d\mathbb{N }^{\psi _q}}{d\mathbb{N }^\psi } = \mathrm{e}^{-\psi (q)\sigma } {\mathbf{1}}_{\{\sigma <+\infty \}}. \end{aligned}$$
(27)

When applying Girsanov formula (27) to \(q=\bar{\theta }\) defined by (10), we get the following remarkable corollary, due to the fact that \(\psi _{\theta }(\bar{\theta }-\theta ) = 0\).

Corollary 2.25

Let \(\psi \) be a critical branching mechanism satisfying Assumptions 1 and 2, and \(\theta \in \Theta ^\psi \) with \(\theta <0\). Let \(F\) be a non-negative measurable functional defined on \(\mathbb{T }\). We have:

$$\begin{aligned} \begin{aligned} \mathrm{e}^{(\bar{\theta }-\theta )x} \, \mathbb{E }_x^{\psi _\theta }[F(\mathcal{T }){\mathbf{1}}_{\{\sigma <+\infty \}} ]&= \mathbb{E }_x^{\psi _{\bar{\theta }}}[F(\mathcal{T })],\\ \mathbb{N }^{\psi _\theta }[F(\mathcal{T }){\mathbf{1}}_{\{\sigma <+\infty \}}]&= \mathbb{N }^{\psi _{\bar{\theta }}}[F(\mathcal{T })]. \end{aligned} \end{aligned}$$
(28)

We deduce from Proposition 2.14 and Theorem 2.22 that the point process \(\mathcal{N }_0^{\mathcal{T }}(dx,d\mathcal{T }^{\prime }) \) defined by (15) with \(a=0\) is under \(\mathbb{P }^\psi _x(d\mathcal{T })\) a Poisson point measure on \(\{\varnothing \}\times \mathbb{T }\) with intensity \(\sigma \delta _\varnothing (dx) \mathbb{N }^\psi [d\mathcal{T }^{\prime }]\). Then we deduce from (21), with \(F=1\), that for \(\theta \ge \theta ^*\):

$$\begin{aligned} \mathbb{N }^{\psi _{\theta }} \left[ 1-\exp \left( \theta \mathcal{Z }_a +\psi (\theta ) \int _0^a \mathcal{Z }_s ds\right) \right] = -\theta . \end{aligned}$$
(29)

2.6 Pruning Lévy trees

We recall the construction from [7] on the pruning of Lévy trees. Let \(\mathcal{T }\) be a random Lévy w-tree under \(\mathbb{P }_x^\psi \) (or under \(\mathbb{N }^\psi \)), with \(\psi \) conservative. Let

$$\begin{aligned} m^{(\mathrm{ske})}(dx,d\theta ) = \sum _{i\in I^{\mathrm{ske}}} \delta _{(x_i,\theta _i)}(dx,d\theta ) \end{aligned}$$

be, conditionally on \(\mathcal{T }\), a Poisson point measure on \(\mathcal{T }\times \mathbb R _+\) with intensity \(2\beta l^{\mathcal{T }}(dx) d\theta \). Since there is a.s. a countable number of branching points (which have \(l^{\mathcal{T }}\)-measure 0), the atoms of this measure are distributed on \(\mathcal{T }{\setminus }(\mathrm{Br}(\mathcal{T }) \cup \mathrm{Lf}(\mathcal{T }))\).

If \(\Pi =0\), we have \(\mathrm{Br}_\infty (\mathcal{T })=\varnothing \) a.s. whereas if \(\Pi (\mathbb R _+)=\infty \), \(\mathrm{Br}_\infty (\mathcal{T })\) is a.s. a countable dense subset of \(\mathcal{T }\). If the latter condition holds, we consider, conditionally on \(\mathcal{T }\), a Poisson point measure

$$\begin{aligned} m^{(\mathrm{nod})}(dx,d\theta ) = \sum _{i\in I^{\mathrm{nod}}} \delta _{(x_i,\theta _i)}(dx,d\theta ) \end{aligned}$$

on \(\mathcal{T }\times \mathbb R _+\) with intensity

$$\begin{aligned} \sum _{y\in \mathrm{Br}_\infty (\mathcal{T })}\Delta _y\delta _y(dx)\,d\theta \end{aligned}$$

where \(\Delta _x\) is the mass of the node \(x\), defined by (19). Hence, if \(\theta >0\), a node \(x\in \mathrm{Br}_\infty (\mathcal{T })\) is an atom of \(m^{(\mathrm{nod})}(dx,[0,\theta ])\) with probability \(1-\exp (-\theta \Delta _x)\). The set

$$\begin{aligned} \left\{ x_i,\ i\in I^{\mathrm{nod}}\right\} =\left\{ x\in \mathcal{T },\ m^{(\mathrm{nod})} \bigl (\{x\}\times \mathbb R _+\bigr )>0\right\} \end{aligned}$$

of marked branching points corresponds \(\mathbb{P }_x^\psi \)-a.s or \(\mathbb{N }^\psi \)-a.e. to \(\mathrm{Br}_\infty (\mathcal{T })\). For \(i\in I^{\mathrm{nod}}\), we set

$$\begin{aligned} \theta _i=\inf \left\{ \theta >0,\ m^{(\mathrm{nod})}\bigl (\{x_i\}\times [0,\theta ]\bigr )>0\right\} \end{aligned}$$

to be the first mark on \(x_i\) (which is conditionally on \(\mathcal{T }\) exponentially distributed with parameter \(\Delta _{x_i}\)), and we set

$$\begin{aligned} \left\{ \theta _j,\ j\in J_i^{\mathrm{nod}}\right\} =\left\{ \theta >\theta _i, \ m^{(\mathrm{nod})}\bigl (\{x_i\}\times \{\theta \}\bigr )>0\right\} \end{aligned}$$

so that we can write

$$\begin{aligned} m^{(\mathrm{nod})}(dx,d\theta ) = \sum _{i\in I^{\mathrm{nod}}} \delta _{x_i}(dx) \left( \delta _{\theta _i}(d\theta ) + \sum _{j\in J^{\mathrm{nod}}_i} \delta _{\theta _j}(d\theta )\right) . \end{aligned}$$

We set the measure of marks:

$$\begin{aligned} \mathcal{M }(dx,d\theta )=m^{(\mathrm{ske})}(dx,d\theta ) + m^{(\mathrm{nod})}(dx,d\theta ), \end{aligned}$$
(30)

and consider the family of w-trees \(\Lambda (\mathcal{T },\mathcal{M })=(\Lambda _\theta (\mathcal{T },\mathcal{M }),\ \theta \ge 0)\), where the \(\theta \)-pruned w-tree \(\Lambda _\theta \) is defined by:

$$\begin{aligned} \Lambda _\theta (\mathcal{T },\mathcal{M }) = \Bigg \{ x\in \mathcal{T },\ \mathcal{M }([\![\varnothing , x [\![\times [0,\theta ]) = 0 \Bigg \}, \end{aligned}$$

rooted at \(\varnothing ^{\Lambda _\theta (\mathcal{T },\mathcal{M }) } = \varnothing ^\mathcal{T }\), and where the metric \(d^{\Lambda _\theta (\mathcal{T },\mathcal{M })} \) and the mass measure \(\mathbf{m}^{\Lambda _\theta (\mathcal{T },\mathcal{M })} \) are the restrictions of \(d^\mathcal{T }\) and \(\mathbf{m}^\mathcal{T }\) to \(\Lambda _\theta (\mathcal{T },\mathcal{M }) \). In particular, we have \(\Lambda _0( \mathcal{T },\mathcal{M })=\mathcal{T }\). The family of w-trees \(\Lambda (\mathcal{T },\mathcal{M })\) is a non-increasing family of real trees, in a sense that \(\Lambda _\theta (\mathcal{T },\mathcal{M })\) is a subtree of \(\Lambda _{\theta ^{\prime }}(\mathcal{T },\mathcal{M })\) for \(0\le \theta ^{\prime }\le \theta \), see Fig. 1. In particular, we have that the pruning operators satisfy a cocycle property, for \(\theta _1\ge 0\) and \(\theta _2\ge 0\):

$$\begin{aligned} \Lambda _{\theta _2}\bigl (\Lambda _{\theta _1}(\mathcal{T },\mathcal{M }),\mathcal{M }_{\theta _1}\bigr ) = \Lambda _{\theta _2+\theta _1}(\mathcal{T },\mathcal{M }), \end{aligned}$$

where \(\mathcal{M }_{\theta }(A\times [0,q])=\mathcal{M }(A\times [\theta , \theta +q])\). Abusing notations, we write \(\mathbb{N }^\psi (d\mathcal{T },d\mathcal{M })\) for the distribution of the pair \((\mathcal{T },\mathcal{M })\) when \(\mathcal{T }\) is distributed according to \(\mathbb{N }^\psi (d\mathcal{T })\) and conditionally on \(\mathcal{T }\), \(\mathcal{M }\) is distributed as described above.

The following result can be deduced from [2].

Theorem 2.26

Let \(\psi \) be a branching mechanism satisfying Assumptions 1 and 2. There exists a non-increasing \(\mathbb{T }\)-valued Markov process \((\mathcal{T }_\theta , \theta \in \Theta ^\psi )\) such that for all \(q\in \Theta ^\psi \), the process \((\mathcal{T }_{\theta +q}, \theta \ge 0)\) is distributed as \(\Lambda (\mathcal{T },\mathcal{M })\) under \(\mathbb{N }^{\psi _q}[d\mathcal{T },d\mathcal{M }]\).

In particular, this theorem implies that \(\mathcal{T }_\theta \) is distributed as \(\mathbb{N }^{\psi _\theta }\) for \(\theta \in \Theta ^\psi \) and that for \(\theta _0 \ge 0\), under \(\mathbb{N }^\psi \), the process of pruned trees \((\Lambda _{\theta _0+\theta }(\mathcal{T }), \theta \ge 0)\) has the same distribution as \((\Lambda _\theta (\mathcal{T }),\theta \ge 0)\) under \(\mathbb{N }^{\psi _{\theta _0}}[d\mathcal{T }]\).

We want to study the time-reversed process \((\mathcal{T }_{-\theta }, \theta \in -\Theta ^\psi )\), which can be seen as a growth process. This process grows by attaching sub-trees at a random point, rather than slowly growing uniformly along the branches. We recall some results from [2] on the growth process. From now on, we will assume in this section that the branching mechanism  \(\psi \) is critical, so that \(\psi _\theta \) is sub-critical iff \(\theta >0\) and super-critical iff \(\theta <0\).

Fig. 1
figure 1

The pruning process, starting from explosion time \(A\) defined in (32)

We will use the following notation for the total mass of the tree \(\mathcal{T }_\theta \) at time \(\theta \in \Theta ^\psi \):

$$\begin{aligned} \sigma _\theta =\mathbf{m}^{\mathcal{T }_\theta } (\mathcal{T }_\theta ). \end{aligned}$$
(31)

The total mass process \((\sigma _\theta , \theta \in \Theta ^\psi )\) is a pure-jump process taking values in \((0,+\infty ]\).

Lemma 2.27

[2] Let \(\psi \) be a critical branching mechanism satisfying Assumptions 1 and 2. If \(0 \le \theta _2 < \theta _1\), then we have:

$$\begin{aligned} \mathbb{N }^\psi [\sigma _{\theta _2} | \mathcal{T }_{\theta _1} ] = \sigma _{\theta _1} \frac{\psi ^{\prime }(\theta _1)}{\psi ^{\prime }(\theta _2)}\cdot \end{aligned}$$

Consider the ascension time (or explosion time):

$$\begin{aligned} A = \inf \left\{ \theta \in \Theta ^\psi ,\ \sigma _\theta < \infty \right\} \!, \end{aligned}$$
(32)

where we use the convention \(\inf \varnothing = \theta _\infty \). The following theorem gives the distribution of the ascension time \(A\) and the distribution of the tree at this random time. Recall that \(\bar{\theta }=\psi ^{-1}(\psi (\theta ))\) is defined in (10).

Theorem 2.28

[2] Let \(\psi \) be a critical branching mechanism satisfying Assumptions 1 and 2.

  1. (1)

    For all \(\theta \in \Theta ^\psi \), we have \( \mathbb{N }^{\psi } [ A > \theta ] = \bar{\theta }- \theta \).

  2. (2)

    If \(\theta _\infty < \theta <0\), under \(\mathbb{N }^\psi \), we have, for any non-negative measurable functional \(F\),

    $$\begin{aligned} \mathbb{N }^\psi [ F(\mathcal{T }_{A+\theta ^{\prime }},\theta ^{\prime }\ge 0) | A=\theta ] = \psi ^{\prime }(\bar{\theta }) \mathbb{N }^\psi \left[ F(\mathcal{T }_{\theta ^{\prime }},\theta ^{\prime }\ge 0) \sigma _0 \mathrm{e}^{-\psi (\theta )\sigma _0} \right] . \end{aligned}$$
  3. (3)

    For all \(\theta \in \Theta ^\psi \), we have \( \mathbb{N }^\psi [\sigma _A<+\infty |A=\theta ]=1\).

In other words, at the ascension time, the tree can be seen as a size-biased critical Lévy tree. A precise description of \(\mathcal{T }_A\) is given in [2]. Notice that in the setting of [2], there is no need of Assumption 2.

3 The growing tree-valued process

3.1 Special Markov Property of pruning

In [7], the authors prove a formula describing the structure of a Lévy tree, conditionally on the \(\theta \)-pruned tree obtained from it in the (sub)critical case. We will give a general version of this result. From the measure of marks, \(\mathcal{M }\) in (30), we define a measure of increasing marks by:

$$\begin{aligned} \mathcal{M }^\uparrow (dx,d\theta ^{\prime }) = \sum _{i\in I^\uparrow } \delta _{(x_i,\theta _i)}(dx,d\theta ^{\prime }), \end{aligned}$$
(33)

with

$$\begin{aligned} I^\uparrow =\left\{ i\in I^\mathrm{ske}\cup I^\mathrm{nod},\ \mathcal{M }([\![\varnothing ,x_i ]\!]\times [0,\theta _i]) = 1\right\} . \end{aligned}$$

The atoms \((x_i,\theta _i)\) for \( i\in I^\uparrow \) correspond to marks such that there are no marks of \(\mathcal{M }\) on \([\![\varnothing ,x_i ]\!]\) with a \(\theta \)-component smaller than \(\theta _i\). In the case of multiple \(\theta _j\) for a given node \(x_i\in \mathrm{Br}_\infty (\mathcal{T })\), we only keep the smallest one. In the case \(\Pi =0\), the measure \(\mathcal{M }^\uparrow \) describes the jumps of a record process on the tree, see [3] for further work in this direction. The \(\theta \)-pruned tree can alternatively be defined using \(\mathcal{M }^\uparrow \) instead of \(\mathcal{M }\) since for \(\theta \ge 0\):

$$\begin{aligned} \Lambda _\theta (\mathcal{T },\mathcal{M }) = \left\{ x\in \mathcal{T },\ \mathcal{M }^\uparrow ([\![\varnothing , x [\![\times [0,\theta ]) = 0 \right\} \!. \end{aligned}$$

We set:

$$\begin{aligned} I_\theta ^{\uparrow }&= \left\{ i\in I^\uparrow ,\ x_i\in \mathrm{Lf}(\Lambda _\theta (\mathcal{T }, \mathcal{M }))\right\} \\&= \left\{ i\in I^\uparrow ,\ \theta _i<\theta \quad \text{ and }\quad \mathcal{M }^\uparrow ([\![\varnothing ,x_i [\![\times [0,\theta ]) = 0\right\} \end{aligned}$$

and for \(i\in I_\theta ^{\uparrow }\):

$$\begin{aligned} \mathcal{T }^i = \mathcal{T }{\setminus } \mathcal{T }^{\varnothing , x_i}=\{ x\in \mathcal{T },\ x_i \in [\![\varnothing ,x]\!]\}, \end{aligned}$$

where \(\mathcal{T }^{y,x}\) is the connected component of \(\mathcal{T }{\setminus }\{x\}\) containing \(y\). For \(i\in I_\theta ^{\uparrow }\), \(\mathcal{T }^i\) is a real tree, and we will consider \(x_i\) as its root. The metric and mass measure on \(\mathcal{T }^i\) are the restrictions of the metric and mass measure of \(\mathcal{T }\) on \(\mathcal{T }^i\). By construction, we have:

$$\begin{aligned} \mathcal{T }=\Lambda _\theta (\mathcal{T },\mathcal{M }) \circledast _{i\in I_\theta ^{\uparrow }} (\mathcal{T }^i,x_i). \end{aligned}$$
(34)

Now we can state the general special Markov property.

Theorem 3.1

(Special Markov property) Let \(\psi \) be a branching mechanism satisfying Assumptions 1 and 2. Let \(\theta >0\). Conditionally on \(\Lambda _\theta (\mathcal{T },\mathcal{M })\), the point measure:

$$\begin{aligned} \mathcal{M }_\theta ^{\uparrow }(dx,d\mathcal{T }^{\prime },d\theta ^{\prime }) = \sum _{i\in I_\theta ^{\uparrow }} \delta _{(x_i,\mathcal{T }^i,\theta _i)}(dx,d\mathcal{T }^{\prime },d\theta ^{\prime }) \end{aligned}$$

under \(\mathbb{P }^\psi _{r_0}\) (or under \(\mathbb{N }^\psi \)) is a Poisson point measure on \(\Lambda _\theta (\mathcal{T },\mathcal{M })\times \mathbb{T }\times (0,\theta ]\) with intensity:

$$\begin{aligned} \mathbf{m}^{\Lambda _\theta (\mathcal{T },\mathcal{M })}(dx) \left( 2\beta \mathbb{N }^\psi [d\mathcal{T }^{\prime }] + \int _{(0,+\infty )} \Pi (dr) \, r \mathrm{e}^{-\theta ^{\prime } r} \mathbb{P }^\psi _r(d\mathcal{T }^{\prime }) \right) \, {\mathbf{1}}_{(0,\theta ]}(\theta ^{\prime })\, d\theta ^{\prime }.\nonumber \\ \end{aligned}$$
(35)

Proof

It is not difficult to adapt the proof of the special Markov property in [7] to get Theorem 3.1 in the (sub)critical case by taking into account the pruning times \(\theta _i\) and the w-tree setting; and we omit this proof which can be found in [6]. We prove how to extend the result to the super-critical Lévy trees using the Girsanov transform of Definition 2.21.

Assume that \(\psi \) is super-critical. For \(a>0\), we will write \(\Lambda _{\theta ,a} (\mathcal{T },\mathcal{M })=\pi _a(\Lambda _\theta (\mathcal{T },\mathcal{M }))\) for short. According to (34) and the definition of super-critical Lévy trees, we have that for any \(a>0\), the truncated tree \(\pi _a(\mathcal{T })\) can be written as:

$$\begin{aligned} \pi _a(\mathcal{T }) = \Lambda _{\theta ,a}(\mathcal{T },\mathcal{M }){\mathop {\mathop {\circledast }\nolimits _{i\in I_\theta ^{\uparrow },}}\limits _{H_{x_i}\le a}} \left( \pi _{a-H_{x_i}}(\mathcal{T }^i),x_i\right) \end{aligned}$$

and we have to prove that \(\sum _{i\in I_\theta ^{\uparrow }} \delta _{(x_i,\mathcal{T }^i,\theta _i)}(dx,d\mathcal{T }^{\prime },d\theta ^{\prime })\) is conditionally on \(\Lambda _{\theta }(\mathcal{T },\mathcal{M }) \) a Poisson point measure with intensity (35). Since \(a\) is arbitrary, it is enough to prove that the point measure \(\mathcal{M }_a\), defined by

$$\begin{aligned} \mathcal{M }_a(dx,d\mathcal{T }^{\prime },d\theta ^{\prime }) = \sum _{i\in I_\theta ^{\uparrow }} {\mathbf{1}}_{\{ H_{x_i} \le a \}} \, \delta _{(x_i,\pi _{a-H_{x_i}}(\mathcal{T }^i),\theta _i)} (dx, d\mathcal{T }^{\prime },d\theta ^{\prime }), \end{aligned}$$

is conditionally on \(\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M }) \) a Poisson point measure with intensity:

$$\begin{aligned}&{\mathbf{1}}_{[0,a]}(H_x)\, \mathbf{m}^{\Lambda _{\theta }(\mathcal{T },\mathcal{M })}(dx)\, {\mathbf{1}}_{(0,\theta ]}(\theta ^{\prime })\, d\theta ^{\prime }\nonumber \\&\quad \quad \times \left( 2\beta (\pi _{a-H_x})_* \mathbb{N }^\psi (d\mathcal{T }^{\prime })+ \int _{(0,+\infty )} \Pi (dr) \, r \mathrm{e}^{-\theta ^{\prime } r} (\pi _{a-H_x})_*\mathbb{P }^\psi _r(d\mathcal{T }^{\prime })\right) .\qquad \end{aligned}$$
(36)

Recall \(\theta ^*\) is the unique real number such that \(\psi _{\theta ^*}^{\prime }(0)=0\), that is, such that \(\psi _{\theta ^*}\) is critical. Let \(\Phi \) be a non-negative, measurable functional on \(\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M })\times \mathbb{T }\times (0,\theta ]\) and let \(F\) be a non-negative measurable functional on \(\mathbb{T }\). Let

$$\begin{aligned} B = \mathbb{N }^\psi \left[ F(\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M })) \exp (-\left\langle \mathcal{M }_a,\Phi \right\rangle ) \right] \!. \end{aligned}$$

Thanks to the Girsanov formula (22) and the special Markov property for critical branching mechanisms, we get:

$$\begin{aligned} B&= \mathbb{N }^{\psi _{\theta ^*}} \left[ \!F(\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M })) \exp (-\left\langle \mathcal{M }_a,\Phi \right\rangle ) \exp \left( \theta ^* \mathcal{Z }_a(\mathcal{T })+ \psi (\theta ^*) \int _0^a \mathcal{Z }_h(\mathcal{T }) dh \right) \!\right] \\&= \mathbb{N }^{\psi _{\theta ^*}}\left[ F(\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M })) \exp \left( \theta ^* \mathcal{Z }_a(\Lambda _\theta (\mathcal{T },\mathcal{M })) + \psi (\theta ^*) \int _0^a \mathcal{Z }_h(\Lambda _\theta (\mathcal{T },\mathcal{M })) dh \right) \right. \\&\quad \left. \times \exp \left( -\int \mathbf{m}^{\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M })}(dx) G(H_x,x,\theta ) \right) \right] , \end{aligned}$$

with \(\mathbf{m}^{\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M })}(dx)= {\mathbf{1}}_{[0,a]}(H_x)\, \mathbf{m}^{\Lambda _{\theta }(\mathcal{T },\mathcal{M })}(dx) \) and \(G(h,x,\theta )\) equal to:

$$\begin{aligned}&\int _0^\theta d\theta ^{\prime } \, \left\{ 2\beta \mathbb{N }^{\psi _{\theta ^*}}\! \left[ \!1\!-\!\exp \left( \!-\! \Phi (x,\pi _{a-h}(\mathcal{T }),\theta ^{\prime })+ \theta ^* \mathcal{Z }_{a-h} (\mathcal{T }) +\psi (\theta ^*)\int _0^{a-h} \mathcal{Z }_t(\mathcal{T }) dt \!\right) \!\right] \right. \\&\quad + \int _{(0,+\infty )} \Pi _{\theta ^*} (dr) r \mathrm{e}^{-\theta ^{\prime }r}\mathbb{E }_r^{\psi _{\theta ^*}} \left. \quad \times \left[ 1-\exp \Bigg (-\Phi (x,\pi _{a-h}(\mathcal{T }),\theta ^{\prime }) + \theta ^* \mathcal{Z }_{a-h} (\mathcal{T }) \right. \right. \\&\left. \left. \quad +\psi (\theta ^*)\int _0^{a-h} \mathcal{Z }_t(\mathcal{T }) dt \Bigg ) \right] \right\} . \end{aligned}$$

By using the Poisson decomposition of   \(\mathbb{P }_r^{\psi _{\theta ^*}}\) (Proposition 2.14), we see that \(G(h,x,\theta )\) can be written as:

$$\begin{aligned} G(h,x,\theta ) \!=\! \int _0^\theta d\theta ^{\prime } \left\{ 2\beta g(h,x,\theta ^{\prime }) +\!\! \int _{(0,\infty )} \Pi _{\theta ^*} (dr)\ r \mathrm{e}^{-\theta ^{\prime } r}\left( 1\!- \!\exp (-r g(h,x,\theta ^{\prime }))\right) \!\right\} \!, \end{aligned}$$

with

$$\begin{aligned}&g(h,x,\theta ^{\prime }) \\&\quad = \mathbb{N }^{\psi _{\theta ^*}} \left[ 1-\exp \left( - \Phi (x,\pi _{a-h}(\mathcal{T }),\theta ^{\prime }) +\theta ^* \mathcal{Z }_{a-h} (\mathcal{T })+\psi (\theta ^*)\int _0^{a-h} \mathcal{Z }_t(\mathcal{T }) dt\right) \!\right] \!. \end{aligned}$$

Thanks to the Girsanov formula and (29), we get:

$$\begin{aligned}&g(h,x,\theta ^{\prime }) = \mathbb{N }^{\psi _{\theta ^*}} \\&\qquad \times \left[ (1-\exp (- \Phi (x,\pi _{a-h}(\mathcal{T }),\theta ^{\prime }))) \exp \left( \theta ^* \mathcal{Z }_{a-h} (\mathcal{T })+\psi (\theta ^*)\int _0^{a-h} \mathcal{Z }_t(\mathcal{T })dt \right) \right] \\&\qquad + \mathbb{N }^{\psi _{\theta ^*}}\left[ 1-\exp \left( \theta ^* \mathcal{Z }_{a-h} (\mathcal{T }) +\psi (\theta ^*)\int _0^{a-h} \mathcal{Z }_t(\mathcal{T }) dt \right) \right] \\&\quad = \mathbb{N }^{\psi } \Big [ 1-\exp (-\Phi (x,\pi _{a-h}(\mathcal{T }),\theta ^{\prime })) \Big ]- \theta ^*. \end{aligned}$$

With \(\tilde{g}(h,x,\theta ^{\prime })= \mathbb{N }^{\psi } [ 1-\exp (-\Phi (x,\pi _{a-h}(\mathcal{T }),\theta ^{\prime })) ]\) and thanks to (7), we get:

$$\begin{aligned} G(h,x,\theta )&= \int _0^\theta d\theta ^{\prime } \left\{ 2\beta \tilde{g}(h,x,\theta ^{\prime }) + \int _{(0,\infty )} \Pi (dr)\ r \mathrm{e}^{-\theta ^{\prime } r}\left( 1- \exp (-r\tilde{g}(h,x,\theta ^{\prime }))\right) \!\right\} \\&+\psi (\theta ^*) - \psi _\theta (\theta ^*). \end{aligned}$$

Notice that from the definition of \(G\) we have \(g\) replaced by \(\tilde{g}\), \(\Pi _{\theta ^*}\) replaced by \(\Pi \) and the additional term \(\psi (\theta ^*) - \psi _\theta (\theta ^*)\). As \( \int \mathbf{m}^{\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M })}(dx) = \int _0^a \mathcal{Z }_h(\Lambda _\theta (\mathcal{T }))dh \), we get:

$$\begin{aligned} B&= \mathbb{N }^{\psi _{\theta ^*}}\! \left[ F(\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M }))R(\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M })) \right. \nonumber \\&\left. \times \exp \left( \theta ^* \mathcal{Z }_a(\Lambda _\theta (\mathcal{T },\mathcal{M })) + \psi _\theta (\theta ^*) \int _0^a \mathcal{Z }_h(\Lambda _\theta (\mathcal{T },\mathcal{M })) dh \right) \!\right] , \end{aligned}$$
(37)

with

$$\begin{aligned} R(\mathcal{T })&= \exp \left( -\int \mathbf{m}^\mathcal{T }(dx) \int _{0}^{\theta } d\theta ^{\prime }\left[ 2\beta \tilde{g}(H_x,x,\theta ^{\prime }) \nonumber \right. \right. \\&\left. \left. + \int _{(0,\infty )} \Pi (dr)\ r \mathrm{e}^{-\theta ^{\prime } r}\left( 1- \exp (-r \tilde{g}(H_x,x,\theta ^{\prime }))\right) \right] \right) . \end{aligned}$$
(38)

Taking \(\Phi =0\) (and thus \(R=1\)) in (37) yields:

$$\begin{aligned} \mathbb{N }^\psi [F(\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M }))]&= \mathbb{N }^{\psi _{\theta ^*}} \left[ F(\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M })) \exp \Big (\theta ^* \mathcal{Z }_a(\Lambda _\theta (\mathcal{T },\mathcal{M }))\right. \nonumber \\&\left. + \psi _\theta (\theta ^*) \int _0^a \mathcal{Z }_h(\Lambda _\theta (\mathcal{T },\mathcal{M })) dh \Big ) \right] . \end{aligned}$$
(39)

Using (39) with \(F\) replaced by \(FR\) gives:

$$\begin{aligned}&\mathbb{N }^\psi \Big [ \exp (-\langle \mathcal{M }_a, \Phi \rangle ) F(\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M })) \Big ]= B\\&\quad =\mathbb{N }^\psi \left[ F(\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M })) R(\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M })) \right] \!. \end{aligned}$$

This implies that \(\mathcal{M }_a\) is, conditionally on \(\Lambda _{\theta ,a}(\mathcal{T },\mathcal{M })\), a Poisson point measure with intensity (36). This ends the proof. \(\square \)

3.2 An explicit construction of the growing process

In this section, we will construct the growth process using a family of Poisson point measures. Let \(\psi \) be a branching mechanism satisfying Assumptions 1 and 2. Let \(\theta \in \Theta ^\psi \). According to (20) and (7), we have:

$$\begin{aligned} {\mathbf{N}}^{\psi _\theta }[\mathcal{T }\in \cdot ] = 2\beta \mathbb{N }^{\psi _\theta }[\mathcal{T }\in \cdot ] + \int _{(0,+\infty )} \Pi (dr) r \mathrm{e}^{-\theta r} \mathbb{P }_r^{\psi _\theta }(\mathcal{T }\in \cdot ). \end{aligned}$$
(40)

Let \(\mathcal{T }^{(0)}\in \mathbb{T }\) with root \(\varnothing \). For \(q\in \Theta ^\psi \) and \(q\le \theta \), we set:

$$\begin{aligned} \mathfrak{T }_q^{(0)}=\mathcal{T }^{(0)}\quad \text{ and }\quad \mathbf{m}_q^{(0)}= {\mathbf{m}}^{{\mathcal{T }}^{(0)}}. \end{aligned}$$

We define the w-trees grafted on \(\mathcal{T }^{(0)}\) by recursion on their generation. We suppose that all the random point measures used for the next construction are defined on \(\mathbb{T }\) under a probability measure \(Q^{\mathcal{T }^{(0)}}(d\omega )\).

Suppose that we have constructed the family \(((\mathfrak{T }_q^{(k)},\mathbf{m}_q^{(n)}),\ 0\le k\le n,\ q \in \Theta ^\psi \cap (-\infty ,\theta ))\). We write

$$\begin{aligned} \mathfrak{T }^{(n)}=\bigsqcup _{q\in \Theta ^\psi ,\ q\le \theta }\mathfrak{T }_q^{(n)}. \end{aligned}$$

We define the (\(n+1\))-th generation as follows. Conditionally on all trees from generations smaller than \(n\), \((\mathfrak{T }_q^{(k)},\ 0\le k\le n,\ q \in \Theta ^\psi \cap (-\infty ,\theta ))\), let

$$\begin{aligned} \mathcal{N }^{n+1}_{\theta }(dx,d\mathcal{T },dq)=\sum _{j\in J^{(n+1)}} \delta _{(x_j,\mathcal{T }^j,\theta _j)}(dx,d\mathcal{T },dq) \end{aligned}$$

be a Poisson point measure on \( \mathfrak{T }^{(n)} \times \mathbb{T }\times \Theta ^\psi \) with intensity:

$$\begin{aligned} \mu ^{n+1}_\theta (dx,d\mathcal{T },dq)= \mathbf{m}^{(n)}_q(dx) {\mathbf{N}}^{\psi _q}[d\mathcal{T }]\, {\mathbf{1}}_{\{ q\le \theta \}}\, dq. \end{aligned}$$

For \(q\in \Theta ^\psi \) and \(q\le \theta \), we set

$$\begin{aligned} J_q^{(n+1)}=\left\{ j\in J^{(n+1)},\ q<\theta _j\right\} \end{aligned}$$

and we define the tree \(\mathfrak{T }_q^{(n+1)}\) and the mass measure \(\mathbf{m}_q^{(n+1)}\) by:

$$\begin{aligned} \mathfrak{T }_q^{(n+1)}=\mathfrak{T }_q^{(n)}\circledast _{j\in J_q^{(n+1)}}(\mathcal{T }^j,x_j)\quad \text{ and }\quad \mathbf{m}_q^{(n+1)}=\sum _{j\in J_q^{(n+1)}}\mathbf{m}^{\mathcal{T }^j}(dx). \end{aligned}$$

Notice that by construction, \((\mathfrak{T }_q^{(n)}, n\in \mathbb{N })\) is a non-decreasing sequence of trees. We set \(\mathfrak{T }_q\) to be the completion of \(\cup _{n\in \mathbb{N }} \mathfrak{T }_q^{(n)}\), which is a real tree with root \(\varnothing \) and metric \(d^{\mathfrak{T }_q}\), and we define a mass measure on \(\mathfrak{T }_q\) by \(\mathbf{m}^{\mathfrak{T }_q}= \sum _{n\in \mathbb{N }} \mathbf{m}^{(n)}_q\).

For \(q\in \Theta ^\psi \) and \(q<\theta \), we consider \(\mathcal{F }_{q}\) the \(\sigma \)-field generated by \(\mathfrak{T }^{(0)}\) and the sequence of random point measures \(({\mathbf{1}}_{\{q^{\prime }\in [q,\theta ]\}}\mathcal{N }_{\theta }^{(n)}(dx,d\mathcal{T },dq^{\prime }),\ n\in \mathbb{N })\). We set \(\mathcal{N }_{\theta }=\sum _{n\in \mathbb{N }} \mathcal{N }^n_{\theta }\). The backward random point process \(q\mapsto {\mathbf{1}}_{\{q\le q^{\prime }\}} \mathcal{N }_\theta (dx, d\mathcal{T }, dq^{\prime })\) is by construction adapted to the backward filtration \((\mathcal{F }_q, q\in \Theta ^\psi \cap (-\infty ,\theta ])\).

The proof of the following result is postponed to Sect. 3.3.

Theorem 3.2

Let \(\psi \) be a branching mechanism satisfying Assumptions 1 and 2. Under \(Q^{\psi _\theta }=\mathbb{N }^{\psi _\theta }[d\mathcal{T }^{(0)}]Q^{\mathcal{T }^{(0)}}(d\omega )\), the process

$$\begin{aligned} \left( \left( \mathfrak{T }_q,d^{\mathfrak{T }_q}, \varnothing , \mathbf{m}^{\bar{\mathfrak{T }}_q}\right) , q\in \Theta ^\psi \cap (-\infty ,\theta ]\right) \end{aligned}$$

is a \(\mathbb{T }\)-valued backward Markov process with respect to the backward filtration \(\mathcal{F }^{\theta }= (\mathcal{F }_q, q\in \Theta ^\psi \cap (-\infty ,\theta ])\). It is distributed as \(((\mathcal{T }_q,\mathbf{m}^{\mathcal{T }_q}), q\in \Theta ^\psi \cap (-\infty ,\theta ])\) under \(\mathbb{N }^\psi \).

Notice the theorem in particular entails that \(( \mathfrak{T }_q,d^{\mathfrak{T }_q}, \varnothing , \mathbf{m}^{\bar{\mathfrak{T }}_q})\) is a w-tree for all \(q\). To prove it, we will use the following lemma.

Lemma 3.3

Let \(\psi \) be a branching mechanism satisfying Assumptions 1 and 2. Let \(K\) be a measurable non-negative process (as a function of \(q\)) defined on \(\mathbb R _+\times \mathbb{T }\times \mathbb{T }\) which is predictable with respect to the backward filtration \(\mathcal{F }^{\theta }\). We have:

$$\begin{aligned}&Q^{\psi _\theta }\left[ \int \mathcal{N }_\theta (dx, d\mathcal{T },dq)\, K(q, \mathfrak{T }_{q}, \mathfrak{T }_{q-}) \right] \\&\quad \quad =Q^{\psi _\theta }\left[ \int K\Big (q, \mathfrak{T }_q,\mathfrak{T }_q \circledast (\mathcal{T },x)\Big ) \, \mu _\theta (dx,d\mathcal{T },dq) \right] , \end{aligned}$$

where \(\mu _\theta (dx,d\mathcal{T },dq)\!=\! \sum _{n\ge 1} \mu ^{n}(dx,dT,dq)\!=\!\mathbf{m}^{\mathfrak{T }_q}(dx) {\mathbf{N}}^{\psi _q}[d\mathcal{T }] \, {\mathbf{1}}_{\{q\in \Theta ^\psi , q\le \theta \}}\, dq\).

This means that the predictable compensator of \(\mathcal{N }_\theta \) is given by:

$$\begin{aligned} \mu _\theta (dx,d\mathcal{T },dq)=\mathbf{m}^{\mathfrak{T }_q}(dx) {\mathbf{N}}^{\psi _q}[d\mathcal{T }]\, {\mathbf{1}}_{\{q \in \Theta ^\psi , q\le \theta \}}\, dq. \end{aligned}$$

Notice that this construction does not fit in the usual framework of random point measures since the support at time \(q\) of the predictable compensator is the (predictable backward in time) random set \(\mathfrak{T }_q\times \mathbb{T }\times \Theta ^\psi \).

Proof

Based on the recursive construction, we have:

$$\begin{aligned}&Q^{\psi _\theta }\left[ \int \mathcal{N }_\theta (dx, d\mathcal{T },dq)\, K(q, \mathfrak{T }_{q}, \mathfrak{T }_{q-}) \right] \\&\quad \quad =\sum _{n=0}^{+\infty }Q^{\psi _\theta }\Bigg [Q^{\psi _\theta }\Bigg [\int \mathcal{N }_\theta ^n(dx,d\mathcal{T },dq)\, K(q, \mathfrak{T }_{q}, \mathfrak{T }_{q}\circledast (\mathcal{T }, x))\\&\qquad \qquad \qquad \qquad \quad \Bigm | (\mathfrak{T }_s^{(k)},\ k\le n,\ s\le \theta )\Bigg ]\Bigg ]. \end{aligned}$$

Now, by construction, we have that:

$$\begin{aligned} \mathfrak{T }_q=\mathfrak{T }_q^{(n)}\circledast _{j\in J_q^{(n)}}(\tilde{\mathcal{T }}_j,x_j) \end{aligned}$$

for \(\tilde{\mathcal{T }}_j=\mathfrak{T }_q{\setminus }\mathfrak{T }_q^{(x_j,\varnothing )}\) which is a measurable function of \({\mathbf{1}}_{\{q^{\prime }>q\}}\mathcal{N }_\theta ^n(dx, d\mathcal{T },dq^{\prime })\) and of the point measures \({\mathbf{1}}_{\{q^{\prime }>q\}} \mathcal{N }_\theta ^\ell (dx, d\mathcal{T }, dq^{\prime })\) for \(\ell \ge n+1\). Therefore, applying the Palm formula with the function

$$\begin{aligned}&F_n\Big (q,\mathcal{T },x,\sum _{j\in J^{(n)},q_j>q}\delta _{(x_j,\mathcal{T }^j,\theta _j)}\Big )\\&\quad =Q^{\psi _\theta }\left[ K\Big (q, \mathfrak{T }_q^{(n)}\circledast _{j\in J_q^{(n)}}(\tilde{\mathcal{T }}_j,x_j),\right. \\&\qquad \left. \times \mathfrak{T }_q^{(n)}\circledast _{j\in J_q^{(n)}}(\tilde{\mathcal{T }}_j,x_j)\circledast (\mathcal{T },x)\Big )\Bigm | (\mathfrak{T }_s^{(k)},\ k\le n,\ s\le \theta ),\mathcal{N }_\theta ^n\right] , \end{aligned}$$

we get:

$$\begin{aligned}&Q^{\psi _\theta } \left[ \int \mathcal{N }_\theta (dx, d\mathcal{T },dq)\, K(q, \mathfrak{T }_{q}, \mathfrak{T }_{q-}) \right] \\&\quad =\sum _{n=0}^{+\infty }Q^{\psi _\theta }\left[ Q^{\psi _\theta }\left[ \int \mathcal{N }_\theta ^n(dx,d\mathcal{T },dq)\,\right. \right. \\&\quad \quad \times \left. \left. F_n\left( q,\mathcal{T },x,\sum _{j\in J^{(n)},q_j>q}\delta _{(x_j,\mathcal{T }^j,\theta _j)}\right) \Bigm | (\mathfrak{T }_s^{(k)},\ k\le n,\ s\le \theta )\right] \right] \\&\quad =\sum _{n=0}^{+\infty }Q^{\psi _\theta }\left[ Q^{\psi _\theta }\left[ \int \mu ^n_\theta (dx,d\mathcal{T },dq)\,\right. \right. \\&\quad \quad \times \left. \left. F_n\left( q,\mathcal{T },x,\sum _{j\in J^{(n)},q_j>q}\delta _{(x_j,\mathcal{T }^j,\theta _j)}\right) \Bigm | (\mathfrak{T }_s^{(k)},\ k\le n,\ s\le \theta )\right] g\right] \\&\quad =\sum _{n=0}^{+\infty }Q^{\psi _\theta }\Bigg [Q^{\psi _\theta }\Bigg [\int \mu _\theta ^n(dx,d\mathcal{T },dq)\, K\Big (q, \mathfrak{T }_q^{(n)}\circledast _{j\in J_q^{(n)}}(\tilde{\mathcal{T }}_j,x_j),\\&\quad \quad \times \mathfrak{T }_q^{(n)}\circledast _{j\in J_q^{(n)}}(\tilde{\mathcal{T }}_j,x_j)\circledast (\mathcal{T },x)\Big )\Bigm | (\mathfrak{T }_s^{(k)},\ k\le n,\ s\le \theta )\Bigg ]\Bigg ]\\&\quad =\sum _{n=0}^{+\infty }Q^{\psi _\theta }\left[ \int \mu _\theta ^n(dx,d\mathcal{T },dq)\,K\Big (q, \mathfrak{T }_{q}, \mathfrak{T }_{q}\circledast (\mathcal{T },x)\Big ) \right] \\&\quad =Q^{\psi _\theta }\left[ \int K\Big (q, \mathfrak{T }_q, \mathfrak{T }_q \circledast (T,x)\Big ) \, \mu _\theta (dx,d\mathcal{T },dq) \right] . \end{aligned}$$

\(\square \)

It can be noticed that the map \(q\mapsto \mathfrak{T }_q\) is non-decreasing càdlàg (backwards in time) and that we have, for \(j\in \cup _{n\in \mathbb{N }} J^{(n)}\), \(x_j\in \mathfrak{T }_{\theta _j}\): \( \mathfrak{T }_{\theta _j-}= \mathfrak{T }_{\theta _j}\circledast (\mathcal{T }^j,x_j)\). In particular, we can recover the random measure \(\mathcal{N }_\theta \) from the jumps of the process \(( \mathfrak{T }_q, q\in \Theta ^\psi \cap (-\infty ,\theta ])\). This and the natural compatibility relation of \(\mathcal{N }_\theta \) with respect to \(\theta \) gives the next corollary.

Corollary 3.4

Let \(\psi \) be a branching mechanism satisfying Assumptions 1 and 2. Let \((\mathcal{T }_\theta , \theta \in \Theta ^\psi )\) be defined under \(\mathbb{N }^\psi \), and let

$$\begin{aligned} \mathcal{N }=\sum _{j\in J}\delta _{(x_j,\mathcal{T }^j,\theta _j)} \end{aligned}$$

be the random point measure defined as follows:

  • The set \(\{\theta _j,\ j\in J\}\) is the set of jumping times of the process \( (\mathcal{T }_\theta , \theta \in \Theta ^\psi )\): for \(j\in J\), \(\mathcal{T }_{\theta _j-}\ne \mathcal{T }_{\theta _j}\).

  • The real tree \(\mathcal{T }^j\) is the closure of \(\mathcal{T }_{\theta _j-} {\setminus } \mathcal{T }_{\theta _j}\).

  • The point \(x_j\) is the root of \(\mathcal{T }^j\) (that is \(x_j\) is the only element \(y\in \mathcal{T }_{\theta _j-}\) such that \(x\in \mathcal{T }^j\) implies \([\![y,x ]\!]\subset \mathcal{T }^j\)).

Then the backward point process \(\theta \mapsto {\mathbf{1}}_{\{\theta \le q^{\prime }\}} \mathcal{N }(dx, d\mathcal{T }, dq^{\prime })\) defined on \(\Theta ^\psi \) has predictable compensator:

$$\begin{aligned} \mu (dx,d\mathcal{T },dq)=\mathbf{m}^{\mathcal{T }_q}(dx) {\mathbf{N}}^{\psi _q}[d\mathcal{T }]\, {\mathbf{1}}_{\{q\in \Theta ^\psi \}}\, dq, \end{aligned}$$

with respect to the backward left-continuous filtration \(\mathcal{F }=(\mathcal{F }_\theta , \theta \in \Theta ^\psi )\) defined by:

$$\begin{aligned} \mathcal{F }_\theta =\sigma ((x_j, \mathcal{T }^j,\theta _j),\ \theta \le \theta _j)= \sigma (\mathcal{T }_{q-},\ \theta \le q). \end{aligned}$$

More precisely, for any non-negative predictable process \(K\) with respect to the backward filtration \(\mathcal{F }\), we have:

$$\begin{aligned}&\mathbb{N }^\psi \left[ \int \mathcal{N }(dx, d\mathcal{T },dq)\, K\Big ( q,\mathcal{T }_{q}, \mathcal{T }_{q-}\Big ) \right] \nonumber \\&\quad =\mathbb{N }^\psi \left[ \int \mu (dx,dT,dq) \, K\Big ( q,\mathcal{T }_q, \mathcal{T }_q \circledast (T,x)\Big ) \, \right] . \end{aligned}$$
(41)

Remark 3.5

Note that Assumption 2 is assumed only for technical measurability purposes, see Remark 2.11. We conjecture that this results also holds without Assumption 2.

As a consequence, thanks to property 3 of Theorem 2.28, we get, with the convention \(\sup \varnothing =\theta _\infty \), that:

$$\begin{aligned} A=\sup \{\theta _j,\ j\in J\quad \text{ and } \quad \sigma ^j=+\infty \} \quad \text{ with }\quad \sigma _j=\mathbf{m}^{\mathcal{T }^j}(\mathcal{T }^j). \end{aligned}$$

3.3 Proof of Theorem 3.2

By construction, it is clear that the process \((\mathfrak{T }_q, q\in \Theta ^\psi \cap (-\infty ,\theta ])\) is a backward Markov process with respect to the backward filtration \((\mathcal{F }_q, q\in \Theta ^\psi \cap (-\infty ,\theta ])\). By construction this process is càglàd in backward time. Since the process \((\mathcal{T }_q,\ q\in \Theta ^\psi )\) is a forward càdlàg Markov process, it is enough to check that for \(\theta _0\in \Theta ^\psi \), such that \(\theta _0<\theta \), the two dimensional marginals \( (\mathfrak{T }_{\theta _0}, \mathfrak{T }_{\theta })\) and \((\mathcal{T }_{\theta _0},\mathcal{T }_{\theta })\) have the same distribution.

Replacing \(\psi \) by \(\psi _{\theta _0}\), we can assume that \(\theta _0=0\) and \(0<\theta \). We will decompose the big tree \(\mathcal{T }_{0}\) conditionally on the small tree \(\mathcal{T }_{\theta }\) by iteration. This decomposition is similar to the one which appears in [1] or [29] for the fragmentation of the (sub)critical Lévy tree, but roughly speaking the fragmentation is here frozen except for the fragment containing the root.

We set \(\mathcal{T }^{(0)}\!=\!\mathcal{T }_\theta \) and \(\tilde{\mathbf{m}}^{(0)}\!=\!\mathbf{m}^{\mathcal{T }_\theta }\), so that \((\mathfrak{T }^{(0)}, \mathbf{m}^{{(0)}})\) and \((\mathcal{T }^{(0)},\tilde{\mathbf{m}}^{(0)}) \) have the same distribution. Recall notation \(\mathcal{M }^\uparrow \) from (33) as well as (34): \(\mathcal{T }_0\!=\!\mathcal{T }^{(0)} \circledast _{i\in I_\theta ^{\uparrow ,1}} (\mathcal{T }^i,x_i)\), where we write \( I_\theta ^{\uparrow ,1}= I_\theta ^{\uparrow }\) and where \(\mathcal{P }^1=\sum _{i\in I_\theta ^{\uparrow ,1}} \delta _{(x_i,\mathcal{T }^i,\theta _i)}\) is, conditionally on \(\mathcal{T }^{(0)}\), a Poisson point measure with intensity:

$$\begin{aligned} \nu ^1 (dx,d\mathcal{T }^{\prime },dq)&= \tilde{\mathbf{m}}^{(0)} (dx) \left( 2\beta \mathbb{N }^\psi [d\mathcal{T }^{\prime }] + \int _{(0,+\infty )} \Pi (dr) \, r \mathrm{e}^{-q r} \mathbb{P }^\psi _r(d\mathcal{T }^{\prime }) \right) \, \\&{\mathbf{1}}_{(0,\theta ]}(q)\, dq. \end{aligned}$$

For \(i\in I_\theta ^{\uparrow ,1}\), we define the subtree of \(\mathcal{T }^i\):

$$\begin{aligned} \tilde{\mathcal{T }}^i=\left\{ x\in \mathcal{T }^i,\ \mathcal{M }^\uparrow (]\!]x_i,x [\![\times [0,\theta _i])=0\right\} . \end{aligned}$$

Since \(\mathcal{T }^i\) is distributed according to \(\mathbb{N }^{\psi }\) (or to \(\mathbb{P }^\psi _{r_i}\) for some \(r_i>0\)), using the property of Poisson point measures, we have that conditionally on \(\mathcal{T }^{0}\) and \(\theta _i\), the tree \(\tilde{\mathcal{T }}^i\) is distributed as \(\Lambda _{\theta _i}(\mathcal{T },\mathcal{M })\) under \(\mathbb{N }^\psi \) (or under \(\mathbb{P }^\psi _{r_i}\)) that is the distribution of \(\tilde{\mathcal{T }}^i\) is \(\mathbb{N }^{\psi _{\theta _i}}[d \mathcal{T }]\) (or \(\mathbb{P }^{\psi _{\theta _i}} _{r_i}(d\mathcal{T })\)), thanks to the special Markov property. Furthermore we have \(\mathcal{T }^i=\tilde{\mathcal{T }}^i \circledast _{i^{\prime }\in I_{\theta ,i}^{\uparrow ,2}} (\mathcal{T }^{i^{\prime }},x_{i^{\prime }})\) where

$$\begin{aligned} \sum _{i^{\prime }\in I_{\theta ,i}^{\uparrow ,2}} \delta _{(x_{i^{\prime }},\mathcal{T }^{i^{\prime }},\theta _{i^{\prime }})} \end{aligned}$$

is, conditionally on \(\mathcal{T }^{(0)}\) and \(\tilde{\mathcal{T }}^i\) a Poisson point measure on \(\tilde{\mathcal{T }}^i\times \mathbb{T }\times (0,\theta ]\) with intensity:

$$\begin{aligned} \mathbf{m}^{\tilde{\mathcal{T }}^i}(dx) \left( 2\beta \mathbb{N }^\psi (d\mathcal{T }^{\prime }) + \int _{(0,+\infty )} \Pi (dr) \, r \mathrm{e}^{-q r} \mathbb{P }^\psi _r(d\mathcal{T }^{\prime }) \right) \, {\mathbf{1}}_{[0,\theta _i)}(q)\, dq. \end{aligned}$$

Thus we deduce, using again the special Markov property, that:

$$\begin{aligned} \tilde{\mathcal{N }}^1_{\theta }(dx,d\mathcal{T },dq)=\sum _{i\in I^{\uparrow ,1}} \delta _{(x_i,\tilde{\mathcal{T }}^i,\theta _i)}(dx,d\mathcal{T },dq) \end{aligned}$$

is conditionally on \(\mathcal{T }^{0}\) a Poisson point measure on \(\mathcal{T }^{(0)}\times \mathbb{T }\times \Theta ^\psi \) with intensity:

$$\begin{aligned} \tilde{\mu }^1(dx,d\mathcal{T },dq)= \tilde{\mathbf{m}}^{(0)}_q (dx) {\mathbf{N}}^{\psi _q}[d\mathcal{T }]\, {\mathbf{1}}_{[0,\theta )}(q)\, dq, \end{aligned}$$

with \(\tilde{\mathbf{m}}^{(0)}_q (dx)=\tilde{\mathbf{m}}^{(0)} (dx)\). We set \(\mathcal{T }^{(1)}=\mathcal{T }^{(0)} \circledast _{i\in I_\theta ^{\uparrow ,1}} (\tilde{\mathcal{T }}^i,x_i)\) for the first generation tree and for \(q\in [0,\theta ]\):

$$\begin{aligned} \tilde{\mathbf{m}}^{(1)}_q (dx)=\sum _{i\in I_{\theta }^{\uparrow ,1}} \mathbf{m}^{\tilde{\mathcal{T }}^i}(dx) {\mathbf{1}}_{[0,\theta _i)}(q). \end{aligned}$$

See Fig. 2 for a simplified representation. We get that \( (\mathfrak{T }^{(1)}_\theta , (\mathbf{m}^{(1)}_q, q\in [0,\theta ]), \mathfrak{T }^{(0)}, \mathbf{m}^{\mathfrak{T }^{(0)}})\) and \( (\mathcal{T }^{(1)},(\tilde{\mathbf{m}}^{(1)}_ q, q\in [0,\theta ]), \mathcal{T }^{(0)},\tilde{\mathbf{m}}^{(0)})\) have the same distribution.

Fig. 2
figure 2

The tree \(\mathcal{T }_0\), \(\mathcal{T }^{(0)}\), and a tree \(\mathcal{T }^i\) and its sub-tree \(\tilde{\mathcal{T }}^i\) belonging to the first generation tree \(\mathcal{T }^{(1)}{\setminus } \mathcal{T }^{(0)}\)

Furthermore, by collecting all the trees grafted on \(\mathcal{T }^{(1)}\), we get that

$$\begin{aligned} \mathcal{T }=\mathcal{T }^{(1)} \circledast _{i^{\prime }\in I_{\theta }^{\uparrow ,2}} (\mathcal{T }^{i^{\prime }},x_{i^{\prime }}), \end{aligned}$$

where \( I_{\theta }^{\uparrow ,2}=\cup _{i\in I_{\theta }^{\uparrow ,1} } I_{\theta ,i}^{\uparrow ,2}\) and where

$$\begin{aligned} \mathcal{P }^2=\sum _{i^{\prime }\in I_{\theta }^{\uparrow ,2}} \delta _{(x_{i^{\prime }},\mathcal{T }^{i^{\prime }},\theta _{i^{\prime }})} \end{aligned}$$

is, conditionally on \( (\mathcal{T }^{(1)},(\tilde{\mathbf{m}}^{(1)}_ q, q\in [0,\theta ]), \mathcal{T }^{(0)},\tilde{\mathbf{m}}^{(0)})\) a Poisson point measure on \(\mathcal{T }^{(1)}\times \mathbb{T }\times (0,\theta ]\) with intensity:

$$\begin{aligned} \nu ^2(dx,d\mathcal{T },dq)=\tilde{\mathbf{m}}^{(1)}_q(dx)\, \left( 2\beta \mathbb{N }^\psi (d\mathcal{T }^{\prime }) \!+\!\!\! \int _{(0,+\infty )} \Pi (dr) \, r \mathrm{e}^{-q r} \mathbb{P }^\psi _r(d\mathcal{T }^{\prime }) \right) \, {\mathbf{1}}_{[0,\theta ]}(q)\, dq. \end{aligned}$$

Notice that:

$$\begin{aligned} \mathcal{T }^{(1)}&= \{x\in \mathcal{T }_0,\ \mathcal{M }^\uparrow ([\![\varnothing , x [\![\times [0,\theta ])\le 1\} \quad \text{ and }\quad \tilde{\mathbf{m}}^{(1)}_\theta (dx)+ \tilde{\mathbf{m}}^{(0)}(dx)\nonumber \\&= {\mathbf{1}}_{\mathcal{T }^{(1)}}(x)\,\mathbf{m}^{\mathcal{T }_0} (dx). \end{aligned}$$
(42)

We can then iterate this construction, and by taking increasing limits we obtain that the pair \(((\cup _{n\in \mathbb{N }} \mathfrak{T }_\theta ^{(n)}, \sum _{n\in \mathbb{N }} \mathbf{m}^{(n)}_\theta ), \mathfrak{T }_0)\) has the same distribution as \((\mathcal{T }^{\prime },\mathcal{T }^{(0)})\), where:

$$\begin{aligned} \mathcal{T }^{\prime }=\left\{ x\in \mathcal{T }_0,\ \mathcal{M }^\uparrow ([\![\varnothing , x [\![\times [0,\theta ])<+\infty \right\} \quad \text{ and }\quad \tilde{\mathbf{m}}^{\prime }(dx)={\mathbf{1}}_{\mathcal{T }^{\prime }}(x)\,\mathbf{m}^{\mathcal{T }_0}(dx). \end{aligned}$$

To conclude, we need to check first that the completion of \(\mathcal{T }^{\prime }\) is \(\mathcal{T }_0\) or, as \(\mathcal{T }_0\) is complete, that the closure of \(\mathcal{T }^{\prime }\) as a subset of \(\mathcal{T }_0\) is exactly \(\mathcal{T }_0\) and then that \(\mathbf{m}^{\mathcal{T }_0}(\mathcal{T }^{\prime c})=0\).

Notice that \(\mathcal{M }^\uparrow \) has fewer marks than \(\mathcal{M }\). Then Proposition 1.2 in [1] in the case when \(\beta =0\) or an elementary adaptation of it in the general framework of [29], gives there is no loss of mass in the fragmentation process. This implies that, if \(\psi \) is (sub)critical, then:

$$\begin{aligned} \mathbf{m}^{ \mathcal{T }_0}(\{x\in \mathcal{T }_0,\ \mathcal{M }([\![\varnothing , x [\![\times [0,\theta ])=\infty \}=0. \end{aligned}$$
(43)

Then, if \(\psi \) is super-critical, by considering the truncation of \(\mathcal{T }_0\) at level \(a\), \(\pi _a(\mathcal{T }_0)\), and using a Girsanov transformation from Definition 2.21 with \(\theta =\theta ^*\) and (43), we deduce that (43) holds for \(\pi _a(\mathcal{T }_0)\). Since \(a\) is arbitrary, we deduce by monotone convergence that (43) holds also in the super-critical case. Thus we have \(\mathbf{m}^{ \mathcal{T }_0}(\mathcal{T }^{\prime c})=0\). Since the closed support of \(\mathbf{m}^{\mathcal{T }_0}\) is the set of leaves \(\mathrm{Lf}(\mathcal{T }_0)\), we deduce that \(\mathrm{Lf}(\mathcal{T }^{\prime })\) is dense in \(\mathrm{Lf}(\mathcal{T }_0)\) and, as \(\mathcal{T }^{\prime }\) and \(\mathcal{T }_0\) have the same root, that \(\mathrm{Sk}(\mathcal{T }^{\prime })= \mathrm{Sk}(\mathcal{T }_0)\). This implies that the closure of \(\mathcal{T }^{\prime }\) is \(\mathcal{T }_0\) and ends the proof.

4 Application to overshooting

We assume that \(\psi \) is critical, \(\theta _\infty <0\) and Assumptions 1 and 2 hold. We will write \(u^\theta \) (resp. \(b^\theta \)) for the solution of (11) (resp. (12)) when \(\psi \) is replaced by \(\psi _\theta \), for \(a\ge 0\), \(h>0\) and \(t\in [0,h)\):

$$\begin{aligned} \int _{u^\theta (a,\lambda )}^\lambda \frac{dr}{\psi _\theta (r)} = a, \quad \text{ and }\quad b^\theta _h(t)=b^\theta (h-t) \quad \text{ with }\quad \int _{b^\theta (h)}^\infty \frac{dr}{\psi _\theta (r)} = h. \end{aligned}$$
(44)

We have \(u^\theta (a,b^\theta (h-a))=b^\theta (h)\). Notice that \(\partial _h b^\theta (h)/ \psi _\theta (b^\theta (h))=-1\) and also that we have \(\partial _\lambda u^\theta (a,\lambda )=\psi _\theta (u^\theta (a,\lambda ))/\psi _\theta (\lambda )\) which implies that:

$$\begin{aligned} \partial _\lambda u^\theta \left( a,b^\theta (h-a)\right) = \frac{\psi _\theta (b^\theta (h))}{\psi _\theta (b^\theta (h-a))} = -\frac{\psi _\theta (b^\theta (h))}{\psi _\theta (b^\theta (h-a))^2} \partial _h b^\theta (h-a). \end{aligned}$$
(45)

We set for \(\theta \in \Theta ^\psi \) and \(\lambda \ge 0\):

$$\begin{aligned} \gamma _\theta (\lambda ) = \psi _\theta ^{\prime }(\lambda ) - \psi _\theta ^{\prime }(0)=\psi ^{\prime }(\lambda +\theta ) -\psi ^{\prime }(\theta )=\partial _\theta \psi _\theta (\lambda ). \end{aligned}$$
(46)

Notice that the function \(\gamma _\theta \) is non-negative and non-decreasing. Recall that \(\bar{\theta }=\psi ^{-1}\circ \psi (\theta )\). We deduce from (44) that for \(\theta \in \Theta ^\psi \), \(\theta <0\) and \(h>0\):

$$\begin{aligned} \bar{\theta }+ b^{\bar{\theta }}(h)= \theta + b^{\theta }(h) \quad \text{ and }\quad \psi _{\bar{\theta }} (b^{\bar{\theta }}(h))= \psi _{\theta } (b^{\theta }(h)). \end{aligned}$$
(47)

4.1 Exit times

Let \(h>0\). We are interested in the first time at which the process of growing trees exceeds height \(h\), in the following sense.

Definition 4.1

The first exit time out of \(h\) is the (possibly infinite) number \(A_h\) defined by

$$\begin{aligned} A_h = \sup \left\{ \theta \in \Theta ^\psi ,\ H_{\mathrm{max}}(\mathcal{T }_\theta ) > h\right\} \!, \end{aligned}$$

with the convention that \(\sup \varnothing =\theta _\infty \).

The constraint not to be higher than \(h\) will be coded by the function \(b^\theta (h)\) which is the probability (under \(\mathbb{N }^{\psi }\)) for the tree \(\mathcal{T }^\theta \) of having maximal height larger than \(h\). By definition of the function \(b\), we have for \(\theta \in \Theta ^\psi \):

$$\begin{aligned} \mathbb{N }^\psi [\theta \le A_h]=\mathbb{N }^\psi \left[ H_{\mathrm{max}}(\mathcal{T }_\theta ) \ge h \right] =b^\theta (h). \end{aligned}$$
(48)

Proposition 4.2

Let \(\psi \) be a critical branching mechanism with \(\theta _\infty <0\) and satisfying Assumptions 1 and 2. The function \(\theta \mapsto b^\theta _h\) is of class \(\mathcal C ^1\) on \((\theta _\infty , +\infty )\). And, under \( \mathbb{N }^\psi \), the distribution of \(A_h\) on \((\theta _\infty ,+\infty )\) has density \(\theta \mapsto -\partial _\theta b^\theta (h)\) with respect to the Lebesgue measure. We also have the following expression for the density of \(A_h\) on \((\theta _\infty ,+\infty )\). Let \(\theta _\infty <\theta \) and \(h>0\). Then:

$$\begin{aligned} -\partial _\theta b^\theta (h)&= \psi _\theta \left( b^\theta (h)\right) \int _0^h da \, \frac{\gamma _\theta (b^\theta (a)) }{\psi _\theta (b^\theta (a)) } = \int _0^{h }da \, \gamma _{ \theta } \left( b^{ \theta }(h-a)\right) \\&\times \, \mathrm{e}^{-\psi ^{\prime }({ \theta }) a - \int _0^a dx\, \gamma _{ \theta } (b^{ \theta }(h-x)) }. \end{aligned}$$

Notice that the distribution of \(A_h\) might have an atom at \(\theta _\infty \).

Proof

Notice that for \(\theta _\infty <\theta \), we have \(\lim _ {\lambda \rightarrow +\infty } \psi ^{\prime \prime }(\lambda )=\beta \) and \(\lim _ {\lambda \rightarrow +\infty } \psi ^{\prime }(\lambda )=+\infty \). In particular \(\psi _\theta ^{\prime }(\lambda )/\psi _\theta (\lambda )\) is bounded for \(\lambda \) large enough. This implies that \(\int ^{+\infty }dr\, \psi ^{\prime }_\theta (r)/\psi _\theta (r)^2 \) is finite thanks to Assumption 2. We deduce that the function \(\theta \mapsto b^\theta _h\) is of class \(\mathcal C ^1\) on \((\theta _\infty , +\infty )\) and, thanks to (48), that under \( \mathbb{N }^\psi \), the distribution of \(A_h\) on \((\theta _\infty ,+\infty )\) has density \(\theta \mapsto -\partial _\theta b^\theta (h)\) with respect to the Lebesgue measure.

Taking the derivative with respect to \(\theta \) in the last term of (44), using (46) and the change of variable \(r=b^\theta (a)\) gives the first equality of the proposition:

$$\begin{aligned} -\partial _\theta b^\theta (h) = \psi _\theta \left( b^\theta (h)\right) \int _{b^\theta (h)}^{+\infty } dr \frac{\gamma _\theta (r)}{\psi _\theta (r)^2} = \psi _\theta \left( b^\theta (h)\right) \int _0^h da \, \frac{\gamma _\theta (b^\theta (a)) }{\psi _\theta (b^\theta (a)) } \cdot \end{aligned}$$
(49)

From (44) we get that \( \partial _t b_h^\theta (t) ={\psi _\theta (b_h^\theta (t))}\). Hence, we have:

$$\begin{aligned} \int _0^t \psi ^{\prime }_\theta \left( b^\theta _h(r)\right) \, dr= \int _0^t \frac{\psi _\theta ^{\prime }(b_h^\theta (r))}{\psi _\theta (b_h^\theta (r))} \partial _t b_h^\theta (r) \,dr= \log \left( \frac{\psi _\theta (b_h^\theta (t))}{\psi _\theta (b_h^\theta (0))} \right) . \end{aligned}$$

This gives:

$$\begin{aligned} \int _0^t \gamma _\theta \left( b_h^\theta (r)\right) dr = \int _0^t \psi ^{\prime }_\theta \left( b^\theta _h(r)\right) \, dr - t \psi ^{\prime }(\theta ) = \log \left( \frac{\psi _\theta (b_h^\theta (t))}{\psi _\theta (b_h^\theta (0))} \right) -t\psi ^{\prime }(\theta ).\nonumber \\ \end{aligned}$$
(50)

We deduce that:

$$\begin{aligned} \int _0^{h }da \, \gamma _{ \theta } \left( b^{ \theta }(h-a)\right) \, \mathrm{e}^{-\psi ^{\prime }({ \theta }) a - \int _0^a dx\, \gamma _{ \theta } (b^{ \theta }(h-x)) } = \psi _\theta \left( b^\theta (h)\right) \int _0^h da \, \frac{\gamma _\theta (b^\theta (a)) }{\psi _\theta (b^\theta (a)) }\cdot \end{aligned}$$

This proves the second equality of the proposition. \(\square \)

Since we will also be dealing with super-critical trees, there is always the positive probability that in the Poisson process of trees an infinite tree arises, which will be grafted onto the process, effectively making it infinite and thus outgrowing height \(h\). In the next proposition, we will compute the conditional distribution of the overshooting time \(A_h\), given \(A\). Note that we always have \(A\le A_h\).

Proposition 4.3

Let \(\psi \) be a critical branching mechanism with \(\theta _\infty <0\) and satisfying Assumptions 1 and 2. For \(\theta _\infty <\theta _0 <\theta \) and \(\theta _0<0\) (that is \(\psi _{\theta _0}\) super-critical), we have, with \(\hat{\theta } = \bar{\theta }_0-\theta _0+\theta \):

$$\begin{aligned} \mathbb{N }^\psi [A_h \ge \theta | A=\theta _0 ]&= 1 - \psi ^{\prime }(\hat{\theta })\psi _{\hat{\theta }}\left( b^{\hat{\theta }}(h)\right) \int _{b^{\hat{\theta }}(h)}^{+\infty } \frac{dr}{\psi _{\hat{\theta }} (r)^2},\\ \mathbb{N }^\psi [A_h =A| A=\theta _0 ]&= \psi ^{\prime }(\bar{\theta }_0) \psi _{\bar{\theta }_0}\left( b^{\bar{\theta }_0}(h)\right) \int _{b^{\bar{\theta }_0}(h)}^{+\infty } \frac{dr}{\psi _{\bar{\theta }_0} (r)^2}\cdot \end{aligned}$$

Since \(\psi _{\bar{\theta }_0}\) is sub-critical, we have \(\psi ^{\prime }(\bar{\theta }_0)>0\) and \(\psi _{\bar{\theta }_0}(r) \sim r \psi ^{\prime }(\bar{\theta }_0)\) when \(r\) goes down to \(0\). Since \(\lim _{h\rightarrow +\infty } b^{\bar{\theta }_0}(h)=0\), we deduce that:

$$\begin{aligned} \lim _{h\rightarrow +\infty } \mathbb{N }^\psi [A_h =A| A=\theta _0 ]=1. \end{aligned}$$

This has a straightforward explanation. If \(h\) is very large, with high probability the process up to \(A\) will not have crossed height \(h\), so that the first jump to cross height \(h\) will correspond to the grafting time of the first infinite tree which happens at the ascension time \(A\). We also deduce from (47) that:

$$\begin{aligned} \mathbb{N }^\psi [A_h =A| A=\theta _0 ] = \psi ^{\prime }(\bar{\theta }_0) \psi _{\theta _0}\left( b^{\theta _0}(h)\right) \int _{b^{\theta _0}(h)}^{+\infty } \frac{dr}{\psi _{ \theta _0} (r)^2}\cdot \end{aligned}$$
(51)

Proof

We use the notation \(\mathcal{Z }^\theta _h=\mathcal{Z }_h(\mathcal{T }^\theta )\) and \(\mathcal{Z }_h=\mathcal{Z }_h(\mathcal{T }^0)\). We have:

$$\begin{aligned} \mathbb{N }^\psi [ A_h \ge \theta | A= \theta _0 ] = \mathbb{N }^\psi [ \mathcal{Z }_{h}^\theta >0 | A=\theta _0 ]&= \mathbb{N }^\psi [ \mathcal{Z }_h^{A+(\theta -\theta _0)} >0 | A=\theta _0 ] \\&= \psi ^{\prime }(\bar{\theta }_0) \mathbb{N }^\psi \left[ \sigma _0 {\mathbf{1}}_{\{ \mathcal{Z }_h^{(\theta -\theta _0)} >0\} } \mathrm{e}^{-\psi (\theta _0)\sigma _0} \right] \\&= \psi ^{\prime }(\bar{\theta }_0) \mathbb{N }^{\psi _{\bar{\theta }_0}} \left[ \sigma _0 {\mathbf{1}}_{\{ \mathcal{Z }_h^{(\theta -\theta _0)} >0\} } \right] \\&= \psi ^{\prime }(\bar{\theta }_0) \mathbb{N }^\psi \left[ \sigma _{\bar{\theta }_0} {\mathbf{1}}_{ \{\mathcal{Z }_{h}^{\bar{\theta }_0+(\theta -\theta _0)} >0\}} \right] \\&= \psi ^{\prime }(\bar{\theta }_0) \mathbb{N }^\psi \left[ \sigma _{\bar{\theta }_0} {\mathbf{1}}_{ \{\mathcal{Z }_{h}^{\hat{\theta }} >0\}} \right] , \end{aligned}$$

where we used (2) of Theorem 2.28 for the third equality, Girsanov formula (27) for the fourth and the homogeneity property of Theorem 2.26 in the fifth. We now condition with respect to \(\mathcal{T }^{\hat{\theta }}\). The indicator function being measurable, the only quantity left to compute is the conditional expectation of \(\sigma _{\bar{\theta }_0}\) given \(\mathcal{T }^{\hat{\theta }}\). Thanks to Lemma 2.27, the fact that \(\hat{\theta }>0\) and the homogeneity property, we get:

$$\begin{aligned} \mathbb{N }^\psi [ A_h \ge \theta | A= \theta _0 ] = \psi ^{\prime }(\hat{\theta })\mathbb{N }^\psi \left[ \sigma _{\hat{\theta }} {\mathbf{1}}_{\{\mathcal{Z }_h^{\hat{\theta }} >0 \} } \right] =\psi ^{\prime }(\hat{\theta })\mathbb{N }^{\psi _{\hat{\theta }}} \left[ \sigma {\mathbf{1}}_{\{\mathcal{Z }_{h} > 0 \}} \right] \!. \end{aligned}$$

Using that \(\mathbb{N }^{\psi _{\hat{\theta }}}[\sigma ]=1/\psi ^{\prime }(\hat{\theta })\), which can be deduced from (25), we get:

$$\begin{aligned} \mathbb{N }^\psi [ A_h \ge \theta | A= \theta _0 ]&= \psi ^{\prime }(\hat{\theta }) \mathbb{N }^{\psi _{\hat{\theta }}}[\sigma ] - \psi ^{\prime }(\hat{\theta }) \mathbb{N }^{\psi _{\hat{\theta }}} \left[ \int _0^h \mathcal{Z }_a da {\mathbf{1}}_{\{\mathcal{Z }_h =0 \}} \right] \\&= 1 - \psi ^{\prime }(\hat{\theta }) \int _0^h da\, \lim _{\lambda \rightarrow \infty } \mathbb{N }^{\psi _{\hat{\theta }}} \Big [ \mathcal{Z }_a \mathrm{e}^{-\lambda \mathcal{Z }_h} \Big ]. \end{aligned}$$

Now, conditioning by \(\mathcal{Z }_a\) and using \(\lim _{\lambda \rightarrow \infty } u^{\hat{\theta }}(h-t,\lambda )=b_h^{\hat{\theta }}(t)\) as well as (23), we get:

$$\begin{aligned} \lim _{\lambda \rightarrow \infty } \mathbb{N }^{\psi _{\hat{\theta }}} \Big [\mathcal{Z }_a \mathrm{e}^{-\lambda \mathcal{Z }_h} \Big ]&= \lim _{\lambda \rightarrow \infty } \mathbb{N }^{\psi _{\hat{\theta }}} \Big [ \mathcal{Z }_a \mathrm{e}^{-\mathcal{Z }_a u^{\hat{\theta }}(h-a,\lambda )}\Big ]\\&= \mathbb{N }^{\psi _{\hat{\theta }}}\left[ \mathcal{Z }_a \mathrm{e}^{-\mathcal{Z }_a b_h^{\hat{\theta }}(a)} \right] = \partial _\lambda u^{\hat{\theta }} (s,b_h^{\hat{\theta }}(a)). \end{aligned}$$

Then use (45) to get:

$$\begin{aligned} \int _0^h da\, \lim _{\lambda \rightarrow \infty } \mathbb{N }^{\psi _{\hat{\theta }}} \Big [ \mathcal{Z }_a \mathrm{e}^{-\lambda \mathcal{Z }_h} \Big ]&= \int _0^h da\partial _\lambda u^{\hat{\theta }} (s,b_h^{\hat{\theta }}(a)) \\&= \psi _{\hat{\theta }} (b^{\hat{\theta }}(h)) \int _0^h da\ \frac{|\partial _h b^{\hat{\theta }}(h-a)|}{\psi _{\hat{\theta }}(b^{\hat{\theta }}(h-a))^2}\\&= \psi _{\hat{\theta }}(b^{\hat{\theta }}(h))\int _{b^{\hat{\theta }}(h)}^{+\infty } \frac{dr}{\psi _{\hat{\theta }} (r)^2}, \end{aligned}$$

and thus deduce the first equality of the proposition. Note that \(\int ^{+\infty } dr/\psi _\theta (r)^2 <+\infty \) thanks to Assumption 2 (this is actually true in general). Let \(\theta \) go down to \(\theta _0\) and use the fact that \(\mathbb{N }^\psi \)-a.e. \(A\le A_h\) to get the second equality. \(\square \)

Remark 4.4

In the quadratic case \(\psi (u)=\beta u^2\), we can obtain closed formulæ. For all \(\theta >0\), we have:

$$\begin{aligned} u^\theta (t,\lambda ) = \frac{2\theta \lambda }{(2\theta +\lambda ) \exp (2\beta \theta t) -\lambda } \quad \text{ and }\quad b^\theta (t)=\frac{2\theta }{\mathrm{e}^{2\beta \theta t}-1}\cdot \end{aligned}$$

We deduce the following exact expression of the conditional distribution for \(\theta _0< \theta \), \(\theta _0<0\) and with \(\bar{\theta }_0=|\theta _0|=-\theta _0\) and \(\hat{\theta }=\theta +2|\theta _0|\):

$$\begin{aligned} \mathbb{N }^\psi [A_h \ge \theta | A=\theta _0 ]&= 1+ (\beta \hat{\theta } h)/ \mathrm{sinh}^2(\beta \hat{\theta } h) - \mathrm{cotanh}(\beta \hat{\theta } h),\\ \mathbb{N }^\psi [A_h =A| A=\theta _0 ]&= \beta \theta _0 h/ \mathrm{sinh}^2(\beta {\theta _0} h)-\mathrm{cotanh}(\beta {\theta _0} h). \end{aligned}$$

Notice that \(\lim _{\theta _0\rightarrow -\infty }\mathbb{N }^\psi [A_h =A| A=\theta _0]=1\). This corresponds to the fact that if \(A\) is large, then the tree \(\mathcal{T }_A\) is small and has little chance to cross level \(h\). (Note that \(\mathcal{T }_A\) has finite height but \(\mathcal{T }_{A-}\) has infinite height.) Thus the time \(A_h\) is equal to the time when an infinite tree is grafted that is, to the ascension time \(A\).

4.2 Distribution of the tree at the exit time

Before stating the theorem describing the tree before it overshoots a given height \(h>0\) under the form of a spinal decomposition, we will explain how this spine is distributed. Recall (46) for the definition of \(\gamma _\theta \).

Lemma 4.5

Let \(\psi \) be a critical branching mechanism satisfying Assumptions  1 and 2. Let \(\theta \in \Theta ^\psi \). The non-negative function

$$\begin{aligned} f:t\mapsto \gamma _\theta (b_h^\theta (t))\exp \left( -\int _0^t \gamma _\theta (b_h^\theta (r)) dr \right) \end{aligned}$$
(52)

is a probability density on \([0,h)\) with respect to Lebesgue measure. If \(\xi \) is a random variable whose distribution is \(f\), then we have \(\mathbb{E }[\exp (-\psi ^{\prime }(\theta ) \xi )]<+\infty \).

Notice the integrability property on \(\xi \) is trivial if \(\theta \ge 0\).

Proof

Notice that \(f=g^{\prime }\mathrm{e}^{-g}\) with \(g(t)=\int _0^t \gamma _\theta ( b_h^\theta (r))\, dr\). Thus we have

$$\begin{aligned} \int _0^h f= \int _0^h g^{\prime } \mathrm{e}^{-g} =\mathrm{e}^{-g(0)}-\mathrm{e}^{-g(h)} \end{aligned}$$

and \(f\) is a density if and only if \(g(h)=\infty \). We deduce from (50) that \(\int _0^t \gamma _\theta (b_h^\theta (r)) dr \) diverges as \(t\) goes to \(h\). The last part of Proposition 4.2 implies that \(\mathrm{e}^{-\psi ^{\prime }(\theta ) \xi }\) is integrable. \(\square \)

Recall Eq. (5) defining the grafting procedure.

Theorem 4.6

Let \(\psi \) be a critical branching mechanism satisfying Assumptions 1 and 2. Let \(\theta _\infty <\theta \) and let \(F\) be a non-negative measurable functional on \(\mathbb{T }^2\). Then, we have:

$$\begin{aligned} \mathbb{N }^\psi \left[ F(\mathcal{T }_{A_h},\, \mathcal{T }_{A_h-}) | A_h = \theta \right]&= \frac{1}{\mathbf{E}\left[ {\mathrm{e}^{- \psi ^{\prime }(\theta ) H_\mathbf{x}}} \right] } \mathbf{E}\left[ F\Big ([\![\varnothing ,\mathbf{x} ]\!]\circledast _{i\in I} (\mathcal{T }^i,x_i),\right. \\&\quad \times \left. ([\![\varnothing ,\mathbf{x} ]\!]\circledast _{i\in I} (\mathcal{T }^i,x_i)) \circledast (T,\mathbf{x})\Big ) \,\mathrm{e}^{- \psi ^{\prime }(\theta ) H_\mathbf{x}}\right] , \end{aligned}$$

where the spine \([\![\varnothing ,\mathbf{x} ]\!]\) is identified with the interval \([0,H_\mathbf{x}]\) (and thus \(y\in [\![\varnothing ,\mathbf{x} ]\!]\) is identified with \(H_y\)) and:

  • The random variable \(H_{\mathbf{x}}\) is distributed with density given by (52).

  • Conditionally on \(H_\mathbf{x} \), sub-trees are grafted on the spine \([0,H_\mathbf{x}]\) according to a Poisson point measure \(\mathcal{N }=\sum _{i\in I} \delta _{(x_i,\mathcal{T }^i)}\) on \([0, H_\mathbf{x}]\times \mathbb{T }\) with intensity:

    $$\begin{aligned} \nu _\theta (da,d\mathcal{T })&= da \left( 2 \beta (\theta +b_h^\theta (a)) \mathbb{N }^{\psi _{\theta }}[d\mathcal{T }, H_{\mathrm{max}}(\mathcal{T }) < h-a ]\right. \nonumber \\&\left. + \int _{(0,+\infty )} r \Pi _{\theta +b_h^{\theta }(x)}(dr) \mathbb{P }_r^{\psi _{\theta }}(d\mathcal{T }, H_{\mathrm{max}}(\mathcal{T }) < h-a ) \right) .\qquad \qquad \quad \end{aligned}$$
    (53)
  • Conditionally on \(H_\mathbf{x} \) and on \(\mathcal{N }\), \(T\) is a random variable on \(\mathbb{T }\) with distribution

    $$\begin{aligned} {\mathbf{N}}^{\psi _\theta }[dT|H_{\mathrm{max}}(T) > h -H_\mathbf{x}]. \end{aligned}$$

In other words, conditionally on \(\{A_h = \theta \}\), we can describe the tree before overshooting height \(h\) by a spinal decomposition along the ancestral branch of the point at which the overshooting sub-tree is grafted. Conditionally on the height of this point, the overshooting tree has distribution \(\mathbf{N}^{\psi _\theta }[dT]\), conditioned on overshooting.

If \(\theta >0\) then \(\psi ^{\prime }(\theta )>0\), and we can understand the weight \(\mathrm{e}^{- \psi ^{\prime }(\theta ) H_\mathbf{x} } / \mathbf{E}[\mathrm{e}^{- \psi ^{\prime }(\theta ) H_\mathbf{x} } ]\) as a conditioning of the random variable \(H_\mathbf{x}\) to be larger than an independent exponential random variable with parameter \(\psi ^{\prime }(\theta )\).

Remark 4.7

When \(h\) goes to infinity, we have, for \(\theta \ge 0\), \(\lim _{h\rightarrow +\infty } b^\theta (h)=0\) and thus the distribution of \(A_h\) concentrates on \(\Theta ^\psi \cap (-\infty ,0)\). For \(\theta <0\) and \(\theta \in \Theta ^\psi \), we deduce from (47) that \(\lim _{h\rightarrow +\infty } b^\theta (h)=\bar{\theta }-\theta >0\). And the distribution of \(\xi \) in Lemma 4.5 clearly converges to the exponential distribution with parameter \(\gamma _\theta (b^\theta (+\infty ))=\psi ^{\prime }(\bar{\theta }) -\psi ^{\prime }(\theta )\). Then the weight \(\mathrm{e}^{- \psi ^{\prime }(\theta ) H_\mathbf{x} } /\mathbf{E}[\mathrm{e}^{- \psi ^{\prime }(\theta ) H_\mathbf{x} } ]\) changes this distribution. In the end, \(H_\mathbf{x} \) is asymptotically distributed as an exponential random variable with parameter \(\psi ^{\prime }(\bar{\theta })\). Notice this is exactly the distribution of the height of a random leaf taken in \(\mathcal{T }_A\), conditionally on \(\{A=\theta \}\), see Lemma 7.6 in [5].

Remark 4.8

A direct application of Theorem 4.6 with \(F(\mathcal{T },\mathcal{T }^{\prime })\) chosen equal to

$$\begin{aligned} G(\mathcal{T },\mathcal{T }^{\prime })={\mathbf{1}}_{\{\mathbf{m}^{\mathcal{T }}(\mathcal{T })<+\infty , \mathbf{m}^{\mathcal{T }^{\prime }}(\mathcal{T }^{\prime })=+\infty \}}, \end{aligned}$$
(54)

allows one to compute for \(\theta <0\):

$$\begin{aligned} \mathbb{N }^\psi [A=A_h|A_h=\theta ] =\left( \psi ^{\prime }(\bar{\theta })- \psi ^{\prime }(\theta ) \right) \frac{C(\theta ,h)}{\psi ^{\prime }(\bar{\theta }) - \psi ^{\prime }(\theta ) C(\theta ,h)}, \end{aligned}$$

where \(C(\theta ,h)=\psi ^{\prime }(\bar{\theta })\psi _\theta (b^\theta (h)) \int _{b^\theta (h)}^{+\infty } dr\, \psi _\theta (r)^{-2} =\mathbb{N }^\psi [A=A_h|A=\theta ]\). The last equality is a consequence of (51). As \(\lim _{h\rightarrow +\infty } \mathbb{N }^\psi [A=A_h|A=\theta ]=1\), we get that

$$\begin{aligned} \lim _{h\rightarrow +\infty }\mathbb{N }^\psi [A=A_h|A_h=\theta ]=1. \end{aligned}$$

Remark 4.9

By considering the function \(G\) in (54) instead of \(F\) in the proof of Theorem 4.6, we can recover the distribution of \(\mathcal{T }_A\) given in [5], but we also can get the joint distribution of \((\mathcal{T }_{A-},\mathcal{T }_A)\). Roughly speaking (and unsurprisingly), conditionally on \(\{A=\theta \}\), \(\mathcal{T }_{A-}\) is obtained from \(\mathcal{T }_A\) by grafting an independent random tree \(T\) on a independent leaf \(x\) chosen according to \(\mathbf{m}^{\mathcal{T }_A}(dx)\) and the distribution of \(T\) is \({\mathbf{N}}^{\psi _\theta }[dT,\,H_{\mathrm{max}}(T)=+\infty ]\). Notice that choosing a leaf at random on \(\mathcal{T }_A\) gives that the distribution of \(\mathcal{T }_A\) is a size-biased distribution of \(\mathbb{N }^{\psi _\theta }[d\mathcal{T }]\).

Proof of Theorem 4.6

Thanks to the compensation formula (41), we can write, if \(g\) is any measurable functional \(\mathbb R \mapsto \mathbb R _+\) with support in \((\theta _\infty , +\infty )\):

$$\begin{aligned}&\mathbb{N }^{\psi }[ F(\mathcal{T }_{A_h},\, \mathcal{T }_{A_h-}) g(A_h) ] \\&\quad = \mathbb{N }^\psi \left[ \sum _{j\in J} {\mathbf{1}}_{\{ H_{\mathrm{max}}(\mathcal{T }_{\theta _j}) <h \}}F(\mathcal{T }_{\theta _j},\, \mathcal{T }_{\theta _j}\circledast (\mathcal{T }^j,x_j)) g(\theta _j) {\mathbf{1}}_{\{ H_{x_j}+H_{\mathrm{max}}(\mathcal{T }^j) >h \} } \right] \\&\quad = \int _{\Theta ^\psi } d\theta \ g(\theta ) B(\theta ,h), \end{aligned}$$

where, using the homogeneity property and the Girsanov transformation (28):

$$\begin{aligned} B(\theta ,h)&= \mathbb{N }^\psi \left[ {\mathbf{1}}_{\{H_{\mathrm{max}}(\mathcal{T }_\theta ) < h\}} \int \mathbf{m}^{\mathcal{T }_\theta }(dx) \right. \\&\times \left. \int {\mathbf{N}}^{\psi _\theta }[dT] F(\mathcal{T }_\theta ,\, \mathcal{T }_\theta \circledast (T,x)) {\mathbf{1}}_{\{H_x + H_{\mathrm{max}}(T) >h\}} \right] \\&= \mathbb{N }^{\psi _\theta }\left[ {\mathbf{1}}_{\{H_{\mathrm{max}}(\mathcal{T }) < h\}} \int \mathbf{m}^{\mathcal{T }}(dx) \right. \\&\times \left. \int {\mathbf{N}}^{\psi _\theta }[dT] F(\mathcal{T },\, \mathcal{T }\circledast (T,x)) {\mathbf{1}}_{\{H_x + H_{\mathrm{max}}(T) >h\}} \right] \\&= \mathbb{N }^{\psi _{\bar{\theta }}}\left[ {\mathbf{1}}_{\{H_{\mathrm{max}}(\mathcal{T }) < h\}} \int \mathbf{m}^{\mathcal{T }}(dx) \right. \\&\times \left. \int {\mathbf{N}}^{\psi _\theta }[dT] F(\mathcal{T },\,\mathcal{T }\circledast (T,x)) {\mathbf{1}}_{\{H_x + H_{\mathrm{max}}(T) >h\}} \right] . \end{aligned}$$

Notice we only replaced \(\mathbb{N }^{\psi _{\theta }}\) by \(\mathbb{N }^{\psi _{\bar{\theta }}}\) in the last equality.

We explain how the term \({\mathbf{1}}_{\{H_{\mathrm{max}}(\mathcal{T }) < h\}}\) changes the decomposition of \(\mathcal{T }\) according to the spine given in Theorem 2.18. Let \(\Phi \) a non-negative measurable function defined on \([0,+\infty )\times \mathbb{T }\) and \(\varphi \) a non-negative measurable function defined on \([0,+\infty )\). Using Theorem 2.18 and notations therein, we get:

$$\begin{aligned}&\mathbb{N }^{\psi _{\bar{\theta }}} \left[ \int \mathbf{m}^{\mathcal{T }}(dx) \right. \left. \varphi (H_x) \mathrm{e}^{-\langle \mathcal{M }_x, \Phi \rangle } {\mathbf{1}}_{\{H_{\mathrm{max}}(\mathcal{T }) < h\}} \right] \\&\quad =\int _0^\infty da \, \varphi (a) \mathrm{e}^{-\psi ^{\prime }_{\bar{\theta }} (0) a }\, \mathbb{E }\left[ \mathrm{e}^{-\sum _{i\in I} {\mathbf{1}}_{\{z_i\le a\}} \Phi (z_i, \bar{\mathcal{T }}^i)} \prod _{i\in I, z_i\le a} {\mathbf{1}}_{\{H_{\mathrm{max}}( \bar{\mathcal{T }}^i)< h-z_i\}} \right] \\&\quad =\int _0^{h }da \, \varphi (a) \, \exp \left( -\psi ^{\prime }({\bar{\theta }} ) a- \int _0^a dx\, {\mathbf{N}}^{\psi _{\bar{\theta }}}\left[ 1- \mathrm{e}^{-\Phi (x, \mathcal{T })} {\mathbf{1}}_{\{H_{\mathrm{max}}(\mathcal{T }) < h-x\}}\right] \right) \!. \end{aligned}$$

Using the definition of \({\mathbf{N}}^{\psi _{\bar{\theta }}}\), see (40), (46) and the Girsanov transformation (28), we get:

$$\begin{aligned}&{\mathbf{N}}^{\psi _{\bar{\theta }}}\left[ 1- \mathrm{e}^{-\Phi (x, \mathcal{T })} {\mathbf{1}}_{\{H_{\mathrm{max}}(\mathcal{T }) < h-x\}}\right] \\&\quad =\gamma _{\bar{\theta }}\left( \mathbb{N }^{\psi _{\bar{\theta }}}\left[ 1- \mathrm{e}^{-\Phi (x, \mathcal{T })} {\mathbf{1}}_{\{H_{\mathrm{max}}(\mathcal{T }) < h-x\}}\right] \right) \\&\quad =\gamma _{\bar{\theta }}\left( b^{\bar{\theta }}(h-x) + \mathbb{N }^{\psi _{ \theta }}\left[ \left( 1- \mathrm{e}^{-\Phi (x, \mathcal{T })}\right) {\mathbf{1}}_{\{H_{\mathrm{max}}(\mathcal{T }) < h-x\}}\right] \right) . \end{aligned}$$

Thanks to (46) and (47), we have for \(\lambda \ge 0\):

$$\begin{aligned} \gamma _{\bar{\theta }}(b^{\bar{\theta }}(h-x) +\lambda )= \gamma _{\theta +b^{\theta }(h-x) }(\lambda ) + \gamma _{\theta }(b^{\theta }(h-x)) + \psi ^{\prime }(\theta )-\psi ^{\prime }(\bar{\theta }). \end{aligned}$$

Take \(\lambda = \mathbb{N }^{\psi _{\theta }}\left[ \left( 1- \mathrm{e}^{-\Phi (x, \mathcal{T })}\right) {\mathbf{1}}_{\{H_{\mathrm{max}}(\mathcal{T }) < h-x\}}\right] \), to deduce that:

$$\begin{aligned}&\mathbb{N }^{\psi _{\bar{\theta }}} \left[ \int \mathbf{m}^{\mathcal{T }}(dx) \right. \left. \varphi (H_x) \mathrm{e}^{-\langle \mathcal{M }_x,\Phi \rangle } {\mathbf{1}}_{\{H_{\mathrm{max}}(\mathcal{T }) < h\}} \right] \\&\quad =\int _0^h da \, \varphi (a) \, \exp \left( -\psi ^{\prime }({\theta }) a - \int _0^a dx\, \gamma _{\theta } (b^{\theta }(h-x)) \right) \\&\qquad \times \exp \left( - \int _0^a dx\, \gamma _{\theta + b^{\theta }(h-x)} \left( \mathbb{N }^{\psi _{\theta }} \left[ \left( 1- \mathrm{e}^{-\Phi (x,\mathcal{T })} \right) {\mathbf{1}}_{\{H_{\mathrm{max}}(\mathcal{T }) < h-x\}}\right] \right) \right) \\&\quad =\int _0^h da \, \varphi (a) \, \exp \left( -\psi ^{\prime }({\theta }) a - \int _0^a dx\, \gamma _{\theta } (b^{\theta }(h-x)) \right) \mathbb{E }\left[ \mathrm{e}^{-\sum _{i\in I} {\mathbf{1}}_{\{z_i\le a\}} \Phi (z_i, \tilde{\mathcal{T }}^i)} \right] , \end{aligned}$$

where under \(\mathbb{E }\), \(\sum _{i\in I} \delta _{(z_i, \tilde{\mathcal{T }}^i)}(dz, d\mathcal{T })\) is a Poisson point measure on \([0,h]\times \mathbb{T }\) with intensity \(\nu _\theta \) in (53). Since Laplace transforms characterize random measure distributions, we get that for any non-negative measurable function \(\tilde{F}\), we have:

$$\begin{aligned}&\mathbb{N }^{\psi _{\bar{\theta }}} \left[ \int \mathbf{m}^{\mathcal{T }}(dx) \tilde{F}(H_x, \mathcal{M }_x) {\mathbf{1}}_{\{H_{\mathrm{max}}(\mathcal{T }) < h\}} \right] \\&\quad = \int _0^{h}da \, \mathrm{e}^{-\psi ^{\prime }({\theta }) a - \int _0^a dx\, \gamma _{\theta } (b^{\theta }(h-x)) } \mathbb{E }\left[ \tilde{F}\left( a, \sum _{i\in I} {\mathbf{1}}_{\{ z_i\le a\}} \delta _{(z_i, \tilde{\mathcal{T }}^i)}\right) \right] . \end{aligned}$$

If we identify the spine \([\![\varnothing , x ]\!]\) (with its metric) with the interval \([ 0, H_x ]\) (with the Euclidean metric), we can use this result to compute \(B(\theta ,h) \) with:

$$\begin{aligned} \tilde{F}(H_x, \mathcal{M }_x)= \int {\mathbf{N}}^{\psi _\theta }[dT\,|\,H_x + H_{\mathrm{max}}(T) >h ] F(\mathcal{T },\, \mathcal{T }\circledast (T,x)), \end{aligned}$$

\(\mathcal{M }_x=\sum _{i\in I_x}\delta _{(H_{x_i}, \mathcal{T }^i)}\) and \(\mathcal{T }= [0, H_x ] \circledast _{i\in I_x}(\mathcal{T }^i,H_{x_i}) \). Since \({\mathbf{N}}^{\psi _\theta }[ H_{\mathrm{max}}(\mathcal{T }) >h ]= \gamma _\theta (b^\theta (h))\), we have:

$$\begin{aligned} \gamma _\theta (b^\theta (h-H_x)) \tilde{F}(H_x, \mathcal{M }_x)= \int {\mathbf{N}}^{\psi _\theta }[dT ] F(\mathcal{T },\, \mathcal{T }\circledast (T,x)) {\mathbf{1}}_{\{H_x + H_{\mathrm{max}}(T) >h\}}. \end{aligned}$$

Therefore, we have:

$$\begin{aligned}&B(\theta ,h) = \mathbb{N }^{\psi _{\bar{\theta }}}\left[ {\mathbf{1}}_{\{H_{\mathrm{max}}(\mathcal{T }) < h\}} \int \mathbf{m}^{\mathcal{T }}(dx) \right. \\&\qquad \qquad \qquad \times \left. \int {\mathbf{N}}^{\psi _\theta }[dT] F(\mathcal{T },\,\mathcal{T }\circledast (T,x)) {\mathbf{1}}_{\{H_x + H_{\mathrm{max}}(T) >h\}} \right] \\&\quad =\int _0^{h }da \, \gamma _\theta (b^\theta (h-a)) \mathrm{e}^{-\psi ^{\prime }({ \theta }) a - \int _0^a dx\, \gamma _{ \theta } (b^{ \theta }(h-x)) } \mathbb{E }\left[ \tilde{F}\left( a, \sum _{i\in I} {\mathbf{1}}_{\{ z_i\le a\}} \delta _{(z_i, \tilde{\mathcal{T }}^i)} \right) \right] . \end{aligned}$$

Thus, we get:

$$\begin{aligned}&\mathbb{N }^{\psi }[ F(\mathcal{T }_{A_h},\, \mathcal{T }_{A_h-}) g(A_h) ]=\int _{\Theta ^\psi } d\theta \ g(\theta ) \int _0^{h }da \, \gamma _\theta (b^\theta (h-a) ) \\&\quad \mathrm{e}^{-\psi ^{\prime }({ \theta }) a - \int _0^a dx\, \gamma _{ \theta } (b^{ \theta }(h-x)) } \mathbb{E }\left[ \tilde{F}\left( a, \sum _{i\in I} {\mathbf{1}}_{\{ z_i\le a\}} \delta _{(z_i, \tilde{\mathcal{T }}^i)}\right) \right] . \end{aligned}$$

Then use the distribution of \(A_h\) under \(\mathbb{N }^\psi \) given in Proposition 4.2 to conclude. \(\square \)