1 Introduction

The original transport problem proposed by Monge [33] is to optimally move a pile of soil to an excavation. Mathematically, given two measures \(\nu \) and \(\mu \) of equal mass, we look for an optimal bijection of \(\mathbb {R}^d\) which moves \(\nu \) to \(\mu \), i.e., look for a map \(S\) so that

$$\begin{aligned} \int _{\mathbb {R}^d} \varphi (S(x)) d\nu (x) = \int _{\mathbb {R}^d} \varphi (x) d\mu (x), \end{aligned}$$

for all continuous functions \(\varphi \). Then, with a given cost function \(c\), the objective is to minimize

$$\begin{aligned} \int _{\mathbb {R}^d}\ c(x,S(x))\ d\nu (x) \end{aligned}$$

over all bijections \(S\).

In his classical papers [29, 30], Kantorovich relaxed this problem by considering a probability measure on \(\mathbb {R}^d \times \mathbb {R}^d\), whose marginals agree with \(\nu \) and \(\mu \), instead of a bijection. This generalization linearizes the problem. Hence it allows for an easy existence result and enables one to identify its convex dual. Indeed, the dual elements are real-valued continuous maps \((g, h)\) of \(\mathbb {R}^d\) satisfying the constraint

$$\begin{aligned} g(x) + h(y) \le c(x,y). \end{aligned}$$
(1.1)

The dual objective is to maximize

$$\begin{aligned} \int _{\mathbb {R}^d} g(x) \ d\nu (x) + \int _{\mathbb {R}^d} h(y) \ d\mu (y) \end{aligned}$$

overall \((g, h)\) satisfying the constraint 1.1. In the last decades an impressive theory has been developed and we refer the reader to [1, 40, 41] and to the references therein.

In robust hedging problems, we are also given two measures. Namely, the initial and the final distributions of a stock process. We then construct an optimal connection. In general, however, the cost functional depends on the whole path of this connection and not simply on the final value. Hence, one needs to consider processes instead of simply the maps \(S\). The probability distribution of this process has prescribed marginals at final and initial times. Thus, it is in direct analogy with the Kantorovich measure. But, financial considerations restrict the process to be a martingale (see Definition 2.4). Interestingly, the dual also has a financial interpretation as a robust hedging (super-replication) problem. Indeed, the replication constraint is similar to (1.1). The formal connection between the original Monge–Kantorovich problem and the financial problem is further discussed in Remark 2.9 and also in the papers [6, 21].

We continue by describing the robust hedging problem. Consider a financial market consisting of one risky asset with a continuous price process. As in the classical paper of Hobson [22], all call options are liquid assets and can be traded for a “reasonable” price that is known initially. Hence, the portfolio of an investor consists of static positions in the call options in addition to the usual dynamically updated risky asset. This leads us to a similar structure to that in [22] and in other papers [7, 9, 11, 12, 15, 16, 21, 2428, 32] which consider model-independent pricing. This approach is very closely related to path-wise proofs of well-known probabilistic inequalities [2, 10]. Apart from the continuity of the price process no other model assumptions are placed on the dynamics of the price process.

In this market, we prove the Kantorovich duality, Theorem 2.7, and an approximation result, Theorem 2.10, for a general class of path-dependent options. The classical duality theorem, for a market with a risky asset whose price process is a semi–martingale, states that the minimal super-replication cost of a contingent claim is equal to the supremum of its expected value over all martingale measures that are equivalent to a given measure. We refer the reader to Delbaen and Schachermayer [17] (Theorem 5.7) for the case of general semi-martingale processes and to El-Karoui and Quenez [20] for its dynamic version in the diffusion case. Theorem 2.7 below, also provides a dual representation of the minimal super-replication cost but for model independent markets. The dual is given as the supremum of the expectations of the contingent claim over all martingale measures with a given marginal at the maturity but with no dominating measure. Since no probabilistic model is pre-assumed for the price process, the class of all martingale measures is quite large. Moreover, martingale measures are typically orthogonal to each other. These facts render the problem difficult.

In the literature, there are two earlier results in this direction. In a purely discrete setup, a similar result was recently proved by Beiglböck, Henry-Labordère and Penkner [6]. In their model, the investor is allowed to buy all call options at finitely many given maturities and the stock is traded only at these possible maturities. In this paper, however, the stock is traded in continuous time together with a static position in the calls with one maturity. In [6] the dual is recognized as a Monge–Kantorovich type optimal transport for martingale measures and the main tool in [6] is a duality result from optimal transport (see Theorem 2.14 in [31]).

In continuous time, Galichon, Henry-Labordère and Touzi [21] prove a different duality and then use the dual to convert the problem to an optimal control problem. There are two main differences between our result and the one proved in [21]. The duality result, Proposition 2.1 in [21], states that the minimal super-replication cost is given as the infimum over Lagrange multipliers and supremum over martingale measures without the final time constraint and the Lagrange multipliers are related to the constraint. Also the problem formulation is different. The model in [21] assumes a large class of possible martingale measures for the price process. The duality is then proved by extending an earlier unconstrained result proved in [38]. As in the unconstrained model of [14, 38, 39], the super replication is defined not path-wise but rather probabilistically through quasi-sure inequalities. Namely, the super-replication cost is the minimal initial wealth from which one can super-replicate the option almost surely with respect to all measures in a given class. In general, these measures are not dominated by one measure. As already mentioned this is the main difficulty and sets the current problem apart from the classical duality discussed earlier. However, our duality result together with the results of [21] implies that these two approaches—namely, robust hedging through the path-wise definition of this paper and the quasi-sure definition of [14, 38, 39] yield the same value. This is proved in Sect. 3 below.

Our second result provides a class of portfolios which are managed on a finite number of random times and asymptotically achieve the minimal super-replication cost. This result may have practical implications allowing us to numerically investigate the corresponding discrete hedges, but we relegate this to a future study.

Robust hedging has been an active research area over the past decade. The initial paper of Hobson [22] studies the case of the lookback option. The connection to the Skorokhod embedding is also made in this paper and an explicit solution for the minimal super-replication cost is obtained. This approach is further developed by Brown, Hobson and Rogers [7], Cox and Obłój [11, 13] and in several other papers, [2328]. We refer the reader to the excellent survey of Hobson [23] for robust hedging and to Obłój [34] for the Skorokhod embedding problem. In particular, the recent paper by Cox and Wang [13] provides a discussion of various constructions of Root’s solution of the Skorokhod embedding.

A similar modeling approach is applied to volatility options by Carr and Lee [9]. In a recent paper, Davis, Obłój and Raval [16] considers the variance swaps in a market with finitely many put options. In particular, in [16] the class of admissible portfolios is enlarged and numerical evidence is obtained by analyzing the S&P500 index options data. Furthermore, [16] contains a duality result in a simpler setting, using the classical Karlin–Isii duality in semi-infinite linear programming.

As already mentioned above, the dual approach is used by Galichon, Henry-Labordère and Touzi [21] and Henry-Labordère and Touzi [32] as well. In these papers, the duality provides a connection to stochastic optimal control which can then be used to compute the solution in a more systematic manner.

The proof of the main results is done in four steps. The first step is to reduce the problem to bounded claims. The second step is to represent the original robust hedging problem as a limit of robust hedging problems which live on a sequence of countable spaces. For these type of problems, robust hedging is the same as classical hedging, under the right choice of a probability measure. Thus we can apply the classical duality results for super-hedging of European options on a given probability space. The third step is to use the discrete structure and apply a standard min–max theorem (similar to the one used in [6]). The last step is to analyze the limit of the obtained prices in the discrete time markets. We combine methods from arbitrage-free pricing and limit theorems for stochastic processes.

The paper is organized as follows. The main results are formulated in the next section. In Sect. 3, the connection between the quasi sure approach and ours is proved. The two sections that follow are devoted to the proof of one inequality which implies the main results. The final section discusses a possible extension.

2 Preliminaries and main results

The financial market consists of a savings account which is normalized to unity \(B_t\equiv 1\) by discounting and of a risky asset \(S_t\), \(t\in [0,T]\), where \(T<\infty \) is the maturity date. Let \(s:=S_0>0\) be the initial stock price and without loss of generality, we set \(s=1\). Denote by \(\mathcal {C}^{+}[0,T]\) the set of all strictly positive functions \(f:[0,T]\rightarrow \mathbb {R}_{+}\) which satisfy \(f_0=1\). We assume that \(S_t\) is a continuous process. Then, any element of \(\mathcal {C}^{+}[0,T]\) can be a possible path for the stock price process \(S\). Let us emphasize that this the only assumption that we make on our financial market.

Denote by \(\mathcal {D}[0,T]\) the space of all measurable functions \(\upsilon :[0,T]\rightarrow \mathbb {R}\) with the norm \(||\upsilon ||=\sup _{0\le t \le T}|\upsilon _t|\). Let \(G:\mathcal {D}[0,T]\rightarrow \mathbb {R}\) be a given deterministic map. We then consider a path dependent European option with the payoff

$$\begin{aligned} X=G(S), \end{aligned}$$
(2.1)

where \(S\) is viewed as an element in \(\mathcal {D}[0,T]\).

2.1 An assumption on the claim

Since our proof is through an approximation argument, we need the regularity of the pay-off functional \(G\). Indeed, we first approximate the stock price process by piece-wise constant functions taking values in a finite set. We also discretize the jump times to obtain a countable set of possible price processes. This process necessitates a continuity assumption with respect to a Skorokhod type topology. Further discussion of this assumption is given in Remark 2.2. In particular, Asian and lookback type options satisfy the below condition. A possible generalization of our result to more general class of pay-offs is discussed in the final Sect. 6.

Let \(\mathcal {D}_N[0,T]\) be the subset of \(\mathcal {D}[0,T]\) that are piece-wise constant functions with \(N\) possible jumps i.e., \(v \in \mathcal {D}_N[0,T]\) if and only if there exists a partition \( t_0=0 < t_1 <t_2<\cdots <t_N < T\) such that

$$\begin{aligned} v_t= \sum _{i=1}^{N}\ v_i \chi _{[t_{i-1},t_i)}(t) + v_{N+1}\chi _{[t_{N},T]}(t), \ \ {\hbox {where}}\ \ v_i:= v_{t_{i-1}}, \end{aligned}$$

and we set \(\chi _A\) be the characteristic function of the set \(A\). We make the following standing assumption on \(G\).

Assumption 2.1

There exists a constant \(L>0\) so that

$$\begin{aligned} |G(\omega )- G(\tilde{\omega })|\le L \Vert \omega -\tilde{\omega }\Vert , \ \ \omega ,\tilde{\omega }\in \mathcal {D}[0,T], \end{aligned}$$

where as before, \(||\cdot ||\) is the \(\sup \) norm.

Moreover, let \(\upsilon , \tilde{\upsilon }\in \mathcal {D}_N[0,T]\) be such that \(\upsilon _{i}=\tilde{\upsilon }_{i}\) for all \(i=1,\ldots ,N\). Then,

$$\begin{aligned} |G(\upsilon )-G(\tilde{\upsilon })| \le L \Vert \upsilon \Vert \sum _{k=1}^{N}|\Delta t_k-\Delta \tilde{t}_k|, \end{aligned}$$

where as usual \(\Delta t_k := t_k- t_{k-1}\) and \(\Delta \tilde{t}_k := \tilde{t}_k- \tilde{t}_{k-1}\).

Remark 2.2

In our setup, the process \(S\) represents the discounted stock price and \(G(S)\) represents the discounted reward. Let \(r>0\) be the constant interest rate. Then, the payoff

$$\begin{aligned} G(S):=e^{-rT}H\left( e^{rT}S_T, \min _{0\le t\le T} e^{rt}S_t, \max _{0\le t\le T} e^{rt}S_t,\int _{0}^T e^{rt} S_t dt\right) , \end{aligned}$$

with a Lipschitz continuous function \(H:\mathbb {R}^4\rightarrow \mathbb {R}\) satisfies the above assumption.

The above condition on \(G\) is, in fact, a Lipschitz assumption with respect to a metric very similar to the Skorokhod one. However, it is weaker than to assume Lipschitz continuity with respect to the Skorokhod metric. Recall that this classical metric is given by

$$\begin{aligned} d(f,g):=\inf _{\lambda }\sup _{0\le t\le T} \max \left( |f(t)-g(\lambda (t))|,|\lambda (t)-t| \right) \!, \end{aligned}$$

where the infimum is taken over all time changes. A time change is a strictly increasing continuous function which satisfy \(\lambda (0)=0\) and \(\lambda (T)=T\). We refer the reader to Chapter 3 in [5] for more information. In particular, while \(\int _{0}^T S_t dt\) is continuous with respect to the Skorokhod metric in \(\mathcal {D}[0,T]\), it is not Lipschitz continuous with respect to this metric. Thus the above assumption is needed in order to include Asian options.

Moreover, from our proof of the main results it can be shown that Theorems 2.7 and 2.10 can be extended to payoffs of the form

$$\begin{aligned} e^{-rT}H\left( e^{r t_1}S_{t_1},\ldots , e^{r t_k} S_{t_k}, \min _{0\le t\le T} e^{rt} S_t, \max _{0\le t\le T}e^{rt}S_t, \int _{0}^T e^{rt} S_t dt\right) \end{aligned}$$

where \(H\) is Lipschitz and \(0<t_1<\cdots <t_k\le T\). \(\square \)

2.2 European calls

We assume that, at time zero, the investor is able to buy any call option with strike \(K\ge 0\), for the price

$$\begin{aligned} C(K):= \int \left( x-K\right) ^{+}d\mu (x), \end{aligned}$$
(2.2)

where \(\mu \) is a given probability measure on \(\mathbb {R}_{+}\). The measure \(\mu \) is assumed to be derived from observed call prices that are liquidly traded in the market. One may also think of \(\mu \) as describing the probabilistic belief (in the market) about the stock price distribution at time \(T\). Then, an approximation argument implies that the price of a derivative security with the payoff \(g(S_T)\) with a bounded, measurable \(g\) must be given by \(\int g d\mu \). We then assume that this formula also holds for all \(g \in \mathbb {L}^1(\mathbb {R}_+,\mu )\).

In particular, \(C(0)= \int x d\mu (x)\). On the other hand the pay-off \(C(0)\) is one stock. Hence, the value of \(C(0)\) must be equal to the initial stock price \(S_0\) which is normalized to one. Therefore, although the probability measure \(\mu \) is quite general, in view of our assumption (2.2) and arbitrage considerations, it should satisfy

$$\begin{aligned} C(0)=\int xd\mu (x)=S_0=1. \end{aligned}$$
(2.3)

For technical reasons, we also assume that there exists \(p>1\) such that

$$\begin{aligned} \int x^p d\mu (x)<\infty . \end{aligned}$$
(2.4)

2.3 Admissible portfolios

We continue by describing the continuous time trading in the underlying asset \(S\). Since we do not assume any semi-martingale structure of the risky asset, this question is nontrivial. We adopt the path-wise approach and require that the trading strategy (in the risky asset) is of finite variation. Then, for any function \(h:[0,T]\rightarrow \mathbb {R}\) of finite variation and continuous function \(S \in \mathcal {C}[0,T]\), we use integration by parts to define

$$\begin{aligned} \int _{0}^t h_u dS_u:= h_t S_t-h_0S_0-\int _{0}^t S_udh_u, \end{aligned}$$

where the last term in the above right hand side is the standard Stieltjes integral.

We are now ready to give the definition of semi-static portfolios and super-hedging. Recall the exponent \(p\) in (2.4).

Definition 2.3

  1. 1.

    We say that a map

    $$\begin{aligned} \phi : A \subset \mathcal {D}[0,T]\rightarrow \mathcal {D}[0,T] \end{aligned}$$

    is progressively measurable, if for any \(v, \tilde{v} \in A\),

    $$\begin{aligned} v_u=\tilde{v}_u, \ \ \forall u \in [0,t] \ \ \Rightarrow \ \ \phi (v)_t=\phi (\tilde{v})_t. \end{aligned}$$
    (2.5)
  2. 2.

    A semi-static portfolio is a pair \(\pi :=(g,\gamma )\), where \(g\in \mathbb {L}^1(\mathbb {R}_+,\mu )\) and

    $$\begin{aligned} \gamma :\mathcal {C}^{+}[0,T]\rightarrow \mathcal {D}[0,T] \end{aligned}$$

    is a progressively measurable map of bounded variation.

  3. 3.

    The corresponding discounted portfolio value is given by,

    $$\begin{aligned} Z^{\pi }_t(S)=g(S_T)\chi _{\{t=T\}}+\int _{0}^t \gamma _u( S) d{S}_u, \quad t\in [0,T], \end{aligned}$$

    where \(\chi _{A}\) is the indicator of the set \(A\). A semi-static portfolio is admissible, if there exists \(M>0\) such that

    $$\begin{aligned} Z^\pi _t(S)\ge -M \left( 1+\sup _{0\le u\le t}S^p_u\right) , \quad \forall \, t\in [0,T], \ \ S\in \mathcal {C}^{+}[0,T]. \end{aligned}$$
    (2.6)
  4. 4.

    An admissible semi-static portfolio is called super-replicating, if

    $$\begin{aligned} Z^\pi _T(S)\ge G(S), \quad \forall {S}\in \mathcal {C}^{+}[0,T]. \end{aligned}$$

    Namely, we require that for any possible value of the stock process, the portfolio value at maturity will be no less that the reward of the European claim.

  5. 5.

    The (minimal) super-hedging cost of \(G\) is defined by,

    $$\begin{aligned} V(G):=\inf \left\{ \int g d\mu : \ \exists \gamma \ \hbox {such} \ \hbox {that} \ \pi :=(g,\gamma ) \ \hbox {is} \ \hbox {super-replicating} \ \right\} . \end{aligned}$$

Notice that the set of admissible portfolios depends on the exponent \(p\) which appears in the assumption (2.4). We suppress this possible dependence to simplify the exposition.

2.4 Martingale optimal transport

Since the dual formula refers to a probabilistic structure, we need to introduce that structure as well. Set \(\Omega :=\mathcal {C}^+[0,T]\) and let \(\mathbb {S}=(\mathbb {S}_t)_{0\le t\le T}\) be the canonical process given by \(\mathbb {S}_t(\omega ):=\omega _t\), for all \(\omega \in \Omega \). Let \(\mathcal {F}_t:=\sigma (\mathbb {S}_s,\, 0\le s\le t)\) be the canonical filtration (which is not right continuous).

The following class of probability measures are central to our results. Recall that we have normalized the stock prices to have initial value one. Therefore, the probability measures introduced below need to satisfy this condition as well.

Definition 2.4

A probability measure \(\mathbb {Q}\) on the space \((\Omega ,{\mathcal F})\) is a martingale measure, if the canonical process \(({\mathbb {S}_t})_{t=0}^T\) is a local martingale with respect to \(\mathbb {Q}\) and \(\mathbb {S}_0=1\) \(\mathbb {Q}\)-a.s.

For a probability measure \(\mu \) on \(\mathbb {R}_+\), \(\mathbb {M}_\mu \) is the set of all martingale measures \(\mathbb {Q}\) such that the probability distribution of \(\mathbb {S}_T\) under \(\mathbb {Q}\) is equal to \(\mu \).

Note that if \(\mu \) satisfies (2.3), then the canonical process \(({\mathbb {S}_t})_{t=0}^T\) is a martingale (not only a local martingale) under any measure \(\mathbb {Q}\in \mathbb {M}_\mu \). Indeed, a strict local martingale satisfies

$$\begin{aligned} 1=\mathbb {S}_0>\mathbb {E}_{\mathbb {Q}}[ \mathbb {S}_T]=\int x d\mu (x), \end{aligned}$$

and it would be in contradiction with (2.3). We use \(\mathbb {E}_{\mathbb {Q}}\) to denote the expectation with respect to \(\mathbb {Q}\).

Remark 2.5

Observe that (2.3) yields that the set \(\mathbb {M}_\mu \) is not empty. Indeed, consider a complete probability space \((\Omega ^W,\mathcal {F}^W,P^W)\) together with a standard one-dimensional Brownian motion \((W_t)_{t=0}^\infty \), and the natural filtration \(\mathcal {F}^W_t\) which is the completion of \(\sigma {\{W_s|s\le {t}\}}\). Then, there exists a function \(f: \mathbb {R}\rightarrow \mathbb {R}_{+}\) such that the probability distribution of \(f(W_T)\) is equal to \(\mu \). Define the martingale \(M_t:=E^W(f(W_T)| \mathcal {F}^W_t)\), \(t\in [0,T]\). In view of (2.3), \(M_0=1\). Since \(M\) is a Brownian martingale, it is continuous. Moreover, since \(\mu \) has support on the positive real line, \(f \ge 0\) and consequently, \(M \ge 0\). Then, the distribution of \(M\) on the space \(\Omega \) is an element in \( \mathbb {M}_\mu \). This construction underpins Bass solution to the Skorokhod embedding problem (see [4]).

Remark 2.6

Clearly the duality is very closely related to fundamental theorem of asset pricing, which states the existence of a measure \(\mathbb {Q}\in \mathbb {M}_\mu \). Since, as shown in the above remark such measures exist under our set of assumptions, the market considered in this paper is arbitrage-free. Then, a natural question that arises is whether our assumptions on the option prices and the measure \(\mu \) can be replaced by the assumption of no-arbitrage. We do not address this very interesting question in this paper. However, several recent papers [3, 8] study this question in discrete time.

The following is the main result of the paper. An outline of its proof is given in Sect. 2.6, below.

Theorem 2.7

Assume that the European claim \(G\) satisfies the Assumption 2.1 and the probability measure \(\mu \) satisfies (2.3) and (2.4). Then, the minimal super-hedging cost is given by

$$\begin{aligned} V(G)=\sup _{\mathbb {Q}\in \mathbb {M}_\mu } \mathbb {E}_{\mathbb {Q}}\left[ G(\mathbb {S})\right] \!. \end{aligned}$$

Remark 2.8

The above theorem provides a duality result for the robust semi-static hedging of a general pay-off. Many specific examples have been considered in the literature. Indeed, the initial paper of Hobson [22] explicitly provides the hedge for a lookback option. Similarly, using the random time change and Skorokhod embedding method [7, 9, 11], and several other papers analyze barrier options, lookback options and volatility options. Also, the path-wise proof of the Doob’s maximal inequality given in [2] constructs an explicit portfolio which robustly hedges the power of the running maximum. We use this hedge in the proof of Lemma 4.1 as well.

Remark 2.9

One may consider the maximizer, if exists, of the expression

$$\begin{aligned} \sup _{\mathbb {Q}\in \mathbb {M}_\mu } \mathbb {E}_{\mathbb {Q}}\left[ G(\mathbb S)\right] \!, \end{aligned}$$

as the optimal transport of the initial probability measure \(\nu =\delta _{\{1\}}\) to the final distribution \(\mu \). However, an additional constraint that the connection is a martingale is imposed. This in turn places a restriction on the measures, namely (2.3). The penalty function \(c\) is replaced by a more general functional \(G\). In this context, one may also consider general initial distributions \(\nu \) rather than Dirac measures. Then, the martingale measures with given marginals corresponds to the Kantorovich generalization of the mass transport problem.

The super-replication problem is also analogous to the Kantorovich dual. However, the dual elements reflect the fact that the cost functional depends on the whole path of the connection.

The reader may also consult [6] for a very clear discussion of the connection between robust hedging and optimal transport.

2.5 A discrete time approximation

Next we construct a special class of simple strategies which achieve asymptotically the super-hedging cost \(V\).

For a positive integer \(N\) and any \(S\in \mathcal {C}^{+}[0,T]\), set \(\tau ^{(N)}_0( S)=0\). Then, recursively define

$$\begin{aligned} \tau ^{(N)}_k(S)=\inf \left\{ t>\tau ^{(N)}_{k-1}(S): |S_t-S_{\tau ^{(N)}_{k-1}( S)}| =\frac{1}{N} \right\} \wedge T, \end{aligned}$$
(2.7)

where we set \(\tau ^{(N)}_k(S)=T\), when the above set is empty. Also, define

$$\begin{aligned} H^{(N)}(S)=\min \{k\in \mathbb {N}:\tau ^{(N)}_k(S)=T\}. \end{aligned}$$
(2.8)

Observe that for any \( S\in \mathcal {C}^{+}[0,T]\), \(H^{(N)}( S)<\infty \).

Denote by \(\mathcal {A}_N\) the set of all portfolios for which the trading in the stock occurs only at the moments \(0=\tau ^{(N)}_0(S)<\tau ^{(N)}_1(S)<\cdots <\tau ^{(N)}_{H^{(N)}(S)}(S)=T\). Formally, \(\pi :=(g,\gamma )\in \mathcal {A}_{N}\), if it is progressively measurable in the sense of (2.5) and it is of the form

$$\begin{aligned} \gamma _t(S) = \sum _{k=0}^{H^{(N)}(S)-1} \gamma _k(S) \chi _{(\tau ^{(N)}_k( S), \tau ^{(N)}_{k+1}(S)]}(t), \end{aligned}$$

for some \(\gamma _k(S)\)’s. Note that, \(\gamma _k(S)\) can depend on \(S\) only through its values up to time \(\tau ^{(N)}_k( S)\), so that \(\gamma _t\) is progressively measurable. Set

$$\begin{aligned} V_{N}(G):=\inf \left\{ \int g d\mu : \ \exists \gamma \ \hbox {such that} \ \pi :=(g,\gamma )\in \mathcal {A}_{N}\ \hbox {is super-replicating}\right\} . \end{aligned}$$

It is clear that for any integer \(N \ge 1\), \(V_N(G)\ge V(G)\). The following result proves the convergence to \(V(G)\). This approximation result is the second main result of this paper. Also, it is the key analytical step in the proof of duality.

Theorem 2.10

Under the assumptions of Theorem 2.7,

$$\begin{aligned} \lim _{N\rightarrow \infty }V_{N}(G)=V(G). \end{aligned}$$

2.6 Proofs of Theorems 2.7 and 2.10

Since \(V_N\ge V\), Theorems 2.7 and 2.10 would follow from the following two inequalities,

$$\begin{aligned} \lim \sup _{N\rightarrow \infty }V_{N}(G)\le \sup _{\mathbb {Q}\in \mathbb {M}_\mu } \mathbb {E}_{\mathbb {Q}}\left[ G(\mathbb S)\right] \end{aligned}$$
(2.9)

and

$$\begin{aligned} V(G)\ge \sup _{\mathbb {Q}\in \mathbb {M}_\mu } \mathbb {E}_{\mathbb {Q}}\left[ G(\mathbb S)\right] \!. \end{aligned}$$
(2.10)

The first inequality is the difficult one and it will be proved in Sects. 4 and 5. The second inequality is simpler and we provide its proof here.

Let \(\mathbb {Q}\in \mathbb {M}_\mu \) and let \(\pi =(g,\gamma )\) be super-replicating. Since \(\gamma \) is progressively measurable in the sense of (2.5), the stochastic integral

$$\begin{aligned} \int _{0}^t \gamma _u(\mathbb S)d\mathbb S_u \end{aligned}$$

is defined with respect to \(\mathbb {Q}\). Also \(\mathbb {Q}\) is a martingale measure. Hence, the above stochastic integral is a \(\mathbb {Q}\) local- martingale. Moreover, from (2.6) we have,

$$\begin{aligned} \int _{0}^t \gamma _u(\mathbb S)d\mathbb S_u \ge -M\left( 1+\sup _{0\le u\le t}|\mathbb {S}_t|^p\right) , \quad \ t\in [0,T]. \end{aligned}$$

Also in view of (2.4) and the Doob–Kolmogorov inequality for the martingale \(\mathbb {S}_t\),

$$\begin{aligned} \mathbb {E}_{\mathbb {Q}}\sup _{0\le t\le T}|\mathbb {S}_t|^p \le C_p \mathbb {E}_{\mathbb {Q}}|\mathbb {S}_T|^p = C_p \int |x|^p d\mu < \infty . \end{aligned}$$

Therefore, \({\mathbb E}_{\mathbb {Q}}\int _{0}^T \gamma _u(\mathbb S)d\mathbb S_u\le 0\). Since \(\pi \) is super-replicating, we conclude that

$$\begin{aligned} \mathbb {E}_{\mathbb {Q}}\left[ G(\mathbb S)\right] \le {\mathbb E}_{\mathbb {Q}}\left( \int _{0}^T \gamma _u(\mathbb S)d\mathbb S_u +g(\mathbb S_T)\right) \le {\mathbb E}_{\mathbb {Q}} \left[ g(\mathbb S_T)\right] =\int g d\mu , \end{aligned}$$

where in the last equality we again used the fact that the distribution of \(\mathbb {S}_T\) under \(\mathbb {Q}\) is equal to \(\mu \). This completes the proof of the lower bound. Together with (2.9), which will be proved later, it also completes the proofs of the theorems. \(\square \)

3 Quasi sure approach and full duality

An alternate approach to define robust hedging is to use the notion of quasi sure super-hedging as was done in [21, 38]. Let us briefly recall this notion. Let \(\mathcal {Q}\) be the set of all martingale measures \(\mathbb {Q}\) on the canonical space \(\mathcal {C}^{+}[0,T]\) under which the canonical process \(\mathbb {S}\) satisfies \(\mathbb {S}_0=1, \mathbb {Q}\)-a.s., has quadratic variation and satisfies \(\mathbb E_{\mathbb {Q}} \sup _{0\le t\le T} \mathbb {S}_t<\infty \). In this market, an admissible hedging strategy (or a portfolio) is defined as a pair \(\pi =(g,\gamma )\), where \(g\in \mathbb {L}_1(\mathbb R_{+},\mu )\) and \(\gamma \) is a progressively measurable process such that the stochastic integral

$$\begin{aligned} \int _{0}^t \gamma _ud\mathbb {S}_u, \quad \ t\in [0,T] \end{aligned}$$

exists for any probability measure \(\mathbb {Q}\in \mathcal {Q}\) and satisfies (2.6) \(\mathbb {Q}\)-a.s. We refer the reader to [38] for a complete characterization of this class. In particular, one does not restrict the trading strategies to be of bounded variation. A portfolio \(\pi =(g,\gamma )\) is called an (admissible) quasi-sure super-hedge, provided that

$$\begin{aligned} g(\mathbb S_T)+\int _{0}^T \gamma _u d\mathbb {S}_u \ge G(\mathbb S), \ \ \mathbb {Q} \ \hbox {a.s.}, \end{aligned}$$

for all \(\mathbb {Q}\in \mathcal {Q}\). Then, the minimal super-hedging cost is given by

$$\begin{aligned} V_{qs}(G):=\inf \left\{ \int g d\mu : \ \exists \gamma \ \hbox {such that} \ \pi :=(g,\gamma )\ \hbox {is a quasi-sure super-hedge} \right\} . \end{aligned}$$

Clearly,

$$\begin{aligned} V(G)\ge V_{qs}(G). \end{aligned}$$

From simple arbitrage arguments it follows that

$$\begin{aligned} V_{qs}(G)\ge \inf _{\lambda \in \mathbb {L}^1(\mathbb R_{+},\mu )} \sup _{\mathbb {Q}\in \mathcal {Q}}\mathbb E_{\mathbb {Q}} \left( G(\mathbb S)-\lambda (\mathbb S_T)+\int \lambda d\mu \right) \end{aligned}$$

where we set \(\mathbb {E}_{\mathbb {Q}}\xi \equiv -\infty \), if \(\mathbb {E}_{\mathbb {Q}}\xi ^{-}\!=\!\infty \). Since \(\inf \sup \ge \sup \inf \), the above two inequalities yield,

$$\begin{aligned} V(G)&\ge V_{qs}(G)\\&\ge \inf _{\lambda \in \mathbb {L}^1(\mathbb {R}_{+},\mu )} \sup _{\mathbb {Q}\in \mathcal {Q}} \mathbb E_{\mathbb {Q}}\left( G(\mathbb S)-\lambda (\mathbb S_T) +\int \lambda d\mu \right) \\&\ge \sup _{\mathbb {Q}\in \mathcal {Q}} \inf _{\lambda \in \mathbb {L}^1(\mathbb {R}_{+},\mu )} \mathbb E_{\mathbb {Q}} \left( G(\mathbb S)-\lambda (\mathbb S_T) +\int \lambda d\mu \right) . \end{aligned}$$

Now if \(\mathbb {Q}\in \mathbb {M}_\mu \), then the two terms involving \(\lambda \) are equal. So we first restrict the measures to the set \(\mathbb {M}_\mu \) and then use Theorem 2.7. The result is

$$\begin{aligned} V(G)&\ge V_{qs}(G)\ge \inf _{\lambda \in \mathbb {L}^1(\mathbb {R}_{+},\mu )} \sup _{\mathbb Q\in \mathcal {Q}} \mathbb E_{\mathbb {Q}}\left( G(\mathbb S)-\lambda (\mathbb S_T) +\int \lambda d\mu \right) \\&\ge \sup _{\mathbb {Q}\in \mathcal {Q}} \inf _{\lambda \in \mathbb {L}^1(\mathbb {R}_{+},\mu )} \mathbb E_{\mathbb {Q}} \left( G(\mathbb S)-\lambda (\mathbb S_T) +\int \lambda d\mu \right) \\&\ge \sup _{\mathbb {Q}\in \mathbb {M}_\mu } \mathbb {E}_{\mathbb {Q}}\left[ G(\mathbb S)\right] = V(G). \end{aligned}$$

Hence, all terms in the above are equal. We summarize this in the following which can be seen as the full duality.

Proposition 3.1

Assume that the European claim \(G\) satisfies Assumption 2.1 and the probability measure \(\mu \) satisfies (2.3), (2.4). Then,

$$\begin{aligned} V(G)&= V_{qs}(G) =\sup _{\mathbb Q\in \mathbb {M}_\mu } \mathbb {E}_{\mathbb {Q}}\left[ G(\mathbb S)\right] \\&= \inf _{\lambda \in \mathbb {L}^1(\mathbb {R}_{+},\mu )} \sup _{\mathbb Q\in \mathcal {Q}} \mathbb E_{\mathbb {Q}}\left( G(\mathbb S)-\lambda (\mathbb S_T) +\int \lambda d\mu \right) \\&= \sup _{\mathbb {Q}\in \mathcal {Q}} \inf _{\lambda \in \mathbb {L}^1(\mathbb {R}_{+},\mu )} \mathbb E_{\mathbb {Q}} \left( G(\mathbb S)-\lambda (\mathbb S_T) +\int \lambda d\mu \right) . \end{aligned}$$

4 Proof of the main results

The rest of the paper is devoted to the proof of (2.9).

4.1 Reduction to bounded claims

The following result will be used in two places in the paper. The first place is Lemma 4.2 where we reduce the problem to claims that are bounded from above. The other place is Lemma 4.8.

Consider a claim with pay-off

$$\begin{aligned} \alpha _{K}(S):= \Vert S\Vert \ \chi _{\{\Vert S\Vert \ge K\}}+ \frac{ \Vert S\Vert }{K}. \end{aligned}$$

Recall that \(V_N(\alpha _K)\) is defined in Sect. 2.5.

Lemma 4.1

$$\begin{aligned} \limsup _{K\rightarrow \infty } \ \limsup _{N \rightarrow \infty } V_N(\alpha _{K}) =0. \end{aligned}$$

Proof

In this proof, we always assume that \(N >K > 1\). Let \(\tau _k=\tau ^{(N)}_k(S)\) and \(n=H^{(N)}(S)\) be as in (2.7), (2.8), respectively, and set

$$\begin{aligned} \theta :=\theta ^{(K)}_N(S)=\min \{k:S_{\tau _k}\ge K-1\} \wedge n. \end{aligned}$$

Set \(c_p:= p/(p-1)\) where \(p\) as in (2.4). We define a portfolio \((g^{(N,K)},\gamma ^{(N,K)})\in \mathcal {A}_N\) as follows. For \(t\in (\tau _k,\tau _{k+1}]\) and \(k=0,1,\ldots ,n-1\), let

$$\begin{aligned} \gamma ^{(N,K)}_t(S)\!=\!\gamma ^{(N,K)}_{{\tau _k}}(S) \!=\!-\!\frac{ p^2}{K(p\!-\!1)}\left( \max _{0\le i\le k}S^{p-1}_{\tau _i}\right) \!-\!\frac{p^2}{(p\!-\!1)}\chi _{\{k\ge \theta \}}\ \left( \max _{\theta \le i\le k}\!S^{p-1}_{\tau _i}\right) , \end{aligned}$$
$$\begin{aligned} g^{(N,K)}(x)=\frac{1}{K}(1+((c_p x)^p-c_p)^{+})+ ((c_p x)^p-(c_p(K-1))^p)^{+}+\frac{2}{N}. \end{aligned}$$

We use Proposition 2.1 in [2] and the inequality \(x<1+x^p\), \(x\in \mathbb {R}_{+}\), to conclude that for any \(t\in [0,T]\)

$$\begin{aligned} g^{(N,K)}(S_t)+\int _{0}^t \gamma ^{(N,K)}_u dS_u \ge \frac{\bar{S}_t}{K} + \bar{S}_t\ \chi _{\{ \bar{S}_t \ge K\}}, \end{aligned}$$

where

$$\begin{aligned} \bar{S}_t:= \max _{0\le u\le t}S_u. \end{aligned}$$

Therefore, \(\pi ^{(N,K)}\!:=\!(g^{N,K)},\gamma ^{(N,K)})\) satisfies (2.6) and super-replicates \(\alpha _{K}\). Hence,

$$\begin{aligned} V_N(\alpha _{K}) \le \int g^{(N,K)}d\mu . \end{aligned}$$

Also, in view of (2.4),

$$\begin{aligned} \limsup _{K\rightarrow \infty }\ \limsup _{N\rightarrow \infty } \int g^{(N,K)}d\mu =0. \end{aligned}$$

These two inequalities complete the proof of the lemma. \(\square \)

A corollary of the above estimate is the following reduction to claims that are bounded from above.

Lemma 4.2

If suffices to prove (2.9) for claims \(G\) that are non-negative, bounded from above and satisfying Assumption 2.1.

Proof

We proceed in two steps. First suppose that (2.9) holds for nonnegative claims that are bounded from above. Then, the conclusions of Theorems 2.7 and 2.10 also hold for such claims.

Now let \(G\) be a non-negative claim satisfying Assumption 2.1. For \(K>0\), set

$$\begin{aligned} G_K:= G \wedge K. \end{aligned}$$

Then, \(G_K\) is bounded and (2.9) holds for \(G_K\). Therefore,

$$\begin{aligned} \limsup _{N \rightarrow \infty } V_N(G_K) \le \sup _{{\mathbb Q}\in \mathbb {M}_\mu } \mathbb {E}_\mathbb {Q}\left[ G_K(\mathbb {S})\right] \le \sup _{{\mathbb Q}\in \mathbb {M}_\mu } \mathbb {E}_\mathbb {Q}\left[ G(\mathbb {S})\right] \!. \end{aligned}$$

In view of Assumption 2.1,

$$\begin{aligned} G(S) \le G(0) + L \Vert S\Vert . \end{aligned}$$

Hence, the set \(\{G(S) \ge K\}\) is included in the set \(\{ L\Vert S\Vert +G(0)\ge K \}\) and

$$\begin{aligned} G \le G_K + \left( L\Vert S\Vert +G(0)-K\right) \chi _{\{L\Vert S\Vert +G(0)\ge K\}}. \end{aligned}$$

By the linearity of the market, this inequality implies that

$$\begin{aligned} V_N(G) \le V_N(G_K) + V_N((L\Vert S\Vert +G(0)-K)\chi _{\{L\Vert S\Vert +G(0)\ge K\}}). \end{aligned}$$

Moreover, in view of the previous lemma,

$$\begin{aligned} \limsup _{K\rightarrow \infty }\ \limsup _{N\rightarrow \infty } V_N((L\Vert S\Vert +G(0)-K)\chi _{\{L\Vert S\Vert +G(0)\ge K\}})=0. \end{aligned}$$

Using these, we conclude that

$$\begin{aligned} \limsup _{N \rightarrow \infty } V_N(G)\le \sup _{{\mathbb Q}\in \mathbb {M}_\mu } \mathbb {E}_\mathbb {Q}\left[ G(\mathbb {S})\right] \!. \end{aligned}$$

Hence, (2.9) holds for all functions that are non-negative and satisfy Assumption 2.1. By adding an appropriate constant this results extends to all claims that are bounded from below and satisfying Assumption 2.1.

Now suppose that \(G\) is a general function that satisfies Assumption 2.1. For \(c>0\), set

$$\begin{aligned} \check{G}_c:= G \vee (-c). \end{aligned}$$

Then, \(\check{G}\) is bounded from below and (2.9), Theorems 2.7 and 2.10 holds, i.e.,

$$\begin{aligned} \limsup _{N\rightarrow \infty } V_N(G) \le \limsup _{N\rightarrow \infty } V_N(\check{G}_c) = \sup _{\mathbb {Q}\in \mathbb {M}_\mu }\ \mathbb {E}_\mathbb {Q}[\check{G}_c(\mathbb S)]. \end{aligned}$$

By Assumption 2.1, \(\check{G}_c( S)\le G(S) + \check{e}_c(S)\) where the error function is

$$\begin{aligned} \check{e}_c(S):=(L\Vert S\Vert -G(0)-c)\chi _{\{L\Vert S\Vert -G(0)- c\ge 0\}}. \end{aligned}$$

Since \(\check{e}_c \ge 0\) and it satisfies the Assumption 2.1,

$$\begin{aligned} \sup _{\mathbb {Q}\in \mathbb {M}_\mu }\ \mathbb {E}_\mathbb {Q}[ \check{e}_c(\mathbb S)] = V(\check{e}_c) = \lim _{N\rightarrow \infty } V_N(\hat{e}_c). \end{aligned}$$

In view of Lemma 4.1,

$$\begin{aligned} \limsup _{c \rightarrow \infty } \sup _{\mathbb {Q}\in \mathbb {M}_\mu }\ \mathbb {E}_\mathbb {Q}[ \check{e}_c(\mathbb S)] = \limsup _{c \rightarrow \infty } \limsup _{N\rightarrow \infty } V_N(\check{e}_c)=0. \end{aligned}$$

We combine the above inequalities to conclude that

$$\begin{aligned} \limsup _{N\rightarrow \infty } V_N(G)&\le \limsup _{c \rightarrow \infty } \sup _{\mathbb {Q}\in \mathbb {M}_\mu }\ \mathbb {E}_\mathbb {Q}[\check{G}_c(\mathbb S)]\\&\le \sup _{\mathbb {Q}\in \mathbb {M}_\mu }\ \mathbb {E}_\mathbb {Q}[G(\mathbb S)]+ \limsup _{c \rightarrow \infty } \sup _{\mathbb {Q}\in \mathbb {M}_\mu }\ \mathbb {E}_\mathbb {Q}[\check{e}_c(\mathbb S)] \\&= \sup _{\mathbb {Q}\in \mathbb {M}_\mu }\ \mathbb {E}_\mathbb {Q}[G(\mathbb S)]. \end{aligned}$$

This exactly (2.9). \(\square \)

4.2 A countable class of piecewise constant functions

In this section, we provide a piece-wise constant approximation of any continuous function \(S\). Fix a positive integer \(N\). For any \(S \in \mathcal {C}^+[0,T]\), let \(\tau ^{(N)}_k(S)\) and \(H^{(N)}(S)\) be the times defined in (2.7) and (2.8), respectively. To simplify the notation, we suppress their dependence on \(S\) and \(N\) and also set

$$\begin{aligned} n=H^{(N)}( S). \end{aligned}$$
(4.1)

We first define the obvious piecewise constant approximation \(\hat{S}=\hat{S}^{(N)}(S)\) using these times. Indeed, set

$$\begin{aligned} \hat{S}_t:= \sum _{k=0}^{n-1} \ S_{\tau _k} \chi _{[\tau _k, \tau _{k+1})}(t) + \left[ S_{\tau _{n-1}}+\frac{1}{N} sign( S_T- S_{\tau _{n-1}})\right] \ \chi _{\{T\}}(t). \end{aligned}$$
(4.2)

The function, that takes \(S\) to \(\hat{S}\) is a map of \(\mathcal {C}^+[0,T]\) into the set of all functions with values in the target set

$$\begin{aligned} A^{(N)}=\left\{ i/N\ :\ i=0,1,2, \ldots , \right\} \!. \end{aligned}$$

Indeed, \(\hat{S}\) is behind the definition of the approximating costs \(V_N\). However, this set of functions is not countable as the jump times are not restricted to a countable set. So, we provide yet another approximation by restricting the jump times as well.

Let \(\hat{\Omega }:=\mathbb {D}[0,T]\) be the space of all right continuous functions \(f:[0,T]\rightarrow \mathbb {R}_{+}\) with left–hand limits (\(c\grave{a}dl\grave{a}g\) functions). For integers \(N,k\), let

$$\begin{aligned} U_k^{(N)}:= \{i/(2^{k}N): i=1,2, \ldots , \} \cup \{1/(i2^k N) : i=1,2, \ldots ,\}, \end{aligned}$$

be the sets of possible differences between two consecutive jump times. Next, we define subsets \(\mathbb {D}^{(N)}\) of \(\mathbb {D}[0,T]\).

Definition 4.3

A function \(f \in \mathbb {D}[0,T]\) belongs to \(\mathbb {D}^{(N)}\), if it satisfies the followings,

  1. 1.

    \(f(0)\in \{1-1/N,1+1/N\}\),

  2. 2.

    \(f\) is piecewise constant with jumps at times \(t_1,\ldots ,t_n\), where

    $$\begin{aligned} t_0=0<t_1<t_2<\cdots <t_n<T, \end{aligned}$$
  3. 3.

    for any \(k=1,\ldots ,n\), \(|f(t_k)-f(t_{k-1})|=1/N\),

  4. 4.

    for any \(k=1,\ldots ,n\), \(t_k-t_{k-1}\in U^{(N)}_k\).

We emphasize, in the fourth condition, the dependence of the set \(U^{(N)}_k\) on \(k\). So as \(k\) gets larger, jump times take values in a finer grid. Also, for technical reasons we will need that the functions value at \(0\) will be equal to \(1\pm 1/N\) but not \(1\).

We continue by defining an approximation of a generic stock price process \(S\),

$$\begin{aligned} F^{(N)} : \mathcal {C}^+[0,T] \rightarrow \mathbb {D}^{(N)}, \end{aligned}$$

as follow. Recall \(\tau _k=\tau ^{(N)}_k(S)\), \(n=H^{(N)}(S)\) from above and also from (2.7), (2.8). Set \(\hat{\tau }_0:=0\), \(\hat{\tau }_n=T\) and for \(k=1,\ldots ,n-1\), define

$$\begin{aligned} \hat{\tau }_k&:= \sum _{i=1}^k\ \Delta \hat{\tau }_i,\\ \Delta \hat{\tau }_i&= \max \{\Delta t \in U^{(N)}_i: \Delta t< \Delta \tau _i= \tau _i-\tau _{i-1}\}, \qquad i=1,\ldots , n-1. \end{aligned}$$

Clearly, \(0=\hat{\tau }_0<\hat{\tau }_1<\cdots <\hat{\tau }_{n-1}<\hat{\tau }_n=T\) and \(\hat{\tau }_k < \tau _k\) for all \(k=1,\ldots ,n-1\).

We are now ready to define \(F^{(N)}(S)\). For \(n=1\), set

$$\begin{aligned} F^{(N)}(S)\equiv 1 + \frac{1}{N}\ sign(S_T-1), \end{aligned}$$

and for \(n>1\), define

$$\begin{aligned} F^{(N)}_t(S)&= \sum _{k=1}^{n-1} \ S_{\tau _{k}} \chi _{[\hat{\tau }_{k-1},\hat{\tau }_k)}(t) \nonumber \\&+ \left( S_{\tau _{n-1}}+\frac{1}{N} sign( S_T- S_{\tau _{n-1}})\right) \ \chi _{[\hat{\tau }_{{{n-1}}},T]}(t). \end{aligned}$$
(4.3)

Observe that the value of the \(k\)th jump of the process \(F^{(N)}(S)\) equals to the value of the \((k+1)\)-th jump of the discretization \(\hat{S}\) of the original process \(S\). Indeed, for \(n > 2\),

$$\begin{aligned} F^{(N)}_{\hat{\tau }_{m}} - F^{(N)}_{\hat{\tau }_{m-1}} = S_{\tau _{m+1}} - S_{\tau _{m}}, \qquad \forall \ m=1,\ldots ,n-2, \end{aligned}$$
(4.4)

and for \(n\ge 2\),

$$\begin{aligned} F^{(N)}_{\hat{\tau }_{n-1}} - F^{(N)}_{\hat{\tau }_{n-2}} = \frac{1}{N}sign\left( S_{T} - S_{\tau _{n-1}}\right) \!. \end{aligned}$$

This shift is essential in order to deal with some delicate questions of adaptedness and predictability. We also recall that the jump times of \(\hat{S}\) are the random times \(\tau _k\)’s while the jump times of \(F^{(N)}(S)\) are \(\hat{\tau }_k\)’s and that all these times depend both on \(N\) and \(S\). Moreover, by construction, \(F^{(N)}(S) \in \mathbb {D}^{(N)}\). But, it may not be progressively measurable as defined in (2.5). However, we use \(F^{(N)}\) only to lift progressively measurable maps defined on \(\mathbb {D}^{(N)}\) to the initial space \(\Omega = \mathcal {C}^+[0,T]\) and this yields progressively measurable maps on \(\Omega \). This procedure is defined and the measurability is proved in Lemma 4.7, below.

The following lemma shows that \(F^{(N)}\) is close to \(S\) in the sense of Assumption 2.1. Let us emphasize that the following result is a consequence of the particular structure of \(\mathbb {D}^{(N)}\) and in particular \(U^{(N)}_k\)’s.

Lemma 4.4

Let \(F^{(N)}\) be the map defined in (4.3). For any \(G\) satisfying the Assumption 2.1 with the constant \(L\),

$$\begin{aligned} |G(S) - G(F^{(N)}(S))| \le \frac{4 L \Vert S\Vert }{N}, \quad \forall \ S\in \mathcal {C}^+[0,T]. \end{aligned}$$

Proof

Set

$$\begin{aligned} \hat{F}:=\hat{F}^{(N)}_t(S):=\sum _{k=0}^{n-1} \ S_{\tau _k} \chi _{[\hat{\tau }_k,\hat{\tau }_{k+1})}(t) + \left[ S_{\tau _{n-1}} +\frac{1}{N} sign( S_T- S_{\tau _{n-1}})\right] \ \chi _{\{T\}}(t). \end{aligned}$$

Observe that \(\hat{S}\) of (4.2) and \(\hat{F}\) are like the functions \(\upsilon \) and \(\tilde{\upsilon }\) in that Assumption 2.1. Hence,

$$\begin{aligned} |G(\hat{S}) - G(\hat{F})| \le L \Vert S\Vert \ \sum _{k=1}^{n} | \Delta \tau _k - \Delta \hat{\tau }_k|. \end{aligned}$$

For \(k<n\),

$$\begin{aligned} \Delta \hat{\tau }_k = \max \{ \Delta t\in U^{(N)}_k\ :\ \Delta t < \Delta \tau _k\ \}. \end{aligned}$$

The definition of \(U^{(N)}_k\) implies that

$$\begin{aligned} 0 \le \Delta \tau _k - \Delta \hat{\tau }_k\le \frac{1}{2^k N}, \quad k =1,\ldots , n-1. \end{aligned}$$

Therefore,

$$\begin{aligned} \sum _{k=1}^{n-1} | \Delta \tau _k - \Delta \hat{\tau }_k| \le \sum _{k=1}^{\infty } \frac{1}{2^k N} = \frac{1}{N}. \end{aligned}$$
(4.5)

Combining the above inequalities, we arrive at

$$\begin{aligned} |G(\hat{S}) - G(\hat{F})| \le \frac{L \Vert S\Vert }{N}. \end{aligned}$$

Set \(F=F^{(N)}(S)\) and directly estimate that

$$\begin{aligned} |G(S) - G(F)|&\le |G(S) - G(\hat{S})| + |G(\hat{S}) - G(\hat{F})| + |G(\hat{F}) - G(F)|\\&\le L \Vert S - \hat{S} \Vert + \frac{L \Vert S\Vert }{N} + |G(\hat{F}) - G(F)|\\&= \frac{3L \Vert S\Vert }{N}+|G(\hat{F}) - G(F)|. \end{aligned}$$

Finally, we observe that by construction,

$$\begin{aligned} \Vert \hat{F}-F\Vert \le \frac{1}{N}, \quad \Rightarrow \quad |G(F) - G(\hat{F})| \le \frac{L}{N}. \end{aligned}$$

The above inequalities completes the proof of the lemma. \(\square \)

Remark 4.5

The proof of the above Lemma provides one of the reasons behind the particular structure of \(U_k^{(N)}\). Indeed, (4.5) is a key estimate which provides a uniform upper bound for the sum of the differences over \(k\). Since there is no upper bound on \(k\), the approximating set \(U_k^{(N)}\) for the \(k\)-th difference must depend on \(k\). Moreover, it should have a summable structure over \(k\). That explains the terms \(2^k\).

On the other hand, the reason for the part \(\{1/(i2^k N): i=1,2,\dots \}\) in the definition of \(U_k^{(N)}\) is to make sure that \(\Delta \hat{\tau }_k>0\). For probabilistic reasons (i.e. adaptability), we want \(\hat{\tau }_k < \tau _k\). This forces us to approximate \( \Delta \tau _k\) by \(\Delta \hat{\tau }_k\) from below. This and \(\Delta \hat{\tau }_k>0\) would be possible only if \(U_k^{(N)}\) has a subsequence converging to zero.

Hence, different sets of \(U^{(N)}_k\)’s are also possible provided that they have these two properties. \(\square \)

4.3 A countable probabilistic structure

An essential step in the proof of (2.9) is a duality result for probabilistic problems. We first introduce this structure and then relate it to the problem \(V_N\).

As before, let \(\hat{\Omega }:=\mathbb {D}[0,T]\) be the space of all right continuous functions \(f:[0,T]\rightarrow \mathbb {R}_{+}\) with left–hand limits (\(c\grave{a}dl\grave{a}g\) functions). Denote by \(\hat{\mathbb S}=(\hat{\mathbb S}_t)_{0\le t\le T}\) the canonical process on the space \(\hat{\Omega }\).

The set \(\mathbb {D}^{(N)}\) defined in Definition 4.3 is a countable subset of \(\hat{\Omega }\). We choose any probability measure \(\hat{\mathbb {P}}^{(N)}\) on \(\hat{\Omega }\) which satisfies \(\hat{\mathbb {P}}^{(N)}(\mathbb {D}^{(N)})=1\) and \(\hat{\mathbb {P}}^{(N)}(\{f\})>0\) for all \(f\in \mathbb {D}^{(N)}\). Let \(\hat{\mathcal {F}}^{(N)}_t\), \(t\in [0,T]\) be the filtration generated by the process \(\hat{\mathbb S}\) and contains \(\hat{\mathbb {P}}^{(N)}\) null sets. Under the measure \(\hat{\mathbb {P}}^{(N)}\), the canonical map \(\hat{\mathbb S}\) has finitely many jumps. Let

$$\begin{aligned} 0= \hat{\tau }_0(\hat{\mathbb S}) <\hat{\tau }_1(\hat{\mathbb S})<\cdots <\hat{\tau }_{\hat{H}(\hat{\mathbb S})}(\hat{\mathbb S})<T, \end{aligned}$$

be the jump times of \(\hat{\mathbb S}\). Note that in Definition 4.3, the final jump time is always strictly less than \(T\).

A trading strategy on the filtered probability space \((\hat{\Omega },\hat{\mathcal {F}}^{(N)}, (\hat{\mathcal {F}}^{(N)}_t)_{t=0}^T, \hat{\mathbb P}^{(N)})\) is a predictable stochastic process \((\hat{\gamma }_t)_{t=0}^T\). Thus, it is a function \(\hat{\gamma }: \mathbb {D}[0,T] \rightarrow \mathcal {D}[0,T]\). Let \(a\in \mathcal {D}[0,T]\) be such that \(a\notin \hat{\gamma }(\mathbb {D}^{(N)})\). Define a map \( \phi : \mathbb {D}[0,T] \rightarrow \mathcal {D}[0,T],\) by \(\phi (\omega )=\hat{\gamma }(\omega )\) if \(\omega \in \mathbb {D}^{(N)}\), and equal to \(a\) otherwise. Clearly, \(\hat{\mathbb {P}}^{(N)}\) almost surely, \(\hat{\gamma }=\phi (\hat{\mathbb {S}})\). Also, since \(\hat{\mathbb {P}}^{(N)}\) is non-zero on every point in \(\mathbb {D}^{(N)}\), the definition of the predictable sigma algebra implies that \(\phi \) is a predictable map. Namely, for any \(v, \tilde{v} \in \mathbb {D}[0,T]\) and \(t\in [0,T]\)

$$\begin{aligned} v_u=\tilde{v}_u \ \forall u \in [0,t) \ \ \Rightarrow \ \ \phi (v)_t=\phi (\tilde{v})_t. \end{aligned}$$

Indeed, arguing by contraposition, if there were \(t\in [0,T]\) and \(v,\tilde{v}\in \mathbb {D}^{(N)}\) such that \(v_u=\tilde{v}_u\) for all \(u \in [0,t)\) and \(\phi (v)_t\ne \phi (\tilde{v})_t\). Then, we would conclude that the event \(\{\hat{\gamma }_t=\phi (v)_t\}\not \in \hat{\mathcal {F}}^{(N)}_{t-}\). However, this would be in contradiction with the predictability of the process \(\hat{\gamma }\). (Recall that \({\mathcal {F}}^{(N)}_{t-}\) is the smallest \(\sigma \)–algebra which contains \({\mathcal {F}}^{(N)}_{s}\) for any \(s<t\)). Hence, any predictable process \(\hat{\gamma }\) has a version \(\phi \) that is progressively measurable in the sense of Definition 2.3. In what follows, we always use this progressively measurable version of any predictable process. In particular, the following can be seen as the probabilistic counterpart of the Definition 2.3.

Definition 4.6

  1. 1.

    A (probabilistic) semi-static portfolio is a pair \((h,\hat{\gamma })\) such that \( \hat{\gamma }:\mathbb {D}[0,T] \rightarrow \mathcal {D}[0,T]\) is predictable and the stochastic integral \(\int _{0}^{\cdot } \hat{\gamma }_{u} d\hat{\mathbb S}_u\) exists (with respect to the measure \(\mathbb {P}^{(N)}\)), and \(h:{A^{(N)}}\rightarrow \mathbb {R}\).

  2. 2.

    A semi-static portfolio is \(\hat{\mathbb {P}}^{(N)}\)-admissible, if \(h\) is bounded and there exists \(M>0\) such that

    $$\begin{aligned} \int _{0}^t \hat{\gamma }_{u} d\hat{\mathbb S}_u \ge -M, \quad \hat{\mathbb {P}}^{(N)}-{a.s.}, \, \ t\in [0,T]. \end{aligned}$$
    (4.6)
  3. 3.

    An admissible semi-static portfolio is \(\hat{\mathbb {P}}^{(N)}\)-super-replicating \(G\), if

    $$\begin{aligned} h(\hat{\mathbb S}_T)+\int _{0}^T \hat{\gamma }_{u} d\hat{\mathbb S}_u\ge G(\hat{\mathbb S}), \ \ \hat{\mathbb {P}}^{(N)}-{a.s.} \end{aligned}$$
    (4.7)

4.4 Approximating \(\mu \)

Recall the set \(\mathcal {A}_N\) of portfolios used in the definition of \(V_N\) in Sect. 2.5.

Next we provide a connection between the probabilistic super-replication and the discrete robust problem. However, the option \(h\) in the Definition 4.6 above is defined only on \(A^{(N)}\) while the static part of the hedges in \(\mathcal {A}_N\) are functions defined on \(\mathbb {R}_+\). So for a given \(h:A^{(N)} \rightarrow \mathbb {R}\), we define the following operator

$$\begin{aligned} g^{(N)}:= \mathcal {L}^{(N)}(h): \mathbb {R}_+ \rightarrow \mathbb {R}\end{aligned}$$

by

$$\begin{aligned} g^{(N)}(x) := (1+ \lfloor Nx \rfloor -Nx) h( \lfloor Nx \rfloor /N) + (Nx- \lfloor Nx \rfloor ) h( (1+\lfloor Nx \rfloor )/N), \end{aligned}$$

where for a real number \(r\), \(\lfloor r \rfloor \) is the largest integer that is not larger than \(r\).

Next, define a measure \(\mu ^{(N)}\) on the set \(A^{(N)}\) by

$$\begin{aligned} \mu ^{(N)}(\{0\}):=\int _{[0,1/N)} \left( 1-N x\right) d\mu (x) \end{aligned}$$

and for any positive integer \(k\),

$$\begin{aligned} \mu ^{(N)}(\{k/N\}):= \int _{[(k-1)/N,k /N)} \left( N x+1-k\right) d\mu (x) + \int _{[k/N,(k+1)/N)} \left( 1+k-N x\right) d\mu (x). \end{aligned}$$

This construction has the following important property. For any bounded function \(h: A^{(N)}\rightarrow \mathbb {R}\), let \(g^{(N)}=\mathcal {L}^{(N)}(h)\) be as above. Then,

$$\begin{aligned} \int h d\mu ^{(N)}=\int g^{(N)}d\mu . \end{aligned}$$
(4.8)

In particular, by taking \(h\equiv 1\), we conclude that \(\mu ^{(N)}\) is a probability measure. Also, since for continuous \(h\), \(g^{(N)}\) converges pointwise to \(h\), one may directly show (by Lebesgue’s dominated convergence theorem) that \(\mu ^{(N)}\) converges weakly to \(\mu \).

4.5 Probabilistic super-replication

Recall the probabilistic super-replication problem introduced in Definition 4.6. Let \(G\) be a European claim as before and \(N\) be a positive integer. Then, the probabilistic super-replication problem is given by,

$$\begin{aligned} \hat{V}_N(G)= \inf \left\{ \int h d\mu ^{(N)}: \exists \ \hat{\gamma } \ {\hbox {s.t.}}\ (h,\hat{\gamma }) \ \hbox {is a}\ {\hat{\mathbb {P}}^{(N)}}\ {\hbox {admissible super hedge of}}\ G \right\} . \end{aligned}$$

We continue by establishing a connection between the probabilistic super hedging \(\hat{V}_N\) and the discrete robust problem \(V_N\). Suppose that we are given a probabilistic semi-static portfolio \(\hat{\pi }= (h, \hat{\gamma })\) in the sense of Definition 4.6. We lift this portfolio to a semi-static portfolio \(\pi ^{(N)} = (g^{(N)}, \gamma ^{(N)}) \in \mathcal {A}_N\). Indeed, let \(g^{(N)}=\mathcal {L}^{(N)}(h)\) be as in Sect. 4.4 and define \(\gamma ^{(N)}:\mathcal {C}^{+}[0,T]\rightarrow \mathcal {D}[0,T]\) by

$$\begin{aligned} \gamma ^{(N)}_t(S)= \sum _{k=1}^{{n-1}} \hat{\gamma }_{\hat{\tau }_k} (F^{(N)}(S)) \chi _{(\tau _k,\tau _{k+1}]}(t), \end{aligned}$$

where \(\tau _k=\tau _k^{(N)}(S)\) are as in (2.7), \(n\) is as in (4.1) and \(F^{(N)}(S)\), \(\hat{\tau }_k:=\hat{\tau }_k(S)\) are as in (4.3). Note that the random integer \(n\) is the number of crossings of magnitude of no less than \(1/N\). Moreover, by construction it is exactly one more than the number of jumps of \(F^{(N)}\). Also notice that

$$\begin{aligned} \gamma ^{(N)}_t (S)=0, \quad \forall \ t \in [0,\tau _1]. \end{aligned}$$

Lemma 4.7

For any probabilistic semi-static portfolio \((h,\hat{\gamma })\), \(\gamma ^{(N)}\) defined above is progressively measurable in the sense of (2.5).

Proof

Let \(S, \tilde{S} \in \mathcal {C}^+[0,T]\) and \(t\in [0,T]\) be such that \(S_u= \tilde{S}_u\) for all \(u\le t\). We need to show that

$$\begin{aligned} \gamma ^{(N)}_t(S) = \gamma ^{(N)}_t(\tilde{S}). \end{aligned}$$

Since the above clearly holds for \(t=0\) and \(t=T\), we may assume that \(t\in (0,T)\). Set

$$\begin{aligned} k_t(S):=k_t^{(N)}(S):= \min \{ i\ge 1, : \tau ^{(N)}_i \ge t\ \}-1, \end{aligned}$$

so that \(0\le k_t(S) < n\) and

$$\begin{aligned} t \in (\tau ^{(N)}_{k_t(S)}, \tau ^{(N)}_{k_t(S)+1}]. \end{aligned}$$

It is clear that \(k_t(S)=k_t(\tilde{S})\). If \(k_t(S)=k_t(\tilde{S})=0\), then \(\gamma ^{(N)}_t(S) = \gamma ^{(N)}_t(\tilde{S})=0\). So we assume that \(k_t(S)>0\) and use the definition of \(\hat{\tau }_k\) to conclude that

$$\begin{aligned} \theta :=\hat{\tau }_{k_t(S)}= \hat{\tau }_{k_t(\tilde{S})}(\tilde{S}). \end{aligned}$$

Since \(0<k_t(S)<n\), we have \(n>1\) and \( F^{(N)}_t\) is given by (4.3), i.e.,

$$\begin{aligned} F^{(N)}_t(S)= \sum _{k=1}^{n-1} \ S_{\tau _{k}} \chi _{[\hat{\tau }_{k-1},\hat{\tau }_{k})}(t) + \left( S_{\tau _{n-1}}+\frac{1}{N} sign( S_T- S_{\tau _{n-1}})\right) \ \chi _{[\hat{\tau }_{n-1},T]}(t). \end{aligned}$$

Now, for any \(u<\theta = \hat{\tau }_{k_t(S)}= \hat{\tau }_{k_t(\tilde{S})}(\tilde{S})\), the above definition implies that

$$\begin{aligned} F_u^{(N)}(S)=S_{\tau _k},\ \ \ F_u^{(N)}(\tilde{S})=\tilde{S}_{\tau _k}, \quad {\hbox {for some}}\quad k \le k_t(S)=k_t(\tilde{S}). \end{aligned}$$

Since by definition \(\tau _{k_t(S)}(S)<t\), we conclude that

$$\begin{aligned} F^{(N)}_u(S) = F^{(N)}_u(\tilde{S}), \quad \forall \ u \in [0,\theta ). \end{aligned}$$

Therefore, by the predictability of \(\hat{\gamma }\) we have \(\gamma ^{(N)}_t(S) =\gamma ^{(N)}_t(\tilde{S})\). \(\square \)

The following lemma provides a natural and a crucial connection between the probabilistic super-replication and the discrete robust problem.

Recall the set \(\mathcal {A}_N\) of portfolios used in the definition of \(V_N\) in Sect. 2.5.

Lemma 4.8

Suppose \(G\) is bounded from above and satisfies the Assumption 2.1. Then,

$$\begin{aligned} \limsup _{N\rightarrow \infty } V_N(G)\le \ \limsup _{N\rightarrow \infty } \hat{V}_N(G). \end{aligned}$$

Proof

Set

$$\begin{aligned} G^{(N)}(S):= G(S)-\frac{5 L \Vert S\Vert }{N}. \end{aligned}$$

We first show that

$$\begin{aligned} V_N(G^{(N)}) \le \hat{V}_N(G). \end{aligned}$$

To prove the above inequality, suppose that a portfolio \((h, \hat{\gamma })\) is a \( \hat{\mathbb {P}}^{(N)}\)-admissible super hedge of \(G\). Then it suffices to construct a map \(\gamma ^{(N)}:\mathcal {C}^{+}[0,T]\rightarrow \mathcal {D}[0,T]\) and \(g^{(N)} : \mathbb {R}_+ \rightarrow \mathbb {R}\) such that the semi-static portfolio \(\pi ^{(N)}:=(g^{(N)},\gamma ^{(N)})\) is admissible, belongs to \(\mathcal {A}_N\) and super-replicates \(G^{(N)}\) in the sense of Definition 2.3.

Let \(g^{(N)}= \mathcal {L}^{(N)}(h)\) be as in Sect. 4.4 and \(\gamma ^{(N)}\) be the probabilistic portfolio considered in Lemma 4.7. We claim that \(\pi ^{(N)}\) is the desired portfolio. In view of Lemma 4.7, we need to show that \(\pi ^{(N)} \) is in \(\mathcal {A}_N\) and super-replicates the \(G^{(N)}\) in the sense of Definition 2.3.

To simplify the notation, we set \(F:= F^{(N)}(S)\).

Admissibility of \(\gamma ^{(N)}\) By construction trading is only at the random times \(\tau _k\)’s. Therefore, \(\pi ^{(N)} \in \mathcal {A}_N\) provided that it satisfies the lower bound (2.6) for every \(t \in [0,T]\). We first claim that for any \(S\in \mathcal {C}^+[0,T]\) and for every \(k \le n-1\),

$$\begin{aligned} \int _0^{\tau _k} \gamma ^{(N)}_u(S) dS_u = \int _{[0,{\hat{\tau }_{k-1}}]} \hat{\gamma }_{u}(F) dF_u. \end{aligned}$$

Since \(\gamma ^{(N)} \equiv 0\) on \([0,\tau _1]\), the above trivially holds for \(k=1\). So we assume that \(1< k \le n-1\). In particular, \(n >2\). Then, we use (4.4) and the definitions to compute that

$$\begin{aligned} \int _{[0,{\hat{\tau }_{k-1}}]} \hat{\gamma }_{u}(F) dF_u&= \sum _{m=1}^{k-1} \hat{\gamma }_{\hat{\tau }_m}(F) (F_{\hat{\tau }_m} - F_{\hat{\tau }_{m-1}})= \sum _{m=1}^{k-1} \hat{\gamma }_{\hat{\tau }_m}(F) (S_{\tau _{m+1}} - S_{\tau _{m}})\\&= \sum _{m=1}^{k} \gamma ^{(N)}_{ \tau _{m+1}}(F) (S_{ \tau _{m+1}} - S_{\tau _{m}}) =\int _{\tau _1}^{\tau _k} \gamma ^{(N)}_u(S) dS_u\\&= \int _{0}^{\tau _k} \gamma ^{(N)}_u(S) dS_u. \end{aligned}$$

The last identity follows from the fact that \(\gamma ^{(N)}\) is zero on the interval \([0,\tau _1]\).

Now, for a given \(t \in [0,T{{)}}\) and \( S \in \mathcal {C}^{+}[0,T]\), let \(k{{\le n-1}}\) be the largest integer so that \(\tau _k \le t\). Construct a function \(\tilde{F} \in \mathbb {D}^{(N)}\) by,

$$\begin{aligned} \tilde{F}_{[0,\hat{\tau }_{k})}=F_{[0,\hat{\tau }_{k})}, \ {\hbox {(i.e.,}} \quad \tilde{F}_u=F_u, \ \forall \ u \in [0,\hat{\tau }_{k}),\!{\hbox {)}} \end{aligned}$$

and

$$\begin{aligned} \tilde{F}_{u}=2F_{\hat{\tau }_{k-1}}- F_{\hat{\tau }_{k}}, \ \ u\ge \hat{\tau }_{k}. \end{aligned}$$

Note that the constructed function \(\tilde{F}\) depends on \(S\) and \(N\), since both \(F\) and the stopping times \(\tau _k\) depend on them. But we suppress these dependences. Since

$$\begin{aligned} \tilde{F}_{\hat{\tau }_{k}}- \tilde{F}_{\hat{\tau }_{k-1}} = -[F_{\hat{\tau }_{k}}-F_{\hat{\tau }_{k-1}}] = \pm 1/N, \end{aligned}$$

and since

$$\begin{aligned} | S_t-S_{\tau _k}| \le 1/N, \end{aligned}$$

there exists \(\lambda \in [0,1]\) (depending on \(t, N, S\)) such that

$$\begin{aligned} S_t-S_{\tau _k}=\lambda (F_{\hat{\tau }_{k}}-F_{\hat{\tau }_{k-1}}) +(1-\lambda ) (\tilde{F}_{\hat{\tau }_{k}}-\tilde{F}_{\hat{\tau }_{k-1}}). \end{aligned}$$

Since \(F\) and \(\tilde{F}\) agree on \([0,\hat{\tau }_{k})\) and \(\hat{\gamma }\) is predictable, \(\hat{\gamma }_u(F)=\hat{\gamma }_u(\tilde{F})\) for all \(u \le \hat{\tau }_{k}\). Also, for \(u \in (\tau _k,t) \subset (\tau _k, \tau _{k+1})\), \(\gamma ^{(N)}_u(S)=\hat{\gamma }_{\hat{\tau }_k}(F)\) and

$$\begin{aligned} \int _0^t \gamma _u^{(N)}(S) dS_u&= \int _0^{\tau _k}\gamma _u^{(N)}(S) dS_u + \int _{\tau _k}^t \gamma _u^{(N)}(S) dS_u \\&= \int _{[0,\hat{\tau }_{k-1}]}\hat{\gamma }_u(F) dF_u + \hat{\gamma }_{\hat{\tau }_k}(F) [S_t - S_{\tau _k}]. \end{aligned}$$

Since \(F\) is piece-wise constant with jumps only at the stopping times \(\hat{\tau }_i\)’s,

$$\begin{aligned} \int _{[0, \hat{\tau }_k]} \hat{\gamma }_u(F) dF_u&= \int _{[0,\hat{\tau }_{k-1}]}\hat{\gamma }_u(F) dF_u + \int _{(\hat{\tau }_{k-1},\hat{\tau }_k]}\hat{\gamma }_u(F) dF_u\\&= \int _{[0,\hat{\tau }_{k-1}]}\hat{\gamma }_u(F) dF_u + \hat{\gamma }_{\hat{\tau }_k}(F) [F_{\hat{\tau }_k}-F_{\hat{\tau }_{k-1}}]. \end{aligned}$$

We calculate the same integral for \(\tilde{F}\) using the fact that \(F=\tilde{F}\) on \([0,\tau _k)\). The result is

$$\begin{aligned} \int _{[0, \hat{\tau }_k]} \hat{\gamma }_u(\tilde{F}) d\tilde{F}_u = \int _{[0,\hat{\tau }_{k-1}]}\hat{\gamma }_u(F) dF_u + \hat{\gamma }_{\hat{\tau }_k}(F) [\tilde{F}_{\hat{\tau }_k}-\tilde{F}_{\hat{\tau }_{k-1}}]. \end{aligned}$$

Therefore,

$$\begin{aligned} \int _0^{t} \gamma ^{(N)}_u(S) dS_u = \lambda \int _{[0,{\hat{\tau }_{k}}]} \hat{\gamma }_{u}(F) dF_u +(1-\lambda ) \int _{[0,\hat{\tau }_{k}]} \hat{\gamma }_{u}(\tilde{F}) d\tilde{F}_u. \end{aligned}$$

Since \(F, \tilde{F} \in \mathbb {D}^{(N)}\) and \(\hat{\mathbb {P}}^{(N)}(F), \hat{\mathbb {P}}^{(N)}(\tilde{F}) >0\), (4.6) imply that

$$\begin{aligned} \int _{[0,{\hat{\tau }_{k}}]} \hat{\gamma }_{u}(F) dF_u \ge -M,\quad {\hbox {and}} \quad \int _{[0,\hat{\tau }_{k}]} \hat{\gamma }_{u}(\tilde{F}) d\tilde{F}_u \ge -M. \end{aligned}$$

Hence, \(\gamma ^{(N)}\) satisfies (2.6) and \(\pi ^{(N)} \in \mathcal {A}_N\).

Super-replication We need to show that

$$\begin{aligned} g^{(N)}(S_T) + \int _0^T \gamma ^{(N)}_u(S) dS_u \ge G^{(N)}(S). \end{aligned}$$

We proceed almost exactly as in the proof of admissibility. Again we define a modification \(\bar{F}\in \mathbb {D}^{(N)}\) by \(\bar{F}_{[0,\hat{\tau }_{n-2})}=F_{[0,\hat{\tau }_{n-2})}\) and \(\bar{F}_u=\bar{F}_{\hat{\tau }_{n-2}}\) for \(u\ge \hat{\tau }_{n-2}\). Set

$$\begin{aligned} \hat{\lambda }:=N |S_T-S_{\tau _{n-1}}|. \end{aligned}$$

Then \(\hat{\lambda }\in [0,1]\) and by the construction of \(g^{(N)}\),

$$\begin{aligned} g^{(N)}(S_T) = \hat{\lambda }h(F_T) +(1-\hat{\lambda }) h(\bar{F}_T). \end{aligned}$$

Hence,

$$\begin{aligned}&g^{(N)}(S_T) + \int _0^T \gamma ^{(N)}_u(S) dS_u \\&\quad = \hat{\lambda }\left[ h(F_T) + \int _0^T \hat{\gamma }_{u}(F) dF_u\right] + (1-\hat{\lambda }) \left[ h(\bar{F}_T) + \int _0^T \hat{\gamma }_{u}(\bar{F}) d\bar{F}_u\right] \\&\quad \ge \hat{\lambda }G(F) + (1-\hat{\lambda }) G(\bar{F}). \end{aligned}$$

Since \(\Vert F-\bar{F}\Vert \le 1/N\), Assumption 2.1 and Lemma 4.4 imply that

$$\begin{aligned} \left| G(S)-G(\bar{F})\right| \ge \left| G(S)-G( F)\right| + \left| G(F)-G(\bar{F})\right| \le \frac{5L\Vert S\Vert }{N}. \end{aligned}$$

Consequently,

$$\begin{aligned} \hat{\lambda }G(F) + (1-\hat{\lambda }) G(\bar{F}) \ge G^{(N)}(S) \end{aligned}$$

and we conclude that \(\pi ^{(N)}\) is super-replication \(G^{(N)}\).

Completion of the proof We have shown that

$$\begin{aligned} V_N(G-5 L \Vert S\Vert /N) \le \hat{V}_N(G). \end{aligned}$$

Moreover, the linearity of the market yields that the super-replication cost is sub-additive. Hence,

$$\begin{aligned} V_N(G) \le V_N(5 L \Vert S\Vert /N)+ V_N(G-5L \Vert S\Vert /N). \end{aligned}$$

Therefore,

$$\begin{aligned} V_N(G) \le V_N(5L \Vert S\Vert /N)+ \hat{V}_N(G). \end{aligned}$$

Finally, by Lemma 4.1,

$$\begin{aligned} \limsup _{N \rightarrow \infty } \ V_N(5L \Vert S\Vert /N) =0. \end{aligned}$$

We use the above inequalities to complete the proof of the lemma. \(\square \)

4.6 First duality

Recall the countable set \(\mathbb {D}^{(N)}\subset \hat{\Omega }\) and its probabilistic structure that were introduced in Sect. 4.3. We consider two classes of measures on this set.

Definition 4.9

  1. 1.

    We say that a probability measure \(\mathbb {Q}\) on the space \((\hat{\Omega },\hat{\mathcal F})\) is a martingale measure if the canonical process \((\hat{\mathbb S}_t)_{t=0}^T\) is a local martingale with respect to \(\mathbb Q\).

  2. 2.

    \(\mathbb {M}_N\) is the set of all martingale measures that are supported on \(\mathbb {D}^{(N)}\).

  3. 3.

    For a given \(K>0\), \(\mathbb {M}^{(K)}_{N}\) is the set of all measures \( \mathbb {Q}\in \mathbb {M}_N\) that satisfy

    $$\begin{aligned} \sum _{k=0}^\infty |\mathbb {Q}({{\hat{\mathbb S}}}_T=k/N) -\mu ^{(N)} (\{k/N\})|<\frac{K}{N}. \end{aligned}$$
    (4.9)

\(\square \)

The following follows from known duality results. We will combine it with Lemma 4.8 and Proposition 5.1, which will be proved in the next section to complete the proof of the inequality (2.9).

Lemma 4.10

Suppose that \(G \ge 0\) is bounded from above by \(K\) and satisfies the Assumption 2.1. Then, for any positive integer \(N\),

$$\begin{aligned} \hat{V}_N(G) \le \left( \sup _{\mathbb Q\in \mathbb {M}^{(K)}_{N}} \mathbb {E}_{\mathbb Q} [G(\hat{\mathbb S})]\right) ^{+}, \end{aligned}$$

where the right hand side of the above inequality is zero when the set \(\mathbb {M}^{(K)}_{N}=\emptyset \) is empty.

Proof

Fix \(N\) and define the set

$$\begin{aligned} \mathcal {Z}= \mathcal {Z}^{(N)}:=\{h: A^{(N)}\rightarrow \mathbb {R}: |h(z)|\le N, \ \ \forall {z}\}. \end{aligned}$$

Set

$$\begin{aligned} \mathbb V:=\inf _{h\in \mathcal {Z}} \sup _{\mathbb Q\in \mathbb {M}_N} \left( \mathbb {E}_{\mathbb Q} (G(\hat{\mathbb S})- h(\hat{\mathbb S}_T))+\int h d\mu ^{(N)}\right) . \end{aligned}$$

Clearly, for any \(\epsilon >0\), there exists \(h_\epsilon \in \mathcal {Z}\) such that

$$\begin{aligned} \sup _{\mathbb Q\in \mathbb {M}_N} \mathbb {E}_{\mathbb Q} (G(\hat{\mathbb S})- h_\epsilon (\hat{\mathbb S}_T)) +\int h_\epsilon d\mu ^{(N)} \le \mathbb {V}+\epsilon . \end{aligned}$$

By construction, the support of the measure \(\hat{\mathbb {P}}^{(N)}\) is \(\mathbb {D}^{(N)}\). Also all elements of \(\mathbb {D}^{(N)}\) are piece-wise constant. Therefore, under \(\hat{\mathbb {P}}^{(N)}\) the canonical process \(\hat{\mathbb S}\) is trivially a semi-martingale and we may use the results of the seminal paper [17]. In particular, by Theorem 5.7 in [17], for

$$\begin{aligned} x_\epsilon = \sup _{\mathbb Q\in \mathbb {M}_N} \mathbb {E}_{\mathbb Q} (G(\hat{\mathbb S})- h_\epsilon (\hat{\mathbb S}_T)) + \epsilon , \end{aligned}$$

there exists an admissible portfolio strategy \(\hat{\gamma }\) such that

$$\begin{aligned} x_\epsilon +\int _{0}^T \hat{\gamma }_u d\hat{\mathbb S}_u\ge G(\hat{\mathbb S})- h_\epsilon (\hat{\mathbb S}_T), \ \ \hat{\mathbb {P}}^{(N)} \ \ \hbox {a.s.} \end{aligned}$$

Therefore, \((h_\epsilon +x_\epsilon ,\hat{\gamma })\) satisfies (4.6)–(4.7), consequently

$$\begin{aligned} \hat{V}_N(G) \le x_\epsilon + \int h_\epsilon d\mu ^{(N)} \le \sup _{\mathbb Q\in \mathbb {M}_N} \mathbb {E}_{\mathbb Q} (G(\hat{\mathbb S})- h_\epsilon (\hat{\mathbb S}_T)) + \int h_\epsilon d\mu ^{(N)} + \epsilon \le \mathbb {V}+2\epsilon . \end{aligned}$$

We now let \(\epsilon \) to zero to conclude that

$$\begin{aligned} \hat{ V}_N(G)\le \ \inf _{h\in Z }\sup _{\mathbb Q\in \mathbb {M}_N} \left( \mathbb {E}_{\mathbb Q}(G(\hat{\mathbb S}) -h(\hat{\mathbb S}_T))+\int h d\mu ^{(N)}\right) . \end{aligned}$$
(4.10)

The next step is to interchange the order of the above infimum and supremum. Consider the vector space \(\mathbb {R}^{A^{(N)}}\) of all functions \(f: A^{(N)} \rightarrow \mathbb {R}\) equipped with the topology of point-wise convergence. Clearly, this space is locally convex. Also, since \(A^{(N)}\) is countable, \(\mathcal {Z}\) is a compact subset of \(\mathbb {R}^{A^{(N)}}\). The set \(\mathbb {M}_N\) can be naturally considered as a convex subspace of the vector space \(\mathbb {R}^{\mathbb {D}^{(\mathbb {N})}}\).

Now, define the function \(\mathcal {G}:\mathcal {Z}\times \mathbb {M}_N\rightarrow \mathbb {R}\), by

$$\begin{aligned} \mathcal G(h,\mathbb Q)=\mathbb {E}_{\mathbb Q} (G(\hat{\mathbb S})-h(\hat{\mathbb S}_T))+\int h d\mu ^{(N)}. \end{aligned}$$

Notice that \(\mathcal G\) is affine in each of the variables. From the bounded convergence theorem, it follows that \(\mathcal G\) is continuous in the first variable. Next, we apply the min-max theorem, Theorem 45.8 in [37] to \(\mathcal G\). The result is,

$$\begin{aligned} \inf _{h\in \mathcal {Z}}\sup _{\mathbb Q\in \mathbb {M}_N}\mathcal G(h,\mathbb Q) =\sup _{\mathbb Q\in \mathbb {M}_N} \inf _ {h\in \mathcal {Z}}\mathcal G(h,\mathbb Q). \end{aligned}$$

This together with (4.10) yields,

$$\begin{aligned} \hat{V}_N(G) \le \sup _{\mathbb {Q}\in \mathbb {M}_N}\inf _{h\in \mathcal {Z}} \left( \mathbb {E}_{\mathbb Q}(G(\hat{\mathbb S}) -h(\hat{\mathbb S}_T)) +\int h d\mu ^{(N)}\right) . \end{aligned}$$
(4.11)

Finally, for any measure \(\mathbb {Q}\in \mathbb {M}_N\), define \(h^{\mathbb Q}\in \mathcal {Z}\) by

$$\begin{aligned} h^{\mathbb Q}(k/N) = N sign(\mathbb {Q}({{\hat{\mathbb S}}}_T =k/N) -\mu ^{(N)}(\{k/N\})), \quad k=0,1,\ldots . \end{aligned}$$

In view of (4.11),

$$\begin{aligned} \hat{V}_N(G)&\le \sup _{\mathbb Q\in \mathbb {M}_N} \left( \mathbb {E}_{\mathbb Q} (G(\hat{\mathbb S})) + \int h^{\mathbb Q} d\mu ^{(N)} -\mathbb {E}_{\mathbb Q} h^{\mathbb Q}(\hat{\mathbb S}_T)\right) \\&= \sup _{\mathbb Q\in \mathbb {M}_N} \left\{ \mathbb {E}_{\mathbb Q} (G(\hat{\mathbb S})) - N \sum _{k=0}^\infty |\mathbb {Q}({{\hat{\mathbb S}}}_T=k/N) -\mu ^{(N)}\left( \left\{ k/N \right\} \right) |\right\} \end{aligned}$$

Suppose that \(\mathbb {Q}\not \in \mathbb {M}^{(K)}_{N}\). Then,

$$\begin{aligned} N \sum _{k=0}^\infty |\mathbb {Q}({{\hat{\mathbb S}}}_T=k/N) -\mu ^{(N)}(\{k/N\})| \ge K. \end{aligned}$$

Since \(G\) is bounded by \(K\), this implies that

$$\begin{aligned} \mathbb {E}_{\mathbb Q} (G(\hat{\mathbb S}))- N \sum _{k=0}^\infty |\mathbb {Q}({{\hat{\mathbb S}}}_T =k/N) -\mu ^{(N)}\left( \left\{ k/N \right\} \right) | \le 0. \end{aligned}$$

Hence,

$$\begin{aligned} \hat{V}_N(G)&\le \left( \sup _{\mathbb Q\in \mathbb {M}^{(K)}_N} \left\{ \mathbb {E}_{\mathbb Q} (G(\hat{\mathbb S}))- N \sum _{k=0}^\infty |\mathbb {Q}({{\hat{\mathbb S}}}_T=k/N) -\mu ^{(N)}\left( \left\{ k/N \right\} \right) |\right\} \right) ^{+}\\&\le \left( \sup _{\mathbb Q\in \mathbb {M}^{(K)}_{N}} \mathbb {E}_{\mathbb Q}(G(\hat{\mathbb S}))\right) ^{+}. \end{aligned}$$

\(\square \)

5 Approximation of Martingale measures

In this final section, we prove the asymptotic connection between the approximating martingale measures \(\mathbb {M}^{(K)}_{N}\) defined in Definition 4.9 and the continuous martingale measures \(\mathbb {M}_\mu \) satisfying the marginal constraint at the final time, defined in Definition 2.4.

The following proposition completes the proof of the inequality (2.9) and consequently the proofs of the main theorems when the claim \(G \ge 0\) is bounded from above. The general case then follows from Lemma 4.2.

Proposition 5.1

Suppose that \(G \ge 0\) is bounded from above by \(K\) and satisfies the Assumption 2.1. Assume that \(\mu \) satisfies (2.3)–(2.4). Then

$$\begin{aligned} \limsup _{N\rightarrow \infty }\ \left( \sup _{\mathbb Q\in \mathbb {M}^{(K)}_N}\ \mathbb {E}_{\mathbb Q} [G (\hat{\mathbb S})]\right) ^{+} \ \le \ \sup _{\mathbb {Q}\in \mathbb {M}_\mu } \ \mathbb {E}_{\mathbb {Q}}\left[ G(\mathbb S) \right] \!. \end{aligned}$$

We prove the above proposition not through a compactness argument as one may expect. Instead, we show that any given measure \(\mathbb {Q}\in \mathbb {M}^{(K)}_{N}\) has a lifted version in \(\mathbb {M}_\mu \) that is close to \(\mathbb {Q}\) in some sense. The set \(\mathbb {M}_\mu \ne \emptyset \) is not empty, thus \(\sup _{\mathbb {Q}\in \mathbb {M}_\mu } \ \mathbb {E}_{\mathbb {Q}}\left[ G(\mathbb S)\right] \ge 0\). Therefore, without loss of generality, we can assume that for sufficiently large \(N\) the set \(\mathbb {M}^{(K)}_N\) is not empty, otherwise the proposition is trivially satisfied. Hence, the above proposition is a direct consequence of the below lemma.

Recall the Lipschitz constant \(L\) in Assumption 2.1.

Lemma 5.2

Under the hypothesis of Proposition 5.1, there exists a function \(f_{K}(\epsilon ,N)\) satisfying,

$$\begin{aligned} \lim _{\epsilon \downarrow 0}\lim _{N\rightarrow \infty }f_{K}(\epsilon ,N)=0 \end{aligned}$$

so that for any \(\hat{\mathbb Q} \in \mathbb {M}^{(K)}_{N}\) and \(\epsilon >0\),

$$\begin{aligned} \mathbb {E}_{\hat{\mathbb Q}} [ G(\hat{\mathbb S})] \le f_{K} ( \epsilon ,N) +\sup _{\mathbb {Q}\in \mathbb {M}_{\mu }} \mathbb {E}_{\mathbb {Q}}\left[ G(\mathbb S)\right] \!. \end{aligned}$$

Proof

Fix \(\epsilon \in (0,1)\), a positive integer \(N\) and \(\hat{\mathbb {Q}}\in \mathbb {M}^{(K)}_N\). Recall that \(G\) is bounded from above by \(K\).

Shift of the initial value Denote by \(\mathbb {D}^{(N)}_1\) the set of all functions \(f \in \mathbb {D}[0,T]\) which satisfy \(f(0)=1\) and the conditions 2–4 in Definition 4.3. Define a map \(H:\mathbb {D}^{(N)}\rightarrow \mathbb {D}^{(N)}_1\) by \(H(f)=f+1-f(0)\). Consider the measure \(\mathbb Q_1= H\circ \hat{\mathbb Q}\), clearly \(\mathbb Q_1\) is a martingale measure.

Jump times Since the probability measure \(\hat{\mathbb Q}\) is supported on the set \(\mathbb {D}^{(N)}\), the canonical process \(\hat{\mathbb S}\) is a purely jump process under \(\mathbb Q_1\), with a finite number of jumps. Introduce the jump times by setting \(\tau _0=0\) and for \(k>0\),

$$\begin{aligned} \tau _k=\inf \{t>\tau _{k-1}: \hat{\mathbb {S}}_t \ne \hat{\mathbb {S}}_{t{\hbox {-}}} \}\wedge T. \end{aligned}$$

Next we introduce the largest random time

$$\begin{aligned} \hat{N}:=\min \{k: \tau _k=T\}. \end{aligned}$$

Then, \(\hat{N}<\infty \) almost surely and consequently, there exists a deterministic positive integer \(m\) (depending on \(\epsilon \)) such that

$$\begin{aligned} \mathbb {Q}_1(\hat{N}>m)<\epsilon . \end{aligned}$$
(5.1)

By the definition of the set \(\mathbb {D}^{(N)}\), there is a decreasing sequence of strictly positive numbers \(t_k\downarrow 0\), with \(t_1=T\), such that for \(i=1,\ldots ,m\),

$$\begin{aligned} \tau _i-\tau _{i-1} \in {\{t_k\}}_{k=1}^\infty \ \cup \ {\{0\}}, \ \ \mathbb {Q}_1-a.s. \end{aligned}$$

Wiener space Let \((\Omega ^W,\mathcal {F}^W,P^W)\) be a complete probability space together with a standard \(m+2\)–dimensional Brownian motion \(\left\{ W_t=\left( W^{(1)}_t,W^{(2))}_t,\ldots ,W^{(m+2)}_t\right) \right\} _{t=0}^\infty \), and the natural filtration \(\mathcal {F}^W_t=\sigma {\{W_s|s\le {t}\}}\). The next step is to construct a martingale \(Z\) on the Brownian probability space \((\Omega ^W,\mathcal {F}^W,P^W)\) together with a sequence of stopping times (with respect to the Brownian filtration) \(\sigma _1\le \sigma _2\le \cdots \le \sigma _m\) such that the distribution (under the Wiener measure \(P^W\)) of the random vector \((\sigma _1,\ldots ,\sigma _m,Z_{\sigma _1},\ldots ,Z_{\sigma _m})\) is equal to the distribution of the random vector \((\tau _1,\ldots ,\tau _m,\hat{\mathbb S}_{\tau _1},\ldots ,\hat{\mathbb S}_{\tau _m})\) under the measure \(\mathbb {Q}_1\). Namely,

$$\begin{aligned} ((\sigma _1,\ldots ,\sigma _m,Z_{\sigma _1},\ldots ,Z_{\sigma _m}),P^W)= ((\tau _1,\ldots ,\tau _m,\hat{\mathbb {S}}_{\tau _1},\ldots , \hat{\mathbb {S}}_{\tau _m}),\mathbb Q_1). \end{aligned}$$
(5.2)

The construction is done by induction, at each step \(k\) we construct the stopping time \(\sigma _k\) and \(Z_{\sigma _k}\) such that the conditional probability is the same as in the case of the canonical process \(\hat{\mathbb S}\) under the measure \(\mathbb Q_1\).

Construction of \(\sigma 's\) and \(Z\) For an integer \(n\) and given \(x_1,\ldots ,x_n\), introduce the notation

$$\begin{aligned} \vec {x}_n:=(x_1,\ldots ,x_n). \end{aligned}$$

Also set

$$\begin{aligned} \mathbb {T}:= {\{t_k\}}_{k=1}^\infty . \end{aligned}$$

For \(k=1,\ldots ,m\), define the functions \(\Psi _k,\Phi _k:\mathbb {T}^k\times \{-1,1\}^{k-1}\rightarrow [0,1]\) by

$$\begin{aligned} \Psi _k(\vec {\alpha }_k;\vec {\beta }_{k-1}):=\mathbb {Q}_1 (\tau _k-\tau _{k-1}\ge \alpha _k\ \big |\ A ), \end{aligned}$$
(5.3)

where

$$\begin{aligned} A:= \{ \tau _i-\tau _{i-1}=\alpha _i,\ \hat{\mathbb S}_{\tau _i}-\hat{\mathbb S}_{\tau _{i-1}} =\beta _i/ N, \ \ i\le k-1\}, \end{aligned}$$

and

$$\begin{aligned} \Phi _k(\vec {\alpha }_k;\vec {\beta }_{k-1})= \mathbb {Q}_1 (\hat{\mathbb S}_{\tau _k}-\hat{\mathbb S}_{\tau _{k-1}}=1/N \ \big | \ B), \end{aligned}$$
(5.4)

where

$$\begin{aligned} B= \{ \tau _k<T, \tau _j-\tau _{j-1}=\alpha _j,\ \hat{\mathbb S}_{\tau _i}-\hat{\mathbb S}_{\tau _{i-1}} =\beta _i/N, \ j\le k, i\le k-1\}. \end{aligned}$$

As usual we set \(\mathbb {Q}_1(\cdot |\emptyset )\equiv 0\). Next, for \(k\le m\), we define the maps \(\Gamma _k,\Theta _k:\mathbb {T}^k \times \{-1,1\}^{k-1}\rightarrow [-\infty ,\infty ]\), as the unique solutions of the following equations,

$$\begin{aligned} P^W(W^{(1)}_{\alpha _k}<\Gamma _k( \vec {\alpha }_k;\vec {\beta }_{k-1}))= \Phi _k(\vec {\alpha }_k;\vec {\beta }_{k-1}), \end{aligned}$$
(5.5)

and

$$\begin{aligned} P^W(W^{(1)}_{t_l}-W^{(1)}_{t_{l+1}}<\Theta _k( \vec {\alpha }_k;\vec {\beta }_{k-1}))= \frac{\Psi _k(\vec {\alpha }_{k-1},t_l;\vec {\beta }_{k-1})}{\Psi _k(\vec {\alpha }_{k-1},t_{l+1};\vec {\beta }_{k-1})}, \end{aligned}$$
(5.6)

where \(l\in \mathbb {N}\) is given by \(\alpha _k=t_l\in \mathbb {T}\). From the definitions it follows that \(\Psi _k(\vec {\alpha }_{k-1},t_l;\vec {\beta }_{k-1})\le \Psi _k(\vec {\alpha }_{k-1},t_{l+1};\vec {\beta }_{k-1})\). Thus if \(\Psi _k(\vec {\alpha }_{k-1},t_{l+1};\vec {\beta }_{k-1})=0\) for some \(l\), then also \(\Psi _k(\vec {\alpha }_{k-1},t_{l};\vec {\beta }_{k-1})=0\). We set \(0/0\equiv 0\).

Set \(\sigma _0\equiv 0\) and define the random variables \(\sigma _1,\ldots ,\sigma _m,Y_1,\ldots ,Y_m\) by the following recursive relations

$$\begin{aligned} \begin{aligned} \sigma _1&=\sum _{k=1}^\infty t_k \chi _{\{W^{(1)}_{t_k}-W^{(1)}_{t_{k+1}}>\Theta _1(t_k)\}} \ \prod _{j=k+1}^\infty \chi _{\{W^{(1)}_{t_j}-W^{(1)}_{t_{j+1}}<\Theta _1(t_j)\}},\\ Y_1&=2\chi _{\{W^{(2)}_{\{\sigma _1}>\Gamma _1(\sigma _1)\}}-1, \end{aligned} \end{aligned}$$
(5.7)

and for \(i>1\)

$$\begin{aligned} \sigma _i&= \sigma _{i-1}+\Delta _i\\ Y_i&= \chi _{\{\sigma _i<T\}}\left( 2\chi _{\{W^{(i+1)}_{\sigma _i}-W^{(i+1)}_{\sigma _{i-1}}> \Gamma _i(\vec {\Delta \sigma }_{i},\vec {Y}_{i-1})\}}-1\right) , \end{aligned}$$

where \(\Delta _i = t_k\) on the set \(A_i \cap B_{i,k}\cap C_{i,k}\) and zero otherwise. These sets are given by,

$$\begin{aligned} A_i&:= {\{|Y_{i-1}|>0\}},\\ B_{i,k}&:= \{W^{(1)}_{t_k+\sigma _{i-1}}-W^{(1)}_{t_{k+1}+\sigma _{i-1}}> \Theta _i(\vec {\sigma }_{i-1},t_k;\vec {Y}_{i-1})\},\\ C_{i,k}&:= \bigcap _{j=k+1}^\infty \{W^{(1)}_{t_j+\sigma _{i-1}}-W^{(1)}_{t_{j+1}+\sigma _{i-1}}< \Theta _i(\vec {\Delta \sigma }_{i-1},t_j;\vec {Y}_{i-1})\}. \end{aligned}$$

Since \(t_k\) is decreasing with \(t_1=T\), \(\sigma _1\le \sigma _2\le \cdots \le \sigma _m\) and they are stopping times with respect to the Brownian filtration. Let \(k\le m\) and \((\vec {\alpha }_k;\vec {\beta }_{k-1})\in \mathbb {T}^k\times \{-1,1\}^{k-1}\). There exists \(m\in \mathbb {N}\) such that \(\alpha _k=t_m\in \mathbb {T}\). From (5.7) to (5.8), the strong Markov property and the independency of the Brownian motion increments it follows that

$$\begin{aligned}&P^W(\sigma _k-\sigma _{k-1}\ge \alpha _k \big | (\vec {\Delta \sigma }_{k-1};\vec {Y}_{k-1})=(\vec {\alpha }_{k-1};\vec {\beta }_{k-1}))\nonumber \\&\quad = P^W\left( \bigcap _{j=m}^\infty (W^{(1)}_{t_j+\sigma _{k-1}}-W^{(1)}_{t_{j+1}+\sigma _{k-1}} <\Theta _k(\vec {\alpha }_{k-1},t_j;\vec {\beta }_{k-1}))\right) \nonumber \\&\quad =\prod _{j=m}^\infty P^W(W^{(1)}_{t_j+\sigma _{k-1}}-W^{(1)}_{t_{j+1}+\sigma _{k-1}}< \Theta _k(\vec {\alpha }_{k-1},t_j;\vec {Y}_{k-1}))\nonumber \\&\quad =\Psi _k(\vec {\alpha }_k,\vec {\beta }_{k-1}), \end{aligned}$$
(5.8)

where the last equality follows from (5.6) and the fact that

$$\begin{aligned} \lim _{l\rightarrow \infty } \Psi _k (\alpha _1,\ldots ,\alpha _{k-1},t_l,\beta _1,\ldots ,\beta _{k-1})=1. \end{aligned}$$

Similarly, from (5.5) and (5.8), we have

$$\begin{aligned}&P^W(Y_k=1\big |\sigma _k<T,\vec {\Delta \sigma }_k=\vec {\alpha }_k,\vec {Y}_{k-1}=\vec {\beta }_{k-1}) \nonumber \\&=P^W\left( W^{(k+1)}_{\sum _{i=1}^k \alpha _i}-W^{(k+1)}_{\sum _{i=1}^{k-1} \alpha _i}< \Gamma _k(\vec {\alpha }_k;\vec {\beta }_{k-1})\right) \nonumber \\&=\Phi _k(\vec {\alpha }_k;\vec {\beta }_{k-1}). \end{aligned}$$
(5.9)

Using (5.3)–(5.4) and (5.8)–(5.9), we conclude that

$$\begin{aligned} \left( \left( \vec {\sigma }_m;\frac{1}{N}\vec {Y}_m\right) , P^W\right) =((\vec {\tau }_m; \vec {\Delta \hat{\mathbb S}}_m),\mathbb {Q}_1) \end{aligned}$$

where \({\Delta \hat{\mathbb S}}_k=\hat{\mathbb {S}}_{\tau _k} -\hat{\mathbb {S}}_{\tau _{k-1}}\), \(k\le m\).

Continuous martingale Set

$$\begin{aligned} Z_t=1+\frac{1}{N}E^W\left( \sum _{i=1}^m Y_i|\mathcal {F}^W_t\right) , \quad \ t\in [0,T]. \end{aligned}$$
(5.10)

Since all Brownian martingales are continuous, so is \(Z\). Moreover, Brownian motion increments are independent and therefore,

$$\begin{aligned} Z_{\sigma _k}=1+\frac{1}{N}\sum _{i=1}^k Y_i, \ \ P^W\hbox {a.s.}, \ \ k\le m. \end{aligned}$$
(5.11)

By the construction of \(Y\) and \(\sigma \)’s, we conclude that (5.2) holds with the process \(Z\).

Measure in \(\mathbb {M}_{\mu }\) The next step in the proof is to modify the martingale \(Z\) in such way that the distribution of the modified martingale is an element of \(\mathbb {M}_{\mu }\). For any two probability measures \(\nu _1,\nu _2\) on \(\mathbb {R}\), Prokhorov’s metric is defined by

$$\begin{aligned}&d(\nu _1,\nu _2)=\inf \{\delta >0: \nu _1(A)\le \nu _2(A^\delta )+\delta \quad \hbox {and}\\&\quad \ \nu _2(A)\le \nu _1(A^\delta )+\delta , \ \ \forall {A}\in \mathcal {B}(\mathbb {R})\}, \end{aligned}$$

where \(\mathcal {B}(\mathbb {R})\) is the set of all Borel sets \(A\subset \mathbb {R}\) and \(A^\delta :=\bigcup _{x\in A}(x-\delta ,x+\delta )\) is the \(\delta \)–neighborhood of \(A\). It is well known that convergence in the Prokhorov metric is equivalent to weak convergence, (for more details on Prokhorov’s metric see [35], Chapter 3, Section 7).

Let \(\nu _1\) and \(\nu _2\), be the distributions of \(\hat{\mathbb S}_{\tau _m}\) and \(\hat{\mathbb {S}}_T\) respectively, under the measure \(\mathbb {Q}_1\). Let \(\nu _3\) be the be the distributions of \(\hat{\mathbb S}_T\) under the measure \(\hat{\mathbb Q}\). In view of (5.1), \(d(\nu _1,\nu _2)<\epsilon \). From the definition of the measure \(\mathbb Q_1\) it follows that \(d(\nu _2,\nu _3)<\frac{2}{N}\). Moreover, (4.9) implies that \(d(\nu _3,\mu ^{(N)})<\frac{K}{N}\) and \(\mu ^{(N)}\) converges to \(\mu \) weakly. Hence, the preceding inequalities, together with this convergence yield that for all sufficiently large \(N\), \(d(\nu _1,\mu )<2\epsilon \). Finally, we observe that in view of (5.2), \((Z_T,P^W)=\nu _1\).

We now use Theorem 4 on page 358 in [35] and Theorem 1 in [36] to construct a measurable function \(\psi :\mathbb {R}^2\rightarrow \mathbb {R}\) such that the random variable \(\Lambda :=\psi (Z_T,W^{(m+2)}_T)\) satisfies

$$\begin{aligned} (\Lambda ,P^W)=\mu \ \ \hbox {and} \ \ P^W(|\Lambda -Z_T|>2\epsilon |)<2\epsilon . \end{aligned}$$
(5.12)

We define a martingale by,

$$\begin{aligned} \Gamma _t=E^W(\Lambda |\mathcal {F}^W_t), \quad \ t\in [0,T]. \end{aligned}$$

In view of (5.12), the distribution of the martingale \(\Gamma \) is an element in \(\mathbb {M}_{\mu }\). Hence,

$$\begin{aligned} \sup _{\mathbb Q\in \mathbb {M}_\mu } \mathbb {E}_{\mathbb {Q}}[G(\mathbb S)] \ge E^W(G(\Gamma )). \end{aligned}$$
(5.13)

We continue with the estimate that connects the distribution of \(\Gamma \) to \(\mathbb {Q}\in \mathbb {M}^{(K)}_{N}\). Observe that \(E^W\Lambda =E^W Z_T=1\). This together with (5.12), positivity of \(Z+\frac{1}{N}\) and \(\Lambda \), and the Holder inequality yields

$$\begin{aligned} E^W|\Lambda -Z_T|&= 2 E^W(\Lambda -Z_T)^{+}-E^W(\Lambda -Z_T)\nonumber \\ \nonumber&= 2 E^W(\Lambda -Z_T)^{+} \\ \nonumber&\le 4\epsilon +\frac{2}{N}+ 2E^W(\Lambda \chi _{\{|\Lambda -Z_T|>2\epsilon \}})\\&\le 4\epsilon +\frac{2}{N}+2\left( \int x^p d\mu (x)\right) ^{1/p}(2\epsilon )^{1/q}, \end{aligned}$$
(5.14)

where \(p>1\) is as (2.4) and \(q=p/({p-1})\). From (5.14) and the Doob inequality for the martingale \(\Gamma _t-Z_t\), \(t\in [0,1]\) we obtain

$$\begin{aligned} E^W(\chi _{\{\Vert \Gamma -Z\Vert >\epsilon ^{1/2q}\}})\le \frac{E^W|\Lambda -Z_T|}{\epsilon ^{1/2q}}\le \frac{4\epsilon +\frac{2}{N}+2\left( \int x^p d\mu (x)\right) ^{1/p}(2\epsilon )^{1/q}}{\epsilon ^{1/2q}}.\nonumber \\ \end{aligned}$$
(5.15)

We now introduce a stochastic process \({(\hat{Z}_t)}_{t=0}^T\), on the Brownian probability space, by, \(\hat{Z}_t=Z_{\sigma _k}\) for \(t\in [\sigma _k,\sigma _{k+1})\), \(k<m\) and for \(t\in [\sigma _m,T]\), we set \(\hat{Z}_t=Z_{\sigma _m}\). On the space \((\hat{\Omega },\mathbb Q_1)\) let \(\tilde{\mathbb S}_t=\hat{\mathbb S}_{t\wedge \tau _m}\), \(t\in [0,T]\). Recall that \(G\) is bounded by \(K\). We now use the Assumption (2.1) together with (5.1) and (5.11) to arrive at

$$\begin{aligned} \begin{aligned}&\mathbb {E}_{\mathbb Q_1}( G(\hat{\mathbb S}))- \mathbb {E}_{\mathbb Q_1} (G(\tilde{\mathbb S}))\le K \epsilon \\&|E^W(G(Z))-E^W (G(\hat{Z}))| \le L E^W \Vert Z-\hat{Z}\Vert \le \frac{L}{N}. \end{aligned} \end{aligned}$$
(5.16)

Recall that by (5.2), \((\hat{Z},P^W)=(\tilde{\mathbb {S}},\mathbb {Q}_1)\). Thus, \(E^W (G(\hat{Z}))= \mathbb {E}_{\mathbb Q_1}(G(\tilde{\mathbb S})).\) This together with Assumption 2.1 and (5.16) yields

$$\begin{aligned} \mathbb {E}_{\hat{\mathbb Q}}(G(\hat{\mathbb S}))\le \frac{L}{N}+\mathbb {E}_{\mathbb Q_1}(G(\hat{\mathbb S})) \le \frac{2L }{N}+K \epsilon + E^W(G(Z)). \end{aligned}$$
(5.17)

From Assumption 2.1, (5.13)–(5.15) and (5.17) we obtain

$$\begin{aligned} \sup _{\mathbb {Q}\in \mathbb {M}_\mu } \mathbb {E}_{\mathbb {Q}}\left[ G(\mathbb S)\right]&\ge E^W(G(\Gamma ))\\&\ge E^W(G(Z))-L \epsilon ^{1/2q}- K E^W(\chi _{\{\Vert \Gamma -Z\Vert >\epsilon ^{1/2q}\}})\\&\ge \mathbb {E}_{\hat{\mathbb Q}}(G(\hat{\mathbb S}))- f_{K} (\epsilon ,N), \end{aligned}$$

where

$$\begin{aligned} f_{K}(\epsilon ,N) =\frac{2L}{N}+ K \epsilon + L\epsilon ^{1/2q}+K \frac{4\epsilon +\frac{2}{N}+\left( \int x^p d\mu (x)\right) ^{1/p}(2\epsilon )^{1/q}}{\epsilon ^{1/2q}}. \end{aligned}$$

\(\square \)

6 Possible extensions

In this paper, we prove a Kantorovich type duality for a super-replication problem in financial market with no prior probability structure. The dual is a martingale optimal problem.

The main theorem holds for nonlinear path-dependent options satisfying Assumption 2.1. Although this condition is satisfied by most of the examples, it is an interesting question to characterize the class of functions for which the duality holds. A possible procedure for extending the proof is the following. Assumption 2.1 is used in the proofs of Lemmas 4.2, 4.4 and 4.8. In Lemma 4.2, only the linear growth implied by the assumption is used and one may replace this assumption by an appropriate growth condition on the function \(G\). In particular, if \(G\) is bounded no assumption would be required.

Since the inequality (2.10) holds for any measurable function \(G\), we need to extend the proof of the inequality (2.9). We may achieve this by modifying the right hand side of formula (4.7) in Definition 4.6 and use a sequence of functions \(G_n(\hat{\mathbb S})\) satisfying the Assumption 2.1 and \(G_n\downarrow G\) as \(n\) approaches to \(\infty \). Under this structure, we skip Lemma 4.4, and prove Lemma 4.8 directly. The final step would be a modification of Proposition 5.1 to the following claim

$$\begin{aligned} \limsup _{N\rightarrow \infty }\ \sup _{\mathbb Q\in \mathbb {M}^{(K)}_N}\ \mathbb {E}_{\mathbb Q} [G_N (\hat{\mathbb S})] \ \le \ \sup _{\mathbb {Q}\in \mathbb {M}_\mu } \ \mathbb {E}_{\mathbb {Q}}\left[ G(\mathbb S) \right] \!. \end{aligned}$$

This extension technique also applies to Barrier options. In this case, we use the approximating sequence as the payoffs \(G_n\) of Barrier options with a larger (than the original payoff \(G\)) corridor. The main concern here is to discretize the process in a way adapted to the barriers.

Two other important extensions are to the case of many stocks and the inclusion of the possibility of jumps into the stock price process. We believe that for the multi-dimensional case, a discretization based proof would be possible. The main difficulty here is to appropriately define the crossing times and use them to obtain a piece-wise constant approximation of a generic stock price process.

Finally, the discretization technique developed in this paper also applies to markets with frictions. Indeed, recently, the authors proved the duality in a finite time model with proportional transactions costs [19] using an earlier result of [18].