Stochastic Control

This chapter takes techniques from stochastic control and applies them to portfolio management. The portfolio can be of varying type, two possibilites are a portfolio for investment of (personal) wealth, or a hedging portfolio with a short position in a derivative contract. The basic problem involves an investor with a self-financing wealth process and a concave utility function to quantify their risk aversion, from which their goal is to maximize their expected utility of terminal wealth and/or consumption. To exemplify the need for hedging obtained from optimal control, recall the price of volatility risk [math]\Lambda(t,s,x)[/math] from Proposition of Chapter. The pricing PDE for stochastic volatility depends on [math]\Lambda[/math], but incompleteness of the market means that [math]\Lambda[/math] may not be uniquely specified. However, an expression can be obtained from the solution to an optimal control, hence writing [math]\Lambda[/math] as a function of the investor's risk aversion. This chapter will start by considering the basic problem of optimization of (personal) wealth, and later on will show how optimal control is used in hedging derivatives.

The Optimal Investment Problem

Consider a standard geometric Brownian motion for the price of a risky asset,

[[math]] \begin{equation} \label{eq:dS_physicalMeasure} \frac{dS_t}{S_t}=\mu dt+\sigma dW_t\ , \end{equation} [[/math]]

where [math]\mu\in\mathbb R[/math], [math]\sigma \gt 0[/math], and [math]W[/math] is a Brownian motion under the statistical measure. There is also the risk-free bank account that pays interest at a rate [math]r\geq 0[/math]. At time [math]t\geq 0[/math] the investors has a portfolio value [math]X_t[/math] with an allocation [math]\pi_t[/math] in the risky asset and a consumption stream [math]c_t[/math]. The dynamics of the portfolio are self-financing,

[[math]] \begin{equation} \label{eq:dX_wealth} dX_t=X_t\left(rdt+ \pi_t\left(\frac{dS_t}{S_t}-rdt\right)-c_tdt\right)\ , \end{equation} [[/math]]

where

[[math]] \begin{align*} \pi_t&=\hbox{proportion of wealth in the risky asset,}\\ c_t&=\hbox{rate of consumption.} \end{align*} [[/math]]

A natural constraint on consumption is [math]c_t\geq 0[/math] for all [math]t\geq 0[/math], and based on \eqref{eq:dX_wealth} a constraint of [math]X_t\geq 0[/math] is enforced automatically. Here we have taken [math]X_t\geq 0[/math] almost surely, but in general any finite lower bound on [math]X_t[/math] is necessary to ensure no-arbitrage (i.e. [math]X_t\geq -M \gt -\infty [/math] almost with constant [math]M[/math] finite), otherwise there could be doubling strateguies. The optimization problem is then formulated as

[[math]] \begin{equation} \label{eq:optimizationProblem} V(t,x)=\max_{\pi,c\geq0}\mathbb E\left[\int_t^TF(u,c_u,X_u)du+U(X_T)\Big|X_t=x\right]\ , \end{equation} [[/math]]

where [math]F[/math] is a concave utility on consumption and wealth, [math]U[/math] is a concave utility on terminal wealth, and the admissible pairs [math](\pi_t,c_t)_{t\geq0}[/math] are non-anticipating, adapted to [math]W[/math], with [math]\int_0^T|\pi_tX_t|^2dt \lt \infty[/math] almost surely. We refer to [math]\mathbb E\left[\int_t^TF(u,c_u,X_u)du+U(X_T)\Big|X_t=x\right][/math] as the objective function, and refer to $V$ as the optimal value function.

Example \label{ex:optimalLog} Suppose that [math]F=0[/math] and [math]U(x) = \log(x)[/math]. There is no utility of consumption, so the optimal is [math]c_t=0[/math] for all [math]t\geq 0[/math]. Now apply It\^o's lemma,

[[math]]d\log(X_t) = rdt+ \pi_t\left(\frac{dS_t}{S_t}-rdt\right) -\frac{\sigma^2\pi_t^2}{2}dt\ ,[[/math]]

and then taking expectations,

[[math]]\mathbb E[\log(X_T)|X_t=x] =\log(x)+\mathbb E\left[\int_t^T\left(r+ \pi_u\left(\mu-r\right) -\frac{\sigma^2\pi_u^2}{2}\right)du\Big|X_t=x\right]\ ,[[/math]]

where the right-hand side is concave in [math]\pi_t[/math]. Hence, the optimal strategy is

[[math]]\pi_t = \frac{\mu-r}{\sigma^2}\qquad\forall t\in[0,T]\ ,[[/math]]

which is the Sharpe ratio divided by the volatility. The optimal value function is

[[math]]V(t,x)=\max_\pi\mathbb E[\log(X_T)|X_t=x] = \log\left(xe^{\left(r+\frac{(\mu-r)^2}{2\sigma^2}\right)(T-t)}\right)\ ,[[/math]]

and using [math]U^{-1}(v) = e^v[/math], we find the certainty equivalent,

[[math]]X_t^{ce}=e^{-r(T-t)}U^{-1}\left(V(t,X_t)\right)= X_te^{\left(\frac{(\mu-r)^2}{2\sigma^2}\right)(T-t)}\ ,[[/math]]

which is the risk-free rate plus [math]\tfrac12[/math] times the Sharpe-ratio squared.

Example

Suppose that [math]F(t,c_t,X_t) = e^{-\beta t}\log(c_tX_t)[/math], [math]U(x) = 0[/math], and [math]T=\infty[/math]. The optimization problem is

[[math]]\max_{\pi,c\geq0}\mathbb E\left[\int_t^\infty e^{-\beta(u-t)}\log(c_uX_u)du\Big|X_t=x\right] = V(x)\ ,[[/math]]

which is constant in [math]t[/math]. Now notice for any admissible [math](\pi,c)[/math] on [math][t,t+\Delta t][/math] we have the dynamic programming principle,

[[math]]V(X_t)\geq e^{-\beta\Delta t}\mathbb E_tV(X_{t+\Delta t}) + \mathbb E_t\int_t^{t+\Delta t}e^{-\beta(u-t)}\log(c_uX_u)du\ ,[[/math]]

with equality if and only if [math](\pi,c)[/math] is chosen optimally over [math][t,t+\Delta t][/math], and hence

[[math]] \begin{align*} &\frac{\mathbb E_tV(X_{t+\Delta t}) -V(X_t)}{\Delta t}\\ & \leq \frac{1-e^{-\beta\Delta t}}{\Delta t}\mathbb E_tV(X_{t+\Delta t}) -\frac{1}{\Delta t} \mathbb E_t\int_t^{t+\Delta t}e^{-\beta(u-t)}\log(c_uX_u)du\\ & \rightarrow \beta V(X_t) -\log(c_tX_t)\ , \end{align*} [[/math]]

as [math]\Delta t\rightarrow 0[/math]. On the other hand, from It\^o's lemma we have

[[math]] \begin{align*} dV(X_t) &= \left(\frac{\sigma^2\pi_t^2X_t^2}{2}\frac{\partial^2}{\partial x^2}V(X_t)+\left(r+\pi_t(\mu-r)-c_t\right)X_t\frac{\partial}{\partial x}V(X_t)\right)dt\\ &+\sigma\pi_tX_t\frac{\partial}{\partial x}V(X_t)dW_t\ , \end{align*} [[/math]]

and assuming the Brownian term vanishes under expectations, we have

[[math]] \begin{align*} &\frac{\mathbb E_tV(X_{t+\Delta t}) -V(X_t)}{\Delta t}\\ & =\frac{1}{\Delta t}\mathbb E_t\int_t^{t+\Delta t}\left(\frac{\sigma^2\pi_u^2X_u^2}{2}\frac{\partial^2}{\partial x^2}V(X_u)+\left(r+\pi_u(\mu-r)-c_u\right)X_u\frac{\partial}{\partial x}V(X_u)\right)du\\ & \rightarrow \frac{\sigma^2\pi_t^2X_t^2}{2}\frac{\partial^2}{\partial x^2}V(X_t)+\left(r+\pi_t(\mu-r)-c_t\right)X_t\frac{\partial}{\partial x}V(X_t)\ , \end{align*} [[/math]]

as [math]\Delta t\rightarrow 0[/math]. Hence, for all admissible pairs [math](\pi,c)[/math] the value function [math]V(x)[/math] satisifies

[[math]] \frac{\sigma^2\pi^2x^2}{2}\frac{\partial^2}{\partial x^2}V(x)+\left(r+\pi(\mu-r)-c\right)x\frac{\partial}{\partial x}V(x) - \beta V(x) +\log(cx)\leq 0\ ,[[/math]]

with equality if and only [math]\pi[/math] and [math]c[/math] are optimal, which leads to the equation

[[math]]\max_{\pi,c\geq 0}\left( \frac{\sigma^2\pi^2x^2}{2}\frac{\partial^2}{\partial x^2}V(x)+\left(r+\pi(\mu-r)-c\right)x\frac{\partial}{\partial x}V(x) - \beta V(x) +\log(cx)\right)=0\ .[[/math]]

Let's assume the ansatz

[[math]]V(x) = a\log(x) +b \ .[[/math]]

Then through first-order optimality conditions (i.e. by differentiating with respect to [math]\pi[/math] and setting equal to zero) we find the optimal

[[math]]\pi(x)= -\frac{\mu-r}{\sigma^2 x}\frac{\frac{\partial}{\partial x}V(x)}{\frac{\partial^2}{\partial x^2}V(x)}=\frac{\mu-r}{\sigma^2 }\ .[[/math]]

Similarly, first-order optimality conditions for [math]c[/math] yield

[[math]]c(x)= \frac{1}{x\frac{\partial}{\partial x}V(x)}=\frac{1}{a}\ .[[/math]]

Putting optimal [math]\pi_t[/math] and [math]c_t[/math] back into the equation for [math]V[/math] along with the ansatz, we find

[[math]]\log(x/a)-\beta(a\log(x)+b)+\left(ar- 1\right)+\frac a2\frac{(\mu-r)^2}{\sigma^2}=0\ ,[[/math]]

and comparing [math]\log(x)[/math] terms and non-[math]x[/math]-dependent terms we find,

[[math]] \begin{align*} a&=\frac1\beta \ ,\\ b&=\frac{1}{2\beta^2}\frac{(\mu-r)^2}{\sigma^2}+\frac{r}{\beta^2}+\frac1\beta\left(\log(\beta)-1\right)\ . \end{align*} [[/math]]

The Hamilton-Jacobi-Bellman (HJB) Equation

Example is useful to get started and to get a sense for how an optimal control should look. Example is more instructive because it shows us how (i) the function [math]V[/math] inherits concavity from [math]F[/math] and [math]U[/math], and (ii) how it also shows how to derive the PDE that [math]V[/math] should satisfy. The derivation starts with the dynamic programming principle,

[[math]]V(t,x)=\max_{\pi,c\geq0}\mathbb E\left[\int_t^{t+\Delta t}F(u,c_u,X_u)du+V(t+\Delta t,X_{t+\Delta t})\Big|X_t=x\right]\ ,[[/math]]

where [math]\max_{\pi,c}[/math] is taken over the interval [math][t,t+\Delta t][/math]. Applying It\^o's lemma to [math]V(t,X_t)[/math], we find

[[math]] \begin{align*} &V(t+\Delta t,X_{t+\Delta t})\\ &= V(t,X_t)+ \int_t^{t+\Delta t}\left(\frac{\partial}{\partial t}+\frac{\sigma^2\pi_u^2X_u^2}{2}\frac{\partial^2}{\partial x^2}+\left(r+\pi_u(\mu-r)-c_u\right)X_u\frac{\partial}{\partial x}\right)V(u,X_u)du\\ &+\sigma\int_t^{t+\Delta t}\pi_uX_u\frac{\partial}{\partial x}V(u,X_u)dW_u\ , \end{align*} [[/math]]

for any admissible [math](\pi,c)[/math] over [math][t,t+\Delta t][/math]. Hence, for any [math](\pi,c)[/math] on [math][t,t+\Delta t][/math] we have

[[math]] \begin{align*} &\mathbb E\left[\int_t^{t+\Delta t}\Bigg(F(u,c_u,X_u)+\Bigg(\frac{\partial}{\partial t}+\frac{\sigma^2\pi_u^2X_u^2}{2}\frac{\partial^2}{\partial x^2}\right.\\ &+\left.\left(r+\pi_u(\mu-r)-c_u\right)X_u\frac{\partial}{\partial x}\Bigg)V(u,X_u)\Bigg)du\Big|X_t=x\right]\leq 0\ , \end{align*} [[/math]]

with equality iff and only if an optimal [math](\pi,c)[/math] is chosen. Hence, dividing by [math]\Delta t[/math] and taking the limt to zero, we obtain the so-called Hamilton-Jacobi-Bellman (HJB) equation:

[[math]] \begin{align} \label{eq:HJB} \max_{\pi,c\geq0}\left(F(t,c)+\Bigg(\frac{\partial}{\partial t}+\frac{\sigma^2\pi^2x^2}{2}\frac{\partial^2}{\partial x^2}+\left(r+\pi(\mu-r)-c\right)x\frac{\partial}{\partial x}\Bigg)V(t,x)\right)&=0\ ,\\ \nonumber V(T,x)&=U(x)\ . \end{align} [[/math]]

Merton's Optimal Investment Problem

Let [math]F=0[/math] and consider a power utility function,

[[math]]U(x) = \frac{x^{1-\gamma}}{1-\gamma}\ ,[[/math]]

where [math]\gamma \gt 0[/math], [math]\gamma\neq1[/math] is the risk aversion. The problem is to solve

[[math]]V(t,x)=\max_\pi\mathbb E[U(X_T)|X_t=x]\ ,[[/math]]

The HJB equation for this problem is

[[math]] \begin{align} \label{eq:HJBmerton} \left(\frac{\partial}{\partial t}+rx\frac{\partial}{\partial x}\right)V(t,x)+\max_{\pi}\left(\frac{\sigma^2\pi^2x^2}{2}\frac{\partial^2}{\partial x^2}V(t,x)+\pi(\mu-r)x\frac{\partial}{\partial x}V(t,x)\right)&=0\ ,\\ \nonumber V(T,x)&=U(x)\ , \end{align} [[/math]]

for which we find the optimal [math]\pi[/math],

[[math]]\pi_t = -\frac{\mu-r}{x\sigma^2} \frac{\frac{\partial}{\partial x}V(t,x)}{\frac{\partial^2}{\partial x^2}V(t,x)}\ . [[/math]]

Inserting the optimal [math]\pi_t[/math] into \eqref{eq:HJBmerton} we obtain the nonlinear equation,

[[math]] \begin{equation} \label{eq:HJBmerton_nonlinear} \left(\frac{\partial}{\partial t}+rx\frac{\partial}{\partial x}\right)V(t,x)-\frac{\left((\mu-r)\frac{\partial}{\partial x}V(t,x)\right)^2}{2\sigma^2\frac{\partial^2}{\partial x^2}V(t,x)}=0\ . \end{equation} [[/math]]

Then using the ansatz [math]V(t,x) = g(t)U(x)[/math], we find

[[math]] \begin{align*} \frac{\partial}{\partial t}V(t,x)&= g'(t)U(x)\ ,\\ \frac{\partial}{\partial x}V(t,x)&= \frac{1-\gamma}{x}g(t)U(x)\ ,\\ \frac{\partial^2}{\partial x^2}V(t,x)&= -\frac{\gamma(1-\gamma)}{x^2}g(t)U(x)\ , \end{align*} [[/math]]

and inserting in \eqref{eq:HJBmerton_nonlinear} we find an ODE for [math]g[/math],

[[math]]g'(t) +r(1-\gamma)+\frac{(1-\gamma)(\mu-r)^2g(t)}{2\gamma\sigma^2}=0\ ,[[/math]]

with terminal condition [math]g(T)=1[/math]. The solution is

[[math]]g(t) = e^{(1-\gamma)(T-t)\left(r+\frac{(\mu-r)^2}{2\gamma\sigma^2}\right)}\ ,[[/math]]

and the optimal value function is

[[math]]V(t,x) = U(x)g(t)=\frac{\left(xe^{(T-t)\left(r+\frac{(\mu-r)^2}{2\gamma\sigma^2}\right)}\right)^{1-\gamma}}{1-\gamma}=U\left(xe^{(T-t)\left(r+\frac{(\mu-r)^2}{2\gamma\sigma^2}\right)}\right)\ .[[/math]]

and the certainty equivalent is

[[math]]X_t^{ce}=e^{-r(T-t)}U^{-1}(v(t,X_t)) = X_te^{(T-t)\left(\frac{(\mu-r)^2}{2\gamma\sigma^2}\right)}\ .[[/math]]

Stochastic Returns

Consider the model

[[math]] \begin{eqnarray} \label{eq:SRM_stochControl} dS_t&=&Y_t S_tdt+\sigma S_tdW_t\\ \label{eq:SRM_dY_stochControl} dY_t &=&\kappa(\mu-Y_t)dt+\beta dB_t\ , \end{eqnarray} [[/math]]

with [math]dW_tdB_t=\rho dt[/math] where [math]\rho\in(-1,1)[/math]. The interpretation of [math]Y_t[/math] could be any of the following: [math]Y_t[/math] is a dividend yield with uncertainty (although somewhat of strange model because it can be negative), or [math]Y_t[/math] is the return rate on a commodities or bond portfolio where there is a role yield due to contango or backwardation.

Let's assume the simple case [math]\mu=r=0[/math], for which the value function is

[[math]]V(t,x,y) = \max_\pi\mathbb E\left[U(X_T)\Big|X_t=x,Y_t=y\right]\ ,[[/math]]

and has HJB equation

[[math]] \begin{align} \nonumber \left(\frac{\partial}{\partial t}+\frac{\beta^2}{2}\frac{\partial^2}{\partial y^2}-\kappa y\frac{\partial}{\partial y}\right)V(t,x,y)&\\ \nonumber +\max_{\pi}\Bigg(\frac{\sigma^2x^2\pi^2}{2}\frac{\partial^2}{\partial x^2}V(t,x,y)+\pi xy\frac{\partial}{\partial x}V(t,x,y)&\\ \label{eq:HJBstochReturns} +\rho \pi x\beta\sigma\frac{\partial^2}{\partial x\partial y}V(t,x,y)\Bigg)&=0\ ,\\ \nonumber V(T,x,y)&=U(x)\ . \end{align} [[/math]]

The first-order condition for [math]\pi[/math] yields the optimal

[[math]]\pi_t = -\frac{xy\frac{\partial}{\partial x}V(t,x,y)+\rho \pi x\beta\sigma\frac{\partial^2}{\partial x\partial y}V(t,x,y)}{\sigma^2x^2\frac{\partial^2}{\partial x^2}V(t,x,y)}\ .[[/math]]

For the power utility

[[math]]U(x) = \frac{x^{1-\gamma}}{1-\gamma}, [[/math]]

we have the ansatz [math]V(t,x,y) = U(x)g(t,y)[/math] with

[[math]] \begin{align*} \frac{\partial}{\partial t}V& = \frac{\partial}{\partial t}g(t,y)U(x)\ ,\\ \frac{\partial}{\partial x}V& =\frac{1-\gamma}{x} g(t,y)U(x)\ ,\\ \frac{\partial^2}{\partial x^2}V& =-\frac{(1-\gamma)\gamma}{x^2} g(t,y)U(x)\ ,\\ \frac{\partial^2}{\partial x\partial y}V& =\frac{1-\gamma}{x} \frac{\partial}{\partial y}g(t,y)U(x)\ , \end{align*} [[/math]]

all of which are inserted into equation \eqref{eq:HJBstochReturns} to get an equation for [math]g[/math]:

[[math]] \begin{align} \nonumber \left(\frac{\partial}{\partial t}+\frac{\beta^2}{2}\frac{\partial^2}{\partial y^2}-\kappa y\frac{\partial}{\partial y}\right)g(t,y)+\frac{1-\gamma}{2\sigma^2\gamma}\left(y+\frac{\rho \beta\sigma\frac{\partial}{\partial y}g(t,y)}{g(t,y)}\right)^2g(t,y)&=0\ ,\\ \nonumber g(T,y)&=1\ . \end{align} [[/math]]

We now apply another ansatz [math]g(t,y) = e^{a(t)y^2+b(t)}[/math], which when inserted into the equation for [math]g(t,y)[/math] yields the following system:

[[math]] \begin{align*} y^2&:~a'(t)=-2\beta^2\left(1+\frac{(1-\gamma)\rho^2}{\gamma}\right)a^2(t)-2\left(\frac{\rho\beta(1-\gamma)}{\sigma\gamma}-\kappa\right)a(t)-\frac{1-\gamma}{2\sigma^2\gamma}\ ,\\ 1&:~b'(t)=-\beta^2a(t)\ , \end{align*} [[/math]]

with terminal conditions [math]a(T)=b(T)=0[/math]. The solution [math]a(t)[/math] can be written as a ratio,

[[math]]a(t) = \frac{v'(t)}{2\beta^2 \left(1+\frac{(1-\gamma)\rho^2}{\gamma}\right)v(t)}\ ,[[/math]]

where [math]v(t)[/math] is the solution to a 2nd-order ODE,

[[math]]v''(t) +2\left(\frac{\rho\beta(1-\gamma)}{\sigma\gamma}-\kappa\right)v'(t)+\frac{(1-\gamma)\beta^2}{\sigma^2\gamma}\left(1+\frac{(1-\gamma)\rho^2}{\gamma}\right)v(t)=0\ . [[/math]]

The roots of this equation are

[[math]]m_\pm =-\left(\frac{\rho\beta(1-\gamma)}{\sigma\gamma}-\kappa\right)\pm\sqrt{\kappa^2-\frac{\beta(1-\gamma)}{\sigma\gamma}\left(2\kappa\rho+\frac{\beta}{\sigma}\right)}\ ,[[/math]]

which gives the general solution

[[math]]v(t) = C_1e^{m_+(T-t)}+C_2e^{m_-(T-t)}\ .[[/math]]

It is not necessary to fully determine constants [math]C_1[/math] and [math]C_2[/math] because we are mainly interested in the ratio [math]v'(t)/v(t)[/math].

Finite-Time Blowup. Complex valued [math]m_\pm[/math] leads to finite-time blowup for the optimization problem. If the roots are complex then let [math]c = -\left(\frac{\rho\beta(1-\gamma)}{\sigma\gamma}-\kappa\right)[/math] and [math]d = \frac{\beta(1-\gamma)}{\sigma\gamma}\left(2\kappa\rho+\frac{\beta}{\sigma}\right)-\kappa^2[/math] so that the general solution is

[[math]]v(t) = e^{c(T-t)}\Big(C_1\cos(d(T-t))+C_2\sin(d(T-t))\Big)\ ,[[/math]]

and with [math]v'(T)=-cC_1-dC_2=0[/math] to satisfy the terminal condition [math]a(T)=0[/math], so that

[[math]]v(t) = C_1e^{c(T-t)}\left(\cos(d(T-t))-\frac{c}{d}\sin(d(T-t))\right)\ .[[/math]]

The solution [math]a(t)[/math] will blow at time [math]t^*[/math] such that [math]\cos(d(T-t^*))-\frac{c}{d}\sin(d(T-t^*))=0[/math], that is [math]\tan(d(T-t^*))=\frac{d}{c}[/math] or

[[math]]T-t^* = \frac{1}{d}\left(\pi\indicator{c\leq0}+\tan^{-1}\left(\frac{d}{c}\right)\right)\ .[[/math]]

Stochastic Volatility

Now let's consider the same optimal terminal wealth problem as the Merton problem, with exponential utility

[[math]]U(x) = -\frac1\gamma e^{-\gamma x}\qquad\hbox{where }\gamma \gt 0\ ,[[/math]]

and in the incomplete market of stochastic volatility,

[[math]] \begin{eqnarray} \label{eq:SVM_stochControl} dS_t&=&\mu S_tdt+\sigma(Y_t)S_tdW_t\\ \label{eq:dY_stochControl} dY_t &=&\alpha(Y_t)dt+\beta(Y_t)dB_t\ , \end{eqnarray} [[/math]]

where [math]dW_t\cdot dB_t = \rho dt[/math]. From \eqref{eq:SVM_stochControl} and \eqref{eq:dY_stochControl}, we have the wealth process,

[[math]]dX_t = rX_tdt+\pi_t\left(\frac{dS_t}{S_t}-rdt\right)\ ,[[/math]]

no longer enforcing the non-negativity constraint. The optimization problem is

[[math]]V(t,x,y) = \max_\pi\mathbb E\left[U(X_T)\Big|X_t=x,Y_t=y\right]\ ,[[/math]]

but the technique used in Example does not apply because there is some local martingale behavior in the stochastic integrals. Instead, we arrive at the optimal solution using the HJB equation. The HJB equation is

[[math]] \begin{align} \nonumber \left(\frac{\partial}{\partial t}+rx\frac{\partial}{\partial x}+\frac{\beta^2(y)}{2}\frac{\partial^2}{\partial y^2}+\alpha(y)\frac{\partial}{\partial y}\right)V(t,x,y)&\\ \nonumber +\max_{\pi}\Bigg(\frac{\sigma^2(y)\pi^2}{2}\frac{\partial^2}{\partial x^2}V(t,x,y)+\pi(\mu-r)\frac{\partial}{\partial x}V(t,x,y)&\\ \label{eq:HJBstochVol} +\rho \pi \beta(y)\sigma(y)\frac{\partial^2}{\partial x\partial y}V(t,x,y)\Bigg)&=0\ ,\\ \nonumber V(T,x)&=U(x)\ . \end{align} [[/math]]

Using the ansatz [math]V(t,x,y) = U(xe^{r(T-t)})g(t,y)[/math], we have

[[math]] \begin{align*} \frac{\partial}{\partial t}V& = \gamma rx e^{r(T-t)} V+U(xe^{r(T-t)})\frac{\partial}{\partial t}g(t,y)\ ,\\ \frac{\partial}{\partial x}V& = -\gamma e^{r(T-t)} V\ ,\\ \frac{\partial^2}{\partial x^2}V& = \gamma^2 e^{2r(T-t)}V\ ,\\ \frac{\partial^2}{\partial x\partial y}V& = -\gamma e^{r(T-t)} U(xe^{r(T-t)})\frac{\partial}{\partial y}g(t,y)\ ,\\ \end{align*} [[/math]]

which we insert into \eqref{eq:HJBstochVol} to find a PDE for [math]g[/math],

[[math]] \begin{align} \nonumber \left(\frac{\partial}{\partial t}+\frac{\beta^2(y)}{2}\frac{\partial^2}{\partial y^2}+\alpha(y)\frac{\partial}{\partial y}\right)g(t,y)&\\ \label{eq:HJBstochVol_g} +\min_{\pi}\Bigg(\frac{\gamma^2e^{2r(T-t)}\sigma^2(y)\pi^2}{2}g(t,y)-\gamma e^{r(T-t)}\pi\left((\mu-r)g(t,y)+\rho \beta(y)\sigma(y)\frac{\partial}{\partial y}g(t,y)\right)\Bigg)&=0\ ,\\ \nonumber g(T,y)&=1\ , \end{align} [[/math]]

and the optimal strategy is

[[math]]\pi_t = e^{-r(T-t)}\left(\frac{\mu-r}{\gamma\sigma^2(y)}+\rho \frac{ \beta(y)}{\gamma\sigma(y)}\frac{\frac{\partial}{\partial y}g(t,y)}{g(t,y)}\right)\ .[[/math]]

Inserting this optimal [math]\pi_t[/math] into \eqref{eq:HJBstochVol_g} yields the nonlinear equation

[[math]] \begin{align} \label{eq:stochVol_Merton_g} \left(\frac{\partial}{\partial t}+\frac{\beta^2(y)}{2}\frac{\partial^2}{\partial y^2}+\alpha(y)\frac{\partial}{\partial y}\right)g(t,y)-\frac{\sigma^2(y)}{2}\left(\frac{\mu-r}{\sigma^2(y)}+\rho \frac{ \beta(y)}{\sigma(y)}\frac{\frac{\partial}{\partial y}g(t,y)}{g(t,y)}\right)^2 g(t,y)&=0\ . \end{align} [[/math]]

This equation can be reduced to a linear PDE if we look for a function [math]\psi(t,y)[/math] such that

[[math]]g(t,y) = \psi(t,y)^q\ ,[[/math]]

where [math]q[/math] is a parameter. Differentiating yields,

[[math]] \begin{align*} \frac{\partial}{\partial t}g &= \frac{qg}{\psi}\frac{\partial}{\partial t}\psi\\ \frac{\partial}{\partial y}g &= \frac{qg}{\psi}\frac{\partial}{\partial y}\psi\\ \frac{\partial^2}{\partial y^2}g &= qg\left(\frac1\psi\frac{\partial^2}{\partial y^2}\psi+\frac{q-1}{\psi^2}\left(\frac{\partial}{\partial y}\psi\right)^2\right)\ , \end{align*} [[/math]]

and then plugging into \eqref{eq:stochVol_Merton_g} with chosen parameter [math]q=1/(1+\rho^2)[/math] yields a linear equation:

[[math]] \begin{align} \label{eq:linearPDEstochVolControl} \left(\frac{\partial}{\partial t}+\frac{\beta^2(y)}{2}\frac{\partial^2}{\partial y^2}+\left(\alpha(y)-\rho \frac{(\mu-r) \beta(y)}{\sigma(y)}\right)\frac{\partial}{\partial y}\right)\psi(t,y)-\frac{(\mu-r)^2}{2q\sigma^2(y)}\psi(t,y)&=0\ . \end{align} [[/math]]

Example Consider a futures contract [math]F_{t,T}[/math] with settlement date [math]T[/math] and stochastic volatility and returns,

[[math]] \begin{align*} \frac{dF_{t,T}}{F_{t,T}}&=\mu Y_tdt+\sqrt{Y_t}dW_t\\ dY_t&=\kappa(\bar Y-Y_t)dt+\beta\sqrt{Y_t}dB_t \end{align*} [[/math]]

where [math]\beta^2\leq 2\kappa\bar Y[/math] and [math]dW_tdB_t=\rho dt[/math]. The wealth process for futures trading is

[[math]]dX_t = rX_tdt+ \pi_t\frac{dF_{t,T}}{F_{t,T}}\ .[[/math]]

For [math]U(x) = -\frac1\gamma e^{-\gamma x}[/math] the optimal terminal expected utility is

[[math]]V(t,x,y)=U(xe^{r(T-t)})\psi(t,y)^q\ ,[[/math]]

where [math]\psi(t,y)[/math] is similar to a solution to equation \eqref{eq:linearPDEstochVolControl}, except the equation has no [math]r[/math],

[[math]] \begin{align*} \left(\frac{\partial}{\partial t}+\frac{\beta^2y}{2}\frac{\partial^2}{\partial y^2}+\left(\kappa(\bar Y-y)-\rho \mu\beta y\right)\frac{\partial}{\partial y}\right)\psi(t,y)-\frac{\mu^2y}{2q}\psi(t,y)&=0\ . \end{align*} [[/math]]

It can be further shown that the solution to this equation is of the form

[[math]]\psi(t,y) = e^{a(t)y+b(t)}\ ,[[/math]]

with [math]a(T)=b(T)=0[/math], and where [math]a(t)[/math] and [math]b(t)[/math] satisfy ODEs,

[[math]] \begin{align*} a'(t)+\frac{\beta^2}{2}a^2(t)-\left(\kappa+\rho \mu\beta \right)a(t)-\frac{\mu^2}{2q}&=0\\ b'(t)+\kappa\bar Ya(t)&=0\ , \end{align*} [[/math]]

both of which can be solved explicitly.

Indifference Pricing

Stochastic control for terminal wealth can be implemented to find the the price of a call option under stochastic volatility,

[[math]]V^{h}(t,x,y,s) = \max_\pi\mathbb E\left[U(X_T-(S_T-K)^+)\Big|X_t=x,Y_t=y,S_t=s\right]\ ,[[/math]]

where the investor now hedges a short position in a call option with strike [math]K[/math]. Compared to the same investor's value function that is not short the call

[[math]]V^0(t,x,y) = \max_\pi\mathbb E\left[U(X_T)\Big|X_t=x,Y_t=y\right]\ ,[[/math]]

we look for the amount of cash [math]$p[/math] such that the

[[math]]V^{h}(t,x+p,y,s) = V^{0}(t,x,y)\ .[[/math]]

The extra cash makes the hedger utility indifferent to the short position. With exponential utility there is a separation of variables,

[[math]] \begin{align*} V(t,x,y,s)&=\max_\pi\mathbb E\left[-\frac1\gamma e^{-\gamma(X_T-(S_T-K)^+)}\Big|X_t=x,Y_t=y,S_t=s\right]\\ &=-\frac1\gamma e^{-\gamma xe^{r(T-t)}}\min_\pi\mathbb E\left[e^{-\gamma\left(\int_t^Te^{r(T-u)}\pi_u\left(\frac{dS_u}{S_u}-rdu\right)-(S_T-K)^+\right)} \Big|Y_t=y,S_t=s\right]\\ &=U\left(xe^{r(T-t)}\right)g^h(t,y,s)\ , \end{align*} [[/math]]

where we've used the differential [math]d\left(e^{r(T-t)}X_t\right) = e^{r(T-t)}\pi_t\left(\frac{dS_t}{S_t}-rdt\right)[/math]. Hence we find a price [math]$p[/math] such that [math]U\left(pe^{r(T-t)}\right) = g(t,y)/g^h(t,y,s)[/math], where [math]g(t,y)[/math] is the solution from \eqref{eq:HJBstochVol_g}. Depending on the risk-aversion coefficient [math]\gamma[/math], there will be different prices [math]\$p[/math]. This brings us back to the price of volatility risk [math]\Lambda(t,s,x)[/math] from Proposition of Chapter. Namely, investors with different risk aversion will have a different [math]\Lambda[/math] for their martingale evaluation of the call option. If an indifference price is obtained then there is a solution to both optimization problems, and hence there is no-arbitrage and the range of prices for [math]$c[/math] will be a no-arbitrage interval. For complete markets there will be a single price [math]$c[/math] for all levels of risk aversion.

General references

Papanicolaou, Andrew (2015). "Introduction to Stochastic Differential Equations (SDEs) for Finance". arXiv:1504.05309 [q-fin.MF].