Table of Contents
Acknowledgements
It is a pleasure to acknowledge the help provided by many colleagues and friends. In particular, the advice and suggestions of Paul Doust, Andrei Pogudin, Jian Chen, Raphael Albrecht, Dhermider Kainth and Michael Dogwood have been of great help. This book is much the better thanks to them.
We are grateful to John Wiley for agreeing to publish this book, and for the enthusiasm they have shown for the project. Caitlin Cornish has been a most efficient and supportive commissioning editor.
Finally, two of us (RR and KM) cannot help feeling some pangs of envy towards our third co-author, Richard. Unfortunately for us, but probably wisely for him, a few months into the project Richard decided to take a year off to tour the world with his girlfriend. We suspect that the pleasures of proofreading and reference checking may have played a part in making trekking through Siberia appear more attractive than it is normally cracked up to be. Be that as it may, his contribution to this book has been so important that, proofreading or no proofreading, he has earned full authorship, and we feel proud to have him as third co-author. (Just don’t do this again, Richard.)
Chapter 1
Introduction
All models are wrong, but some models are useful
We present in this book a financially motivated extension of the LIBOR market model that reproduces for all strikes and maturities the prices of the plain-vanilla hedging instruments (swaptions and caplets) produced by the SABR model. In other words, our extension of the LIBOR market model accurately recovers in a financially motivated manner the whole of the SABR smile surface.
As the SABR model has become the ‘market standard’ for European options, just the recovery of the smile surface by a dynamic model could be regarded as a useful achievement in itself. However, we have tried to do more. As we have stressed in the opening sentences, we have tried to accomplish this task in a way that we consider financially justifiable.
Our reason for insisting on financial reasonableness is not (just) an aesthetic one. We believe that the quality of a derivatives model should be judged not just on the basis of its ability to price today’s hedging instruments, but also on the basis of the quality of the hedges it suggests. We believe that these hedges can be good only if the model is rooted in empirical financial reality. The ‘empirical financial reality’ of relevance for the pricing and hedging of complex derivatives is the dynamics of the smile surface. We explain below why we believe that this is the case.
We are therefore not just offering yet another model. We present a ‘philosophy’ of option pricing that takes into account the realities of the industry needs (e.g., the need to calibrate as accurately as possible to the plain-vanilla reference hedging instruments, the need to obtain prices and hedges in reasonable time) while reproducing a realistic future evolution of the smile surface (our ‘financial reality’).
Until recently choosing between fitting today’s prices very accurately and being respectful of ‘financial reality’ (given our meaning of the term) entailed making hard choices. For instance, some approaches, such as local-volatility modelling (see, e.g., Dupire (1994), Derman and Kani (1994)), fulfilled (by construction) very well the first set of requirements (perfect fitting of today’s smile). This made local volatility models very popular with some traders. Yet, the dynamics of the smile these models implied were completely wrong. Indeed, the SABR model, which constitutes the starting point for our extension, was introduced to remedy the wrong dynamics imposed by the local-volatility framework.
On the other hand, financially much more palatable models, such as the Variance Gamma model (see, e.g., Madan and Seneta (1990)) and its ‘stochastic volatility’ extensions (see, e.g., Madan and Carr (1998)), have failed to gain acceptance in the trading rooms because of their computational cost and, above all, the difficulties in achieving a quick and stable calibration to current market prices. These prices may be ‘wrong’ and the Variance Gamma models ‘right’, but this is not a discussion the complex derivatives trader is interested in entering into - and probably wisely so.
We believe that these hard choices no longer have to be made. The framework we present recovers almost exactly today’s market prices of plain-vanilla options, and at the same time implies a reasonable future evolution for the smile surface. We say ‘reasonable’ and not ‘optimal’. The evolution our model implies is not the ‘best’ from an econometric point of view. Two of us (RR and RW), for instance, believe that a two-state Markov-chain model for the instantaneous volatility does a much better job at describing how smile surfaces evolve, especially in times of market turmoil. We have published extensively in this area (see, e.g., Rebonato and Kainth (2004) and White and Rebonato (2008)), and our ideas have been well received in academic circles. Yet we are aware that the approach, even after all the numerical tricks we have discovered, remains too awkward for daily use on the trading floor. It is destined to remain ‘another interesting model’. This is where the need for realism comes into play. We believe that the extension of the LMM that we present provides a plausible description of our financial reality while retaining tractability, computational speed and ease of calibration.
As we said, we take the SABR model (Hagan et al.) as the starting point for our extension of the LMM. This is not just because the SABR model has become the market standard to reproduce the price of European options. It is also because it is a good model for European options. Again, pragmatism certainly played a part in its specification as well. A log-normal choice for the volatility process is not ideal, both from a theoretical and (sometimes) from a computational point of view. However, the great advantages afforded by the ability to have an analytic approximation to the true prices, the ease of calibration and the stability of the fitted parameters have more than offset these drawbacks. The main strength of the SABR model, however, is that it is financially justifiable, not just a fitting exercise: the dynamics it implies for the smile evolution when the underlying changes are fundamentally correct - unlike the dynamics suggested by the even-better-fitting local-volatility model.
If the SABR model is so good, why do we need to tinker with it? The problem with the SABR model is that it treats each European option (caplet, swaption) in isolation - in its own measure. The processes for the various underlyings (the forward rates and swap rates) do not ‘talk to each other’. It is not obvious how to link these processes together in a coherent dynamics for the whole yield curve. The situation is strongly reminiscent of the pre-LMM days. In those days market practitioners were using the Black (1976) formula for different caplets and swaptions (each with its own ‘implied volatility’), but did not know how to link the processes together for the various forward rates to a coherent, arbitrage-free evolution for the whole yield curve. This is what the LMM achieved: it brought all the forward rates under a single measure, and specified dynamics that, thanks to the no-arbitrage ‘drift adjustments’, were simultaneously valid for all the underlyings. Complex instruments could then be priced (with a deterministic volatility).
We are trying to do something very similar. With our model we bring the dynamics of the various forward rates and stochastic volatilities under a single measure. To ensure absence of arbitrage we also derive ‘drift adjustments’. Not surprisingly, these have to be applied both to the forward rates and to their volatilities. When this is done, complex derivatives, which depend on the joint realization of all the relevant forward rates, can now be priced.
All of this is not without a price: when the volatilities become stochastic, there is a whole new set of functions to specify (the volatilities of the volatilities). There is also a whole correlation structure to assign: forward-rate/forward-rate correlations, as in the LMM; but also the forward-rate/volatility and volatility/volatility correlations. For, say, a 10-year, quarterly deal, this could provide a fitting junky with hundreds of parameters to play with. Since implying process parameters from market prices is an inverse problem (which also has to rely on the informational efficiency of the market), we are very wary of this approach. Instead, our philosophy can instead be summarized with the sound bite:
Imply from market prices what you can (really) hedge, and estimate econometrically what you cannot.
This is for us so important that we must explain what we mean. Ultimately, it goes back to our desire to reproduce the dynamics of the smile surface as well as we (realistically) can.
One may say: ‘If the price of an option is equal to the cost of the instruments required for hedging, and if a model, like the local volatility one, reproduces the prices of all of today’s hedging options perfectly, what else should a trader worry about?’ We agree with the first part of the statement (‘the price of an option is equal to the cost of the instruments required for hedging’), but the bit about ‘the cost of the instruments required for hedging’ refers not just to today’s hedging, but to all the hedging costs incurred throughout the life of the complex deal. This, after all, is what pricing by dynamic replication is all about. Since volatility (vega) hedging is essential in complex derivatives trading, future re-hedging costs mean future prices of plain-vanilla options (future caplets and swaptions). Future prices of caplets and swaptions means future implied volatilities. Future implied volatilities means future smiles. This is why a plausible evolution of the smile is essential to complex derivatives pricing: it determines the future re-hedging costs that, according to the model, will be incurred during the life of the deal. If a model implies an implausible level or shape for the future smile (as local-volatility models do), it also implies implausible future prices for caplets and swaptions and therefore implausible re-hedging costs.
One of us (RR) has discussed all of this at some length in a recent book (see Rebonato (2004a), Chapter 1 in particular). Since we want to keep this book as concise and to-the-point as possible, we shall not repeat the argument in detail - matters, indeed, are a bit more complex because in a diffusive setting the theoretical status of vega hedging is at the very least dubious. Even here, however, we must say that our argument, despite its plausibility, does not enjoy universal acceptance. There is a school of thought that believes in what we call a ‘fully implied’ approach. In a nutshell, this approach says something like: ‘Fit all the plain-vanilla option prices today with your model, without worrying too much whether your chosen model may imply implausible dynamics for the smile; use all the plain-vanilla instruments you have fitted to for your hedging; future re-hedging costs may indeed be different from what your model believes; but you will make compensating errors in your complex instrument and in the hedges.’
Again, one of us (RR) has argued at length against this view. In brief, the objections are that for the ‘all-implied’ approach to work option markets must either be perfectly informationally efficient or complete. The first requirement is appealing because it suggests that traders can be spared the hard task of carrying out complicated and painstaking econometric analyses, because the market has already done all this work for them: the information, according to this view, is already all in the prices, and we only have to extract it. While this optimistic view about the informational efficiency of the market may hold in the aggregate about very large, liquid and heavily scrutinized markets (such as the equity or bond markets), it is not obvious that it should be true in every corner of the investment landscape. In particular, it appears to me a bit too good to be true in the complex derivatives arena, as it implies, among other things, that supply and demand cannot affect the level of option prices - and hence of implied volatilities (an ‘excess’ supply of volatility by, say, investors should have no effect on the clearing levels of implied volatilities because, if it made options too ‘cheap’, it would entice pseudo-arbitrageurs to come in and restore price to fundamentals). Again, see the discussion by Rebonato (2004a) about this point.
The second line of defence for the ‘all-implied’ approach is somewhat less ambitious. It simply implies that ‘wrong’ prices can be ‘locked in’ by riskless trades - much as one can lock in a forward rate if one can trade in the underlying discount bonds: if one can trade in discount bonds of, say, six and nine months, one can lock in the future borrowing/lending rate without worrying whether this implied level is statistically plausible or not. This view, however, implies that the market in complex derivatives is complete, i.e., that one can notionally trade, or synthetically construct, a security with a unit payment in every single state of the world of relevance for the payoff of the complex security we want to price. But plain-vanilla instruments (caplets and European swaptions) emphatically do not span all the states of the world that affect the value of realistically complex derivatives products. The relevant question is therefore how much is left out by the completeness assumption. We believe that the answer is ‘far too much’.
Our approach therefore is to calibrate our model as accurately as possible to those instruments we are really going to use in our hedging (this is the ‘hedge what we really can’ part of our sound bite). We then try to ‘guesstimate’ as accurately as possible using econometric analysis the remaining relevant features of the future smile (remember, this ultimately means ‘of the future re-hedging costs’) and to ensure that our calibrated model reflects the gross features of these empirical findings in the whole if not in the detail. This is why we give such great importance to the econometric estimation of the dynamic variables of our models as to devote a whole part of the book (Part III) to the topic.
But, if the future smile is unknown today, what hopes can we have of calibrating our model appropriately, and therefore of guessing correctly the future re-hedging costs? Our hopes lie in the fact that the future smile surface may well be stochastic, but certain regularities are readily identifiable. We may not be able to guess exactly which shape the smile surface will assume in the future, but we should make sure that these identifiable regularities are broadly recovered. An informed guess, we believe, is way better than nothing. If the goal seems too modest, let us not forget that the local-volatility model miserably fails even this entry-level test of statistical acceptability.
So, we do not peddle the snake-oil of the ‘perfect model with the perfect hedge’. After all, if a substantial degree of uncertainty did not remain even after the best model was used, it would be difficult to explain why, in a competitive market, the margins enjoyed by complex derivatives traders are still so much wider than the wafer-thin margins available in less uncertain, or more readily hedgeable, asset classes. The name of the game therefore is not to hope that we can eliminate all uncertainty (perhaps by deluding ourselves that we can ‘lock in’ all the current market prices). A more realistic goal for a good model is to offer the ability to reduce the uncertainty to an acceptable minimum by making as judicious a use as possible of the econometric information available.
This is what we believe our modelling approach can offer. And this is why our book is different from most other books on derivatives pricing, which tend to be heavy on stochastic calculus but worryingly thin on empirical analysis.
Finally, we are well aware that there are conditions of market stress that our model ‘does not know about’. We therefore propose in the last chapter of our book a pragmatic hedging approach, inspired by the work two of us (RR and RW) have done with the two-state Markov-chain approach mentioned above. This approach can ensure a reasonable hedging strategy even in those situations when the (essentially diffusive) assumptions of our model fail miserably. This will be an unashamedly ‘outside-the-model’ hedging methodology, whose strength relies on two essential components: the empirical regularities of the dynamics of the smile surface; and the robustness of the fits we propose. As these are two cornerstones of our approach, we believe that we have a chance of succeeding.
Part I
The Theoretical Set-Up
Chapter 2
The LIBOR Market Model
... When we have contracted a habitude and intimacy with any [pricing model]; tho’ in [using it] we have not been able to discover any very valuable quality, of which [it] is posseess’d; yet we cannot forbear preferring [it] to [new models], of whose superior merit we are fully convinc’d ...
Adapted from David Hume, A Treatise on Human Nature, 1740.
In order to make our treatment self-consistent, we review in this chapter the ‘standard’ (i.e., deterministic-volatility) LIBOR market model (LMM). The most influential original papers published in refereed journals about the LMM were by Brace, Gatarek and Musiela (1997), Jamshidian (1997) and Rutkowski (1998). For a treatment of the topic conceptually aligned with our way of looking at things, see Rebonato (2002) and Rebonato (2004a). For a discussion of the historical development of interest-rate modelling leading to the LMM and beyond, see Rebonato (2004b). In order to set the LMM in the broader modelling context of term-structure models, a very good discussion and many references can be found in Hughston (2003) and Hughston and Brody (2000).
For the purposes of the following discussion, the most important thing to remember is that, despite the name, the LMM is not a model; rather, it is a set of no-arbitrage conditions among forward rates (or discount bonds). The precise form of these no-arbitrage conditions depends on the chosen ‘unit of account’ (the numeraire). As it turns out, these no-arbitrage conditions are purely a function of the volatilities of, and the correlations among, the state variables (in our case, the forward rates). This is because ‘physically’ the origin of the no-arbitrage condition is the covariance between the payoff and the discounting. In a nutshell the reasoning goes as follows. We can discount cashflows in several different ways (i.e., we can use several different stochastic numeraires to relate a future payoff to its values today). These different stochastic numeraires will in general co-vary (positively or negatively) with the same payoff in different ways. For instance, the stochastic discounting might be high just when the payoff is high, thereby reducing its value today, or vice versa. However, the value today of a payoff must be independent of the arbitrary way we have chosen to discount it. It should therefore be intuitive that, in order to obtain a numeraire-independent price, we must somehow adjust the dynamics of the state variable in order to account and compensate for this co-variation. What is needed to go from this intuition to a specific form for the no-arbitrage conditions is just a moderate amount of stochastic-calculus plumbing. This is what we turn to in the following.
2.1 Definitions
We assume that in the economy a discrete set of default-free discount bonds,
, are traded. We denote the generic instantaneous forward rate at time
t, resetting at time
T and paying at time
T +
τ by
f (
t, T, T +
τ). The
N reset times are indexed and numbered from 1 to
N:
T1, T2,...
, TN. If we work with spanning forward rates, the payment time for the
i th forward rate coincides with the reset time for the
(i + 1
)th forward rate. The forward rates are then denoted by
(2.1)
The instantaneous volatilities of the forward rates are denoted by
(2.2)
The instantaneous correlation between forward rate
i and forward rate
j is denoted by
(2.3)
For discounting a numeraire must be chosen. A valid numeraire must be strictly positive in all states of the world. To make life easier, it is much better if it does not pay dividends or coupons. A possible choice can be a discount bond,
.
The link between the forward rates and the discount bonds introduced above is via the definition:
(2.4)
with
(2.5)
We call τi the tenor of the forward rate, but note that this definition is not universal.
The description of the (discrete) yield curve is completed by providing the value of the spot rate, i.e., the rate for lending/borrowing from spot time to
T1, given by
(2.6)
We stress that this set-up provides a description of a discrete set of forward rates indexed by a continuous time index.
In the deterministic-volatility LMM the evolution of these forward rates is described by equations of the form
with
(2.8)
Here
ft is the vector of spanning forward rates that constitute the yield curve,
σt the vector of the associated volatilities, and
ρ the matrix of the associated correlations. Note that, in principle, the functions
σi (
t, Ti) need not be the same for different forward rates; we have therefore used a superscript to identify the possibly different volatility functions. If these functions
are the same for all the forward rates, and if the dependence on
t and
Ti of this common function (say,
σ (·)) is of the form
(2.9)
then the LMM is said to be time homogeneous. This is important, because, as explained at length in Rebonato (2002), in this case the future smile surface will exactly ‘look like’ today’s smile surface. If this can be achieved, it is (most of the time) a very desirable feature, for the reasons explained in the Introduction.
Finally, note that, with a slight abuse of notation, we will often denote these time-homogeneous functions as
In this equation the superscript i or Ti now denotes the dependence on the expiry of the forward rate, Ti, of the same volatility function for all the forward rates. So, in the time-homogenous formulation, at a given time t, the volatilities of two forward rates differ only because they have different times to expiry - i.e., they are at different times of their otherwise identical ‘life’. This is what makes the smile surface time invariant.
As for the drifts, μi ({ft}, {σt}, ρ, t), which appear in , these will be derived in a unified manner when dealing with the LMM-SABR model.
2.2 The Volatility Functions
There are, of course, many financially plausible functions that satisfy above. One of us (RR) has explained at length in Rebonato (2002) and Rebonato (2004a) why the following specification provides a good choice:
A selction of possible shapes of this functional form is shown in . Summarizing briefly, this functional form has the following properties.
• It allows for a monotonically decaying or for a humped volatility function. This is desirable because Rebonato (2002) and Rebonato (2004a) explain that a humped volatility should be appropriate for normal trading periods and a monotonically decaying one for excited periods. In a nutshell, the argument points to the fact that, in normal market times, the actions of the monetary authorities are such that the maximum uncertainty in the value of rates is found neither in immediately resetting forward rates, nor in forward rates with very long expiries. It is in the intermediate-maturity range that the uncertainty should be greatest. In Part III we present empirical evidence to buttress the claims made in the references above.
• It is, of course, square-integrable and allows for closed-form solutions of the integrals of its square. As we shall see, this is important because these integrals are linked to the pricing of plain-vanilla and complex instruments.
• Its parameters lend themselves to an easy interpretation. For instance,
a +
d is the value of the instantaneous volatility of any forward rate as its expiry approaches zero;
d is the value of the instantaneous volatility for very long maturities; the maximum of the hump, if the choice of parameters allows for one, is given by
. If we believe in the ‘financial story’ presented in the references above, we can check whether our market-fitted parameters are consistent with it. Also, we can compare the position of the maximum obtained from these market fits with the econometric evidence presented in Part III of this book.
• When market fits are carried out for at-the-money swaptions or caplets, ‘natural’ fits are obtained, with parameters that lend themselves to the financial interpretation above.
• When coupled with a simple correlation function, the functional form (2.11) describes well and in a parsimonious manner the whole at-the-money swaption surface. See, for instance, the studies by Rebonato (2006) and White and Rebonato (2008).
For these reasons, this is the particular functional form that we shall use, and expand upon, in this book. However, there is no loss of generality in doing so, and all of our treatment would still hold if any other form for the time-homogeneous function g(·) were used.
2.3 Separating the Correlation from the Volatility Term
Let us go back to and rewrite it as
(2.12)
Possible shapes of the volatility function in . Note how both ‘excited’ (series 5) and ‘normal’ states (series 1 to 4) can be obtained.
where we now assume that we are dealing with
m (
m ≤
N) factors and that the Brownian increments are independent:
(2.13)
where
δij is the Kronecker delta (
δij = 1 for
i =
j and 0 otherwise). The quantities
σik can be interpreted as the loadings of the
ith forward rate onto the
kth factor. Clearly, because of this independence, the relationship between the volatility
σi and the loadings
σik is given by
If we have chosen the function in such a way that the relationship
(2.15)
holds true, then the market caplets will be correctly priced. (In the equation above, and everywhere in the book, the quantity
represents the Black implied volatility - recall that, for the moment, we are dealing with a world without smiles.) For this reason we call the caplet-pricing condition.
Let us now multiply and divide each loading
σik by the volatility,
σi, of the
ith forward rate:
(2.16)
Making use of the caplet-pricing condition, this can be rewritten as
If we now define the quantity
bik as
(2.18)
can be expressed in a more compact way as
(2.19)
Let us now denote by
b the [
N ×
m] matrix of elements
bjk. It can be readily shown that the correlation matrix can be expressed as
(2.20)
Expression (2.19) is very useful. It tells us that we can decompose the stochastic part of the evolution of the forward rate into a component, σi, that only depends on the volatility (and that we may want to hedge with caplets), and a component, the matrix b, that only affects the correlation (and that we may want to use for historical fitting). The link between the loadings bik and the prices of caplets is shown below.
2.4 The Caplet-Pricing Condition Again
We have stated above that in a smile-less world the instantaneous and implied Black volatilities are linked by the relationship
If is satisfied, the Black price of the
ith caplet is exactly recovered. If we have chosen the function
g(
Ti −
t) in such a way that its root-mean-square is equal to the Black volatility, the caplet-pricing condition
(2.22)
simply becomes
(2.23)
Suppose now that the function
g(·) is parametrized by a set of coefficients, perhaps the parameters {
a, b, c, d} discussed above. For any given parametrization one can check, for each forward rate, whether the integral of the square of the instantaneous volatility out to the expiry of the forward rate does coincide with the total Black variance. For the functional form for
g(·) specified above, this means checking whether the relationship
holds true.
In general, a given set of parameters {
a, b, c, d} will not allow the exact fulfilment of condition (2.24) for more than four forward rates. Therefore, even in a world without smiles, the same set of parameters {
a, b, c, d} will not recover the Black caplet prices of all the forward rates. In order to achieve the exact recovery of the prices of all of today’s caplet prices, we associate to each forward rate a different scaling factor,
, defined as
(2.25)
and write for the forward-rate-specific instantaneous volatility function
By introducing this forward-rate-specific scaling factor the caplet condition is therefore fulfilled by construction everywhere along the curve.
We note that the introduction of the forward-rate-specific scaling factors
makes the evolution of the term structure of volatilities no longer strictly time homogeneous - this is clear, because the scaling factors are not functions of the
residual, but of the
initial time to maturity. This will be an acceptable price to pay for a perfect (smile-less) fit only to the extent that all the quantities
will turn out to be close to 1 (or, actually, to any constant). Indeed, in practice this works very well, and one of us (RR) has shown at length (see, e.g., Rebonato (2002) and Rebonato (2004a)) that imposing the condition that all the scaling terms
should be as close as possible to 1 is a very good way to parametrize the function
g(·).
Despite the fact that the scaling terms
are unashamedly ‘fudge factors’ that, in a perfect world, we would gladly do without, these quantities will turn out to be crucial for the stochastic-volatility extension that we present in Chapter 4.
In sum: we have split the stochastic part of the evolution of the forward rates into a component, related to caplet prices, that purely depends on the volatility and a component that purely depends on the correlation. It is to this second part that we now turn.
2.5 The Forward-Rate/Forward-Rate Correlation
In the deterministic-volatility LMM the correlation matrix has always been the poor relation of the volatility function. When we move to the LMM-SABR model a lot more care must be given to the specification of the correlation surface. We therefore begin to look at it in more detail than is usually done with the standard LMM even if we are still in a deterministic-volatility setting.
2.5.1 The Simple Exponential Correlation
The simplest functional form for a correlation function is possibly the following:
with
Ti,
Tj the expiries of the
ith and
jth forward rates, and
β a positive constant. By the farther apart two forward rates are, the more decorrelated they are. Furthermore, for any positive
β one can rest assured that the corresponding matrix
ρ will always be an admissible correlation matrix (i.e., a real, symmetric matrix with positive eigenvalues).
These two desirable features are naturally recovered by . What expression (2.27) does not handle well is the fact that two forward rates, separated by the same ‘distance’, Ti – Tj , will decorrelate just as much irrespective of the expiry of the first. So, according to the parametrization (2.27), the decorrelation between the forward rates expiring in 1 and 2 years’ time is the same as the decorrelation between the forward rates expiring in 29 and 30 years’ time. As we show in Part III (Empirical results), this is empirically a very poor approximation.
Why has this functional form been used so often? The reason is that, despite this financial blemish, it has an important computational advantage: in the LIBOR market model the following quantities (covariance elements)
play an essential role, as they enter the drift for the forward rates and must be calculated, either implicitly or explicitly, to evaluate any complex payoff. Note, however, that if the correlation function
ρij is of the form (2.27), there is no explicit time dependence on the integration time variable in :
ρ(
t, Ti , Tk) =
ρ(
Ti , Tk) =
ρij . Therefore one can write
(2.29)
If one chooses a sufficiently simple functional form for the instantaneous volatility (such as the one discussed above), the integral (2.29) can be pre-calculated analytically. This can lighten the computational burden considerably.
Whether the unpleasant financial features of the functional form (2.27) are important or not depends on the instrument being priced. Rebonato (2002) argues at length that for European swaptions the dependence of their price on the specific functional form of the correlation function is modest. However, for products like CMS spread options a satisfactory description of the dependence of the decorrelation factor, β, on the expiry of the forward rates can be very important. See again the empirical data presented in Part III. For this reason we present an alternative functional form in the following section.
2.5.2 The Multiplicative Correlation
Several ways to improve on the simple specification (2.27) have been proposed. See, e.g., Schoenmakers and Coffey (2000). We have identified the problem with the simple correlation function to be the fact that the decorrelation (brought about by the exponential decay β) only depends on the distance between two forward rates. To overcome this problem the challenge is therefore to introduce a dependence of the decorrelation factor, β, on the expiries of the forward rates, β = β(Ti,Tj), in such a way that the resulting correlation matrix remains real, symmetric and positive definite.
A simple way to achieve this goal is the following. Consider, for simplicity, a 5 × 5 real symmetric matrix. In general it could have 10 (i.e.,
) independent elements. If we assign these 10 elements as we wish (keeping them, of course, between −1 and 1) there is no guarantee that the resulting matrix will be positive definite, i.e., that it will be a valid correlation matrix. However, if we only specify (5 − 1) quantities,
a1,
a2,
a3 and
a4 (keeping them, of course, again between −1 and 1) and these are linked as in the matrix below:
(2.30)
then it is easy to show (Doust, 2007) that the resulting real symmetric matrix
is always positive definite, i.e., it is a possible correlation matrix. (We note in passing that this result is similar in spirit to the construction in Schoenmakers and Coffey (2000).) This is because the matrix (2.30) admits Cholesky decomposition
(2.31)
The condition such that this gives a valid correlation matrix is that all the elements along the main diagonal be real. This is always going to be the case, for instance, if the
ais are chosen so that
(2.32)
This is a very simple condition to satisfy.
In order to be sure that one is constructing a valid [
n ×
n] correlation matrix one can therefore proceed as follows. First of all we fill in the elements on the diagonal:
We then define the elements in the first row by the relationship
(2.34)
By inspection (assuming
i >
j) we then note that
(2.35)
Finally, the elements in the upper triangle of the correlation matrix are determined by the symmetry relationship: ρij = ρji. So, given the (n − 1) quantities ai, i = 1, 2, ..., n − 1, we know how to build a valid [n × n] correlation matrix.
The question that remains outstanding is: how do we choose the quantities ai? The answer depends on how smooth we want our resultant correlation matrix to be: too many parameters may well give a better fit, but we risk chasing unavoidable numerical noise.
When all the elements of the correlation matrix are positive (this is a plausible case for forward rates) the following is a systematic way to proceed.
Let us write
where Δ
T is the spacing between forward rates. Then
This is nice because when
βk =
β0 for all
ks the expression above becomes
(2.38)
i.e., the ‘traditional’ exponentially decaying correlation function (2.27). The ‘physical’ meaning of this choice is that every additional increment Δ
T of distance from
any forward rate brings about exactly the same degree of decorrelation, exp [ –
β0Δ
T]. When all the
βk s are identical, it is easy to see that this corresponds exactly to the simple exponential correlation.
Then, as we give further flexibility to the dependence of the quantities
βk on
k we gain the ability to specify that the degree of decorrelation between the
jth and
kth forward rates should depend on the expiry of the first of the two rates. For instance, if we believe that the decorrelation between, say, the ninth and tenth forward rates should be less pronounced than the decorrelation between the first and second, this will simply be reflected in the requirement that
We can systematically increase the flexibility of our model correlation function by requiring, for instance, that the dependence of the quantities βk on k should progress to take on a wider range of functional forms. Note, however, that we do not only want βk to be a (weakly) decreasing function in k. We must also require βk > 0 for all k to ensure the conditions on ai. A linear or quadratic function does not automatically guarantee this easily. However, if we choose βk = g0 + g1/k + g2/k2 + ... for positive gi, then the βk are both decreasing in k and always positive. Obviously, alternative forms are also possible with the maximum flexibility, of course, corresponding to the case when all the (n − 1) quantities β1, β2,..., βn−1 are allowed to be ‘fitting parameters’.
We will discuss in Part III how well this functional form can describe real correlation matrices (the answer, as we shall see, is ‘very well’-otherwise we would not have bothered presenting the approach ... ). For the moment, we present the types of correlation shapes that can be produced by the Doust formulation.
2.6 Possible Shapes of the Doust Correlation Function
We show in this section the shapes of the correlation surface that the Doust correlation function can naturally assume. displays the case when every
ai is set equal to the same value:
(2.39)
Doust correlation structure with β1 = 1.
It is easy to see that, in this simple case, we indeed recover the exponentially decaying surface more traditionally described by the function
(2.40)
The interesting features of the Doust function are displayed in , obtained with the ai = exp [−βi] in (and an equal spacing of ΔT = 1 year).
Here we clearly see that the shape of the correlation surface, when looked at down the main diagonal, changes from convex at the ‘front’, to concave at the ‘back’. This is indeed what intuition and empirical evidence (see Chapter 10) tell us should happen.
Finally, a very simple and very useful generalization of the Doust correlation function can always be introduced at ‘no extra cost’. In fact we can always impose that the long-term decorrelation among forward rates should not go asymptotically to zero with increasing ‘distance’, but to some finite level,
LongCorr, simply by rewriting the original correlation function,
, in the form
(2.41)
If the matrix
is a valid correlation matrix, one can rest assured that so is the matrix
ρij (
t), at least as long as
LongCorr > 0.
This extension is so simple and so useful that in the following we will implicitly assume that it is always carried out.
Doust correlation structure with convexity.
The quantities βi used to obtain
βi | ai = exp [—βi] |
---|
0.250 | 0.7788 |
0.220 | 0.8025 |
0.180 | 0.8353 |
0.145 | 0.8650 |
0.120 | 0.8869 |
0.100 | 0.9048 |
0.080 | 0.9231 |
0.040 | 0.9608 |
0.001 | 0.9990 |
2.7 The Covariance Integral Again
Consider again integrals of the type
(2.42)
As mentioned above, these integrals appear in the drifts of the forward rates in the LIBOR market model, and are the computational bottlenecks for the Monte Carlo evolution of the state variables (typically the forward rates). However, the ability to ‘pull’ the correlation out of the integral is lost as soon as any other functional form, other than linear, is chosen for the dependence of the correlation on the running time to maturity
Ti −
s. Even if one simply looks at time-homogeneous functions, for any correlation function expressed as a function of the difference of functions
g(·) other than a linear function
(2.43)
the dependence on
t makes the integral unlikely to be analytically feasible. (Clearly, if
g(
t, Ti) =
α +
β(
Ti −
t),
g (
Ti −
t) −
g (
Tj −
t) =
β (
Ti −
Tj).)
Consider now , which describes the Doust correlation between a forward rate exactly of maturity Ti and another forward rate exactly of maturity Tj. As real or simulation time goes by, the distance between the two forward rates remains the same, but, since in the Doust model (and in reality) the correlation does not purely depend on their distance, the correlation between the two rates will change. We therefore can no longer pull the correlation out of the integral.
We can see more clearly what happens as follows. First of all note that, as the clock moves from time T0 to time T1, the correlation matrix evolves from the correlation seen at time T0 to the same correlation with the last row and column dropped. If we keep the labels for the forward rates the same as at time T0, then at time T1 the correlation between forward rates, say, 4 and 5, is the same as the correlation between forward rates 3 and 4 at time T1. It is exactly because of this self-similarity that we can call the Doust correlation function time homogeneous. If our simulation uses steps as large as the spacing between the resets of forward rates (and we ‘freeze all quantities at the start of their values at the start of the time step’), it is therefore easy to see that in the discrete-time Euler approximation calendar time does not enter the covariance integral.
With deterministic-volatility LMM it is easy to use computational ‘tricks’ (see, e.g., Hunter, Jaeckel and Joshi (2001) or Joshi and Stacey (2006)) that make such long-stepping numerically feasible. With a stochastic-volatility LMM, however, the step size cannot be too long, because it must be sufficiently fine to allow the forward rate to ‘feel’ the variation of the stochastic volatility. This being the case, one often has to take intermediate Monte Carlo steps in between reset times - the more so, the longer the tenor of the forward rate. This therefore raises the question of what happens to the correlation matrix at these ‘in between’ times.
To see matters clearly, let’s now denote by
the decorrelation for times
t between
Tj and
Tj +1 between the forward rates labelled at time
T0 by
i − 1 and
i:
(2.44)
So, the quantities given by to (2.35) are given in the new notation by
. If we then use the mapping from
ai to the
βi in , the problem can be recast in terms of what happens to the decaying constants
βi as time moves in between reset times. We therefore similarly define
(2.45)
With this definition it is reasonable to require that, to first order,
(2.46)
So, the decaying constant,
, and hence the decorrelation factors,
, and hence the whole correlation matrix,
ρij (
t), can all be expressed for any intermediate time between resets as a function of the initial decaying constants and calendar time (and, of course, the reset times). The expression is very simple and linear in the integration time
t. Depending on the functional form chosen for the instantaneous volatility function, it may still allow analytic integration of the covariance integral. But even if this is not the case, not everything is lost. If one wanted to avoid the numerical integration of the covariance elements, one could store in memory (computer memory, that is) pre-calculated correlation matrices evaluated at the intermediate time steps. This is not as awful as it sounds because of the time homogeneity of the correlation matrix. One simply has to store as many correlation matrices as there are time steps per tenor (so, for a quarterly tenor and a monthly time step this would mean storage of three correlation matrices). After the first reset, everything will look exactly the same again, at least for evenly spaced forward rates.
Chapter 3
The SABR Model
3.1 The SABR Model (and Why it is a Good Model)
The SABR model is very well explained in the paper by Hagan et al. (2002). This is an excellent and very clearly written paper, and we only report the main results here for completeness. Unlike much of the current literature, the approach by Hagan and colleagues does not just provide a clever approximation for an option pricing problem, but places the topic in a clear context of trading, hedging and modelling relevance. It explains clearly why, even for European options, ‘just fitting’ today’s market prices (no matter how well) is not good enough. One can compare the SABR model with another all-fitting approach (the local-volatility model). Despite the fact that the latter fits (by construction) the market even better, Hagan and his coauthors explain clearly why it is nonetheless not a good idea to hedge even European options using a local-volatility approach. This is because the latter predicts that the smile will move when the underlying moves in ways that are not borne out in market reality. This is an empirical issue, not a theoretical one that can be settled by looking at the quality of today’s fit to the market.
Let us expand on this point a bit, because we think it is a crucial one. The superiority of the SABR over the local-volatility model is not an a priori theoretical one. It could have been that the world was such that option prices change with the underlying as predicted by the local-volatility model. If this had been the case, there would be no value in the SABR model, and it would make sense to use the local-volatility approach. This is not a question that can be settled by looking at today’s smile (that tells us about changes in option prices versus strike, as opposed to changes in option prices as the underlying moves). Hedging has everything to do with smile dynamics and very little to do with the risk-neutral density (which is predicted to be virtually the same by the local-volatility and the SABR models). Once again, the reader would benefit from reading the original work by Hagan et al. where these ideas are clearly discussed. See also Rebonato (2002) and (2004a, especially Chapter 12) where one of us (RR) expands on these concepts at greater length than we want to do here.
3.2 Description of the Model
In the SABR model the underlying (in our case, a forward rate,
, of expiry
T) follows the dynamics
(3.1)
(3.2)
(3.3)
The model is fully specified once we add to the equations above the initial conditions
and
. It is a CEV model augmented by stochastic volatility.
A few observations are in order.
1. We are working in the (terminal) measure,
, under which both the forward rate and its volatility are martingales (driftless). We can always do this if we work with one forward rate in isolation at a time. Under this same measure, however, the process for another forward rate and for its volatility would not be driftless. We will derive the appropriate drifts in Chapter 4.
2. All the parameters of the model, νT , βT and ρT, are just constants, not functions of time.
3. All the parameters of the model, νT , βT and ρT, are specific to a particular forward rate. To make this clear we have appended the superscript T.
4. We have indicated that the increments
and
are the increments of
standard Brownian motions under the measure
by appending the same superscript
T. This should be understood as a shorthand notation for
and
. Since each forward rate defines uniquely the terminal measure,
, under which its process is driftless, there is no ambiguity in using the lighter notation.
5. Within the SABR model there is no way for the various forward rates to interact with each other (for instance, we cannot use the SABR model to determine the payoff of a path-dependent option). Each forward rate lives in its own measure and does not know anything about the other forward rates. The SABR model as it stands cannot describe the dynamics of a yield curve.