Filtered Historical SimulationValue at Risk for Options: A Dimension Reduction Approach to Model the VolatilitySurface Shifts

(1)

Value at Risk for Options

A Dimension Reduction Approach to Model the Volatility Surface Shifts

Fredrik Gunnarsson

A thesis presented for the degree Master of Science and Engineering in Engineering Physics

Umeå University Spring 2019

(2)

Filtered Historical Simulation Value at Risk for Options Fredrik Gunnarsson, frgu0031@student.umu.se

Fredrik Gunnarsson, 2019c

Supervisors: Anders Stäring Cinnober Dennis Sundström Cinnober

Joakim Ekspong Department of Physics

Examiner: Markus Ådahl Department of Mathematics and Mathematical Statistics

Master of Science Thesis in Engineering Physics, 30 ECTS Department of Physics

(3)

a result of ignorance in the risk of complex derivatives. As a consequence, a filtered historical simulation (FHS) Value at Risk (VaR) approach is becoming the standard way to determine the base initial margin for portfolios in cash-equity and fixed income, but the strategy is to incorporate this method on the options market as well. Because of the complexity of the options market, it is required to model the shifts in the volatility surface, which is a procedure that is data extensive. Therefore, it is important to study dimension reduction techniques that consolidates the dynamics in the volatility shifts while reducing the size of the data-set to decrease the computation time. This thesis treats the topic of reducing dimensions for modeling the volatility surface shifts, which is applied for scenario creation to estimate the FHS VaR for options. The treated dimension reduction methods consist of a general pivot model that applies point shifts, a Principal Component Analysis (PCA) model that builds the shifted surface with principal components and an SVI model that shifts the parameters in the SVI parameterization. The methods are evaluated in terms of explanatory power in a PCA analysis, correlation of the profit and losses and a VaR error estimate. An additional study cover how to establish arbitrage-free scenarios and the implications of it. We conclude that a 9-pivot model is an appropriate model to use when modeling the volatility shifts. The model captures the first two principal components covering >95% of the explanatory variance while keeping a limited VaR error, which is what we want to achieve with dimension reduction to estimate the FHS VaR.

Keywords— Volatility surface, Principle Component Analysis, Implied volatility, Value at Risk, Historical simulation, Volatility surface shifts, Dimension reduction, Options, Risk

(4)

Acknowledge

First, I would like to thank Cinnober for letting me complete my thesis at their office in Stockholm. Cinnober provided me with an excellent working environment, brilliant colleagues and valuable insights into the topic as well as in the area of business. I would like to especially thank my supervisors Dennis Sundström and Anders Stäring for their continues support and remarks during the thesis.

Finally, I would like to express gratitude to my supervisor at Umeå University, Joakim Ekspong. He advised me how to structure my thesis in an excellent manner.

Fredrik Gunnarsson, Umeå, May 2019

(5)

1 Background 1

2 Introduction 3

3 Theory 6

3.1 Black-Scholes-Merton Model . . . 6

3.1.1 Black-Scholes Equation . . . 6

3.1.2 Implied Volatility . . . 7

3.1.3 Volatility Surface . . . 7

3.1.4 Arbitrage . . . 8

3.2 Stochastic Volatility Inspired model . . . 9

3.2.1 Calibration of parameters . . . 10

3.3 Value at Risk . . . 10

3.3.1 Value at Risk estimation . . . 11

3.3.2 Shifting methodologies . . . 12

3.3.3 Absolute shifts . . . 12

3.3.4 Logarithmic Shifts . . . 13

3.3.5 GARCH modeling . . . 13

3.3.6 Filtered Historical Simulation . . . 14

3.3.7 FHS VaR for options . . . 15

3.4 Principal Component Analysis . . . 15

3.5 Interpolation and extrapolation techniques . . . 16

4 Methodology 17 4.1 Shifting Methodologies . . . 17

4.2 Volatility surface shifts . . . 17

4.2.1 Single grid point/Parallel shift . . . 18

4.2.2 Multiple grid point shift . . . 19

4.3 Dimension reduction models . . . 20

4.3.1 Pivot model . . . 20

4.3.2 Parameter model . . . 22

4.3.3 PCA model . . . . 23

4.4 Arbitrage free scenario creation . . . 25

4.5 Backtesting techniques . . . 25

4.5.1 Using PCA to validate the model. . . . 25

4.6 Compare P&L correlation . . . 26

4.6.1 VaR error estimates . . . 26

4.7 Volatility surface dynamics . . . 27

4.7.1 Principal component analysis . . . 28

4.7.2 VaR Surface . . . 28

4.7.3 Contract concentration . . . 29

(6)

4.7.4 Black-Scholes volatility dependency . . . 29

4.8 Data and restrictions . . . 30

5 Results and Discussion 31 5.1 Interpolation Techniques . . . 31

5.2 Extrapolation techniques . . . 32

5.2.1 Moneyness dimension . . . . 33

5.2.2 Time to maturity dimension . . . . 34

5.3 Pivot model . . . 35

5.3.1 Specific pivots to apply the shifting on . . . 40

5.4 PCA model . . . . 42

5.5 SVI Parameter shifts . . . 42 6 Is it worth making the scenarios arbitrage-free? 46

7 Conclusion 48

Appendices A1

A Mathematical interpretation of PCA . . . . A1 B Area of extrapolation . . . B1 C Estimating implied volatility using Newton-Raphson . . . C1 D Interpolation and extrapolation theory . . . D1 D.1 Linear interpolation . . . D1 D.2 Flat extrapolation/interpolation . . . D1 D.3 Flat forward interpolation . . . D1 D.4 Cubic Hermite Interpolation . . . D2 D.5 Cubic Spline interpolation . . . D2 D.6 Nearest neighbour . . . D3 E Sticky Strike and sticky delta . . . E1

(7)

Derivative ∼ Contract between two or more parties whose value is based on an underlying asset

T ∼ Time to maturity, the time remaining until a financial contract expires K ∼ Strike price, the price where the underlying instrument can be bought or

sold at maturity

S ∼ Stock price

r ∼ Risk free rate

M oneyness ∼ StockP rice StrikeP rice = _K^S

Clearing ∼ Clearing denotes all activities from the time a commitment is made for a transaction until it is settled

CCP ∼ Central Counterparty Clearinghouse is the party that handles the clearing process

IM ∼ Initial Margin, potential future losses that would occur in adverse price movement while the transaction is being cleared

V S ∼ Volatility Surface, defines the volatility dependence implied by the option price over time to maturity and moneyness with Black-Scholes formula BS ∼ Black-Scholes model used to calculate the theoretical value for European

options

Arbitrage ∼ The concept of simultaneous purchase and sale financial contracts/assets with the ability to make a risk free profit

P ivot ∼ Defined as the implied volatility change in time for a specific option, pivot = log(σ_t^imp(T, M )) − log(σ_t−1^imp(T, M ))

OT C ∼ Over-the-counter, are private contracts traded between two parties without third party involvement, e.g., an exchange

Spot ∼ Refers to the current market price of an asset.

AT M ∼ At the money, refers to options with moneyness = 1, i.e., the stock price is equal to the strike price

Shif t ∼ Defines as the change in the implied volatility surface over a period in time P &L ∼ Profit and Loss = Current market value - Scenario value

(8)

1 Background

1 Background

The 2008 financial crisis was to a high extent caused by risk undervaluation in the banks strive for sophisticated derivatives, which were difficult to evaluate [1]. As a response to this, the Basel Committee on Banking Supervision released an updated version of its regulatory framework, putting pressure on banks to present a more thorough risk analysis. As a part of the framework, a stressed Value at Risk (VaR) requirement was introduced as a portion of a comprehensive risk analysis [2]. VaR can be defined as an estimate of the potential worst α-percentile loss in a portfolio or asset (α is usually defined as the 99^th-percentile).

The post-financial crisis have transformed the landscape of risk management, a wider range of contracts are traded on electronic platforms and the importance of financial stability has increased. The risk systems were originally developed by banks but due to the more complex environment, an external party that can meet the extreme criteria of speed and function- ality is useful. From this, Cinnober developed a system for clearing financial transactions, TRADExpress^TMRealTime Clearing [3]. This system is used by Central Counterparty Clear- inghouses (CCPs) that insert itself as the counterparty to both the buyer and seller. The CCPs estimates a base Initial Margin (IM) that the buyer and seller must post as collateral while the trade is being processed (cleared).

In the recent decade, a Filtered Historical Simulation (FHS) VaR approach is becoming the standard way to determine the IM component of a portfolio for clearing transactions. Clearing is the activities that occur from the time of commitment of a transaction until it is settled.

A CCP can consequentially be considered as a middle party that takes care of the risk and transaction of the securities. The IM then accounts for potential future losses that would occur in adverse price movements until the transaction has been cleared [4]. This margin is thus highly dependent on the market condition, security types and margin horizon. The FHS VaR is one approach to estimate the IM, that is mainly used for markets such as cash-equity and fixed income. For derivative markets consisting of mainly futures and options, SPAN is the most used method, but the trend is to move to VaR for these markets as well. The move to VaR is supported by Jo Burnham [5], were she explained how major CCPs tends to move from calculating the IM with the SPAN approach to the FHS approach. Both Eurex clearing and London Metal exchange are currently establishing these changes while the regulatory framework is also implementing FHS VaR. However, since FHS needs reasonable and robust scenarios based on historical data, it is more demanding to apply than SPAN, by reason that the SPAN parameters can be determined in a more arbitrary way by the risk managers at the CCP’s.

To model FHS VaR for options, there is a need to consider volatility surfaces historical data in a considerable long look-back period to obtain robust estimations. With varying volatility surfaces entail an unfeasible method to use in reality, consequentially, dimension reduction

(9)

Dimension reduction is also needed in the estimation of VaR to make the method feasible in reality, due to extensive estimations with large data-sets.

In light of the large data-sets, we want to construct a method to model the volatility surface shifts (variations in the implied volatility over time) while capturing the majority of the movements in the data. By capturing the dominant part of the dynamics, we can produce an accurate estimate of FHS VaR for the IM estimations.

(10)

2 Introduction

2 Introduction

The standard Black-Scholes model is well known for valuating option contracts and depends on the volatility as a parameter quantifying the risk associated with the returns of the underlying asset. Black-Scholes model is dependent on the strike price (K) of the option contract, the risk free rate (r), the time to maturity (T), the underlying stock price(S) and the volatility(σ). One of the parameters in the formula that cannot be extracted easily from the contract information is the volatility. However, by using option contracts on the market we can find the volatility with Black-Scholes formula, and is therefore called the implied volatility (σ^imp). Considering the implied volatility for various contracts with the same underlying asset results in variations of the volatility in two dimensions, strike price (can be normalized to moneyness = _K^S ) and time to maturity. Before the market crash of 1987 the implied volatility dependency was constant across time to maturity and moneyness, but after the crash, the dependency shifted and the implied volatility is now varying in both dimensions, which establish the so-called volatility surface. Volatility surfaces have shapes similar to the one in Figure 1.

Figure 1: Example of an implied volatility surface to illustrate the dependence of time to maturity and moneyness.

Another characteristic of the volatility surface is that the shape changes over time, from day to day. Therefore, if the volatility surface is observed over time, shifts in the volatility can be identified (i.e., the volatility for a specific option where the same time to maturity and mon- eyness is considered change continuously). The time-dependency can further be used in risk and pricing applications because of a change in the implied volatility would impact the option price and vice-versa. Since this time-dependency is an strong factor for volatility surfaces, we will use the FHS VaR approach to estimate the impact that the implied volatility shifts have

(11)

A problem with the FHS approach is that it requires the implied volatility for a specific options time to maturity and moneyness over time, because every option changes its time to maturity and moneyness for one day to another. Consequentially, you can not follow a specific contract over time to estimate its VaR, rather you need to use historical options with the same underlying moneyness and time to maturity. But to find corresponding option contracts, an extensive amount of contracts are needed. One approach of solving this problem is to consider the dependency of the implied volatility such that a surface can be modeled like the one in Figure 1, where the missing datapoints can be found by interpolation. Even though we could in theory model the full volatility surface and consider every single option in it, the amount of data this method would process imply computational difficulties in both time and storage capacity.

Because of the extensive amount of data needed to model the surface, many CCPs only consider options with the highest liquidity when estimating the implied volatility shift when used as a risk factor. The single point shift is then applied across all options with the same underlying derivative (can be thought of a level or parallel shift in the volatility surface). Ap- plying a parallel shift can be considered a naive approach when estimating VaR for options.

Ideally would be if the complete volatility surface and the respective shifts were used when estimating VaR and building the scenarios, however, this is a data consuming process that makes it complex and hard to perform. To put it in perspective, when estimating VaR for equities, there is one time series per underlying asset. When considering options, every underlying asset has multiple moneyness points and expiration dates, making this process multidimensional, not unusual with > 100 contracts per underlying asset. The consequence is that a robust method is needed to estimate the return of the implied volatility based on the volatility surface with- out losing significant amount of information. Doris Dobi [6] approached this problem using a pivot model, and successfully reduced the number of dimensions in the data, which further was backtested with a PCA analysis of the shifts. She advised a 9-pivot model for dimension reduction. However, Doris never questioned this method in a risk perspective when estimating FHS VaR and did not examine which options (time to maturity and moneyness) to apply the pivot model on nor the interpolation/extrapolation techniques to consider. (A pivot is defined in this thesis as a single point shift considering a specific option on the volatility surface.)

The Option Clearing Corporation (OCC) constructed a similar method as Doris for di- mension reduction, estimating the implied volatility scenarios based on 9-pivot points in the historical volatility surface. The point of interest’s (poi’s) daily volatility is estimated in the surface by linear interpolation, the scenario can then be estimated by applying the closest shock from the 9-pivot model to the daily volatility, can be thought of flat interpolation. Through this approach, the model is able to capture the potential volatility exposure associated with a portfolio given changes in the level, skewness, convexity and term structure of the implied volatility surface. In general, OCC observes an overall 9% increase in risk margin requirements under this approach compared to the SPAN approach. [7]

(12)

2 Introduction

Another similar method to OCC but with some simplifications are used by NASDAQ, they made a detailed overview of the IM calculation with an historical simulation approach. The approach was to extract a discrete set of volatility points or risk factors where the estimated shifts are calculated, apply the shifts on today’s market volatility level to obtain a grid of scenario volatilities. To find the poi’s scenarios they interpolate in the scenario grid, note that this approach differ from Doris approach as OCC interpolate in the scenario grid and not the grid of shifts. [8]

Multiple studies also examine the characteristics and the movement in the volatility surface over time, and that the volatility surface dynamics can be modeled by PCA analysis in terms of dimension reduction. The most well-known by Rama Cont and Jose da Fonseca [9] show that the variance in the daily observations is well captured within the first principal components to an extent of 95%, the study was done on the SP500 option. This result is also supported by Michael Kamal and Jim Gatheral, whom makes an overview of Goldman Sachs, Merrill Lynch and Cont, et. al. studies that applied the PCA analysis on different indexes. The final conclu- sion is that the principal components can be explained by a level mode (accounts for >80% of variation); term structure variation and a skew: one where strikes below the spot price move in the opposite direction from those above and where the overall magnitude is attenuated as term increases. The observations done in the paper confirms that three principal components are enough to characterize the change in the implied volatility. Note that the above studies were performed on major index options which may be traded differently than equity options. [9]

It is important to note that the Regulatory Technical Standards specifies that the IM shall cover the exposures arising from market movements over a minimum historical look-back period of the latest 12 months. CCPs shall ensure that the data used to calculate the IM will capture a full range of market conditions including periods of stress (extreme but plausible market conditions). Exposures observed during the look-back period shall be covered at a minimum confidence level of 99.5% for OTC derivatives and 99.0% for other financial instruments. This supports the problem that confronts the FHS VaR method, since it is therefore required to have at least 12 months of historical data. [10]

The objective of the thesis is to create robust shifts in the surface that can be used as a historical scenario generator to estimate VaR. These shifts need to be approximated from the volatility surface while considering a fraction of the full data from the surface. In turn, dimension reduction needs to be applied to reduce the number of points needed to extract from the volatility surface to obtain a shifted surface containing the majority of the movements. To define a robust method in terms of adaptivity for high/low volatile options, accuracy and effi- ciency, a framework needs to be established and thoroughly tested. This thesis will go through the currently used and establish new methods, interpret the method from a risk perspective, and consequentially suggest an approach of solving the dimension reduction that is adaptive,

(13)

3 Theory

In this section, we cover concepts of Black-Scholes (BS) formula for option pricing, the volatility surface construction, PCA and interpolation techniques to get robust theoretical tools to use in the scenario creation and shifting models.

3.1 Black-Scholes-Merton Model

In the early 1970s, Fisher Black, Myron Scholes, and Robert Merton made a breakthrough in the option valuation theory and developed what is known as the Black-Scholes model, sometimes known as the Black-Scholes-Merton model, which is widely used and influenced by traders and market practitioners to price derivatives. The full derivation can be found in John C Hull, Options, Futures and other derivatives [11].

3.1.1 Black-Scholes Equation

Black-Scholes equation is given in Equation 1,

∂C^BS

∂T +1

2σ²S²∂²C^BS

∂S² + rS∂C^BS

∂S = rC^BS, (1)

where C^BS is the options call price, S is the current stock price, K is the strike price of the option, r is the risk-free interest rate, σ is the volatility of the underlying asset and T is the time to maturity. Note that Equation 1 is a partial differential equation. The solution to this equation gives us the well known Black-Scholes formula, which is stated in Equation 2. The theory and the building blocks behind the BS-model are not required for the objective of this thesis so we simply provide Equation 2,

C^BS = Φ(d1)S − Φ(d2)Ke^−rT, (2)

where,

d1= 1 σ√

T

log S

K

+ T

r +σ²

2

,

d2= 1 σ√

T

log S

K

+ T

r −σ²

2

, and Φ is the cumulative normal distribution function,

Φ(x) = 1

√ 2π

Z x

−∞

e⁻¹²^z²dz.

Black-Scholes formula is a strong tool in finance and allows us to calculate the theoretical price of European call and put options for all equity contracts based on known market parameters. If we know the European call option price for a certain strike price and date we can deduce the value for a European put option with the same strike and date, and vice-versa, this relation is known as put-call parity defined in Equation 3.

C + Ke^−rT = P + S, (3)

where P is the European put price.

(14)

3.1 Black-Scholes-Merton Model 3 Theory

3.1.2 Implied Volatility

From the Black-Scholes formula, every parameter except one can be easily distinguished in the daily market data. This is where the implied volatility comes into consideration, which is the volatility implied by the option prices observed in the market using BS-model reversed, defined in Equation 4.

C^{M arket}= C^BS(T, σ^imp, r, S, K), (4)

where C^{M arket} is the option price observed in the market.

However, this is not essentially straightforward as just inverting the BS-model to find σ^imp, because it is just not invertible. Therefore, we need to use an iterative method such as Newton- Raphson, which finds the root of the equation to find the unknowns. Whereas the volatility in the BS-model is backward-looking (using historical prices to estimate volatility), the implied volatility is forward-looking. Given that the implied volatility is forward-looking, traders often quote the implied volatility rather than the option price, which is convenient since the implied volatility tends to be less variable than the option price [11]. The theory to estimate the implied volatility with Newton-Raphson can be found in Appendix C.

From now on, the implied volatility σ^imp will be denoted σ in equations and should not be confused by the volatility in Black-Scholes formula.

3.1.3 Volatility Surface

When analyzing historical option prices for a specific underlying asset and further estimating the implied volatility, we can see that the volatility changes in the moneyness and time to ma- turity dimensions. This means that the BS-model does not fully explain the option price since it fails to capture movements in time to maturity and moneyness dimensions, i.e., BS-model assumes constant volatility across moneyness and time to maturity.

The volatility surface maps the options volatility in the time to maturity and moneyness dimensions and can therefore be used to price options with any maturity and moneyness. A sample from a volatility surface is shown in Table 1 where the data differs in both dimensions, the dependence of the surface in these dimensions lay the ground in how to model the shifts.

(15)

Table 1: Data sample for an implied volatility surface [11] where the columns specify moneyness and the rows specifies time to maturity.

T

M 0.90 0.95 1.00 1.05 1.10

1 month 14.2 13.0 12.0 13.1 14.5 3 month 14.0 13.0 12.0 13.1 14.2 6 month 14.1 13.3 12.5 13.4 14.3 1 year 14.7 14.0 13.5 14.0 14.8 2 year 15.0 14.4 14.0 14.5 15.1 5 year 14.8 14.6 14.4 14.7 15.0

Keeping in mind that option data is far from perfect, robust methods need to be used to interpolate and build such a surface. Building the volatility surface is not a part of the objective for this thesis, but the thesis base the studies on the assumption that robust volatility surfaces exists. Consider Adam Öhmans master thesis at Cinnober [12] for insights in the topic. The dependence of the volatility surface in the moneyness dimension is often known in the literature as a volatility "smile" or "skew" that resembles its appearance. In time to maturity dimension, the dependency is close to linear, which is easier to model than a "smile/skew" indicating that there is, in theory, easier to reduce dimensions in time to maturity than in moneyness.

3.1.4 Arbitrage

The existence of arbitrage is considered to be a portfolio that can be utilized with no initial investment, with no downside risk and the probability of making a profit is > 0. There are two types of arbitrage, static and dynamic arbitrage. Static means that there is a probability

>0 to create a portfolio today that can lead to arbitrage. Dynamic means that there is an arbitrage possibility if the portfolio is allowed to move in time. Michael Roper [13] defined the sufficient conditions on the volatility surface for the surface to be free of static arbitrage. The conditions are based on the following assumptions:

• Call options are perfectly liquid with non negative prices that defines the whole spectrum.

K > 0, T > 0.

• Interest rates and dividend yields are considered to be zero

• Absence of transaction cost

Let Ξ be a function which is connected to implied volatility such that Ξ(x, T ) =√

T σ(K, T ), (5)

where x = log(K/S). Six conditions that can be used to fulfill static arbitrage with use of Ξ, the conditions are stated in IV1-IV6 below.

IV1 (Smootheness) for every T > 0, Ξ(·, T ) is twice differential

(16)

3.2 Stochastic Volatility Inspired model 3 Theory

IV2 (Positivity) for every x ∈ IR and T > 0,

Ξ(x, T ) > 0

IV3 (Durreleman’s condition) for every T > 0 and x ∈ IR, 0 ≤ (1 −x∂xΞ

Xi )²−1

4Ξ²(∂xΞ)²+ Ξ∂_xx² Ξ (6) IV4 (Monotonicity in T ) for every x ∈ IR, Ξ(x, ·) is non decreasing

IV5 (Large moneyness behavior ) for every T > 0,

x→∞lim sup(Ξ(x, T )

√

2x ∈ [0, 1) IV6 (Value at maturity) for every x ∈ IR, Ξ(x, 0) = 0

Fulfilling condition IV1-IV6 for Ξ implies that the volatility surface is free of static arbitrage.

3.2 Stochastic Volatility Inspired model

Numerous models have attempted to parameterize the volatility surface in such a way where the arbitrage conditions are satisfied while keeping minimum deviation from the market data.

One of the most popular models to capture the volatility surface is the Stochastic Volatility Inspired (SVI) parametrization of the implied volatility smile [14].

The SVI parametrization of the implied volatility smile was introduced in 1999 and is commonly used throughout the volatility construction areas [14]. Some extensions on the model that preserves the arbitrage relations in the options market have been developed, such as the Surface Stochastic Volatility Inspired (SSVI) model. We will not go into detail in the various models but what is important to understand here is that each of these models’ parameters are fitted from market data, consequentially the data can be optimized to best correspond to the SVI parameters a, b, ρ, σ and m. The SVI model is defined in Equation 7, where T σ(M )² is known as the total implied variance,

T σ(M )²= a + b

ρ(M − m) +p

(M − m)²+ s²

. (7)

Every parameter have distinctive effects on the smile, a affects the vertical alignment, m the horizontal alignment by defining the middle point of the smile, b affects the angle between the tails, ρ rotates the smile, s affects the curvature around m and σ(M ) is the volatility for the corresponding M, which is the moneyness. With these parameters given for different time to maturities we can build up the volatility surface in an efficient manner and interpolate to find points between the smiles.

(17)

3.2.1 Calibration of parameters

The procedure of fitting Equation 7 to market data is by optimizing the parameters such that the error between the model and the data remains limited, however, before imposing the optimization we need to define a few restrictions on the parameters to keep the model arbitrage free. Evidence exists [13] that the curvature at the money is positive, which implies that

s²> 0.

The same fact also supports that

b ≥ 0.

Furthermore, the total implied variance should not be greater than the largest observed total implied variance, i.e.,

a ≤ max{T σ(M )²}, similarly we also impose the relation

m ≥ 2min{Mi}, m ≤ 2max{M_i}.

Finally, as ρ is the correlation parameter it is bounded between [−1, 1], imposed by the defini- tion of correlation. The mentioned conditions need to be satisfied for the SVI model, however, additional restrictions can be defined to make the optimization more efficient, for insights, see [12].

3.3 Value at Risk

VaR models play a core role in risk management and all of the various VaR models have the same aim, estimating the potential losses of an asset or a diverse set of assets. VaR is usually calculated with the historical price returns, one can then model the worst (1 − α)% outcome in the historical returns.

There are different ways of defining VaR, and numerous methods to estimate it. In this thesis, we use the FHS method, which ensure that the Profit and Losses (P&L) follow a Gaussian (normal) distribution over a given time period. VaR is thereby defined as the worst α-percentile of the distribution. The filtered part takes place in modeling the time series for volatility clusters, which is explained in more detail in section 3.3.6. [15] The definition of VaR is stated in Definition

1.

Definition 1 VaR is the maximum potential loss that an asset or portfolio X can suffer within a fixed confidence level (cl) during a holding period.

V aRcl(X) = sup{x|P (X ≥ x) > cl} (8)

(18)

3.3 Value at Risk 3 Theory

In Figure 2 we illustrate the VaR estimation from a set of returns, the returns are estimated in percentage loss per day, where the bell curve reflects the appearance of the frequency in the returns. Thereby, for a 95% VaR, we estimate where the limit for the worst 5% of the returns is located. In Figure 2, 5% corresponds to a 3.75% loss, i.e., the number of losses that exceed

≈ 3.75% is 5 % of the total number of returns.

Figure 2: V alue at risk visualized. The returns are assumed to be normally distributed and VaR is estimated as the return at the, in this example, 95^thworst percentile of the distribution.

3.3.1 Value at Risk estimation

When estimating VaR for a specific option using implied volatility as the risk factor, the volatility surface is used over time to find the volatility and its returns. The return dependency needs to fulfill the stationary conditions (defined in section 3.3.2) for the estimation to be accurate in different market conditions. To construct the return series such that it satisfies the stationarity conditions there is a need to filter the time series, the filtering is explained in Section 3.3.6.

The historical price returns can further be applied to the return on the current market condition. Suppose that ∆σi= log(σi) − log(σi−1) is the logarithmic return of a specific daily change of the implied volatility, for which we can apply the market volatility of the current day on. This procedure creates a scenario for the possible change in the future, see Equation 9.

σ_Scenario_i= σ_Te^∆σⁱ, (9)

(19)

which result in a scenario vector consisting of n log difference scenarios (σScenario_i),







σScenario1

σScenario₂

... σScenario_n





 ,

the scenario vector is consequentially used to create the P&L distribution as P &L_i = BS(S, K, r, T, σ_T) − BS(S, K, r, T, σ_Scenario_i),

where σT is the implied volatility estimate at the valuation day and σScenario_iis the i^thimplied volatility scenario. This yields a P &L vector for estimating VaR,





 P &L₁ P &L2

... P &Ln





 ,

where VaR is the α-quantile of the P &L distribution.

3.3.2 Shifting methodologies

One of the fundamental requirements when estimating VaR is that the time series or the P&L distribution is stationary, otherwise, prediction and risk metrics may be over/underestimated because of the market conditions during the period of consideration. In light of this, we define the requirements of stationarity when considering time series in Definition 2.

Definition 2 The process {Xt; t ∈ Z} is said to be weakly stationary if

• E(X_t) = µ ∀t

• E|Xt|²< ∞ ∀t

• γ(t, t + h) = γ(h) are independent of t

Where γ is the covariance function. This definition relates to that a stationary process {Xt} must satisfy three properties; constant first moment, finite variation and that the second moment is independent of t. These requirements imply that the series fluctuates with a constant variation around a fixed level. When considering the implied volatility time series, we require that the shifts are stationary before deploying them as a scenario.

3.3.3 Absolute shifts

To use the 1-day historical VaR scenario generation, the shifts on the risk factors need to be stationary with equal mean and variance. An absolute shift can be applied dependent on the dynamics of the data, in the case of absolute shifts, the risk factor dynamics need to be described by a random walk,

xt= µ + xt−1+ t, t∼ iid(0, σ²).

(20)

3.3 Value at Risk 3 Theory

The random walk suggest that the future risk-factor is equal to the present factor plus a constant drift and a random chock (t) with mean zero and constant variance. To model the scenarios with the requirements in Definition 2 we calculate the scenarios according to Equation 10,

xScenario_j = xT + ∆xj, (10)

where j indicate the scenario number, and ∆x_j is defined in Equation 11,

∆x_j = x_t+1−j− xt−j. (11)

3.3.4 Logarithmic Shifts

Instead of using absolute shift we can use logarithmic shifts for scenario creation. Logarithmic shifts are useful for time series where chocks of the logarithm of the risk factor have an identical distribution with variance σ². This implies that the logarithm of risk factors is given by a random walk,

log(xt) = φ + log(xt−1) + , t∼ iid(0, σ²),

which implies that the future risk factors are dependent on the present plus a constant drift and a random chock (t). The logaritmic shifts determines the scenarios for the future risk factors by Equation 12.

xScenario_j = xTe^r^j, (12)

where r_j = log(x_t−j/x_t−1−j). If the risk factor has a dependency of identical mean and time-varying conditional variance there is an advantage of applying a filtering method such as GARCH(1,1) to obtain stationary data defined in Section 3.3.5.

3.3.5 GARCH modeling

The fundamental concept of GARCH modeling stems from the work of the 2003 Nobel prize winner Robert F. Engle and is built upon the fact that financial data is very particular from a time series analysis point of view [16]. Unlike other types of data, it is highly unpredictable (unfortunately) and cannot be considered to have constant variance, as some other linear time series do. Instead, they are what is called heteroscedastic time series, meaning that the vari- ance varies over time. GARCH modeling takes this into consideration and produces what is called conditional variance [16]. Definition 3 defines the GARCH(p,q) model.

Definition 3 {Xt} is called a GARCH(p, q) process, if it is stationary, and

X_t= h_tZ_t, {Z_t} ∼ iid(0, 1) (13)

h²_t = α0+ α1X_t−1² + ... + αpX_t−p² + β1h²_t−1+ ... + βqh²_t−q (14) α0> 0, αp> 0, βⁱ> 0, i = 1, ..., q.

(21)

By applying a GARCH(1, 1) model to a set of returns, one can obtain the conditional variance h²_t of the corresponding time series through Equation 14. h²_t can then be used to filter the series from volatility clusters if there are variations in the conditional variance [16], this is where the FHS process comes into consideration.

3.3.6 Filtered Historical Simulation

When the correlation and volatility are constant we confront the problem with the historical simulation approach, but this case is rarely when considering time-series of financial contracts.

Therefore, for financial contracts and when the volatility is time dependent we apply the FHS approach defined in this section.

When considering FHS we filter the historical returns in such a way that it fulfills the stationarity conditions and the requirements in Definition 2 [17]. The stationarity condition is important to accurately estimate VaR during the current market dynamics, such as volatility level (the amount the time series fluctuates over time). As an example, an increase in portfolio volatility during high-risk measurement periods strongly affects the outcome of the portfolios VaR. To grasp the concept of conditional variance we illustrate the variations in historical returns in Figure 3, thereby observing that the returns are clustered in periods and the time- series is heteroscedastic, which would affect the VaR estimation. In light of this behavior, we filter the time series to obtain uniform variations in the data and a stationary time series.

0 50 100 150 200 250

Time [Days]

-0.2 -0.1 0 0.1 0.2

Returns [i-i+1]

Low frequency period High frequency period

Figure 3: Historical return time series where clustering patterns can be identified, the y-axis indicate the daily returns of implied volatility.

To obtain stationarity, the first procedure of the filtering process is to normalize the returns to the current market volatility,

ηt= _t ht

.

Where ηt will be the return normalized for volatility clusters, h²_t is the conditional variance, usually from a GARCH(1,1) estimate for financial time series, defined in Section 3.3.5, t is often defined as

t= rt− E[r],

to justify for a mean deviating from zero. In turn, the time series can be modeled with the

(22)

3.4 Principal Component Analysis 3 Theory

market level for each day such that

Yt= E[r] + ηthT, where σT is the volatility for the current market condition.

3.3.7 FHS VaR for options

The necessary theory is now defined and we can introduce how to estimate FHS VaR for the specific case of options. The procedure for estimating this is stated in steps 1-6:

1. Historical implied volatility are considered available for a specific Time to Maturity (T) and Moneyness (M) over a period of time.

2. Using the historical volatility to estimate log shifts for the daily change

∆_iσ(M, T ) = log(σ_i(M, T )) − log(σ_i−1(M, T ))

3. Apply GARCH(1,1) (Equation 14) model to justify for time varying variance for the time series of log shifts, this process is applied to make the time series at the same variance level historically it is on today’s market. h is the GARCH(1,1) estimate for the conditional variance for the implied volatility shifts, this step is accomplished by:

Yi= E[∆1:Nσ(M, T )] +∆iσ(M, T ) − E[∆_1:Nσ(M, T )]

ht

hT,

4. Apply every Yi on today’s market volatility to establish possible scenarios for the next day

σScenario_i(M, T ) = σT oday(M, T )e^Yⁱ (15) 5. Estimate the change in the option price imposed by every scenario to create an P&L

vector

P &Li= BS(σT oday(M, T )) − BS(σScenario_i(M, T )) (16) 6. Form the P&L vector and the distribution of the returns, an estimate of the VaR α-

quantile for a specific option (T, M ) can be performed, see Definition 1.

Following this procedure we can estimate FHS VaR for options, however, step 1-2 is not easy accomplished which introduce the need for reduce dimensions of the volatility surface. We want to estimate these log-returns (shifts) with the minimal amount of data from the volatility surface.

3.4 Principal Component Analysis

PCA is a multivariate technique used to emphasize variation and bring out the strongest explanatory vectors in a data-set, it is commonly used to make data easy to explore and

(23)

of PCA is that the reduced set of variables still contains the majority of the information in the original data-set. PCA maintain the information by transforming a data-set of possibly correlated variables to a set of uncorrelated variables called principal components or factor loadings. The first principal component will account for the vast majority of the movement in the data, whereas the sequential components will explain the largest remaining variations.

The mathematical interpretation of PCA is found in Appendix A.

3.5 Interpolation and extrapolation techniques

Now that the volatility and the enclosing theory are defined we can continue with the more general theory of interpolation and extrapolation. The reason for this section is that inter- polation and extrapolation will be considered in the aspect of finding the poi’s in the shifted surface or matrix, and consequently to grasp the concept of the shifting models and the scenario applications. The interpolation techniques that are used to find unknown data points are tabulated in Table 2. The mentioned techniques are explained in more theoretical detail in Appendix D.

Table 2: The differences of the interpolation methods in terms of Computational effort and Continuity.

Interpolation Technique Computation Effort Continuity

Linear Requires more memory and computation

time than nearest neighbor

C⁰

Nearest Neighbour Fastest computation time discontinuous

Next Same memory requirements and computa-

tion time as Nearest neighbour

discontinuous

Previous Same memory requirements and computa-

tion time as Nearest neighbour

discontinuous

Flat Forward Same computation time and memory requirements as Linear

C⁰

Cubic Hermite interpolation Requires more memory and computation time than Linear

C¹

Cubic Spline interpolation Most expensive to compute C¹

(24)

4 Methodology

4 Methodology

4.1 Shifting Methodologies

As previously explained, shifting is the theory of analyzing the changes in implied volatility over time, which is accomplished by considering the historical volatility surfaces and the corresponding shifts.

To estimate the shifts we need to specify the different methods and how to apply the methods to create scenarios. The shifts can be estimated in a logarithmic, relative or absolute methodology according to Equation 17-19 [18].

Logarithmic shift ∆σ = log(σt+1/σt) ⇒ F = e^∆σ, (17) Relative shift ∆σ = σ_t+1/σ_t− 1 ⇒ F = 1 + ∆σ, (18) Absolute shift Of f set = σt+1− σt. (19) The scenarios can then be defined by

σ_Scenario= F ∗ σ_T + Of f set, (20)

intuitively explained as applying a return from previous data on today’s market volatility, results in the ability to create series of scenarios for the implied volatility.

The returns are extracted from historical volatility surfaces, which may or may not lack data or contain the specific point in question. The discreteness of the data creates a problem, and to solve this we interpolate in the shift matrix to make an approximation of the point.

The different techniques are defined in Section 3.5.

In this thesis we will consider logarithmic shifts (Equation 17) for its statistical beneficial properties, Iain J. Clark [19] demonstrates the advantage of this decision and performed a rigorous analysis on the optimal shifting type for implied volatility data. The findings suggested a logarithmic shift applied with a GARCH(1,1)-process to adjust for heteroscedasticity. This produces a close to perfect stationary process for implied volatility shifts and can additionally be used to estimate accurate scenarios (Equation 20).

4.2 Volatility surface shifts

When analyzing historical volatility surfaces and using volatility surfaces to create a shifted surface, one may consider a non-computable feasible process and a feasible process. The non- feasible process consists of shifting every point in the historical volatility surface and creating a shifted surface with all available data. Shifting all points can however be used as ideal data and test the models developed against it.

(25)

The computable feasible process consists of reducing dimensions in the data-set, yet describe the majority of the dynamics in the shifted volatility surface. By utilizing interpolation and extrapolation techniques we can define processes for creating volatility surface shifts with reduced dimensions that nevertheless maps the full surface. In the following sections, methods of shifting and dimension reduction techniques will be defined, such as parallel shifting, multiple grid-points shifting, dimension reduction using principal components and shifting of parameters from the SVI parametrization.

4.2.1 Single grid point/Parallel shift

The most basic shifting methodology is applying the pointwise shift for one single grid point and adopt the same shift across the surface (i.e., parallel shift for the full surface). The procedure for creating such a shift and estimate the scenarios for every option of interest is defined as:

(26)

4.2 Volatility surface shifts 4 Methodology

Single grid points shifting

Require:

(i) Discrete Volatility surfaces over time is considered available.

(ii) The grid points in the volatility surface consists of moneyness and time to maturity, σ(T, M )

(iii) A specific poi, with some moneyness and time to maturity of an option (T*,M*) is of interest to estimate scenarios of

Procedure:

1. Find a grid point (Tshif t, MShif t) that contains the volatility we apply the shifting to, a parallel shift usually use a point close to ATM

• If the grid point exist in the historical and today’s surface, continue, otherwise interpolate/extrapolate the point. Linear interpolation is often used in moneyness dimension while flat forward interpolation is used in time to maturity dimension, defined in Appendix D.

2. Use the chosen grid point to calculate the historical return of the implied volatility,

∆σi= log(σi(Tshif t, MShif t)) − log(σi−1(Tshif t, MShif t)).

3. The historical return of the grid point is considered as a shift for all points on the surface. ∆σ_i( ~T , ~M ) = ∆σ_i(T_{shif t}, M_{shif t}), ~T , ~M ∈ IR

4. To apply the return on today’s volatility level we need the poi volatility σT(T^∗, M^∗), which is found with inter/extrapolation techniques in today’s volatil- ity surface (σ_T).

5. The estimated values from the historical returns are applied on today’s volatil- ity surface for the poi and the options VaR can be estimated, σScenario_i = σT(T^∗, M^∗)e^∆σⁱ.

4.2.2 Multiple grid point shift

We have now established a procedure for implementing the parallel shift, to extend the concept to multiple shifting points, we define the actions that need modification from the parallel shift.

Note that i-iii corresponds to the same requirements as in the parallel shift, the same goes for procedure 1,4 and 5.

(27)

Multiple grid points shifting

Require:

(iiii) n chosen grid points from where the shifting should take place, specified by mon- eyness and time to maturity







T_{shif t}¹ , M_{Shif t}¹ T_{shif t}² , M_{Shif t}² T_{shif t}³ , M_{Shif t}³

... T_{shif t}ⁿ , M_{Shif t}ⁿ







Procedure:

2 Use the chosen grid points to calculate the historical return of the implied volatil- ity, ∆σi(T_{shif t}^j , M_{Shif t}^j ) = log(σi(T_{shif t}^j , M_{Shif t}^j )) − log(σi−1(T_{shif t}^j , M_{Shif t}^j )) for all the shifting points. i specifies the specific day and j specifies the shift point number.

3 To find the volatility shift of the poi, the historical returns of the shifting points (∆σi(T_{shif t}ⁱ , M_{Shif t}ⁱ )) are interpolated/extrapolated with techniques dependent on where the grid points are located in relation to the poi, parameter shifts 4.3.2 or PCA shifts 4.3.3 can also be applied.

The interpolation and extrapolation in the volatility surface for the multiple grid point shift can be performed in many means. Linear interpolation is the most reliable to use, however, not always the best one. In instances of a 9 point grid results that a cubic interpolation method could be fitted and be more accurate than linear but with the disadvantage of additional computation cost for the complexity of the cubic method.

4.3 Dimension reduction models

4.3.1 Pivot model

In this approach, we analyze the return of the implied volatility and see if we can model the change in the volatility surface with only a few pivots. To make the dimension reduction useful we need to keep in mind that a quite large reduction in dimensions is required. Several studies show there is also known that in high volatility eras, such as the financial crisis, the pivot model can be simplified, proved by PCA, that the leading eigenvalues are increasingly dominant under adverse price movements [6]. In turn, analyzing 1-15 pivots seems necessary to obtain a thorough study. In addition to this, it is important to examine points where the majority of the explanatory curvature can be found, while keeping in mind that is also need to consider where the best market prices are located (the most liquid regions). Liquidity can be considered as the options where the trades are most frequent, close to ATM, and therefore,

(28)

4.3 Dimension reduction models 4 Methodology

ATM options consist of the most accurate prices.

The pivot model follows the same procedure as the multiple grid point shifting defined in Section 4.2.2. The procedure consists of building a shifted surface with a limited amount of data (the chosen points to shift) from the volatility surface, the limited data are then used to build a matrix of log shifts, which can then be used to find the shift for the specific poi by interpolation/extrapolation. When the shifts for every day are extracted for the poi, the FHS VaR procedure presented in Section 3.3.7 can be applied to estimate FHS VaR.

We start by the original n × n volatility data, from the data we extract m × m points, from these m × m points we estimate log shifts to build a m × m shifted matrix, then if a poi is not one of these m × m shifted points we interpolate in the matrix to find the specific point. The procedure is illustrated below.

n × n - Volatility points

m × m Volatility points

m × m Volatility shifts

interpolate in m × m shifts to find poi

By applying the pivot model we end up with historical shifts for a reduced set of options, an illustration of a (3 × 3)-pivot table is shown in Figure 4. The cross in the figure indicates where a poi may be located and needs to be interpolated. One approach is to first linear interpolate in moneyness dimension for both of the cross-sections where the poi is located in between. Secondly, when the correct moneyness locations are identified, interpolate in time to maturity to find the poi volatility shift approximation. The interpolation techniques are defined in section 3.5.

(29)

Figure 4: A pivot grid with 9-pivots, the cross indicate a poi where interpolation techniques needs to be applied in order to obtain the volatility shift for that specific option.

As mentioned, by applying this method we end up with an approximation of historical volatility shifts approximation for a specific poi. From the historical shifts we apply the FHS VaR method for options defined in Section 3.3.7.

4.3.2 Parameter model

With a parametrization of the volatility surface where the parameters are stored over time (see the SVI parametrization in Equation 7), one approach would be to establish parameter shifts. In Table 3, a sample of the parameters over time is given, which can be used to estimate parameter shifts and build parameter scenarios.

Table 3: A sample of the SVI parameters over time

t1 a1 b1 σ1 ρ1 m1

t2 a2 b2 σ2 ρ2 m2

t3 a3 b3 σ3 ρ3 m3

..

. ... ... ... ... ... tn an bn σn ρn mn

As shown in to Table 3, there will be 5 parameters that will change over time, for every smile. By using the change in the parameters we can deploy the shifting on the parameters instead of the volatility,

∆ai= ai− ai−1

∆bi= bi− bi−1

∆σ_i= σ_i− σ_i−1

∆ρi= ρi− ρi−1

∆mi= mi− mi−1

(30)

4.3 Dimension reduction models 4 Methodology

where a, b, σ, ρ and m are parameters for the SVI model. We can hence treat every single return vector separately to apply a GARCH(1,1) estimate, this to have the parameters in today’s market level and further apply the shift on today’s market parameters, i.e., performing a FHS for the parameters. This yields a sequence of scenario parameters:

aScenario_i = an+ ∆ai, bScenario_i = bn+ ∆bi, σ_Scenario_i = σ_n+ ∆σ_i, ρScenario_i = ρn+ ∆ρi, mScenario_i= mn+ ∆mi,

which can be used to build up a new surface for each scenario. The parameter shifting model is an interesting approach and would incorporate the arbitrage-free characteristics of the SVI model.

4.3.3 PCA model

When we apply PCA on the cross-sections of the shifted volatility surface (see Table 4), we are able to construct so-called principal components or loadings. If the loadings are approximated from historical volatility surfaces and stored, we can apply them to the volatility surfaces when building up the shift. One benefit in doing this is that we can reduce the number of points to shift and build up a full shifted surface based on the factor loadings. As factor loadings are established on cross-sections of the volatility surface, we will also consider cross-section when building the surface.

Table 4: Shows the cross-section in moneyness dimension of the volatility, which the PCA is applied to, ∆σ is the change in the implied volatility, n is the number of days and k is the number of moneyness data points and t is the time.

M1 M2 M3 . . . Mk

t1 ∆σ_t^M₁¹ ∆σ^M_t₁² ∆σ^M_t₁³ . . . ∆σ^M_t₁^k t2 ∆σ_t^M₂¹

t3 ∆σ_t^M₃¹ . .. ... ..

. ...

tn ∆σ_t^M_n¹ . . . ∆σ^M_t_n^k

If we study the data in the time to maturity dimension, we can define the shifts with factor loadings as

∆~σ =

Length(~L)

X

i=1

ψiL~i, (21)

where ~L is defined as the factor loadings (see Equation 29), ψ is representing the coefficients that needs to be approximated to establish the shifted surface. To approximate ψ, we first

(31)

∆~σ =







log(σ_t(T, M₁)) − log(σ_t−1(T, M₁)) log(σt(T, M2)) − log(σt−1(T, M2)) log(σt(T, M3)) − log(σt−1(T, M3))







=







∆σ(T, M₁)

∆σ(T, M2)

∆σ(T, M3)







, (22)

A =





 L¹_m

1 L²_m

1 L³_m

1

L¹_m

2 L²_m

2 L³_m

2

L¹_m₃ L²_m₃ L³_m₃







, (23)

where L^j_m_iis the j^thfactor loading and i^thcorresponding moneyness point. Using this method- ologies we can find ~ψ with Equation 24,

ψ = A~ ⁻¹∆~σ. (24)

We can then use this approximation of ~ψ and apply it with the full factor loadings by utiliz- ing Equation 21, in turn, yields that we can build up the volatility surface shifts by practicing this technique on a number of different time to maturities. We can then interpolate/extrapo- late between the cross-sections of the shifts to find the specific shift for the poi. When this is applied for a period in time we can estimate the historical shifts for a poi, and then apply the method described in Section 3.3.7 to estimate FHS VaR from the approximated shifts.

Consequentially, we end up with a process that reduces the amount of data that needs to be extracted from the volatility surfaces. While this process still approximates the volatility surface shifts using principal components to explain the majority of the data.

The dimension reduction process consists of the following; we start with the original n × n volatility data, from the data we extract m × m points, from these m × m points we estimate log shifts to build a m × m shifted matrix, from the m × m shifted matrix we build up a new n × n volatility shift data with the help of principal components, then if a poi is not one of these n × n shifted points we interpolate in the matrix. The procedure is illustrated below.

n × n - Volatility points

m × m Volatility points

m × m Volatility shifts

Reconstruct all volatility shifts with principal components to obtain an n × n shifted matrix

Interpolate in the n × n shifts to find poi