Multilevel Change of Measure for Complex Digital Options

Multilevel Change of Measure for Complex Digital Options Jiaxing Wang Somerville College University of Oxford A thesis submitted in partial fulfillment of the MSc in Mathematical Finance Trinity 2014

This thesis is dedicated to the people who never stop believing in me, My mother and my father. You have been the brightest light in my darkest night.

Acknowledgements I would never have been able to finish my dissertation without the guidance of my supervisor, help from fris, and support from my family. I would like to offer my greatest gratitude to my supervisor Prof. Mike Giles, for his guidance, comments and great patience through my learning process of this MSc thesis. I would like to thank my good fri Danni, for her helpful suggestions. I would also like to thank my parents, who have always been supportive and encouraging in my hardest times.

Abstract The multilevel Monte Carlo approach introduced by Giles (Operations Research, 56(3):607-617, 2008) aims to achieve greater accuracy for the same computational cost by combining simulations in different levels of discretization. In particular for digital options, previous related work has suggested the conditional expectation approach and the technique of splitting in multiple dimensions. In this paper, we suggest the change of measure approach as an alternative for splitting and analyse its efficiency compared to previous methods in both scalar and multidimensional cases.

Contents 1 Introduction 1 1.1 Milstein Scheme.............................. 2 1.2 Multilevel Monte Carlo method..................... 3 1.3 Pathwise Sensitivities........................... 5 2 Digital Options with scalar SDEs 7 2.1 Conditional Expectation Approach................... 8 2.2 Splitting.................................. 10 2.3 Change of Measure Method....................... 10 2.4 Optimal Number of Samples....................... 13 2.4.1 Splitting.............................. 14 2.4.2 Change of Measure Method................... 15 2.5 Numerical Results............................. 16 2.5.1 Numerical Results for computing the option value....... 18 2.5.2 Numerical Results for computing the vega........... 21 3 Digital options with multi-dimensional SDEs 24 3.1 Conditional Expectation Approach................... 26 3.2 Splitting.................................. 28 3.3 Change of Measure Method....................... 28 3.4 Computational Results.......................... 32 3.4.1 Comparison with Analytical Values without correlation.... 32 3.4.2 Numerical results for computing the option value....... 33 3.4.3 Numerical results for computing the vega............ 36 4 Conclusions and future work 39 i

A Matlab Codes for 1D problems 40 A.1 BS price code for digital call (Jeff Dewynne[3], April 2009)...... 40 A.1.1 bsdigitalcall........................... 40 A.1.2 bsrealvalues........................... 46 A.1.3 bsturn2arrays........................... 47 A.2 Standard normal distributions (p.d.f and c.d.f) (Mike Giles)..... 50 A.3 Code for MLMC option values...................... 51 A.3.1 1D test functions and level l estimator............. 51 A.3.2 MLMC calculation for conditional expectation method.... 58 A.3.3 MLMC calculation for change of measure method....... 59 A.3.4 MLMC calculation for splitting method............. 61 A.4 Code for MLMC Vega.......................... 62 A.4.1 1D test functions and level l estimator............. 63 A.4.2 MLMC calculation for conditional expectation method.... 70 A.4.3 MLMC calculation for change of measure method....... 72 B Matlab Codes for 2D problems 74 B.1 A function for computing bivariate normal probabilities[4]..................... 74 B.2 Code for MLMC option values...................... 76 B.2.1 2D test functions and level l estimator............. 76 B.2.2 MLMC calculation for conditional expectation method.... 85 B.2.3 MLMC calculation for change of measure method....... 87 B.2.4 MLMC calculation for splitting method............. 88 B.3 Code for MLMC Vega.......................... 90 B.3.1 2D test functions and level l estimator............. 90 B.3.2 MLMC calculation for conditional expectation method.... 99 B.3.3 MLMC calculation for change of measure method....... 100 Bibliography 103 ii

List of Figures 2.1 Variance for the option price........................ 19 2.2 Mean for the option price......................... 19 2.3 Number of samples in each level for the option price............ 20 2.4 Accuracy for the option price....................... 20 2.5 Variance for the vega............................ 21 2.6 Mean for the vega............................. 22 2.7 Number of Samples in each level for the vega............... 22 2.8 Accuracy for the vega........................... 23 3.1 Variance for the option price........................ 34 3.2 Mean for the option price......................... 34 3.3 Number of samples in each level for the option price............ 35 3.4 Accuracy for the option price....................... 35 3.5 Variance for the vega............................ 36 3.6 Mean for the vega............................. 37 3.7 Number of Samples in each level for the vega............... 37 3.8 Accuracy for the vega........................... 38 iii

Chapter 1 Introduction In mathematical finance, the Monte Carlo method is used quite heavily for calculating the expected payoff value of a financial option. As a brute force method, it has the advantage of simplicity and flexibility. More importantly, it can be developed to handle more complex models, such as higher dimensional problems, which can be very challenging for other methods such as finite difference methods. Monte Carlo is used widely in fixed income markets (LIBOR models), credit markets and equities because of its high-dimensional advantage. Specifically, we consider a SDE with general drift and volatility terms, in scalar case, ds(t) = a(s, t)dt + b(s, t)dw (t), 0 < t < T, (1.1) and in vector case, if we use i as the index for different asset prices, ds i = a i (S, t)dt + j b ij (S, t)dw j, 0 < t < T (1.2) with given initial data S 0. In the case of this paper we are interested in European digital options with payoff, P (S(T )). A standard Monte Carlo method using a numerical discretisation with first order weak convergence, such as Euler-Maruyama method and the Milstein scheme, has mean-square-error O(N 1 + h 2 ). To make this equal to ɛ 2 requires O(ɛ 2 ) indepent paths and O(ɛ 1 ) timesteps for each path due to the first order weak convergence, thus resulting in a computational complexity of O(ɛ 3 ). At the MCQMC 2006 conference Giles presented a multilevel approach[5] which stimulated a lot of research interest. It reduces the cost to O(ɛ 2 (ln ɛ) 2 ) for option payoffs with a uniform Lipschitz bound, such as European options, and O(ɛ 2 ) for digital options using a conditional expectation treatment and Milstein discretization[7]. The technique of conditional expectation has helped to achieve a better multilevel variance convergence rate for digital options. It works well in 1D where the analytic 1

value for the conditional expectation can be expressed in a simple form. Moreover, at the coarsest level the path is not simulated at all and the computation cost is greatly reduced, making it an ideal method for digital options. But in high dimensions, approximating the multiple integration becomes an arduous task. In this case, although one can turn to the technique of splitting [1], which tries to simulate the conditional expectation by a numerical average of sub-samples, we are unable to calculate pathwise sensitivity. The change of measure method is a hybrid of the above two methods. Rather than taking sub-samples to estimate the conditional expectation as the splitting method, we instead take a sample from a third distribution and recover samples for the fine and coarse paths using Radon-Nikodym derivatives. Although this method is not optimal in cases where conditional expectation method still applies, it has the advantage of handling high dimensional options and their Greeks. The remainder of this dissertation is organized as follows. The rest of this chapter reviews some of the important concepts in Milstein scheme, Multilevel Monte Carlo method, and pathwise sensitivities. Chapters 2 and 3 explain in detail how to calculate Radon-Nikodym derivatives in 1D and multi-dimensions, and compare it with the conditional expectation and splitting methods. Finally we generalize our result and make several suggestions for future work. 1.1 Milstein Scheme The Euler-Maruyama discretisation of equation (1.1) is Ŝ n+1 = Ŝn + a(ŝn, t n )h + b(ŝn, t n ) W n. Provided a and b are sufficiently smooth and appropriate conditions on payoff we have O(h) weak convergence but only O( h) strong convergence. As strong convergence is important in Multilevel Monte Carlo, we use the Milstein method for improvement. The Milstein discretisation for a scalar SDE is Ŝ n+1 = Ŝn + ah + b W n + 1 b 2 S b( W n h) 2. For vector SDEs (1.2), the Milstein method is Ŝ i,n+1 = Ŝi,n + a i,n h + i b ij,n W j,n + 1 2 j,k,l b ij S l b lk,n ( W j,n W k,n ρ jk h A jk,n ). 2

where ρ jk is the correlation between S j and S k, and A jk,n is the so-called Lévy area defined as A jk,n = tn+1 t n (W j (t) W j (t n ))dw k (t) (W k (t) W k (t n ))dw j (t). It is known that the Lévy areas are not needed if the commutativity condition holds, i.e., for all i, j, k, l ( bij b lk b ) ik b lj = 0. S l S l In this paper, we only consider the case where b is a non-singular diagonal matrix, and b ii only deps on S i. In this case, the commutativity condition is satisfied and there is no need to simulate Lévy areas. 1.2 Multilevel Monte Carlo method Consider multiple sets of simulations with different timesteps h l = M l T, l = 0, 1,..., L. Let P denote the payoff and ˆP l its numerical approximation with M l number of timesteps. In this paper we choose the refinement factor M = 2, which means that each level has twice as many timesteps as the previous level. Then we have the following equality. E[ ˆP L ] = E[ ˆP 0 ] + L E[ ˆP l ˆP l 1 ] (1.3) l=1 In multilevel Monte Carlo, we use N l number of samples with the same Brownian path to approximate E[ ˆP l ˆP l 1 ]. That is, on each level l > 0, we have the estimator Ŷ l = N 1 l N l i=1 ( ˆP (i) l (i) ˆP l 1 ). (1.4) Alternatively, we can rewrite the value of each sample as ˆP f l ˆP l 1 c. The superscripts f and c denote the use of a fine-path estimate or a coarse-path estimate. The subscripts denote the number of timesteps N l = 2 l used for each evaluation. It is important to ensure that the telescoping sum in (1.3) is correctly computed. This is achieved by requiring that E[ ˆP f c l ] = E[ ˆP l ]. (1.5) That is, the mean value of the coarse-path estimate should equal to the mean value of the fine-path on the previous level. 3

In the case of digital options, based on the method we use, the coarse-path estimate and the fine-path estimate are achieved by different ways. But we always do a sanity check to ensure that (1.5) is respected. Finally we have the full estimator for E[P L ] being L l=0 Ŷ l where Ŷ 0 = N 1 0 N 0 n=1 ˆP (n) 0. Let C 0, V 0 denote the cost and variance of one sample ˆP 0, and C l, V l that of ˆP l ˆP l 1, then the total cost and variance is L l=0 N lc l and L l=0 N 1 l V l. Our aim is to control the cost for a given MSE (mean-squared-error) ɛ 2. Since ] MSE = E [(Ŷ E[Y ])2 = V[Ŷ ] + (E[Ŷ ] E[Y ])2, (1.6) we only need to consider minimizing the cost for a given V[Ŷ ] = ɛ2 /2. Using the method of Lagrange multiplier, we can minimise the cost for given variance ɛ 2 /2 by choosing N l = λ V l /C l where λ = 2ɛ 2 L l=0 Vl C l. (1.7) The total computational cost is thus ( L ) 2 C = ɛ 2 Vl C l. l=0 To do numerical analysis for computational cost, we resort to the MLMC theorem proposed by Giles [5]. Theorem 1.2.1 (MLMC). Let P denote a functional of the solution of stochastic differential equation (1.1) for a given Brownian path W (t), and let ˆP l denote the corresponding approximation using a numerical discretisation with timestep h l = M l T. If there exist indepent estimators Ŷl based on N l Monte Carlo samples, and positive constants α 1, β, c 2 1, c 2, c 3 such that i) E[ ˆP l P ] c 1 h α l ii) E[Ŷl] = iii) V[Ŷl] c 2 N 1 l h β l { E[ ˆP 0 ], if l = 0 E[ ˆP l ˆP l 1 ], if l > 0 4

iv) C l, the computational complexity of Ŷl, is bounded by C l c 3 N l h 1 l, then there exists a positive constant c 4 such that for any ɛ < e 1 there are values L and N l for which the multilevel estimator Ŷ = L Ŷ l, l=0 has a mean-square-error with bound ] MSE := E [(Ŷ E[P ])2 < ɛ 2 with a computational complexity C with bound c 4 ɛ 2 if β > 1, C c 4 ɛ 2 (ln ɛ) 2, if β = 1, c 4 ɛ 2 (1 β)/α, if 0 < β < 1. Proof. See [5]. According to the theorem, α is determined by the weak convergence, and the computational complexity is determined by β. 1.3 Pathwise Sensitivities Let Ŝ = (Ŝk) k [0,N] denote the approximation to the whole asset path and Ŵ = ( Ŵk) k [0,N 1] the corresponding Brownian increments. The value of the option before discounting is V = E[P (S)] which is approximated by ˆV = E[ ˆP (Ŝ)] = ˆP (Ŝ(θ, Ŵ ), θ) p(ŵ )dŵ where dŵ := N 1 k=0 d Ŵk and p(ŵ ) := N 1 k=0 p( Ŵk) the joint p.d.f of the normally distributed Brownian increments Ŵ. 5

Then the Greek is approximated by d ˆV dθ = d ˆP (Ŝ(θ, Ŵ ), θ) p(ŵ dθ )dŵ ( = ˆP Ŝ Ŝ θ + ˆP ) p(ŵ θ )dŵ ( = E ˆP Ŝ Ŝ θ + ˆP ) θ For multilevel Monte Carlo, E[ ˆP ] = E[ ˆP 0 ] + [ L l=1 E ˆP f l ˆP ] l 1 c. Moreover, because a digital option does not dep on the entire path, ˆP 0 and ˆP l only dep on the asset value at one timestep before maturity when using the conditional expectation approach, as discussed in the next chapter. Therefore, we have the expression for multilevel Greeks being the sum of two parts: the differential w.r.t level 0 estimator and the differential of the corrections. In other words, all we have to do is to calculate the following: and on each level, d ˆP 0 dθ = ˆP 0 S 0 S 0 θ + ˆP 0 θ d ˆP f dθ = ˆP f S f N 1 + ˆP f S f θ θ N 1 (1.8) (1.9) d ˆP c dθ = ˆP c S c N 2 S c N 2 θ + ˆP c θ (1.10) 6

Chapter 2 Digital Options with scalar SDEs We consider the digital option with discounted payoff P = exp( rt )1 {S(T )>K} where the underlying price follows a geometri Brownian motion under risk neutral measure Q with constant parameters r, σ ds(t) = rs(t)dt + σs(t)dw (t). We adopt the Milstein scheme for discretization, i.e., ˆD n = 1 + rh + σ Ŵn + 1 2 σ2 ( Ŵ n 2 h) Ŝ n+1 = Ŝn ˆD n ( ) ŝ n+1 = ŝ n D n + Ŝn Ŵn + σ( Ŵ n 2 h) where ŝ := Ŝn/ σ is defined to calculate vega. The standard Multilevel Monte Carlo method would be to generate the entire path of S(t) and then compute the estimator for each level accordingly as in (1.4). This approach was adopted previously by Giles when he first started about applying multilevel Monte Carlo to digital options[5]. If we use the Euler-Maruyama discretisation, because of the O( h) strong convergence, the paths terminating within O( h) of the strike K will have an O(1) probability to generate a difference of O(1) in payoff. There are O( h) of the paths within this range, thus giving an O( h) variance of a single sample. In this case, β = 1/2 and the computational cost would be O(h 5/2 ) (α = 1 in the MLMC Theorem). On the other hand, Milstein scheme improves the strong convergence to O(h), thus repeating the argument above we could see that V l = O(h). This corresponds to β = 1 and computational cost O(ɛ 2 (ln ɛ) 2 ). 7

2.1 Conditional Expectation Approach In order to achieve a better variance convergence rate, the best way so far is to use the conditional expectation approach[8]. Instead of calculating payoff values based on the final asset value, we terminate the path simulation one step before maturity T. Now we try to figure out the probability distribution of the final payoff conditional on S f N 1 = Ŝf N 1. have For the fine path, switching to Euler discretization for the final step we would S f N = Sf N 1 (1 + rh) + σsf N 1 W N 1, (2.1) which means that the final state has a normal distribution with mean and standard deviation µ f = S f N 1 (1 + rh), σ f = σs f N 1 h. (2.2) Thus the expected payoff of the fine path conditional on knowing the path up to the penultimate timestep is [ [ E P f (S f N ) WN 2 ] = E 1 S f N K WN 2 ] where Φ is the cumulative Normal distribution. = P (S f N K W N 2) ( ) µ f K = Φ σ f ( ) S f N 1 (1 + rh) K = Φ h σs f N 1 (2.3) For the coarse path, we also stop at one coarse timestep before maturity. But because we are given the Brownian increment W N 2 for the first half of the final coarse timestep, we would have S c N = S c N 2(1 + 2rh + σ W N 2 ) + σs c N 2 W N 1. (2.4) Thus the final state of the coarse path also has a Normal distribution with mean and standard deviation µ c = S c N 2(1 + 2rh + σ W N 2 ), σ c = σs c N 2 h, (2.5) 8

and the conditional payoff is thus E [P c (SN) c W N 2 ] = E [ 1 S c N K WN 2 ] ( ) µ c K = Φ σ ( c ) S c = Φ N 2 (1 + 2rh + σ W N 2 ) K h σs c N 2 (2.6) To compute the Greeks, we follow the same steps as in [2] and adopt the pathwise sensitivity approach. For the fine path, we define A f = (µ f K)/σ f. Then [ dp f dσ = A f φ(af ) Sf N 1 + Af σ σ A f S f N 1 = 1 σ f S f N 1 µ f S f N 1 A f σ = Af σ f Af σ f σ f σ σ f S f N 1 where S f N 1 / σ is computed along the simulation of the path, and ]. µ f S f N 1 = 1 + rh, σ f S f N 1 = σ h, σ f σ = Sf N 1 h. For the coarse path, we define A c = (µ c K)/σ c. Then [ ] dp c A c dσ = φ(ac ) Sc N 2 + Ac. σ σ A c S c N 2 = 1 σ c S c N 2 µ c S c N 2 Ac σ c σ c S c N 2 A c σ = 1 µc σ c σ Ac σ c σ c σ where SN 2 c / σ is computed along the simulation of the path, and µ c S c N 2 = 1 + 2rh + σ W N 2, σ c S c N 2 = σ h, µ c σ = σ c Sc N 2 W N 2, σ = Sc N 2 h. The estimator for the vega on each level is thus ( ) Ŷl σ = 1 N l f,(i) c,(i) d ˆP d ˆP. N l dσ dσ i=1 9

2.2 Splitting Splitting is similar to conditional expectation method, but here the conditional expectation is replaced by a numerical estimate. To be more specific, first we revert to Milstein discretisation for the final timestep, i.e., ˆD N 1 = 1 + rh + σ ŴN 1 + 1 2 σ2 ( Ŵ N 1 2 h) Ŝ N = ŜN 1 ˆD N 1 where h represents the size of the timestep deping whether we are considering the fine path or the coarse path. Then we generate a number of sub-samples of the final Brownian increment for each path, and average the payoffs for both the fine path and the coarse path, respectively. If the number of sub-samples is chosen appropriately as we will explain later, the variance and the computational cost are the same, to the leading order, comparing to the method of conditional expectation. However, because the digital options have discontinuous payoff, this approach does not naturally provide us with a way to calculate the pathwise sensitivities. Alternative methods can be used, however, to calculate the Greeks. For example, vibrato Monte Carlo, which is a hybrid combination of pathwise and LRM sensitivity calculation, is a very practical method for discontinuous payoff functions and their Greeks. Giles gave a precise form of the hybrid combination and specified how much variance reduction it could achieve in his paper[6]. In this paper, we do not discuss further details and we only consider calculating option values with splitting method. 2.3 Change of Measure Method Like the conditional expectation approach, we employ the Euler discretization (2.1)(2.4) for the final timestep, and we have the normal distribution of the final state with mean and variance (2.2)(2.5). Now the probability density function at maturity is given by 1 p c = exp { (x } µc ) 2 2πσc 2σ 2 c and p f = { } 1 exp (x µf ) 2. 2πσf 2σf 2 10

For each path, letting W N 2 denote the Brownian path up to the penultimate timestep, we wish to compute the conditional expectation Y l = E WN 2 [P f P c ] = E WN 2 [1 S f N K] E W N 2 [1 S c N K] = E WN 2 [1 S g N K R f ] E WN 2 [1 S g N K R c ] = E WN 2 [1 S g N K (R f R c )] (2.7) where S g N is a third distribution and has a similar distribution (usually normal distribution) with S f N and Sc N. R c and R f are the Radon-Nikodym derivatives. Now we try to define the third distribution as an average of S f N and Sc N. First we define S g N 1 = 1 2 (Sc N 2 + S f N 1 ), µ g = 1 2 (µc + µ f ), σ g = σ b S g N 1 h, where σ b takes the same value as σ but is viewed as a constant later on when we calculate Greeks. Thus the p.d.f for S g N is p g = 1 2πσg exp { (x µg ) 2 The Radon-Nikodym derivative can be computed as follows. R f = p f = σ { } bs g N 1 (S g T exp µg ) 2 p g σs f 2σ 2 N 1 b (Sg N 1 )2 h (Sg N µf ) 2 := S f v f, 2σ 2 (S f N 1 )2 h R c = p c = σ bs g { N 1 (S g T exp µg ) 2 p g σsn 2 c 2σb 2(Sg N 1 )2 h (Sg N } µc ) 2 := S c v c. 2σ 2 (SN 2 c )2 h To compute the Greeks using change of measure method, we again calculate the pathwise sensitivity. According to (2.7), 2σ 2 g }. [ ( dy l d dσ = E W N 2 1 S g N K dσ R f d )] dσ R c assuming that the distribution of S g N does not change with σ. For the fine path, dr f dσ = R f σ + R f Sf N 1 σ S f N 1 R f σ = Sf v f σ + Sf v f (Sg N µf ) 2 σ 3 (S f N 1 )2 h 11

R f S f N 1 = Sf v f + S f v f 2(Sg N µf ) µf S f N 1 S f N 1 = Sf v f + S f v f S f N 1 (Sg N µf ) µf S f N 1 2σ 2 (S f N 1 )2 h + 4σ 2 S f N 1 h(sg N µf ) 2 4σ 4 (S f N 1 )4 h 2 S f N 1 + (Sg N µf ) 2 σ 2 (S N 1 ) 3 h where µ f / S f N 1 = 1 + rh as in section 2.1 and Sf N 1 / σ is computed along the path simulation. For the coarse path, dr c dσ = R c σ + R c S c N 2 Sc N 2 σ [ R c σ = v c 2(S g Sc σ + Sc v c N µc ) µc σ 2σ2 (SN 2 c )2 h + 4σ(SN 2 c )2 h(s g N µc ) 2 4σ 4 (SN 2 c )4 h 2 ] = Sc v c σ + Sc v c [ (S g N µc ) µc σ σ + (Sg N µc ) 2 σ 3 (S c N 2 )2 h ] R c S c N 2 = Sc v c S c N 2 = Sc v c S c N 2 + S c v c + S c v c 2(Sg N µc ) µc S c N 2 (Sg N µc ) µc S c N 2 2σ 2 (SN 2 c )2 h + 4σ 2 SN 2 c h(sg N µc ) 2 4σ 4 (SN 2 c )4 h 2 SN 2 c + (Sg N µc ) 2 σ 2 (S N 2 ) 3 h where µ c S c N 2 = 1 + 2rh + σ W N 2, µ c σ = Sc N 2 W N 2 as in section 2.1 and S f N 1 / σ is computed along the path simulation. Finally we have the estimator Ŷl σ = 1 N l N l i=1 1 P P j=1 ˆ1 g,(i,j) S N K ( d ˆR(i,j) f dσ ) (i,j) d ˆR c dσ 12

2.4 Optimal Number of Samples We follow the same argument as in [6]. Let W N 2 denote the Brownian motion up to the penultimate timestep, and W N 1 the last Brownian increment. Then for any function g(w N 2, W N 1 ) we have the estimator M ( P ) Ŷ M,P = M 1 P 1 g(w (m) (m,p) N 2, W N 1 ) as an unbiased estimator for m=1 p=1 E [g(w N 2, W N 1 )] = E WN 2 [ E WN 1 [g(w N 2, W N 1 ) W N 2 ] ]. Now we need to get an expression for the variance of the estimator. First we define [ ḡ(w (m) N 2 ) := E g(w (m) (m,p) N 2, W N 1 m=1 (m) ) W N 2 ( ) g W (m) (m,p) N 2, W N 1 = g ḡ, then the estimator can be written as M ( ( ) P ( ) ) Ŷ M,P = M 1 ḡ W (m) N 2 + P 1 g W (m) (m,p) N 2, W N 1 Then Ŷ M,P E[Ŷ ] = M 1 M The variance thus can be computed as follows. ] ] V [ŶM,P = E [(Ŷ E[Ŷ ])2 = M 2 M p=1 ], m=1 (ḡ E[ḡ]) + (MP ) 1 m,p E[( g) 2 ] g. E [ḡ E(ḡ)] 2 + (MP ) 2 m=1 m,p [ = M 1 V (ḡ) + (MP ) 1 E WN 2 E WN 1 ( g) 2] [ ] = M 1 V E(g W (m) N 2 ) [ + (MP ) 1 E WN 2 V WN 1 ( g W N 2 ) ] := M 1 v 1 + (MP ) 1 v 2 On the other hand we have the cost propotional to c 1 M + c 2 MP, where c 1 corresponds to the path calculation up to the penultimate timestep and c 2 the calculation 13

of the last timestep and the payoff values. That is, c 1 = O(h 1 ) and c 2 = O(1). For a fixed computational cost, the variance can be minimised by minimising the product ( M 1 v 1 + (MP ) 1 v 2 ) (c1 M + c 2 MP ) = v 1 c 2 P + v 1 c 1 + v 2 c 2 + v 2 c 1 P 1 which attains the optimum when 2.4.1 Splitting P opt = v 2 c 1 /v 1 c 2. (2.8) First of all, we notice that the coarse and fine paths differ by O(h) due to first order convergence of the Milstein scheme. Then we have S c N 2 Sf N 1 = O( h) and thus µ f µ c = O(h), σ f σ c = O(h). For splitting, g(w N 2, W N 1 ) = P f (S f N ) P c (S c N). If we suppose we use an Euler final step for simplicity, then for paths near the strike K, ( ) ( ) µ f K µ c K E [g(w N 2, W N 1 ) W N 2 ] = Φ Φ σ f σ c [ ( ) ( )] µ f K µ c K = Φ Φ σ f σ f [ ( ) ( )] µ c K µ c K + Φ Φ σ f σ c = 1 ( ) ( ) ( ) µ K φ µ + µc K µ c K φ σ σ }{{} f σ f σ 2 σ }{{} O(h) O(h 1/2 ) = O(h 1 2 ) } {{ } O(1) }{{} O(h) } {{ } O(h 1/2 ) where µ is some value between µ f and µ c, σ some value between σ f and σ c. } {{ } O(1) Now because the payoff of a digital option is discontinuous with an O(h 1/2 ) fraction of paths being within O(h 1/2 ) of the discontinuity, we have v 1 E [ (E [g(w N 2, W N 1 ) W N 2 ]) 2] P(near strike) E [ (E [g(w }{{} N 2, W N 1 ) W N 2 ]) 2 near strike ] }{{} O(h) O(h 3 2 ) O(h 1 2 ) On the other hand, because the probability density functions of S f N and Sc N have width of magnitude O(h 1/2 ), there is an O(h 1/2 ) probability that S N will fall within 14

the range of O(h) of the strike, which will give an O(1) difference in payoff, conditional on S f N 1 and Sc N 2 being near the strike. Therefore, and thus V WN 1 [g(w N 2, W N 1 ) W N 2 ] E [g(w N 2, W N 1 ) W N 2 ] 2 O(h 1 2 ) O(h 1 2 ) (O(1)) 2 O(h 1 2 ) v 2 = P (near strike) E [ V }{{} WN 1 [g(w N 2, W N 1 ) W N 2 ] near strike ] }{{} = O(h) O(h 1 2 ) Substituting c 1, c 2, v 1, v 2 into (2.8) we have P opt = O(h 3/4 ), which means that the optimal number of samples for splitting method on each level is proportional to h 3/4. Notice that as h 0, the variance goes to v 1 M 1 and the cost c 1 M asymptotically. Thus splitting method does not, to leading order, increase the variance or the computational cost compared to using the conditional expectation method wherever applicable. 2.4.2 Change of Measure Method In this case, g(w N 2, W N 1 ) = P g (R f R c ) = P g [ R(µ f, σ f ) R(µ c, σ c ) ] = P [ g R(µ f, σ f ) R(µ f, σ c ) + R(µ f, σ c ) R(µ c, σ c ) ] [ R = P g σ (µf, σ ) σ + R ] µ (µ, σ c ) µ. where µ is some value between µ f and µ c, σ some value between σ f and σ c. Since we have R(µ, σ) = σ { g (S g σ exp N µg ) 2 (Sg N } µ)2 := σ g 2σg 2 2(σ ) 2 σ v, O(h) {}}{ R σ = σ g σ v + σ g }{{ 2 } σ v (S g N µ)2 = O(h 1 }{{ σ 3 2 ) } O(h 1 2 ) O(h 1 2 ) 15

and noting that Thus we can see R µ = σ g σ v Sg N µ σ 2 = O(h 1 2 ), S g N µ = (Sg N µg ) + (µ g µ) = O(h 1 2 ). }{{}}{{} O(h 1/2 ) O(h) g(w N 2, W N 1 ) = P g R σ (µf, σ ) σ + R }{{} µ (µ, σ c ) µ = O(h 1 2 ). }{{} O(h 1/2 ) O(h 1/2 ) and Therefore E WN 1 [g(w N 2, W N 1 ) W N 2 ] = O(h 1 2 ) V WN 1 [g(w N 2, W N 1 ) W N 2 ] = O(h). Following the same argument as that for splitting we get v 1 = v 2 = O(h 3/2 ). Finally P opt = O(h 1/2 ) so the optimal number of samples for splitting method on each level is proportional to h 1/2. Same as splitting method, as h 0, the variance goes to v 1 M 1 and the cost c 1 M asymptotically. Thus the change of measure method does not, to leading order, increase the variance or the computational cost compared to using conditional expectation method wherever applicable. 2.5 Numerical Results Table 2.1 shows the values and vegas computed by different approaches, with parameters T = 1, r = 0.05, σ = 0.2, K = 1. The BS code is from Jeff Dewynne[3] (see Appix A.1). All numerical methods use N = 10000 sample paths as an initial sample set on each level, level L = 5, with N 3 = 100 subsamples for change of measure approach and splitting. According to equation (1.1), if we want to control the MSE within ɛ 2, we also need to require E[Ŷ ] E[Y ] < ɛ/ 2, that is, [ L ] E Ŷ l E[Y ] < ɛ/ 2. l=0 16

BS Price Conditional Expectation Radon Splitting value 0.5323 0.5328 0.5327 0.5323 vega -0.6567-0.6578-0.6799 - Table 2.1: Prices and Vegas for a digital option on one asset This can be achieved by requiring E[Ŷl] E[Y ] < ɛ/ 2 for any 0 l L. Since E[Ŷl Y ] ah L = a 2 l, (2.9) if we have first order weak convergence (which is verified by numerical results later.) E[Ŷl 1 Y ] ah L = a 2 (l 1) by the property of weak convergence. We then have that E[Ŷl Ŷl 1] a2 l E[ = Ŷ l ] E[Y ] < ɛ/ 2 This is used in the code in Appix A.3.2 to determine when to stop increasing the levels. For standard Monte Carlo methods, in order to achieve an accuracy ɛ and with probability 0.84, we need to generate the number of samples N = 2( ˆσ ɛ )2 where ˆσ is the sample standard deviation of these samples. Thus the cost of plain Monte Carlo with timesteps 2 L is given by 2( ˆσ ɛ )2 2 L On the other hand, the costs of MLMC using conditional expectation approach and using Radon-Nikodym derivative or splitting approach are given by L N l 2 l l=0 and L l=0 N l (2 l + N (l) 3 ) where N (l) 3 denotes the number of subsamples on each path. 17

2.5.1 Numerical Results for computing the option value Compared to standard MC, MLMC has greatly improved the variance and computational cost. In figure 2.1a, a comparison is made between the variance of MLMC radon and standard MC radon, achieving the same level of accuracy. It can be seen that by applying Multilevel Monte Carlo, the variance is greatly reduced. Besides, as we can see, the variance for standard Monte Carlo remains approximately constant, while in contrast, the variance for Multilevel methods (figure 2.1b) decreases when the level increases. This, together with a decreasing in h l, also results in a downward tr in figure 2.3, indicating that fewer number of paths are needed when the level increases. In figure 2.4a, we can clearly see that MLMC is approximately 20 times more efficient than standard MC when change of measure method is used. As shown in figure 2.1b that change of measure approach and splitting share similar slopes, approximately β = 1.25, and for conditional expectation approach β = 1.5. Thus according to MLMC theorem 1.2.1, the computational comlexity C should has a bound O(ɛ 2 ) which is confirmed by figure 2.4b. The three values in figure 2.4b is roughly constant, as expected. The method of conditional expectation is the most efficient, followed by splitting, and then change of measure method. Figure 2.2 shows that for these three approaches E[ ˆP l ˆP l 1 ] is approximately O(h l ), verifying first order weak convergence (2.9), α = 1. This is used to determine the number of levels to be used. Using equation (1.7), we can plot the number of samples needed for each level in order to achieve a prescribed accuracy as in figure 2.3. We see that the maximum level required increases as the desired accuracy ɛ decreases. It is also worth noticing that for conditional expectation approach, we only need one timestep for l = 0. This is because we do not simulate the path at all, but just use equation (2.3) to evaluate the payoff. This striking difference makes conditional expectation approach extremely efficient and should be considered the optimal choice whenever applicable. In general, the conditional expectation approach outperforms the other two. The reason behind this high efficiency is due to the availability of analytic expression of the conditional expectation computed on each path. This may not be available in higher dimensions where the other two can fully display their advantages. 18

10 0 5 5 0 log 2 variance 5 10 15 log 2 variance 10 15 20 20 25 P l radon MLMC Radon 30 0 2 4 6 8 10 l (a) MLMC radon and MC radon 25 MLMC con MLMC radon MLMC splitting 30 0 2 4 6 8 10 l (b) MLMC radon, con. ex., and splitting Figure 2.1: Variance for the option price log 2 mean 0 2 4 6 8 10 12 14 16 18 P l Radon MLMC Radon 20 0 2 4 6 8 10 l (a) MLMC radon and MC radon log 2 mean 0 2 4 6 8 10 12 14 16 MLMC con 18 MLMC radon MLMC splitting 20 0 2 4 6 8 10 l (b) MLMC radon, con. ex., and splitting Figure 2.2: Mean for the option price 19

10 9 10 8 10 7 ε=0.0001 ε=0.0002 ε=0.0005 ε=0.001 ε=0.002 10 9 10 8 10 7 ε=0.0001 ε=0.0002 ε=0.0005 ε=0.001 ε=0.002 10 6 10 6 N l N l 10 5 10 5 10 4 10 4 10 3 10 3 10 2 0 2 4 6 8 10 l (a) MLMC con 10 2 0 2 4 6 8 10 l (b) MLMC radon 10 9 10 8 10 7 ε=0.0001 ε=0.0002 ε=0.0005 ε=0.001 ε=0.002 10 6 N l 10 5 10 4 10 3 10 2 0 2 4 6 8 10 l (c) MLMC splitting Figure 2.3: Number of samples in each level for the option price 10 2 10 2 MLMC con MLMC radon MLMC splitting 10 1 10 1 ε 2 Cost 10 0 ε 2 Cost 10 0 10 1 10 1 Std MC radon MLMC radon 10 2 10 4 10 3 ε 10 2 10 4 10 3 ε (a) MLMC radon and MC radon (b) MLMC radon, con. ex., and splitting Figure 2.4: Accuracy for the option price 20

2.5.2 Numerical Results for computing the vega Different from the case for the option value, when computing the vega, the variance for standard MC rapidly increase as the level becomes large, which is shown in figure 2.5a. Interestingly, MLMC does not help much in efficiency in this case, as shown in figure 2.8a. Change of measure requires much more samples on each level (Figure 2.7). It is still less efficient than conditional expectation method (Figure 2.8b), and they both seem to have a moderate slope of variance (2.5b), indicating that β < 1. Figure 2.2 confirms first order weak convergence. The small variation might result from a limited sample size in this experiment. In conclusion, both methods perform reasonably worse when computing the vega. Conditional expectation still remains the optimal choice in terms of efficiency. 5 5 0 0 log 2 variance 5 log 2 variance 5 10 10 P l radon MLMC Radon 15 0 2 4 6 8 10 l MLMC con MLMC radon 15 0 2 4 6 8 10 l (a) MLMC radon and MC radon (b) MLMC radon, and con. ex. Figure 2.5: Variance for the vega 21

log 2 mean 0 2 4 6 8 10 12 14 16 18 P l Radon MLMC Radon 20 0 2 4 6 8 10 l (a) MLMC radon and MC radon log 2 mean 0 2 4 6 8 10 12 14 16 18 MLMC con MLMC radon 20 0 2 4 6 8 10 l (b) MLMC radon, and con. ex. Figure 2.6: Mean for the vega 10 9 10 8 10 7 ε=0.001 ε=0.002 ε=0.005 ε=0.01 ε=0.02 10 9 10 8 10 7 ε=0.001 ε=0.002 ε=0.005 ε=0.01 ε=0.02 10 6 10 6 N l N l 10 5 10 5 10 4 10 4 10 3 10 3 10 2 0 2 4 6 8 10 l (a) MLMC con 10 2 0 2 4 6 8 10 l (b) MLMC radon Figure 2.7: Number of Samples in each level for the vega 22

10 2 10 2 ε 2 Cost 10 1 ε 2 Cost 10 1 10 0 10 0 Std MC radon MLMC radon 10 1 10 3 10 2 (a) MLMC radon and MC radon ε MLMC con MLMC radon 10 1 10 3 10 2 (b) MLMC radon, and con. ex. ε Figure 2.8: Accuracy for the vega 23

Chapter 3 Digital options with multi-dimensional SDEs We now consider a 2D case. Two stocks follow the Geometric Brownian Motion ds 1 = rs 1 dt + σ 1 S 1 dw 1 ds 2 = rs 2 dt + σ 2 S 2 dw 2 where W 1 and W 2 are two Brownian Motions with correlation ρ. An option pays 1 unit at maturity if both of the stocks go above the strike K, and 0 otherwise. We also consider the sensitivity with respect to the parameter σ 1. Using the Milstein scheme, we have the discretisation (i = 1, 2) ˆD i,n = 1 + rh + σ i Ŵi,n + 1 2 σ2 i ( Ŵ 2 i,n h) Ŝ i,n+1 = Ŝi,n ˆD i,n ŝ 1,n+1 = ŝ 1,n D 1,n + Ŝ1,n[ Ŵ1,n + σ 1 ( Ŵ 2 1,n h)], where ŝ 1,n = Ŝ1,n/ σ 1. And it is easy to get W n ( W1,n W 2,n ) = ( ) 1 0 h ρ Z 1 ρ 2 where Z is a vector of i.i.d. unit Normal random variables. For the fine path, we stop at one step before maturity and use Euler-Maruyama discretisation for the last step, i.e. (i = 1, 2) ˆD f i,n 1 = 1 + rh + σ i Ŵi,N 1 Ŝ f i,n = Ŝf i,n 1 ˆD f i,n 1 24

Then the final state of the fine path has the conditional p.d.f given S f N 1 = Ŝf N 1 p f = 1 2π Ω f exp{ 1 2 (x af ) T (Ω f ) 1 (x a f )} where ( ) a f S f 1,N 1 = 1,N 1 h ( a f) S f 1 2,N 1 + rsf 2,N 1 h := a f ( 2 ) Ω f = σ1(s 2 f 1,N 1 )2 h σ 1 σ 2 S f 1,N 1 Sf 2,N 1 ρh σ 1 σ 2 S f 1,N 1 Sf 2,N 1 ρh σ2 2(S f 2,N 1 )2 h := ( σ f 11 σ f ) 12 σ f 21 σ f 22 where Note that Ω f can be factorized as Ω f = hm f ΣM f ( ) M f S f 1,N 1 = σ 1 0 0 S f 2,N 1 σ 2 ( ) 1 ρ Σ = ρ 1 In this way, we can easily calculate the determinant and inverse of matrix Ω f as follows. det Ω f = h 2 (det M) 2 det Σ = h 2 (S f 1,N 1 Sf 2,N 1 σ 1σ 2 ) 2 (1 ρ 2 ) (Ω f ) 1 = 1 h (M f ) 1 Σ 1 (M f ) 1 = 1 h(1 ρ 2 ) Then the p.d.f for S f N is { 1 p f = 2πhS f 1,N 1 Sf 2,N 1 σ 1σ exp 2 1 ρ 2 [ ( 1 (S f 1,N 1 σ 1) 2 ρ ρ σ 1 σ 2 S f 1,N 1 Sf 2,N 1 1 2h(1 ρ 2 ) ] } (x 1 a f 1) 2 (S f 1,N 1 σ 1) 2 + (x 2 a f 2) 2 (S f 2,N 1 σ 2) 2 2ρ(x 1 a f 1)(x 2 a f 2) σ 1 σ 2 S f 1,N 1 Sf 2,N 1 σ 1 σ 2 S f 1,N 1 Sf 2,N 1 1 (S f 2,N 1 σ 2) 2 Similarly, for the last step of the coarse path, we use Euler scheme to get (i = 1, 2) ˆD c i,n 1 = 1 + 2rh + σ i ( W i,n 2 + W i,n 1 ) ) Ŝ c i,n = Ŝc i,n 1 ˆD c i,n 1 Define ( S a c c = 1,N 2 + 2rS1,N 2 c h + σ 1S1,N 2 c W ) 1,N 2 S2,N 2 c + 2rSc 2,N 2 h + σ 2S2,N 2 c W 2,N 2 ( ) a c := 1 a c 2 25

( Ω c σ = 1(S 2 1,N 2 c )2 h σ 1 σ 2 S1,N 2 c Sc 2,N 2 ρh Then the p.d.f. for the final state is σ 1 σ 2 S1,N 2 c Sc 2,N 2 ρh ) σ2 2(S2,N 2 c )2 h := ( σ c 11 σ c 12 σ c 21 σ c 22 1 p c = 2π Ω c exp{ 1 2 (x ac ) T (Ω c ) 1 (x a c )} { 1 = 2πhS1,N 2 c Sc 2,N 2 σ 1σ exp 1 2 1 ρ 2 2h(1 ρ 2 ) [ ] } (x 1 a c 1) 2 (S1,N 2 c σ 1) + (x 2 a c 2) 2 2 (S2,N 2 c σ 2) 2ρ(x 1 a c 1)(x 2 a c 2) 2 σ 1 σ 2 S1,N 2 c Sc 2,N 2 3.1 Conditional Expectation Approach Given the distribution of both the fine path and coarse path, it is easy for us to express the option price in terms of bivariate normal probabilities. Generally speaking, the numerical computation of multivariate normal probabilities is no easy task. ) This is the main reason we usually turn to other methods when dealing with options in multi-dimensions. However, for 2D normal probabilities, Alan Genz[4] has provided us with a simple and relatively accurate numerical approximation. option pricing formula, E[1 {S1,N K,S 2,N K}] = P (S 1,N K, S 2,N K) = P ( S 1,N a 1 σ 11 K a 1 σ 11, S 2,N a 2 σ 22 K a 2 σ 22 ) = P (Z 1 K a 1 σ 11, Z 2 K a 2 σ 22 ) According to where Z 1, Z 2 are unit Normal random variables with correlation ρ. This probability can be approximated by numerical methods described above. To calculate the Greeks, we only need to differentiate the above probability with respect to the parameter θ for both the fine and coarse path. To this, we use the notation s 1,n = S 1,n θ Let and compute s 1,n for each timestep. C 1 = K a 1 σ 11, C 2 = K a 2 σ 22, and denote f(z 1, z 2 ) as the joint probability density of the bivariate Normal random variables Z 1 and Z 2. Then the price of the option before discounting is thus C(S 0, 0; T, K; r) = C2 C1 26 f(z 1, z 2 )dz 1 dz 2

Differentiation gives us dc dθ = C θ + C S 1,N 1 + C S 2,N 1 S 1,N 1 θ S 2,N 1 θ When θ = σ 1, we only need to calculate the derivative below. For the fine path, C f σ 1 = Cf 1 σ 1 C f S f 1,N 1 = Cf 1 σ 1 = Cf 1 σ 1 C f 2 C f 2 C f 2 dc = C + C S 1,N 1 dσ 1 σ 1 S 1,N 1 σ 1 f(c f 1, z 2 )dz 2 = Cf 1 1 exp( (Cf 1 ) 2 σ 1 2π 2 = Cf 1 φ(c f σ f 1 )Φ 1 = Cf 1 σ f 11 φ(c f σ f 1 )Φ σ 11 1 = ( Cf 1 a f 1 a f 1 S f 1,N 1 1 2π 1 ρ 2 exp( (Cf 1 ) 2 + z2 2 2ρC f 1 z 2 )dz 2(1 ρ 2 2 ) 1 2π 1 ρ exp( (z 2 ρc f 1 ) 2 + (C f 1 ) 2 (1 ρ 2 ) )dz 2 2(1 ρ 2 2 ) ) ( C f 2 ρc f 1 1 ρ 2 C f 2 ) ( ) C f 2 ρc f 1 1 ρ 2 + Cf 1 σ f 11 σ f 11 S f 1,N 1 1 2π 1 ρ 2 exp( (z 2 ρc f 1 ) 2 2(1 ρ 2 ) )dz 2 )φ(c f 1 )Φ ( ) C f 2 ρc f 1 1 ρ 2 where C f 1 σ f 11 = Cf 1, σ f 11 C f 1 a f 1 = 1, σ f 11 σ f 11 σ 1 = S f 1,N 1 h, σ f 11 S f 1,N 1 Similarly, for the coarse path, C c σ 1 = Cc 1 σ 1 C c S c 1,N 2 C c 2 = σ 1 h, a f 1 S f 1,N 1 f(c c 1, z 2 )dz 2 = ( Cc 1 a c 1 + Cc 1 σ11 c )φ(c a c 1 σ 1 σ11 c σ 1)Φ c 1 = ( Cc 1 a c 1 a c 1 S c 1,N 2 + Cc 1 σ c 11 σ c 11 S c 1,N 2 = 1 + rh ( ) C2 c ρc1 c 1 ρ 2 ( ) )φ(c1)φ c C2 c ρc1 c 1 ρ 2 27

where C c 1 σ c 11 = Cc 1, σ11 c C c 1 a c 1 = 1, σ11 c σ c 11 σ c = S c 1,N 2 h, σ c 11 S c 1,N 2 = σ 1 h, 3.2 Splitting a c 1 σ 1 = S c 1,N 2 W 1,N 2, a c 1 S c 1,N 2 = 1 + 2rh + σ 1 W 1,N 2 The conditional expectation method is the most efficient in low dimensions because the option price is a simple conditional expectation. When dimensions get larger, the previous approach gradually become unavailable due to the difficulty in computing the multivariate normal probabilities. easily adapted to higher dimensions. However, the method of splitting can be In 2D case, we revert to using the Milstein discretisation for the final timestep, i.e.(i = 1, 2) ˆD i,n 1 = 1 + rh + σ i Ŵi,N 1 + 1 2 σ2 i ( Ŵ 2 i,n 1 h) Ŝ i,n = Ŝi,N 1 ˆD i,n 1 Then we generate N 3 sub-samples for the last timestep and averaging over them to get an average payoff. Again, the variance and the computational cost remain the same as in 1D case to the leading order, with appropriate number of sub-samples. The method of pathwise sensitivity can not be applied in this context in order to compute the Greeks. Other hybrid methods, such as vibrato Monte Carlo method, can be applied, but the discussion is beyond the scope of this paper. 3.3 Change of Measure Method As in the 1D case, the change of Measure method is similar to the method of conditional expectation and the splitting technique. However, it kind of combines the strength of both techniques: it is adaptable in high dimensions and it is easy to calculate the Greeks. Now again as an illustration we consider the bivariate case. Apart from distributions of the fine and coarse final states, we consider a third distribution with mean µ g and variance σ g. The p.d.f is thus p g = 1 2π Ω g exp{ 1 2 (Sg N ag ) T (Ω g ) 1 (S g N ag )} 28

with Ω g and a g to be defined. Then the Radon-Nikodym derivatives for the fine and coarse paths are R f = p f p g = R c = p c p g = Ω g Ω f exp{1 2 (Sg N ag ) T (Ω g ) 1 (S g N ag ) 1 2 (Sg N af ) T (Ω f ) 1 (S g N af )} Ω g Ω c exp{1 2 (Sg N ag ) T (Ω g ) 1 (S g N ag ) 1 2 (Sg N ac ) T (Ω c ) 1 (S g N ac )} Given the Brownian path up to the N-1 timestep, the estimators for each of the paths are ˆP f = 1 N 3 ˆ1 N ˆR { Ŝ g 3 1,N K,Ŝg 2,N K} f, ˆP c = 1 N 3 ˆ1 N ˆR { Ŝ g 3 1,N K,Ŝg 2,N K} c i=1 If we want to take the mean and variance of the distribution g to be the average of f and c, we define a g = 1 2 (af + a c ) := i=1 ( a g) 1 S g N 1 = 1 ( S g ) 2 (Sf N 1 + S c 1,N 1 N 2) := S g 2,N 1 ( (σ Ω g b = 1 ) 2 (S g 1,N 1 )2 h σ1σ b 2S b g 1,N 1 Sg 2,N 1 ρh ) σ1σ b 2S b g 1,N 1 Sg 2,N 1 ρh (σb 2) 2 (S g 2,N 1 )2 h where σ b 1 and σ b 2 are baseline values that take the same values as σ 1 and σ 2 but viewed as constant when calculating the Greeks. Now we can simulate S g N as follows (i = 1, 2). a g 2 S g i,n = ag i + σb i Ŝg i,n 1 Ŵi,N 1 The p.d.f for g is 1 p g = { 2πhS g 1,N 1 Sg 2,N 1 σb 1σ exp 1 } 2 b 1 ρ 2 2 (Sg N ag ) T (Ω g ) 1 (S g N ag ) { 1 = 2πhS g 1,N 1 Sg 2,N 1 σb 1σ exp 1 2 b 1 ρ 2 2h(1 ρ 2 ) [ (S g 1,N ag 1) 2 (S g 1,N 1 σb 1) + (Sg 2,N ag 2) 2 2 (S g 2,N 1 σb 2) 2ρ(Sg 1,N ag 1)(S g 2,N ] } ag 2) 2 σ1σ b 2S b g 1,N 1 Sg 2,N 1 29

Thus we have R f = Sg 1,N 1 Sg 2,N 1 σb 1σ2 b { 1 S f 1,N 1 Sf 2,N 1 σ exp 1σ 2 2h(1 ρ 2 ) [ (S g 1,N ag 1) 2 (S g 1,N 1 σb 1) + (Sg 2,N ag 2) 2 2 (S g 2,N 1 σb 2) 2ρ(Sg 1,N ag 1)(S g 2,N ag 2) 2 σ1σ b 2S b g 1,N 1 Sg 2,N 1 (Sg 1,N af 1) 2 (S f 1,N 1 σ 1) (S g 2,N af 2) 2 2 (S f 2,N 1 σ 2) + 2ρ(S g 1,N af 1)(S g 2,N ]} af 2) 2 σ 1 σ 2 S f 1,N 1 Sf 2,N 1 := S f v f { S1,N 2 c Sc 2,N 2 σ exp 1σ 2 R c = Sg 1,N 1 Sg 2,N 1 σb 1σ b 2 1 2h(1 ρ 2 ) [ (S g 1,N ag 1) 2 (S g 1,N 1 σb 1) 2 + (Sg 2,N ag 2) 2 (S g 2,N 1 σb 2) 2 2ρ(Sg 1,N ag 1)(S g 2,N ag 2) σ b 1σ b 2S g 1,N 1 Sg 2,N 1 (Sg 1,N ac 1) 2 (S c 1,N 2 σ 1) 2 (Sg 2,N ac 2) 2 (S c 2,N 2 σ 2) 2 + 2ρ(Sg 1,N ac 1)(S g 2,N ac 2) σ 1 σ 2 S c 1,N 2 Sc 2,N 2 := S c v c According to what we discussed before, the estimator for Greeks in each level is d ˆP dθ = 1 N 3 N 3 i=1 ˆ1 { Ŝ g 1,N K,Ŝg 2,N K}(d ˆR f dθ d ˆR c dθ ) In particular, we calculate vega for the first asset. For the fine path, R f σ 1 = S f σ 1 v f + dr f dσ 1 S f v f 2h(1 ρ 2 ) = S f σ 1 v f + S fv f h(1 ρ 2 ) = R f + σ 1 R f S f 1,N 1 [ 2(S g 1,N af 1) 2 (S f 1,N 1 )2 σ1 3 [ (S g 1,N af 1) 2 (S f 1,N 1 )2 σ 3 1 S f 1,N 1 σ 1 ]} 2ρ(Sg 1,N af 1)(S g 2,N ] af 2) σ1σ 2 2 S f 1,N 1 Sf 2,N 1 ρ(sg 1,N af 1)(S g 2,N ] af 2) σ1σ 2 2 S f 1,N 1 Sf 2,N 1 30

R f S f 1,N 1 = S f S f v f v S f f + 2h(1 ρ ) 2 1,N 1 where [2(S g 1,N af a 1) f 1 (S f S f 1,N 1 σ 1) 2 + 2(S g 1,N af 1) 2 σ1s 2 f 1,N 1 1,N 1 (S f 1,N 1 σ 1) 4 2ρ(S g 2,N af 2)σ 1 σ 2 S f 1,N 1 Sf a f 1 2,N 1 + 2ρ(S g S f 1,N af 1)(S g 2,N af 2)σ 1 σ 2 S f ] 2,N 1 1,N 1 (σ 1 σ 2 S f 1,N 1 Sf 2,N 1 )2 = S f v S f f + S [(S g 1,N fv af f 1) h(1 ρ 2 ) 1,N 1 a f 1 S f 1,N 1 ρ(s g 2,N af 2)(S f a f 1 1,N 1 + S g S f 1,N af 1) ] 1,N 1 σ 1 σ 2 (S f 1,N 1 )2 S f 2,N 1 Likewise, for the coarse path, a f 1 S f 1,N 1 dr c dσ 1 = R c σ 1 + = 1 + rh. R c S c 1,N 2 S f 1,N 1 + (Sg 1,N af 1) 2 (S f 1,N 1 )3 σ 2 1 S c 1,N 2 σ 1 R c = S [ c S c v c 2(S g 1,N v c + ac 1) ac 1 σ 1 (S1,N 2 c σ 1) 2 + 2(S1,N 2 c )2 σ 1 (S g 1,N ac 1) 2 σ 1 σ 1 2h(1 ρ 2 ) (S1,N 2 c σ 1) 4 2ρ(Sg 1,N ac 1) ac 1 σ 1 σ 1 σ 2 S1,N 2 c Sc 2,N 2 + 2ρ(Sg 2,N ac 2)(S g 2,N ac 2)σ 2 S1,N 2 c Sc 2,N 2 (σ 1 σ 2 S1,N 2 c Sc 2,N 2 )2 = S c v c + S [ cv c (S g 1,N ac 1) ac 1 σ 1 σ 1 + (S g 1,N ac 1) 2 σ 1 h(1 ρ 2 ) (S1,N 2 c )2 σ1 3 ρ(sg 2,N ac 2)( ac 1 σ 1 σ 1 + S g 1,N ] ac 1) σ1σ 2 2 S1,N 2 c Sc 2,N 2 ] 31

R c S c 1,N 2 = S c S c v c v S1,N 2 c c + 2h(1 ρ ) 2 [ 2(S g 1,N ac a 1) c 1 (S c S1,N 2 c 1,N 2 σ 1) 2 + 2(S g 1,N ac 1) 2 σ1s 2 1,N 2 c (S1,N 2 c σ 1) 4 2ρ(S g 2,N ac 2)σ 1 σ 2 S1,N 2 c Sc a c 1 2,N 2 + 2ρ(S g S1,N 2 c 1,N ac 1)(S g 2,N ac 2)σ 1 σ 2 S2,N 2 c ] (σ 1 σ 2 S1,N 2 c Sc 2,N 2 )2 = S c v S1,N 2 c c + S [ (S g cv c 1,N ac a 1) c 1 S S c 1,N 2 c + (Sg 1,N ac 1) 2 1,N 2 h(1 ρ 2 ) (S1,N 2 c )3 σ1 2 ρ(s g 2,N ac 2)(S c a c 1 1,N 2 + S g S1,N 2 c 1,N ] ac 1) σ 1 σ 2 (S1,N 2 c )2 S2,N 2 c where a c 1 σ 1 = S c 1,N 2 W 1,N 2, a c 1 S c 1,N 2 = 1 + 2rh + σ 1 W 1,N 2. 3.4 Computational Results 3.4.1 Comparison with Analytical Values without correlation Assume ρ = 0 for these two assets. First we try to work out the Black-Scholes price and sensitivity w.r.t the first volatility. Based on our assumption on the price processes, we have { (r 1 2 σ2 1)T + σ 1 W (1) T S (1) T = S (1) 0 exp }, { S (2) T = S (2) 0 exp (r 1 } 2 σ2 2)T + σ 2 W (2) T, where W (1) T and W (2) T are Q-Brownian Motions. According to the risk neutral pricing formula and the indepence of these assets, ] p = e rt E [1 Q { } 1 { S (1) T K S (2) ( = e rt P Q S (1) K = e rt Φ T ( r 1 2 σ2 1 T K } ) ( P Q S (2) T ) S T + ln (1) 0 K σ 1 T ) K Φ ( r 1 2 σ2 2 ) T + ln S (2) 0 K σ 2 T 32

To calculate the vega, we simply differentiate p w.r.t σ 1, i.e. ( ) p r 1 = e rt φ 2 σ2 S 1 T + ln (1) ( ) 0 K r 1 Φ 2 σ2 S 2 T + ln (2) 0 K σ 1 σ 1 T σ 2 T ( ) r + 1 2 σ2 S 1 T + ln (1) 0 K T σ 2 1 The following chart shows the option values and vegas using different approaches, with parameters T = 1, r = 0.05, σ 1 = 0.1, σ 2 = 0.2, ρ = 0, K = 1. All numerical methods use N = 10000 sample paths, level L = 5, with N 3 = 100 subsamples for change of measure approach and splitting. BS Price Conditional Expectation Radon Splitting value 0.3586 0.3588 0.3586 0.3587 vega -1.0555-1.0563-1.0583 - Table 3.1: Prices and Vegas for a digital on two uncorrelated assets 3.4.2 Numerical results for computing the option value Now we consider a correlation ρ = 0.5 between these two assets. Other parameters are: S 0 = [1, 1], T = 1, r = 0.05, σ 1 = 0.1, σ 2 = 0.2, K = 1. In 2D case, all three methods have a slope approaching to a value approximately 1.5(Figure 3.1b), indicating that β = 1.5, and complexity O(ɛ 2 ). Again we see from the figure 3.4b that the rank in efficiency is conditional expectation, splitting and the change of measure approach. However, they have a slightly upward tr when ɛ is very large. In figure 3.3, we see the number of samples required in each level for these three methods. Change of measure method needs the largest number of samples, but is still decreasing due to the decreasing variance. Figure 3.2b shows that the mean correction for all three methods is O(h l ), corresponding to first order weak convergence. 33

0 0 5 5 10 10 log 2 variance 15 log 2 variance 15 20 20 25 P l radon MLMC Radon 30 0 2 4 6 8 10 l 25 MLMC con MLMC radon MLMC splitting 30 0 2 4 6 8 10 l (a) MLMC radon and MC radon (b) MLMC radon, con. ex., and splitting Figure 3.1: Variance for the option price log 2 mean 0 2 4 6 8 10 12 14 16 18 P l Radon MLMC Radon 20 0 2 4 6 8 10 l (a) MLMC radon and MC radon log 2 mean 0 2 4 6 8 10 12 14 16 MLMC con 18 MLMC radon MLMC splitting 20 0 2 4 6 8 10 l (b) MLMC radon, con. ex., and splitting Figure 3.2: Mean for the option price 34

10 7 10 6 ε=0.001 ε=0.002 ε=0.005 ε=0.01 ε=0.02 10 7 10 6 ε=0.001 ε=0.002 ε=0.005 ε=0.01 ε=0.02 10 5 10 5 N l N l 10 4 10 4 10 3 10 3 10 2 0 2 4 6 8 10 l (a) MLMC con 10 2 0 2 4 6 8 10 l (b) MLMC radon 10 9 10 8 10 7 ε=0.001 ε=0.002 ε=0.005 ε=0.01 ε=0.02 10 6 N l 10 5 10 4 10 3 10 2 0 2 4 6 8 10 l (c) MLMC splitting Figure 3.3: Number of samples in each level for the option price Std MC radon MLMC radon MLMC con MLMC radon MLMC splitting 10 2 10 2 ε 2 Cost 10 1 ε 2 Cost 10 1 10 0 10 0 10 1 10 3 10 2 ε 10 1 10 3 10 2 ε (a) MLMC radon and MC radon (b) MLMC radon, con. ex., and splitting Figure 3.4: Accuracy for the option price 35

3.4.3 Numerical results for computing the vega Again from Figure 3.6b, we are confirmed of a first order weak convergence. The several erratic points at in the middle of the graph might be due to changes of sign in the correction of mean. Figure 3.5 shows that the variance of standard MC increases. These two MLMC approaches have an approximate 0.5 slope, corresponding to β = 0.5. Because we are computing the Greeks, MLMC does not have much advantage over standard MC. Again the conditional expectation approach remains the best choice in terms of efficiency. 15 15 10 10 5 5 log 2 variance 0 log 2 variance 0 5 5 10 P l radon MLMC Radon 15 0 2 4 6 8 10 l 10 MLMC con MLMC radon 15 0 2 4 6 8 10 l (a) MLMC radon and MC radon (b) MLMC radon, and con. ex. Figure 3.5: Variance for the vega 36

log 2 mean 0 2 4 6 8 10 12 14 16 18 P l Radon MLMC Radon 20 0 2 4 6 8 10 l (a) MLMC radon and MC radon log 2 mean 0 2 4 6 8 10 12 14 16 18 MLMC con MLMC radon 20 0 2 4 6 8 10 l (b) MLMC radon, and con. ex. Figure 3.6: Mean for the vega 10 9 10 8 10 7 ε=0.002 ε=0.005 ε=0.01 ε=0.02 ε=0.04 10 9 10 8 10 7 ε=0.002 ε=0.005 ε=0.01 ε=0.02 ε=0.04 10 6 10 6 N l N l 10 5 10 5 10 4 10 4 10 3 10 3 10 2 0 2 4 6 8 10 l (a) MLMC con 10 2 0 2 4 6 8 10 l (b) MLMC radon Figure 3.7: Number of Samples in each level for the vega 37