ECON FINANCIAL ECONOMICS I

Similar documents
Continuous Time Finance. Tomas Björk

AMH4 - ADVANCED OPTION PRICING. Contents

Lecture 11: Ito Calculus. Tuesday, October 23, 12

Lecture Note 8 of Bus 41202, Spring 2017: Stochastic Diffusion Equation & Option Pricing

Stochastic Calculus - An Introduction

1.1 Basic Financial Derivatives: Forward Contracts and Options

1 The continuous time limit

Reading: You should read Hull chapter 12 and perhaps the very first part of chapter 13.

STOCHASTIC CALCULUS AND BLACK-SCHOLES MODEL

Stochastic Dynamical Systems and SDE s. An Informal Introduction

Stochastic Calculus, Application of Real Analysis in Finance

Homework Assignments

Drunken Birds, Brownian Motion, and Other Random Fun

BROWNIAN MOTION Antonella Basso, Martina Nardon

BROWNIAN MOTION II. D.Majumdar

Introduction to Probability Theory and Stochastic Processes for Finance Lecture Notes

3.1 Itô s Lemma for Continuous Stochastic Variables

From Discrete Time to Continuous Time Modeling

The Black-Scholes PDE from Scratch

S t d with probability (1 p), where

Risk Neutral Valuation

Numerical schemes for SDEs

Bluff Your Way Through Black-Scholes

IEOR E4703: Monte-Carlo Simulation

Stochastic calculus Introduction I. Stochastic Finance. C. Azizieh VUB 1/91. C. Azizieh VUB Stochastic Finance

A No-Arbitrage Theorem for Uncertain Stock Model

The Black-Scholes Model

Dr. Maddah ENMG 625 Financial Eng g II 10/16/06

The Black-Scholes Model

Simulating Stochastic Differential Equations

Lecture Notes for Chapter 6. 1 Prototype model: a one-step binomial tree

Basic Arbitrage Theory KTH Tomas Björk

Binomial model: numerical algorithm

The stochastic calculus

An Introduction to Point Processes. from a. Martingale Point of View

Lecture 23: April 10

Martingales. by D. Cox December 2, 2009

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 19 11/20/2013. Applications of Ito calculus to finance

Math 416/516: Stochastic Simulation

Continuous Processes. Brownian motion Stochastic calculus Ito calculus

RMSC 4005 Stochastic Calculus for Finance and Risk. 1 Exercises. (c) Let X = {X n } n=0 be a {F n }-supermartingale. Show that.

Non-semimartingales in finance

1 Mathematics in a Pill 1.1 PROBABILITY SPACE AND RANDOM VARIABLES. A probability triple P consists of the following components:

Using of stochastic Ito and Stratonovich integrals derived security pricing

Last Time. Martingale inequalities Martingale convergence theorem Uniformly integrable martingales. Today s lecture: Sections 4.4.1, 5.

STOCHASTIC INTEGRALS

Chapter 3: Black-Scholes Equation and Its Numerical Evaluation

Risk Neutral Measures

Lecture 3: Review of mathematical finance and derivative pricing models

Valuation of derivative assets Lecture 8

Martingale representation theorem

No-arbitrage theorem for multi-factor uncertain stock model with floating interest rate

Lecture 17. The model is parametrized by the time period, δt, and three fixed constant parameters, v, σ and the riskless rate r.

Queens College, CUNY, Department of Computer Science Computational Finance CSCI 365 / 765 Fall 2017 Instructor: Dr. Sateesh Mane.

( ) since this is the benefit of buying the asset at the strike price rather

THE MARTINGALE METHOD DEMYSTIFIED

Economathematics. Problem Sheet 1. Zbigniew Palmowski. Ws 2 dw s = 1 t

Monte Carlo Simulations

Functional vs Banach space stochastic calculus & strong-viscosity solutions to semilinear parabolic path-dependent PDEs.

Lévy models in finance

Characterization of the Optimum

MSc Financial Engineering CHRISTMAS ASSIGNMENT: MERTON S JUMP-DIFFUSION MODEL. To be handed in by monday January 28, 2013

Interest rate models in continuous time

Numerical Simulation of Stochastic Differential Equations: Lecture 1, Part 2. Integration For deterministic h : R R,

DRAFT. 1 exercise in state (S, t), π(s, t) = 0 do not exercise in state (S, t) Review of the Risk Neutral Stock Dynamics

Finance: A Quantitative Introduction Chapter 8 Option Pricing in Continuous Time

5. Itô Calculus. Partial derivative are abstractions. Usually they are called multipliers or marginal effects (cf. the Greeks in option theory).

M5MF6. Advanced Methods in Derivatives Pricing

Lecture 4. Finite difference and finite element methods

Hedging under Arbitrage

Change of Measure (Cameron-Martin-Girsanov Theorem)

Modeling via Stochastic Processes in Finance

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models

Stochastic Processes and Brownian Motion

The Black-Scholes Model

SPDE and portfolio choice (joint work with M. Musiela) Princeton University. Thaleia Zariphopoulou The University of Texas at Austin

************* with µ, σ, and r all constant. We are also interested in more sophisticated models, such as:

Replication and Absence of Arbitrage in Non-Semimartingale Models

CS134: Networks Spring Random Variables and Independence. 1.2 Probability Distribution Function (PDF) Number of heads Probability 2 0.

Hedging of Contingent Claims under Incomplete Information

Errata, Mahler Study Aids for Exam 3/M, Spring 2010 HCM, 1/26/13 Page 1

X i = 124 MARTINGALES

Week 1 Quantitative Analysis of Financial Markets Basic Statistics A

Probability in Options Pricing

(Informal) Introduction to Stochastic Calculus

Option Pricing Models for European Options

- 1 - **** d(lns) = (µ (1/2)σ 2 )dt + σdw t

Stochastic Differential equations as applied to pricing of options

Lattice (Binomial Trees) Version 1.2

Advanced Stochastic Processes.

6. Martingales. = Zn. Think of Z n+1 as being a gambler s earnings after n+1 games. If the game if fair, then E [ Z n+1 Z n

King s College London

Lecture 3. Sergei Fedotov Introduction to Financial Mathematics. Sergei Fedotov (University of Manchester) / 6

An Introduction to Stochastic Calculus

M.I.T Fall Practice Problems

Lecture 8: The Black-Scholes theory

Equivalence between Semimartingales and Itô Processes

MASM006 UNIVERSITY OF EXETER SCHOOL OF ENGINEERING, COMPUTER SCIENCE AND MATHEMATICS MATHEMATICAL SCIENCES FINANCIAL MATHEMATICS.

1 Geometric Brownian motion

Hedging Credit Derivatives in Intensity Based Models

Transcription:

Lecture 3 Stochastic Processes & Stochastic Calculus September 24, 2018

STOCHASTIC PROCESSES Asset prices, asset payoffs, investor wealth, and portfolio strategies can all be viewed as stochastic processes. Definition: A stochastic process is a time-indexed sequence of random variables. Stochastic Processes vs. Random Variables (paths) (real numbers) A stochastic process must be regarded as a single unit. It is a random variable, but it belongs to a different space, i.e., a function space, and we write it using two arguments: X(t, ω).

A realization is an entire path, not a number, and the Population is the set of all possible realizations. Price ω1 ω2 ω3 Time Note, in economics we observe only one realization. Law of Large Numbers and Central Limit Theorem arguments are still applicable if observations over time provide enough new information about ensemble averages.

Stationary Stochastic Process: All joint distributions are independent of time (e.g., initial conditions have worn off ) f(x t, x t+1, x t+j ) indpt. of t j Markov Process: A restriction on the conditional distribution function: f(future present, past) = f(future present) or Prob(x t+1 x t, x t 1, x t k ) = prob(x t+1 x t ) k 1 (i.e., no path dependence). Whether a process is Markov depends on the dimensionality of the state. Example: S t = α 1 S t 1 + α 2 S t 2 appears non-markov. However, if we define Z t = (S t, S t 1 ), then Z t is Markov. A process can be Markov but nonstationary. Example: x t = λx t 1 + ε t with λ 1 (Why?) If a process is stationary and (low-dimensional) Markov, then conventional statistical methods can be applied.

MARTINGALES The preeminent stochastic process in finance is a martingale. A martingale satisfies the following restriction: E[X t+1 F t ] = X t where F t = Information available at time-t. Hence, a martingale is nonstationary. There are 2 key ingredients to any martingale: 1 A family of information sets, F t (called a filtration ). A process may be a martingale w.r.t. one filtration but not another. 2 A probability measure, used to define conditional expectations. A process may be a martingale w.r.t. one measure but not another. We exploit this often later on. A martingale is a mathematical formalization of a fair game. Betting on the outcomes of an unbiased coin generates a martingale wealth process.

A martingale only restricts the first moment. It should be distinguished from a random walk. A random walk is a martingale, but not vice versa. (Exercise: Provide an example of a martingale that is not a random walk). If E[X t+1 F t ] > X t the process is called a submartingale. If E[X t+1 F t ] < X t it is a supermartingale. Note that if we define prices to include dividends we can write the Fund. Eq. as follows: M t P t = E t [M t+ P t+ ] where with time-additive utility M t+ /M t = e ρ U (C t+ )/U (C t) This says that SDF-scaled prices follow a martingale. We are interested in what happens when 0 and trading becomes continuous. The continuous-time limit of a (continuous) martingale is a Brownian motion process.

CONTINUOUS- VS. DISCRETE-TIME The time index in a stochastic process can either be discrete (integers) or continuous (real numbers). Is time really discrete or continuous? [Who knows, ask a physicist]. In economics, the choice between them should be based solely on mathematical and computational convenience. Use whichever is easier for the problem at hand! The fact is, in many asset pricing problems, continuous-time is easier. This is especially true in option pricing. Besides, in today s world of computerized, algorithmic trading, continuous-time is becoming increasingly accurate even from a descriptive standpoint! The easiest way to think about a continuous-time stochastic process is as a limit of a discrete-time process. However, there are dangers in doing this, and we must be careful in how we take limits.

WIENER PROCESS (BROWNIAN MOTION) Defining a discrete-time process is simple - Just cumulate i.i.d. shocks. The Wold Representation Theorem tells us this is a completely general way to define a (linear) discrete-time process. Example: x t = ρx t 1 + ε t ρ < 1 ε t i.i.d. x t = j=0 ρj ε t j Likewise, defining a deterministic continuous-time process is also simple - Just write down a differential equation Example: dx dt = a x Here s the question - Can we define a stochastic continuous-time process by just adding on an i.i.d. shock to a differential equation, as we did in the discrete-time case? Does the following eq. make sense? dx dt = a x + ε Answer: Not really, but in a way yes.

It turns out that defining a continuous-time i.i.d. process is a bit tricky, and leads to some surprising results. We will see that the mathematically correct way to define a stochastic differential equation is in terms of the following integral equation: x t = x 0 + t 0 ax s ds + t 0 dw s where W t is a Wiener Process, and the 2nd integral is a new type of integral, called an Ito Integral. The Wiener process can be viewed as a continuous-time limit of the following random walk process x t = x t 1 + ε t ε t i.i.d. N(0, 1) x t+t = x t + T j=1 ε t+j where the step-size shrinks to zero in a very particular way as the time between steps shrinks to zero.

Note 3 features of the above random walk: 1 E(x t+t x t) = x t T or equivalently, E( x t+k x t) = 0 (Markov, with indpt., zero mean increments). 2 Var(x t+t x t) = T (Variance increases linearly with number of periods). 3 Although the unconditional distribution is undefined (since the process is nonstationary), all conditional distributions are Gaussian. We want to derive a continuous-time process with these same properties. In particular, we want the variance to depend on the sample length, T, but not on the number of steps. Unless the step-sizes go to zero slower than the step-lengths, the law of large numbers would cause the variance to shrink to zero. In analogy to the Wold Representation Theorem, all continuous sample path continuous-time processes can be constructed as functions of the Wiener process. (If you want to allow jumps, then you must consider a wider class of innovations, called Levy Processes).

BINOMIAL TREE Let s consider a slight generalization of a random walk, which allows the process to drift up or down on average over time. To visualize the path of this process, consider the following binomial tree: X0 + 2Δh P 2 X0 + Δh P X0 2pq X0 -- Δh q X0 -- 2Δh q 2 t=0 Type equation he t = Δt t = 2Δt This depicts the path of the following r.w. (with drift) process: x t+ t = x t + ε t+ t ε t = h with prob. p = h with prob. q

This is called a binomial tree because each step is the outcome of a Bernoulli trial, so the path of the process (sum of the steps) is a binomial r.v. By definition, t 0 when defining a continuous-time process. The question is, What happens to h as t 0? Clearly, if the process is to be continuous, then h 0 as t 0. But how fast? Moments: E( x) = (p q) h E[( x) 2 ] = p( h) 2 + q( h) 2 = ( h) 2 var( x) = E[( x) 2 ] [E( x)] 2 = [1 (p q) 2 ]( h) 2 = 4pq( h) 2

A continuous-time interval of length T can be divided into n = T t discrete time-steps. Since each step is independent, we have E(x T x 0 ) = T (p q) h = T (p q) h t t var(x T x 0 ) = T t 4pq( h)2 = T 4pq ( h)2 t (1) (2) Suppose h t ( h goes to 0 at the same rate as t). Then notice from (2) that var(x T x 0 ) 0. This is just the Law of Large Numbers. However, we want var(x T x 0 ) T. To get this, we assume h = σ t This implies ( h)2 t σ 2. If we also let p = 1 2 [ 1 + α ] t σ q = 1 2 [ 1 α ] t σ then, as t 0, the above binomial dist. converges to a Normal dist. (ie, Central Limit Theorem) and we find This is what we want! x T x 0 N(αT, σ 2 T ) Note, pq = 1 α2 (1 4 σ 2 t) 1 4

Comments & Questions: 1 Notice that x t h t t 1 as t 0 t t 2 This means that the continuous-time limit process, x(t), is not differentiable. However, it is continuous. (Can you visualize a continuous, but nowhere differentiable process?) 3 Divide the continuous interval [0, T ] into discrete subintervals t 0 = 0 < t 1 < < t n = T. Define the pth-variation of the Wiener process as n V p = lim X tk X tk 1 p t k 0 k=1 Using the above results, one can show V 1 =, V 2 = T, and V p = 0 for p 3. 4 Thus, the total variation or length of the Wiener process is infinite, even over arbitrarily short time intervals. However, its quadratic variation is finite, and equal to the length of the time interval. This reflects a trade-off between the number of steps, and the size of each step. As the number of steps goes to, the size of each step doesn t go to zero fast enough to keep the sum of the products finite, but the squared step sizes do. 5 The fact that variations beyond order 2 equal 0 will be important when developing our new stochastic calculus rules. 6 It turns out that the Wiener process is not special in this regard. All continuous martingales have V 1 = and V 2 = T (except for the degenerate case of a constant).

What exactly does it mean for a discrete-time process to converge to a continuous-time process. (They live in different spaces!). We just showed their means and variances converged. What about their sample paths? How do we know a random process is continuous? (It s random!) If W t is not differentiable, in what sense (if any) can we use its innovations as the underlying i.i.d. shocks driving a stochastic analog of a differential equation? In discrete-time, we move interchangebly between sums and differences: x t = x t 1 + ε t {discrete differential eq. x t = t ε j {discrete integral eq. j=0 This correspondence is like a discrete version of the Fundamental Theorem of Calculus: F (b) F (a) = b a F (x)dx. Does a similar result apply to stochastic continuous-time processes? It s not obvious, since dw does not exist. dt

SUMMARY SO FAR A Wiener process (or standard Brownian motion) is a continuous-time process having the following 3 properties: 1 Continuous sample paths 2 Stationary i.i.d. increments 3 W t N(0, t) t [W 0 = 0] Therefore, over any discrete time interval, t, we can write W t = ε t t εt N(0, 1) Notation: As t 0, we write dw = ε dt E[dW ] = E[ε dt] = 0 E[(dW ) 2 ] = E[ε 2 dt] = dt More generally, we can define a Brownian motion with drift, µ, and volatility, σ, as follows x t = x 0 + µ t + σw t

Upon first encountering this material, one sometimes gets the impression that Wiener processes and Brownian motion are different objects, i.e., that Wiener processes are in some sense more general. A famous theorem of P. Levy shows that this is not the case, and that we can think of them as being synonymous. Theorem: If a continuous-time process, x t, has continuous sample paths with stationary i.i.d. increments, then it must be a Brownian motion. That is, continuity + stationary i.i.d. increments Normality Hence, the 3rd part of the previous definition was not really required. Implication: If the data indicate a process is non-gaussian, then it cannot be a Wiener process. (However, you might be able to convert it into one with a suitable transformation).

A Caveat Before proceeding, we need to be aware of an important caveat to all our results. Consider the following silly (but useful) example: Example: Let U by a r.v. which is uniformly distributed on [0, 1], and consider the following 2 random processes on the continuous time interval [0, 1]: { 1 if t = U X t = 0 for all t [0, 1] Y t = 0 otherwise Since Pr(U = t) = 0 t, X t and Y t have the same distributions. However, X t and Y t are different processes. In particular, Pr(X t = 0 t) = 1 and Pr(Y t = 0 t) = 0. Implications: 1 Since we are dealing with random things, there are no guarantees. 2 All our statements about the properties of a random process mean they hold almost everywhere or almost surely. 3 In this way, we can say the above 2 processes are equal. (They only differ on a set of measure zero). 4 This qualifies our earlier claim about the continuity of a Wiener process - More precisely, The sample paths of a Wiener process are almost surely continuous

W EAK C ONVERGENCE /D ONSKER S T HEOREM In what sense do the sample paths of a discrete-time random walk converge to those of a Wiener process? In statistics, you learn that there are many ways we can define the convergence of a random variable. One of them is: K ASA ECON 2021 - F INANCIAL E CONOMICS I

Convergence in Distribution: Let X n be a sequence of random variables with CDF F n(x n). We say that X n converges in distribution to a r.v. X with CDF F (X) if lim n F n(x n) F (X) = 0 at all continuity points of F (X). This is a statement about the prob. dists. of X n and X, not the values of X n and X Weak convergence can be interpreted as a function space analog of convergence in distribution (ie, it applies to stochastic processes rather than random variables). Definition: Let (Ω, F) be a measure space, where Ω is a set of continuous functions, F be a σ-algebra of its subsets, and µ n be a sequence of prob. measures on (Ω, F). Then µ n converges weakly to µ (denoted µ n µ) if for each continuous, bounded function f we have f(x)µn(dx) f(x)µ(dx) Equivalently, if we let X n be the random process associated with µ n (ie, µ n(a) = P r(x n A)), then weak convergence implies Ef(X n) Ef(x). Theorem (Donsker): Let ε t i.i.d. with E t = 0 and Eε 2 t = 1. Let X T = ε 1 + ε 2 + + ε T be the T th partial sum. Form the following linear continuous-time interpolation of X T on the interval [0, 1] W n t = { 1 n X m if t = m n (0 m n) linear if t [ m n, m+1 n ) Then, as n, W n t W t, where W t is a Wiener process.

Comments: Note that no assumption was made about the distribution of ε t. Believe it or not, this is really a useful and practical result. Why? Because the function f can be used to define many useful properties of a path (e.g., the maximum value reached before time T ). Donsker s Theorem then tells us that the sample path properties of a random walk can be approximated by those of a Wiener process (and vice versa). This means computer programs can be used to approximate the path properties of W t (or, going the other way) analytically derived results for W t approximate those for X t.

BACKWARD & FORWARD EQUATIONS Markov processes are fully characterized by their transition probabilities (and an initial condition). Example: A discrete-state/discrete-time process (called a Markov Chain) is characterized by its transition probability matrix, P ij. A standard Wiener process is Markov, and from our results so far, its transition probabilities are given by: { 1 (y x) 2 } f(y, t x, s) = exp 2π(t s) 2(t s) = prob. W t will be at y at t given it starts at x at s < t Taking derivatives, one can verify that f is the solution of the following 2 (partial) differential equations: f s f t = 1 2 f 2 x 2 = 1 2 f 2 y 2 }Backward Eq. }Forward Eq.

The backward eq. conditions on a future time and value, and describes how the expectation of this value changes with different initial conditions. It is solved backwards. Option pricing uses the backward eq., since we know the value at expiration if we know the terminal price of the underlying asset. The forward eq. conditions on an initial distribution, and describes how this distribution evolves over time. It is solved forwards. It can be used to find the long-run stationary distribution of a stochastic process. Mathematically, the backward and forward eqs. are duals (adjoints) of each other. The forward eq. describes the diffusion of particles subject to random molecular motion. It is sometimes called the heat eq.. The backward eq. is just the heat eq. with time running backwards. Later we derive more general versions of these equations using a different method, based on Ito s Lemma.

WHITE NOISE & STOCHASTIC DIFF. EQS. Discrete-time processes are built from discrete-time i.i.d. random variables, ε t X t = propagation mech. impulse {}}{{}}{ α 1 X t 1 + α 2 X t 2 + ε t It would be nice if the same were true for cont.-time processes The sum of a cont.-time i.i.d. process makes sense. That s what a Wiener process is: W t = t 0 εsds. Problem: We can t differentiate this to get: dw = ε. W dt t is too eratic. The stochastic analog of the Fundamental Theorem of Calculus breaks down. We need to develop new calculus rules (known as stochastic calculus ). For now, I will just alert you to the fact that there is a mathematically sophisticated way of defining the above derivative by appealing to the notion of generalized functions, which are motivated by the Dirac δ-function δ(x) = { + x = 0 0 x 0 f(x)δ(x)dx = f(0) δ(x)dx = 1 Engineers & physicists use these all the time. Using δ-functions we can say that dw dt exists, and engineers call it white noise.

STOCHASTIC DIFFERENTIAL EQUATIONS Since dw does not exist (in a conventional sense), we approach dt SDEs via the following integral eq. X t = X 0 + t 0 µ(x s )ds + t 0 σ(x s )dw s The first integral is a conventional Riemann integral. The 2nd integral requires special treatment due to the erratic behavior of W t. If the integrand σ( ) is well behaved (eg, independent of W ) then the 2nd integral can be evaluated as usual. For example, if σ is constant, we have t 0 σdw s = σw t (since W 0 = 0). However, if the integrand is random (as if often is in finance) then we must be careful. Note, since dw is random, we must remember that the integral is a random variable, so when discussing convergence of discrete sums we must adopt a probabilistic notion of convergence.

THE ITO INTEGRAL Let s start with an example. Suppose we want to evaluate t 0 W s dw s Consider the following two Riemann sum approximations I 1 = lim I 2 = lim n 1 t k 0 k=0 n 1 t k 0 k=0 W tk [W tk+1 W tk ] W tk+1 [W tk+1 W tk ] The only difference here is that I 1 evaluates the integrand at the left-hand endpoint of each interval whereas I 2 evaluates at the right endpoint. If W were a deterministic function of time, both would converge to the same result: 1 2 W t 2 (since W 0 = 0). [To show this use the result nk=1 k = (1/2)n(n + 1)]

When W t is random, however, it matters where we evaluate the integrand. Using the fact that W t has independent mean zero increments and the law of iterated expectations, we can see that E(I 1 ) = 0. At the same time, we can easily see that E(I 2 ) = t. The Ito integral is defined by the assumption that we evaluate at the left endpoint. As long as the integrand is nonanticipating (ie, measurable w.r.t. info available at the left endpoint) and square integrable (to make sure sums converge) this will ensure that the Ito integral is a martingale. Evaluating at the left endpoint also makes sense economically. It allows us to use integrals to sum up continuous-time trading profits when agents must choose portfolios before future prices are known. In this simple example, we could actually evaluate the limit of the Riemann sum by brute force. However, it is easier to use the fact that by construction the Ito integral is a martingale. Since E(W 2 t ) = t, we can infer that t 0 WsdWs = 1 2 (W 2 t t). More precisely, remembering that the integral is random, one can show that I 1 converges in mean-square to 1 2 (W t 2 t). (That is, the expected squared deviations between the two goes to zero as the intervals go to zero).

ITO S LEMMA As in regular calculus, one rarely evaluates stochastic integrals using Riemann sums. Instead, we look for an anti-derivative. Not surprisingly, taking derivatives of a random process also produces some differences from classical calculus. Ito s lemma shows us how to take derivatives of functions of a Wiener process. It can be interpreted as a stochastic analog of the Chain Rule. It has 3 main uses in financial economics 1 It allows us to calculate expected continuation values, E[dV ], in continuous-time dynamic programming problems. We use DP to study optimal consumption/portfolio problems in various settings. 2 In asset pricing problems, the SDF process is a function of underlying macro variables that follow Ito processes [eg., M t = e δt U (C t)]. Ito s lemma enables us to infer the stochastic properties of the SDF, given the stochastic properties of the underlying macro variables. 3 Derivatives prices are functions the underlying price, C(S, t), where S follows an Ito process. To calculate its value we use Ito s lemma.

Rather than proceed in the abstract, let s consider an example. Suppose X t follows the Brownian motion with drift process dx = µ dt + σdw Remember, we cannot write: dx dt = µ + σ dw dt simply shorthand notation for the integral. The above equation is X t = X 0 + µ t 0 ds + σ t 0 dw s (Since σ is constant here, no special treatment of the integral is required). Suppose we have some function, F (t, X), which depends on X (and t), and we want to approximate df = total differential. Normally, a 1st-order approximation would be: df F F dx + x t dt

However, since dx dt, we must go out to 2nd order in the Taylor series to pick-up all the first-order terms in dt df F F dx + x t dt + 1 2 2 F X 2 (dx)2 + 1 2 2 F t 2 (dt)2 + F dx dt X t Note that the final two terms are of order higher than dt, and so can be dropped. However, note that (dx) 2 = µ 2 (dt) 2 + 2µσdt dw + σ 2 (dw ) 2 σ 2 dt This gives us Ito s Lemma: df = F X = dx + F t dt + 1 2 σ2 2 F X 2 dt ( µ F X + 1 2 σ2 2 F X 2 + F t ) dt + σ F X dw

The extra 1 2 σ2 2 F X 2 dt term in Ito s Lemma can be interpreted as a Jensen s inequality correction, reflecting the fact that the expectation of a nonlinear function the nonlinear function of the expectation. As always, its importance is increasing in the variance. Note that if F is concave, the Ito/Jensen correction reduces the drift of the process, while if F is convex, it increases the drift. Examples: 1 Suppose dx = µ dt + σdw and define Z = F (X) = e X. Note, Plugging into Ito s lemma we get F x = e X = Z F xx = e X = Z F t = 0 dz = df = F x dx + 1 2 σ2 F xxdt = Z(µdt + σdw ) + 1 2 σ2 Zdt = (µ + 1 2 σ2 )Z dt + σzdw This example is used all the time in financial economics. [Z is called geometric Brownian motion] 2 Suppose dx = µx dt + σx dw and define Z = F (X) = log(x). Note, F x = 1/X F xx = 1/X 2 F t = 0

Plugging into Ito s lemma we get dz = df = F x dx + 1 2 σ2 X 2 ( 1/X 2 )dt = 1 X (µxdt + σxdw ) 1 2 σ2 dt = (µ 1 2 σ2 ) dt + σdw This example is also used all the time in financial economics. Suppose dc = µdt + σdw, and consider the SDF process C M = F (t, C) = e δt U (C). Ito s lemma allows us to infer the following process for the SDF dm = F t dt + F cdc + 1 2 σ2 C 2 F ccdt = δe δt U (C)dt + e δt U (C)[µCdt + σcdw ] + 1 2 σ2 C 2 e δt U (C)dt ( ) ( )] ( ) [ δ dm M = + CU U µ + 1 C 2 σ2 2 U U dt + σ CU U dw Once again, this result is used often in financial economics.

SOLUTIONS OF SDES Consider the SDE dx = µ(t, X)dt + σ(t, X)dW (3) What does it mean to solve this equation? Unlike solving a deterministic ODE, we cannot expect to find a unique path that satisfies the equation. Instead, we look for a stochastic process whose statistical properties replicate those of equation (3). One possible strategy is to express it in its more appropriate integral form: X t = X 0 + t 0 µ(s, X s)ds + t 0 σ(s, X s)dw s and attempt to evaluate the integrals. In practice, it is more common to search for a function, X t = F (t, X), such that when Ito s Lemma is applied to it, we get back the same drift, µ(t, X), and diffusion, σ(t, X), coefficients as in eq. (3).

Since the drift and diffusion coefficients fully characterize the statistical properties of a process, doing this will solve the equation. However, there is a subtlety here that goes back to our discussion of martingales. We know that Ito integrals are martingales, and that martingales are always defined relative to an information set. Let F t be the filtration generated by the Brownian motion in eq. (3). Then if our solution is F t-adapted, we have what is called a strong solution. If it is adapted to some other information set, we have a weak solution. The statistical properties are the same either way, but our assumptions about the available information could be different. Example: Suppose we want to solve for the geometric Brownian motion process dx X = µdt + σdw (4) where µ and σ are constants. It s not clear how we should evaluate the Ito integral X sdw s, since we don t know what X is yet! So let s use the indirect/ito s lemma approach. Let s posit the form X t = X 0 e (µ 1 2 σ2 )t+σw t (5) Applying Ito s lemma to this, one can verify that the resulting drift is µx and the resulting diffusion coefficient is σx. Hence, eq. (5) solves eq. (4). If the W t process in (5) is the same as in (4), then we have a strong solution.

AN EXISTENCE THEOREM In the case of deterministic ODEs there are well known conditions that ensure existence (and uniqueness) of solutions. (Before looking for something, it s useful to know that it exists) The following theorem provides conditions for the existence and uniqueness of solutions to SDEs. Theorem: Assume the drift and diffusion coefficients are measureable functions satisfying the following two conditions µ(t, X) + σ(t, X) C 1 (1 + X ) X, t [0, T ] µ(t, X) µ(t, Y ) + σ(t, X) σ(t, Y ) C 2 X Y X, Y, t [0, T ] where C 1 and C 2 are given constants. Then the SDE has a unique square-integrable solution. dx = µ(t, X)dt + σ(t, X)dW The 1st condition is called a linear growth condition. The 2nd condition is called a Lipschitz condition. These are sufficient conditions, and are often too strict for finance applications. However, one case is noteworthy - if the drift and diffusion coefficients are indpt. of t and affine functions of X, then this theorem tells us a unique solution exists.

LOCAL MARTINGALES Earlier I claimed that Ito integrals produce martingales. That is, M t = M 0 + t 0 θ s dw s is a martingale as long as θ t is adapted to the filtration generated by W t, (F t ). According to the Martingale Representation Theorem, this is a completely general way to define martingales, i.e., all (continuous) martingales adapted to F t are Ito integrals for some θ t process. However, it turns out this is not quite correct. Ito integrals only define local martingales. Without further restrictions on θ t, there is no guarantee they are global martingales. A sufficient condition that ensures a local martingale is a global martingale is the following [ ] T E θ 2 s ds < 0

DOUBLING STRATEGIES A good way to see the relevance of this distinction is to consider the following doubling strategy: Consider an infinite sequence of discrete dates within some interval [0, T ]. For example, 0 = t 0 < t 1 < < T with t n = nt /(n + 1) as n. Suppose you can bet on the toss of a fair coin at each t n, winning if it shows heads. Suppose your initial bet is $1, and you double your bet each time until you win. If you win, you quit. From the Law of Large Numbers, you win with probability 1, and E[W T ] = W 0 + 1 W 0. Hence, your wealth is not a martingale, even though each bet is fair and the expected (local) change in your wealth is 0. What s going on here? How can the sum of a bunch of zeros add up to 1? Three things are crucial: (1) You can place an infinite number of bets, (2) Your wealth is allowed to become unboundedly negative, and (3) You are allowed to place an infinitely large bet. Relax any of these and your strategy is a loser: E(W T ) < W 0. The only way you can generate a positive global mean from a sequence mean zero bets is if there is some probability of an infinite loss. Lesson: In continuous-time finance, ruling out arbitrage requires placing (weak) restrictions on an agent s portfolio or wealth.