Introduction to Probability Theory and Stochastic Processes for Finance Lecture Notes

Similar documents
Stochastic Calculus, Application of Real Analysis in Finance

Lecture 17. The model is parametrized by the time period, δt, and three fixed constant parameters, v, σ and the riskless rate r.

Basic Arbitrage Theory KTH Tomas Björk

Lecture Notes for Chapter 6. 1 Prototype model: a one-step binomial tree

MATH3075/3975 FINANCIAL MATHEMATICS TUTORIAL PROBLEMS

Introduction Random Walk One-Period Option Pricing Binomial Option Pricing Nice Math. Binomial Models. Christopher Ting.

Risk Neutral Measures

Lecture 6: Option Pricing Using a One-step Binomial Tree. Thursday, September 12, 13

Class Notes on Financial Mathematics. No-Arbitrage Pricing Model

3 Stock under the risk-neutral measure

From Discrete Time to Continuous Time Modeling

1.1 Basic Financial Derivatives: Forward Contracts and Options

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models

Martingales. by D. Cox December 2, 2009

Equivalence between Semimartingales and Itô Processes

1 The continuous time limit

On the Lower Arbitrage Bound of American Contingent Claims

The Binomial Lattice Model for Stocks: Introduction to Option Pricing

4 Martingales in Discrete-Time

Stochastic Processes and Financial Mathematics (part one) Dr Nic Freeman

The Binomial Lattice Model for Stocks: Introduction to Option Pricing

Stochastic calculus Introduction I. Stochastic Finance. C. Azizieh VUB 1/91. C. Azizieh VUB Stochastic Finance

Stochastic Processes and Stochastic Calculus - 9 Complete and Incomplete Market Models

Binomial model: numerical algorithm

Asymptotic results discrete time martingales and stochastic algorithms

The Black-Scholes PDE from Scratch

STOCHASTIC CALCULUS AND BLACK-SCHOLES MODEL

Derivatives Pricing and Stochastic Calculus

AMH4 - ADVANCED OPTION PRICING. Contents

Mathematics in Finance

4: SINGLE-PERIOD MARKET MODELS

RMSC 4005 Stochastic Calculus for Finance and Risk. 1 Exercises. (c) Let X = {X n } n=0 be a {F n }-supermartingale. Show that.

Hedging under Arbitrage

Martingale Measure TA

BROWNIAN MOTION II. D.Majumdar

An Introduction to Point Processes. from a. Martingale Point of View

Fundamental Theorems of Asset Pricing. 3.1 Arbitrage and risk neutral probability measures

Non replication of options

Financial Mathematics. Spring Richard F. Bass Department of Mathematics University of Connecticut

Introduction to Stochastic Calculus and Financial Derivatives. Simone Calogero

2.1 Multi-period model as a composition of constituent single period models

CONVERGENCE OF OPTION REWARDS FOR MARKOV TYPE PRICE PROCESSES MODULATED BY STOCHASTIC INDICES

Arbitrage Theory without a Reference Probability: challenges of the model independent approach

Mathematical Finance in discrete time

Tangent Lévy Models. Sergey Nadtochiy (joint work with René Carmona) Oxford-Man Institute of Quantitative Finance University of Oxford.

MATH 5510 Mathematical Models of Financial Derivatives. Topic 1 Risk neutral pricing principles under single-period securities models

Characterization of the Optimum

The Infinite Actuary s. Detailed Study Manual for the. QFI Core Exam. Zak Fischer, FSA CERA

Chapter 15: Jump Processes and Incomplete Markets. 1 Jumps as One Explanation of Incomplete Markets

Pricing theory of financial derivatives

THE MARTINGALE METHOD DEMYSTIFIED

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 19 11/20/2013. Applications of Ito calculus to finance

LECTURE 2: MULTIPERIOD MODELS AND TREES

Drunken Birds, Brownian Motion, and Other Random Fun

3.2 No-arbitrage theory and risk neutral probability measure

3 Arbitrage pricing theory in discrete time.

- Introduction to Mathematical Finance -

Lecture 23: April 10

Random Variables Handout. Xavier Vilà

LECTURE 4: BID AND ASK HEDGING

A model for a large investor trading at market indifference prices

Minimal Variance Hedging in Large Financial Markets: random fields approach

The Binomial Model. Chapter 3

Binomial Option Pricing

Value of Flexibility in Managing R&D Projects Revisited

Stochastic Differential equations as applied to pricing of options

Optimal stopping problems for a Brownian motion with a disorder on a finite interval

Practical example of an Economic Scenario Generator

Derivative Securities

On Existence of Equilibria. Bayesian Allocation-Mechanisms

Homework Assignments

Finance: A Quantitative Introduction Chapter 7 - part 2 Option Pricing Foundations

The Black-Scholes Model

Martingale Approach to Pricing and Hedging

Keeping Your Options Open: An Introduction to Pricing Options

Model-independent bounds for Asian options

Risk Neutral Valuation

Hedging of Contingent Claims under Incomplete Information

Computational Finance. Computational Finance p. 1

We discussed last time how the Girsanov theorem allows us to reweight probability measures to change the drift in an SDE.

Numerical schemes for SDEs

A class of coherent risk measures based on one-sided moments

6: MULTI-PERIOD MARKET MODELS

based on two joint papers with Sara Biagini Scuola Normale Superiore di Pisa, Università degli Studi di Perugia

Option Pricing with Delayed Information

Option Pricing. Chapter Discrete Time

Homework Assignments

Module 10:Application of stochastic processes in areas like finance Lecture 36:Black-Scholes Model. Stochastic Differential Equation.

The Black-Scholes Model

Valuation of derivative assets Lecture 8

CHAPTER 2 Concepts of Financial Economics and Asset Price Dynamics

Math-Stat-491-Fall2014-Notes-V

Illiquidity, Credit risk and Merton s model

Homework 1 posted, due Friday, September 30, 2 PM. Independence of random variables: We say that a collection of random variables

Multi-Asset Options. A Numerical Study VILHELM NIKLASSON FRIDA TIVEDAL. Master s thesis in Engineering Mathematics and Computational Science

Outline of Lecture 1. Martin-Löf tests and martingales

Why Bankers Should Learn Convex Analysis

IEOR 3106: Introduction to OR: Stochastic Models. Fall 2013, Professor Whitt. Class Lecture Notes: Tuesday, September 10.

M5MF6. Advanced Methods in Derivatives Pricing

DRAFT. 1 exercise in state (S, t), π(s, t) = 0 do not exercise in state (S, t) Review of the Risk Neutral Stock Dynamics

Transcription:

Introduction to Probability Theory and Stochastic Processes for Finance Lecture Notes Fabio Trojani Department of Economics, University of St. Gallen, Switzerland Correspondence address: Fabio Trojani, Swiss Institute of Banking and Finance, University of St. Rosenbergstr. 5, CH-9 St. Gallen, e-mail: Fabio.Trojani@unisg.ch. Gallen,

Contents 1 Introduction to Probability Theory 4 1.1 The Binomial Model................................... 4 1.1.1 The Risky Asset................................. 4 1.1. The Riskless Asset................................ 4 1.1.3 A Basic No Arbitrage Condition........................ 5 1.1.4 Some Basic Remarks............................... 5 1.1.5 Pricing Derivatives: a first Example...................... 5 1. Finite Probability Spaces................................ 7 1..1 Measurable Spaces................................ 7 1.. Probability measures............................... 11 1..3 Random Variables................................ 14 1..4 Expected Value of Random Variables Defined on Finite Measurable Spaces 15 1..5 Examples of Probability Spaces and Random Variables with Finite Sample Space....................................... 16 1.3 General Probability Spaces............................... 17 1.3.1 Some First Examples of Probability Spaces with non finite Sample Spaces. 18 1.3. Continuity Properties of Probability Measures................ 1.3.3 Random Variables................................ 1 1.3.4 Expected Value and Lebesgue Integral..................... 5 1.3.5 Some Further Examples of Probability Spaces with uncountable Sample Spaces 8 1.4 Stochastic Independence................................. 9 Conditional Expectations and Martingales 33.1 The Binomial Model Once More............................ 33. Sub Sigma Algebras and Partial Information.................... 34 1

.3 Conditional Expectations................................ 36.3.1 Motivation.................................... 36.3. Definition and Properties............................ 37.4 Martingale Processes................................... 41 3 Pricing Principles in the Absence of Arbitrage 44 3.1 Stock Prices, Risk Neutral Probability Measures and Martingales.......... 45 3. Self Financing Strategies, Risk Neutral Probability Measures and Martingales... 46 3.3 Existence of Risk Neutral Probability Measures and Derivatives Pricing...... 48 3.4 Uniqueness of Risk Neutral Probability Measures and Derivatives Hedging..... 5 3.5 Existence of Risk Neutral Probability Measures and Absence of Arbitrage..... 5 4 Introduction to Stochastic Processes 5 4.1 Basic Definitions..................................... 5 4. Discrete Time Brownian Motion............................ 54 4.3 Girsanov Theorem: Application to a Semicontinuous Pricing Model......... 57 4.3.1 A Semicontinuous Pricing Model........................ 57 4.3. Risk Neutral Valuation in the Semicontinuous Model............. 58 4.3.3 A Discrete Time Formulation of Girsanov Theorem.............. 6 4.3.4 A Discrete Time Derivation of Black and Scholes Formula.......... 64 4.4 Continuous Time Brownian Motion........................... 66 5 Introduction to Stochastic Calculus 71 5.1 Starting Point, Motivation................................ 71 5. The Stochastic Integral................................. 73 5..1 Some Basic Preliminaries............................ 74 5.. Simple Integrands................................ 75

5..3 Squared Integrable Integrands.......................... 81 5..4 Properties of Stochastic Integrals........................ 84 5.3 Itô s Lemma........................................ 85 5.3.1 Starting Point, Motivation and Some First Examples............. 85 5.3. A Simplified Derivation of Itô s Formula.................... 88 5.4 An Application of Stochastic Calculus: the Black-Scholes Model.......... 93 5.4.1 The Black-Scholes Market............................ 93 5.4. Self Financing Portfolios and Hedging in the Black-Scholes Model..... 93 5.4.3 Probabilistic Interpretation of Black-Scholes Prices: Girsanov Theorem once more........................................ 95 3

1 Introduction to Probability Theory 1.1 The Binomial Model We start with the binomial model to introduce some basic ideas of probability theory related to the pricing of contingent claims, basically for the following reasons: It is a simple setting where the arbitrage concept and its relation to risk neutral pricing can be explained It is a model used in practice where binomial trees are calibrated to real data, for instance to price American derivatives It is a simple setting to introduce the concept of conditional expectations and martingales, which are at the hearth of the theory of derivatives pricing. 1.1.1 The Risky Asset S t is the price of a risky stock at time t I, where we start for simplicity with a discrete time index I = {, 1, }. The dynamics of S t is defined by us t 1 with probability p S t = ds t 1 with probability 1-p, where p, 1. We impose for brevity the further condition u = 1 d > 1 giving a recombining tree. 1.1. The Riskless Asset B t is the price at time t of a riskless money account. r > is the riskless interest rate on the money account, implying B t = 1 + r B t 1 4

for any t = 1,. For simplicity we impose the normalization B = 1. 1.1.3 A Basic No Arbitrage Condition A necessary condition for the absence of arbitrage opportunities in our model is d < 1 + r < u. 1 Example 1 In the sequel we will often use a numerical example with parameters S = 4, u = 1/d =, r =.5. 1.1.4 Some Basic Remarks Notice that to any trajectory T T, T H, HT, HH, in the tree we can associate the corresponding values of S 1 and S. Thus, from the perspective of time, both S 1 and S are random entities whose value depends on which event/trajectory will be realized in the model. To fully describe the random behaviour of S 1 and S we can make use of the space Ω = {T T, T H, HT, HH} of all random sequences that can be realized on the tree. Basically, Ω contains all the information about the single outcomes that can be realized in our model. Definition i The set Ω of all possible outcomes in a random experiment is called the sample space. ii Each single event ω Ω is called an outcome of the random experiment. Example 3 In the above two period model we had Ω = {T T, T H, HT, HH} and ω = T T or ω = T H or ω = HT or ω = HH. Exercise 4 Give the sample space and all single outcomes in a binomial tree with three periods. 1.1.5 Pricing Derivatives: a first Example Definition 5 An European call option with strike price K and maturity T I is the right to buy at time T the underlying stock for the price K. We denote by c t the price of the European call option at time t. 5

From the definition we immediately have for the pay-off at maturity of the call option: S T K S T > K c T =, S T K or, more compactly: c T = S T K +, where x + := max x, is the positive part of x. Remark 6 Notice that c T depends on ω Ω only through S T ω. The goal in any pricing model is to determined the time price as for instance the price c of a derivative payoff falling at a later time T, say as for instance the pay-off c T = S T K +. Assumption 7 To illustrate the main ideas we start with T = 1. Definition 8 A perfect hedging portfolio for c T with value V at time is a position in stock and V S money accounts recall the normalization B = 1, such that c 1 H = S 1 H + V S 1 + r c 1 T = S 1 T + V S 1 + r Remark 9 A perfect hedging portfolio replicates exactly the future pay-off of the derivative to be hedged. Therefore, it is a vehicle to fully eliminate the risk intrinsic in the randomness of the future value of a derivative. Proposition 1 i For T = 1, the quantity is given by = c 1 H c 1 T S 1 H S 1 T. 3 is called the delta of the hedging portfolio. ii The risk neutral valuation formula follows: c = V = 1 1 + r [ pc 1 H + 1 p c 1 T ], where p = 1 + r d u d. 6

Proof. i Compute the difference between the first and the second equation in and solve for. ii Insert given by 3 in one of the two equations in and solve for V. Absence of arbitrage then implies V = c. Remark 11 i The price V = c does not depend on the binomial probability p. ii Under the given conditions cf. 1 one has p, 1. Therefore the identity c = 1 1 + r [ pc 1 H + 1 p c 1 T ] says that the price c is a discounted expectation of the call future random pay-offs, computed using the risk adjusted probabilities p and 1 p. More compactly, we could thus write c = Ẽ c 1 1 + r, where Ẽ denotes expectations under p, 1 p. This is a so called risk adjusted or risk neutral valuation formula. Exercise 1 i For the case T = 1 and for the model parameters in Example 1 compute the numerical value of c. ii For the case T = compute recursively the hedging portfolio of the derivative, starting from 1 H, 1 T, V 1 H, V 1 T, and finishing with and V. 1. Finite Probability Spaces In the sequel we let Ω be a given sample space. 1..1 Measurable Spaces Let F be the family of all subsets of Ω; F is an example of a so called sigma algebra, a concept that we define in the sequel. Definition 13 i A sigma algebra G F is a family of subsets of Ω such that: 1. G 7

. If A G then it follows A c G 3. If A i i N G is a countable sequence in G, then it follows A i G i N ii The couple Ω, G is called a measurable space. Example 14 i F is a sigma algebra, the finest one on Ω. Indeed, F. Moreover, for any set A F the complement A c is a subset of Ω, i.e. is in F. The same holds for any not only for a countable union of sets in F. ii The subfamily G := {, Ω} is the coarsest sigma algebra on Ω. iii In the setting of the binomial model of Example 1, it is easy to verify please do it! that the subfamily G := {, Ω, {HT, HH}, {T T, T H}}, is a sigma algebra, the sigma algebra generated by the first period price movements in the model. Remark 15 We make use of sigma algebras to model different information sets at the disposal of the investor in doing her portfolio choices. For instance, in the setting of the binomial model of Example 1, the information available at time before observing prices can be modelled by the trivial information set G := {, Ω}. That is, at time investors only know that the possible realized outcome ω has to be an element of the sample space Ω. At time 1 investors can observe S 1. Thus, depending on the value of S 1 they will know at time 1 that either ω {HT, HH} if and only if S 1 ω = S u, or ω {T T, T H} if and only if S 1 ω = S d. 8

Thus at time 1 investors do not have full information about ω, since they still do not know the direction of the price movement in period. However, they can determine to which specific event of their information set ω belongs. The larger smaller this set, the preciser the rougher the information on the realized outcome ω. For instance, while at time investors only know that the outcome will be an element of the sample space, at time 1 they know that the outcome implies either an upward or a downward price movement in the first period. Based on these considerations a natural sigma algebra G 1 to model investors price information at time 1 is G 1 := {, Ω, {HT, HH}, {T T, T H}}, verify that G 1 is indeed a sigma algebra. Similarly, by observing only the price S investors will know at time that either ω = HH if and only if S ω = S u, or ω = T T if and only if S ω = S d, or ω {T H, HT } if and only if S ω = S du. On the other hand, by observing the prices S 1 and S investors will know at time ω = HH if and only if S ω = S u, or ω = T T if and only if S ω = S d, or ω = T H if and only if S 1 ω = S d and S ω = S du, or ω = HT if and only if S 1 ω = S u and S ω = S du. 9

Based on these considerations a natural sigma algebra G to model investors price information up to time is the smallest one containing the system of subsets of Ω given by E := {, Ω, {HT }, {HH}, {T T }, {T H}}. We denote this sigma algebra by G = σ E. Finally, the sigma algebra representing the information obtained by observing only the price S is G 3 = {, Ω, {HH}, {T H, HT, T T }, {T T }, {T H, HT, HH}, {T H, HT }, {T T, HH}} Notice that while the relation G G 1 G implies an information set growing over time, we do not have G 1 G 3 why?. Therefore, the sequence of sigma algebras G, G 1, G 3 is not consistent with the idea of an investor s information set growing over time. Exercise 16 Borel sigma algebra on R Let Ω := R and denote by T the set of all open intervals in R T = {a, b a b, a, b R}. 1. Show with a simple counterexample that T is not a sigma algebra on R.. We know that there does exist a sigma algebra over R containing T which one?. Thus, there also exists a minimal sigma algebra containing T, the so-called Borel sigma algebra over R denoted by B R which has to be of the form B R = G is σ algebra over R T G G To show that B R is indeed a sigma algebra over R it is thus sufficient to show that intersections of sigma algebras are sigma algebras. Do this, by verifying the corresponding definition. 3. Show, using simple set operations, that the events, a, a,, [a, b], a, b], {a}, where a b, are elements of B R. 1

4. Show that any countables subset {a i } i N of R is an element of B R. As mentioned, a natural way to model a growing amount of information over time is through increasing sequences of sigma algebras. This is the next definition. Definition 17 Let Ω, G be a measurable space. A sequence G i i=,1,...,n of sigma algebras over Ω such that G G 1... G n G, is called a filtration. Example 18 In Remark 15 the sequence G i i=,1, is a filtration, while the sequence G i i=,1,3 is not. 1.. Probability measures For the whole section let Ω, G be a measurable space. Definition 19 We say that an event A G is realized in a random experiment with sample space Ω if ω A. Example In the two period binomial model we have {T H, T T } = {The stock price drops in the first period}. Thus, if a time 1 we observe T, {T H, T T } is realized. On the other hand, if we observe H, then {T H, T T } is not realized i.e. A c = {HT, HH} is realized. The next step is to assign in a consistent way probabilities to events that can be realized in a random experiment. 11

Definition 1 i A probability measure on Ω, G is a function P : G [, 1] such that: 1. P Ω = 1. For any disjoint sequence A i i N G such that A i A j = for i j it follows P A i = P A i. i N i N This property is called sigma additivity. ii We call a triplet Ω, G, P a probability space. Example In the two period binomial model we set Ω = {T T, T H, HT, HH}, G = F, and define probabilities with the binomial rule P HH = p, P T T = 1 p, P T H = P HT = p 1 p. The sigma additivity then implies, for instance P HT, HH = P HH + P HT = p + p 1 p. More generally, we have, in this finite sample space setting: P A = ω A P ω Proposition 3 Let Ω, G, P be a probability space. We have: 1. P A\B = P A P A B. P A B = P A + P B P A B 3. P A c = 1 P A 4. If A B then P A P B 1

Proof. 1. A\B = A B c and A = A B A B c. By sigma additivity if follows: P A = P A B + P A B c = P A B + P A\B.. A B = A\B B. Therefore, using 1 and by sigma additivity: P A B = P A\B + P B = P A + P B P A B. 3. This is a particular case of 1. with A = Ω and B = A. 4. By 1. we have, under the given assumption: P B = P B A + P B\A = P A + P B\A P A. Remark 4 In Definition 1, the condition 1. for a probability measure implies the condition, 1. P =. In fact, a function µ : G [, ] satisfying condition 1. and. in Definition 1 is called a measure on the measurable space Ω, G. Notice, that in this case we can have µ Ω =. Exercise 5 The Lebesgue measure on the measurable space R, B R denoted by µ is a measure µ : B R [, ] such that µ a, b = b a for any open interval a, b, a b. It can be shown that Lebesgue measure exists and is unique we will not prove this, we will just assume it in the sequel. Show the following properties of Lebesgue measure, using the general definition of a measure. 1. µ =, µ R + =. µ {a} = for any a R 3. For any countable subset {a i } i N of R one has µ {ai } i N =. 13

1..3 Random Variables For the whole section let Ω, G be a measurable space such that the cardinality of Ω is finite Ω <. We will extend the concept of a random variable to non finite sample spaces in a later section. Definition 6 Let X : Ω R be a function from Ω to the real line. i The sigma algebra σ X := { X 1 B : B is a subset of R }, where X 1 B is a short notation for the preimage {ω : X ω B} of B under X, is called the sigma algebra generated by X. ii X is called a random variable on Ω, G if it is measurable with respect to G, that is if σ X G. Remark 7 i It is useful to know some properties of preimages. We have for any subset B of R, and for any non necessarily countable sequence B α α A of subsets of R: X 1 B c = X 1 B c X 1 B α = X 1 B α α A X 1 α A B α = X 1 B α α A α A ii σ X is a sigma algebra. Indeed, = X 1 σ X. Moreover, if A = X 1 B for some subset of R, then A c = X 1 B c = X 1 B c σ X, because B c is a subset of R. Similarly, given a sequence A i i N such that A i = X 1 B i for a sequence of subsets B i i N of R we have: A i = X 1 B i = X 1 B i σ X, i N i N i N because i N B i is a subset of R. iii σ X represents the partial information set that is available about an outcome ω Ω by observing the values of X. 14

Example 8 In the two period binomial model S, S 1 and S are all trivially measurable with respect to the finest sigma algebra F over Ω. However, since S is constant we have σ S = {, Ω} = G, and S is G measurable. Further, σ S 1 = {, Ω, {HT, HH}, {T T, T H}} = G 1, and S 1 is G 1 but not G measurable. Finally, σ S = {, Ω, {HH}, {T H, HT, T T }, {T T }, {T H, HT, HH}, {T H, HT }, {T T, HH}} = G 3. Therefore, S is G 3 but not G 1 measurable. On the other hand, S 1 is G 1 but not G 3 measurable why?. 1..4 Expected Value of Random Variables Defined on Finite Measurable Spaces For the whole section let Ω, G, P be a probability space such that the cardinality of Ω is finite Ω <. We will extend the concept of expected value of a random variable to the non finite sample space setting in a later section. Further, let X : Ω, G R be a random variable. Definition 9 i The expected value E X of a random variable X defined on a finite sample space is given by E X := ω Ω X ω P ω. ii The variance V ar X of X is given by [ V ar X := E X E X ] = E X E X. 15

Example 3 In the two period binomial model of Example 1 we have: S HH = 16 ; P HH = p S HT = S T H = 4 ; P T H = P HT = p 1 p S T T = 1 ; P HH = 1 p Therefore, E S = 16 p + 4 p 1 p + 1 1 p 1..5 Examples of Probability Spaces and Random Variables with Finite Sample Space Example 31 The Bernoulli distribution with parameter p is a probability measure P on the measurable space Ω, G given by Ω := {, 1}, G := F, such that: P 1 = p, 1. Example 3 The Binomial distribution with parameters n and p is a probability measure P on a measurable space Ω, G given below. The sample space is given by Ω := {n dimensional sequences with components or 1}. For instance, a possible element of Ω is ω = 11...1111 }{{} n components. Further, we set G := F. Finally, P is given by P ω = p # of 1 in ω 1 p # of in ω. For instance, using the properties of a probability measure we have: P at least a 1 over the n components = 1 P no 1 over the n components = 1 1 p n, 16

and so forth. Example 33 A discrete uniform distribution modelling the toss of a fair die is obtained by setting Ω := {1,, 3, 4, 5, 6}, G := F, and P ω = 1 6, ω Ω. For instance, using the properties of a probability measure we then have: P obtaining an even number = P + P 4 + P 6 = 1, and so forth. Example 34 A discrete uniform distribution modelling the toss of two independent fair dies is obtained by setting Ω := {11, 1, 13, 14, 15, 16, 1,,..., 66} G := F, and P ω = 1 36, ω Ω. For instance, using the properties of a probability measure we then have: P the sum of the two numbers is larger than 1 = P 66 + P 56 + P 65 = 1 1, and so forth. Let X : Ω {, 3, 4,.., 1} be the function giving the sum of the numbers on the two dies. We have: σ X =, Ω, {11} }{{} X 1 that is X is a random variable on Ω, F., {1, 1}, {13, 31, },... }{{}}{{} F, X 1 3 X 1 4 1.3 General Probability Spaces Definition 1 of a probability space does not require the assumption Ω <. 17

1.3.1 Some First Examples of Probability Spaces with non finite Sample Spaces A first simple example of a probability space defined on a non finite sample space is the following. Example 35 Let Ω = R, G = B R and define P A = µ A [, 1]. P is a probability measure, the uniform distribution on the interval [, 1]. Indeed, we have: 1. P Ω = µ Ω [, 1] = µ [, 1] = 1.. For any disjoint sequence A i i N B R it follows P i N A i = µ i N A i [, 1] = µ i N A i [, 1] = i N µ A i [, 1] = i N P A i More generally, setting P A = µ A [a, b] µ [a, b], defines a uniform distribution on the interval [a, b]. A famous example of a probability space with non finite sample space is the one underlying a Poisson distribution on N. Example 36 Let Ω := N and G := F. Thus in this case Ω is an infinite, countable, sample space. We define for any ω Ω Setting for A F P ω := λω ω! e λ, λ >. P A := ω A P ω, one obtains the Poisson distribution on N, F with parameter λ. P is a probability measure on Ω, F. Indeed, we have P Ω = ω Ω P ω = k= λ k k! e λ = 1, 18

and, for any disjoint sequence A i i N F, P A i = P ω = P ω = P A i. i N ω S i N Ai i N ω A i i N The last example of a probability space with non finite sample space that we present is the one underlying a Binomial experiment where n. Example 37 Let Ω := {T, H} be the space of infinite sequences with components T or H. Thus any outcome ω Ω is of the form ω = ω i i N, ω i {T, H}. This is an infinite, uncountable, sample space. Therefore, some caution is needed in constructing a suitable sigma algebra on Ω, on which we are enabled in a second step to extend the binomial distribution in a consistent way. We define G n := {The sigma algebra generated by the first n tosses}, for any n N. For instance, we obtain for G 1 : G 1 = {, Ω, {ω Ω : ω 1 = T }, {ω Ω : ω 1 = H}}, and so on for n > 1. We know that there is a sigma algebra F over Ω such that G n F for all n N. However, this sigma algebra is too large to assign binomial probabilities on it in a consistent way. Therefore, we work in the sequel with the smallest sigma algebra containing all G n s. We define G := H, H n N G n H is sigma algebra over Ω the sigma algebra generated by n N G n. Notice that G contains events that can be quite rich and that do not belong to any G n, n N. An example of such an event is A := {H on every toss} = {ω Ω : ω i = H for all i N} = {ω Ω : ω i = H for i n} G, }{{} n N G n 19

where {ω Ω : ω i = H for i n} = {H on the first n tosses}. We now define a probability measure P on G whose restriction on any G n is a binomial distribution with parameters n and p. Precisely, define for any A G n and some given n N P A = p # of H in the first n tosses 1 p # of T in the first n tosses. For instance, for the event {H on the first tosses} = {ω Ω : ω i = H for i }, we obtain P H on the first tosses = p, and so forth. Using the properties of a probability measure we can then uniquely extend P to all of G. For instance, we have P H on all tosses P H on the first n tosses = p n, for all n N. Therefore, for p, 1 it follows P H on all tosses =. 1.3. Continuity Properties of Probability Measures Two further continuity properties of a probability measure - in excess of the properties in Proposition 3 - are useful when working with countable set operations over monotone sequences of events. They are given below. Proposition 38 Let A n n N G be a countable sequence of events. It then follows: 1. If A 1 A..., then: continuity from below. P A n P n A n n N,

. If A 1 A..., then: continuity from above. P A n P n A n n N Proof. 1. Let A := n N A n. We have, A = n N A n \A n 1, where A :=. Thus, under the given assumption the event A is written as a countable, disjoint, union of subsets of G. It then follows using the properties of a probability measure P A = n N P A n \A n 1 = n N P A n P A n 1 = lim n P A n P A = lim n P A n.. We have P A n P n n N A n P A c n P n c A n = P n N n N A c n, by de Morgan s law. The proof now follows from 1. 1.3.3 Random Variables For the whole section let Ω, G, P be a probability space and R, B R be a Borel measurable space over R. When working with uncountable sample spaces, the measurability requirement behind Definition 6 of a random variable for finite sample spaces has to be modified. Basically, we are going to require measurability only for preimages of any Borel subset of R, rather than measurability for preimages of any subset of R. This is a necessary step, in order to be able to assign consistently probabilities to Borel events determined by the images of some random variable on Ω, G, P. Definition 39 Let X : Ω R be a real valued function. i The sigma algebra σ X := { X 1 B : B B R }, 1

is the sigma algebra generated by X. ii X is a random variable on Ω, G if σ X G. Example 4 For a set A Ω let a function 1 A : Ω {, 1} be defined by 1 ω A 1 A ω =. otherwise 1 A is called the indicator function of the set A. We have please verify σ 1 A = {, Ω, A, A c }. Hence, 1 A is a random variable over Ω, G if and only if A G. The measurability property in Definition 44 allows us to assign in a natural way probabilities also to Borel events that are induced by images of random variables, as is illustrated in the next example. Example 41 Let X be a random variable on a probability space Ω, F, P. For any event B B R we define L X B := P X 1 B. 4 L X is a probability measure on B R, the probability distribution of X or the probability induced by X on B R. Remark, that 4 is well defined, precisely because of the measurability of the random variable X. Showing that L X is indeed a probability measure is very simple. In fact, we have: L X R = P X 1 R = P X R = P Ω = 1. Moreover, for any sequence B i i N of disjoint events we obtain: L X B i = P X 1 B i = P X 1 B i = P X 1 B i = L X B i, i N i N i N i=1 i=1 using in the third equality the fact that B i i N and thus also X 1 B i is a sequence of i N disjoint events.

Checking measurability of a candidate random variable can be by definition a quite hard and lengthy task, since we have to check preimages of any Borel subset of R. Fortunately, the next result offers a much easier criterion by which measurability is easy to verify in many applications. Proposition 4 For a function X : Ω R let E := { X 1, t : t R } = {{X < t} : t R}, be the set of preimages of open intervals of the form, t under X. Then it follows: E G σ X G. Proof. Define: H := { B B R : X 1 B G } B R. It is sufficient to show that under the given conditions B R H, i.e. B R = H. We start by showing that H is a sigma algebra. We have first X 1 = G, hence H. Second, for a set B H it follows X 1 B c = X 1 B }{{} G Finally, for a sequence B n n N H we have c G. X 1 B n = X 1 B n G, }{{} n N n N G showing that H is a sigma algebra as claimed. Since B R is by definition the smallest sigma algebra containing all open intervals on the real line it is sufficient to show that under the given conditions H contains all open intervals on the real line. To this end, recall that all sets of the form, t are by assumption elements of H. For a general open interval a, b, a b it then 3

follows: X 1 a, b = X 1, b = X 1, b n N n N = X 1, b }{{} n N G This concludes the proof of the proposition., a + 1 c n, a + 1 c n X 1 c, a + 1 n G. } {{ } G Example 43 Let X n n N be an arbitrary sequence of random variables on Ω, G. It then follows: 1. ax 1 + bx is a random variable for any a, b R. sup n N X n and inf n N X n are random variables 3. lim sup X n := lim n sup k n X k and lim inf X n := lim n inf k n X k are random variables. Proof. We apply several times Proposition 4. 1. For a, b we have {ax 1 + bx < t} = r Q For statement. we obtain: { } sup X n < t n N {ax 1 < r} {bx < t r} = r Q = {X n < t} G, }{{} n N G { X 1 < r } }{{ a } G { X < t r } G. b }{{} G { } inf X n < t = {X n < t} G. n N }{{} n N G 3. For any n N it follows that Y n := sup k n X k and Z n := inf k n X k are random variables, by. Moreover, the sequences Y n n N and Z n n N are monotonically decreasing and increasing, respectively. Thefore: This concludes the proof. {lim sup X n < t} = {lim inf X n < t} = { } lim Y n < t n { } lim Z n < t n = {Y n < t} G. }{{} n N G = {Z n < t} G. }{{} n N G 4

1.3.4 Expected Value and Lebesgue Integral For the whole section let Ω, G, P be a probability space and R, B R be the Borel measurable space over R. The expected value of a general random variable is defined as its Lebesgue integral with respect to some probability measure P on Ω, G. More generally, Lebesgue integrals of measurable functions can be defined with respect to some measure as for instance Lebesgue measure µ defined on a corresponding measurable space as for instance the measurable space R, B R. The construction of the Lebesgue integral for a general random variable X starts by defining the value of the Lebesgue integral for linear combinations of indicator functions, goes over to extend the integral to functions that are pointwise monotonic limits of sequences of simple functions, and finally defines the integral for the more general case of an integrable random variable see below the precise definition. Definition 44 i A random variable X is simple if X = where n N, c 1,.., c n R, and A 1,.., A n G n c i 1 Ai, i=1 are mutually disjoint events. The vector space of simple random variables on Ω, G is denoted by S G.The expected value E X of a simple function X is defined by E X := Ω XdP := n c i P A i. ii Let X be a non negative random variable. The expected value E X of X is defined by E X := iii A random variable X is integrable, if Ω i=1 { } XdP := sup Y dp : Y X and Y S G Ω. E X + <, E X <, where X + := max X, and X := max X, are the positive and negative part of X, respectively. We denote the vector space of integrable random variable by L 1 P. For any X L 1 P 5

the expected value E X of X is defined by E X = E X + E X. iv Finally, for a random variable X L 1 P and a set A G we define A XdP := Ω 1 A XdP Remark 45 i The key point in the definition of E X is iii. In fact, iii is a quite reasonable definition because for any random variable X there always exists a sequence X n n N of simple random variables converging monotonically pointwise to X from below. Such a sequence is obtained for instance by setting for any ω Ω X n ω = n n k=1 k 1 n 1 { k 1 n <X k n } ω + n1 {X>n} ω. Moreover, it can be shown that the limit of the sequence of integrals E X n does not depend on the choice of the specific approximating sequence. Therefore, iii in Definition 44 could be also equivalently written as E X := lim E X n := lim X n dp, n n Ω for a given approximating sequence X n n N. ii As mentioned, expected values are by definition just integrals of measurable functions with respect to some probability measure. In fact, the definition of the Lebesgue integral of a measurable function with respect to some measure µ, say, follows exactly the same steps as above, readily by replacing everywhere the probability measure P with the measure µ in i, ii, iii and iv. Let us discuss some first very simple examples of expected values computed using the above definitions. Example 46 Let Ω := R, G := B R and set for any A G P A = µ A [, 1]. 6

The expected value of X := 1 Q is E X = 1 µ Q [, 1] =, because Q is a countable set. Notice, that this function is not Riemann integrable in the usual sense. The expected value of ω = Y ω := otherwise, can be computed as the limit of the expected values in an approximating sequence X n n N of simple functions given by Hence: n ω = X n ω := otherwise. E Y = lim n E X n = lim n n µ {} [, 1] = lim n n µ {} =. Notice, that also Y is not Riemann integrable in the usual sense. The basic properties of the above integral definition are collected in the next proposition. Proposition 47 Let X,Y L 1 P and a,b R; it then follows: 1. E ax + by = ae X + be Y. If X Y pointwise, then E X E Y 3. For two sets A, B G such that A B = it follows A B XdP = Ω 1 A B XdP = Ω 1 A + 1 B XdP = 1. A XdP + XdP. B 7

Proof. 1. For brevity we show this property only for indicator functions X = 1 A, Y = 1 B, where A, B G are disjoint events. We have E ax + by = E a1 A + b1 B = Def ii ap A+bP B = ae 1 A +be 1 B = ae X+bE Y.. If Y X, then there exists a sequence of simple approximating functions X n converging monotonically to Y X. This implies: E Y E X = 1. E Y X = lim E X n = lim n n say, because for any n N we have c 1n,.., c kn n. k n i=1 c in P A in, 1.3.5 Some Further Examples of Probability Spaces with uncountable Sample Spaces For the whole section let Ω, G, P be a probability space and R, B R be the Borel measurable space over R. Using Lebesgue integrals we are also able to construct probability measures by integrating a suitable density function over events A G. A well-known example in this respect arises by integrating the density function of a standard normal distribution. Example 48 Ω, G := R, B R; φ : R R + is defined by φ x = 1 exp x π, x R. φ is the density function of a standard normally distributed random variable and is such that R φ x dµ x = φ x dx = 1, i.e. φ L 1 µ. A standard normal probability distribution P on R, B R is obtained by setting for any A G: P A := φ x dµ x. A It is straightforward to verify, using the basic properties of Lebesgue integrals together with some monotone convergence property, that P is indeed a probability measure. 8

More generally, densities can be also defined on abstract probability spaces, as is demonstrated in the next final example. Example 49 Let X be a random variable on Ω, G such that X L 1 P, and define Q A := E 1 AX E X X = E 1 A E X. It is easy to verify, using the basic properties of Lebesgue integrals together with some monotone convergence property, that Q is a further probability measure on Ω, G. Moreover, the absolute continuity property P A = Q A =, follows from the definition. If, moreover, P A = Q A = the probabilities Q and P are called equivalent. This property holds when X >. The random variable Z := X EX By construction dq dp is called the Radon Nykodin derivative of Q with respect to P, denoted by dq dp. dq is a density function on Ω, G because dp and E dq X = E = 1. dp E X 1.4 Stochastic Independence For the whole section let Ω, G, P be a probability space Definition 5 Two events A, B G are stochastically independent if P A B = P A P B. 5 We use the notation A B to denote two independent events. 9

Remark 51 Condition 5 states that two events are independent if and only if their conditional and unconditional probabilities are the same, i.e.: P A B := P A B P B P A P B = = P A, A B P B provided of course P B >. This property is symmetric in A, B. Example 5 Stochastic independence is a feature determined by the structure of the underlying probability P. As an illustration of this fact consider again the two period binomial model of Example 1. We have there: P HH, HT P HT, T H = p + p 1 p p 1 p = p 1 p, 6 and P {HH, HT } {HT, T H} = P HT = p 1 p. 7 Therefore, 6 and 7 are equal if and only if p = 1, that is the only binomial probability under which the above events are independent is the one implied by p = 1. The concept of stochastic independence between events can be naturally extended to stochastic independence between information sets, i.e. sigma algebras. Definition 53 Two sigma algebras G 1, G G are stochastically independent if for all A G 1 and B G one has A B. We use the notation G 1 G to denote independent sigma algebras. Example 54 In the two period binomial model of Example 1 we define the two following sigma algebras: G 1 := {, Ω, {HT, HH}, {T T, T H}}, the sigma algebra generated by the first price increment, and G := {, Ω, {HH, T H}, {T T, HT }}, 3

the sigma algebra generated by the second price movement. We then have, for any p [, 1]: G 1 G. For instance, for the sets {HT, HH} and {HH, T H} one obtains P HT, HH P HH, T H = p + p 1 p p + p 1 p = p, and P {HT, HH} {HH, T H} = P HH = p. These features derive directly from the way how probabilities are assigned by a binomial distribution where P ω = p # of H in ω 1 p # of T in ω. Finally, we can also define independence between random variables as independence of the information sets they generate. Definition 55 Two random variables X,Y on Ω, G, P are independent if σ X σ Y. We use the notation X Y to denote independence between random variables. Example 56 We already discussed that the two sigma algebras G 1, G of Example 54 are independent in the binomial model. Notice that we have please verify! G 1 = {, Ω, {HT, HH}, {T T, T H}} = σ S 1 /S, and G := {, Ω, {HH, T H}, {T T, HT }} = σ S /S 1. Therefore, the stock price returns S 1 /S and S /S 1 in a binomial model are stochastically independent. 31

Example 57 Let A, B G be two independent events and let the functions 1 ω A 1 ω B 1 A ω =, 1 B ω = otherwise otherwise, be the indicator functions of the sets A and B, respectively. We then have please verify: σ 1 A = {, Ω, A, A c }, σ 1 B = {, Ω, B, B c }. Therefore, 1 A 1 B if and only if A B please verify. Some properties related to independence are important. The first one says that independence is maintained under measurable transformations. Proposition 58 Let X,Y be independent random variables on Ω, G, P and h, g : R R be two measurable functions. It then follows: h X g Y. Proof. We give a graphical proof of this statement, which makes use of the fact that preimages of composite mappings are contained in the preimage of the first function in the composition: σ X By assumption σ Y. σ h X σ g Y The second important property of stochastic independence is related to the expectation of a product of random variables. Proposition 59 Let X,Y be independent random variables on Ω, G, P. It then follows E XY = E X E Y. Proof. For the sake of brevity we give the proof for the simplest case where X = 1 A, Y = 1 B, for events A, B G such that A B. As usual, the extension of this result for more general 3

setting requires considering linear combinations of indicator functions, i.e. simple functions, and pointwise limits of simple functions. For the given simplified setting we have: E XY = E 1 A 1 B = E 1 A B = 1 P A B + P A B c = P A B = A B P A P B = E 1 A E 1 B = E X E Y. This concludes the proof. Conditional Expectations and Martingales For the whole section let Ω, G, P be a probability space.1 The Binomial Model Once More For later reference, we summarize the structure of a general n period binomial model, since it will be used to illustrate some of the concepts introduced below. I := {, 1,,.., n} is a discrete time index representing the available transaction dates in the model The sample space is given by Ω := {Sequences of n coordinates H or T } with single outcomes ω of the form for instance. ω = T T T H..HT }{{} n coordinates, G := F, the sigma algebra of all subsets of Ω Dynamics of the stock price and money account: us t 1 with probability p S t = with probability 1 p ds t 1, B t = 1 + r B t 1 33

for given B = 1, S and where u = 1/d, u > 1 + r > d. The sequence S t t=,..,n is a sequence of random variables defined on a single probability space Ω, G, P. This is an example of a so called stochastic process on Ω, G, P. Associated with stochastic processes are flows of information sets i.e. sigma algebras generated by the process history up to a given time. For instance, for any t I we can define t G t := σ σ S, σ S 1,.., σ S t := σ σ S k k=, the smallest sigma algebra containing all sigma algebras generated by S, S 1,...,S t. G t represents the information about a single outcome ω Ω which can be obtained exclusively by observing the price process up to time t. Clearly, G t G s t s. Therefore, the sequence G t t=,..,n constitutes a filtration, the filtration generated by the process S t t=,..,n.. Sub Sigma Algebras and Partial Information We model partial information about single outcomes ω Ω or about single events A G using sub sigma algebras of G. Example 6 Let X be a random variable on Ω, G. Then σ X is by definition a sub sigma algebra of G. σ X represent the partial information about an outcome ω Ω which can be obtained by observing X ω. For instance, set n = 3 in the above binomial model and consider the outcome ω = T T T. By observing S 1, i.e. using σ S 1 as the available information set we can only conclude ω {T T T, T HH, T HT, T T H} S 1 ω = S d. 34

However, when observing all price movements from t = to t = 3 we can make use of the sigma algebra 3 G 3 := σ σ S t, t= to fully identify ω Ω. Both σ S 1 and G 3 are sub sigma algebras of G, which however represent different pieces of information about ω Ω Based on the above simple considerations we can now formally define what it means for an event to be realized. Definition 61 i An event A G is realized by means of a sub sigma algebra G G if A G. ii Let G t be a sigma algebra generated by some price process 1 up to time t. We say that A is realized by means of the price information up to time t if A G t. Remark 6 By definition, realization of an event A G by means of G is precisely measurability of that event with respect to the sub sigma algebra G. Precisely, given an event A G we can determine it uniquely using G, i.e. we can say that A has been realized, if and only if A G. For instance, in the above 3 period binomial model we can consider the event A = {T T T }. Clearly, A / σ S 1 since we do not know using σ S 1 the value of the second and the third coin tosses. Therefore, A is not realized by means of σ S 1, i.e. it is not realized by means of the price information up to time 1. However, 3 A G 3 := σ σ S t t=, i.e. A is realized by means of the whole price information available up to time 3. Example 63 The event {The first two price returns are both positive} is realized by means of the price information up to time, while the event {The total number of positive price returns is } is not. 1 See for instance the above examples. 35

.3 Conditional Expectations For the whole section let X be a random variable on Ω, G..3.1 Motivation Given an event A = X 1 a G, for some a R, we are always able to identify for any ω A the corresponding value X ω of the random variable X using the information set G. Indeed, we then have by definition ω X 1 a G, i.e. X ω = a, σx G for all ω A. However, using a coarser information set G σ X it may happen that we are not able to fully determine the value X ω that a random variable X associates to a given single outcome ω A. Specifically, it may happen that based on the information available in G we can only state for some non singleton set B B R ω X 1 B, i.e. X ω B. 8 In this case, the information set G is not sufficiently fine to fully determine the precise value of X ω associated with a specific ω A. Thus, the goal in such a situation is to define a suitable candidate prediction E X G ω for the unknown value X ω based on the information G. We will call E X G the conditional expectation of X conditionally on G. Notice, that a first necessary requirement on E X G is that it can be fully determined using the information G, that is it has to be G measurable. Further, a natural idea to compute the prediction E X G as an unbiased forecast such that the expectation of E X G and X agree on all sets A G : E X G dp = XdP see below the precise definition. A A 36

.3. Definition and Properties Definition 64 Let G G be a sub sigma algebra. The conditional expectation E X G of X conditioned on the sigma algebra G is a random variable satisfying: 1. E X G is G measurable. For any A G : E X G dp = XdP, A A partial averaging property. In the sequel, we write for any further random variable Y on Ω, G: E X Y := E X σ Y Remark 65 i E X G exists, provided X L 1 P ; this is a consequence of the so called Radon Nykodin Theorem. ii The random variable E X G is unique, up to events of zero probability. Precisely, if Y and Z are two candidate G measurable random variables satisfying. of the above definition, then: P Y = Z = 1 Example 66 i If G = {, Ω} then E X G = E X 1 Ω, that is conditional expectations conditioned on trivial information sets are unconditional expectations. Indeed, E X 1 Ω is G measurable and E X 1 Ω dp = E X P Ω = E X = XdP Ω Ω ii If X is G measurable then E X G = X, that is if the conditioning information set is sufficiently fine to determine X completely then conditional expectations of a random variable are the random variable itself. Indeed, in this case we trivially have: E X G dp = XdP, for any set A G. A A 37

Proposition 67 Let G G be a sub sigma algebra and X, Y L 1 P. It then follows: 1. E E X G = E X Law of Iterated Expectations.. For any a, b R: E ax + by G = ae X G + be Y G, Linearity. 3. If X then E X G with probability 1 Monotonicity. 4. For any sub sigma algebra H G : E E X G H = E X H, Tower Property. 5. If σ X G then E X G = E X 1 Ω, Independence. 6. If V is a G measurable random variable such that V X L 1 P then E V X G = V E X G Proof. 1. Set A = Ω G ; by definition in then follows E X = Ω XdP = Ω E X G dp = E E X G.. By construction ae X G + be Y G is G measurable. Moreover, for any A G : A ae X G + be Y G dp = a = a = 38 A A A E X G dp + b XdP + b Y dp A ax + by dp, A E Y G dp

using in the first and the third equality the linearity of Lebesgue integrals and in the second equality the definition of conditional expectations. 3. Let A := {E X G < } G. Then, E X G dp = XdP, A A since X and by the monotonicity of Lebesgue integrals. Further, the monotonicity of Lebesgue integrals also implies A E X G dp, since 1 A E X G <. Therefore, A E X G dp =, implying P A =. 4. E X H is by definition H measurable. Further, for any A H: A E X H dp = A XdP = A E X G dp =: Y dp, A since A G because H G. By definition, this implies that E X H is the conditional expectation of the random variable Y := E X G conditioned on the sigma algebra H. 5. E X 1 Ω is trivially G measurable. We show the statement for the case X = 1 B, where B G. The extension to the general case follows by standard arguments. We have for any A G : A E X 1 Ω dp = E X P A = E 1 B P A = P B P A = P A B = E 1 A 1 B = XdP, using in the fourth equality the independence assumption, in the fifth the properties of indicator functions and in the sixth the definition of X. 6. V E X G is G measurable. Again, we show A 39

the statement for the simpler case V = 1 B, where B G. We have for any A G, V E X G dp = A = 1 B E X G dp = E X G dp A A B XdP = 1 B XdP = V XdP, A B A A using in the third equality the definition of conditional expectations, and otherwise the properties of indicator functions. Example 68 In the n period Binomial model we have E S 1 σ S 1 = S 1, by the σ S 1 measurability of S 1. S is not σ S 1 measurable. However, we know that σ S /S 1 σ S 1. Therefore, S E S σ S 1 = E S 1 S 1 σ S 1 = S 1 E More generally, we have S S 1 σ S 1 = S 1 E σ S t /S t 1 G t 1, S S 1 = S 1 pu + 1 p d. where G t 1 := σ t 1 t = 1,..., n. Therefore, by the same arguments: k= σ S k, E S t G t 1 = S t 1 pu + 1 p d. 4

Finally, the tower property gives after some iterations: E S t+k G t 1 = E E S t+k G t+k 1 G t 1 = E S t+k 1 pu + 1 p d G t 1 = pu + 1 p d E S t+k 1 G t 1 =... = pu + 1 p d k E S t G t 1 = pu + 1 p d k+1 S t 1..4 Martingale Processes We now introduce a class of stochastic processes that are particularly important in finance: the class of martingale processes. Indeed, it will turn out in a later chapter that the price processes of many financial instruments are martingale processes after a suitable change of probability. In this section we give the necessary definitions and present some first examples of martingale processes. Definition 69 i Let G := G t t=,..,n be a filtration over Ω, G, P. The quadruplet Ω, G, G,P is called a filtered probability space. ii A stochastic process X := X t t=,..,n on a filtered probability space Ω, G, G,P is adapted is G adapted if for any t =,.., n the random variable X t is G t measurable. iii A G adapted process is a martingale if for any t =,.., n 1 one has X t = E X t+1 G t, 9 martingale condition. The process is a submartingale a supermartingale if in 9 the sign the sign holds. Remark 7 Notice, that in Definition 69 both the filtration G and the relevant probability P are crucial in determining the validity of the martingale condition 9 for an adapted process. Indeed, 41

different probabilities and filtrations can imply 9 to be satisfied or not. For instance, in the n period binomial model we obtained, using the filtration generated by the stock price process, E S t G t 1 = S t 1 pu + 1 p d. Therefore, the only binomial probability measure under which the stock price process is a martingale is the one satisfying pu + 1 p d = 1, i.e. p = 1 d u d. 1 The binomial probabilities such that p > 1 d / u d p < 1 d / u d imply a stock price process that is a submartingale a supermartingale. Being a martingale is a quite strong condition on a stochastic process, which strongly relates future process coordinates with current ones. This is made more explicit below. Proposition 71 Let X t t=,..,n be a martingale on the filtered probability space Ω, G, G,P. 1. It then follows for any t, s {, 1,..., n} such that s t: X t = E X s G t.. If Y t t=,..,n is a further martingale on the filtered probability space Ω, G, G,P and such that Y n = X n then Y t = X t almost surely for all t {, 1,..., n}. Proof. 1. The tower property combined with the martingale property implies X t = E X t+1 G t = E E X t+ G t+1 G t = E X t+ G t =... = E X t+k G t, for any k = s t.. From 1. we have X t = E X n G t = E Y n G t = Y t. This concludes the proof. 4

Example 7 AR1 process: Let ε t t=1,...,n be an identically distributed, zero mean, adapted process on a filtered probability space Ω, G, G,P and such that for any t the random variable ε t is independent from the process history up to time t 1, i.e.: σ ε t σ t 1 i=1 σ ε i, t = 1,..., n. 11 An Autoregressive Process of Order 1 AR1 is defined by t = X t = ρx t 1 + ε t t >, where ρ R. It is easily seen that X t t=,..,n is G adapted. Furthermore, for any t = 1,..., n, E X t G t 1 = E ρx t 1 + ε t G t 1 = ρe X t 1 G t 1 +E ε t G t 1 = ρx t 1 +E ε t = ρx t 1, using in the second equality the linearity of conditional expectations, in the third the G t 1 measurability of X t 1 and the independence assumption 11, and in the fourth the zero mean property of ε t E ε t =. Therefore, an AR1 process is a martingale if and only if ρ = 1. The process resulting for ρ = 1 is called a Random Walk process. Example 73 MA1 process: Let ε t t=,..,n be the same process as in Example 7. A Moving Average Process of Order 1 MA1 is defined by t = X t = ε 1 t = 1, ε t + ρε t 1 t > 1 where ρ R. It is easily seen that X t t=,..,n is G adapted. Furthermore, for any t =,.., n we have, similarly to above, E X t G t 1 = E ε t + ρε t 1 G t 1 = ρe ε t 1 G t 1 + E ε t G t 1 = ρε t 1 + E ε t = ρε t 1. Therefore, X t 1 = E X t G t 1 ε t 1 + ρε t = ρε t 1, 43