Probability Basics
Probability and Random Variables A FINANCIAL TIMES COMPANY 2
Probability Probability of union P[A [ B] =P[A]+P[B] P[A \ B] Conditional Probability A B P[A B] = Bayes Theorem P[A \ B] P[B] P[B A] = P[B \ A] P[A] P[A \ B] =P[B \ A] Joint probability P[A B] = P[A \ B] P[B] = P[B A]P[A] P[B] Independence B P[B] =05 P[A B] =P[A]! P[A \ B] =P[A]P[B] A P[A] =05 P[A \ B] =025 3
Probability Mutually exclusive events P[A [ B] =P[A]+P[B]! P[A \ B] =0 A B Mutually exclusive events are not independent! Collectively exhaustive events P[A [ B] =1 P[A]+P[B] 1 Mutually exclusive and collectively exhaustive events P[A [ B] =1 P[A]+P[B] =1 P[B] =P[Ā] A B 4
Discrete Random Variables Value of a discrete random variable, X, is determined by the outcome (event) of a random experiment that generates discrete outcomes, eg a coin toss Sample space of a random experiment is the set of all possible outcomes (events), ΩΩ For a coin toss, ΩΩ = {T,H} A rule for assigning values to X: X = ( 0 if T, 1 if H The sample space is a countable set of events (finite or infinite) 5
Discrete Random Variables Range / support x i 2{x 1,x 2,x n } Probability mass function P[X = x i ]=f(x i )=p i Cumulative distribution function P[X apple x i ]=F (x i )= ix j=1 p j 0 apple F (x i ) apple 1 6
Discrete Random Variables Question Consider a toss of a fair coin Define the random variable, X: ( 0 if T, X = 1 if H Write down the probability mass function and the (cumulative) distribution function to describe this probability model Answer Probability mass function: Cumulative distribution function: f(0) = 05 f(1) = 05 F (0) = 05 F (1) = 1 7
Discrete Random Variables Question Consider tossing two fair coins Define Y as the number of heads observed Write down the sample space for the experiment Determine the mass function and distribution function for Y Calculate the probability of observing at least one head Answer Sample space = {T,T}, {T,H}, {H, T}, {H, H} Support y 2 {0, 1, 2} 8
Discrete Random Variables Answer (continued) Probability functions y f(y) F (y) 0 025 025 1 050 075 2 025 100 100 Probability of at least one head P[Y > 0] = 1 F (0) =1 025 =075 9
Continuous Random Variables Range / support (Probability) density function, f(x) 10 P[X = a] = Z a a f(x)dx =0 a b f P(a X b) x 2 [x l,x u ] x l <x u P[a <Xapple b] = Z b a f(x)dx x l <a<b<x u 0 < P[a <Xapple b] < 1 P[x l <Xapple x u ]= Z xu x l f(x)dx =1 P[a <X<b]=P[a apple X apple b]
Continuous Random Variables (Cumulative) distribution function, F(x) F (x) = Z x 1 f(u)du f(x) = df (x) dx Probability density function f(x) Cumulative distribution function 1 F(x) 0 x P[a <Xapple b] = Z b a f(x)dx = F (b) F (a) 11
Continuous Random Variables Question A continuous random variable, X, is uniformly distributed over the interval Determine the probability x 2 [0, 1] P[01 <Xapple 06] Answer Density function f(x) = ( 1 0 apple x apple 1, 0 otherwise 1 f (x) f(x) =1 x 2 [0, 1] 0 01 06 1 A uniform distribution on [0, 1] interval x 12
Continuous Random Variables Answer (continued) Distribution function F(x) F (x) = Z x 0 f(u)du = x x 2 [0, 1] 1 (0, 0) 1 x Probability P[01 <Xapple 06] P[01 <Xapple 06] = F (06) F (01) = 06 01 =05 13
Continuous Random Variables Question Consider the following distribution function F (x) =x 2 x 2 [0, 1] Find x 0 such that P[X apple x 0 ]= 1 4 Answer P[X apple x 0 ]= 1 4 F (x 0 )= 1 4 x 2 0 = 1 4 x 0 = r 1 4 x 0 = 1 2 14
Continuous Random Variables Answer (continued) Inverse distribution function F 1 F (x) = x If F F (x) =x 2 x = p F (x) 1 F (x) = p F (x) 1 F 1 = 4 r 1 4 = 1 2 For any probability, p F p = F (x) 1 (p) = p p = x 15
Joint Probability Functions Discrete random variables, X and Y x i 2 {x 1,,x n } y j 2 {y 1,,y m } Joint mass function P[X = x i \ Y = y j ]=f(x i,y j ) Joint distribution function P[X apple x k \ Y apple y l ]= kx lx f(x i,y j )=F(x k,y l ) i=1 j=1 16
Joint Probability Functions Continuous random variables, X and Y x 2 [x l,x u ] y 2 [y l,y u ] Joint density function f(x, y) Joint distribution function P[X apple x 0 \ Y apple y 0 ]= Z x0 x l Z y0 y l f(x, y)dy dx = F (x 0,y 0 ) 17
Moments A FINANCIAL TIMES COMPANY 18
Expectation Discrete random variables E[X] = nx x i f(x i )=µ x i 2 {x 1,x 2,x n } i=1 nx f(x i )=1 i=1 Continuous random variables E[X] = Z xu x l xf(x)dx = µ x 2 [x l,x u ] Z xu x l f(x)dx =1 19
Expectations of functions of random variables Expectation of a linear function of a random variable, X Y = a + bx E[Y ]=a + be[x] (a, b constants) Expectation of non-linear function of random variable, X Convex function Y = X 2 E[Y ] > E[X] 2 E[a + bx 2 ]=a + be[x 2 ] >a+ be[x] 2 Concave function Y = p X Jensen s Inequality E[Y ] < p E[X] 20
Moments Moments can be defined about any parameter (constant), c The n th moment about c, is E[(X c) n ] The n th ordinary moment of a random variable is a moment about zero (c =0) E[X n ] The first ordinary moment (n=1) is the mean of a random variable It is a measure of location or central tendency E[X] Moments defined about the mean of a random variable are central moments h E X E[X] ni = µ n 21
Moments The first (n = 1) central moment of a random variable is zero h i µ 1 = E X E[X] = E[X] E[X] = 0 The second central moment (n=2) of a random variable is variance By Jensen s inequality we know that the variance is always positive Variance is a measure of dispersion h µ 2 = E X E[X] 2i = E hx 2 2XE[X]+E[X] 2i = E[X 2 ] E[X] 2 = V[X] = 2 22
Variance Variance of a linear function of a random variable V[a + bx] =b 2 V[X] Variance of a non-linear function of a random variable Y = X 2 x f(x) 0 025 1 050 2 025 E[X] =1 V[X] = 1 2 V[Y ]=E[Y 2 ] E[Y ] 2 = E[X 4 ] E[X 2 ] 2 = V[X 2 ] y f(y) 0 025 1 050 4 025 E[Y ]= 3 2 V[Y ]= 9 4 V[X 2 ] 6= V[X] 2 V[a + bx 2 ]=b 2 V[X 2 ] 23
Standardized Moments The first standardized central moment of a random variable is zero µ 1 = h i E X E[X] p =0 2 The second standardized central moment of a random variable is one h µ E 2 2 = X E[X] 2i 2 =1 24
Skewness The third central moment of a random variable is a measure of the asymmetry of the density/mass function of the random variable h µ 3 = E X E[X] 3i = E[X 3 ] 3E[X]V[X] E[X] 3 = E[X 3 ] 3µ 2 µ 3 Symmetry is implied by h E X E[X] 3i =0 Skewness is defined as the third standardized central moment h S[X] = µ E 3 3 = X E[X] 3i 3 Probability density function Zero skewness Positive skewness Standard deviation = p V[X] Negative skewness ct of Skewness 25
Kurtosis The fourth central moment of a random variable is a measure of the dispersion of the distribution of the random variable h µ 4 = E X E[X] 4i Compared to the second central moment (variance), the fourth central moment places more weight on extreme values of the random variable Kurtosis is defined as the fourth standardized central moment of the random variable h K[X] = µ E 4 4 = X V[X] 2 E[X] 4i 26
Skewness and Kurtosis Question Determine the skewness and kurtosis for the random variable, X x f(x) 0 050 1 050 05# 04# 03# 02# 01# 00# f(x)% 0# 1# Answer E[X] = 0(05) + 1(05) = 05 V[X] =(0 05) 2 (05) + (1 05) 2 (05) = 025 S[X] = (0 05)3 (05) + (1 05) 3 (05) 05 3 =0 K[X] = (0 05)4 (05) + (1 05) 4 (05) 05 4 =1 Note: Kurtosis = 1 is the minimum kurtosis for any random variable! 27
Kurtosis Question Compare the kurtosis for the following random variables x f(x) 0 025 1 050 2 025 060# 050# 040# 030# 020# 010# f(x)% y f(y) 0 0125 1 0750 2 0125 080# 060# 040# 020# f(y)% 000# 0# 1# 2# 000# 0# 1# 2# z f(z) -04142 0125 10000 0750 24142 0125 080# 060# 040# 020# 000# f(z)% (04142# 1# 24142# 28
Kurtosis Answer 422 x f(x) 0 025 1 050 2 025 V[X] =(0 1) 2 (025) + (2 1) 2 (025) = 05 K[X] = (0 1)4 (025) + (2 1) 4 (025) 05 2 =2 y f(y) 0 0125 1 0750 2 0125 V[Y ]=(0 1) 2 (0125) + (2 1) 2 (0125) = 025 K[Y ]= (0 1)4 (0125) + (2 1) 4 (0125) 025 2 =4 z f(z) -04142 0125 10000 0750 24142 0125 V[Z] =( 04142 1) 2 (0125) + (24142 1) 2 (0125) 05 K[Z] = ( 04142 1)4 (0125) + (24142 1) 4 (0125) 05 2 4 29
Kurtosis Kurtosis is best thought of as a measure of the heaviness of the tails of the density (mass) function of a continuous (discrete) random variable and of the peakedness of the density (mass) function The effects of kurtosis on tails and peaks across two or more distributions should only be compared if the variances are equal Excess Kurtosis for a selection of distributions with mean = 0 and variance = 1 Excess Kurtosis = Kurtosis - 3 Excess Kurtosis for a Normal RV = 0 Source: http://enwikipediaorg/wiki/kurtosis 30
Kurtosis Normal density: mean = 0, variance = 5/3, kurtosis = 3 Student-t density: mean = 0, deg of freedom = 5, variance = 5/3, kurtosis = 9 060# 050# 040# 030# 020# 010# 000# *700# *500# *300# *100# 100# 300# 500# 700# Normal density: mean = 0, variance = 1, kurtosis = 3 Student-t density: mean = 0, deg of freedom = 5, variance (scaled) = 1, kurtosis = 9 060# 050# 040# 030# 020# 010# 000# *700# *500# *300# *100# 100# 300# 500# 700# 31
Combinations of random variables Combinations Functions (sums, products, etc) of random variables Linear Combinations Moments Z = X + Y V = ax + by W = c + dx ey E[Z], V[Z], S[Z], K[Z] Joint probability (mass/density) f(x,y) y 0 1 2 0 030 005 005 040 x 1 010 020 005 035 2 005 005 015 025 045 030 025 100 32
Combinations of random variables Expectation of linear combinations f(x,y) y 0 1 2 0 030 005 005 040 x 1 010 020 005 035 2 005 005 015 025 045 030 025 100 Z"="X"+"Y y 0 1 2 0 0 1 2 x 1 1 2 3 2 2 3 4 E[Z] y 0 1 2 0 000 005 010 x 1 010 040 015 2 010 015 060 E[Z] =165 f(x) x 040 0 035 1 025 2 E[X] 085 f(y) y 045 0 030 1 025 2 E[Y] 080 E[Z] =E[X]+E[Y ] =165 E[X + Y ]=E[X]+E[Y ] E[X Y ]=E[X] E[Y ] E[aX + by cz] = ae[x]+ be[y ] ce[z] 33
Combinations of random variables Variance of linear combinations V[Z] =E (Z E[Z]) 2 f(x,y) y 0 1 2 0 030 005 005 040 x 1 010 020 005 035 2 005 005 015 025 045 030 025 100 (z#$#e[z])^2 y 0 1 2 0 27225 04225 01225 x 1 04225 01225 18225 2 01225 18225 55225 V[Z] =19275 Z^2$=$(X+Y)^2 y 0 1 2 0 0 1 4 x 1 1 4 9 2 4 9 16 V[Z] =E[Z 2 ] E[Z] 2 =465 27225 =19275 34
Combinations of random variables Variance of linear combinations V[X + Y ]=E (X + Y E[X + Y ]) 2 = E[(X + Y ) 2 ] E[X + Y ] 2 = E[X 2 + Y 2 +2XY ] E[X] 2 + E[Y ] 2 +2E[X]E[Y ] = V[X]+V[Y ]+2 E[XY ] E[X]E[Y ] Covariance COV[X, Y ]=E[XY ] E[X]E[Y ] 35
Combinations of random variables Covariance A second order cross central moment h COV[X, Y ]=E X E[X] Y E[Y ] i = E[XY ] E[X]E[Y ] = X,Y y y y m Y m Y m Y x x m X m X m X x COV[X, Y ] > 0 COV[X, Y ] < 0 COV[X, Y ]=0 COV[X, X] =V[X] 36
Combinations of random variables Variance of linear combinations V[X + Y ]=E[(X + Y ) 2 ] E[X + Y ] 2 = V[X]+V[Y ]+2COV[X, Y ] Correlation V[aX + by ]=a 2 V[X]+b 2 V[Y ]+2ab COV[X, Y ] V[aX by ]=a 2 V[X]+b 2 V[Y ] 2ab COV[X, Y ] C[X, Y ]= COV[X, Y ] p V[X] V[Y ] = X,Y X Y = X,Y 1 apple X,Y apple 1 Expectation of products E[XY ]=E[X]E[Y ]+COV[X, Y ] Independence E[XY ]=E[X]E[Y ]! COV[X, Y ]=0 37
Probability Models A FINANCIAL TIMES COMPANY 38
Bernoulli Distribution Range (support) x 2 {0, 1} Mass function f(x) = ( 1 p if x =0 p if x =1 or f(x) =p x (1 p) 1 x 0 <p<1 Moments E[X] =p V[X] =p(1 p) S[X] = 1 2p p p(1 p) K[X] = 1 p(1 p) 3 f(x)% 05# 04# 03# 02# 01# 00# 0# 1# K(X) = 1, if p = 05! minimum kurtosis! 39
Binomial Distribution Sum of n independent and identically distributed (same p) Bernoulli random variables nx Y = i=1 X i Range (support) y 2 {0, 1, 2,,n} Mass function n f(y) = p y (1 y p) n y 0 <p<1 n = y n! y!(n y)! n! =n(n 1)(n 2) 1 Moments E[X] =np V[X] =np(1 p) 1 2p S[X] = p np(1 p) K[X] = 1 np(1 p) 6 n +3 Source: http://enwikipediaorg/wiki/binomial distribution 40
Binomial Distribution Question A multiple choice exam has 10 questions with 5 choices per question If you need at least 3 correct answers to pass the exam, what is the probability that you will pass simply by guessing? Answer Number of correct answers is a binomial random variable with n = 10 and p = 02 P[Y 3] = 1 P[Y apple 2] 10 =1 2 02 2 (08) 8 10 1 10 02(08) 9 0 =1 08 8 45(004) + 10(016) + 064 =1 08 8 (098) 03222 08 10 41
Poisson Distribution Range (support) x 2 {0, 1, 2,} Mass function f(x) = ( t)x x! e t t is a scale parameter Moments E[X] = V[X] = t t S[X] = 1 p t K[X] = 1 t +3 Source: http://enwikipediaorg/wiki/poisson_distribution 42
Poisson Distribution Question The number of defaults per month in a large bond portfolio follows a Poisson process On average, there are two defaults per month The number of defaults is independent from one month to the next Calculate the probability the number of defaults will not exceed 3 over the next two months Answer =2 t =2 P[X apple 3] = P[X = 0] + P[X = 1] + P[X = 2] + P[X = 3] = 40 0! e 4 + 41 1! e 4 + 42 2! e 4 + 43 3! e 4 4 = e 4 0 0! + 41 1! + 42 2! + 43 3! = e 4 1+4+8+ 32 3 71 = e 4 3 04335 43
Poisson Approximation to Binomial Binomial probability P[Y = y] = n p y (1 p) n y y Poisson probability P[Y = y] =e y y! lim n!1 p!0 n p y (1 p) n y = e y y y! np = (constant) For large n and small p n p y (1 p) n y e y y y! np = 44
Exponential Distribution Range (support) t 2 [0, 1] Density function f(t) = e t Distribution function F (t) =1 e t Moments E(T )=1/ V(T )=1/ 2 S(T )=2 K(T )=9 Source: http://enwikipediaorg/wiki/exponential_distribution 45
Poisson and Exponential Distributions Question The number of defaults per month in a large bond portfolio follows a Poisson process On average, there are two defaults per month The number of defaults is independent from one month to the next Calculate the probability that there will be at least one default over the next month Answer Poisson: Exponential: =2 t =1 =2 P[X 1] = 1 P[X = 0] =1 4 0 0! e 2 =1 e 2 08647 P[T <1] = 1 e 2(1) 08647 46
Uniform Distribution Range (support) x 2 [a, b] 1<a<b<1 Density function Distribution function Moments f(x) = 1 b F (x) = x b a a a E[X] = a + b 2 (b a)2 V[X] = 12 S[X] =0 K[X] = 9 5 Source: http://enwikipediaorg/wiki/uniform_distribution_(continuous) 47
Standard Normal (Gaussian) Distribution Range (support) z 2 ( 1, 1) Density function (z) = 1 p 2 e z2 /2 Distribution function 68% of the distribution is between 1 and +1 (z) = Z z 1 1 p 2 e u2 /2 du 95% is between 2 and +2 Moments 4 3 2 1 0 1 2 3 4 Realization of the standard normal random variable E[Z] =0 V[Z] =1 S[Z] =0 K[Z] =3 Source: Jorion (2011), p 44 48
Normal (Gaussian) Distribution X = µ + Z Z N(0, 1) Z = X µ Range (support) Density function x 2 [ 1, 1] f(x) = 1 p 2 e 1 2( (x µ ) 2 Moments E[X] =µ V[X] = 2 S[X] =0 K[X] =3 The sum of independent normal random variables is a normal random variable Source: http://enwikipediaorg/wiki/normal_distribution 49
Log-Normal Distribution X N(µ, ) Y = e X ln Y N(µ, ) Range (support) y 2 (0, 1) 14" 12" 10" 8" 6" 4" 2" Y"="exp(X)" Density function Moments f(y) = 1 y p 2 e 1 2( (ln y µ ) 2 0" (15" (10" (05" 00" 05" 10" 15" 20" 25" E[Y ]=e µ+ 2 /2 V[Y ]= e 2 1 e 2µ+ 2 p S[Y ]= e 2 +2 e 2 1 K[Y ]=e 4 2 +2e 3 2 +3e 2 2 3 Source: http://enwikipediaorg/wiki/log-normal_distribution 50
Chi-Squared Distribution X = kx Zi 2 Z i N(0, 1) iid X 2 (k) i=1 Range (support) x 2 [0, 1) Density function (a) = Z 1 0 f(x) = e y y a 1 2 k/2 k 2 1 dy x k/2 1 e x/2 (a) =(a 1)! (for integral values of a) Moments E[X] =k V[X] =2k S[X] = p 8/k K[X] = 12/k +3 Source: http://enwikipediaorg/wiki/chi-squared_distribution 51
Student s t Distribution X = Z p U/k Z N(0, 1) U 2 (k) Range (support) x 2 ( 1, 1) Density function f(x) = (a) = k+1 2 p k k 2 Z 1 0 e y y a 1+ x2 1 dy k k+1 2 Moments E[X] =0 k>1 V[X] =k/(k 2) k>2 S[X] =0 k>3 K[X] =6/(k 4) + 3 k>4 Source: http://enwikipediaorg/wiki/student's_t-distribution#cite_note-hogg-15 52