Week 1 Quantitative Analysis of Financial Markets Basic Statistics A

Week 1 Quantitative Analysis of Financial Markets Basic Statistics A Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 October 14, 2017 Christopher Ting QF 603 October 14, 2017 1/36

Table of Contents 1 Introduction 2 Central Tendency 3 Dispersion 4 Portfolio Variance and Hedging 5 Takeaways Christopher Ting QF 603 October 14, 2017 2/36

Introduction Statistics as a discipline refers to the methods we use to analyze data. Statistical methods fall into one of two categories: descriptive statistics or inferential statistics. Descriptive statistics summarize the important characteristics of large data sets. The goal is to consolidate numerical data into useful information. Inferential statistics pertain to the procedures used to make forecasts, estimates, or judgments about a large set of data on the basis of the statistical characteristics of a smaller set (a sample). In risk management, we often need to describe the relationship between two random variables. Is there a relationship between the returns of an equity and the returns of a market index? Christopher Ting QF 603 October 14, 2017 3/36

Learning Outcomes of QA02 Chapter 3. Michael Miller, Mathematics and Statistics for Financial Risk Management, 2nd Edition (Hoboken, NJ: John Wiley & Sons, 2013). Interpret and apply the mean, standard deviation, and variance of a random variable. Calculate the mean, standard deviation, and variance of a discrete random variable. Calculate and interpret the covariance and correlation between two random variables. Calculate the mean and variance of sums of variables. Christopher Ting QF 603 October 14, 2017 4/36

Population vs. Sample A population is defined as the set of all possible members of a stated group. Examples 1 Cross Section: All stocks listed on the Nasdaq 2 Time Series: Dow Jones Industrial Average Index It is frequently too costly or time consuming to obtain measurements for every member of a population, if it is even possible. A sample is a subset randomly drawn from the population. Christopher Ting QF 603 October 14, 2017 5/36

Population mean of N entities Central Tendency: Mean E ( X ) := µ = 1 N N X i, i = 1, 2,..., N. Sample mean is an estimate of the true mean µ: X n = 1 n X j, j = i 1, i 2,..., i n. n j=1 The law of large numbers for a random variable X states that, for a sample of independently realized values, x 1, x 2,..., x n, 1 n lim x i = µ, n n What is the point to calculate the sample mean? Christopher Ting QF 603 October 14, 2017 6/36

Estimator The (functional) form of the estimator X in Slide 6 is an estimator. You can also define the sample average alternative as Which is better, X or {}}{ X? {}}{ X := 1 n + 1 n x i. Christopher Ting QF 603 October 14, 2017 7/36

Unbiasedness and Consistency Unbiasedness An estimator ( ) θ n of a population statistic θ is said to be unbiased when E θn = θ. Consistency An estimator θ n of a population statistic θ is said to be unbiased when lim n θn θ in probability. Namely, for any ε > 0. ( lim θn P θ ) ε = 0. n Exercise: Show that the estimator X for the mean µ is unbiased but estimator {}}{ X is biased. Exercise: Show that the estimators X and {}}{ X are consistent. Is θ 7 unbiased? Consistent? Christopher Ting QF 603 October 14, 2017 8/36

Independently and Identically Distributed (i.i.d.) The concept of independence is a strong condition. Two random variables are independent when they are not related in any way. Identical means that the two random variables are the same from the statistical standpoint. They follow the same probability distribution, hence the same descriptive statistics. X is a random variable, but it has many copies. With the subscript i, X i is a copy of X when it will realize a value for the i-th time. Question: Is the sample mean X n a random variable? Christopher Ting QF 603 October 14, 2017 9/36

Expected Value: Discrete and Continuous For a discrete random variable with n possible outcomes, suppose the probabilities are The mean is P(X = x i ) = p i, i = 1, 2,..., n. µ = E ( X ) = n p i x i. For a continuous random variable with probability density function f(x) and cumulative distribution function F (x), the mean is given by the integration: µ = E ( X ) = x f(x) dx = x df. Christopher Ting QF 603 October 14, 2017 10/36

Linearity of Expectation The expectation operator E ( ) is linear. That is, for two random variables, X and Y, and a constant, c, the following two equations are true: E ( cx ) = c E ( X ) E ( X + Y ) = E ( X ) + E ( Y ) Having introduced two constants a and b, show that E ( ax + by ) = a E ( X ) + b E ( Y ). Christopher Ting QF 603 October 14, 2017 11/36

Probability and Expected Value Tossing a fair coin and the random variable X. { 1, if ω = Head ; X = 0, if ω = Tail. Suppose P ( ω = Head ) = 1 2 = P ( ω = Tail ). Quiz: What is the value of E ( X )? Suppose the coin is not fair and P ( ω = Head ) = 0.6. 1 What is the value of E ( X )? 2 Suppose x 1, x 2,..., x n are the results of tossing the unfair coin n times. What is the value of 1 n x i if n is very large? n Christopher Ting QF 603 October 14, 2017 12/36

Probability and Expected Value (cont d) Consider the indicator variable 1 A, which is defined as { 1, if ω A; 1 A = 0, if ω A c. What is E ( 1 A )? Christopher Ting QF 603 October 14, 2017 13/36

Central Tendency: Median and Mode The median of a discrete random variable is the value such that the probability that a value is less than or equal to the median is equal to 50%. P ( X m ) = P ( x m ) = 1 2. The median is found by first ordering the data and then separating the ordered data into two halves. The mode of a sample is the value that has the highest frequency of occurrences. Question: Calculate the mean, median, and mode of the following data set. 20%, 10%, 5%, 5%, 0%, 10%, 10%, 10%, 19% Christopher Ting QF 603 October 14, 2017 14/36

At the start of the year, a bond portfolio consists of two bonds, each worth $100. At the end of the year, if a bond defaults, it will be worth $20. If it does not default, the bond will be worth $100. The probability that both bonds default is 20%. The probability that neither bond defaults is 45%. What are the mean, median, and mode of the year-end portfolio value? Solution 1 If both bonds default, then the portfolio value V will be $20 + $20 = $40. The problem says that P ( V = $40 ) = 20%. 2 If neither bond defaults, then V will be $100 + $100 = $200. The problem says that P ( V = $200 ) = 45%. 3 So the probability of one of the two bonds defaults is P ( V = $120 ) = 1 0.2 0.45 = 35%. 4 Hence, E ( V ) = 0.2 $40 + 0.35 $120 + 0.45 $200 = $140. 5 The mode is $200, as it occurs with the highest probability of 45%. 6 The median is $120; half of the outcomes are less than or equal to $120. Christopher Ting QF 603 October 14, 2017 15/36 Sample Problem

Another Sample Problem Recall the probability density function f(x) = 8 ( 9 x for x 0, 3 ]. 2 To calculate the median, we need to find m, such that the integral of f(x) from the lower bound of f(x), zero, to m is equal to 0.50. m Solving for m, we find m = 1 3 2 2 To find the mean, we compute 0 f(x) dx = 0.5. to find that µ = 1. µ = 3/2 0 x f(x) dx Christopher Ting QF 603 October 14, 2017 16/36

A Measure of Dispersion Variance is defined as the expected value of the difference between the variable and its mean squared: σ 2 := E ( (X µ) 2) =: V ( X ) The symbol σ 2 is often used to denote the variance of the random variable X with mean µ. The square root of variance, σ, is the standard deviation. The mean µ of investment return is often referred to as the expected return. The Standard deviation of investment return R is referred to as volatility. Volatility is not risk. Exercise: Show that σ 2 = E ( X 2) µ 2. Exercise: Compute the variance of X in Slide 12. Christopher Ting QF 603 October 14, 2017 17/36

A Property of Variance Prove that, with c being a constant, Proof: 1 Let Y = cx. 2 µ Y = E ( cx ) = c E ( X ) =: cµ X V ( cx ) = c 2 V ( X ). 3 By definition, σ 2 Y = E ( (Y µy ) 2 ). 4 Therefore, ( (cx σy 2 ) ) 2 = E cµx = E (c 2( ) ) 2 X µ X ( (X = c 2 ) ) 2 E µx = c 2 V ( X ). 5 Since σ 2 Y = V ( cx ), the proof is complete. Christopher Ting QF 603 October 14, 2017 18/36

Sample Variance The sample average or sample mean of a random variable X is X n = 1 n n X i. The sample variance, as an estimator, is defined as σ 2 n = 1 n 1 n ( ) 2. Xi X n Why divided by n 1 and not n? Alternative estimator of sample variance is σ n 2 = 1 n ( ) 2. Xi X n n Which is the correct one? Christopher Ting QF 603 October 14, 2017 19/36

Sample Variance σ 2 n is Unbiased. First we show that n ( ) 2 n ( Xi X n = X 2 i 2X i X n + X 2 n) = n Xi 2 2X n X i + nx 2 n. Since nx n = n X i, we have Note that E(X 2 i ) = σ2 + µ 2. n ( ) 2 n Xi X n = Xi 2 nx 2. Christopher Ting QF 603 October 14, 2017 20/36

Sample Variance σ 2 n is Unbiased. (cont d) ( ) Next, we need to calculate E X 2 n. Let Y := X n. Note that for any random variable Y, E ( Y 2) = V ( Y ) + µ 2. The variance of the sample average is, by the assumption of the independence of X i, V ( Y ) ( ) ( 1 n n ) = V X i = 1 n n 2 V X i = 1 n n 2 V ( ) 1 X i = n 2 nσ2 = σ2 n. Christopher Ting QF 603 October 14, 2017 21/36

Sample Variance σ 2 n is Unbiased. (cont d) If follows that E ( Y 2) ( ) = E X 2 n = σ2 n + µ2. To show unbiasedness, we need to prove that E ( σ 2) = σ 2. Noting that E(Xi 2) = σ2 + µ 2, we find E ( σ ( n 2) = 1 E ( Xi 2 n 1 ( n = 1 n 1 = 1 n 1 ) ( ) ) n E X 2 n ( σ 2 + µ 2) n ( σ 2 ( (n 1)σ 2 ) = σ 2. n + µ2 ) ) Christopher Ting QF 603 October 14, 2017 22/36

Variance for a Continuous Random Variable Definition σ 2 = ( x µ ) 2f(x) dx Exercise: Suppose the probability density function is f(x) = 8 ( 9 x for x 0, 3 ]. Compute the variance. 2 Christopher Ting QF 603 October 14, 2017 23/36

Standardized Variables Suppose X is a random variable with constant mean µ and variance σ 2. Since volatility σ 0 for a random variable, we can define Y := X µ. σ The variable Y has mean zero and variance 1. Quiz: If X is a stochastic process dx t = µ dt + σ db t, what is the stochastic process for Y t? Christopher Ting QF 603 October 14, 2017 24/36

Covariance Covariance is a generalized version of variance. It is defined as C(X, Y ) σ XY := E (( X µ X )( Y µy )). Variance is a special case: C(X, X) = σ XX = V ( X ). Whereas variance is strictly positive, covariance can be positive, negative, and zero. If X and Y are independent, then it must be that C(X, Y ) = 0. If C(X, Y ) = 0, it is not necessarily true that X and Y are independent. Exercise: Show that 1 σ XY = E ( XY ) µ X µ Y. 2 C ( X, Y ) = C ( Y, X ). 3 C ( X + Y, Z ) = C ( X, Z ) + C ( Y, Z ). Christopher Ting QF 603 October 14, 2017 25/36

Estimators of Covariance Given the paired data, (x i, y i ), i = 1, 3,..., n, the sample covariance is defined as σ XY = 1 n ( )( ) xi µ X yi µ Y. n 1 C ( X i, Y j ) = 0, if i j. Proof: Suppose Y and X are related by a mapping f( ), i.e., Y j = f(x j ). Note that the mapping involves the paired copies because each Y j is independent and does not relate at all with Y i. Otherwise, if Y j = f(x i, X j ), then Y j may depend on Y i indirectly through since Y i = f(x h, X i ). Homework Assignment: Show that n ( )( ) n 1 Xi µ X Yi µ Y = X i Y i nx n Y n. 2 Use these results to show that the sample covariance is unbiased. Christopher Ting QF 603 October 14, 2017 26/36

Linear Combination of Two Random Variables Suppose X and Y are a pair random variables with means µ X = E(X) and µ Y = E(Y ), respectively. Also, suppose a and b are two constants. Prove that Proof V ( ax + by ) = a 2 V ( X ) + b 2 V ( Y ) + 2ab C ( X, Y ). 1 V ( ax + by ) = E ( (ax + by ) 2) (aµ X + bµ Y ) 2. 2 Expanding the two quadratic term and collecting the expanded terms accordingly, we obtain a 2 E ( X 2) a 2 µ 2 X + b 2 E ( Y 2) b 2 µ 2 Y + 2ab E ( XY ) 2abµ X µ Y, which is a 2 ( E ( X 2) µ 2 X) + b 2 ( E ( Y 2) µ 2 Y ) + 2ab ( E ( XY ) µx µ Y ). Christopher Ting QF 603 October 14, 2017 27/36

Correlation: Normalized Covariance The normalization of covariance gives rise to correlation, which is defined as ρ XY := σ XY σ X σ Y. Correlation has the nice property that it varies between -1 and +1. If two variables have a correlation of +1 (-1), then we say they are perfectly correlated (anti-correlated). If one random variable causes the other random variable, or that both variables share a common underlying driver, then they are highly correlated. But in general, high correlation does not imply causation of one variable on the other. If two variables are uncorrelated, it does not necessarily follow that they are unrelated. So what does correlation really tell us? Christopher Ting QF 603 October 14, 2017 28/36

Sample Problem If X has an equal probability of being -1, 0, or +1, what is the correlation between X and Y if Y = X 2? First, we calculate the respective means of both variables: E ( X ) = 1 3 ( 1) + 1 3 (0) + 1 (1) = 0. 3 E ( Y ) = 1 3 (( 1)2 ) + 1 3 (02 ) + 1 3 (12 ) = 2 3. The covariance can be found as follows: σ XY = 1 ( ( 1 0)(( 1) 2 2/3) + (0 0)(0 2 2/3) 3 ) + (1 0)(1 2 2/3) = 0. So, even though X and Y are clearly related (Y = X 2 ), their correlation is zero! Christopher Ting QF 603 October 14, 2017 29/36

Portfolio Variance If we have two securities with random returns X A and X B, with means µ A and µ B and standard deviations σ A and σ B, respectively, we can calculate the variance of X A plus X B as follows: σ 2 A+B = σ 2 A + σ 2 B + 2ρ AB σ A σ B, where ρ AB is the correlation between X A and X B. If the securities are uncorrelated, then σa+b 2 = σ2 A + σ2 B n. In general, suppose Y = X i. The portfolio s variance is σ 2 Y = n i=j m ρ ij σ i σ j. Christopher Ting QF 603 October 14, 2017 30/36

Square Root Rule Suppose X i is a copy of X such that σ i = σ for all i, and that all of the X i s are uncorrelated, i.e., ρ ij = 0 for i j. Then, σ Y = n σ. Consider the time series of weekly i.i.d. returns. The volatility is σ = 2.06%. What is the annualized volatility? Answer Assume that one year has 52 weeks. Using the square root rule, we obtain 2.06% 52 = 14.85%. If i.i.d. fails to hold, square root rule may lead to a misleading value. Christopher Ting QF 603 October 14, 2017 31/36

Application: Static Hedging If the portfolio P is a linear combination of X A and X B, i.e., P = ax A + bx B, then σ 2 P = a 2 σ 2 A + b 2 σ 2 B + 2abρ AB σ A σ B. Correlation is central to the problem of hedging. Let a = 1, i.e., X A is our primary asset. What should the hedge ratio b be such that the portfolio variance σp 2 is the smallest possible? The first-order condition with respect to b is dσ 2 P db = 2bσ2 B + 2ρ AB σ A σ B = 0. Christopher Ting QF 603 October 14, 2017 32/36

Application: Static Hedging (cont d) The optimal hedge ratio is b σ A = ρ AB = C ( ) XA, X B σ B V ( ). X B If b is positive (negative), long (short) the asset B. Substituting b back into our original equation, the smallest volatility you can achieve for the hedged portfolio is σ P = σ A 1 ρ 2 AB. When ρ AB equals zero (i.e., when the two securities are uncorrelated), the optimal hedge ratio is zero. You cannot hedge one security with another security if they are uncorrelated. Christopher Ting QF 603 October 14, 2017 33/36

Puzzling? Adding an uncorrelated security to a portfolio will always increase its variance! For example, $100 of Security A plus $20 of uncorrelated Security B will have a higher dollar standard deviation. But if Security A and Security B are uncorrelated and have the same standard deviation, then replacing some of Security A with Security B will decrease the dollar standard deviation of the portfolio. For example, $80 of Security A plus $20 of uncorrelated Security B will have a lower dollar standard deviation than $100 of Security A. Christopher Ting QF 603 October 14, 2017 34/36

Demystifying the Puzzle Let R A and R B be the returns of Security A and Security B, respectively. Let σ 2 A( = V(R A ) ) and σ 2 B( = V(R B ) ) be the variances of these returns. Moreover, suppose σ A = σ B = σ. The dollar value of Security A will become $100R A. If the portfolio is constructed by investing $100 in Security A, then the volatility of the portfolio value in dollars is V ( ) 100R A = $100σ. But if the portfolio is made by having $80 invested in Security A and $20 invested in uncorrelated Security B, then the volatility of the portfolio value in dollars is V ( ) 80R A + V ( ) 20R B, which is 6,400σA 2 + 400σ2 B = 6,800 σ < 100σ. Christopher Ting QF 603 October 14, 2017 35/36

Important Lessons Mean-variance analysis is a cornerstone of investment, even trading. All sample estimators such as sample average and sample variance are random variables due to sampling randomness. Sample mean, sample variance, and sample covariance Unbiasedness of the three sample estimates Diversification is more subtle than you thought. Christopher Ting QF 603 October 14, 2017 36/36