Week 1 Quantitative Analysis of Financial Markets Basic Statistics A

Similar documents
Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Week 1 Quantitative Analysis of Financial Markets Distributions B

Week 1 Quantitative Analysis of Financial Markets Probabilities

Lecture 3: Return vs Risk: Mean-Variance Analysis

Statistics for Business and Economics

Lecture 4: Return vs Risk: Mean-Variance Analysis

Random Variables and Probability Distributions

Chapter 7: Random Variables

Introduction Random Walk One-Period Option Pricing Binomial Option Pricing Nice Math. Binomial Models. Christopher Ting.

II. Random Variables

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Statistics vs. statistics

Applying the Principles of Quantitative Finance to the Construction of Model-Free Volatility Indices

1.1 Interest rates Time value of money

Homework Assignments

Business Statistics 41000: Probability 3

SYLLABUS AND SAMPLE QUESTIONS FOR MSQE (Program Code: MQEK and MQED) Syllabus for PEA (Mathematics), 2013

PORTFOLIO THEORY. Master in Finance INVESTMENTS. Szabolcs Sebestyén

CS134: Networks Spring Random Variables and Independence. 1.2 Probability Distribution Function (PDF) Number of heads Probability 2 0.

ECO220Y Introduction to Probability Readings: Chapter 6 (skip section 6.9) and Chapter 9 (section )

Equilibrium Asset Returns

Probability and Random Variables A FINANCIAL TIMES COMPANY

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

Sampling and sampling distribution

8.1 Estimation of the Mean and Proportion

MS-E2114 Investment Science Lecture 5: Mean-variance portfolio theory

Lecture 22. Survey Sampling: an Overview

UNIVERSITY OF VICTORIA Midterm June 2014 Solutions

Random Variables and Applications OPRE 6301

Chapter 7: Random Variables and Discrete Probability Distributions

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

25 Increasing and Decreasing Functions

Binomial Random Variables. Binomial Random Variables

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Quantitative Methods for Economics, Finance and Management (A86050 F86050)

Much of what appears here comes from ideas presented in the book:

Point Estimators. STATISTICS Lecture no. 10. Department of Econometrics FEM UO Brno office 69a, tel

Martingales. by D. Cox December 2, 2009

Chapter 3 Discrete Random Variables and Probability Distributions

MATH 3200 Exam 3 Dr. Syring

AP Statistics Chapter 6 - Random Variables

Mean-Variance Portfolio Theory

Chapter 8. Introduction to Statistical Inference

Some Characteristics of Data

Econ 250 Fall Due at November 16. Assignment 2: Binomial Distribution, Continuous Random Variables and Sampling

MATH 264 Problem Homework I

Statistics for Business and Economics: Random Variables (1)

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Martingales, Part II, with Exercise Due 9/21

CHAPTER 7 RANDOM VARIABLES AND DISCRETE PROBABILTY DISTRIBUTIONS MULTIPLE CHOICE QUESTIONS

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Covariance and Correlation. Def: If X and Y are JDRVs with finite means and variances, then. Example Sampling

IOP 201-Q (Industrial Psychological Research) Tutorial 5

Discrete Random Variables and Probability Distributions

Random Variables Handout. Xavier Vilà

General Notation. Return and Risk: The Capital Asset Pricing Model

Statistics for Managers Using Microsoft Excel 7 th Edition

AMH4 - ADVANCED OPTION PRICING. Contents

Application to Portfolio Theory and the Capital Asset Pricing Model

Optimizing Portfolios

CSC Advanced Scientific Programming, Spring Descriptive Statistics

Expected Value and Variance

Techniques for Calculating the Efficient Frontier

Reading: You should read Hull chapter 12 and perhaps the very first part of chapter 13.

CSCI 1951-G Optimization Methods in Finance Part 07: Portfolio Optimization

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

Mean-Variance Analysis

Math-Stat-491-Fall2014-Notes-V

ECON Introductory Econometrics. Lecture 1: Introduction and Review of Statistics

2011 Pearson Education, Inc

Introduction Taylor s Theorem Einstein s Theory Bachelier s Probability Law Brownian Motion Itô s Calculus. Itô s Calculus.

Hedging and Regression. Hedging and Regression

TOPIC: PROBABILITY DISTRIBUTIONS

Probability Distributions for Discrete RV

Statistics. Marco Caserta IE University. Stats 1 / 56

1. Covariance between two variables X and Y is denoted by Cov(X, Y) and defined by. Cov(X, Y ) = E(X E(X))(Y E(Y ))

3.1 Measures of Central Tendency

Chapter 4 Partial Fractions

L04: Homework Answer Key

Probability. An intro for calculus students P= Figure 1: A normal integral

CHAPTER 6 Random Variables

Module 6 Portfolio risk and return

Chapter 4 Variability

Lecture 9. Probability Distributions. Outline. Outline

variance risk Alice & Bob are gambling (again). X = Alice s gain per flip: E[X] = Time passes... Alice (yawning) says let s raise the stakes

Diploma in Business Administration Part 2. Quantitative Methods. Examiner s Suggested Answers

Chapter 6: Random Variables

Class Notes on Financial Mathematics. No-Arbitrage Pricing Model

Chapter 6: Random Variables

Module 4: Point Estimation Statistics (OA3102)

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Application of Stochastic Calculus to Price a Quanto Spread

Lecture 9. Probability Distributions

Basic Procedure for Histograms

Session 8: The Markowitz problem p. 1

Simulation Wrap-up, Statistics COS 323

4.2 Probability Distributions

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

Information Globalization, Risk Sharing and International Trade

5.7 Probability Distributions and Variance

Transcription:

Week 1 Quantitative Analysis of Financial Markets Basic Statistics A Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 October 14, 2017 Christopher Ting QF 603 October 14, 2017 1/36

Table of Contents 1 Introduction 2 Central Tendency 3 Dispersion 4 Portfolio Variance and Hedging 5 Takeaways Christopher Ting QF 603 October 14, 2017 2/36

Introduction Statistics as a discipline refers to the methods we use to analyze data. Statistical methods fall into one of two categories: descriptive statistics or inferential statistics. Descriptive statistics summarize the important characteristics of large data sets. The goal is to consolidate numerical data into useful information. Inferential statistics pertain to the procedures used to make forecasts, estimates, or judgments about a large set of data on the basis of the statistical characteristics of a smaller set (a sample). In risk management, we often need to describe the relationship between two random variables. Is there a relationship between the returns of an equity and the returns of a market index? Christopher Ting QF 603 October 14, 2017 3/36

Learning Outcomes of QA02 Chapter 3. Michael Miller, Mathematics and Statistics for Financial Risk Management, 2nd Edition (Hoboken, NJ: John Wiley & Sons, 2013). Interpret and apply the mean, standard deviation, and variance of a random variable. Calculate the mean, standard deviation, and variance of a discrete random variable. Calculate and interpret the covariance and correlation between two random variables. Calculate the mean and variance of sums of variables. Christopher Ting QF 603 October 14, 2017 4/36

Population vs. Sample A population is defined as the set of all possible members of a stated group. Examples 1 Cross Section: All stocks listed on the Nasdaq 2 Time Series: Dow Jones Industrial Average Index It is frequently too costly or time consuming to obtain measurements for every member of a population, if it is even possible. A sample is a subset randomly drawn from the population. Christopher Ting QF 603 October 14, 2017 5/36

Population mean of N entities Central Tendency: Mean E ( X ) := µ = 1 N N X i, i = 1, 2,..., N. Sample mean is an estimate of the true mean µ: X n = 1 n X j, j = i 1, i 2,..., i n. n j=1 The law of large numbers for a random variable X states that, for a sample of independently realized values, x 1, x 2,..., x n, 1 n lim x i = µ, n n What is the point to calculate the sample mean? Christopher Ting QF 603 October 14, 2017 6/36

Estimator The (functional) form of the estimator X in Slide 6 is an estimator. You can also define the sample average alternative as Which is better, X or {}}{ X? {}}{ X := 1 n + 1 n x i. Christopher Ting QF 603 October 14, 2017 7/36

Unbiasedness and Consistency Unbiasedness An estimator ( ) θ n of a population statistic θ is said to be unbiased when E θn = θ. Consistency An estimator θ n of a population statistic θ is said to be unbiased when lim n θn θ in probability. Namely, for any ε > 0. ( lim θn P θ ) ε = 0. n Exercise: Show that the estimator X for the mean µ is unbiased but estimator {}}{ X is biased. Exercise: Show that the estimators X and {}}{ X are consistent. Is θ 7 unbiased? Consistent? Christopher Ting QF 603 October 14, 2017 8/36

Independently and Identically Distributed (i.i.d.) The concept of independence is a strong condition. Two random variables are independent when they are not related in any way. Identical means that the two random variables are the same from the statistical standpoint. They follow the same probability distribution, hence the same descriptive statistics. X is a random variable, but it has many copies. With the subscript i, X i is a copy of X when it will realize a value for the i-th time. Question: Is the sample mean X n a random variable? Christopher Ting QF 603 October 14, 2017 9/36

Expected Value: Discrete and Continuous For a discrete random variable with n possible outcomes, suppose the probabilities are The mean is P(X = x i ) = p i, i = 1, 2,..., n. µ = E ( X ) = n p i x i. For a continuous random variable with probability density function f(x) and cumulative distribution function F (x), the mean is given by the integration: µ = E ( X ) = x f(x) dx = x df. Christopher Ting QF 603 October 14, 2017 10/36

Linearity of Expectation The expectation operator E ( ) is linear. That is, for two random variables, X and Y, and a constant, c, the following two equations are true: E ( cx ) = c E ( X ) E ( X + Y ) = E ( X ) + E ( Y ) Having introduced two constants a and b, show that E ( ax + by ) = a E ( X ) + b E ( Y ). Christopher Ting QF 603 October 14, 2017 11/36

Probability and Expected Value Tossing a fair coin and the random variable X. { 1, if ω = Head ; X = 0, if ω = Tail. Suppose P ( ω = Head ) = 1 2 = P ( ω = Tail ). Quiz: What is the value of E ( X )? Suppose the coin is not fair and P ( ω = Head ) = 0.6. 1 What is the value of E ( X )? 2 Suppose x 1, x 2,..., x n are the results of tossing the unfair coin n times. What is the value of 1 n x i if n is very large? n Christopher Ting QF 603 October 14, 2017 12/36

Probability and Expected Value (cont d) Consider the indicator variable 1 A, which is defined as { 1, if ω A; 1 A = 0, if ω A c. What is E ( 1 A )? Christopher Ting QF 603 October 14, 2017 13/36

Central Tendency: Median and Mode The median of a discrete random variable is the value such that the probability that a value is less than or equal to the median is equal to 50%. P ( X m ) = P ( x m ) = 1 2. The median is found by first ordering the data and then separating the ordered data into two halves. The mode of a sample is the value that has the highest frequency of occurrences. Question: Calculate the mean, median, and mode of the following data set. 20%, 10%, 5%, 5%, 0%, 10%, 10%, 10%, 19% Christopher Ting QF 603 October 14, 2017 14/36

At the start of the year, a bond portfolio consists of two bonds, each worth $100. At the end of the year, if a bond defaults, it will be worth $20. If it does not default, the bond will be worth $100. The probability that both bonds default is 20%. The probability that neither bond defaults is 45%. What are the mean, median, and mode of the year-end portfolio value? Solution 1 If both bonds default, then the portfolio value V will be $20 + $20 = $40. The problem says that P ( V = $40 ) = 20%. 2 If neither bond defaults, then V will be $100 + $100 = $200. The problem says that P ( V = $200 ) = 45%. 3 So the probability of one of the two bonds defaults is P ( V = $120 ) = 1 0.2 0.45 = 35%. 4 Hence, E ( V ) = 0.2 $40 + 0.35 $120 + 0.45 $200 = $140. 5 The mode is $200, as it occurs with the highest probability of 45%. 6 The median is $120; half of the outcomes are less than or equal to $120. Christopher Ting QF 603 October 14, 2017 15/36 Sample Problem

Another Sample Problem Recall the probability density function f(x) = 8 ( 9 x for x 0, 3 ]. 2 To calculate the median, we need to find m, such that the integral of f(x) from the lower bound of f(x), zero, to m is equal to 0.50. m Solving for m, we find m = 1 3 2 2 To find the mean, we compute 0 f(x) dx = 0.5. to find that µ = 1. µ = 3/2 0 x f(x) dx Christopher Ting QF 603 October 14, 2017 16/36

A Measure of Dispersion Variance is defined as the expected value of the difference between the variable and its mean squared: σ 2 := E ( (X µ) 2) =: V ( X ) The symbol σ 2 is often used to denote the variance of the random variable X with mean µ. The square root of variance, σ, is the standard deviation. The mean µ of investment return is often referred to as the expected return. The Standard deviation of investment return R is referred to as volatility. Volatility is not risk. Exercise: Show that σ 2 = E ( X 2) µ 2. Exercise: Compute the variance of X in Slide 12. Christopher Ting QF 603 October 14, 2017 17/36

A Property of Variance Prove that, with c being a constant, Proof: 1 Let Y = cx. 2 µ Y = E ( cx ) = c E ( X ) =: cµ X V ( cx ) = c 2 V ( X ). 3 By definition, σ 2 Y = E ( (Y µy ) 2 ). 4 Therefore, ( (cx σy 2 ) ) 2 = E cµx = E (c 2( ) ) 2 X µ X ( (X = c 2 ) ) 2 E µx = c 2 V ( X ). 5 Since σ 2 Y = V ( cx ), the proof is complete. Christopher Ting QF 603 October 14, 2017 18/36

Sample Variance The sample average or sample mean of a random variable X is X n = 1 n n X i. The sample variance, as an estimator, is defined as σ 2 n = 1 n 1 n ( ) 2. Xi X n Why divided by n 1 and not n? Alternative estimator of sample variance is σ n 2 = 1 n ( ) 2. Xi X n n Which is the correct one? Christopher Ting QF 603 October 14, 2017 19/36

Sample Variance σ 2 n is Unbiased. First we show that n ( ) 2 n ( Xi X n = X 2 i 2X i X n + X 2 n) = n Xi 2 2X n X i + nx 2 n. Since nx n = n X i, we have Note that E(X 2 i ) = σ2 + µ 2. n ( ) 2 n Xi X n = Xi 2 nx 2. Christopher Ting QF 603 October 14, 2017 20/36

Sample Variance σ 2 n is Unbiased. (cont d) ( ) Next, we need to calculate E X 2 n. Let Y := X n. Note that for any random variable Y, E ( Y 2) = V ( Y ) + µ 2. The variance of the sample average is, by the assumption of the independence of X i, V ( Y ) ( ) ( 1 n n ) = V X i = 1 n n 2 V X i = 1 n n 2 V ( ) 1 X i = n 2 nσ2 = σ2 n. Christopher Ting QF 603 October 14, 2017 21/36

Sample Variance σ 2 n is Unbiased. (cont d) If follows that E ( Y 2) ( ) = E X 2 n = σ2 n + µ2. To show unbiasedness, we need to prove that E ( σ 2) = σ 2. Noting that E(Xi 2) = σ2 + µ 2, we find E ( σ ( n 2) = 1 E ( Xi 2 n 1 ( n = 1 n 1 = 1 n 1 ) ( ) ) n E X 2 n ( σ 2 + µ 2) n ( σ 2 ( (n 1)σ 2 ) = σ 2. n + µ2 ) ) Christopher Ting QF 603 October 14, 2017 22/36

Variance for a Continuous Random Variable Definition σ 2 = ( x µ ) 2f(x) dx Exercise: Suppose the probability density function is f(x) = 8 ( 9 x for x 0, 3 ]. Compute the variance. 2 Christopher Ting QF 603 October 14, 2017 23/36

Standardized Variables Suppose X is a random variable with constant mean µ and variance σ 2. Since volatility σ 0 for a random variable, we can define Y := X µ. σ The variable Y has mean zero and variance 1. Quiz: If X is a stochastic process dx t = µ dt + σ db t, what is the stochastic process for Y t? Christopher Ting QF 603 October 14, 2017 24/36

Covariance Covariance is a generalized version of variance. It is defined as C(X, Y ) σ XY := E (( X µ X )( Y µy )). Variance is a special case: C(X, X) = σ XX = V ( X ). Whereas variance is strictly positive, covariance can be positive, negative, and zero. If X and Y are independent, then it must be that C(X, Y ) = 0. If C(X, Y ) = 0, it is not necessarily true that X and Y are independent. Exercise: Show that 1 σ XY = E ( XY ) µ X µ Y. 2 C ( X, Y ) = C ( Y, X ). 3 C ( X + Y, Z ) = C ( X, Z ) + C ( Y, Z ). Christopher Ting QF 603 October 14, 2017 25/36

Estimators of Covariance Given the paired data, (x i, y i ), i = 1, 3,..., n, the sample covariance is defined as σ XY = 1 n ( )( ) xi µ X yi µ Y. n 1 C ( X i, Y j ) = 0, if i j. Proof: Suppose Y and X are related by a mapping f( ), i.e., Y j = f(x j ). Note that the mapping involves the paired copies because each Y j is independent and does not relate at all with Y i. Otherwise, if Y j = f(x i, X j ), then Y j may depend on Y i indirectly through since Y i = f(x h, X i ). Homework Assignment: Show that n ( )( ) n 1 Xi µ X Yi µ Y = X i Y i nx n Y n. 2 Use these results to show that the sample covariance is unbiased. Christopher Ting QF 603 October 14, 2017 26/36

Linear Combination of Two Random Variables Suppose X and Y are a pair random variables with means µ X = E(X) and µ Y = E(Y ), respectively. Also, suppose a and b are two constants. Prove that Proof V ( ax + by ) = a 2 V ( X ) + b 2 V ( Y ) + 2ab C ( X, Y ). 1 V ( ax + by ) = E ( (ax + by ) 2) (aµ X + bµ Y ) 2. 2 Expanding the two quadratic term and collecting the expanded terms accordingly, we obtain a 2 E ( X 2) a 2 µ 2 X + b 2 E ( Y 2) b 2 µ 2 Y + 2ab E ( XY ) 2abµ X µ Y, which is a 2 ( E ( X 2) µ 2 X) + b 2 ( E ( Y 2) µ 2 Y ) + 2ab ( E ( XY ) µx µ Y ). Christopher Ting QF 603 October 14, 2017 27/36

Correlation: Normalized Covariance The normalization of covariance gives rise to correlation, which is defined as ρ XY := σ XY σ X σ Y. Correlation has the nice property that it varies between -1 and +1. If two variables have a correlation of +1 (-1), then we say they are perfectly correlated (anti-correlated). If one random variable causes the other random variable, or that both variables share a common underlying driver, then they are highly correlated. But in general, high correlation does not imply causation of one variable on the other. If two variables are uncorrelated, it does not necessarily follow that they are unrelated. So what does correlation really tell us? Christopher Ting QF 603 October 14, 2017 28/36

Sample Problem If X has an equal probability of being -1, 0, or +1, what is the correlation between X and Y if Y = X 2? First, we calculate the respective means of both variables: E ( X ) = 1 3 ( 1) + 1 3 (0) + 1 (1) = 0. 3 E ( Y ) = 1 3 (( 1)2 ) + 1 3 (02 ) + 1 3 (12 ) = 2 3. The covariance can be found as follows: σ XY = 1 ( ( 1 0)(( 1) 2 2/3) + (0 0)(0 2 2/3) 3 ) + (1 0)(1 2 2/3) = 0. So, even though X and Y are clearly related (Y = X 2 ), their correlation is zero! Christopher Ting QF 603 October 14, 2017 29/36

Portfolio Variance If we have two securities with random returns X A and X B, with means µ A and µ B and standard deviations σ A and σ B, respectively, we can calculate the variance of X A plus X B as follows: σ 2 A+B = σ 2 A + σ 2 B + 2ρ AB σ A σ B, where ρ AB is the correlation between X A and X B. If the securities are uncorrelated, then σa+b 2 = σ2 A + σ2 B n. In general, suppose Y = X i. The portfolio s variance is σ 2 Y = n i=j m ρ ij σ i σ j. Christopher Ting QF 603 October 14, 2017 30/36

Square Root Rule Suppose X i is a copy of X such that σ i = σ for all i, and that all of the X i s are uncorrelated, i.e., ρ ij = 0 for i j. Then, σ Y = n σ. Consider the time series of weekly i.i.d. returns. The volatility is σ = 2.06%. What is the annualized volatility? Answer Assume that one year has 52 weeks. Using the square root rule, we obtain 2.06% 52 = 14.85%. If i.i.d. fails to hold, square root rule may lead to a misleading value. Christopher Ting QF 603 October 14, 2017 31/36

Application: Static Hedging If the portfolio P is a linear combination of X A and X B, i.e., P = ax A + bx B, then σ 2 P = a 2 σ 2 A + b 2 σ 2 B + 2abρ AB σ A σ B. Correlation is central to the problem of hedging. Let a = 1, i.e., X A is our primary asset. What should the hedge ratio b be such that the portfolio variance σp 2 is the smallest possible? The first-order condition with respect to b is dσ 2 P db = 2bσ2 B + 2ρ AB σ A σ B = 0. Christopher Ting QF 603 October 14, 2017 32/36

Application: Static Hedging (cont d) The optimal hedge ratio is b σ A = ρ AB = C ( ) XA, X B σ B V ( ). X B If b is positive (negative), long (short) the asset B. Substituting b back into our original equation, the smallest volatility you can achieve for the hedged portfolio is σ P = σ A 1 ρ 2 AB. When ρ AB equals zero (i.e., when the two securities are uncorrelated), the optimal hedge ratio is zero. You cannot hedge one security with another security if they are uncorrelated. Christopher Ting QF 603 October 14, 2017 33/36

Puzzling? Adding an uncorrelated security to a portfolio will always increase its variance! For example, $100 of Security A plus $20 of uncorrelated Security B will have a higher dollar standard deviation. But if Security A and Security B are uncorrelated and have the same standard deviation, then replacing some of Security A with Security B will decrease the dollar standard deviation of the portfolio. For example, $80 of Security A plus $20 of uncorrelated Security B will have a lower dollar standard deviation than $100 of Security A. Christopher Ting QF 603 October 14, 2017 34/36

Demystifying the Puzzle Let R A and R B be the returns of Security A and Security B, respectively. Let σ 2 A( = V(R A ) ) and σ 2 B( = V(R B ) ) be the variances of these returns. Moreover, suppose σ A = σ B = σ. The dollar value of Security A will become $100R A. If the portfolio is constructed by investing $100 in Security A, then the volatility of the portfolio value in dollars is V ( ) 100R A = $100σ. But if the portfolio is made by having $80 invested in Security A and $20 invested in uncorrelated Security B, then the volatility of the portfolio value in dollars is V ( ) 80R A + V ( ) 20R B, which is 6,400σA 2 + 400σ2 B = 6,800 σ < 100σ. Christopher Ting QF 603 October 14, 2017 35/36

Important Lessons Mean-variance analysis is a cornerstone of investment, even trading. All sample estimators such as sample average and sample variance are random variables due to sampling randomness. Sample mean, sample variance, and sample covariance Unbiasedness of the three sample estimates Diversification is more subtle than you thought. Christopher Ting QF 603 October 14, 2017 36/36