Business Statistics 41000: Probability 4

Similar documents
Business Statistics 41000: Probability 3

Section 0: Introduction and Review of Basic Concepts

A useful modeling tricks.

The normal distribution is a theoretical model derived mathematically and not empirically.

ECON 214 Elements of Statistics for Economists 2016/2017

Chapter 7: Point Estimation and Sampling Distributions

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Statistics for Business and Economics: Random Variables:Continuous

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

Chapter 5. Sampling Distributions

Central Limit Theorem, Joint Distributions Spring 2018

Statistics, Measures of Central Tendency I

Elementary Statistics Lecture 5

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Chapter 7 Sampling Distributions and Point Estimation of Parameters

ECON 214 Elements of Statistics for Economists

Statistics, Their Distributions, and the Central Limit Theorem

Chapter 9: Sampling Distributions

Introduction to Statistics I

STAT Chapter 7: Central Limit Theorem

The topics in this section are related and necessary topics for both course objectives.

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

6 Central Limit Theorem. (Chs 6.4, 6.5)

Contents. The Binomial Distribution. The Binomial Distribution The Normal Approximation to the Binomial Left hander example

2011 Pearson Education, Inc

Statistical Methods in Practice STAT/MATH 3379

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

Module 4: Probability

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Statistics and Probability

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

Part V - Chance Variability

Binomial Random Variables. Binomial Random Variables

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

TOPIC: PROBABILITY DISTRIBUTIONS

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

4 Random Variables and Distributions

Math 227 Elementary Statistics. Bluman 5 th edition

Introduction to Business Statistics QM 120 Chapter 6

Section Introduction to Normal Distributions

What was in the last lecture?

Midterm Exam III Review

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Statistics 6 th Edition

The Binomial Distribution

The Binomial Probability Distribution

Class 16. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

MAKING SENSE OF DATA Essentials series

Lecture 9. Probability Distributions. Outline. Outline

Sampling. Marc H. Mehlman University of New Haven. Marc Mehlman (University of New Haven) Sampling 1 / 20.

Chapter 6: Random Variables

Topic 6 - Continuous Distributions I. Discrete RVs. Probability Density. Continuous RVs. Background Reading. Recall the discrete distributions

STAT 241/251 - Chapter 7: Central Limit Theorem

Chapter 8. Variables. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

Lecture 9. Probability Distributions

5.3 Statistics and Their Distributions

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Section The Sampling Distribution of a Sample Mean

4.3 Normal distribution

AMS7: WEEK 4. CLASS 3

15.063: Communicating with Data Summer Recitation 4 Probability III

Lecture 6: Chapter 6

5-1 pg ,4,5, EOO,39,47,50,53, pg ,5,9,13,17,19,21,22,25,30,31,32, pg.269 1,29,13,16,17,19,20,25,26,28,31,33,38

The Normal Probability Distribution

Statistical Intervals (One sample) (Chs )

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

MA 1125 Lecture 18 - Normal Approximations to Binomial Distributions. Objectives: Compute probabilities for a binomial as a normal distribution.

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

Using the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the

Unit2: Probabilityanddistributions. 3. Normal distribution

Data Analysis and Statistical Methods Statistics 651

Honor Code: By signing my name below, I pledge my honor that I have not violated the Booth Honor Code during this examination.

Econ 6900: Statistical Problems. Instructor: Yogesh Uppal

Sampling and sampling distribution

Chapter 7. Sampling Distributions and the Central Limit Theorem

The Binomial Distribution

BIOL The Normal Distribution and the Central Limit Theorem

Chapter 5: Statistical Inference (in General)

Normal Probability Distributions

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Lecture 9 - Sampling Distributions and the CLT

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede,

Examples: Random Variables. Discrete and Continuous Random Variables. Probability Distributions

MAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw

2. Modeling Uncertainty

CHAPTER 5 Sampling Distributions

CHAPTER 5 SAMPLING DISTRIBUTIONS

Chapter 5: Probability models

MATH 264 Problem Homework I

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 4: May 2, Abstract

Tutorial 11: Limit Theorems. Baoxiang Wang & Yihan Zhang bxwang, April 10, 2017

Commonly Used Distributions

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

In a binomial experiment of n trials, where p = probability of success and q = probability of failure. mean variance standard deviation

Figure 1: 2πσ is said to have a normal distribution with mean µ and standard deviation σ. This is also denoted

Transcription:

Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1

Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404 Harper Center Office hours: email me for an appointment Office phone: 773.834.5249 http://faculty.chicagobooth.edu/drew.creal/teaching/index.html 2

Course schedule Week # 1: Plotting and summarizing univariate data Week # 2: Plotting and summarizing bivariate data Week # 3: Probability 1 Week # 4: Probability 2 Week # 5: Probability 3 Week # 6: In-class exam and Probability 4 Week # 7: Statistical inference 1 Week # 8: Statistical inference 2 Week # 9: Simple linear regression Week # 10: Multiple linear regression 3

Outline of today s topics I. Standardization II. Histograms and i.i.d. draws III. The Law of Large Numbers IV. The Central Limit Theorem 4

Standardization 5

Standardization To standardize a random variable means to subtract the mean and divide by the standard deviation. What does this do to the mean and variance? Let E[X ] = µ and V[X ] = σ 2. Then... Y = X µ σ = X σ µ σ Our formulas for linear functions tell us that E[Y ] = 0 and V[Y ] = 1. 6

Standardizing a numeric variable In many practical situations, it is also useful to standardize the data. To standardize a numeric variable means to subtract the sample mean and divide by the sample standard deviation. What are the sample mean and sample variance of the new variable? 7

Standardization Standardizing a random variable creates a new random variable with mean equal to zero and variance equal to 1. Standardizing a numeric variable in your dataset means to create a new variable with sample mean equal to zero and sample variance equal to 1. The new random variable is unitless. In both cases, the new variable can be interpreted as the number of standard deviations away from the mean. Let s see an example! 8

Standardization: How unusual are some events? Sometimes something weird or unusual happens and we want to quantify just how weird it is. A typical example is a market crash. 5 Daily returns on U.S. Equities 0 5 10 Black Monday: 10/19/1987 15 20 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 9

Standardization How unusual is the crash? 1. The data up until the crash looks (approximately) normal. 2. Suppose we model it as: N(0.03796, 0.7893). 3. The mean and variance were estimated using the data before the day of the crash. 4. The return on the day of the crash: -20.69% 0.7 0.6 0.5 0.4 0.3 0.2 0.1 4 3 2 1 0 1 2 3 4 10

Standardization 0.6 The crash return was way out in the left tail. 0.5 0.4 0.3 0.2 Black Monday: 20.69% 0.1 20.0 17.5 15.0 12.5 10.0 7.5 5.0 2.5 0.0 2.5 5.0 We want to know the probability of this crash assuming the data is normal. To do this we will standardize the data. We want to ask: if the value were from a standard normal, what would it be? 11

Standardization We can think of our returns as: R t = 0.03796 + 0.7893Z t Z t N(0, 1) The value Z t corresponding to a generic R t value is: Z t = R t µ σ = R t 0.03796 0.7893 The values of Z t for t = 1,..., T should look standard normal. Why? 12

Standardization So, how unusual is the crash return? Z t = 20.69 0.03796 0.7893 = 21.84 Its z-value is -21.84. It is like drawing a value of -21.84 from the standard normal. No way! 0.6 Plotted are the z-values for the previous months. 0.5 0.4 0.3 More returns farther in the tail than a standard normal. 0.2 0.1 4 3 2 1 0 1 2 3 4 13

Standardization For X N(µ, σ 2 ), the z value corresponding to a value x is Z = X µ σ. Any time someone says z-value or z score, they are just talking about how many standard deviations we are away from the mean under a bell curve. 14

Normal Probabilities and Standardization Suppose a return is distributed R N(0.01, 0.04 2 ). What is the probability of a return between 0 and 0.05? In lecture #5, we calculated this as: P(0 < R < 0.05) = F R (0.05) F R (0) = 0.8413 0.4013 = 0.44 where we used =NORMDIST(0.0,0.01,0.04,TRUE) = 0.4013 15

Standardization For X N(µ, σ 2 ), Pr(a < X < b) = Pr( a µ σ < Z < b µ σ ). when Z N(0, 1) For a normal r. v., we can always calculate the probability of an interval (a, b) by transforming the interval to ( a µ σ, b µ σ ) and comparing it to a standard normal r.v.. Before computers were common, we looked up probabilities in the tables at the back of a stats book! 16

Normal Probabilities and Standardization An alternative way to do this is to first standardize the values 0 and 0.05. This is equivalent to Z being between 0 0.01 0.04 = 0.25 and (0.05 0.01) 0.04 = 1. Using the normal CDF in Excel, = NORMDIST( 0.25, 0, 1, TRUE) = 0.4013 = NORMDIST(1, 0, 1, TRUE) = 0.841 Pr(0 < R < 0.05) = Pr( 0.25 < Z < 1) = 0.84 0.4 = 0.44 17

Histograms and IID draws 18

19

Histograms and IID Draws Here is a histogram of 1000 draws from the standard normal distribution, i.e. Z N(0, 1). The height of each bar tells us the number of observations in each interval. All the intervals have the same width. 100 80 60 40 20 4 3 2 1 0 1 2 3 4 20

Histograms and IID Draws If we divide the height of each bar by the width times 1000 the picture looks the same, but now the area of each bar equals the % of observations in the interval. 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 4 3 2 1 0 1 2 3 4 z This is just a fancy way of scaling the histogram so that the total area of all the bars equals 1. It looks the same, but the vertical scale is different. 21

Histograms and IID Draws For a large number of i.i.d draws, the observed percent in an interval should be close to the probability. Note two things: 1. For the pdf, the area is the probability of the interval. 2. In the histogram, the area is the observed percent in the interval. 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 4 3 2 1 0 1 2 3 4 22

Histograms and IID Draws As the number of draws gets larger, the histogram gets closer to the pdf! It looks like a bell curve. 0.6 n = 100 0.5 0.4 n = 500 0.4 0.3 0.2 0.2 0.1 0.5 0.4 4 2 0 2 4 n = 2000 0.5 0.4 4 2 0 2 4 n = 1 million 0.3 0.3 0.2 0.2 0.1 0.1 4 2 0 2 4 4 2 0 2 4 23

Histograms and IID Draws The (normalized) histogram of a large number of i.i.d. draws from any continuous distribution should look like the p.d.f.. 24

Histograms and IID Draws Here is another example for uniform random variables X U(2, 5). n = 100 n = 500 0.4 0.4 0.2 0.2 2 3 4 5 2 3 4 5 0.4 n = 2000 0.4 n = 1 million 0.2 0.2 2 3 4 5 2 3 4 5 25

Histograms and IID Draws Here is another example from a random variable with a skewed distribution. 0.4 0.3 n = 100 0.4 0.3 n = 500 0.2 0.2 0.1 0.1 0.0 2.5 5.0 7.5 10.0 0.4 0.3 n = 2000 0.0 2.5 5.0 7.5 10.0 0.4 0.3 n = 1 million 0.2 0.2 0.1 0.1 0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0 26

Histograms and IID Draws Can we use this to do probability calculations?...yes! For example, suppose Z N(0, 1) and we want to know P(Z < 1.5). Step 1. Using Excel, simulate 1,000 i.i.d. draws from the standard normal distribution. Step 2. Determine the percentage of these draws that are less than -1.5. Step 3. This is (approximately) the probability we are looking for. (NOTE: This is also true for discrete random variables. And, the approximation gets better the larger the number of draws.) 27

The Law of Large Numbers 28

The Law of Large Numbers In lecture # 3, we learned that one possible interpretation of probability is long run frequency. In other words, if we were to repeat a random experiment over and over and over again, the probability of an event happening is the frequency that it happens after a large number of identical experiments. 29

The Law of Large Numbers Consider tossing a fair coin repeatedly. Let Y i = 1 if the toss is a head and zero otherwise on the i-th toss. Let X 1 = Y 1. Let X 2 = 1 2 (Y 1 + Y 2 ). Let X 3 = 1 3 (Y 1 + Y 2 + Y 3 )... Let X n = 1 n (Y 1 + Y 2 + Y 3 +... + Y n ) = 1 n n i=1 Y i. 30

Remember that X n = 1 n n i=1 Y i. This is the plot of X j for j = 1,..., 5000. 1.0 0.8 0.6 0.4 0.2 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 Number of Coin Tosses Notice that the sample mean X n is getting closer to the true mean p = 0.5 as we increase n. 31

The Law of Large Numbers As n becomes large E[X ] 1 n where the x i s are outcomes from i.i.d. draws all having the same distribution as X. n i=1 x i This is an example of the Law of Large Numbers. 32

Law of Large Numbers: why it works Remember our example from Lecture #3 where we tossed two coins. Let X equal the number of heads in two tosses. Suppose we toss two coins ten times. Each time we record the number of heads 1 0 2 1 0 1 2 0 2 0 Question: what is the average value? x = (4 0 + 3 1 + 3 2)/10 = 0.9 33

Law of Large Numbers: why it works Now suppose we toss two coins 1000 times. What is the sample mean? 2 1 1 2 2 2 1 2 2 0 2 1 1 2 1 2 0 0 1 0 1 0 1 2 1 1 1 1 2 2 1 1 1 1 1 1 1 1 0 0 1 0 2 1 1 0 2 1 2 2 1 2 1 1 0 1 2 1 1 1 1 1 2 1 1 1 1 1 2 1 1 2 2 1 2 1 1 2 1 1 1 0 0 2 2 0 1 1 0 1 2 1 1 0 1 1 1 1 1 2 2 0 2 1 1 1 0 1 1 1 1 0 2 2 0 0 1 0 2 2 2 1 1 0 1 1 1 0 2 2 0 1 0 2 1 0 1 0 0 2 1 2 1 1 0 0 2 1 1 1 1 1 2 1 1 1 1 0 1 0 0 1 1 0 2 1 0 1 0 1 1 2 0 1 1 1 0 1 1 1 1 1 0 0 1 1 2 1 0 0 1 0 2 1 1 2 1 1 1 1 1 1 0 1 1 1 1 0 2 1 1 2 2 1 2 2 2 2 0 1 1 0 2 0 1 0 2 1 1 1 1 1 1 0 2 2 1 1... 34

Law of Large Numbers: why it works Well, of course we can just have the computer figure it out, but let us think about this for a minute. What should the mean be? Let n 0, n 1, and n 2 be the number of 0 s, 1 s, and 2 s, respectively. Then, the average would be: n 0 n 0 + n 1 n 1 + n 2 n 2 This appears similar to the expectation E[X ] but we re just weighting each outcome by their frequencies instead of the probabilities! 35

Law of Large Numbers: why it works n 0 n 0 + n 1 n 1 + n 2 n 2 Now note that the possible outcomes of each experiment are i.i.d. draws from the discrete distribution: x P(x) 0 0.25 1 0.50 2 0.25 36

Law of Large Numbers: why it works As the number of draws n gets larger, we should have: n 0 n 0.25 n 1 n 0.5 n 2 n 0.25 Hence, the average should be about: 0.25 0 + 0.5 1 + 0.25 2 = 1 but, this is just the expected value of the random variable X. 37

Law of Large Numbers: why it works The actual sample mean is from the 1000 tosses was: x = 1.0110 Hence, with a very, very,,...large number of tosses we would expect the sample mean to be very close to 1 (the expected value). To summarize, we can think of the expected value, which in this case is equal to: p X (0) 0 + p X (1) 1 + p X (2) 2 = 1 as the long run average of i.i.d. draws. 38

The Law of Large Numbers The Law of Large Numbers is also true for functions f ( ) of X. E[f (X )] 1 n n f (x i ) i=1 Example: Consider the function f (x) = (x µ) 2. V[X ] = E[f (X )] 1 n n (x i µ) 2 i=1 This implies that we can use the sample variance s 2 x as an approximation of the true variance! 39

The Law of Large Numbers Example: Let s return to the example where we tossed 2 coins 1000 times. The sample mean from the 1000 tosses was: x = 1.0110 The sample variance from the 1000 tosses was: s 2 x = 0.51 If X is the number of heads out of two coin tosses: V[X ] = 0.25 (0 1) 2 + 0.55 (1 1) 2 + 0.25 (2 1) 2 = 0.5 40

The Law of Large Numbers Thus, for large samples the sample quantities that we can compute from our observed data should be similar to the quantities we talked about for random variables: V[X ] 1 n n (x i x) 2 i=1 1 n 1 n (x i x) 2 i=1 This is true if we are taking i.i.d. draws! 41

The Central Limit Theorem 42

The Central Limit Theorem The central limit theorem (CLT) says that the average of a large number of independent random variables is (approximately) normally distributed. Another way of saying this is: Suppose that X 1, X 2,..., X n are i.i.d. random variables and let Y = X 1+X 2 +...+X n. As n gets large n Y N(µ Y, σ 2 Y ) 43

The Central Limit Theorem What is so special about this? Notice that although we did assume that the X i s are i.i.d., we DID NOT say what distribution they have. That s right! The CLT says: The average of a large number of independent random variables is (approximately) normally distributed, no matter what distribution the individual random variables have! 44

The Central Limit Theorem Example: Consider the binomial distribution. Define Y = X 1 + X 2 +... + X n where X i Bernoulli(p) i.i.d.. 0.4 Binomial(5,0.2) 0.20 Binomial(25,0.2) 0.3 0.15 0.2 0.10 0.1 0.05 0 1 2 3 4 5 0 5 10 15 20 25 0.100 0.075 0.04 Binomial(100,0.2) Binomial(500,0.2) 0.03 0.050 0.025 0.02 0.01 0 20 40 60 80 100 0 100 200 300 400 500 45

The Central Limit Theorem 1. As we increase n, the distribution of Y gets closer and closer to a normal distribution with the same mean and variance as the binomial. 2. In the graph on the right, I have plotted the binomial distribution (blue) on top of the normal distribution (red) with p = 0.2 and n = 100. 0.10 0.09 0.08 0.07 0.06 0.05 N(np,np(1 p)) Binomial(n,p) 0.04 0.03 0.02 0.01 0 10 20 30 40 50 60 70 80 90 100 46

How good is the approximation? Your company is about to manufacture 100 parts. Suppose defects are i.i.d. X i Bernoulli(0.1). Let Y = X 1 + X 2 +... + X 100 be the number of defects. Y binomial(100, 0.1). E[Y ] = n p = 100 0.1 = 10 σ Y = n p (1 p) = 100 0.1 0.9 = 3 47

How good is the approximation? Even though Y binomial(100, 0.1), let us use the normal approximation, first. Let the normal distribution have the same mean and variance. Y N(10, 9) Based on the normal approximation, there is a 95% chance that the number of defects is in the interval: (µ 2σ Y, µ + 2σ Y ) = 10 ± 6 = (4, 16) 48

Example: We can compare that to the exact answer based on the binomial probabilities. What is the correct binomial probability of obtaining between 4 and 16 defective parts? If the normal approximation is good, the exact number should be close to.95. Let us see if this is the case... P(4 < Y < 16) = F (16) F (4) = 0.9794 0.0237 = 0.9557 The normal approximation appears to be pretty good. 49