Central Limit Theorem, Joint Distributions Spring 2018

Similar documents
Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

Statistics for Business and Economics

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

4.3 Normal distribution

Chapter 4 Continuous Random Variables and Probability Distributions

MLLunsford 1. Activity: Central Limit Theorem Theory and Computations

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Topic 6 - Continuous Distributions I. Discrete RVs. Probability Density. Continuous RVs. Background Reading. Recall the discrete distributions

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

18.05 Problem Set 3, Spring 2014 Solutions

Discrete Random Variables

Tutorial 11: Limit Theorems. Baoxiang Wang & Yihan Zhang bxwang, April 10, 2017

Business Statistics 41000: Probability 4

Chapter 4 Continuous Random Variables and Probability Distributions

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

4 Random Variables and Distributions

6 Central Limit Theorem. (Chs 6.4, 6.5)

NORMAL APPROXIMATION. In the last chapter we discovered that, when sampling from almost any distribution, e r2 2 rdrdϕ = 2π e u du =2π.

Business Statistics 41000: Probability 3

Central Limit Theorem (cont d) 7/28/2006

Normal Distribution. Definition A continuous rv X is said to have a normal distribution with. the pdf of X is

2011 Pearson Education, Inc

Probability. An intro for calculus students P= Figure 1: A normal integral

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

STATISTICS and PROBABILITY

Continuous Probability Distributions & Normal Distribution

Lecture 6: Chapter 6

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

5.3 Statistics and Their Distributions

Math489/889 Stochastic Processes and Advanced Mathematical Finance Homework 5

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

Statistical Tables Compiled by Alan J. Terry

II. Random Variables

The Normal Distribution

STAT 111 Recitation 4

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 3, 1.9

What was in the last lecture?

Lecture 9. Probability Distributions. Outline. Outline

15.063: Communicating with Data Summer Recitation 4 Probability III

MA : Introductory Probability

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 4: May 2, Abstract

A.REPRESENTATION OF DATA

Chapter 4 Random Variables & Probability. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Statistics for Business and Economics: Random Variables:Continuous

Lecture 9. Probability Distributions

The Normal Distribution. (Ch 4.3)

The Normal Probability Distribution

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

The Binomial Distribution

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

Probability Distributions II

ECEn 370 Introduction to Probability

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Chapter 7 1. Random Variables

MTH6154 Financial Mathematics I Stochastic Interest Rates

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Homework Assignments

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial

Introduction to Statistics I

ECON 214 Elements of Statistics for Economists 2016/2017

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Engineering Statistics ECIV 2305

IEOR 165 Lecture 1 Probability Review

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

Chapter 2: Random Variables (Cont d)

. (i) What is the probability that X is at most 8.75? =.875

Introduction to Business Statistics QM 120 Chapter 6

Module 3: Sampling Distributions and the CLT Statistics (OA3102)

ECON 214 Elements of Statistics for Economists

5.4 Normal Approximation of the Binomial Distribution

Chapter 6 Continuous Probability Distributions. Learning objectives

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 3, 1.9

Part V - Chance Variability

VI. Continuous Probability Distributions

Section The Sampling Distribution of a Sample Mean

Central limit theorems

Statistics, Measures of Central Tendency I

Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 6 Normal Probability Distribution QMIS 120. Dr.

MAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw

14.30 Introduction to Statistical Methods in Economics Spring 2009

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

STAT Chapter 7: Central Limit Theorem

Statistics, Their Distributions, and the Central Limit Theorem

PROBABILITY DISTRIBUTIONS

Normal Cumulative Distribution Function (CDF)

Some Discrete Distribution Families

ECE 340 Probabilistic Methods in Engineering M/W 3-4:15. Lecture 10: Continuous RV Families. Prof. Vince Calhoun

Discrete Random Variables

STAT 241/251 - Chapter 7: Central Limit Theorem

Statistics 511 Supplemental Materials

Transcription:

Central Limit Theorem, Joint Distributions 18.5 Spring 218.5.4.3.2.1-4 -3-2 -1 1 2 3 4

Exam next Wednesday Exam 1 on Wednesday March 7, regular room and time. Designed for 1 hour. You will have the full 8 minutes. Class on Monday will be review. Practice materials posted. Learn to use the standard normal table for the exam. No books or calculators. You may have one 4 6 notecard with any information you like. February 27, 218 2 / 31

The bell-shaped curve.5.4.3 φ(z).2.1 4 2 2 4 z This is standard normal distribution N(, 1): φ(z) = 1 2π e z2 /2 N(, 1) means that mean is µ =, and std deviation is σ = 1. Normal with mean µ, std deviation σ is N(µ, σ): φ µ,σ (z) = 1 σ /2σ2 e (z µ)2 2π February 27, 218 3 / 31

Lots of normal distributions.7.6.5.4 N(, 1) N(4.5,.5) N(4.5, 2.25) N(6.5, 1.) N(8.,.5).3.2.1 4 2 2 4 6 8 1 February 27, 218 4 / 31

Standardization Random variable X with mean µ, standard deviation σ. Standardization: Y = X µ. σ Y has mean and standard deviation 1. Standardizing any normal random variable produces the standard normal. If X normal then standardized X stand. normal. We reserve Z to mean a standard normal random variable. February 27, 218 5 / 31

Board Question: Standardization Here are the pdfs for four (binomial) random variables X. Standardize them, and make bar graphs of the standardized distributions. Each bar should have area equal to the probability of that value. (Each bar has width 1/σ, so each bar has height pdf σ.) X n = n = 1 n = 4 n = 9 1 1/2 1/16 1/512 1 1/2 4/16 9/512 2 6/16 36/512 3 4/16 84/512 4 1/16 126/512 5 126/512 6 84/512 7 36/512 8 9/512 9 1/512 February 27, 218 6 / 31

Concept Question: Normal Distribution X has normal distribution, standard deviation σ. within 1 σ 68% Normal PDF within 2 σ 95% 68% within 3 σ 99% 95% 3σ 2σ σ 1. P( σ < X < σ) is (a).25 (b).16 (c).68 (d).84 (e).95 99% σ 2σ 3σ z 2. P(X > 2σ) (a).25 (b).16 (c).68 (d).84 (e).95 answer: 1c, 2a February 27, 218 7 / 31

Central Limit Theorem Setting: X 1, X 2,... i.i.d. with mean µ and standard dev. σ. For each n: X n = 1 n (X 1 + X 2 +... + X n ) S n = X 1 + X 2 +... + X n average sum. Conclusion: For large n: ) X n N (µ, σ2 n S n N ( nµ, nσ 2) Standardized ( S n or X n ) N(, 1) That is, S n nµ nσ = X n µ σ/ n N(, 1). February 27, 218 8 / 31

CLT: pictures The standardized average of n i.i.d. Bernoulli(.5) random variables with n = 1, 2, 12, 64..4.35.3.25.2.15.1.5-3 -2-1 1 2 3.4.35.3.25.2.15.1.5-3 -2-1 1 2 3.4.35.3.25.2.15.1.5-3 -2-1 1 2 3.4.35.3.25.2.15.1.5-4 -3-2 -1 1 2 3 4 February 27, 218 9 / 31

CLT: pictures 2 Standardized average of n i.i.d. uniform random variables with n = 1, 2, 4, 12..4.35.3.25.2.15.1.5-3 -2-1 1 2 3.5.4.3.2.1-3 -2-1 1 2 3.4.35.3.25.2.15.1.5-3 -2-1 1 2 3.4.35.3.25.2.15.1.5-3 -2-1 1 2 3 February 27, 218 1 / 31

CLT: pictures 3 The standardized average of n i.i.d. exponential random variables with n = 1, 2, 8, 64. 1.8.6.4.2-3 -2-1 1 2 3.7.6.5.4.3.2.1-3 -2-1 1 2 3.5.4.3.2.1-3 -2-1 1 2 3.5.4.3.2.1-3 -2-1 1 2 3 February 27, 218 11 / 31

CLT: pictures The non-standardized average of n Bernoulli(.5) random variables, with n = 4, 12, 64. Spikier. 1.4 1.2 1.8.6.4.2-1 -.5.5 1 1.5 2 7 6 5 4 3 2 1 3 2.5 2 1.5 1.5 -.2.2.4.6.8 1 1.2 1.4 -.2.2.4.6.8 1 1.2 1.4 February 27, 218 12 / 31

Table Question: Sampling from the standard normal distribution As a table, produce two random samples from (an approximate) standard normal distribution. To make each sample, the table is allowed eight rolls of the 1-sided die. Note: Hint: µ = 5.5 and σ 2 8 for a single 1-sided die. CLT is about averages. answer: The average of 9 rolls is a sample from the average of 9 independent random variables. The CLT says this average is approximately normal with µ = 5.5 and σ = 8.25/ 9 = 2.75 If x is the average of 9 rolls then standardizing we get z = x 5.5 2.75 is (approximately) a sample from N(, 1). February 27, 218 13 / 31

Board Question: CLT 1. Carefully write the statement of the central limit theorem. 2. To head the newly formed US Dept. of Statistics, suppose that 5% of the population supports Ani, 25% supports Ruthi, and the remaining 25% is split evenly between Efrat, Elan, David and Jerry. A poll asks 4 random people who they support. What is the probability that at least 55% of those polled prefer Ani? 3. What is the probability that less than 2% of those polled prefer Ruthi? answer: On next slide. February 27, 218 14 / 31

Solution answer: 2. Let A be the fraction polled who support Ani. So A is the average of 4 Bernoulli(.5) random variables. That is, let X i = 1 if the ith person polled prefers Ani and if not, so A = average of the X i. The question asks for the probability A >.55. Each X i has µ =.5 and σ 2 =.25. So, E(A) =.5 σa 2 =.25/4 or σ A = 1/4 =.25. Because A is the average of 4 Bernoulli(.5) variables the CLT says it is approximately normal and standardizing gives So Continued on next slide A.5.25 Z P(A >.55) P(Z > 2).25 and February 27, 218 15 / 31

Solution continued 3. Let R be the fraction polled who support Ruthi. The question asks for the probability the R <.2. Similar to problem 2, R is the average of 4 Bernoulli(.25) random variables. So E(R) =.25 and σ 2 R = (.25)(.75)/4 = σ R = 3/8. So R.25 3/8 Z. So, P(R <.2) P(Z < 4/ 3).15 February 27, 218 16 / 31

Bonus problem Not for class. Solution will be posted with the slides. An accountant rounds to the nearest dollar. We ll assume the error in rounding is uniform on [-.5,.5]. Estimate the probability that the total error in 3 entries is more than $5. answer: Let X j be the error in the j th entry, so, X j U(.5,.5). We have E(X j ) = and Var(X j ) = 1/12. The total error S = X 1 +... + X 3 has E(S) =, Var(S) = 3/12 = 25, and σ S = 5. Standardizing we get, by the CLT, S/5 is approximately standard normal. That is, S/5 Z. So P(S < 5 or S > 5) P(Z < 1 or Z > 1).32. February 27, 218 17 / 31

Joint Distributions X and Y are jointly distributed random variables. Discrete: Probability mass function (pmf): p(x i, y j ) Continuous: probability density function (pdf): f (x, y) Both: cumulative distribution function (cdf): F (x, y) = P(X x, Y y) February 27, 218 18 / 31

Discrete joint pmf: example 1 Roll two dice: X = # on first die, Y = # on second die X takes values in 1, 2,..., 6, Y takes values in 1, 2,..., 6 Joint probability table: X\Y 1 2 3 4 5 6 1 1/36 1/36 1/36 1/36 1/36 1/36 2 1/36 1/36 1/36 1/36 1/36 1/36 3 1/36 1/36 1/36 1/36 1/36 1/36 4 1/36 1/36 1/36 1/36 1/36 1/36 5 1/36 1/36 1/36 1/36 1/36 1/36 6 1/36 1/36 1/36 1/36 1/36 1/36 pmf: p(i, j) = 1/36 for any i and j between 1 and 6. February 27, 218 19 / 31

Discrete joint pmf: example 2 Roll two dice: X = # on first die, T = total on both dice X\T 2 3 4 5 6 7 8 9 1 11 12 1 1/36 1/36 1/36 1/36 1/36 1/36 2 1/36 1/36 1/36 1/36 1/36 1/36 3 1/36 1/36 1/36 1/36 1/36 1/36 4 1/36 1/36 1/36 1/36 1/36 1/36 5 1/36 1/36 1/36 1/36 1/36 1/36 6 1/36 1/36 1/36 1/36 1/36 1/36 February 27, 218 2 / 31

Continuous joint distributions X takes values in [a, b], Y takes values in [c, d] (X, Y ) takes values in [a, b] [c, d]. Joint probability density function (pdf) f (x, y) f (x, y) dx dy is the probability of being in the small square. d y Prob. = f(x, y) dx dy dx dy c a b x February 27, 218 21 / 31

Properties of the joint pmf and pdf Discrete case: probability mass function (pmf) 1. p(x i, y j ) 1 2. Total probability is 1: n i=1 m p(x i, y j ) = 1 Continuous case: probability density function (pdf) 1. f (x, y) 2. Total probability is 1: d b c a j=1 f (x, y) dx dy = 1 Note: f (x, y) can be greater than 1: it is a density, not a probability. February 27, 218 22 / 31

Example: discrete events Roll two dice: X = # on first die, Consider the event: A = Y X 2 Describe the event A and find its probability. Y = # on second die. answer: We can describe A as a set of (X, Y ) pairs: A = {(1, 3), (1, 4), (1, 5), (1, 6), (2, 4), (2, 5), (2, 6), (3, 5), (3, 6), (4, 6)}. Or we can visualize it by shading the table: X\Y 1 2 3 4 5 6 1 1/36 1/36 1/36 1/36 1/36 1/36 2 1/36 1/36 1/36 1/36 1/36 1/36 3 1/36 1/36 1/36 1/36 1/36 1/36 4 1/36 1/36 1/36 1/36 1/36 1/36 5 1/36 1/36 1/36 1/36 1/36 1/36 6 1/36 1/36 1/36 1/36 1/36 1/36 P(A) = sum of probabilities in shaded cells = 1/36. February 27, 218 23 / 31

Example: continuous events Suppose (X, Y ) takes values in [, 1] [, 1]. Uniform density f (x, y) = 1. Visualize the event X > Y and find its probability. answer: y 1 X > Y 1 x The event takes up half the square. Since the density is uniform this is half the probability. That is, P(X > Y ) =.5. February 27, 218 24 / 31

Cumulative distribution function F (x, y) = P(X x, Y y) = y x c a f (u, v) du dv. Properties f (x, y) = 2 F (x, y). x y 1. F (x, y) is non-decreasing. That is, as x or y increases F (x, y) increases or remains constant. 2. F (x, y) = at the lower left of its range. If the lower left is (, ) then this means lim F (x, y) =. (x,y) (, ) 3. F (x, y) = 1 at the upper right of its range. February 27, 218 25 / 31

Marginal pmf and pdf Roll two dice: X = # on first die, T = total on both dice. The marginal pmf of X is found by summing the rows. The marginal pmf of T is found by summing the columns X\T 2 3 4 5 6 7 8 9 1 11 12 p(x i ) 1 1/36 1/36 1/36 1/36 1/36 1/36 1/6 2 1/36 1/36 1/36 1/36 1/36 1/36 1/6 3 1/36 1/36 1/36 1/36 1/36 1/36 1/6 4 1/36 1/36 1/36 1/36 1/36 1/36 1/6 5 1/36 1/36 1/36 1/36 1/36 1/36 1/6 6 1/36 1/36 1/36 1/36 1/36 1/36 1/6 p(t j ) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 1 For continuous distributions the marginal pdf f X (x) is found by integrating out the y. Likewise for f Y (y). February 27, 218 26 / 31

Board question Suppose X and Y are random variables and (X, Y ) takes values in [, 1] [, 1]. 3 the pdf is 2 (x 2 + y 2 ). 1 Show f (x, y) is a valid pdf. 2 Visualize the event A = X >.3 and Y >.5. Find its probability. 3 Find the cdf F (x, y). 4 Find the marginal pdf f X (x). Use this to find P(X <.5). 5 Use the cdf F (x, y) to find the marginal cdf F X (x) and P(X <.5). 6 See next slide February 27, 218 27 / 31

Board question continued 6. (New scenario) From the following table compute F (3.5, 4). X\Y 1 2 3 4 5 6 1 1/36 1/36 1/36 1/36 1/36 1/36 2 1/36 1/36 1/36 1/36 1/36 1/36 3 1/36 1/36 1/36 1/36 1/36 1/36 4 1/36 1/36 1/36 1/36 1/36 1/36 5 1/36 1/36 1/36 1/36 1/36 1/36 6 1/36 1/36 1/36 1/36 1/36 1/36 answer: See next slide February 27, 218 28 / 31

Solution answer: 1. Validity: Clearly f (x, y) is positive. Next we must show that total probability = 1: 1 1 3 2 (x 2 + y 2 ) dx dy = 1 [ 1 2 x 3 + 3 ] 1 2 xy 2 dy = 1 1 2 + 3 2 y 2 dy = 1. 2. Here s the visualization 1 y A.5.3 1 x The pdf is not constant so we must compute an integral P(A) = 1 1.3.5 3 2 (x 2 + y 2 ) dy dx = 1.3 [ 3 2 x 2 y + 1 ] 1 2 y 3.5 dx (continued) February 27, 218 29 / 31

Solutions 2, 3, 4, 5 1 3x 2 2. (continued) =.3 4 + 7 dx =.5495 16 y x 3 3. F (x, y) = 2 (u2 + v 2 ) du dv = x 3 y 2 + xy 3 2. 4. f X (x) = P(X <.5) = 1.5 [ 3 3 2 (x 2 + y 2 ) dy = 2 x 2 y + y 3 ] 1 = 3 2 2 x 2 + 1 2 f X (x) dx =.5 3 2 x 2 + 1 [ 1 2 dx = 2 x 3 + 1 ].5 2 x = 5 16. 5. To find the marginal cdf F X (x) we simply take y to be the top of the y-range and evalute F : F X (x) = F (x, 1) = 1 2 (x 3 + x). Therefore P(X <.5) = F (.5) = 1 2 (1 8 + 1 2 ) = 5 16. 6. On next slide February 27, 218 3 / 31

Solution 6 6. F (3.5, 4) = P(X 3.5, Y 4). X\Y 1 2 3 4 5 6 1 1/36 1/36 1/36 1/36 1/36 1/36 2 1/36 1/36 1/36 1/36 1/36 1/36 3 1/36 1/36 1/36 1/36 1/36 1/36 4 1/36 1/36 1/36 1/36 1/36 1/36 5 1/36 1/36 1/36 1/36 1/36 1/36 6 1/36 1/36 1/36 1/36 1/36 1/36 Add the probability in the shaded squares: F (3.5, 4) = 12/36 = 1/3. February 27, 218 31 / 31