Examples of continuous probability distributions: The normal and standard normal

Similar documents
The Normal Probability Distribution

Chapter 6. The Normal Probability Distributions

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Math 227 Elementary Statistics. Bluman 5 th edition

Topic 6 - Continuous Distributions I. Discrete RVs. Probability Density. Continuous RVs. Background Reading. Recall the discrete distributions

The graph of a normal curve is symmetric with respect to the line x = µ, and has points of

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Chapter 4. The Normal Distribution

STA 320 Fall Thursday, Dec 5. Sampling Distribution. STA Fall

ECON 214 Elements of Statistics for Economists 2016/2017

Statistics for Business and Economics: Random Variables:Continuous

CHAPTER 5 Sampling Distributions

Continuous Distributions

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

No, because np = 100(0.02) = 2. The value of np must be greater than or equal to 5 to use the normal approximation.

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial

Section Introduction to Normal Distributions

8.2 The Standard Deviation as a Ruler Chapter 8 The Normal and Other Continuous Distributions 8-1

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Chapter 3 - Lecture 5 The Binomial Probability Distribution

Lecture 6: Chapter 6

Week 7. Texas A& M University. Department of Mathematics Texas A& M University, College Station Section 3.2, 3.3 and 3.4

Class 12. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

The binomial distribution p314

CHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) =

Terms & Characteristics

Continuous Probability Distributions & Normal Distribution

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Found under MATH NUM

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Figure 1: 2πσ is said to have a normal distribution with mean µ and standard deviation σ. This is also denoted

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Standard Normal, Inverse Normal and Sampling Distributions

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Statistics for Managers Using Microsoft Excel/SPSS Chapter 6 The Normal Distribution And Other Continuous Distributions

Review. What is the probability of throwing two 6s in a row with a fair die? a) b) c) d) 0.333

Lecture 9. Probability Distributions. Outline. Outline

STAB22 section 1.3 and Chapter 1 exercises

MATH 118 Class Notes For Chapter 5 By: Maan Omran

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Examples: Random Variables. Discrete and Continuous Random Variables. Probability Distributions

Lecture 9. Probability Distributions

CHAPTER 6 Random Variables

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Chapter ! Bell Shaped

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

MATH 104 CHAPTER 5 page 1 NORMAL DISTRIBUTION

The Normal Distribution

Lecture 5 - Continuous Distributions

HOMEWORK: Due Mon 11/8, Chapter 9: #15, 25, 37, 44

Introduction to Statistics I

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Statistics and Probability

MidTerm 1) Find the following (round off to one decimal place):

CHAPTER TOPICS STATISTIK & PROBABILITAS. Copyright 2017 By. Ir. Arthur Daniel Limantara, MM, MT.

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Review of commonly missed questions on the online quiz. Lecture 7: Random variables] Expected value and standard deviation. Let s bet...

Chapter 6: Random Variables

Probability. An intro for calculus students P= Figure 1: A normal integral

STAT 201 Chapter 6. Distribution

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

Chapter 7 1. Random Variables

MAKING SENSE OF DATA Essentials series

NORMAL RANDOM VARIABLES (Normal or gaussian distribution)

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Elementary Statistics Lecture 5

The Normal Distribution

Sampling Distributions For Counts and Proportions

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

What was in the last lecture?

Normal distribution. We say that a random variable X follows the normal distribution if the probability density function of X is given by

Making Sense of Cents

Section Distributions of Random Variables

Part V - Chance Variability

4 Random Variables and Distributions

Probability Distribution Unit Review

2011 Pearson Education, Inc

Chapter 6 Continuous Probability Distributions. Learning objectives

The Normal Approximation to the Binomial

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

Statistics (This summary is for chapters 18, 29 and section H of chapter 19)

5.4 Normal Approximation of the Binomial Distribution

The normal distribution is a theoretical model derived mathematically and not empirically.

Statistics for Business and Economics

Unit2: Probabilityanddistributions. 3. Normal distribution

NOTES: Chapter 4 Describing Data

The Binomial Probability Distribution

Business Statistics 41000: Probability 4

Section 6.5. The Central Limit Theorem

Math Tech IIII, May 7

LECTURE 6 DISTRIBUTIONS

Chapter 8 Estimation

Section 5 3 The Mean and Standard Deviation of a Binomial Distribution!

Transcription:

Examples of continuous probability distributions: The normal and standard normal

The Normal Distribution f(x) Changing μ shifts the distribution left or right. Changing σ increases or decreases the spread. X

The Normal Distribution: as mathematical function (pdf) f 1 x 1 ( ( x) e ) Note constants: =3.14159 e=.7188 This is a bell shaped curve with different centers and spreads depending on and

The Normal PDF It s a probability function, so no matter what the values of and, must integrate to 1! 1 e 1 ( x ) dx 1

Normal distribution is defined by its mean and standard dev. E(X)= = x 1 x 1 ( ) e dx Var(X)= = ( x 1 e 1 x ( ) dx) Standard Deviation(X)=

**The beauty of the normal curve: No matter what and are, the area between - and + is about 68%; the area between - and + is about 95%; and the area between -3 and +3 is about 99.7%. Almost all values fall within 3 standard deviations.

68-95-99.7 Rule 68% of the data 95% of the data 99.7% of the data

68-95-99.7 Rule in Math terms.997 1.95 1.68 1 3 3 ) ( 1 ) ( 1 ) ( 1 dx e dx e dx e x x x

How good is rule for real data? Check some example data: The mean of the weight of the women = 17.8 The standard deviation (SD) = 15.5

68% of 10 =.68x10 = ~ 8 runners In fact, 79 runners fall within 1-SD (15.5 lbs) of the mean. 11.3 17.8 143.3 5 0 P e r c e n t 15 10 5 0 80 90 100 110 10 130 140 150 160 POUNDS

95% of 10 =.95 x 10 = ~ 114 runners In fact, 115 runners fall within -SD s of the mean. 96.8 17.8 158.8 5 0 P e r c e n t 15 10 5 0 80 90 100 110 10 130 140 150 160 POUNDS

99.7% of 10 =.997 x 10 = 119.6 runners In fact, all 10 runners fall within 3-SD s of the mean. 81.3 17.8 174.3 5 0 P e r c e n t 15 10 5 0 80 90 100 110 10 130 140 150 160 POUNDS

Example Suppose SAT scores roughly follows a normal distribution in the U.S. population of college-bound students (with range restricted to 00-800), and the average math SAT is 500 with a standard deviation of 50, then: 68% of students will have scores between 450 and 550 95% will be between 400 and 600 99.7% will be between 350 and 650

Example BUT What if you wanted to know the math SAT score corresponding to the 90 th percentile (=90% of students are lower)? P(X Q) =.90 Q 00 (50) 1 e x500 ( ) 50 1 dx.90 Solve for Q?.Yikes!

The Standard Normal (Z): Universal Currency The formula for the standardized normal probability density function is ) ( 1 ) 1 0 ( 1 1 (1) 1 ) ( Z Z e e Z p

The Standard Normal Distribution (Z) All normal distributions can be converted into the standard normal curve by subtracting the mean and dividing by the standard deviation: Z X Somebody calculated all the integrals for the standard normal and put them in a table! So we never have to integrate! Even better, computers now do all the integration.

Comparing X and Z units 100 0 00 X ( = 100, = 50).0 Z ( = 0, = 1)

Example For example: What s the probability of getting a math SAT score of 575 or less, =500 and =50? 575 500 Z 1.5 50 i.e., A score of 575 is 1.5 standard deviations above the mean 575 x500 ( ) 50 1 Z 1 1 P( X 575) e dx e dz (50) 00 1.5 1 Yikes! But to look up Z= 1.5 in standard normal chart (or enter into SAS) no problem! =.933

Practice problem If birth weights in a population are normally distributed with a mean of 109 oz and a standard deviation of 13 oz, a. What is the chance of obtaining a birth weight of 141 oz or heavier when sampling birth records at random? b. What is the chance of obtaining a birth weight of 10 or lighter?

Answer a. What is the chance of obtaining a birth weight of 141 oz or heavier when sampling birth records at random? Z 141109 13.46 From the chart or SAS Z of.46 corresponds to a right tail (greater than) area of: P(Z.46) = 1-(.9931)=.0069 or.69 %

Answer b. What is the chance of obtaining a birth weight of 10 or lighter? Z 10109 13.85 From the chart or SAS Z of.85 corresponds to a left tail area of: P(Z.85) =.803= 80.3%

Looking up probabilities in the standard normal table What is the area to the left of Z=1.51 in a standard normal curve? Z=1.51 Area is 93.45% Z=1.51

Normal probabilities in SAS data _null_; thearea=probnorm(1.5); put thearea; run; 0.933197987 The probnorm(z) function gives you the probability from negative infinity to Z (here 1.5) in a standard normal curve. And if you wanted to go the other direction (i.e., from the area to the Z score (called the so-called Probit function data _null_; thezvalue=probit(.93); put thezvalue; run; 1.47579108 The probit(p) function gives you the Z-value that corresponds to a left-tail area of p (here.93) from a standard normal curve. The probit function is also known as the inverse standard normal function.

Probit function: the inverse (area)= Z: gives the Z-value that goes with the probability you want For example, recall SAT math scores example. What s the score that corresponds to the 90 th percentile? In Table, find the Z-value that corresponds to area of.90 Z= 1.8 Or use SAS data _null_; thezvalue=probit(.90); put thezvalue; run; 1.815515655 If Z=1.8, convert back to raw SAT score X 500 1.8 = 50 X 500 =1.8 (50) X=1.8(50) + 500 = 564 (1.8 standard deviations above the mean!) `

Are my data normal? Not all continuous random variables are normally distributed!! It is important to evaluate how well the data are approximated by a normal distribution

1. Look at the histogram! Does it appear bell shaped?. Compute descriptive summary measures are mean, median, and mode similar? 3. Do /3 of observations lie within 1 std dev of the mean? Do 95% of observations lie within std dev of the mean? 4. Look at a normal probability plot is it approximately linear? 5. Run tests of normality (such as Kolmogorov- Smirnov). But, be cautious, highly influenced by sample size! Are my data normally distributed?

Data from our class Median = 6 Mean = 7.1 Mode = 0 SD = 6.8 Range = 0 to 4 (= 3.5 )

Data from our class Median = 5 Mean = 5.4 Mode = none SD = 1.8 Range = to 9 (~ 4 )

Data from our class Median = 3 Mean = 3.4 Mode = 3 SD =.5 Range = 0 to 1 (~ 5 )

Data from our class Median = 7:00 Mean = 7:04 Mode = 7:00 SD = :55 Range = 5:30 to 9:00 (~4 )

Data from our class 0.3 13.9 7.1 +/- 6.8 = 0.3 13.9

Data from our class 7.1 +/- *6.8 = 0 0.7

Data from our class 7.1 +/- 3*6.8 = 0 7.5

Data from our class 5.4 +/- 1.8 = 3.6 7. 3.6 7.

Data from our class 5.4 +/- *1.8 = 1.8 9.0 1.8 9.0

Data from our class 5.4 +/- 3*1.8 = 0 10 0 10

Data from our class 0.9 5.9 3.4 +/-.5= 0.9 7.9

Data from our class 0 8.4 3.4 +/- *.5= 0 8.4

Data from our class 0 10.9 3.4 +/- 3*.5= 0 10.9

Data from our class 6:09 7:59 7:04+/- 0:55 = 6:09 7:59

Data from our class 5:14 8:54 7:04+/- *0:55 = 5:14 8:54

Data from our class 4:19 9:49 7:04+/- *0:55 = 4:19 9:49

The Normal Probability Plot Normal probability plot Order the data. Find corresponding standardized normal quantile values: i quantile ( ) n 1 where is theprobit function,which gives thez value th i that corresponds toa particularleft - Plot the observed data values against normal quantile values. tailarea Evaluate the plot for evidence of linearity.

Normal probability plot coffee Right-Skewed! (concave up)

Normal probability plot love of writing Neither right-skewed or left-skewed, but big gap at 6.

Norm prob. plot Exercise Right-Skewed! (concave up)

Norm prob. plot Wake up time Closest to a straight line

Formal tests for normality Results: Coffee: Strong evidence of non-normality (p<.01) Writing love: Moderate evidence of nonnormality (p=.01) Exercise: Weak to no evidence of nonnormality (p>.10) Wakeup time: No evidence of non-normality (p>.5)

Normal approximation to the binomial When you have a binomial distribution where n is large and p is middle-of-the road (not too small, not too big, closer to.5), then the binomial starts to look like a normal distribution in fact, this doesn t even take a particularly large n Recall: What is the probability of being a smoker among a group of cases with lung cancer is.6, what s the probability that in a group of 8 cases you have less than smokers?

Normal approximation to the binomial When you have a binomial distribution where n is large and p isn t too small (rule of thumb: mean>5), then the binomial starts to look like a normal distribution Recall: smoking example.7 Starting to have a normal shape even with fairly small n. You can imagine that if n got larger, the bars would get thinner and thinner and this would look more and more 0 1 3 4 5 6 7 8 like a continuous function, with a bell curve shape. Here np=4.8.

Normal approximation to binomial.7 0 1 3 4 5 6 7 8 What is the probability of fewer than smokers? Exact binomial probability (from before) =.00065 +.008 =.00865 Normal approximation probability: =4.8 =1.39 Z (4.8) 1.39.8 1.39 P(Z<)=.0

A little off, but in the right ballpark we could also use the value to the left of 1.5 (as we really wanted to know less than but not including ; called the continuity correction ) Z 1.5 (4.8) 1.39 3.3 1.39.37 P(Z -.37) =.0069 A fairly good approximation of the exact probability,.00865.

Practice problem 1. You are performing a cohort study. If the probability of developing disease in the exposed group is.5 for the study duration, then if you sample (randomly) 500 exposed people, What s the probability that at most 10 people develop the disease?

Answer By hand (yikes!): P(X 10) = P(X=0) + P(X=1) + P(X=) + P(X=3) + P(X=4)+.+ P(X=10)= 500 10 (.5) 10 (.75) 380 500 500 500 498 1 499 0 500 (.5) (.75) (.5) (.75) (.5) (.75) + + 1 + 0 OR Use SAS: data _null_; Cohort=cdf('binomial', 10,.5, 500); put Cohort; run; 0.335047 OR use, normal approximation: =np=500(.5)=15 and =np(1-p)=93.75; =9.68 10 15 Z 9.68.5 P(Z<-.5)=.3015

Proportions The binomial distribution forms the basis of statistics for proportions. A proportion is just a binomial count divided by n. For example, if we sample 00 cases and find 60 smokers, X=60 but the observed proportion=.30. Statistics for proportions are similar to binomial counts, but differ by a factor of n.

Stats for proportions For binomial: x x np np(1 p) Differs by a factor of n. For proportion: x np(1 p) pˆ pˆ p np(1 p) n p(1 n p) Differs by a factor of n. P-hat stands for sample proportion. pˆ p(1 n p)

It all comes back to Z Statistics for proportions are based on a normal distribution, because the binomial can be approximated as normal if np>5