BIOL The Normal Distribution and the Central Limit Theorem

Similar documents
STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

STAT Chapter 7: Central Limit Theorem

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

STAT 241/251 - Chapter 7: Central Limit Theorem

ECON 214 Elements of Statistics for Economists 2016/2017

Section The Sampling Distribution of a Sample Mean

Midterm Exam III Review

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Part V - Chance Variability

8.2 The Standard Deviation as a Ruler Chapter 8 The Normal and Other Continuous Distributions 8-1

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

Figure 1: 2πσ is said to have a normal distribution with mean µ and standard deviation σ. This is also denoted

Lecture 9. Probability Distributions. Outline. Outline

STAT 201 Chapter 6. Distribution

ECON 214 Elements of Statistics for Economists

Lecture 9. Probability Distributions

Commonly Used Distributions

Math 227 Elementary Statistics. Bluman 5 th edition

Normal Model (Part 1)

MLLunsford 1. Activity: Central Limit Theorem Theory and Computations

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

The Normal Distribution

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Normal distribution. We say that a random variable X follows the normal distribution if the probability density function of X is given by

Statistics and Probability

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Chapter 7: Point Estimation and Sampling Distributions

Business Statistics 41000: Probability 4

1 Sampling Distributions

Probability. An intro for calculus students P= Figure 1: A normal integral

Chapter 7 Study Guide: The Central Limit Theorem

LECTURE 6 DISTRIBUTIONS

The Standard Deviation as a Ruler and the Normal Model. Copyright 2009 Pearson Education, Inc.

AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

Lecture 6: Chapter 6

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

The Normal Probability Distribution

Statistics 511 Supplemental Materials

2011 Pearson Education, Inc

Chapter 6. The Normal Probability Distributions

CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS

Chapter 15: Sampling distributions

Sampling Distributions For Counts and Proportions

MAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw

What was in the last lecture?

7 THE CENTRAL LIMIT THEOREM

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

University of California, Los Angeles Department of Statistics. Normal distribution

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Sampling Distributions and the Central Limit Theorem

A.REPRESENTATION OF DATA

STAT:2010 Statistical Methods and Computing. Using density curves to describe the distribution of values of a quantitative

AMS7: WEEK 4. CLASS 3

AP Stats ~ Lesson 6B: Transforming and Combining Random variables

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

5.1 Personal Probability

Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 6 Normal Probability Distribution QMIS 120. Dr.

Chapter 5. Sampling Distributions

Module 4: Probability

Lecture 8 - Sampling Distributions and the CLT

Binomial Random Variable - The count X of successes in a binomial setting

Density curves. (James Madison University) February 4, / 20

The normal distribution is a theoretical model derived mathematically and not empirically.

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Describing Data: One Quantitative Variable

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Introduction to Business Statistics QM 120 Chapter 6

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

PROBABILITY DISTRIBUTIONS

Statistics, Their Distributions, and the Central Limit Theorem

Chapter 9: Sampling Distributions

MATH 104 CHAPTER 5 page 1 NORMAL DISTRIBUTION

Review of commonly missed questions on the online quiz. Lecture 7: Random variables] Expected value and standard deviation. Let s bet...

I. Standard Error II. Standard Error III. Standard Error 2.54

The Binomial Distribution

Section Distributions of Random Variables

Example. Chapter 8 Probability Distributions and Statistics Section 8.1 Distributions of Random Variables

Theoretical Foundations

Chapter 5: Statistical Inference (in General)

FINAL REVIEW W/ANSWERS

work to get full credit.

Sampling Distribution Models. Copyright 2009 Pearson Education, Inc.

Math 243 Lecture Notes

The graph of a normal curve is symmetric with respect to the line x = µ, and has points of

Section Introduction to Normal Distributions

CHAPTER 6 Random Variables

Statistical Methods in Practice STAT/MATH 3379

AP * Statistics Review

Chapter 5 Student Lecture Notes 5-1. Department of Quantitative Methods & Information Systems. Business Statistics

The Binomial Distribution

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter 7 Sampling Distributions and Point Estimation of Parameters

4.2 Bernoulli Trials and Binomial Distributions

8.1 Estimation of the Mean and Proportion

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Chapter 7: Random Variables

Transcription:

BIOL 300 - The Normal Distribution and the Central Limit Theorem In the first week of the course, we introduced a few measures of center and spread, and discussed how the mean and standard deviation are good summaries when the histogram (or distribution) is symmetric and unimodal. When it is not symmetric, we may want to use the median and IQR as summaries, although for most of the course, we will deal with things that are approximately symmetric and unimodal. Understanding the idea of what a Standard Deviation is, is very important as almost all statistical methods rely on this, and we will see it come up again and again throughout the course (in all of statistics, actually). Recall: The SD can be thought of as a measure of the average deviation from the mean. You ve spent your whole life being told that you can t compare apples and oranges; well were going to present a method for doing just that! Example: I m at a crucial point in my life where I m trying to decide what to do with it; teach or research? For my Masters I graduated with a grade of 87%, and the mean Master s grade is 83% with a standard deviation of 5%. For my course evaluations I have a mean rating of 4.65 (out of 5), and the mean evaluation score is typically 3.5 with a standard deviation of 0.4. Which one am I better at? (Note: These are made up numbers!) If we just look at the values it is hard to compare the two. 87 is 4 larger than 83, and 4.65 is only 1.15 larger than 3.5, but...87 is a lot bigger than 4.65 and you can only go 1.5 over the average of 3.5, and 17 over the average grade of 83 so...how do we compare the two? The answer is to use the Standard Deviation as a measuring stick, as it summarizes the average deviation from the mean. Essentially, we will want to find out how far each one is from its respective mean, in terms of its average deviation from the mean. 1

The Masters grade is 87 83 = 4% above the mean grade, and... 4% in terms of standard deviation, it is = 0.8 standard deviations above the mean. SD=5% The evaluation score is 4.65 3.5 = 1.15 above the mean score, and... 1.15 in terms of standard deviation, it is SD=0.4 mean. = 2.875 standard deviations above the In terms of each of their own respective means and average deviation from the mean (or SD), the evaluation scores are much higher above their own mean than the Masters grade. In your every day life you are essentially using statistical tools to make decisions, without even knowing it... In my opinion (so don t generalize this to every statistician), statistics is simply a discipline that tries to take the way a person thinks about things and makes logical decisions based on what they observe in every day life, and formalize these into a set of objective rules. Adding or Multiplying each value by a constant: 1. Adding a constant (shifting) If we add a constant (c) to each observation in the data then: The measures of center (mean, median, midrange) will all have the constant (c) added to them, and so will the Quartiles. The measures of spread (variance, SD, range, IQR) will all remain the same. 2. Multiplying by a constant (scaling) If we multiply each observation by a constant (c), then: The measures of center (mean, median, midrange), and the measures of spread (SD, range, IQR) and the Quartiles will all be multiplied by the constant (c). The variance will be multiplied by c 2 In short, adding changes center, but not spread. Multiplying changes the spread and center. Multiplying by a constant is how we change measurement units (eg) Kg to lbs. 2

Standardizing (Z-scores): Question: How can we compare observations that were measured on different scales or from two different distributions? Answer: By summarizing how far away each of the observations is from the mean, in terms of its standard deviation (or average deviation from the mean)! The Z-score summarizes how far a given observation (x i ) is from its mean (µ), in terms of it s SD (σ). Z-score (Z)= Z = x i µ σ difference between observation and mean Standard deviation Exercise: A flight from Vancouver to Toronto usually takes 4.5 hours with a SD of 15 minutes. If my last flight took 4 hours and 10 minutes, how far is this from the mean in standard units? When we Standardize, we are adding (actually subtracting) a constant from every observation, and then multiplying (actually dividing) every observation by a constant...check rules on last page If we let M = x i x, then the mean of M is x x = 0, and the SD of M is unchanged. If we now let Z = M, then the mean of Z is the mean of M times the constant, which SD equals 0. The SD of Z is the SD of M times the constant, which is SD = 1. SD So, Z-scores have a mean of 0 and a SD of 1. A positive Z-score means that the observation is above the mean, and a negative z-score means its below it. The farther an observation is from the mean, the larger the Z-score will be in absolute value. 3

The Normal Distribution (Bell Curve): This is where we take a small step into the theoretical world of statistics. Many types of data one collects have a distribution that is bell shaped and roughly symmetric, and the Normal Distribution is appropriate for summarizing these (note that we are dealing with only quantitative variables here). (eg) weight, IQ scores,... Characteristics of Normal Model: 1. It is bell-shaped, unimodal, and perfectly symmetric about the mean (µ x ). 2. The spread of the distribution is determined by the standard deviation (σ). 3. This model is denoted by: N(µ, σ 2 ), where µ=mean, σ 2 =Variance, and σ is the SD. 4. The total area under the curve is 100% (just as the total area of the bars for a histogram is 100%) Theoretical Normal Models Porbability (%) 0.00 0.05 0.10 0.15 0.20 N(2, 36) N( 4, 9) N(2, 4) 15 10 5 0 5 10 15 Values f(x) = 1 e (x µx)2 2σ 2 2πσ 2 4

Notes: For the Normal Distribution, we use (µ x ) for the mean instead of ( x), and (σ) for the SD instead of (s), why??? The ( x) and (s) are Statistics or sample estimates; numerical summaries of the observed data. (sample) The (µ x ) and (σ) are Parameters or population parameters; that specify the theoretical model. (population) 5

Standardized Values (for the Normal Model: ) Z = x µ x σ When we standardize an observation from a Normal Distribution, the Z-score is N(0, 1). What we do is we use a theoretical Normal Distribution to describe the distribution of an observed variable. One must check the histogram to make sure that such a model is appropriate (symmetric and unimodal). We take the observed estimates of the mean and SD, and if a Normal Distribution seems appropriate, then we use the Normal Distribution (with the same mean and SD to approximate the observed data. We then standardize the value(s) of interest, so that we can use a Standard Normal variable (N(0, 1)). We can then answer questions such as: What proportion of males have weights above 190lbs? How many between 210 and 220? and so on... The 68-95-99.7 Rule: Approximately 68% of the data will be within +/ 1 SD of the mean. Approximately 95% of the data will be within +/ 2 SDs of the mean. Approximately 99.7% of the data will be within +/ 3 SDs of the mean. (eg) if a class has a mean grade of 70% and a SD of 5%, then approximately 68% of students will receive grades between 65-75%, approx. 95% will receive grades between 60-80%, and 99.7% between 55-85%. Let s Draw a Picture: 6

Finding Percentages Under the Normal Model: 1. Draw a Normal Model and label where the mean is. Then shade the area of interest. 2. Standardize the x-value(s) that are at the boundaries of the area of interest. 3. Use the Normal Table (provided online) to find the area of the shaded region. I have posted a different Normal table than the textbook on the course website, which I will use in lectures. Example: What is the area (probability) below a Z-score of Z = 1.5? What is the area (probability) between Z-scores of -1.21 and 1.21? Summary: 1. We estimate the mean and SD for our observed data. 2. Check if a Normal Model is appropriate (symmetric, unimodal) 3. If it is, then we standardize the values of interest. 4. Use the Normal Table to find the percentages we are interested in. (the Normal Model is HUGE in statistics, so make sure to practice many of these problems) 7

Exercises: 1. Suppose that math SAT scores follow the normal model. The past results of the math SAT exams show that males and females have mean scores of 500 and 455 and standard deviations of 100 and 120, respectively. Steve and Nikki took the math SAT exam, and they both scored 620. (a) Compare their scores using the z-score. (b) What percentage of males score over 600 on the math SAT test? (c) What percentage of females score between 255 and 555 on the math SAT test? 2. Find the area under the Normal Model for the following Z-scores. (a) smaller than -1.10 (b) bigger than -1.10 (c) bigger than 2.15 (d) between 0 and 1.18 (e) between -1.10 and 1.62 (f) smaller than -4.50 3. Find the z-scores corresponding to the following percentiles: (a) 50 th (b) 70 th (c) 15 th 4. The Mercury missions from NASA allowed no astronauts to be taller than 180.3cm. The mean height of males is 175.6cm with a standard deviation of 7.1cm. (a) What percentage of males would be ineligible to be astronauts, based on height alone? (b) Find the interquartile range for the height of males. 8

** So far, we have talked about examining a single observation. In the last section we discussed how to work out probabilities when our variable follows a normal distribution. In previous sections, we discussed how to work out probabilities for things that follow a binomial or poission distribution. But, we only talked about examining single observations. Often, we take a sample of (n) observations, and we are interested in the mean or the sum of the n observations. (eg) We may be interested in taking a sample of 10 new floursecent lightbulbs and making some statements about the mean lifetime of the lightbulbs, or maybe even the sum of the lifetime of the 10 bulbs. Maybe we ve developed a new drug, and we want to make some statements about the mean decrease in bloodpressure due to the drug. So, how do we deal with these? Well, the Central Limit Theorem (CLT) makes things easy for us. This is probably the single most important theorem in statistics. Central Limit Theorem This is the fundamental theorem of statistics, and was first proved by Pierre-Simon LaPlace. The CLT says that if our random sample (x 1, x 2,..., x n ) comes from any particular distribution (may not be Normal, but all from the same distribution) with a mean of µ and a variance of σ 2, then when n is large enough, the sample mean x and the sample sum S approximately follow a normal distribution. We also require that the sample observations are independent and random. (ie) If X i Are Independent observations from any distribution with mean µ and variance σ 2, and a large n then... X N(µ, σ2 n ) S N(nµ, nσ 2 ) 9

It is not so intuitive at first, but even if the distribution we are sampling from is skewed or even bimodal, the sampling distribution of the mean will be normally distributed. This result is very important and is used extensively throughout statistics, as it tells us that no matter what distribution our random sample (the x i s) come from, that the sample mean X and the sample sum S follow a normal distribution as long as a few assumptions are met. The CLT is an assymptotic result, meaning that when n =, X and S are normally distributed, but when n <, X and S are only approximately normally distributed. This raises the question, when is n large enough? There is no quick answer to this. The more symmetric the distribution of the x i s the smaller n can be. A generally accepted rule is n 25 First let s draw a picture (and explain) what we mean by this, then we will take a look at a simulation example, and then we will discuss the CLT for proportions, the binomial and Poisson distributions. Picture: 10

Instead of using math, let s try and convince ourselves that this is true using some intuition and simulations. Simulating The Sampling Distribution of a Mean: Below is a picture of histograms of some simulated rolls of dice. I will talk about these in class. Note: I used dice as the example, as these cover many areas. (ie) A dice can lead to proportions, it can be binomial, and it is an ordinal variable, which is similar to a quantitative/continuous variable. 10000 Rolls of 1 Die The Mean of 10000 Rolls of 2 Dice The Mean of 10000 Rolls of 3 Dice Density 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Density 0.00 0.05 0.10 0.15 0.20 0.25 0.30 Density 0.0 0.1 0.2 0.3 0.4 0.5 0.6 1 2 3 4 5 6 die1 1 2 3 4 5 6 die2 1 2 3 4 5 die3 The Mean of 10000 Rolls of 5 Dice The Mean of 10000 Rolls of 25 Dice Density 0.0 0.1 0.2 0.3 0.4 0.5 Density 0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 4 5 6 die5 2.5 3.0 3.5 4.0 4.5 5.0 die25 We can see that as the sample size increases (1, 2, 3, 5, 25), the sampling distribution of the mean begins to look like a normal model. 11

Note: For now, we answer questions like: If the true mean and standard deviation are some known value, then what is the probability of getting certain estimated values? Later, we will use these ideas for testing hypotheses. ** The CLT for continuous variables Example: Suppose we will sample and test 25 lightbulbs to measure their lifetimes. Suppose we know that each of the lightbulbs has a mean lifetime of 1000 hours and a standard deviation of 1000 hours, and follows an exponential distribution (a continuous distribution we have not talked about). (a) What is the probability that the sample mean of the 25 lightbulbs will be more than 1100 hours? (b) What is the probability that the sample sum of the 25 lightbulbs will be more than 23,000? (c) What range of lifetimes would be approximately 95% sure that X will be in? Center this interval at the true mean. ** Now let s start talking about proportions. The CLT and Proportions If we are dealing with something that is categorical/discrete, we often end up examining a proportion. Here, we will see how we can use the normal distribution to approximate the sampling distribution of a proportion. By sampling distribution we mean the following. If we knew the true proportion in the population, then what type of sample proportions are likely to arise in our sample. We use these ideas to test hypotheses later. It turns out that if the true population proportion is p then... p(1 p) σˆp = n 12

This follows a normal distribution as long as: np and n(1 p) are both 10 Examples: 1. Suppose I will flip a fair coin 100 times. (a) What is the probability that I get 60% or more heads? (we can and will also answer this question treating it as a binomial) (b) What range of the precentage of heads should I expect to get 95% of the time? Center this interval around the true proportion. 2. There is what is known as the basic strategy for playing Blackjack. Playing this strategy gives the player a 45% chance of winning a given hand. If you play 50 hands, what is the probability that you win more than 50% of the hands (and win $)? 13

Normal Approximation to the Binomial Recall: That if X BIN(n, p), then µ x = np and σ 2 x = np(1 p) When n is large and p is not too close to 0 or 1, then... BIN(n, p) N(np, np(1-p)) The rule-of-thumb for this approximation to work is that min{np, n(1 p)} 10 Continuity Correction: Because we are approximating a discrete distribution with a continuous distribution, we must make a continuity correction. (ie) In the discrete case, P(X x) P(X > x) P (X = k) = P (k 0.5 X k + 0.5) P (a X b) = P (a 0.5 X b + 0.5) P (a < X < b) = P (a + 0.5 X b 0.5) P (X < a) = P (X a 0.5) P (X a) = P (X a + 0.5) P (X > a) = P (X a + 0.5) P (X a) = P (X a 0.5) Note: The continuity correction makes little difference when n is large. Examples: 1. Suppose I will flip a fair coin 100 times. (a) What is the probability that I get 60 or more heads? (b) What range of the number of heads should I expect to get 95% of the time? 2. Basic strategy for playing Blackjack. Playing this strategy gives the player a 45% chance of winning a given hand. If you play 50 hands, what is the probability that you win more than half of the hands (and win $)? 14

Normal Approximation to the Poisson Recall: That if X POISSON(λ ), then µ x = λ and σ 2 x = λ When λ is large, then... POISSON(λ ) N(λ, λ ) The rule-of-thumb for this approximation to work is that λ 20 Like in the Binomial case, here we are approximating a discrete distribution using a continuous distribution, so we must use the same continuity correction as in the Binomial case. Example: 1. Recall the example from the section on Poisson processes. We were monitoring the number of earthquakes in California over 6.7, and there were an average of 1.5 per year. (a) What is the probability of having more than 28 large earthquakes in the next 15 years? 15

Examples: 1. You have designed a new sattelite that is planned to orbit in space for the next 150 years. It is set up in the following way. It has 25 battery packs. One powers the sattelite, and when it burns out the next battery takes over, and when that one burns out the next takes over, and so on. The lifetime of each battery follows an exponential distribution with a mean of 8 years and a standard deviation of 8 years. What is the probability that the sattelite runs out of batteries before the 150 years is up? 2. A standard bottle of beer advertises that it contains 341mL of beer. In fact, the machine that pours the beer into the bottle pours a mean amount of 343mL with a standard deviation of 2mL. The amount of beer poured follows a normal distribution. (a) What is the probability that a randomly selected bottle of beer is underfilled? (b) If you buy a two-four, what is the probability that no more than 4 bottles are underfilled? (c) If you buy a 6-pack, what is the probability that the average amount of liquid is less than 341mL? 3. An elevator has a limit of 10 people or 2000lbs. If 10 people get on the elevator what is the probability that they surpass the limit? Suppose that the weights of people follow a normal distribution with a mean of 170lbs and a standard deviation of 30lbs. 4. It is believed that 4% of children have a gene that may be linked to juvenille diabetes. Researchers are hoping to track 20 or more of these children (with the defect) for several years. they will test 732 newborn babies for the presence of this gene, and if the gene is present, they will track the child for several years. What is the probability that they find 20 or more subjects to be in the study? (you can answer in terms of a proportion, or as a binomial) 16