Sampling Distributions and the Central Limit Theorem

Size: px
Start display at page:

Download "Sampling Distributions and the Central Limit Theorem"

Transcription

1 Sampling Distributions and the Central Limit Theorem February 18

2 Data distributions and sampling distributions So far, we have discussed the distribution of data (i.e. of random variables in our sample, such as the smoking status or height) For the rest of the course, we will be more interested in the distribution of statistics For example: we go out, collect a sample of 10 people, measure their heights, and then take the average Is this average a random variable?

3 Sources of variability Yes, for several reasons: There is bound to be measurement error A person s height is not perfectly constant: people are (slightly) taller in the morning than in the evening And most importantly, the average height depends on the specific ten people that comprise our sample For all of these reasons, every time we repeat this procedure, we will get a different answer

4 Sampling distributions Therefore, our statistic is a random variable, and like any random variable, it will have a distribution To reflect the fact that the distribution depends on the random sample, the distribution of a statistic is called a sampling distribution In practice, no one obtains sampling distributions directly Investigators do not collect 10 samples of 10 individuals and report 10 different means; if they can afford to sample 100 people, they collect a single sample of 100 people and report a single mean

5 What s the point? Sampling distributions So why do we study sampling distributions? The reason we study sampling distributions is to understand how, in theory, our statistic would be distributed The ability to reproduce research is a cornerstone of the scientific method; sampling distributions provide a description of the likely and unlikely outcomes of these replications The variability of this distribution is the key to answering the question: How accurate is my generalization to the population likely to be?

6 Seeing sampling distributions As I said before, investigators do not collect multiple samples in order to look at their sampling distributions Since we cannot observe sampling distributions in practice, we need to do one of the following: Argue or prove that the statistic will have a certain distribution, like the normal or binomial distribution Conduct computer simulations in which we artificially create multiple samples We will spend a large portion of our time in the second half of this course doing each of the above

7 Kerrich s experiment The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem A South African mathematician named John Kerrich was visiting Copenhagen in 1940 when Germany invaded Denmark Kerrich spent the next five years in an interment camp To pass the time, he carried out a series of experiments in probability theory One of them involved flipping a coin 10,000 times

8 The law of averages The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem We all know that a coin lands heads with probability 50% Thus, after many tosses, the law of averages says that the number of heads should be about the same as the number of tails or does it?

9 Kerrich s results Sampling distributions The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem Number of Number of Heads - tosses (n) heads 0.5 Tosses , ,000 1, ,000 1, ,000 2, ,000 2, ,000 3, ,000 3, ,000 4, ,000 4, ,000 5,067 67

10 Kerrich s results plotted The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem Number of heads minus half the number of tosses Number of tosses Instead of getting closer, the numbers of heads and tails are getting farther apart

11 Repeating the experiment 50 times The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem This is no fluke: Number of heads minus half the number of tosses Number of tosses

12 Where s the law of averages? The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem So where s the law of averages? As it turns out, the law of averages does not say that as n increases the number of heads will be close to the number of tails What it says instead is that, as n increases, the average number of heads will get closer and closer to the population average of 0.5 The technical term for this is that the sample average, which is a statistic, converges to the population average, which is a parameter

13 The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem Repeating the experiment 50 times, Part II Percentage of heads Number of tosses

14 Trends in Kerrich s experiment The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem There are three very important trends going on in this experiment These trends can be observed visually from the computer simulations or proven via the binomial distribution We ll work with both approaches so that you can get a sense of they both work and reinforce each other Before we do so, I ll introduce two additional, important facts about the binomial distribution: its mean and standard deviation

15 The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem The mean and standard deviation of the binomial distribution The mean of a binomial distribution is np This makes intuitive sense: if the average number of times the event occurs on each trial is p, then it stands to reason that the average number of times it would occur with n trials is np When talking about the mean of a distribution, the term expected value is sometimes used, to avoid confusion with sample means The standard deviation of a binomial distribution is np(1 p) This also makes sense: when p is close to 0 or 1, there is less variability than when p is close to 0.5, and as we have seen, variability goes up with n

16 The expected value of the mean The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem Getting back to Kerrich s experiment, the number of heads that we will have after flipping a coin n times follows a binomial distribution with number of trials n and probability of the event occurring on any given trial 0.5 Thus, the expected number of heads after n flips is 0.5n Furthermore, the expected value of the average is 0.5n n = 0.5

17 The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem The expected value of the mean (cont d) Thus, for all n, the expected value of the average is equal to the population parameter p = 0.5 Putting it another way, we can always expect the distribution of the sample average to be centered around the population average Such statistics are called unbiased estimators

18 The standard error of the mean The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem To avoid confusion with the sample standard deviation, the standard deviation of the sampling distribution of a statistic is called its standard error In Kerrich s experiment, the standard deviation of the number of heads is np(1 p) = n(0.5)(0.5) = 0.5 n The standard error (the standard deviation of the average number of heads) is therefore np(1 p) = 0.5 n n n = 0.5 n

19 The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem Variability in the population, the sum, and the mean Note that, in the original population, the standard deviation was p(1 p) = 0.5 Denoting this standard deviation as SD, the variability (standard deviation or standard error) of the original population, the sum, and the mean are related as follows: Population Sum Mean SD SD n SD/ n So, for example, as we double n, it is true that the variability of the sum will increase However, the variability doesn t double; it only goes up by n Thus, when we divide by n to obtain the average, we still have an expression that goes to 0 as n increases

20 The square root law The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem The relationship between variability in the population and variability in the mean is a very important relationship, sometimes called the square root law: SE = SD n We will see variations on this theme many times in the second half of this course Once again, we see this phenomenon visually in our simulation results

21 The distribution of the mean The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem Finally, let s look at the distribution of the mean by creating histograms of the mean in our simulation 2 flips 7 flips 25 flips Frequency Frequency Frequency Mean Mean Mean

22 The central limit theorem The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem In summary, there are three very important phenomena going on here concerning the sampling distribution of the sample average: #1 The expected value is always equal to the population average #2 The standard error is always equal to the population standard deviation divided by the square root of n #3 As n gets larger, the sampling distribution looks more and more like the normal distribution Furthermore, it can be proven that these three properties of the sampling distribution of the sample average hold for any distribution in the underlying population

23 The central limit theorem (cont d) The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem This result is called the central limit theorem, and it is one of the most important, remarkable, and powerful results in all of statistics In the real world, we rarely know the distribution of our data But the central limit theorem says: we don t have to

24 The central limit theorem (cont d) The law of averages The expected value of the mean The standard error of the mean The distribution of the mean The central limit theorem Furthermore, as we have seen, knowing the mean and standard deviation of a distribution that is approximately normal allows us to calculate anything we wish to know with tremendous accuracy and the sampling distribution of the mean is always approximately normal The only caveats: Observations must be independently drawn from and representative of the population The central limit theorem applies to the sampling distribution of the mean not necessarily to the sampling distribution of other statistics How large does n have to be before the distribution becomes close enough in shape to the normal distribution?

25 Rules of thumb are frequently recommended that n = 20 or n = 30 is large enough to be sure that the central limit theorem is working There is some truth to such rules, but in reality, whether n is large enough for the central limit theorem to provide an accurate approximation to the true sampling distribution depends on how close to normal the population distribution is If the original distribution is close to normal, n = 2 might be enough If the underlying distribution is highly skewed or strange in some other way, n = 50 might not be enough

26 Example #1 Sampling distributions Population n=10 Density Density x Sample means

27 Example #2 Sampling distributions For example, imagine an urn containing the numbers 1, 2, and 9: n=20 Density Sample mean n=50 Density

28 Example #2 (cont d) n=100 Density Sample mean

29 Example #3 Sampling distributions Weight tends to be skewed to the right (far more people are overweight than underweight) Let s perform an experiment in which the NHANES sample of adult men is the population I am going to randomly draw twenty-person samples from this population (i.e. I am re-sampling the original sample)

30 Example #3 (cont d) n=20 Density Sample mean

31 Sampling distribution of serum cholesterol According the National Center for Health Statistics, the distribution of serum cholesterol levels for 20- to 74-year-old males living in the United States has mean 211 mg/dl, and a standard deviation of 46 mg/dl We are planning to collect a sample of 25 individuals and measure their cholesterol levels What is the probability that our sample average will be above 230?

32 Procedure: Probabilities using the central limit theorem Calculating probabilities using the central limit theorem is quite similar to calculating them from the normal distribution, with one extra step: #1 Calculate the standard error: SE = SD/ n, where SD is the population standard deviation #2 Draw a picture of the normal approximation to the sampling distribution and shade in the appropriate probability #3 Convert to standard units: z = (x µ)/se, where µ is the population mean #4 Determine the area under the normal curve using a table or computer

33 Example #1: Solution We begin by calculating the standard error: SE = SD n = = 9.2 Note that it is smaller than the standard deviation by a factor of n

34 Example #1: Solution After drawing a picture, we would determine how many standard errors away from the mean 230 is: = 2.07 What is the probability that a normally distributed random variable is more than 2.07 standard deviations above the mean? = 1.9%

35 Comparison with population Note that this is a very different number than the percent of the population has a cholesterol level above 230 That number is 34.0% (230 is.41 standard deviations above the mean) The mean of a group is much less variable than individuals As Sherlock Holmes says in The Sign of the Four: While the individual man is an insoluble puzzle, in the aggregate he becomes a mathematical certainty. You can, for example, never foretell what any one man will do, but you can say with precision what an average number will be up to. Individuals vary, but percentages remain constant. So says the statistician.

36 Procedure: Central limit theorem percentiles We can also use the central limit theorem to approximate percentiles of the sampling distribution: #1 Calculate the standard error: SE = SD/ n #2 Draw a picture of the normal curve and shade in the appropriate area under the curve #3 Determine the percentiles of the normal curve corresponding to the shaded region using a table or computer #4 Convert from standard units back to the original units: = µ + z(sd)

37 Percentiles Sampling distributions We can use that procedure to answer the question, 95% of our sample averages will fall between what two numbers? Note that the standard error is the same as it was before: 9.2 What two values of the normal distribution contain 95% of the data? The 2.5th percentile of the normal distribution is Thus, a normally distributed random variable will lie within 1.96 standard deviations of its mean 95% of the time

38 Example #2: Solution Which numbers are 1.96 standard errors away from the expected value of the sampling distribution? (9.2) = (9.2) = Therefore, 95% of our sample averages will fall between 193 mg/dl and 229 mg/dl

39 Example #3 Sampling distributions What if we had only collected samples of size 10? Now, the standard error is SE = = 14.5 Now what is the probability of that our sample average will be above 230?

40 Example #3: Solution Now 230 is only = 1.31 standard deviations away from the expected value The probability of being more than 1.31 standard deviations above the mean is 9.6% This is almost 5 times higher than the 1.9% we calculated earlier for the larger sample size

41 Example #4 Sampling distributions What about the values that would contain 95% of our sample averages? The values 1.96 standard errors away from the expected value are now (14.5) = (14.5) = Note how much wider this interval is than the interval (193,229) for the larger sample size

42 Example #5 Sampling distributions What if we d increased the sample size to 50? Now the standard error is 6.5, and the values (6.5) = (6.5) = contain 95% of the sample averages

43 Summary Sampling distributions n SE Interval Width of interval (182.5,239.5) (193.0,229.0) (198.2,223.8) 25.6 The width of the interval is going down by what factor?

44 Example #6 Sampling distributions Finally, we ask a slightly harder question: How large would the sample size need to be in order to insure a 95% probability that the sample average will be within 5 mg/dl of the population mean? As we saw earlier, 95% of observations fall within 1.96 standard deviations of the mean Thus, we need to get the standard error to satisfy 1.96(SE) = 5 SE =

45 Example #6: Solution The standard error is equal to the standard deviation over the square root of n, so = SD n n = SD n = In the real world, we of course cannot sample people, so we would sample 326 to be safe

46 Example #7 Sampling distributions How large would the sample size need to be in order to insure a 90% probability that the sample average will be within 10 mg/dl of the population mean? There is a 90% probability that a normally distributed random variable will fall within standard deviations of the mean Thus, we want 1.645(SE) = 10, so = 46 n n = 57.3 Thus, we would sample 58 people

The Central Limit Theorem

The Central Limit Theorem The Central Limit Theorem Patrick Breheny March 1 Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 1 / 29 Kerrich s experiment Introduction The law of averages Mean and SD of

More information

Part V - Chance Variability

Part V - Chance Variability Part V - Chance Variability Dr. Joseph Brennan Math 148, BU Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 1 / 78 Law of Averages In Chapter 13 we discussed the Kerrich coin-tossing experiment.

More information

Random variables The binomial distribution The normal distribution Other distributions. Distributions. Patrick Breheny.

Random variables The binomial distribution The normal distribution Other distributions. Distributions. Patrick Breheny. Distributions February 11 Random variables Anything that can be measured or categorized is called a variable If the value that a variable takes on is subject to variability, then it the variable is a random

More information

Random variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny.

Random variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny. Distributions September 17 Random variables Anything that can be measured or categorized is called a variable If the value that a variable takes on is subject to variability, then it the variable is a

More information

The normal distribution is a theoretical model derived mathematically and not empirically.

The normal distribution is a theoretical model derived mathematically and not empirically. Sociology 541 The Normal Distribution Probability and An Introduction to Inferential Statistics Normal Approximation The normal distribution is a theoretical model derived mathematically and not empirically.

More information

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Statistics 431 Spring 2007 P. Shaman. Preliminaries Statistics 4 Spring 007 P. Shaman The Binomial Distribution Preliminaries A binomial experiment is defined by the following conditions: A sequence of n trials is conducted, with each trial having two possible

More information

ECON 214 Elements of Statistics for Economists 2016/2017

ECON 214 Elements of Statistics for Economists 2016/2017 ECON 214 Elements of Statistics for Economists 2016/2017 Topic The Normal Distribution Lecturer: Dr. Bernardin Senadza, Dept. of Economics bsenadza@ug.edu.gh College of Education School of Continuing and

More information

Normal Approximation to Binomial Distributions

Normal Approximation to Binomial Distributions Normal Approximation to Binomial Distributions Charlie Vollmer Department of Statistics Colorado State University Fort Collins, CO charlesv@rams.colostate.edu May 19, 2017 Abstract This document is a supplement

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 6 Normal Probability Distributions 6-1 Overview 6-2 The Standard Normal Distribution

More information

Chapter 5 Normal Probability Distributions

Chapter 5 Normal Probability Distributions Chapter 5 Normal Probability Distributions Section 5-1 Introduction to Normal Distributions and the Standard Normal Distribution A The normal distribution is the most important of the continuous probability

More information

1 Sampling Distributions

1 Sampling Distributions 1 Sampling Distributions 1.1 Statistics and Sampling Distributions When a random sample is selected the numerical descriptive measures calculated from such a sample are called statistics. These statistics

More information

Math 227 Elementary Statistics. Bluman 5 th edition

Math 227 Elementary Statistics. Bluman 5 th edition Math 227 Elementary Statistics Bluman 5 th edition CHAPTER 6 The Normal Distribution 2 Objectives Identify distributions as symmetrical or skewed. Identify the properties of the normal distribution. Find

More information

Probability. An intro for calculus students P= Figure 1: A normal integral

Probability. An intro for calculus students P= Figure 1: A normal integral Probability An intro for calculus students.8.6.4.2 P=.87 2 3 4 Figure : A normal integral Suppose we flip a coin 2 times; what is the probability that we get more than 2 heads? Suppose we roll a six-sided

More information

Chapter 7: Point Estimation and Sampling Distributions

Chapter 7: Point Estimation and Sampling Distributions Chapter 7: Point Estimation and Sampling Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 20 Motivation In chapter 3, we learned

More information

CHAPTER 5 Sampling Distributions

CHAPTER 5 Sampling Distributions CHAPTER 5 Sampling Distributions 5.1 The possible values of p^ are 0, 1/3, 2/3, and 1. These correspond to getting 0 persons with lung cancer, 1 with lung cancer, 2 with lung cancer, and all 3 with lung

More information

5.7 Probability Distributions and Variance

5.7 Probability Distributions and Variance 160 CHAPTER 5. PROBABILITY 5.7 Probability Distributions and Variance 5.7.1 Distributions of random variables We have given meaning to the phrase expected value. For example, if we flip a coin 100 times,

More information

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions SGSB Workshop: Using Statistical Data to Make Decisions Module 2: The Logic of Statistical Inference Dr. Tom Ilvento January 2006 Dr. Mugdim Pašić Key Objectives Understand the logic of statistical inference

More information

AMS7: WEEK 4. CLASS 3

AMS7: WEEK 4. CLASS 3 AMS7: WEEK 4. CLASS 3 Sampling distributions and estimators. Central Limit Theorem Normal Approximation to the Binomial Distribution Friday April 24th, 2015 Sampling distributions and estimators REMEMBER:

More information

Contents. The Binomial Distribution. The Binomial Distribution The Normal Approximation to the Binomial Left hander example

Contents. The Binomial Distribution. The Binomial Distribution The Normal Approximation to the Binomial Left hander example Contents The Binomial Distribution The Normal Approximation to the Binomial Left hander example The Binomial Distribution When you flip a coin there are only two possible outcomes - heads or tails. This

More information

BIOL The Normal Distribution and the Central Limit Theorem

BIOL The Normal Distribution and the Central Limit Theorem BIOL 300 - The Normal Distribution and the Central Limit Theorem In the first week of the course, we introduced a few measures of center and spread, and discussed how the mean and standard deviation are

More information

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage 6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic

More information

Module 4: Probability

Module 4: Probability Module 4: Probability 1 / 22 Probability concepts in statistical inference Probability is a way of quantifying uncertainty associated with random events and is the basis for statistical inference. Inference

More information

MATH 264 Problem Homework I

MATH 264 Problem Homework I MATH Problem Homework I Due to December 9, 00@:0 PROBLEMS & SOLUTIONS. A student answers a multiple-choice examination question that offers four possible answers. Suppose that the probability that the

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 4 Random Variables & Probability Distributions Content 1. Two Types of Random Variables 2. Probability Distributions for Discrete Random Variables 3. The Binomial

More information

Binomial and Normal Distributions

Binomial and Normal Distributions Binomial and Normal Distributions Bernoulli Trials A Bernoulli trial is a random experiment with 2 special properties: The result of a Bernoulli trial is binary. Examples: Heads vs. Tails, Healthy vs.

More information

Chapter 6.1 Confidence Intervals. Stat 226 Introduction to Business Statistics I. Chapter 6, Section 6.1

Chapter 6.1 Confidence Intervals. Stat 226 Introduction to Business Statistics I. Chapter 6, Section 6.1 Stat 226 Introduction to Business Statistics I Spring 2009 Professor: Dr. Petrutza Caragea Section A Tuesdays and Thursdays 9:30-10:50 a.m. Chapter 6, Section 6.1 Confidence Intervals Confidence Intervals

More information

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality Point Estimation Some General Concepts of Point Estimation Statistical inference = conclusions about parameters Parameters == population characteristics A point estimate of a parameter is a value (based

More information

5.3 Statistics and Their Distributions

5.3 Statistics and Their Distributions Chapter 5 Joint Probability Distributions and Random Samples Instructor: Lingsong Zhang 1 Statistics and Their Distributions 5.3 Statistics and Their Distributions Statistics and Their Distributions Consider

More information

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate

More information

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Review of previous lecture: Why confidence intervals? Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Suppose you want to know the

More information

We use probability distributions to represent the distribution of a discrete random variable.

We use probability distributions to represent the distribution of a discrete random variable. Now we focus on discrete random variables. We will look at these in general, including calculating the mean and standard deviation. Then we will look more in depth at binomial random variables which are

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Math 140 Introductory Statistics Let s make our own sampling! If we use a random sample (a survey) or if we randomly assign treatments to subjects (an experiment) we can come up with proper, unbiased conclusions

More information

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Chapter 7 Sampling Distributions and Point Estimation of Parameters Chapter 7 Sampling Distributions and Point Estimation of Parameters Part 1: Sampling Distributions, the Central Limit Theorem, Point Estimation & Estimators Sections 7-1 to 7-2 1 / 25 Statistical Inferences

More information

Chapter 5: Statistical Inference (in General)

Chapter 5: Statistical Inference (in General) Chapter 5: Statistical Inference (in General) Shiwen Shen University of South Carolina 2016 Fall Section 003 1 / 17 Motivation In chapter 3, we learn the discrete probability distributions, including Bernoulli,

More information

1 Inferential Statistic

1 Inferential Statistic 1 Inferential Statistic Population versus Sample, parameter versus statistic A population is the set of all individuals the researcher intends to learn about. A sample is a subset of the population and

More information

2 General Notions 2.1 DATA Types of Data. Source: Frerichs, R.R. Rapid Surveys (unpublished), NOT FOR COMMERCIAL DISTRIBUTION

2 General Notions 2.1 DATA Types of Data. Source: Frerichs, R.R. Rapid Surveys (unpublished), NOT FOR COMMERCIAL DISTRIBUTION Source: Frerichs, R.R. Rapid Surveys (unpublished), 2008. NOT FOR COMMERCIAL DISTRIBUTION 2 General Notions 2.1 DATA What do you want to know? The answer when doing surveys begins first with the question,

More information

CH 5 Normal Probability Distributions Properties of the Normal Distribution

CH 5 Normal Probability Distributions Properties of the Normal Distribution Properties of the Normal Distribution Example A friend that is always late. Let X represent the amount of minutes that pass from the moment you are suppose to meet your friend until the moment your friend

More information

5-1 pg ,4,5, EOO,39,47,50,53, pg ,5,9,13,17,19,21,22,25,30,31,32, pg.269 1,29,13,16,17,19,20,25,26,28,31,33,38

5-1 pg ,4,5, EOO,39,47,50,53, pg ,5,9,13,17,19,21,22,25,30,31,32, pg.269 1,29,13,16,17,19,20,25,26,28,31,33,38 5-1 pg. 242 3,4,5, 17-37 EOO,39,47,50,53,56 5-2 pg. 249 9,10,13,14,17,18 5-3 pg. 257 1,5,9,13,17,19,21,22,25,30,31,32,34 5-4 pg.269 1,29,13,16,17,19,20,25,26,28,31,33,38 5-5 pg. 281 5-14,16,19,21,22,25,26,30

More information

Theoretical Foundations

Theoretical Foundations Theoretical Foundations Probabilities Monia Ranalli monia.ranalli@uniroma2.it Ranalli M. Theoretical Foundations - Probabilities 1 / 27 Objectives understand the probability basics quantify random phenomena

More information

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

Probability is the tool used for anticipating what the distribution of data should look like under a given model. AP Statistics NAME: Exam Review: Strand 3: Anticipating Patterns Date: Block: III. Anticipating Patterns: Exploring random phenomena using probability and simulation (20%-30%) Probability is the tool used

More information

Statistical Methods in Practice STAT/MATH 3379

Statistical Methods in Practice STAT/MATH 3379 Statistical Methods in Practice STAT/MATH 3379 Dr. A. B. W. Manage Associate Professor of Mathematics & Statistics Department of Mathematics & Statistics Sam Houston State University Overview 6.1 Discrete

More information

Statistics and Probability

Statistics and Probability Statistics and Probability Continuous RVs (Normal); Confidence Intervals Outline Continuous random variables Normal distribution CLT Point estimation Confidence intervals http://www.isrec.isb-sib.ch/~darlene/geneve/

More information

Probability & Sampling The Practice of Statistics 4e Mostly Chpts 5 7

Probability & Sampling The Practice of Statistics 4e Mostly Chpts 5 7 Probability & Sampling The Practice of Statistics 4e Mostly Chpts 5 7 Lew Davidson (Dr.D.) Mallard Creek High School Lewis.Davidson@cms.k12.nc.us 704-786-0470 Probability & Sampling The Practice of Statistics

More information

Statistical Intervals (One sample) (Chs )

Statistical Intervals (One sample) (Chs ) 7 Statistical Intervals (One sample) (Chs 8.1-8.3) Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to normally distributed with expected value µ and

More information

3. Probability Distributions and Sampling

3. Probability Distributions and Sampling 3. Probability Distributions and Sampling 3.1 Introduction: the US Presidential Race Appendix 2 shows a page from the Gallup WWW site. As you probably know, Gallup is an opinion poll company. The page

More information

Figure 1: 2πσ is said to have a normal distribution with mean µ and standard deviation σ. This is also denoted

Figure 1: 2πσ is said to have a normal distribution with mean µ and standard deviation σ. This is also denoted Figure 1: Math 223 Lecture Notes 4/1/04 Section 4.10 The normal distribution Recall that a continuous random variable X with probability distribution function f(x) = 1 µ)2 (x e 2σ 2πσ is said to have a

More information

Chapter 14. From Randomness to Probability. Copyright 2010 Pearson Education, Inc.

Chapter 14. From Randomness to Probability. Copyright 2010 Pearson Education, Inc. Chapter 14 From Randomness to Probability Copyright 2010 Pearson Education, Inc. Dealing with Random Phenomena A random phenomenon is a situation in which we know what outcomes could happen, but we don

More information

Business Statistics 41000: Probability 4

Business Statistics 41000: Probability 4 Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:

More information

MAKING SENSE OF DATA Essentials series

MAKING SENSE OF DATA Essentials series MAKING SENSE OF DATA Essentials series THE NORMAL DISTRIBUTION Copyright by City of Bradford MDC Prerequisites Descriptive statistics Charts and graphs The normal distribution Surveys and sampling Correlation

More information

8.1 Estimation of the Mean and Proportion

8.1 Estimation of the Mean and Proportion 8.1 Estimation of the Mean and Proportion Statistical inference enables us to make judgments about a population on the basis of sample information. The mean, standard deviation, and proportions of a population

More information

Sampling Distributions For Counts and Proportions

Sampling Distributions For Counts and Proportions Sampling Distributions For Counts and Proportions IPS Chapter 5.1 2009 W. H. Freeman and Company Objectives (IPS Chapter 5.1) Sampling distributions for counts and proportions Binomial distributions for

More information

Chapter 8 Estimation

Chapter 8 Estimation Chapter 8 Estimation There are two important forms of statistical inference: estimation (Confidence Intervals) Hypothesis Testing Statistical Inference drawing conclusions about populations based on samples

More information

1. Statistical problems - a) Distribution is known. b) Distribution is unknown.

1. Statistical problems - a) Distribution is known. b) Distribution is unknown. Probability February 5, 2013 Debdeep Pati Estimation 1. Statistical problems - a) Distribution is known. b) Distribution is unknown. 2. When Distribution is known, then we can have either i) Parameters

More information

The topics in this section are related and necessary topics for both course objectives.

The topics in this section are related and necessary topics for both course objectives. 2.5 Probability Distributions The topics in this section are related and necessary topics for both course objectives. A probability distribution indicates how the probabilities are distributed for outcomes

More information

Chapter 9: Sampling Distributions

Chapter 9: Sampling Distributions Chapter 9: Sampling Distributions 9. Introduction This chapter connects the material in Chapters 4 through 8 (numerical descriptive statistics, sampling, and probability distributions, in particular) with

More information

. 13. The maximum error (margin of error) of the estimate for μ (based on known σ) is:

. 13. The maximum error (margin of error) of the estimate for μ (based on known σ) is: Statistics Sample Exam 3 Solution Chapters 6 & 7: Normal Probability Distributions & Estimates 1. What percent of normally distributed data value lie within 2 standard deviations to either side of the

More information

ECON 214 Elements of Statistics for Economists

ECON 214 Elements of Statistics for Economists ECON 214 Elements of Statistics for Economists Session 7 The Normal Distribution Part 1 Lecturer: Dr. Bernardin Senadza, Dept. of Economics Contact Information: bsenadza@ug.edu.gh College of Education

More information

STAT 201 Chapter 6. Distribution

STAT 201 Chapter 6. Distribution STAT 201 Chapter 6 Distribution 1 Random Variable We know variable Random Variable: a numerical measurement of the outcome of a random phenomena Capital letter refer to the random variable Lower case letters

More information

Stat 213: Intro to Statistics 9 Central Limit Theorem

Stat 213: Intro to Statistics 9 Central Limit Theorem 1 Stat 213: Intro to Statistics 9 Central Limit Theorem H. Kim Fall 2007 2 unknown parameters Example: A pollster is sure that the responses to his agree/disagree questions will follow a binomial distribution,

More information

Homework: (Due Wed) Chapter 10: #5, 22, 42

Homework: (Due Wed) Chapter 10: #5, 22, 42 Announcements: Discussion today is review for midterm, no credit. You may attend more than one discussion section. Bring 2 sheets of notes and calculator to midterm. We will provide Scantron form. Homework:

More information

STAT 157 HW1 Solutions

STAT 157 HW1 Solutions STAT 157 HW1 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/10/spring/stats157.dir/ Problem 1. 1.a: (6 points) Determine the Relative Frequency and the Cumulative Relative Frequency (fill

More information

Sampling & Confidence Intervals

Sampling & Confidence Intervals Sampling & Confidence Intervals Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 24/10/2017 Principles of Sampling Often, it is not practical to measure every subject in a population.

More information

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.

More information

Elementary Statistics

Elementary Statistics Chapter 7 Estimation Goal: To become familiar with how to use Excel 2010 for Estimation of Means. There is one Stat Tool in Excel that is used with estimation of means, T.INV.2T. Open Excel and click on

More information

CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS

CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS A random variable is the description of the outcome of an experiment in words. The verbal description of a random variable tells you how to find or calculate

More information

MLLunsford 1. Activity: Central Limit Theorem Theory and Computations

MLLunsford 1. Activity: Central Limit Theorem Theory and Computations MLLunsford 1 Activity: Central Limit Theorem Theory and Computations Concepts: The Central Limit Theorem; computations using the Central Limit Theorem. Prerequisites: The student should be familiar with

More information

STAB22 section 1.3 and Chapter 1 exercises

STAB22 section 1.3 and Chapter 1 exercises STAB22 section 1.3 and Chapter 1 exercises 1.101 Go up and down two times the standard deviation from the mean. So 95% of scores will be between 572 (2)(51) = 470 and 572 + (2)(51) = 674. 1.102 Same idea

More information

The Binomial and Geometric Distributions. Chapter 8

The Binomial and Geometric Distributions. Chapter 8 The Binomial and Geometric Distributions Chapter 8 8.1 The Binomial Distribution A binomial experiment is statistical experiment that has the following properties: The experiment consists of n repeated

More information

The probability of having a very tall person in our sample. We look to see how this random variable is distributed.

The probability of having a very tall person in our sample. We look to see how this random variable is distributed. Distributions We're doing things a bit differently than in the text (it's very similar to BIOL 214/312 if you've had either of those courses). 1. What are distributions? When we look at a random variable,

More information

Lecture 12. Some Useful Continuous Distributions. The most important continuous probability distribution in entire field of statistics.

Lecture 12. Some Useful Continuous Distributions. The most important continuous probability distribution in entire field of statistics. ENM 207 Lecture 12 Some Useful Continuous Distributions Normal Distribution The most important continuous probability distribution in entire field of statistics. Its graph, called the normal curve, is

More information

Lecture Data Science

Lecture Data Science Web Science & Technologies University of Koblenz Landau, Germany Lecture Data Science Statistics Foundations JProf. Dr. Claudia Wagner Learning Goals How to describe sample data? What is mode/median/mean?

More information

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions. ME3620 Theory of Engineering Experimentation Chapter III. Random Variables and Probability Distributions Chapter III 1 3.2 Random Variables In an experiment, a measurement is usually denoted by a variable

More information

8.2 The Standard Deviation as a Ruler Chapter 8 The Normal and Other Continuous Distributions 8-1

8.2 The Standard Deviation as a Ruler Chapter 8 The Normal and Other Continuous Distributions 8-1 8.2 The Standard Deviation as a Ruler Chapter 8 The Normal and Other Continuous Distributions For Example: On August 8, 2011, the Dow dropped 634.8 points, sending shock waves through the financial community.

More information

But suppose we want to find a particular value for y, at which the probability is, say, 0.90? In other words, we want to figure out the following:

But suppose we want to find a particular value for y, at which the probability is, say, 0.90? In other words, we want to figure out the following: More on distributions, and some miscellaneous topics 1. Reverse lookup and the normal distribution. Up until now, we wanted to find probabilities. For example, the probability a Swedish man has a brain

More information

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82 Announcements: Week 5 quiz begins at 4pm today and ends at 3pm on Wed If you take more than 20 minutes to complete your quiz, you will only receive partial credit. (It doesn t cut you off.) Today: Sections

More information

The Accuracy of Percentages. Confidence Intervals

The Accuracy of Percentages. Confidence Intervals The Accuracy of Percentages Confidence Intervals 1 Review: a 0-1 Box Box average = fraction of tickets which equal 1 Box SD = (fraction of 0 s) x (fraction of 1 s) 2 With a simple random sample, the expected

More information

Math 227 (Statistics) Chapter 6 Practice Test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Math 227 (Statistics) Chapter 6 Practice Test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Math 227 (Statistics) Chapter 6 Practice Test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Using the following uniform density curve, answer the

More information

STAT 241/251 - Chapter 7: Central Limit Theorem

STAT 241/251 - Chapter 7: Central Limit Theorem STAT 241/251 - Chapter 7: Central Limit Theorem In this chapter we will introduce the most important theorem in statistics; the central limit theorem. What have we seen so far? First, we saw that for an

More information

Value (x) probability Example A-2: Construct a histogram for population Ψ.

Value (x) probability Example A-2: Construct a histogram for population Ψ. Calculus 111, section 08.x The Central Limit Theorem notes by Tim Pilachowski If you haven t done it yet, go to the Math 111 page and download the handout: Central Limit Theorem supplement. Today s lecture

More information

CHAPTER 5 SAMPLING DISTRIBUTIONS

CHAPTER 5 SAMPLING DISTRIBUTIONS CHAPTER 5 SAMPLING DISTRIBUTIONS Sampling Variability. We will visualize our data as a random sample from the population with unknown parameter μ. Our sample mean Ȳ is intended to estimate population mean

More information

When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution?

When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution? Distributions 1. What are distributions? When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution? In other words, if we have a large number of

More information

Lecture 6: Chapter 6

Lecture 6: Chapter 6 Lecture 6: Chapter 6 C C Moxley UAB Mathematics 3 October 16 6.1 Continuous Probability Distributions Last week, we discussed the binomial probability distribution, which was discrete. 6.1 Continuous Probability

More information

4: Probability. What is probability? Random variables (RVs)

4: Probability. What is probability? Random variables (RVs) 4: Probability b binomial µ expected value [parameter] n number of trials [parameter] N normal p probability of success [parameter] pdf probability density function pmf probability mass function RV random

More information

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg :

More information

MidTerm 1) Find the following (round off to one decimal place):

MidTerm 1) Find the following (round off to one decimal place): MidTerm 1) 68 49 21 55 57 61 70 42 59 50 66 99 Find the following (round off to one decimal place): Mean = 58:083, round off to 58.1 Median = 58 Range = max min = 99 21 = 78 St. Deviation = s = 8:535,

More information

Statistics, Measures of Central Tendency I

Statistics, Measures of Central Tendency I Statistics, Measures of Central Tendency I We are considering a random variable X with a probability distribution which has some parameters. We want to get an idea what these parameters are. We perfom

More information

IOP 201-Q (Industrial Psychological Research) Tutorial 5

IOP 201-Q (Industrial Psychological Research) Tutorial 5 IOP 201-Q (Industrial Psychological Research) Tutorial 5 TRUE/FALSE [1 point each] Indicate whether the sentence or statement is true or false. 1. To establish a cause-and-effect relation between two variables,

More information

Sampling Distributions Chapter 18

Sampling Distributions Chapter 18 Sampling Distributions Chapter 18 Parameter vs Statistic Example: Identify the population, the parameter, the sample, and the statistic in the given settings. a) The Gallup Poll asked a random sample of

More information

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are Chapter 7 presents the beginning of inferential statistics. Concept: Inferential Statistics The two major activities of inferential statistics are 1 to use sample data to estimate values of population

More information

STAT:2010 Statistical Methods and Computing. Using density curves to describe the distribution of values of a quantitative

STAT:2010 Statistical Methods and Computing. Using density curves to describe the distribution of values of a quantitative STAT:10 Statistical Methods and Computing Normal Distributions Lecture 4 Feb. 6, 17 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowa.edu 1 2 Using density curves to describe the distribution of values of

More information

The Binomial Probability Distribution

The Binomial Probability Distribution The Binomial Probability Distribution MATH 130, Elements of Statistics I J. Robert Buchanan Department of Mathematics Fall 2017 Objectives After this lesson we will be able to: determine whether a probability

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 5 Probability Distributions 5-1 Overview 5-2 Random Variables 5-3 Binomial Probability

More information

Chapter 17. The. Value Example. The Standard Error. Example The Short Cut. Classifying and Counting. Chapter 17. The.

Chapter 17. The. Value Example. The Standard Error. Example The Short Cut. Classifying and Counting. Chapter 17. The. Context Short Part V Chance Variability and Short Last time, we learned that it can be helpful to take real-life chance processes and turn them into a box model. outcome of the chance process then corresponds

More information

work to get full credit.

work to get full credit. Chapter 18 Review Name Date Period Write complete answers, using complete sentences where necessary.show your work to get full credit. MULTIPLE CHOICE. Choose the one alternative that best completes the

More information

The "bell-shaped" curve, or normal curve, is a probability distribution that describes many real-life situations.

The bell-shaped curve, or normal curve, is a probability distribution that describes many real-life situations. 6.1 6.2 The Standard Normal Curve The "bell-shaped" curve, or normal curve, is a probability distribution that describes many real-life situations. Basic Properties 1. The total area under the curve is.

More information

Lecture 2. Probability Distributions Theophanis Tsandilas

Lecture 2. Probability Distributions Theophanis Tsandilas Lecture 2 Probability Distributions Theophanis Tsandilas Comment on measures of dispersion Why do common measures of dispersion (variance and standard deviation) use sums of squares: nx (x i ˆµ) 2 i=1

More information

Comparing Estimators

Comparing Estimators Comparing Estimators The Median For the sake of discussion, assume that we are measuring the heights of randomly selected adult men from the U.S. Also for the sake of discussion, let's suppose that this

More information

The Binomial Distribution

The Binomial Distribution The Binomial Distribution January 31, 2019 Contents The Binomial Distribution The Normal Approximation to the Binomial The Binomial Hypothesis Test Computing Binomial Probabilities in R 30 Problems The

More information

Essential Question: What is a probability distribution for a discrete random variable, and how can it be displayed?

Essential Question: What is a probability distribution for a discrete random variable, and how can it be displayed? COMMON CORE N 3 Locker LESSON Distributions Common Core Math Standards The student is expected to: COMMON CORE S-IC.A. Decide if a specified model is consistent with results from a given data-generating

More information