Statistics, Their Distributions, and the Central Limit Theorem

Similar documents
Distribution of the Sample Mean

Discrete Random Variables

Discrete Random Variables

Business Statistics 41000: Probability 4

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Chapter 7 Study Guide: The Central Limit Theorem

Part V - Chance Variability

Statistics and Probability

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Confidence Intervals: Review

The normal distribution is a theoretical model derived mathematically and not empirically.

The Central Limit Theorem. Sec. 8.2: The Random Variable. it s Distribution. it s Distribution

Expected Value of a Random Variable

Chapter 7: Point Estimation and Sampling Distributions

Module 4: Probability

Density curves. (James Madison University) February 4, / 20

STA 320 Fall Thursday, Dec 5. Sampling Distribution. STA Fall

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Lecture 8 - Sampling Distributions and the CLT

Midterm Exam III Review

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Figure 1: 2πσ is said to have a normal distribution with mean µ and standard deviation σ. This is also denoted

Chapter 9. Sampling Distributions. A sampling distribution is created by, as the name suggests, sampling.

7 THE CENTRAL LIMIT THEOREM

Shifting our focus. We were studying statistics (data, displays, sampling...) The next few lectures focus on probability (randomness) Why?

Statistics for Managers Using Microsoft Excel 7 th Edition

Business Statistics 41000: Probability 3

Statistics, Measures of Central Tendency I

Chapter 8: The Binomial and Geometric Distributions

2011 Pearson Education, Inc

Discrete Random Variables

15.063: Communicating with Data Summer Recitation 4 Probability III

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

Statistical Methods in Practice STAT/MATH 3379

Chapter 7: Sampling Distributions Chapter 7: Sampling Distributions

NORMAL RANDOM VARIABLES (Normal or gaussian distribution)

No, because np = 100(0.02) = 2. The value of np must be greater than or equal to 5 to use the normal approximation.

AMS7: WEEK 4. CLASS 3

Review of commonly missed questions on the online quiz. Lecture 7: Random variables] Expected value and standard deviation. Let s bet...

4.2 Probability Distributions

Chapter 9: Sampling Distributions

Introduction to Statistics I

Probability: Week 4. Kwonsang Lee. University of Pennsylvania February 13, 2015

Chapter 3 - Lecture 5 The Binomial Probability Distribution

Discrete Random Variables

Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017

5.2 Random Variables, Probability Histograms and Probability Distributions

Central Limit Theorem

Lecture 9. Probability Distributions. Outline. Outline

Statistics for Business and Economics: Random Variables:Continuous

Central Limit Theorem (cont d) 7/28/2006

Data Analysis and Statistical Methods Statistics 651

Chapter 6: Random Variables. Ch. 6-3: Binomial and Geometric Random Variables

Lecture 9. Probability Distributions

BIOL The Normal Distribution and the Central Limit Theorem

Sampling Distribution

Statistics vs. statistics

ECON 214 Elements of Statistics for Economists 2016/2017

Elementary Statistics Lecture 5

Statistics 511 Additional Materials

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

Central Limit Theorem, Joint Distributions Spring 2018

Chapter 3 - Lecture 3 Expected Values of Discrete Random Va

Contents. The Binomial Distribution. The Binomial Distribution The Normal Approximation to the Binomial Left hander example

STAT Chapter 7: Central Limit Theorem

ECON 214 Elements of Statistics for Economists

STAT 111 Recitation 3

5.4 Normal Approximation of the Binomial Distribution

The Normal Probability Distribution

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

I. Standard Error II. Standard Error III. Standard Error 2.54

The Binomial Probability Distribution

MidTerm 1) Find the following (round off to one decimal place):

STAT Chapter 6: Sampling Distributions

Statistics 511 Supplemental Materials

Confidence Intervals Introduction

Section The Sampling Distribution of a Sample Mean

SECTION 4.4: Expected Value

Sampling. Marc H. Mehlman University of New Haven. Marc Mehlman (University of New Haven) Sampling 1 / 20.

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

Sampling and sampling distribution

Theoretical Foundations

Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Simple Random Sample

The graph of a normal curve is symmetric with respect to the line x = µ, and has points of

The binomial distribution p314

Expectations. Definition Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X ) or

STAT 241/251 - Chapter 7: Central Limit Theorem

Lecture 22. Survey Sampling: an Overview

Learning Goals: * Determining the expected value from a probability distribution. * Applying the expected value formula to solve problems.

6 Central Limit Theorem. (Chs 6.4, 6.5)

Transcription:

Statistics, Their Distributions, and the Central Limit Theorem MATH 3342 Sections 5.3 and 5.4 Sample Means Suppose you sample from a popula0on 10 0mes. You record the following sample means: 10.1 9.5 9.6 10.2 9.5 9.2 10.4 9.3 8.5 11.0 Why aren t the values all the same? 1

Sta0s0cs A sta$s$c is any quan0ty whose value can be calculated from sample data. Before obtaining data: It is uncertain what value a sta0s0c will take A sta0s0c is a RV and will be denoted in CAPS AKer obtaining data: It is known what value a sta0s0c takes for that data. An observed value of a sta0s0c is denoted in lowercase Parameters A parameter is any characteris0c of a popula0on. For a given popula0on, the value that a parameter takes is fixed. The value of a parameter is usually unknown to us in prac0ce. We use sta0s0cs to es0mate parameters! 2

Review Parameters Describe Populations Fixed Values for a Given Population Value Unknown in Practice Statistics Describe Samples Changes from Sample to Sample Value is Calculated for a Given Sample Random Samples The RV s X 1, X 2,, X n are said to form a (simple) random sample of size n if: The X i s are independent RVs Every X i has the same probability distribu0on Same as saying that the X i s are independent and iden/cally distributed (iid) 3

Random Sampling Error The deviation between the statistic and the parameter. Caused by chance in selecting a random sample. This includes only random sampling error. NOT errors associated with choosing bad samples. A Population Distribution For a given variable, this is the probability distribution of values the RV can take among all of the individuals in the population. IMPORTANT: Describes the individuals in the population. 4

A Sampling Distribution The probability distribution of a statistic in all possible samples of the same size from the same population. IMPORTANT: Describes a statistic calculated from samples from a given population. Developing a Sampling Distribution Assume there is a population Population size n = 4 A B C D Measurement of interest is age of individuals Values: 18, 20, 22, 24 (years) 5

Consider All Possible Samples of Size n = 2 1 st 2 nd Observation Obs 18 20 22 24 18 18,18 18,20 18,22 18,24 20 20,18 20,20 20,22 20,24 22 22,18 22,20 22,22 22,24 24 24,18 24,20 24,22 24,24 16 possible samples (sampling with replacement) 16 Sample Means 1st 2nd Observation Obs 18 20 22 24 18 18 19 20 21 20 19 20 21 22 22 20 21 22 23 24 21 22 23 24 Displaying the Sampling Distribution 16 Sample Means 1st 2nd Observation Obs 18 20 22 24 18 18 19 20 21 20 19 20 21 22 22 20 21 22 23 24 21 22 23 24 _ P(X).3.2.1 0 Sample Mean Distribution n = 2 18 19 20 21 22 23 24 _ X 6

Population Distribution vs. Sampling Distribution P(X).3.2 Population Sample Mean Distribution n = 2 _ P(X).3.2.1.1 0 18 20 22 24 A B C D X 0 18 19 20 21 22 23 24 _ X The Law of Large Numbers A sample is drawn at random from any population with mean µ. As the number of observations goes up, the sample mean x gets closer to the population mean µ. 7

Example: Class of 20 Students Suppose there are 20 people in a class and you are interested in the average height of the class. The heights (in inches): 72 64 75 63 62 61 68 76 59 73 67 66 65 64 60 65 70 56 71 62 The average height is 65.95 in. Simulation: n = 5 8

Simulation: n = 10 Simulation: n = 15 9

What the Law of Large Numbers Tells Us It tells us that our estimate of the population mean will get better and better as we take bigger and bigger samples. This means the variability of the sample mean decreases as n increases. However, it is often misused by gamblers and sports analysts, among others. Sampling Distribution of a Mean A SRS of size n is taken from a population with a mean µ and a standard deviation σ. Then: E(X) = µ X = µ V(X) = σ 2 X = σ 2 n σ X = σ n 10

Example: Sodium Measurements Standard deviation of sodium content 10 mg. Measure 3 times and the mean of these 3 measurements is recorded. What is the standard deviation of the mean? How many measurements are needed to get a standard deviation of the mean equal to 5? Sampling Distribution of a Mean If the distribution of the population is N(µ, σ 2 ) Then the sample mean of n independent observations has the distribution:! N µ, σ 2 $ # & " n % 11

Graphical Depiction x n n larger n n Distribution of x µ The Central Limit Theorem As sample size gets large enough the sampling distribution becomes almost Normal regardless of shape of population 12

The Central Limit Theorem Let X 1,,X n be a random sample from a distribution with mean µ and variance σ 2. Then if n is sufficiently large (n > 30 rule of thumb):! X is approximately N µ, σ 2 $ # & " n % The larger the value of n, the better the approximation. Simulating 500 Rolls of n Dice n = 1 Die 100 90 80 70 60 50 40 30 20 10 0 1 2 3 4 5 6 Roll 13

Simulating 500 Rolls of n Dice n = 2 Dice Frequency 100 90 80 70 60 50 40 30 20 10 0 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Bin Simulating 500 Rolls of n Dice n = 5 Dice 140 120 100 Frequency 80 60 40 20 0 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Bin 14

Simulating 500 Rolls of n Dice n = 10 Dice Frequency 200 180 160 140 120 100 80 60 40 20 0 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Bin Z-Score for Distribution of the Mean x ( x µ) z = σ n where: = Sample mean µ = Population mean σ = Population standard deviation n = Sample size 15

Example Calculation What is the probability that a sample of 100 automobile insurance claim files will yield an average claim of $4,527.77 or less if the average claim for the population is $4,560 with standard deviation of $600? z = ( x µ ) σ n = (4,527.77 4,560) 600 100 = 32.23 60 = 0.537 P(Z < -0.54) = 0.2946 Example: ACT Exam Scores on the ACT exam are distributed N(18.6, 5.9 2 ) What is the probability that a single student scores 21 or higher? What is the probability that the mean score of 50 students is 21 or higher? 16

Summary Means of random samples are less variable than individual observations. Means of random samples are more Normal than individual observations. 17