Introduction to Alternative Statistical Methods. Or Stuff They Didn t Teach You in STAT 101

Size: px
Start display at page:

Download "Introduction to Alternative Statistical Methods. Or Stuff They Didn t Teach You in STAT 101"

Transcription

1 Introduction to Alternative Statistical Methods Or Stuff They Didn t Teach You in STAT 101

2 Classical Statistics For the most part, classical statistics assumes normality, i.e., if all experimental units of interest were measured and those measurements were plotted, then the distribution would look bell-shaped.

3 Normal Distribution

4 Not to be Confused with

5 Normal Distribution The standard normal is the iconic normal distribution but there are as many normal distributions as there are combinations of finite means and finite, positive variances.

6 Normal Distributions

7 Central Limit Theorem All this is well and good, you say, but there is no reason, in general, to assume normality. How do we proceed? The famous French mathematician Laplace ( ) discovered in the early 19 th century that the sample mean, X, is normally distributed for sufficiently large sample size, provided the sample is random and the underlying population variance is finite.

8 Central Limit Theorem How large is sufficiently large? For light-tailed distributions, sample size n 30 or so is usually sufficient. What do I mean by light-tailed? The rate at which the tail or tails of the distribution approach zero must be at least as fast as 1/e x in math-speak. (The tails of the normal distribution approach zero at the rate of (1/e x^2 ), which is faster still.)

9 Limitations of the CLT So where can things come off of the rails? It seems like the Central Limit Theorem takes care of just about everything.

10 What Manner of Distribution is this?

11 The Cauchy Distribution This is the Cauchy distribution. It sort of looks normal, and some people might mistake it for a normal distribution (present company excepted, of course!) but observe the tails.

12 Cauchy vs. Normal

13 Cauchy vs. Normal The Cauchy distribution has heavy tails. What is meant by heavy tails here? The tails of the Cauchy distribution approach zero at the rate of 1/x 2 How does that compare with the normal distribution? Example: For x = 10, 1/x 2 = 1/100 = By way of contrast, (1/e x^2 ) = 3.72 x 10^-44 (i.e., a decimal with 43 leading zeroes). Quite a difference!

14 A Pathological Distribution In fact, the tails of the Cauchy are so heavy, that the mean and the variance do not exist. One observation is a better estimator of the center of the Cauchy distribution than the average of a random sample! The Cauchy Distribution represents an extreme situation and is good for testing where our methods break down. Will you encounter the Cauchy distribution in practice? Probably not, but it is a distinct possibility that you will encounter the next problematic example.

15 Mixed Normal A mixed normal distribution occurs when a population of interest actually contains two subpopulations that are normally distributed but with different means and variances within each subpopulation. For whatever reason, these subpopulations cannot be easily isolated, and the resulting distribution is not normal, even though it consists of two normal subpopulations.

16 Mixed Normal A specific example (from Rand Wilcox s Applying Contemporary Statistical Techniques): Assume a population consists of both dieters and non-dieters in a ratio of 1:9, i.e., 10% have dieted and 90% have not. Let X represent the amount of weight loss observed for an individual during the previous year. Further assume that X is distributed N(0, 100) for dieters and X is distributed N(0, 1) for non-dieters.

17 Mixed Normal The resulting distribution can be represented as (0.9)N(0, 1) + (0.1) N(0, 100), which has mean = (0.9)(0) + (0.1)(0) = 0 and variance = (0.9)(1) + (0.1)(100) = = Thus, even though non-dieters represent 90% of the population and their variance is 1, we observe much greater variability in the resulting mixed model. This presents a problem for inferences about the population mean, for example. How do we proceed in such problem cases?

18 Trimming and Winsorizing Data A trimmed mean is obtained by deleting a certain percentage of the smallest and largest values and then calculating the mean based on the remaining values. For example, a 10% trimmed mean is obtained by deleting 10% of the highest values and 10% of the lowest values, leaving you with 80% of the original values upon which a mean is then calculated.

19 A concrete example: Trimmed Mean {3.54, 6.61, 2.88, 2.20, 8.04, 5.31, 6.51, 6.37, 3.86, 7.82, 0.967, 1.12, 7.00, 4.87, 8.39, 4.15, 3.11, 7.48, 16.62, 2.77} The mean of this sample is % trimming removes 0.967, 1.12, 8.39, and The 10% trimmed mean is based on the = 16 remaining values, which in this example, is 5.16.

20 Trimmed Mean Isn t throwing out data a bad idea? Not necessarily. In the 1960s and 1970s Lehmann and Bickel (two famous statisticians) showed that the 10% trimmed mean is nearly as good as X for approximately normal data and a much safer bet than X for heavy-tailed data. (A. DasGupta, Asymptotic Theory of Statistics and Probability, p. 271)

21 Trimmed Mean Incidentally, you have encountered trimmed means before, even if you have not recognized them as such. The median of a distribution is a 50% trimmed mean, i.e., you remove 50% of the lowest data values and 50% of the highest data values and are left with one number as an estimate of the location parameter or center of the distribution.

22 Trimmed Mean Moreover, removing transparently erroneous data or data collected on an unsuitable experimental unit is not trimming. Trimming occurs when you remove data values that are legitimate (or cannot be identified as illegitimate) but small or large with respect to the sample as a whole.

23 Winsorizing Data Winsorizing is distinct from trimming in that a certain proportion of lowest and highest values are not discarded but are instead replaced by the lowest and highest values in the data set apart from those values, so that the sample size remains the same.

24 Winsorizing Data Returning to my previous example: {3.54, 6.61, 2.88, 2.20, 8.04, 5.31, 6.51, 6.37, 3.86, 7.82, 0.967, 1.12, 7.00, 4.87, 8.39, 4.15, 3.11, 7.48, 16.62, 2.77} If this data is winsorized at 10% then the smallest and larger values after the 2 lowest and 2 highest (2 = 10% of 20) values are identified are 2.20 and 8.04, respectively. The winsorized data set becomes:

25 Winsorizing Data {3.54, 6.61, 2.88, 2.20, 8.04, 5.31, 6.51, 6.37, 3.86, 7.82, 2.20, 2.20, 7.00, 4.87, 8.04, 4.15, 3.11, 7.48, 8.04, 2.77} The 10% winsorized mean is the average of these 20 values with the repeats, which is 5.15 as compared to the sample mean of The winsorized standard deviation is just the standard deviation of the winsorized data. In this example, it is 2.22 (as compared to 3.5 for the original sample).

26 (1-α)% CI for Trimmed Mean You cannot use the standard method of constructing confidence intervals about means for trimmed means! It would be unsound for a number of reasons, including that the remaining values in the trimmed data set are no longer independent or identically distributed.

27 (1-α)% CI for Trimmed Mean How can the leftover data points be dependent when they were independent just a few minutes ago before trimming? I did not trim at random. I ordered the data first, then trimmed the lowest and highest 10%. For observations to be independent, one cannot tell you anything about another. When you order the data and observe that the penultimate largest value is 8, say, then you know the largest value cannot be 7 or, indeed, any value less than 8. Hence, the trimmed data set is not independent (or identically distributed).

28 (1-α)% CI for Trimmed Mean The correct standard error for a 10% trimmed mean is s w 0.8 n Where s w is the standard deviation of the 10% winsorized sample.

29 (1-α)% CI for Trimmed Mean The correct standard error for the 10% trimmed mean of my example data is 0.62 and a 95% confidence interval for the 10% trimmed mean is: 5.16±2.13(0.62) = (3.84, 6.47) where 2.13 is the appropriate critical value from the T distribution with 16 (= #of values left after deleting two smallest and two largest) - 1 = 15 degrees of freedom (i.e., t (15) = 2.13)

30 CI s for Binomial Proportions Everyone is taught the Wald-Wolfowitz confidence interval for binomial proportions, i.e., Which is based on a normal approximation. They are not taught how poorly it performs in general, however, even for large n.

31 CI s for Binomial Proportions What do I mean by performs poorly? The coverage probability of the Wald-Wolfowitz CI is often less than you intend when you choose your α. (For example, you might intend to have a 95% CI but end up with a 89% CI.) What is the coverage probability? The coverage probability is the percentage associated with the CI you construct. A 95% confidence interval has (or should have) a coverage probability of 0.95

32 CI s for Binomial Proportions But what does this mean? For the frequentist statistician, it means that were she to replicate her experiment 20 times, she would expect 19 out of the 20 confidence intervals she constructs to contain the true value of the parameter she is estimating. For the frequentist, the parameter is a fixed constant of nature and once the CI is constructed it either contains the true value or it does not.

33 CI s for Binomial Proportions By way of contrast, for the Bayesian statistician, the parameter is not fixed; there is a probability distribution associated with it and even after the CI is constructed he can speak of there being a probability that the parameter is contained in that interval. The distinction between frequentist and Bayesian statistics is not particularly important here and would, in any event, require a talk of its own.

34 CI s for Binomial Proportions What should you use instead of the Wald- Wolfowitz confidence interval? There are a number of alternatives, but I recommend the Wilson score interval.

35

36 Wilson Score Interval The most general form of the Wilson score interval is as follows:

37 Wilson Score Interval A 95% Wilson Interval is (approximately): 2 pˆ(1 pˆ) 1 pˆ 2 2 n n n 4 1 n

38 Wilson Score Interval As an example, suppose we observe 7 successes in 100 trials. An (approximate) 95% Wilson interval would be (0.034, 0.14) I have been writing approximate in parentheses because I used 2 in place of the correct critical value z = 1.96 to make the formula on the previous slide look cleaner. The difference is slight. If you want to be extra fastidious, you can use the critical value from the T distribution, t (n-1), which in this ex. would be

39 In Summary Bell-shaped distributions need not have nice properties. Do not simply assume the underlying distribution of your data is normal and apply standard statistical methods. To do so is sort of like the statistical equivalent of running with scissors. Instead, investigate the data until you are satisfied of its approximate normality.

40 In Summary Even when the data appears to be approximately normal, you might want to consider trimming and/or winsorizing it. The efficiency of the 10% trimmed mean is such that it is competitive with the sample mean even under normality and is a better bet for heavy-tailed data. (This situation is not unlike the Wilcoxon-Mann-Whitney test vs. the twosample T test.)

41 In Summary The Wald-Wolfowitz confidence interval for binomial proportions, like fast food, probably should be avoided.

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage 7 Statistical Intervals Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to

More information

Sampling and sampling distribution

Sampling and sampling distribution Sampling and sampling distribution September 12, 2017 STAT 101 Class 5 Slide 1 Outline of Topics 1 Sampling 2 Sampling distribution of a mean 3 Sampling distribution of a proportion STAT 101 Class 5 Slide

More information

Chapter 9: Sampling Distributions

Chapter 9: Sampling Distributions Chapter 9: Sampling Distributions 9. Introduction This chapter connects the material in Chapters 4 through 8 (numerical descriptive statistics, sampling, and probability distributions, in particular) with

More information

The normal distribution is a theoretical model derived mathematically and not empirically.

The normal distribution is a theoretical model derived mathematically and not empirically. Sociology 541 The Normal Distribution Probability and An Introduction to Inferential Statistics Normal Approximation The normal distribution is a theoretical model derived mathematically and not empirically.

More information

Chapter 9. Sampling Distributions. A sampling distribution is created by, as the name suggests, sampling.

Chapter 9. Sampling Distributions. A sampling distribution is created by, as the name suggests, sampling. Chapter 9 Sampling Distributions 9.1 Sampling Distributions A sampling distribution is created by, as the name suggests, sampling. The method we will employ on the rules of probability and the laws of

More information

Chapter 8 Statistical Intervals for a Single Sample

Chapter 8 Statistical Intervals for a Single Sample Chapter 8 Statistical Intervals for a Single Sample Part 1: Confidence intervals (CI) for population mean µ Section 8-1: CI for µ when σ 2 known & drawing from normal distribution Section 8-1.2: Sample

More information

Learning Objectives for Ch. 7

Learning Objectives for Ch. 7 Chapter 7: Point and Interval Estimation Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition 1 Learning Objectives for Ch. 7 Obtaining a point estimate of a population parameter

More information

Statistics 13 Elementary Statistics

Statistics 13 Elementary Statistics Statistics 13 Elementary Statistics Summer Session I 2012 Lecture Notes 5: Estimation with Confidence intervals 1 Our goal is to estimate the value of an unknown population parameter, such as a population

More information

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are Chapter 7 presents the beginning of inferential statistics. Concept: Inferential Statistics The two major activities of inferential statistics are 1 to use sample data to estimate values of population

More information

ECON 214 Elements of Statistics for Economists

ECON 214 Elements of Statistics for Economists ECON 214 Elements of Statistics for Economists Session 7 The Normal Distribution Part 1 Lecturer: Dr. Bernardin Senadza, Dept. of Economics Contact Information: bsenadza@ug.edu.gh College of Education

More information

Random variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny.

Random variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny. Distributions September 17 Random variables Anything that can be measured or categorized is called a variable If the value that a variable takes on is subject to variability, then it the variable is a

More information

Statistics Class 15 3/21/2012

Statistics Class 15 3/21/2012 Statistics Class 15 3/21/2012 Quiz 1. Cans of regular Pepsi are labeled to indicate that they contain 12 oz. Data Set 17 in Appendix B lists measured amounts for a sample of Pepsi cans. The same statistics

More information

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise. Math 224 Q Exam 3A Fall 217 Tues Dec 12 Version A Problem 1. Let X be the continuous random variable defined by the following pdf: { 1 x/2 when x 2, f(x) otherwise. (a) Compute the mean µ E[X]. E[X] x

More information

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی یادگیري ماشین توزیع هاي نمونه و تخمین نقطه اي پارامترها Sampling Distributions and Point Estimation of Parameter (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی درس هفتم 1 Outline Introduction

More information

19. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE

19. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE 19. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE We assume here that the population variance σ 2 is known. This is an unrealistic assumption, but it allows us to give a simplified presentation which

More information

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Statistics 431 Spring 2007 P. Shaman. Preliminaries Statistics 4 Spring 007 P. Shaman The Binomial Distribution Preliminaries A binomial experiment is defined by the following conditions: A sequence of n trials is conducted, with each trial having two possible

More information

Introduction to Statistics I

Introduction to Statistics I Introduction to Statistics I Keio University, Faculty of Economics Continuous random variables Simon Clinet (Keio University) Intro to Stats November 1, 2018 1 / 18 Definition (Continuous random variable)

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://wwwstattamuedu/~suhasini/teachinghtml Suhasini Subba Rao Review of previous lecture The main idea in the previous lecture is that the sample

More information

Confidence Intervals. σ unknown, small samples The t-statistic /22

Confidence Intervals. σ unknown, small samples The t-statistic /22 Confidence Intervals σ unknown, small samples The t-statistic 1 /22 Homework Read Sec 7-3. Discussion Question pg 365 Do Ex 7-3 1-4, 6, 9, 12, 14, 15, 17 2/22 Objective find the confidence interval for

More information

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Chapter 7 Sampling Distributions and Point Estimation of Parameters Chapter 7 Sampling Distributions and Point Estimation of Parameters Part 1: Sampling Distributions, the Central Limit Theorem, Point Estimation & Estimators Sections 7-1 to 7-2 1 / 25 Statistical Inferences

More information

Business Statistics 41000: Probability 4

Business Statistics 41000: Probability 4 Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:

More information

Chapter 7 Study Guide: The Central Limit Theorem

Chapter 7 Study Guide: The Central Limit Theorem Chapter 7 Study Guide: The Central Limit Theorem Introduction Why are we so concerned with means? Two reasons are that they give us a middle ground for comparison and they are easy to calculate. In this

More information

ECON 214 Elements of Statistics for Economists 2016/2017

ECON 214 Elements of Statistics for Economists 2016/2017 ECON 214 Elements of Statistics for Economists 2016/2017 Topic The Normal Distribution Lecturer: Dr. Bernardin Senadza, Dept. of Economics bsenadza@ug.edu.gh College of Education School of Continuing and

More information

The Binomial Probability Distribution

The Binomial Probability Distribution The Binomial Probability Distribution MATH 130, Elements of Statistics I J. Robert Buchanan Department of Mathematics Fall 2017 Objectives After this lesson we will be able to: determine whether a probability

More information

Confidence Intervals for the Mean. When σ is known

Confidence Intervals for the Mean. When σ is known Confidence Intervals for the Mean When σ is known Objective Find the confidence interval for the mean when s is known. Intro Suppose a college president wishes to estimate the average age of students attending

More information

One sample z-test and t-test

One sample z-test and t-test One sample z-test and t-test January 30, 2017 psych10.stanford.edu Announcements / Action Items Install ISI package (instructions in Getting Started with R) Assessment Problem Set #3 due Tu 1/31 at 7 PM

More information

STA Module 3B Discrete Random Variables

STA Module 3B Discrete Random Variables STA 2023 Module 3B Discrete Random Variables Learning Objectives Upon completing this module, you should be able to 1. Determine the probability distribution of a discrete random variable. 2. Construct

More information

Value (x) probability Example A-2: Construct a histogram for population Ψ.

Value (x) probability Example A-2: Construct a histogram for population Ψ. Calculus 111, section 08.x The Central Limit Theorem notes by Tim Pilachowski If you haven t done it yet, go to the Math 111 page and download the handout: Central Limit Theorem supplement. Today s lecture

More information

Chapter 5. Sampling Distributions

Chapter 5. Sampling Distributions Lecture notes, Lang Wu, UBC 1 Chapter 5. Sampling Distributions 5.1. Introduction In statistical inference, we attempt to estimate an unknown population characteristic, such as the population mean, µ,

More information

Chapter 7.2: Large-Sample Confidence Intervals for a Population Mean and Proportion. Instructor: Elvan Ceyhan

Chapter 7.2: Large-Sample Confidence Intervals for a Population Mean and Proportion. Instructor: Elvan Ceyhan 1 Chapter 7.2: Large-Sample Confidence Intervals for a Population Mean and Proportion Instructor: Elvan Ceyhan Outline of this chapter: Large-Sample Interval for µ Confidence Intervals for Population Proportion

More information

CH 5 Normal Probability Distributions Properties of the Normal Distribution

CH 5 Normal Probability Distributions Properties of the Normal Distribution Properties of the Normal Distribution Example A friend that is always late. Let X represent the amount of minutes that pass from the moment you are suppose to meet your friend until the moment your friend

More information

Lecture 2. Probability Distributions Theophanis Tsandilas

Lecture 2. Probability Distributions Theophanis Tsandilas Lecture 2 Probability Distributions Theophanis Tsandilas Comment on measures of dispersion Why do common measures of dispersion (variance and standard deviation) use sums of squares: nx (x i ˆµ) 2 i=1

More information

Chapter 6 Probability

Chapter 6 Probability Chapter 6 Probability Learning Objectives 1. Simulate simple experiments and compute empirical probabilities. 2. Compute both theoretical and empirical probabilities. 3. Apply the rules of probability

More information

STA Rev. F Learning Objectives. What is a Random Variable? Module 5 Discrete Random Variables

STA Rev. F Learning Objectives. What is a Random Variable? Module 5 Discrete Random Variables STA 2023 Module 5 Discrete Random Variables Learning Objectives Upon completing this module, you should be able to: 1. Determine the probability distribution of a discrete random variable. 2. Construct

More information

Chapter 7. Sampling Distributions and the Central Limit Theorem

Chapter 7. Sampling Distributions and the Central Limit Theorem Chapter 7. Sampling Distributions and the Central Limit Theorem 1 Introduction 2 Sampling Distributions related to the normal distribution 3 The central limit theorem 4 The normal approximation to binomial

More information

The Two-Sample Independent Sample t Test

The Two-Sample Independent Sample t Test Department of Psychology and Human Development Vanderbilt University 1 Introduction 2 3 The General Formula The Equal-n Formula 4 5 6 Independence Normality Homogeneity of Variances 7 Non-Normality Unequal

More information

AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4

AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4 AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Summer 2014 1 / 26 Sampling Distributions!!!!!!

More information

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede,

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede, FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede, mb8@ecs.soton.ac.uk The normal distribution The normal distribution is the classic "bell curve". We've seen that

More information

STAT Chapter 6: Sampling Distributions

STAT Chapter 6: Sampling Distributions STAT 515 -- Chapter 6: Sampling Distributions Definition: Parameter = a number that characterizes a population (example: population mean ) it s typically unknown. Statistic = a number that characterizes

More information

Chapter 7. Sampling Distributions and the Central Limit Theorem

Chapter 7. Sampling Distributions and the Central Limit Theorem Chapter 7. Sampling Distributions and the Central Limit Theorem 1 Introduction 2 Sampling Distributions related to the normal distribution 3 The central limit theorem 4 The normal approximation to binomial

More information

. (i) What is the probability that X is at most 8.75? =.875

. (i) What is the probability that X is at most 8.75? =.875 Worksheet 1 Prep-Work (Distributions) 1)Let X be the random variable whose c.d.f. is given below. F X 0 0.3 ( x) 0.5 0.8 1.0 if if if if if x 5 5 x 10 10 x 15 15 x 0 0 x Compute the mean, X. (Hint: First

More information

MATH 3200 Exam 3 Dr. Syring

MATH 3200 Exam 3 Dr. Syring . Suppose n eligible voters are polled (randomly sampled) from a population of size N. The poll asks voters whether they support or do not support increasing local taxes to fund public parks. Let M be

More information

Math 227 Elementary Statistics. Bluman 5 th edition

Math 227 Elementary Statistics. Bluman 5 th edition Math 227 Elementary Statistics Bluman 5 th edition CHAPTER 6 The Normal Distribution 2 Objectives Identify distributions as symmetrical or skewed. Identify the properties of the normal distribution. Find

More information

Sampling Distributions and the Central Limit Theorem

Sampling Distributions and the Central Limit Theorem Sampling Distributions and the Central Limit Theorem February 18 Data distributions and sampling distributions So far, we have discussed the distribution of data (i.e. of random variables in our sample,

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 14 (MWF) The t-distribution Suhasini Subba Rao Review of previous lecture Often the precision

More information

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed. The Central Limit Theorem The central limit theorem (clt for short) is one of the most powerful and useful ideas in all of statistics. The clt says that if we collect samples of size n with a "large enough

More information

A Derivation of the Normal Distribution. Robert S. Wilson PhD.

A Derivation of the Normal Distribution. Robert S. Wilson PhD. A Derivation of the Normal Distribution Robert S. Wilson PhD. Data are said to be normally distributed if their frequency histogram is apporximated by a bell shaped curve. In practice, one can tell by

More information

1 Inferential Statistic

1 Inferential Statistic 1 Inferential Statistic Population versus Sample, parameter versus statistic A population is the set of all individuals the researcher intends to learn about. A sample is a subset of the population and

More information

MA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution.

MA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution. MA 5 Lecture - Mean and Standard Deviation for the Binomial Distribution Friday, September 9, 07 Objectives: Mean and standard deviation for the binomial distribution.. Mean and Standard Deviation of the

More information

Statistics 6 th Edition

Statistics 6 th Edition Statistics 6 th Edition Chapter 5 Discrete Probability Distributions Chap 5-1 Definitions Random Variables Random Variables Discrete Random Variable Continuous Random Variable Ch. 5 Ch. 6 Chap 5-2 Discrete

More information

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD MAJOR POINTS Sampling distribution of the mean revisited Testing hypotheses: sigma known An example Testing hypotheses:

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Ramesh Yapalparvi Week 4 à Midterm Week 5 woohoo Chapter 9 Sampling Distributions ß today s lecture Sampling distributions of the mean and p. Difference between means. Central

More information

Statistical Intervals (One sample) (Chs )

Statistical Intervals (One sample) (Chs ) 7 Statistical Intervals (One sample) (Chs 8.1-8.3) Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to normally distributed with expected value µ and

More information

Statistics and Probability

Statistics and Probability Statistics and Probability Continuous RVs (Normal); Confidence Intervals Outline Continuous random variables Normal distribution CLT Point estimation Confidence intervals http://www.isrec.isb-sib.ch/~darlene/geneve/

More information

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems Interval estimation September 29, 2017 STAT 151 Class 7 Slide 1 Outline of Topics 1 Basic ideas 2 Sampling variation and CLT 3 Interval estimation using X 4 More general problems STAT 151 Class 7 Slide

More information

Basics. STAT:5400 Computing in Statistics Simulation studies in statistics Lecture 9 September 21, 2016

Basics. STAT:5400 Computing in Statistics Simulation studies in statistics Lecture 9 September 21, 2016 STAT:5400 Computing in Statistics Simulation studies in statistics Lecture 9 September 21, 2016 Based on a lecture by Marie Davidian for ST 810A - Spring 2005 Preparation for Statistical Research North

More information

8.1 Estimation of the Mean and Proportion

8.1 Estimation of the Mean and Proportion 8.1 Estimation of the Mean and Proportion Statistical inference enables us to make judgments about a population on the basis of sample information. The mean, standard deviation, and proportions of a population

More information

MAS187/AEF258. University of Newcastle upon Tyne

MAS187/AEF258. University of Newcastle upon Tyne MAS187/AEF258 University of Newcastle upon Tyne 2005-6 Contents 1 Collecting and Presenting Data 5 1.1 Introduction...................................... 5 1.1.1 Examples...................................

More information

Using the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the

Using the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the Using the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the mean, use the CLT for the mean. If you are being asked to

More information

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality Point Estimation Some General Concepts of Point Estimation Statistical inference = conclusions about parameters Parameters == population characteristics A point estimate of a parameter is a value (based

More information

Math 140 Introductory Statistics. First midterm September

Math 140 Introductory Statistics. First midterm September Math 140 Introductory Statistics First midterm September 23 2010 Box Plots Graphical display of 5 number summary Q1, Q2 (median), Q3, max, min Outliers If a value is more than 1.5 times the IQR from the

More information

Binomial and Normal Distributions

Binomial and Normal Distributions Binomial and Normal Distributions Bernoulli Trials A Bernoulli trial is a random experiment with 2 special properties: The result of a Bernoulli trial is binary. Examples: Heads vs. Tails, Healthy vs.

More information

Module 4: Probability

Module 4: Probability Module 4: Probability 1 / 22 Probability concepts in statistical inference Probability is a way of quantifying uncertainty associated with random events and is the basis for statistical inference. Inference

More information

AMS7: WEEK 4. CLASS 3

AMS7: WEEK 4. CLASS 3 AMS7: WEEK 4. CLASS 3 Sampling distributions and estimators. Central Limit Theorem Normal Approximation to the Binomial Distribution Friday April 24th, 2015 Sampling distributions and estimators REMEMBER:

More information

Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means

Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means Chapter 11: Inference for Distributions 11.1 Inference for Means of a Population 11.2 Comparing Two Means 1 Population Standard Deviation In the previous chapter, we computed confidence intervals and performed

More information

Lecture 16: Estimating Parameters (Confidence Interval Estimates of the Mean)

Lecture 16: Estimating Parameters (Confidence Interval Estimates of the Mean) Statistics 16_est_parameters.pdf Michael Hallstone, Ph.D. hallston@hawaii.edu Lecture 16: Estimating Parameters (Confidence Interval Estimates of the Mean) Some Common Sense Assumptions for Interval Estimates

More information

Probability Weighted Moments. Andrew Smith

Probability Weighted Moments. Andrew Smith Probability Weighted Moments Andrew Smith andrewdsmith8@deloitte.co.uk 28 November 2014 Introduction If I asked you to summarise a data set, or fit a distribution You d probably calculate the mean and

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 05 Normal Distribution So far we have looked at discrete distributions

More information

The Central Limit Theorem

The Central Limit Theorem The Central Limit Theorem Patrick Breheny March 1 Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 1 / 29 Kerrich s experiment Introduction The law of averages Mean and SD of

More information

Lecture 6: Chapter 6

Lecture 6: Chapter 6 Lecture 6: Chapter 6 C C Moxley UAB Mathematics 3 October 16 6.1 Continuous Probability Distributions Last week, we discussed the binomial probability distribution, which was discrete. 6.1 Continuous Probability

More information

Chapter 3 Discrete Random Variables and Probability Distributions

Chapter 3 Discrete Random Variables and Probability Distributions Chapter 3 Discrete Random Variables and Probability Distributions Part 4: Special Discrete Random Variable Distributions Sections 3.7 & 3.8 Geometric, Negative Binomial, Hypergeometric NOTE: The discrete

More information

value BE.104 Spring Biostatistics: Distribution and the Mean J. L. Sherley

value BE.104 Spring Biostatistics: Distribution and the Mean J. L. Sherley BE.104 Spring Biostatistics: Distribution and the Mean J. L. Sherley Outline: 1) Review of Variation & Error 2) Binomial Distributions 3) The Normal Distribution 4) Defining the Mean of a population Goals:

More information

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Exam 2 Spring 2015 Statistics for Applications 4/9/2015 18.443 Exam 2 Spring 2015 Statistics for Applications 4/9/2015 1. True or False (and state why). (a). The significance level of a statistical test is not equal to the probability that the null hypothesis

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Tommy Khoo Your friendly neighbourhood graduate student. It is Time for Homework Again! ( ω `) Please hand in your homework. Third homework will be posted on the website,

More information

The Assumption(s) of Normality

The Assumption(s) of Normality The Assumption(s) of Normality Copyright 2000, 2011, 2016, J. Toby Mordkoff This is very complicated, so I ll provide two versions. At a minimum, you should know the short one. It would be great if you

More information

Econ 6900: Statistical Problems. Instructor: Yogesh Uppal

Econ 6900: Statistical Problems. Instructor: Yogesh Uppal Econ 6900: Statistical Problems Instructor: Yogesh Uppal Email: yuppal@ysu.edu Lecture Slides 4 Random Variables Probability Distributions Discrete Distributions Discrete Uniform Probability Distribution

More information

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same. Chapter 14 : Statistical Inference 1 Chapter 14 : Introduction to Statistical Inference Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same. Data x

More information

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution Section 7.6 Application of the Normal Distribution A random variable that may take on infinitely many values is called a continuous random variable. A continuous probability distribution is defined by

More information

Probability Models.S2 Discrete Random Variables

Probability Models.S2 Discrete Random Variables Probability Models.S2 Discrete Random Variables Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard Results of an experiment involving uncertainty are described by one or more random

More information

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg :

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Ramesh Yapalparvi Week 3 Chapter 5 Probability Chapter 7 Normal Distribution Chapter 8 Advanced Graphs Chapter 9 Sampling Distributions ß today s lecture Sampling distributions

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

AP Stats Review. Mrs. Daniel Alonzo & Tracy Mourning Sr. High

AP Stats Review. Mrs. Daniel Alonzo & Tracy Mourning Sr. High AP Stats Review Mrs. Daniel Alonzo & Tracy Mourning Sr. High sdaniel@dadeschools.net Agenda 1. AP Stats Exam Overview 2. AP FRQ Scoring & FRQ: 2016 #1 3. Distributions Review 4. FRQ: 2015 #6 5. Distribution

More information

Chapter 7: Estimation Sections

Chapter 7: Estimation Sections 1 / 31 : Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods: 7.5 Maximum Likelihood

More information

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions SGSB Workshop: Using Statistical Data to Make Decisions Module 2: The Logic of Statistical Inference Dr. Tom Ilvento January 2006 Dr. Mugdim Pašić Key Objectives Understand the logic of statistical inference

More information

STAT 201 Chapter 6. Distribution

STAT 201 Chapter 6. Distribution STAT 201 Chapter 6 Distribution 1 Random Variable We know variable Random Variable: a numerical measurement of the outcome of a random phenomena Capital letter refer to the random variable Lower case letters

More information

MA131 Lecture 9.1. = µ = 25 and σ X P ( 90 < X < 100 ) = = /// σ X

MA131 Lecture 9.1. = µ = 25 and σ X P ( 90 < X < 100 ) = = /// σ X The Central Limit Theorem (CLT): As the sample size n increases, the shape of the distribution of the sample means taken with replacement from the population with mean µ and standard deviation σ will approach

More information

Properties of Probability Models: Part Two. What they forgot to tell you about the Gammas

Properties of Probability Models: Part Two. What they forgot to tell you about the Gammas Quality Digest Daily, September 1, 2015 Manuscript 285 What they forgot to tell you about the Gammas Donald J. Wheeler Clear thinking and simplicity of analysis require concise, clear, and correct notions

More information

SAMPLING DISTRIBUTIONS. Chapter 7

SAMPLING DISTRIBUTIONS. Chapter 7 SAMPLING DISTRIBUTIONS Chapter 7 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution Statistic and Parameter Statistic numerical summary of sample data: p-hat or xbar Parameter

More information

Review: Population, sample, and sampling distributions

Review: Population, sample, and sampling distributions Review: Population, sample, and sampling distributions A population with mean µ and standard deviation σ For instance, µ = 0, σ = 1 0 1 Sample 1, N=30 Sample 2, N=30 Sample 100000000000 InterquartileRange

More information

Chapter Seven: Confidence Intervals and Sample Size

Chapter Seven: Confidence Intervals and Sample Size Chapter Seven: Confidence Intervals and Sample Size A point estimate is: The best point estimate of the population mean µ is the sample mean X. Three Properties of a Good Estimator 1. Unbiased 2. Consistent

More information

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial Lecture 8 The Binomial Distribution Probability Distributions: Normal and Binomial 1 2 Binomial Distribution >A binomial experiment possesses the following properties. The experiment consists of a fixed

More information

σ 2 : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

σ 2 : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics σ : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics CONTENTS Estimating other parameters besides μ Estimating variance Confidence intervals for σ Hypothesis tests for σ Estimating standard

More information

Non-Inferiority Tests for the Odds Ratio of Two Proportions

Non-Inferiority Tests for the Odds Ratio of Two Proportions Chapter Non-Inferiority Tests for the Odds Ratio of Two Proportions Introduction This module provides power analysis and sample size calculation for non-inferiority tests of the odds ratio in twosample

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Tommy Khoo Your friendly neighbourhood graduate student. Midterm Exam ٩(^ᴗ^)۶ In class, next week, Thursday, 26 April. 1 hour, 45 minutes. 5 questions of varying lengths.

More information

Chapter 5: Probability models

Chapter 5: Probability models Chapter 5: Probability models 1. Random variables: a) Idea. b) Discrete and continuous variables. c) The probability function (density) and the distribution function. d) Mean and variance of a random variable.

More information

Chapter 7. Confidence Intervals and Sample Sizes. Definition. Definition. Definition. Definition. Confidence Interval : CI. Point Estimate.

Chapter 7. Confidence Intervals and Sample Sizes. Definition. Definition. Definition. Definition. Confidence Interval : CI. Point Estimate. Chapter 7 Confidence Intervals and Sample Sizes 7. Estimating a Proportion p 7.3 Estimating a Mean µ (σ known) 7.4 Estimating a Mean µ (σ unknown) 7.5 Estimating a Standard Deviation σ In a recent poll,

More information

Confidence Intervals and Sample Size

Confidence Intervals and Sample Size Confidence Intervals and Sample Size Chapter 6 shows us how we can use the Central Limit Theorem (CLT) to 1. estimate a population parameter (such as the mean or proportion) using a sample, and. determine

More information

Introduction to Statistical Data Analysis II

Introduction to Statistical Data Analysis II Introduction to Statistical Data Analysis II JULY 2011 Afsaneh Yazdani Preface Major branches of Statistics: - Descriptive Statistics - Inferential Statistics Preface What is Inferential Statistics? Preface

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 14 (MWF) The t-distribution Suhasini Subba Rao Review of previous lecture Often the precision

More information