The Normal Distribution

Similar documents
Chapter 3. Density Curves. Density Curves. Basic Practice of Statistics - 3rd Edition. Chapter 3 1. The Normal Distributions

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

Unit2: Probabilityanddistributions. 3. Normal distribution

ECON 214 Elements of Statistics for Economists

STAT:2010 Statistical Methods and Computing. Using density curves to describe the distribution of values of a quantitative

ECON 214 Elements of Statistics for Economists 2016/2017

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

STAB22 section 1.3 and Chapter 1 exercises

Density curves. (James Madison University) February 4, / 20

7.1 Graphs of Normal Probability Distributions

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

NORMAL RANDOM VARIABLES (Normal or gaussian distribution)

Since his score is positive, he s above average. Since his score is not close to zero, his score is unusual.

Shifting and rescaling data distributions

Normal Model (Part 1)

Statistics 431 Spring 2007 P. Shaman. Preliminaries

appstats5.notebook September 07, 2016 Chapter 5

Categorical. A general name for non-numerical data; the data is separated into categories of some kind.

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Statistics 511 Supplemental Materials

The Standard Deviation as a Ruler and the Normal Model. Copyright 2009 Pearson Education, Inc.

The "bell-shaped" curve, or normal curve, is a probability distribution that describes many real-life situations.

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes. Standardizing normal distributions The Standard Normal Curve

Lecture 5 - Continuous Distributions

Chapter 6. The Normal Probability Distributions

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Chapter 4. The Normal Distribution

Lecture 6: Normal distribution

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Describing Data: One Quantitative Variable

The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).

MAKING SENSE OF DATA Essentials series

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Frequency Distribution and Summary Statistics

Distributions of random variables

Data Distributions and Normality

Distributions of random variables

Announcements. Unit 2: Probability and distributions Lecture 3: Normal distribution. Normal distribution. Heights of males

4: Probability. Notes: Range of possible probabilities: Probabilities can be no less than 0% and no more than 100% (of course).

Honors Statistics. 3. Discuss homework C2# Discuss standard scores and percentiles. Chapter 2 Section Review day 2016s Notes.

8.2 The Standard Deviation as a Ruler Chapter 8 The Normal and Other Continuous Distributions 8-1

Review of commonly missed questions on the online quiz. Lecture 7: Random variables] Expected value and standard deviation. Let s bet...

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

Math 227 Elementary Statistics. Bluman 5 th edition

Module Tag PSY_P2_M 7. PAPER No.2: QUANTITATIVE METHODS MODULE No.7: NORMAL DISTRIBUTION

STAT 113 Variability

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Statistics (This summary is for chapters 18, 29 and section H of chapter 19)

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Review. What is the probability of throwing two 6s in a row with a fair die? a) b) c) d) 0.333

The normal distribution is a theoretical model derived mathematically and not empirically.

Chapter 6.1 Confidence Intervals. Stat 226 Introduction to Business Statistics I. Chapter 6, Section 6.1

Numerical Descriptive Measures. Measures of Center: Mean and Median

5.1 Mean, Median, & Mode

Math 243 Lecture Notes

Chapter 3. Lecture 3 Sections

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

We will also use this topic to help you see how the standard deviation might be useful for distributions which are normally distributed.

Chapter Seven. The Normal Distribution

Lecture 2 Describing Data

MAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

DATA SUMMARIZATION AND VISUALIZATION

The Normal Probability Distribution

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

MEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION

MATH 104 CHAPTER 5 page 1 NORMAL DISTRIBUTION

Terms & Characteristics

1 Describing Distributions with numbers

When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution?

Introduction to Computational Finance and Financial Econometrics Descriptive Statistics

Chapter 2. Section 2.1

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Lecture 6: Chapter 6

Data Analysis and Statistical Methods Statistics 651

The Normal Distribution

Unit 2 Statistics of One Variable

Random variables The binomial distribution The normal distribution Other distributions. Distributions. Patrick Breheny.

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

A continuous random variable is one that can theoretically take on any value on some line interval. We use f ( x)

Numerical Descriptions of Data

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

Section3-2: Measures of Center

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Statistical Methods in Practice STAT/MATH 3379

Lecture 9. Probability Distributions. Outline. Outline

Chapter 8 Estimation

Continuous Distributions

When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution?

What s Normal? Chapter 8. Hitting the Curve. In This Chapter

Lecture 9. Probability Distributions

Mathematics 1000, Winter 2008

Part V - Chance Variability

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

Math146 - Chapter 3 Handouts. The Greek Alphabet. Source: Page 1 of 39

Chapter 4 Random Variables & Probability. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

LAB 2 INSTRUCTIONS PROBABILITY DISTRIBUTIONS IN EXCEL

Transcription:

Stat 6 Introduction to Business Statistics I Spring 009 Professor: Dr. Petrutza Caragea Section A Tuesdays and Thursdays 9:300:50 a.m. Chapter, Section.3 The Normal Distribution Density Curves So far we have: graphically displayed data: histogram, stemplot, boxplot described the overall pattern and identified deviations and outliers numerically quantified center and spread of the distribution If the distribution (as displayed by the histogram) appears sufficiently regular, we can approximate it with a smooth curve, a so-called density curve. The density curve is simplified and an idealized version of reality, but can still be useful! Example: Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 / 38 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 / 38 Chapter.3 The Density Curve Properties A density curve is a curve that is always on or above the horizontal axis, and has an area of exactly underneath it. A density curve describes the overall pattern of a distribution. The area under the curve and above any range of values is the proportion of all observations that fall in that range. gas mileage example from textbook: Examples: Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 3 / 38 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 / 38

Median and Mean of a Density Curve..5.6.7.8 5 6 7 8 9 Median: The equal-areas point with 50% of the mass on either side. Mean: The balancing point of the curve, if it were a solid mass Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 5 / 38 3 5 6 7 8 9 0 3 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 6 / 38 Introduction to Normal Distributions the Normal (or Gaussian) distribution is the single most important distribution in Statistics. many variables can be modeled (described) using the Normal distribution, e.g. height of humans SAT scores length of human pregnancies, etc. it is characterized by the following two parameters: Normal Distribution (by Carl Friedrich Gauss (777-855)) the the overall shape: and Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 7 / 38 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 8 / 38

pictures of various normal distributions: Notation: to denote the normal distribution we use Example: denotes a normal distribution with mean and standard deviation, while denotes a normal distribution with mean and standard deviation. To denote that a variable (e.g. heights, SAT scores, etc.) follows a normal distribution we write Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 9 / 38 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 0 / 38 The 68-95-99.7 Rule holds for all normal distributions (i.e. for any choice of µ and σ) 68-95-99.7 Rule For a variable that follows a have that, we approx. of the data fall within standard deviation of the mean, i.e. within 99.7% 95% 68% approx. of all the data fall within standard deviations of the mean, i.e. within 3% 3% approx. of all the data fall within standard deviations of the mean, i.e. within Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 / 38 3.5% 3.5% 0.5%.35%.35% 0.5% " # 3! " #! " #! " " $! " $! " $ 3! Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 / 38

Example: The length of human pregnancies follows a normal distribution with mean µ = 66 days and a standard deviation of σ = 6 days. How long do the middle 95% of all pregnancies last? The Standard Normal Distribution is a special normal distribution. has a mean and a standard deviation. denoted by. Nearly all the area is between and.!"#$%#&%'()&*#+'%,-"&,./",)$ How long do the shortest 6% of all pregnancies last (at most)? 3 How long do the longest 0.5% of all pregnancies last (at least)? ()*+,-'. $%$ $%# $%" $%! $%&!!!"!# $ # "! ' Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 3 / 38 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 / 38 Knowing the mean and the standard deviation of a normal distribution allows us to determine What of individuals fall in a specified range. What a given individual falls at if you know their data value. 3 What data value corresponds to a given. For the standard normal distribution, the proportion of observations falling into a specified range is tabulated. This is the tabulated values. normal distribution for which we have We therefore need to any given normal distribution to a standard normal distribution, i.e. the values from any are transformed to the corresponding values from a. This is called. Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 5 / 38 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 6 / 38

standardizing, z-score If x is an observation from a normal distribution that has mean µ and standard deviation σ, the standardized value of x is given by Example: (length of human pregnancies continued) A standardized value is often called a. A z-score tells us how many standard deviations the original observation is off the mean and in which direction. Observations larger than the mean are positive (i.e. have a positive z-score) when standardized, and observations smaller than the mean are negative (i.e. have a negative z-score) when standardized. Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 7 / 38 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 8 / 38 Finding z-scores and corresponding proportions/areas under the normal curve Why are z-scores helpful? IQ s follow a normal distribution with mean µ = 00 and standard deviation σ = 6 heights of males follow approx. a normal distribution with mean µ = 70 inches and σ = 3 Who is more unusual? A man being 73 inches tall or a man having an IQ of? Once we know the corresponding z-score of an observation we can look up the overall proportion (percentage) of men in that population having a height of 73 inches or more. need to know how to read Table A (Table of the Standard Normal Distribution) Table A in your textbook Note, in the following the terms proportion, probability, percentage, and area are all interchangeable, i.e. proportion = probability = percentage = area Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 9 / 38 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 0 / 38

To find the proportion (corresponding to the area under the normal curve) of observations that fall into a given range, e.g. between -z and z: The first column gives the z-score values correct to one decimal place and the first row gives the second decimal place for a z- score. For example, if we want to find the area below z=., we will find z=. in the first column, then look for z=0.0 along the first row. Where the corresponding row and column intersect gives the value 0.05. Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 / 38 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 / 38 using table a to find proportions under the normal curve consider the following situations: What proportion of observations is below z =.67, i.e. what is the probability of observing a z-score of.67 or less? What proportion of observations is greater than z =.67? What proportion is less than z =.00 and greater than z =.00? What proportion is below z =.67? Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 3 / 38 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 / 38

What is the area between z = and z =? What z-score does the 30 th percentile correspond to? What proportion is between z = 0.96 and z =.33? What z-scores bound the middle 60%? Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 5 / 38 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 6 / 38 Applications of the Normal Distribution State the problem, i.e. state the mean µ, the standard deviation σ and the value of the observation x standardize x, i.e. find the corresponding z-score using z = x µ σ 3 draw picture, i.e. locate z-score under normal curve and shade area of interest Applications of the Normal Distribution Example: male heights N(70, 3) What proportion of men is shorter than 7 inches? What proportion of men is taller than 65 inches? 3 What proportion of men is taller than 73 inches? use Table A to find the shaded area Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 7 / 38 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 8 / 38

What proportion of men has an IQ of or more? (IQ N(00, 6)) Backwards Calculations we can also work backwards given a certain percentile (or proportion), what is the corresponding value of x? Example: Heights N(70, 3) What value does the 50 th percentile of men s height correspond to? What value does the 0 th percentile of men s height correspond to? Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 9 / 38 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 30 / 38 In general, to do backward calculations use the following formula x = z σ + µ What value does the 85 th percentile correspond to? Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 3 / 38 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 3 / 38

Assessing Normality of Data Based on experience and/or past data the assumption of normality might be justified In general it is quite risky though to assume normality without looking at the data and verifying normality Normally distributed data allow the application of further statistical procedures which enable us to learn more about the data and also to further derive additional information about the variable we are interested in. (We will learn about such procedures in Chapters 6&7) If data are not normally distributed and we still apply statistical procedures that require the assumption of normality, derived information can be wrong and misleading. Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 33 / 38 How to assess Normality Histogram/stemplot or boxplot: reveal non-normal features, such as skewness multiple models outliers If the above graphical displays appear somewhat normal, i.e. they indicate a symmetric, unimodal, bell-shaped distribution we can use a so-called normal quantile plot. Normal quantile plots are a more sensitive tool allowing us to take a closer look to judge the adequacy of normality. Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 3 / 38 Normal quantile plots: Observations from a standard normal distribution for various sample sizes hard to construct by hand (use JMP) n=50 n=00 for main idea see pages 67 & 68 of the textbook If distribution is close to a normal distribution, the plots points in a normal quantile plot will lie close to a straight line. Some Caution: Real data almost always show some departure from normality (i.e. from a perfect normal distribution). It is important to restrict the examination of a normal quantile plot to searching for clear departures from normality. We can ignore minor wiggles in the plot most common methods will work well as long as the data are reasonably close to a normal distribution with no extreme outliers. Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 35 / 38 9 3.50 0.0.00-3 -3 0 3 9 3.50 0.0.00-3 -3 0 3 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 36 / 38

small sample sizes n=0 n=5 Observations from a skewed right and a triangular distribution 9.50 3 0 9.50 3 0 9.50 3 0 9.50 3 0.0.0.0.0.00-3.00-3.00-3.00-3 0 0 3 0 3 5 6 7 0...3..5.6.7.8.9 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 37 / 38 Stat 6 (Spring 009, Section A) Introduction to Business Statistics I Section.3 38 / 38