Shifting and rescaling data distributions

Similar documents
Normal Model (Part 1)

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

Continuous random variables

Chapter 7 1. Random Variables

The Normal Distribution

The Normal Distribution

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Section 3.4 The Normal Distribution

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Chapter 15: Sampling distributions

Continuous Random Variables and the Normal Distribution

Math 243 Lecture Notes

The Standard Deviation as a Ruler and the Normal Model. Copyright 2009 Pearson Education, Inc.

ECON 214 Elements of Statistics for Economists

Unit2: Probabilityanddistributions. 3. Normal distribution

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

ECON 214 Elements of Statistics for Economists 2016/2017

1 Describing Distributions with numbers

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

DATA SUMMARIZATION AND VISUALIZATION

6.2 Normal Distribution. Normal Distributions

Week 7. Texas A& M University. Department of Mathematics Texas A& M University, College Station Section 3.2, 3.3 and 3.4

Making Sense of Cents

Statistics for Business and Economics: Random Variables:Continuous

The Normal Distribution. (Ch 4.3)

Chapter 4 Continuous Random Variables and Probability Distributions

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Announcements. Unit 2: Probability and distributions Lecture 3: Normal distribution. Normal distribution. Heights of males

3.5 Applying the Normal Distribution (Z-Scores)

Chapter 6: The Normal Distribution

The Central Limit Theorem for Sums

Chapter 6: The Normal Distribution

Terms & Characteristics

Density curves. (James Madison University) February 4, / 20

IOP 201-Q (Industrial Psychological Research) Tutorial 5

5.3 Interval Estimation

The Normal Probability Distribution

The normal distribution is a theoretical model derived mathematically and not empirically.

NOTES: Chapter 4 Describing Data

The "bell-shaped" curve, or normal curve, is a probability distribution that describes many real-life situations.

Lecture 6: Chapter 6

Probability. An intro for calculus students P= Figure 1: A normal integral

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes. Standardizing normal distributions The Standard Normal Curve

Review of commonly missed questions on the online quiz. Lecture 7: Random variables] Expected value and standard deviation. Let s bet...

7 THE CENTRAL LIMIT THEOREM

Lecture 6: Normal distribution

Math 227 Elementary Statistics. Bluman 5 th edition

Standard Normal Calculations

Applications of Data Dispersions

The Central Limit Theorem for Sample Means (Averages)

Math Tech IIII, May 7

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Chapter 6. The Normal Probability Distributions

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Lecture 5 - Continuous Distributions

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

A continuous random variable is one that can theoretically take on any value on some line interval. We use f ( x)

3.1 Measures of Central Tendency

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

MLLunsford 1. Activity: Central Limit Theorem Theory and Computations

Central Limit Theorem

Statistics (This summary is for chapters 18, 29 and section H of chapter 19)

Chapter 4 Continuous Random Variables and Probability Distributions

Normal Probability Distributions

Chapter 5 Normal Probability Distributions

MATH 104 CHAPTER 5 page 1 NORMAL DISTRIBUTION

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Section3-2: Measures of Center

STAT:2010 Statistical Methods and Computing. Using density curves to describe the distribution of values of a quantitative

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1

Data Analysis and Statistical Methods Statistics 651

8.2 The Standard Deviation as a Ruler Chapter 8 The Normal and Other Continuous Distributions 8-1

Confidence Intervals and Sample Size

Chapter 4. The Normal Distribution

Normal Probability Distributions

Counting Basics. Venn diagrams

Statistical Methods in Practice STAT/MATH 3379

Continuous Probability Distributions & Normal Distribution

Section 3.5a Applying the Normal Distribution MDM4U Jensen

Describing Data: One Quantitative Variable

Frequency Distribution and Summary Statistics

Chapter 3. Density Curves. Density Curves. Basic Practice of Statistics - 3rd Edition. Chapter 3 1. The Normal Distributions

Unit 2: Statistics Probability

Normal Curves & Sampling Distributions

NORMAL RANDOM VARIABLES (Normal or gaussian distribution)

Prob and Stats, Nov 7

Lecture 9. Probability Distributions. Outline. Outline

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

These Statistics NOTES Belong to:

Numerical Descriptions of Data

Study Ch. 7.3, # 63 71

Chapter 4 Random Variables & Probability. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Transcription:

Shifting and rescaling data distributions It is useful to consider the effect of systematic alterations of all the values in a data set. The simplest such systematic effect is a shift by a fixed constant. Suppose a certain data set is given, and a second data set is obtained from the first by adding the same number c (positive or negative)to each value. Then any measure of center (median or mean) of the new data set is shifted by the same constant value c; any measure of spread (IQR or standard deviation) is unchanged by the shift; and any measure of relative standing (percentile value or z-score) is unchanged by the shift 1

Another common alteration is a rescaling of the data. Suppose a certain data set is given, and a second data set is obtained from the first by rescaling each value to a different unit of measure (every one of the original values x is replaced with a scaled value kx, k being the scale factor). Then any measure of center (median or mean) of the new data set is rescaled by the same scale factor k; any measure of spread (IQR or standard deviation) is rescaled by the same scale factor k; and any measure of relative standing (percentile value or z-score) is unchanged by the rescaling 2

The Normal Model The most commonly occurring distributions in practice are symmetric and bell-shaped. Mathematicians have devised a theoretical model for such distributions, the normal model. It faithfully describes many real data sets and is the basis for most statistical inference techniques. The normal curve, a curve meant to describe the contour of a symmetric and bell-shaped histogram, is characterized by its mean, labeled µ (the Greek letter m ), and its standard deviation, labeled σ (the Greek letter s ). These two numbers determine all the information about the distribution; they are called the parameters of the model. We generally use Roman characters (like x, s) to represent statistics, which are computed from the actual data measurements, while we use Greek letters (like µ, σ) to represent parameters, which are theoretical quantities representing our assumptions about what happens in general. 3

The mean µ of the distribution lies on the scale axis at the position of the central peak of the curve. The points on either side of the mean at which the curve changes concavity are located exactly one standard deviation σ away from the mean; that is, they are located on the axis at the values µ σ and µ + σ. The normal distribution with mean µ and standard deviation σ is denoted N(µ, σ). 4

The 68-95-99.7 Rule For the normal model, the following approximations are useful: about 68% of the data will lie within one standard deviation of the mean (between µ σ and µ + σ); about 95% of the data will lie within two standard deviations of the mean (between µ 2σ and µ+2σ); nearly all (about 99.7%) of the data will lie within three standard deviation of the mean (between µ 3σ and µ + 3σ). 5

Working with the Normal Model In situations where the normal model is being applied to a given situation, sketch a graph of the model and identify the appropriate scale by marking on the horizontal axis the seven number summary: µ 3σ, µ 2σ, µ σ, µ, µ + σ, µ + 2σ, µ + 3σ More specific percentages associated with the normal model N(µ, σ) can be found with your calculator: the percentage of the data lying between two particular values a and b (a y b) is computed as DISTR normalcdf( a, b, µ, σ ) (If no upper bound b is given, it is understood that b = use 1E99 for ; if no lower bound is given, it is understood that a = use -1E99 for.) 6

The standard normal distribution N(0, 1) has mean 0 and standard deviation 1. If a normal model N(µ, σ) applies to a data set, then the corresponding standardized values z will follow the standard normal distribution N(0, 1). Percentages associated with the standard normal model N(0, 1) can also be found with your calculator by omitting entry of the values of µ and σ: the percentage of the data lying between two particular values a and b (a y b) of a standard normal model is computed as DISTR normalcdf( a, b ) Technology can also be used to work with the inverse problem: to determine the critical z-score, labeled z, that lies at the p th percentile of the data in the standard normal model, compute DISTR invnorm( p ) More generally, to determine the critical value, labeled x, that lies at the p th percentile of the data in the normal model N(µ, σ), compute DISTR invnorm( p, µ, σ ) 7

The Nearly Normal Condition Only if a data set has unimodal and symmetric shape may we appropriately apply the normal model to describe it; check this by consulting a histogram or boxplot to verify the relevant features of the distributiion. 8