Data Analysis and Statistical Methods Statistics 651

Size: px
Start display at page:

Download "Data Analysis and Statistical Methods Statistics 651"

Transcription

1 Data Analysis and Statistical Methods Statistics Lecture 10 (MWF) Checking for normality of the data using the QQplot Suhasini Subba Rao

2 Review of previous lecture We calculated probabilities of a normal distribution by standardisation. Example Suppose X N( 3, 0.5), what is P (X 3.5)? Standardise: P (X 3.5) = P ( X ) = P (Z 0.707) (by using the normal tables). We note that when we calculate we are going from a nonstandard normal X N( 3, 0.5) to a standard normal, hence Z = , where Z N(0, 1). We also did the reverse of this finding the values on the x-axis where P (X x) = 0.8, when X N(6, 7) (for example). In this case we had to standardise: P ( X 6 7 x 9 7 ) = 0.8. Look up in the tables the z-value that corresponds to 0.8. This is Therefore x 9 7 = 0.85 and solve for x. 1

3 Checking for Normality (a very rough check) Suppose x 1,..., x n is a sample from a normal distribution with mean µ and variance σ 2. First we order them from the smallest number to the largest number: x (1),..., x (n). Estimate the mean and standard deviations from the data; x and s. Plot all the observations on a number line. Locate the mean x on this line and also the intervals: [ x s, x + s], [ x 2s, x + 2s] and [ x 3s, x + 3s]. If the observations came from a normal, then Roughly 68% of the observations should lie in the interval [ x s, x+s]. 2

4 95% of the observations should lie in the interval [ x 2s, x + 2s]. 99.7% of the observations should lie in the interval [ x 3s, x + 3s]. Remember this means counting the number of points in each interval, and dividing it by the total number of observations. This is an extremely rough way to check for normality. There can exist weird non-normal distributions where the following: Roughly 68% of the observations should lie in the interval [ x s, x+s]. 95% of the observations should lie in the interval [ x 2s, x + 2s]. 99.7% of the observations should lie in the interval [ x 3s, x + 3s]. could be true! 3

5 Motivating the QQplot Lecture 10 (MWF) QQplots We need to find a more accurate method (which is close in idea to the counting in an interval). This motivates the idea of the QQplot. Roughly speaking the QQplots orders the data from the smallest to the largest and plots the data against corresponding normal quantile. Data X 1,..., X n ordered from smallest to largest X (1),..., X (n). Plot X (i) against the i/n quantile of the normal distribution (omitting the first and last observations). If the data comes from a normal distribution (with the mean and variance estimated from the data) the data (empirical quantiles) will match the normal quantiles, and plot should lie on a straightline (on the x = y line). This is the QQplot. 4

6 Checking for normality: The QQ plot This plots what has been described above. The QQplot consists of points and a straight 45 degree line. X X X X X (5) (4) (3) (2) (1)..... x=y line y y y y y (1) (2) (3) (4) (5) If the points tend to lie on the straightline, then this suggests the observations come from a normal distribution. 5

7 Example: Antarctic maximum temperature QQplot It would appear that the maximum temperatures are close to normal. The mean of this data is about 4.5 and the standard deviation is

8 Using this information we can calculate the probabilities. This months maximum temperature is 7 degrees, what is its percentile? Answer P (X 7) = P (Z (7 4.5)/2.16) = This tells me the temperature is in the 87% percentile (using the normal approximation). Based on the data the proportion of temperatures less than 7 degrees is about 86.5% which fits the calculation made using normal approximation of the data very well. 7

9 Example: Antarctic minimum temperature QQplot The minimum temperatures appear to be far from normal. We know that the mean and standard deviation of this data is 13.8 degrees and 9.3 respectively. 8

10 If we use normality of the data to calculate the chance of the temperature being less than -10 we have P (X 10) = P (Z = ) = (about 65.4%). Based on the data the proportion of temperatures less than -10 degrees is about 55%, which is quite different to the proportion calculated using the normal approximation. 9

11 Interpretating a QQ-plot Lecture 10 (MWF) QQplots Some experienced statisticans have shaman like powers when it comes to interpretating QQ-plots. You don t need them, but it is good to have a feel of them. There are two main features you need to look for; Left Skew. This means the distribution is not symmetric. Find the mode (the heightest point of the distribution). The right of the mode should be shorter than the left of the mode. Right Skew. This means the distribution is not symmetric. Find the mode (the heightest point of the distribution). The right of the mode should be longer than the left of the mode. Heavy tails. This means that the probability of large numbers if much more likely than a normal distribution. For example for a 10

12 normal distribution most the observations 98% lie within the interval [ x 3s, x + 3s]. For a heavy tail distribution a far smaller proportion lie in this interval. 11

13 Skewed distributions Lecture 10 (MWF) QQplots A right skewed distribution (red) has a long right tail (green is normal). For a left skewed distribution the QQ-plot is the mirror image along the 45 degree line (arch going upwards and towards the left). 12

14 A right skewed distribution and it s QQplot This is right skewed and we see the qqplots looks like a U. 13

15 QQ-plot of a left skewed distribution The above is indicates a left skewed distribution. The points are arched, going from the below the 45 degree line across it and down again. 14

16 Heavy tail distribution Lecture 10 (MWF) QQplots Has much thicker tails than a normal distribution (the blue are the tails of a normal and red are the tails of a thick tail). 15

17 QQ-plot of a heavy tailed distribution The plot is like an S. On the left of the plot it is left of the 45 degree line and then towards the right it goes to being right of the 45 degree line. 16

18 What does thick tailed distribution mean?? Look at the histogram of the following data set (size 200 observations). Look at the proportion of points outside one/two and three standard deviations of the mean (compare with 68%, 95% and 99.8%). It is a lot more than the normal distribution. Look at the tails, it is higher (thicker) than the normal distribution. 17

19 The corresponding QQplot Below we make a QQplot of the above data set. Lecture 10 (MWF) QQplots The S shape suggests the distribution has thick tails. 18

20 QQplot of the original M&M data It is clearly non-normal. First the horizontal lines that we see is because the data is integer valued (not normal), second it has a strange, shape that does not look at all like it lies on the x=y line. 19

21 QQplot of the average of 5 M&M bags Does not look normal, but certainly the qqplot looks closer x=y then the previous plot. 20

22 QQplot of the average of 10 M&M bags Again it does not look normal, but it does look more normal than the original data. Remember we only have 17 averages, which may explain why the histogram of the averages looks more flat than bell-shaped. 21

23 QQplot of binary data Let us return to the example of people liking apple juice. 100 people were interviewed and each person was asked whether they like apple juice or not (1=yes, 0 = no). Here is the data 22

24 34% of this sample liked apple juice. This data is binary (not normal!), this is why you see the two lines. It is clearly not normal, and you cannot make it more normal by increasing the sample size. What does become normal is the sample proportion (which in this case is 34%) - this is due to the CLT, which we discuss in lecture 12. But only when the sample size is relatively large. 23

25 Transforming Data Lecture 10 (MWF) QQplots If the data is far from normal we often do a transformation of it to make it have less outliers and less skewed. Standard transforms are (for positive data); The log transform; X i log X i = Y i. The variance of the transformed observation tends to be less than the variance of the original observation (sometimes this transformation is called variance stablisation ). Often used when the sample mean and sample variance of X are similar. The power transform; X i X β i = Y i (where 0 < β < 1). This transformation tends to control outliers and unskews the data. 24

26 Left is a QQplot of the original data and the right is the QQplot of the square root of the data (ie. X i X i = X 1/2 i ). Observe how the square root of the data is still skewed - but it is less skewed than the original data. Reducing skewness in data is very useful way of making the CLT work for smaller sample sizes (see later). 25

27 QQ plots and testing for normality There are statistical tests (I have not defined this yet) for checking normality. One of the most famous ones is called the Kolmogorov- Smirnov test. QQ plots for other distributions It is possible by make a QQplot for other distributions. That is to check whether the observations are drawn from another distribution of interest. The QQplot must be modified to the new distribution (where the quantiles of distribution are compared with the ordered data). If you want to know how please ask me. Again the Kolmogorov-Smirnov test can be used to check whether the observations come from the distribution of interest. 26

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 10 (MWF) Checking for normality of the data using the QQplot Suhasini Subba Rao Checking for

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Review of previous lecture: Why confidence intervals? Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Suppose you want to know the

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 14 (MWF) The t-distribution Suhasini Subba Rao Review of previous lecture Often the precision

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 14 (MWF) The t-distribution Suhasini Subba Rao Review of previous lecture Often the precision

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao The binomial: mean and variance Recall that the number of successes out of n, denoted

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://wwwstattamuedu/~suhasini/teachinghtml Suhasini Subba Rao Review of previous lecture The main idea in the previous lecture is that the sample

More information

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Math 2311 Bekki George bekki@math.uh.edu Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Class webpage: http://www.math.uh.edu/~bekki/math2311.html Math 2311 Class

More information

IOP 201-Q (Industrial Psychological Research) Tutorial 5

IOP 201-Q (Industrial Psychological Research) Tutorial 5 IOP 201-Q (Industrial Psychological Research) Tutorial 5 TRUE/FALSE [1 point each] Indicate whether the sentence or statement is true or false. 1. To establish a cause-and-effect relation between two variables,

More information

QQ Plots Stat 342, Spring 2014 Prof. Guttorp - TA Aaron Zimmerman

QQ Plots Stat 342, Spring 2014 Prof. Guttorp - TA Aaron Zimmerman QQ Plots Stat 342, Spring 2014 Prof. Guttorp - TA Aaron Zimmerman To get you started, remember that that a q-q-plot plots (Fn 1 (p), F0 1 (p)) for p (0, 1), where Fn 1 (p) = inf{y : F n (y) p}, where F

More information

CH 5 Normal Probability Distributions Properties of the Normal Distribution

CH 5 Normal Probability Distributions Properties of the Normal Distribution Properties of the Normal Distribution Example A friend that is always late. Let X represent the amount of minutes that pass from the moment you are suppose to meet your friend until the moment your friend

More information

Business Statistics 41000: Probability 4

Business Statistics 41000: Probability 4 Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:

More information

Putting Things Together Part 2

Putting Things Together Part 2 Frequency Putting Things Together Part These exercise blend ideas from various graphs (histograms and boxplots), differing shapes of distributions, and values summarizing the data. Data for, and are in

More information

Basic Procedure for Histograms

Basic Procedure for Histograms Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that

More information

Review of commonly missed questions on the online quiz. Lecture 7: Random variables] Expected value and standard deviation. Let s bet...

Review of commonly missed questions on the online quiz. Lecture 7: Random variables] Expected value and standard deviation. Let s bet... Recap Review of commonly missed questions on the online quiz Lecture 7: ] Statistics 101 Mine Çetinkaya-Rundel OpenIntro quiz 2: questions 4 and 5 September 20, 2011 Statistics 101 (Mine Çetinkaya-Rundel)

More information

Normal Probability Distributions

Normal Probability Distributions Normal Probability Distributions Properties of Normal Distributions The most important probability distribution in statistics is the normal distribution. Normal curve A normal distribution is a continuous

More information

Statistics and Probability

Statistics and Probability Statistics and Probability Continuous RVs (Normal); Confidence Intervals Outline Continuous random variables Normal distribution CLT Point estimation Confidence intervals http://www.isrec.isb-sib.ch/~darlene/geneve/

More information

Numerical Descriptive Measures. Measures of Center: Mean and Median

Numerical Descriptive Measures. Measures of Center: Mean and Median Steve Sawin Statistics Numerical Descriptive Measures Having seen the shape of a distribution by looking at the histogram, the two most obvious questions to ask about the specific distribution is where

More information

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution PSY 464 Advanced Experimental Design Describing and Exploring Data The Normal Distribution 1 Overview/Outline Questions-problems? Exploring/Describing data Organizing/summarizing data Graphical presentations

More information

1 Small Sample CI for a Population Mean µ

1 Small Sample CI for a Population Mean µ Lecture 7: Small Sample Confidence Intervals Based on a Normal Population Distribution Readings: Sections 7.4-7.5 1 Small Sample CI for a Population Mean µ The large sample CI x ± z α/2 s n was constructed

More information

Elementary Statistics

Elementary Statistics Chapter 7 Estimation Goal: To become familiar with how to use Excel 2010 for Estimation of Means. There is one Stat Tool in Excel that is used with estimation of means, T.INV.2T. Open Excel and click on

More information

Moments and Measures of Skewness and Kurtosis

Moments and Measures of Skewness and Kurtosis Moments and Measures of Skewness and Kurtosis Moments The term moment has been taken from physics. The term moment in statistical use is analogous to moments of forces in physics. In statistics the values

More information

Lecture 2 Describing Data

Lecture 2 Describing Data Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms

More information

The topics in this section are related and necessary topics for both course objectives.

The topics in this section are related and necessary topics for both course objectives. 2.5 Probability Distributions The topics in this section are related and necessary topics for both course objectives. A probability distribution indicates how the probabilities are distributed for outcomes

More information

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Chapter 7 Sampling Distributions and Point Estimation of Parameters Chapter 7 Sampling Distributions and Point Estimation of Parameters Part 1: Sampling Distributions, the Central Limit Theorem, Point Estimation & Estimators Sections 7-1 to 7-2 1 / 25 Statistical Inferences

More information

1 Describing Distributions with numbers

1 Describing Distributions with numbers 1 Describing Distributions with numbers Only for quantitative variables!! 1.1 Describing the center of a data set The mean of a set of numerical observation is the familiar arithmetic average. To write

More information

Chapter 7 Study Guide: The Central Limit Theorem

Chapter 7 Study Guide: The Central Limit Theorem Chapter 7 Study Guide: The Central Limit Theorem Introduction Why are we so concerned with means? Two reasons are that they give us a middle ground for comparison and they are easy to calculate. In this

More information

Chapter ! Bell Shaped

Chapter ! Bell Shaped Chapter 6 6-1 Business Statistics: A First Course 5 th Edition Chapter 7 Continuous Probability Distributions Learning Objectives In this chapter, you learn:! To compute probabilities from the normal distribution!

More information

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Lecture 1: Review and Exploratory Data Analysis (EDA) Ani Manichaikul amanicha@jhsph.edu 16 April 2007 1 / 40 Course Information I Office hours For questions and help When? I ll announce this tomorrow

More information

STAB22 section 1.3 and Chapter 1 exercises

STAB22 section 1.3 and Chapter 1 exercises STAB22 section 1.3 and Chapter 1 exercises 1.101 Go up and down two times the standard deviation from the mean. So 95% of scores will be between 572 (2)(51) = 470 and 572 + (2)(51) = 674. 1.102 Same idea

More information

Business Statistics 41000: Probability 3

Business Statistics 41000: Probability 3 Business Statistics 41000: Probability 3 Drew D. Creal University of Chicago, Booth School of Business February 7 and 8, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 13 (MWF) Designing the experiment: Margin of Error Suhasini Subba Rao Terminology: The population

More information

Chapter 7. Sampling Distributions

Chapter 7. Sampling Distributions Chapter 7 Sampling Distributions Section 7.1 Sampling Distributions and the Central Limit Theorem Sampling Distributions Sampling distribution The probability distribution of a sample statistic. Formed

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 05 Normal Distribution So far we have looked at discrete distributions

More information

Assessing Normality. Contents. 1 Assessing Normality. 1.1 Introduction. Anthony Tanbakuchi Department of Mathematics Pima Community College

Assessing Normality. Contents. 1 Assessing Normality. 1.1 Introduction. Anthony Tanbakuchi Department of Mathematics Pima Community College Introductory Statistics Lectures Assessing Normality Department of Mathematics Pima Community College Redistribution of this material is prohibited without written permission of the author 2009 (Compile

More information

STAT 201 Chapter 6. Distribution

STAT 201 Chapter 6. Distribution STAT 201 Chapter 6 Distribution 1 Random Variable We know variable Random Variable: a numerical measurement of the outcome of a random phenomena Capital letter refer to the random variable Lower case letters

More information

1. Variability in estimates and CLT

1. Variability in estimates and CLT Unit3: Foundationsforinference 1. Variability in estimates and CLT Sta 101 - Fall 2015 Duke University, Department of Statistical Science Dr. Çetinkaya-Rundel Slides posted at http://bit.ly/sta101_f15

More information

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed. The Central Limit Theorem The central limit theorem (clt for short) is one of the most powerful and useful ideas in all of statistics. The clt says that if we collect samples of size n with a "large enough

More information

Skewness and the Mean, Median, and Mode *

Skewness and the Mean, Median, and Mode * OpenStax-CNX module: m46931 1 Skewness and the Mean, Median, and Mode * OpenStax This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Consider the following

More information

Simple Descriptive Statistics

Simple Descriptive Statistics Simple Descriptive Statistics These are ways to summarize a data set quickly and accurately The most common way of describing a variable distribution is in terms of two of its properties: Central tendency

More information

The Normal Distribution

The Normal Distribution Stat 6 Introduction to Business Statistics I Spring 009 Professor: Dr. Petrutza Caragea Section A Tuesdays and Thursdays 9:300:50 a.m. Chapter, Section.3 The Normal Distribution Density Curves So far we

More information

AP Statistics Chapter 6 - Random Variables

AP Statistics Chapter 6 - Random Variables AP Statistics Chapter 6 - Random 6.1 Discrete and Continuous Random Objective: Recognize and define discrete random variables, and construct a probability distribution table and a probability histogram

More information

CHAPTER 2 Describing Data: Numerical

CHAPTER 2 Describing Data: Numerical CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of

More information

DATA SUMMARIZATION AND VISUALIZATION

DATA SUMMARIZATION AND VISUALIZATION APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296

More information

Numerical Descriptions of Data

Numerical Descriptions of Data Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =

More information

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley. Appendix: Statistics in Action Part I Financial Time Series 1. These data show the effects of stock splits. If you investigate further, you ll find that most of these splits (such as in May 1970) are 3-for-1

More information

QQ PLOT Yunsi Wang, Tyler Steele, Eva Zhang Spring 2016

QQ PLOT Yunsi Wang, Tyler Steele, Eva Zhang Spring 2016 QQ PLOT INTERPRETATION: Quantiles: QQ PLOT Yunsi Wang, Tyler Steele, Eva Zhang Spring 2016 The quantiles are values dividing a probability distribution into equal intervals, with every interval having

More information

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Exam 2 Spring 2015 Statistics for Applications 4/9/2015 18.443 Exam 2 Spring 2015 Statistics for Applications 4/9/2015 1. True or False (and state why). (a). The significance level of a statistical test is not equal to the probability that the null hypothesis

More information

Unit2: Probabilityanddistributions. 3. Normal distribution

Unit2: Probabilityanddistributions. 3. Normal distribution Announcements Unit: Probabilityanddistributions 3 Normal distribution Sta 101 - Spring 015 Duke University, Department of Statistical Science February, 015 Peer evaluation 1 by Friday 11:59pm Office hours:

More information

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Chapter 3 Numerical Descriptive Measures Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Objectives In this chapter, you learn to: Describe the properties of central tendency, variation, and

More information

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.

More information

NCSS Statistical Software. Reference Intervals

NCSS Statistical Software. Reference Intervals Chapter 586 Introduction A reference interval contains the middle 95% of measurements of a substance from a healthy population. It is a type of prediction interval. This procedure calculates one-, and

More information

Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR

Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR Nelson Mark University of Notre Dame Fall 2017 September 11, 2017 Introduction

More information

ECON 214 Elements of Statistics for Economists 2016/2017

ECON 214 Elements of Statistics for Economists 2016/2017 ECON 214 Elements of Statistics for Economists 2016/2017 Topic The Normal Distribution Lecturer: Dr. Bernardin Senadza, Dept. of Economics bsenadza@ug.edu.gh College of Education School of Continuing and

More information

Terms & Characteristics

Terms & Characteristics NORMAL CURVE Knowledge that a variable is distributed normally can be helpful in drawing inferences as to how frequently certain observations are likely to occur. NORMAL CURVE A Normal distribution: Distribution

More information

Measures of Dispersion (Range, standard deviation, standard error) Introduction

Measures of Dispersion (Range, standard deviation, standard error) Introduction Measures of Dispersion (Range, standard deviation, standard error) Introduction We have already learnt that frequency distribution table gives a rough idea of the distribution of the variables in a sample

More information

Statistics for Business and Economics: Random Variables:Continuous

Statistics for Business and Economics: Random Variables:Continuous Statistics for Business and Economics: Random Variables:Continuous STT 315: Section 107 Acknowledgement: I d like to thank Dr. Ashoke Sinha for allowing me to use and edit the slides. Murray Bourne (interactive

More information

SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data

SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu September 5, 2015

More information

Statistical Intervals (One sample) (Chs )

Statistical Intervals (One sample) (Chs ) 7 Statistical Intervals (One sample) (Chs 8.1-8.3) Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to normally distributed with expected value µ and

More information

the display, exploration and transformation of the data are demonstrated and biases typically encountered are highlighted.

the display, exploration and transformation of the data are demonstrated and biases typically encountered are highlighted. 1 Insurance data Generalized linear modeling is a methodology for modeling relationships between variables. It generalizes the classical normal linear model, by relaxing some of its restrictive assumptions,

More information

Chapter 4. The Normal Distribution

Chapter 4. The Normal Distribution Chapter 4 The Normal Distribution 1 Chapter 4 Overview Introduction 4-1 Normal Distributions 4-2 Applications of the Normal Distribution 4-3 The Central Limit Theorem 4-4 The Normal Approximation to the

More information

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.) Starter Ch. 6: A z-score Analysis Starter Ch. 6 Your Statistics teacher has announced that the lower of your two tests will be dropped. You got a 90 on test 1 and an 85 on test 2. You re all set to drop

More information

STAT:2010 Statistical Methods and Computing. Using density curves to describe the distribution of values of a quantitative

STAT:2010 Statistical Methods and Computing. Using density curves to describe the distribution of values of a quantitative STAT:10 Statistical Methods and Computing Normal Distributions Lecture 4 Feb. 6, 17 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowa.edu 1 2 Using density curves to describe the distribution of values of

More information

Frequency Distribution and Summary Statistics

Frequency Distribution and Summary Statistics Frequency Distribution and Summary Statistics Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai i at Mānoa Outline 1. Stemplot 2. Frequency table 3. Summary

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 6 Normal Probability Distributions 6-1 Overview 6-2 The Standard Normal Distribution

More information

We will also use this topic to help you see how the standard deviation might be useful for distributions which are normally distributed.

We will also use this topic to help you see how the standard deviation might be useful for distributions which are normally distributed. We will discuss the normal distribution in greater detail in our unit on probability. However, as it is often of use to use exploratory data analysis to determine if the sample seems reasonably normally

More information

CHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) =

CHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) = Solutions to End-of-Section and Chapter Review Problems 225 CHAPTER 6 6.1 (a) P(Z < 1.20) = 0.88493 P(Z > 1.25) = 1 0.89435 = 0.10565 P(1.25 < Z < 1.70) = 0.95543 0.89435 = 0.06108 (d) P(Z < 1.25) or Z

More information

If the distribution of a random variable x is approximately normal, then

If the distribution of a random variable x is approximately normal, then Confidence Intervals for the Mean (σ unknown) In many real life situations, the standard deviation is unknown. In order to construct a confidence interval for a random variable that is normally distributed

More information

Lecture 6: Normal distribution

Lecture 6: Normal distribution Lecture 6: Normal distribution Statistics 101 Mine Çetinkaya-Rundel February 2, 2012 Announcements Announcements HW 1 due now. Due: OQ 2 by Monday morning 8am. Statistics 101 (Mine Çetinkaya-Rundel) L6:

More information

Chapter 5. Sampling Distributions

Chapter 5. Sampling Distributions Lecture notes, Lang Wu, UBC 1 Chapter 5. Sampling Distributions 5.1. Introduction In statistical inference, we attempt to estimate an unknown population characteristic, such as the population mean, µ,

More information

Financial Time Series and Their Characteristics

Financial Time Series and Their Characteristics Financial Time Series and Their Characteristics Egon Zakrajšek Division of Monetary Affairs Federal Reserve Board Summer School in Financial Mathematics Faculty of Mathematics & Physics University of Ljubljana

More information

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi Chapter 4: Commonly Used Distributions Statistics for Engineers and Scientists Fourth Edition William Navidi 2014 by Education. This is proprietary material solely for authorized instructor use. Not authorized

More information

What was in the last lecture?

What was in the last lecture? What was in the last lecture? Normal distribution A continuous rv with bell-shaped density curve The pdf is given by f(x) = 1 2πσ e (x µ)2 2σ 2, < x < If X N(µ, σ 2 ), E(X) = µ and V (X) = σ 2 Standard

More information

3.1 Measures of Central Tendency

3.1 Measures of Central Tendency 3.1 Measures of Central Tendency n Summation Notation x i or x Sum observation on the variable that appears to the right of the summation symbol. Example 1 Suppose the variable x i is used to represent

More information

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.

More information

Description of Data I

Description of Data I Description of Data I (Summary and Variability measures) Objectives: Able to understand how to summarize the data Able to understand how to measure the variability of the data Able to use and interpret

More information

5-1 pg ,4,5, EOO,39,47,50,53, pg ,5,9,13,17,19,21,22,25,30,31,32, pg.269 1,29,13,16,17,19,20,25,26,28,31,33,38

5-1 pg ,4,5, EOO,39,47,50,53, pg ,5,9,13,17,19,21,22,25,30,31,32, pg.269 1,29,13,16,17,19,20,25,26,28,31,33,38 5-1 pg. 242 3,4,5, 17-37 EOO,39,47,50,53,56 5-2 pg. 249 9,10,13,14,17,18 5-3 pg. 257 1,5,9,13,17,19,21,22,25,30,31,32,34 5-4 pg.269 1,29,13,16,17,19,20,25,26,28,31,33,38 5-5 pg. 281 5-14,16,19,21,22,25,26,30

More information

Section 5 3 The Mean and Standard Deviation of a Binomial Distribution!

Section 5 3 The Mean and Standard Deviation of a Binomial Distribution! Section 5 3 The Mean and Standard Deviation of a Binomial Distribution! Previous sections required that you to find the Mean and Standard Deviation of a Binomial Distribution by using the values from a

More information

Module 4: Probability

Module 4: Probability Module 4: Probability 1 / 22 Probability concepts in statistical inference Probability is a way of quantifying uncertainty associated with random events and is the basis for statistical inference. Inference

More information

Examples of continuous probability distributions: The normal and standard normal

Examples of continuous probability distributions: The normal and standard normal Examples of continuous probability distributions: The normal and standard normal The Normal Distribution f(x) Changing μ shifts the distribution left or right. Changing σ increases or decreases the spread.

More information

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. For exams (MD1, MD2, and Final): You may bring one 8.5 by 11 sheet of

More information

6.1, 7.1 Estimating with confidence (CIS: Chapter 10)

6.1, 7.1 Estimating with confidence (CIS: Chapter 10) Objectives 6.1, 7.1 Estimating with confidence (CIS: Chapter 10) Statistical confidence (CIS gives a good explanation of a 95% CI) Confidence intervals Choosing the sample size t distributions One-sample

More information

appstats5.notebook September 07, 2016 Chapter 5

appstats5.notebook September 07, 2016 Chapter 5 Chapter 5 Describing Distributions Numerically Chapter 5 Objective: Students will be able to use statistics appropriate to the shape of the data distribution to compare of two or more different data sets.

More information

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Chapter 8 Measures of Center Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Data that can only be integer

More information

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede,

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede, FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede, mb8@ecs.soton.ac.uk The normal distribution The normal distribution is the classic "bell curve". We've seen that

More information

Normal Approximation to Binomial Distributions

Normal Approximation to Binomial Distributions Normal Approximation to Binomial Distributions Charlie Vollmer Department of Statistics Colorado State University Fort Collins, CO charlesv@rams.colostate.edu May 19, 2017 Abstract This document is a supplement

More information

DESCRIBING DATA: MESURES OF LOCATION

DESCRIBING DATA: MESURES OF LOCATION DESCRIBING DATA: MESURES OF LOCATION A. Measures of Central Tendency Measures of Central Tendency are used to pinpoint the center or average of a data set which can then be used to represent the typical

More information

MA 1125 Lecture 05 - Measures of Spread. Wednesday, September 6, Objectives: Introduce variance, standard deviation, range.

MA 1125 Lecture 05 - Measures of Spread. Wednesday, September 6, Objectives: Introduce variance, standard deviation, range. MA 115 Lecture 05 - Measures of Spread Wednesday, September 6, 017 Objectives: Introduce variance, standard deviation, range. 1. Measures of Spread In Lecture 04, we looked at several measures of central

More information

10/1/2012. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1

10/1/2012. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 Pivotal subject: distributions of statistics. Foundation linchpin important crucial You need sampling distributions to make inferences:

More information

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82 Announcements: Week 5 quiz begins at 4pm today and ends at 3pm on Wed If you take more than 20 minutes to complete your quiz, you will only receive partial credit. (It doesn t cut you off.) Today: Sections

More information

Chapter 7 1. Random Variables

Chapter 7 1. Random Variables Chapter 7 1 Random Variables random variable numerical variable whose value depends on the outcome of a chance experiment - discrete if its possible values are isolated points on a number line - continuous

More information

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Statistics 431 Spring 2007 P. Shaman. Preliminaries Statistics 4 Spring 007 P. Shaman The Binomial Distribution Preliminaries A binomial experiment is defined by the following conditions: A sequence of n trials is conducted, with each trial having two possible

More information

STAT 113 Variability

STAT 113 Variability STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2

More information

ECON 214 Elements of Statistics for Economists

ECON 214 Elements of Statistics for Economists ECON 214 Elements of Statistics for Economists Session 7 The Normal Distribution Part 1 Lecturer: Dr. Bernardin Senadza, Dept. of Economics Contact Information: bsenadza@ug.edu.gh College of Education

More information

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions ELE 525: Random Processes in Information Systems Hisashi Kobayashi Department of Electrical Engineering

More information

χ 2 distributions and confidence intervals for population variance

χ 2 distributions and confidence intervals for population variance χ 2 distributions and confidence intervals for population variance Let Z be a standard Normal random variable, i.e., Z N(0, 1). Define Y = Z 2. Y is a non-negative random variable. Its distribution is

More information

Web Science & Technologies University of Koblenz Landau, Germany. Lecture Data Science. Statistics and Probabilities JProf. Dr.

Web Science & Technologies University of Koblenz Landau, Germany. Lecture Data Science. Statistics and Probabilities JProf. Dr. Web Science & Technologies University of Koblenz Landau, Germany Lecture Data Science Statistics and Probabilities JProf. Dr. Claudia Wagner Data Science Open Position @GESIS Student Assistant Job in Data

More information

Introduction to Computational Finance and Financial Econometrics Descriptive Statistics

Introduction to Computational Finance and Financial Econometrics Descriptive Statistics You can t see this text! Introduction to Computational Finance and Financial Econometrics Descriptive Statistics Eric Zivot Summer 2015 Eric Zivot (Copyright 2015) Descriptive Statistics 1 / 28 Outline

More information

Statistical Methods in Practice STAT/MATH 3379

Statistical Methods in Practice STAT/MATH 3379 Statistical Methods in Practice STAT/MATH 3379 Dr. A. B. W. Manage Associate Professor of Mathematics & Statistics Department of Mathematics & Statistics Sam Houston State University Overview 6.1 Discrete

More information

Probability. An intro for calculus students P= Figure 1: A normal integral

Probability. An intro for calculus students P= Figure 1: A normal integral Probability An intro for calculus students.8.6.4.2 P=.87 2 3 4 Figure : A normal integral Suppose we flip a coin 2 times; what is the probability that we get more than 2 heads? Suppose we roll a six-sided

More information

Descriptive Statistics (Devore Chapter One)

Descriptive Statistics (Devore Chapter One) Descriptive Statistics (Devore Chapter One) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 0 Perspective 1 1 Pictorial and Tabular Descriptions of Data 2 1.1 Stem-and-Leaf

More information