Most of the transformations we will deal with will be in the families of powers and roots: p X -> (X -1)/-1.

Size: px
Start display at page:

Download "Most of the transformations we will deal with will be in the families of powers and roots: p X -> (X -1)/-1."

Transcription

1 Powers and Roots Quite often when we re dealing with quantitative data, it turns out that for the purposes of analysis, it is useful to carry out a transformation of one of the variables of interest. This could occur for any one of a number of reasons: C C C C A distribution may be skewed. This can be problematic for any one of a number of reasons. 1) Skewed distributions are difficult to eamine because many observations may be piled up in one place. 2) Unusual observations in the direction opposite from that of the skew may be hidden. 3) Summary measures like the mean and median don t necessarily make much sense for a skewed distribution. The relationship between two variables many be nonlinear. Again, this can be a problem for any one of a number of reasons. 1) Linear relationships are easier to interpret than nonlinear ones. With a linear relationship, after all, a one unit change in the independent variable leads to some unit change in the dependent variable. 2) The statistical theory for linear models is simple and well-developed. This is what we will be learning for the rest of the quarter. 3) Nonparametric regression is not feasible when there is more than independent variable. Spread may be nonconstant. There are cases where the spread of a variable will depend on its level. In that situation, comparing two distributions of the same variable from populations are at different levels may be difficult. One or both of the variables may be a proportion. Most of the transformations we will deal with will be in the families of powers and roots: p X -> (X -1)/-1. In the case that p is zero, we will substitute the log transformation, X -> log. The relationships of these transformed variables to the original are summarized in the figure below: Note that you will not always be using the specific formula listed above. You may need to fiddle a bit so that all of the independent variables are positive. Also, as a general rule, carrying out a transformation only makes sense when the ratio of the largest to the smallest value is quite large.

2 Creating transformed variables in STATA with the generate command. An important command you will need to use in STATA to create transformed variables is the generate command. The format is generate new variable = epression involving eisting variables for eample, the curves in the previous transparency were generated from an original variable by the following commands: generate y0 = ln() generate y1 = - 1 generate y2 = ^2-1 generate y3 = ^3-1 generate ym1 = (^-1-1)/-1 The original variable had been created to consist of 1000 observations ranging from 0 to 4. Note that ln() is the natural logarithm of, log() is the log base 10 of, and ^3 is cubed. The range of functions you can include in your epression is quite wide. help functions will summarize them for you. For eample, normd() is the height of the normal distribution at, normprob() is the value of the cumulative normal at, and so on y = normd() y = normprob() y = sin(1/) y = sin 2-4 4

3

4 Dealing with skewness If you have a variable that is skewed, you may use a power/root or log transformation to try to make it look more normal, or at least more symmetric. In general, if a distribution is skewed to the right, you may need to eperiment with a log or a root transformation, while if it is skewed to the left, you may need to square or cube it. Here are some eamples: First, a situation where a variable is skewed to the left. On the left is a smoothed univariate distribution (using STATA s kdensity) histogram of countries according to the percentage of males with a primary school education in 1993 (prim93m). Note that it is skewed to the left, indicating that while most countries have a fairly high proportion of males with some primary school education, some have very little. On the right is a smoothed distribution of a variable prim93m2 generated by squaring prim93m. Note that it is much more symmetric, and even sort of normal Primary 93 M (World Bank) prim93m2 Here s a situation where transformation of a left-skewed distribution improves symmetry, but not normality. On the left is a distribution of life epectancies in 1995, on the right is a distribution of th life epectancy to the 4 power: e0 in 1995 (World Bank) e+06 e0_ e+07 The smoothed distribution is the univariate equivalent of the bivariate smoothing we saw earlier.

5 Just as we can raise to a power to adjust for a left skew, we can log or take a root to correct for a right skew. On the left is a smoothed distribution of per capita GDP, on the right is the equivalent for logged per capita GDP United Nations per capita GDP lpcgdp95 Below is an etreme eample. On the left is a distribution of surface areas of countries. It is right skewed, indicating that while most countries have a small surface area, some have very large ones. On the right is a distribution of the tenth-root of surface areas, the result of some eperimentation Surface Area in sq. km. World B AreaRt2

6 Transforming to deal with nonlinearity Often we will want to transform a variable to linearize a relationship. An important rule of thumb is that if we plot the dependent variable on the vertical versus the independent variable on the horizontal, if the plot ehibits a bulge up and to the left, we will probably need to either transform the independent variable downward or the dependent variable upward. If the bulge is down and to the left, we may need to shift transform the independent variable upward or the dependent variable downward. Here s an eample where on the left, male life epectancy is plotted against per capita GDP, and on the right, it is plotted against logged per capita GDP Male e from U.N. Male e from U.N United Nations per capita GDP lpcgdp95 Here s an eample where the dependent variable is TFR United Nations per capita GDP lpcgdp95

7 Nonconstant spread Sometimes the spread of a variable depends on the level of another variable. For eample, according to the figure on the left below, how spread out the distribution of the infant mortality rate is varies according to the level of fertility, even though the relationship between the two is linear. That can sometimes be a problem if you want in some way to compare the distribution of the variable at different levels. Outliers may be harder to locate at one of the levels, for eample. When spread increases with level, working down the ladder of powers can stabilize it. The right on the right below plots the cube root of the infant mortality rate as a function of the TFR IMR IMR

8 Proportions An important transformation when one of the variables represents a proportion is the logit transformation, ln(p/(1-p)). This is the log of the odds of P. Proportions are unusual beasts because they are bounded, by 0 and 1, and often they behave differently when you look at distributions of them. Often you will see eamples of nonconstant spread when you are looking at proportions as a dependent variable, and the logit transformation is another way of dealing with this. Here is the IMR/TFR relationship from the previous page, followed by one where the IMR is transformed into the logit: IMR logitimr

9 Here s an eample where we re looking at a distribution of countries by the percentage of dwellings with running water. It s right skewed, but squaring the percentage doesn t do much that s useful, indeed it produces a bimodal distribution that still looks a bit skewed Fraction Fraction Water (World Bank) mwater95 Here s what happens when we look at a logistic transformation: Fraction lwater95 Notice the distribution is much better-behaved.

Overview. Family of powers and roots

Overview. Family of powers and roots 4. Transformations Overview.................................................................. 2 Family of powers and roots...................................................... 3 Family of powers and roots......................................................

More information

STAT 113 Variability

STAT 113 Variability STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2

More information

Normal Probability Distributions

Normal Probability Distributions Normal Probability Distributions Properties of Normal Distributions The most important probability distribution in statistics is the normal distribution. Normal curve A normal distribution is a continuous

More information

1 Describing Distributions with numbers

1 Describing Distributions with numbers 1 Describing Distributions with numbers Only for quantitative variables!! 1.1 Describing the center of a data set The mean of a set of numerical observation is the familiar arithmetic average. To write

More information

Some estimates of the height of the podium

Some estimates of the height of the podium Some estimates of the height of the podium 24 36 40 40 40 41 42 44 46 48 50 53 65 98 1 5 number summary Inter quartile range (IQR) range = max min 2 1.5 IQR outlier rule 3 make a boxplot 24 36 40 40 40

More information

Numerical Descriptive Measures. Measures of Center: Mean and Median

Numerical Descriptive Measures. Measures of Center: Mean and Median Steve Sawin Statistics Numerical Descriptive Measures Having seen the shape of a distribution by looking at the histogram, the two most obvious questions to ask about the specific distribution is where

More information

Non-linearities in Simple Regression

Non-linearities in Simple Regression Non-linearities in Simple Regression 1. Eample: Monthly Earnings and Years of Education In this tutorial, we will focus on an eample that eplores the relationship between total monthly earnings and years

More information

Continuous Distributions

Continuous Distributions Quantitative Methods 2013 Continuous Distributions 1 The most important probability distribution in statistics is the normal distribution. Carl Friedrich Gauss (1777 1855) Normal curve A normal distribution

More information

Percentiles, STATA, Box Plots, Standardizing, and Other Transformations

Percentiles, STATA, Box Plots, Standardizing, and Other Transformations Percentiles, STATA, Box Plots, Standardizing, and Other Transformations Lecture 3 Reading: Sections 5.7 54 Remember, when you finish a chapter make sure not to miss the last couple of boxes: What Can Go

More information

Mathematics 1000, Winter 2008

Mathematics 1000, Winter 2008 Mathematics 1000, Winter 2008 Lecture 4 Sheng Zhang Department of Mathematics Wayne State University January 16, 2008 Announcement Monday is Martin Luther King Day NO CLASS Today s Topics Curves and Histograms

More information

Assessing Normality. Contents. 1 Assessing Normality. 1.1 Introduction. Anthony Tanbakuchi Department of Mathematics Pima Community College

Assessing Normality. Contents. 1 Assessing Normality. 1.1 Introduction. Anthony Tanbakuchi Department of Mathematics Pima Community College Introductory Statistics Lectures Assessing Normality Department of Mathematics Pima Community College Redistribution of this material is prohibited without written permission of the author 2009 (Compile

More information

appstats5.notebook September 07, 2016 Chapter 5

appstats5.notebook September 07, 2016 Chapter 5 Chapter 5 Describing Distributions Numerically Chapter 5 Objective: Students will be able to use statistics appropriate to the shape of the data distribution to compare of two or more different data sets.

More information

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line. Introduction We continue our study of descriptive statistics with measures of dispersion, such as dot plots, stem and leaf displays, quartiles, percentiles, and box plots. Dot plots, a stem-and-leaf display,

More information

The Normal Distribution

The Normal Distribution 5.1 Introduction to Normal Distributions and the Standard Normal Distribution Section Learning objectives: 1. How to interpret graphs of normal probability distributions 2. How to find areas under the

More information

Unit 2 Statistics of One Variable

Unit 2 Statistics of One Variable Unit 2 Statistics of One Variable Day 6 Summarizing Quantitative Data Summarizing Quantitative Data We have discussed how to display quantitative data in a histogram It is useful to be able to describe

More information

CHAPTER 2 Describing Data: Numerical

CHAPTER 2 Describing Data: Numerical CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of

More information

2 Exploring Univariate Data

2 Exploring Univariate Data 2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting

More information

Lecture 2 Describing Data

Lecture 2 Describing Data Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 05 Normal Distribution So far we have looked at discrete distributions

More information

STAB22 section 1.3 and Chapter 1 exercises

STAB22 section 1.3 and Chapter 1 exercises STAB22 section 1.3 and Chapter 1 exercises 1.101 Go up and down two times the standard deviation from the mean. So 95% of scores will be between 572 (2)(51) = 470 and 572 + (2)(51) = 674. 1.102 Same idea

More information

22.2 Shape, Center, and Spread

22.2 Shape, Center, and Spread Name Class Date 22.2 Shape, Center, and Spread Essential Question: Which measures of center and spread are appropriate for a normal distribution, and which are appropriate for a skewed distribution? Eplore

More information

DATA SUMMARIZATION AND VISUALIZATION

DATA SUMMARIZATION AND VISUALIZATION APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 10 (MWF) Checking for normality of the data using the QQplot Suhasini Subba Rao Checking for

More information

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data Summarising Data Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Today we will consider Different types of data Appropriate ways to summarise these data 17/10/2017

More information

Section 1.5: Factoring Special Products

Section 1.5: Factoring Special Products Objective: Identify and factor special products including a difference of two perfect squares, perfect square trinomials, and sum and difference of two perfect cubes. When factoring there are a few special

More information

Putting Things Together Part 2

Putting Things Together Part 2 Frequency Putting Things Together Part These exercise blend ideas from various graphs (histograms and boxplots), differing shapes of distributions, and values summarizing the data. Data for, and are in

More information

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Lecture 1: Review and Exploratory Data Analysis (EDA) Ani Manichaikul amanicha@jhsph.edu 16 April 2007 1 / 40 Course Information I Office hours For questions and help When? I ll announce this tomorrow

More information

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.

More information

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Chapter 3 Numerical Descriptive Measures Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Objectives In this chapter, you learn to: Describe the properties of central tendency, variation, and

More information

Random variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny.

Random variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny. Distributions September 17 Random variables Anything that can be measured or categorized is called a variable If the value that a variable takes on is subject to variability, then it the variable is a

More information

the display, exploration and transformation of the data are demonstrated and biases typically encountered are highlighted.

the display, exploration and transformation of the data are demonstrated and biases typically encountered are highlighted. 1 Insurance data Generalized linear modeling is a methodology for modeling relationships between variables. It generalizes the classical normal linear model, by relaxing some of its restrictive assumptions,

More information

SEX DISCRIMINATION PROBLEM

SEX DISCRIMINATION PROBLEM SEX DISCRIMINATION PROBLEM 5. Displaying Relationships between Variables In this section we will use scatterplots to examine the relationship between the dependent variable (starting salary) and each of

More information

9. Logit and Probit Models For Dichotomous Data

9. Logit and Probit Models For Dichotomous Data Sociology 740 John Fox Lecture Notes 9. Logit and Probit Models For Dichotomous Data Copyright 2014 by John Fox Logit and Probit Models for Dichotomous Responses 1 1. Goals: I To show how models similar

More information

Basic Procedure for Histograms

Basic Procedure for Histograms Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that

More information

x is a random variable which is a numerical description of the outcome of an experiment.

x is a random variable which is a numerical description of the outcome of an experiment. Chapter 5 Discrete Probability Distributions Random Variables is a random variable which is a numerical description of the outcome of an eperiment. Discrete: If the possible values change by steps or jumps.

More information

Monetary Economics Measuring Asset Returns. Gerald P. Dwyer Fall 2015

Monetary Economics Measuring Asset Returns. Gerald P. Dwyer Fall 2015 Monetary Economics Measuring Asset Returns Gerald P. Dwyer Fall 2015 WSJ Readings Readings this lecture, Cuthbertson Ch. 9 Readings next lecture, Cuthbertson, Chs. 10 13 Measuring Asset Returns Outline

More information

The Normal Distribution

The Normal Distribution Stat 6 Introduction to Business Statistics I Spring 009 Professor: Dr. Petrutza Caragea Section A Tuesdays and Thursdays 9:300:50 a.m. Chapter, Section.3 The Normal Distribution Density Curves So far we

More information

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.

More information

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean Measure of Center Measures of Center The value at the center or middle of a data set 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) 1 2 Mean Notation The measure of center obtained by adding the values

More information

STAT 157 HW1 Solutions

STAT 157 HW1 Solutions STAT 157 HW1 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/10/spring/stats157.dir/ Problem 1. 1.a: (6 points) Determine the Relative Frequency and the Cumulative Relative Frequency (fill

More information

Probability distributions

Probability distributions Probability distributions Introduction What is a probability? If I perform n eperiments and a particular event occurs on r occasions, the relative frequency of this event is simply r n. his is an eperimental

More information

Data screening, transformations: MRC05

Data screening, transformations: MRC05 Dale Berger Data screening, transformations: MRC05 This is a demonstration of data screening and transformations for a regression analysis. Our interest is in predicting current salary from education level

More information

Skewness and the Mean, Median, and Mode *

Skewness and the Mean, Median, and Mode * OpenStax-CNX module: m46931 1 Skewness and the Mean, Median, and Mode * OpenStax This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Consider the following

More information

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality Point Estimation Some General Concepts of Point Estimation Statistical inference = conclusions about parameters Parameters == population characteristics A point estimate of a parameter is a value (based

More information

Chapter 6. Transformation of Variables

Chapter 6. Transformation of Variables 6.1 Chapter 6. Transformation of Variables 1. Need for transformation 2. Power transformations: Transformation to achieve linearity Transformation to stabilize variance Logarithmic transformation MACT

More information

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1)

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) Descriptive statistics are ways of summarizing large sets of quantitative (numerical) information. The best way to reduce a set of

More information

Probability distributions relevant to radiowave propagation modelling

Probability distributions relevant to radiowave propagation modelling Rec. ITU-R P.57 RECOMMENDATION ITU-R P.57 PROBABILITY DISTRIBUTIONS RELEVANT TO RADIOWAVE PROPAGATION MODELLING (994) Rec. ITU-R P.57 The ITU Radiocommunication Assembly, considering a) that the propagation

More information

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Math 2311 Bekki George bekki@math.uh.edu Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Class webpage: http://www.math.uh.edu/~bekki/math2311.html Math 2311 Class

More information

Copyright 2005 Pearson Education, Inc. Slide 6-1

Copyright 2005 Pearson Education, Inc. Slide 6-1 Copyright 2005 Pearson Education, Inc. Slide 6-1 Chapter 6 Copyright 2005 Pearson Education, Inc. Measures of Center in a Distribution 6-A The mean is what we most commonly call the average value. It is

More information

Jacob: What data do we use? Do we compile paid loss triangles for a line of business?

Jacob: What data do we use? Do we compile paid loss triangles for a line of business? PROJECT TEMPLATES FOR REGRESSION ANALYSIS APPLIED TO LOSS RESERVING BACKGROUND ON PAID LOSS TRIANGLES (The attached PDF file has better formatting.) {The paid loss triangle helps you! distinguish between

More information

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE AP STATISTICS Name: FALL SEMESTSER FINAL EXAM STUDY GUIDE Period: *Go over Vocabulary Notecards! *This is not a comprehensive review you still should look over your past notes, homework/practice, Quizzes,

More information

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.) Starter Ch. 6: A z-score Analysis Starter Ch. 6 Your Statistics teacher has announced that the lower of your two tests will be dropped. You got a 90 on test 1 and an 85 on test 2. You re all set to drop

More information

Example: Histogram for US household incomes from 2015 Table:

Example: Histogram for US household incomes from 2015 Table: 1 Example: Histogram for US household incomes from 2015 Table: Income level Relative frequency $0 - $14,999 11.6% $15,000 - $24,999 10.5% $25,000 - $34,999 10% $35,000 - $49,999 12.7% $50,000 - $74,999

More information

Establishing a framework for statistical analysis via the Generalized Linear Model

Establishing a framework for statistical analysis via the Generalized Linear Model PSY349: Lecture 1: INTRO & CORRELATION Establishing a framework for statistical analysis via the Generalized Linear Model GLM provides a unified framework that incorporates a number of statistical methods

More information

EconS Income E ects

EconS Income E ects EconS 305 - Income E ects Eric Dunaway Washington State University eric.dunaway@wsu.edu September 23, 2015 Eric Dunaway (WSU) EconS 305 - Lecture 13 September 23, 2015 1 / 41 Introduction Over the net

More information

Probability. An intro for calculus students P= Figure 1: A normal integral

Probability. An intro for calculus students P= Figure 1: A normal integral Probability An intro for calculus students.8.6.4.2 P=.87 2 3 4 Figure : A normal integral Suppose we flip a coin 2 times; what is the probability that we get more than 2 heads? Suppose we roll a six-sided

More information

Computing Statistics ID1050 Quantitative & Qualitative Reasoning

Computing Statistics ID1050 Quantitative & Qualitative Reasoning Computing Statistics ID1050 Quantitative & Qualitative Reasoning Single-variable Statistics We will be considering six statistics of a data set Three measures of the middle Mean, median, and mode Two measures

More information

H i s t o g r a m o f P ir o. P i r o. H i s t o g r a m o f P i r o. P i r o

H i s t o g r a m o f P ir o. P i r o. H i s t o g r a m o f P i r o. P i r o fit Lecture 3 Common problem in applications: find a density which fits well an eperimental sample. Given a sample 1,..., n, we look for a density f which may generate that sample. There eist infinitely

More information

Week 19 Algebra 2 Assignment:

Week 19 Algebra 2 Assignment: Week 9 Algebra Assignment: Day : pp. 66-67 #- odd, omit #, 7 Day : pp. 66-67 #- even, omit #8 Day : pp. 7-7 #- odd Day 4: pp. 7-7 #-4 even Day : pp. 77-79 #- odd, 7 Notes on Assignment: Pages 66-67: General

More information

FINITE MATH LECTURE NOTES. c Janice Epstein 1998, 1999, 2000 All rights reserved.

FINITE MATH LECTURE NOTES. c Janice Epstein 1998, 1999, 2000 All rights reserved. FINITE MATH LECTURE NOTES c Janice Epstein 1998, 1999, 2000 All rights reserved. August 27, 2001 Chapter 1 Straight Lines and Linear Functions In this chapter we will learn about lines - how to draw them

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 6 Normal Probability Distributions 6-1 Overview 6-2 The Standard Normal Distribution

More information

Descriptive Analysis

Descriptive Analysis Descriptive Analysis HERTANTO WAHYU SUBAGIO Univariate Analysis Univariate analysis involves the examination across cases of one variable at a time. There are three major characteristics of a single variable

More information

Exploratory Data Analysis (EDA)

Exploratory Data Analysis (EDA) Exploratory Data Analysis (EDA) Introduction A Need to Explore Your Data The first step of data analysis should always be a detailed examination of the data. The examination of your data is called Exploratory

More information

Graphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics

Graphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics Graphical and Tabular Methods in Descriptive Statistics MATH 3342 Section 1.2 Descriptive Statistics n Graphs and Tables n Numerical Summaries Sections 1.3 and 1.4 1 Why graph data? n The amount of data

More information

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. For exams (MD1, MD2, and Final): You may bring one 8.5 by 11 sheet of

More information

CHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) =

CHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) = Solutions to End-of-Section and Chapter Review Problems 225 CHAPTER 6 6.1 (a) P(Z < 1.20) = 0.88493 P(Z > 1.25) = 1 0.89435 = 0.10565 P(1.25 < Z < 1.70) = 0.95543 0.89435 = 0.06108 (d) P(Z < 1.25) or Z

More information

Summary of Statistical Analysis Tools EDAD 5630

Summary of Statistical Analysis Tools EDAD 5630 Summary of Statistical Analysis Tools EDAD 5630 Test Name Program Used Purpose Steps Main Uses/Applications in Schools Principal Component Analysis SPSS Measure Underlying Constructs Reliability SPSS Measure

More information

We will also use this topic to help you see how the standard deviation might be useful for distributions which are normally distributed.

We will also use this topic to help you see how the standard deviation might be useful for distributions which are normally distributed. We will discuss the normal distribution in greater detail in our unit on probability. However, as it is often of use to use exploratory data analysis to determine if the sample seems reasonably normally

More information

Descriptive Statistics

Descriptive Statistics Chapter 3 Descriptive Statistics Chapter 2 presented graphical techniques for organizing and displaying data. Even though such graphical techniques allow the researcher to make some general observations

More information

Solutions for practice questions: Chapter 9, Statistics

Solutions for practice questions: Chapter 9, Statistics Solutions for practice questions: Chapter 9, Statistics If you find any errors, please let me know at mailto:msfrisbie@pfrisbie.com. 1. We know that µ is the mean of 30 values of y, 30 30 i= 1 2 ( y i

More information

Summary of Information from Recapitulation Report Submittals (DR-489 series, DR-493, Central Assessment, Agricultural Schedule):

Summary of Information from Recapitulation Report Submittals (DR-489 series, DR-493, Central Assessment, Agricultural Schedule): County: Martin Study Type: 2014 - In-Depth The department approved your preliminary assessment roll for 2014. Roll approval statistical summary reports and graphics for 2014 are attached for additional

More information

MA 1125 Lecture 05 - Measures of Spread. Wednesday, September 6, Objectives: Introduce variance, standard deviation, range.

MA 1125 Lecture 05 - Measures of Spread. Wednesday, September 6, Objectives: Introduce variance, standard deviation, range. MA 115 Lecture 05 - Measures of Spread Wednesday, September 6, 017 Objectives: Introduce variance, standard deviation, range. 1. Measures of Spread In Lecture 04, we looked at several measures of central

More information

Introduction to Risk, Return and Opportunity Cost of Capital

Introduction to Risk, Return and Opportunity Cost of Capital Introduction to Risk, Return and Opportunity Cost of Capital Thomas Emilsson, --5 Risk, Return and Portfolio Variance Risk and Return When making investments, there needs to be an estimation of the risks

More information

GRAPHS IN ECONOMICS. Appendix. Key Concepts. A Positive Relationship

GRAPHS IN ECONOMICS. Appendix. Key Concepts. A Positive Relationship Appendi GRAPHS IN ECONOMICS Ke Concepts Graphing Data Graphs represent quantit as a distance on a line. On a graph, the horizontal scale line is the -ais, the vertical scale line is the -ais, and the intersection

More information

Statistical Evidence and Inference

Statistical Evidence and Inference Statistical Evidence and Inference Basic Methods of Analysis Understanding the methods used by economists requires some basic terminology regarding the distribution of random variables. The mean of a distribution

More information

SOLUTIONS TO THE LAB 1 ASSIGNMENT

SOLUTIONS TO THE LAB 1 ASSIGNMENT SOLUTIONS TO THE LAB 1 ASSIGNMENT Question 1 Excel produces the following histogram of pull strengths for the 100 resistors: 2 20 Histogram of Pull Strengths (lb) Frequency 1 10 0 9 61 63 6 67 69 71 73

More information

Random variables The binomial distribution The normal distribution Other distributions. Distributions. Patrick Breheny.

Random variables The binomial distribution The normal distribution Other distributions. Distributions. Patrick Breheny. Distributions February 11 Random variables Anything that can be measured or categorized is called a variable If the value that a variable takes on is subject to variability, then it the variable is a random

More information

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, Abstract

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, Abstract Basic Data Analysis Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, 2013 Abstract Review summary statistics and measures of location. Discuss the placement exam as an exercise

More information

Sampling Distributions and the Central Limit Theorem

Sampling Distributions and the Central Limit Theorem Sampling Distributions and the Central Limit Theorem February 18 Data distributions and sampling distributions So far, we have discussed the distribution of data (i.e. of random variables in our sample,

More information

Section 6-1 : Numerical Summaries

Section 6-1 : Numerical Summaries MAT 2377 (Winter 2012) Section 6-1 : Numerical Summaries With a random experiment comes data. In these notes, we learn techniques to describe the data. Data : We will denote the n observations of the random

More information

NCSS Statistical Software. Reference Intervals

NCSS Statistical Software. Reference Intervals Chapter 586 Introduction A reference interval contains the middle 95% of measurements of a substance from a healthy population. It is a type of prediction interval. This procedure calculates one-, and

More information

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either

More information

3.1 Measures of Central Tendency

3.1 Measures of Central Tendency 3.1 Measures of Central Tendency n Summation Notation x i or x Sum observation on the variable that appears to the right of the summation symbol. Example 1 Suppose the variable x i is used to represent

More information

Frequency Distribution Models 1- Probability Density Function (PDF)

Frequency Distribution Models 1- Probability Density Function (PDF) Models 1- Probability Density Function (PDF) What is a PDF model? A mathematical equation that describes the frequency curve or probability distribution of a data set. Why modeling? It represents and summarizes

More information

GRAPHS IN ECONOMICS. Appendix. Key Concepts. Graphing Data

GRAPHS IN ECONOMICS. Appendix. Key Concepts. Graphing Data Appendix GRAPHS IN ECONOMICS Key Concepts Graphing Data Graphs represent quantity as a distance on a line. On a graph, the horizontal scale line is the x-axis, the vertical scale line is the y-axis, and

More information

Descriptive Statistics (Devore Chapter One)

Descriptive Statistics (Devore Chapter One) Descriptive Statistics (Devore Chapter One) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 0 Perspective 1 1 Pictorial and Tabular Descriptions of Data 2 1.1 Stem-and-Leaf

More information

Multiple regression - a brief introduction

Multiple regression - a brief introduction Multiple regression - a brief introduction Multiple regression is an extension to regular (simple) regression. Instead of one X, we now have several. Suppose, for example, that you are trying to predict

More information

Unit2: Probabilityanddistributions. 3. Normal distribution

Unit2: Probabilityanddistributions. 3. Normal distribution Announcements Unit: Probabilityanddistributions 3 Normal distribution Sta 101 - Spring 015 Duke University, Department of Statistical Science February, 015 Peer evaluation 1 by Friday 11:59pm Office hours:

More information

Financial Econometrics

Financial Econometrics Financial Econometrics Value at Risk Gerald P. Dwyer Trinity College, Dublin January 2016 Outline 1 Value at Risk Introduction VaR RiskMetrics TM Summary Risk What do we mean by risk? Dictionary: possibility

More information

Frequency Distribution and Summary Statistics

Frequency Distribution and Summary Statistics Frequency Distribution and Summary Statistics Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai i at Mānoa Outline 1. Stemplot 2. Frequency table 3. Summary

More information

Fundamentals of Statistics

Fundamentals of Statistics CHAPTER 4 Fundamentals of Statistics Expected Outcomes Know the difference between a variable and an attribute. Perform mathematical calculations to the correct number of significant figures. Construct

More information

2 DESCRIPTIVE STATISTICS

2 DESCRIPTIVE STATISTICS Chapter 2 Descriptive Statistics 47 2 DESCRIPTIVE STATISTICS Figure 2.1 When you have large amounts of data, you will need to organize it in a way that makes sense. These ballots from an election are rolled

More information

Introduction to Population Modeling

Introduction to Population Modeling Introduction to Population Modeling In addition to estimating the size of a population, it is often beneficial to estimate how the population size changes over time. Ecologists often uses models to create

More information

Simple Descriptive Statistics

Simple Descriptive Statistics Simple Descriptive Statistics These are ways to summarize a data set quickly and accurately The most common way of describing a variable distribution is in terms of two of its properties: Central tendency

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 10 (MWF) Checking for normality of the data using the QQplot Suhasini Subba Rao Review of previous

More information

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s. STAT 515 -- Chapter 5: Continuous Distributions Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s. Continuous distributions typically are represented by

More information

The following content is provided under a Creative Commons license. Your support

The following content is provided under a Creative Commons license. Your support MITOCW Recitation 6 The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make

More information

Business Statistics 41000: Probability 4

Business Statistics 41000: Probability 4 Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:

More information

Sampling Distributions

Sampling Distributions AP Statistics Ch. 7 Notes Sampling Distributions A major field of statistics is statistical inference, which is using information from a sample to draw conclusions about a wider population. Parameter:

More information

Business Statistics 41000: Probability 3

Business Statistics 41000: Probability 3 Business Statistics 41000: Probability 3 Drew D. Creal University of Chicago, Booth School of Business February 7 and 8, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404

More information