STA218 Analysis of Variance

Size: px
Start display at page:

Download "STA218 Analysis of Variance"

Transcription

1 STA218 Analysis of Variance Al Nosedal. University of Toronto. Fall 2017 November 27, 2017

2 The Data Matrix The following table shows last year s sales data for a small business. The sample is put into a matrix format in which each of the three rows corresponds to one of the three countries in which the company does business, and each of the four columns corresponds to one of its four salespersons. So a cell in the matrix corresponds to one of 12 salesperson/country combinations. The numbers in the cell represent the sales (in units of $1000) made by that particular salesman in that country last year. This data will be used throughout the chapter to develop the theory underlying Analysis of Variance or, for short, ANOVA.

3 The Data Matrix Country A Country B Country C Average Salesperson 1 6, 7, 8 10, 10 12, 13, Salesperson 2 10, 15 10, 10 11, Salesperson 3 10, 15 7, 13 11, Salesperson 4 2, 3, 4 10, 10 8, 9, 10 7 Average

4 Altogether, there were 28 sales last year that totaled $280 - so the average sale was $10. The row (salesperson) averages are: Row 1 (Salesperson 1) $10. Row 2 (Salesperson 2) $12. Row 3 (Salesperson 3) $12. Row 4 (Salesperson 4) $ 7.

5 The column (country) averages are: Column 1 (Country A) $8. Column 2 (Country B) $10. Column 3 (Country C) $12.

6 Now we will begin our study of how to make a statistically valid prediction of the next sales figure. In that regard, there are four possible situations that can occur. 1. Neither the country nor the salesperson of the next sale (observation) is known. 2. The country of the next sale is known, but the salesperson is not known. 3. The salesperson of the next sale is known, but the country is not known. 4. Both the country and the salesperson of the next sale are known.

7 Situation 1. Without any additional information, the best prediction is the sample mean $10. This prediction is best in the least squares sense - that is, if $10 had been used to predict each of the 28 observations in the sample, then the total of the squared errors SS TOTAL would be as small as possible. In our data set, SS TOTAL equals 354. That figure can be verified by calculating (x i 10) 2 for each observation x i of the sample.

8 Situation 2. One-factor ANOVA Model. If only the country of the next sale is known, then two different predictions are possible for the next sales figure: The sample mean $10. The mean of the sales of the country in which the next sale will occur. (In this case, $8 if the next sale will occur in Country A, $10 in Country B, or $12 in Country C.) This prediction ignores the information present in the sales figures from the other two countries.

9 Situation 3. One-factor ANOVA Model. If only the salesperson of the next sale is known, then two different predictions are possible for the next sales figure: The sample mean $10. The mean of the previous sales of the salesperson who will make the next sale. (In this case, $10 if the next sale will be made by Salesperson 1, etc. ) This prediction ignores the information present in the sales figures from the other three salespersons.

10 Situations 2 and 3 are called one-factor ANOVA models, because only one-factor is known about the next sale. As noted, if a prediction is either a row mean or a column mean, then it ignores the observations in the other rows or columns. To make that kind of prediction, it s necessary to statistically verify that the ignored observations are indeed different populations and therefore not relevant to the prediction.

11 Situation 4. Two-factor ANOVA Model. We are not covering this kind of model in our course.

12 The Null Hypothesis for One-Factor ANOVA We have discussed the prediction possibilities for one-factor ANOVA models. Now, we will learn how to test the statistical significance of a one-factor ANOVA model. Let s suppose that we want to predict the next sales figure, and that we know the country in which this sale will occur but the identity of the salesperson is NOT known. Without any statistical testing, we can always by default use the sample mean $10 to predict the next sale. The default prediction, the sample mean, doesn t use any information about the country (column) in which the sale will occur.

13 The Null Hypothesis for One-Factor ANOVA However, if instead we use the mean of the observations in only one column (the column that corresponds to the particular country in which we know the next sale will occur), then we have to test the null hypothesis H 0 : µ COL1, µ COL2, µ COL3, are equal and reject it in favor of the alternative hypothesis H a : µ COL1, µ COL2, µ COL3, are NOT all equal

14 The Null Hypothesis for One-Factor ANOVA If the null hypothesis is rejected, then we can be statistically confident that the column means are not all equal, and therefore that the individual column means (i. e., $8, $10, $12) can be used to predict the amount of the next sale. If the next sale sale was going to occur in Country A, then the prediction would be $8. If the next sale was going to occur in Country B, then the prediction would be $10. If the next sale was going to occur in Country C, then the prediction would be $12.

15 The One-Factor ANOVA F Test To test the null hypothesis stated above, we have to calculate an F-statistic. If F STAT > F (c 1,n c), α, then reject H 0, and use the sample column means to predict future observations. Otherwise, do not reject H 0 and use the overall sample mean to predict future observations.

16 ANOVA Table To see how this F STAT is calculated, see the ANOVA Table below. Source of Degrees of Sum of Mean Sum of F Ratio Variation Freedom Square Squares (df) (SS) (MSS) SS Explained c -1 SS EXP EXP c 1 F = MSS EXP MSS UNEXP SS Unexplained n -c SS UNEXP UNEXP n c Total n -1 SS TOTAL

17 Calculation of SS TOTAL If no model is used, then the predictions for each of the 28 observations (in dollar amounts) will be 10. If these predictions are used, the squared error of these 28 predictions is given in the table below. Country A Country B Country C Salesperson 1 16, 9, 4 0, 0 4, 9, 16 Salesperson 2 0, 25 0, 0 1, 36 Salesperson 3 0, 25 9, 9 1, 36 Salesperson 4 64, 49, 36 0, 0 4, 1, 0 Prediction Errors Squared when NO Factor is used (Total) = 354.

18 Calculation of SS UNEXPLAINED If the column model is used, then the 28 observations would have the following 28 predictions, where $8 is the average for the first column, $10 is the average for the second column, and $12 is the average for the third column. Country A Country B Country C Salesperson 1 8, 8, 8 10, 10 12, 12, 12 Salesperson 2 8, 8 10, 10 12, 12 Salesperson 3 8, 8 10, 10 12, 12 Salesperson 4 8, 8, 8 10, 10 12, 12, 12

19 Calculation of SS UNEXPLAINED Using the above 28 predictions, the errors squared are shown in the table below. Country A Country B Country C Salesperson 1 4, 1, 0 0, 0 0, 1, 4 Salesperson 2 4, 49 0, 0 1, 16 Salesperson 3 4, 49 9, 9 1, 16 Salesperson 4 36, 25, 16 0, 0 16, 9, 4 Errors Squared when the Column Factor is used (Total) = 274.

20 Calculation of SS EXPLAINED The units explained by the column model are calculated by finding the square of each prediction change when moving from NO model to the column model. The following table presents the square of each prediction change: Country A Country B Country C Salesperson 1 4, 4, 4 0, 0 4, 4, 4 Salesperson 2 4, 4 0, 0 4, 4 Salesperson 3 4, 4 0, 0 4, 4 Salesperson 4 4, 4, 4 0, 0 4, 4, 4 Table of the Square of the Prediction Change when Moving from NO Model to the Column Model (Total) = 80.

21 ANOVA Table The ANOVA Table for the column factor can now be filled in as shown below: Source of Degrees of Sum of Mean Sum of F Ratio Variation Freedom Square Squares (df) (SS) (MSS) 80 Explained = = Unexplained = Total So for this one-factor ANOVA model, F STAT = 3.65.

22 Conclusion If the null hypothesis is true, then the F-statistic should be a value from the F 2, 25 distribution. Referring to the table that contains the upper 0.05 cut-off points of F distributions, we see that F (2,25),0.05 = Since 3.65 is greater than 3.39, this tells us that the F-statistic is in the upper 0.05 of the F 2, 25 distribution. Therefore we can reject the null hypothesis at the 0.05 significance level, and we conclude that the country means are not all the same. Thus, the prediction for the next sale in a known country is the mean of all the previous sales in that country.

23 This time for the Row Factor We have just performed the F test to verify that the country (column) one-factor ANOVA model is statistically significant. There is another one-factor ANOVA model that also could be examined - the salesperson (row) factor model. Let s test H 0 : µ ROW 1, µ ROW 2, µ ROW 3, µ ROW 4 and reject it in favor of the alternative hypothesis are equal H a : µ ROW 1, µ ROW 2, µ ROW 3, µ ROW 4 are NOT all equal

24 ANOVA Table The resulting ANOVA table for the salesperson (row) factor is shown below: Source of Degrees of Sum of Mean Sum of F Ratio Variation Freedom Square Squares (df) (SS) (MSS) 120 Explained = = Unexplained = 9.75 Total So for this one-factor ANOVA model, F STAT = 4.10.

25 Conclusion Consulting the upper 0.05 cut-off table for the F distribution, we find that F (3, 24),0.05 = 3.01 Since F-statistic = 4.10 > 3.01, the null hypothesis can once again be rejected at the 0.05 level, and we can use the salesperson factor to predict sales, concluding that it is statistically valid to predict either $10, $12, or $7, respectively, for Salespersons 1, 2, 3, or 4.

26 # R Code; sales1=c(6, 7, 8, 10, 10, 12, 13, 14); sales2=c (10, 15, 10, 10, 11, 16 ); sales3=c(10, 15, 7, 13, 11, 16 ); sales4= c(2, 3, 4, 10, 10, 8, 9, 10 ); sales=c(sales1,sales2,sales3,sales4); person=c(rep(1,8),rep(2,6),rep(3,6),rep(4,8)); oneway.test(sales~person,var.equal=true);

27 ## ## One-way analysis of means ## ## data: sales and person ## F = , num df = 3, denom df = 24, p-value =

28 Underlying Assumptions Officially, to use the predictions from an ANOVA model, three assumptions about the populations from which the sample was taken must be satisfied: 1. Each population has a Normal distribution. 2. Each population has the same standard deviation σ. 3. The observations are mutually independent of one another.

29 Formulas Sum of Squares for Treatments (a.k.a. between-treatments variation or Explained) SST = k n j ( x j x) 2 j=1 Sum of Squares for Error (a.k.a. within-treatments variation or Unexplained) SSE = n k j (x ij x j ) 2 = (n 1 1)s (n k 1)sk 2. j=1 i=1

30 Formulas Mean Square for Treatments Mean Square for Error MST = SST k 1 MSE = SSE n k

31 Formulas Test Statistic F = MST MSE

32 Exercise 14.1 A statistics practitioner calculated the following statistics: Treatment Statistic n x s Complete the ANOVA table.

33 Solution x = 5(10)+5(15)+5(20) = 15 SST = 5(10 15) 2 + 5(15 15) 2 + 5(20 15) 2 = 250 SSE = (5 1)(50) + (5 1)(50) + (5 1)(50) = 600

34 ANOVA Table Source of Degrees of Sum of Mean Sum of F Ratio Variation Freedom Square Squares (df) (SS) (MSS) 250 Treatments = = Error = 50 Total

35 Exercise 14.2 A statistics practitioner calculated the following statistics: Treatment Statistic n x s Complete the ANOVA table.

36 Solution x = 4(20)+4(22)+4(25) = SST = 4( ) 2 + 4( ) 2 + 5( ) 2 = SSE = (4 1)(10) + (4 1)(10) + (4 1)(10) = 90

37 ANOVA Table Source of Degrees of Sum of Mean Sum of F Ratio Variation Freedom Square Squares (df) (SS) (MSS) Treatments = = Error = 10 Total

38 Exercise 14.5 A consumer organization was concerned about the differences between the advertised sizes of containers and the actual amount of product. In a preliminary study, six packages of three different brands of margarine that are supposed to contain 500ml were measured. The differences from 500 ml are listed here. Do these data provide sufficient evidence to conclude that differences exist between the three brands? Use α = Brand 1 Brand 2 Brand

39 Solution Step 1. State Hypotheses. µ i = population mean for differences from 500 ml (brand i, where i = 1, 2, 3). H 0 : µ 1 = µ 2 = µ 3 H a : At least two means differ.

40 Solution Step 2. Compute test statistic. Brand 1 Brand 2 Brand 3 Mean Variance Grand mean = x = SST = 6( ) 2 + 6( ) 2 + 6( ) 2 = SSE = (6 1)(1.87) + (6 1)(2.30) + (6 1)(1.47) = 28.20

41 Solution Grand mean = x = SST = 6( ) 2 + 6( ) 2 + 6( ) 2 = SSE = (6 1)(1.87) + (6 1)(2.30) + (6 1)(1.47) = 28.20

42 ANOVA Table Source of Degrees of Sum of Mean Sum of F Ratio Variation Freedom Square Squares (df) (SS) (MSS) Treatments = = Error = 1.88 Total

43 Solution Step 3. Find Rejection Region. We reject the null hypothesis only if F > F α,k 1,n k If we let α = 0.05, the rejection region for this exercise is F > F 0.05, 2,15 = 3.682

44 Solution Step 4. Conclusion. We found the value of the test statistic to be F = Since F = 1.70 < F 0.05, 2,15 = 3.682, we can t reject H 0. Thus, there is not evidence to infer that the average differences differ between the three brands.

45 # R Code; brand1=c(1,3,3,0,1,0); brand2=c (2,2,4,3,0,4); brand3=c(1,2,4,2,3,4); differences=c(brand1,brand2,brand3); brand=c(rep(1,6),rep(2,6),rep(3,6)); oneway.test(differences~brand,var.equal=true);

46 ## ## One-way analysis of means ## ## data: differences and brand ## F = , num df = 2, denom df = 15, p-value =

47 Exercise The friendly folks a the Internal Revenue Service (IRS) in the United States and Canada Revenue Agency (CRA) are always looking for ways to improve the wording and format of its tax return forms. Three new forms have been developed recently. To determine which, if any, are superior to the current form, 120 individuals were asked to participate in an experiment. Each of the three new forms and the currently used form were filled out by 30 different people. The amount of time (in minutes) taken by each person to complete the task was recorded. What conclusions can be drawn from these data?

48 R Code #Step 1. Entering data; # importing data; # url of tax return forms; forms_url = " forms_data= read.table(forms_url,header=true); names(forms_data); forms_data[1:4, ];

49 R Code ## [1] "Form1" "Form2" "Form3" "Form4" ## Form1 Form2 Form3 Form4 ## ## ## ##

50 R Code #Step 2. ANOVA; time1=forms_data$form1; time2=forms_data$form2; time3=forms_data$form3; time4=forms_data$form4; length(forms_data$form1); times=c(time1,time2,time3,time4); forms=c(rep(1,30),rep(2,30),rep(3,30),rep(4,30)); oneway.test(times~forms,var.equal=true)

51 R Code ## [1] 30 ## ## One-way analysis of means ## ## data: times and forms ## F = , num df = 3, denom df = 116, p-value =

STA258 Analysis of Variance

STA258 Analysis of Variance STA258 Analysis of Variance Al Nosedal. University of Toronto. Winter 2017 The Data Matrix The following table shows last year s sales data for a small business. The sample is put into a matrix format

More information

Lecture note 8 Spring Lecture note 8. Analysis of Variance (ANOVA)

Lecture note 8 Spring Lecture note 8. Analysis of Variance (ANOVA) Lecture note 8 Analysis of Variance (ANOVA) 1 Overview of ANOVA Analysis of variance (ANOVA) is a comparison of means. ANOVA allows you to compare more than two means simultaneously. Proper experimental

More information

1.017/1.010 Class 19 Analysis of Variance

1.017/1.010 Class 19 Analysis of Variance .07/.00 Class 9 Analysis of Variance Concepts and Definitions Objective: dentify factors responsible for variability in observed data Specify one or more factors that could account for variability (e.g.

More information

Study of one-way ANOVA with a fixed-effect factor

Study of one-way ANOVA with a fixed-effect factor Study of one-way ANOVA with a fixed-effect factor In the last blog on Introduction to ANOVA, we mentioned that in the oneway ANOVA study, the factor contributing to a possible source of variation that

More information

Chapter 8 Student Lecture Notes 8-1. Department of Quantitative Methods & Information Systems. Business Statistics

Chapter 8 Student Lecture Notes 8-1. Department of Quantitative Methods & Information Systems. Business Statistics Chapter 8 Student Lecture Notes 8-1 Department of Quantitative Methods & Information Systems Business Statistics Chapter 11 One Way analysis of Variance QMIS 0 Dr. Mohammad Zainal Chapter Goals After completing

More information

Lecture 8: Single Sample t test

Lecture 8: Single Sample t test Lecture 8: Single Sample t test Review: single sample z-test Compares the sample (after treatment) to the population (before treatment) You HAVE to know the populational mean & standard deviation to use

More information

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 42

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 42 STA258H5 Al Nosedal and Alison Weir Winter 2017 Al Nosedal and Alison Weir STA258H5 Winter 2017 1 / 42 CONFIDENCE INTERVALS FOR σ 2 Al Nosedal and Alison Weir STA258H5 Winter 2017 2 / 42 Background We

More information

Topic 30: Random Effects Modeling

Topic 30: Random Effects Modeling Topic 30: Random Effects Modeling Outline One-way random effects model Data Model Inference Data for one-way random effects model Y, the response variable Factor with levels i = 1 to r Y ij is the j th

More information

Statistics & Statistical Tests: Assumptions & Conclusions

Statistics & Statistical Tests: Assumptions & Conclusions Degrees of Freedom Statistics & Statistical Tests: Assumptions & Conclusions Kinds of degrees of freedom Kinds of Distributions Kinds of Statistics & assumptions required to perform each Normal Distributions

More information

Non-Inferiority Tests for Two Means in a 2x2 Cross-Over Design using Differences

Non-Inferiority Tests for Two Means in a 2x2 Cross-Over Design using Differences Chapter 510 Non-Inferiority Tests for Two Means in a 2x2 Cross-Over Design using Differences Introduction This procedure computes power and sample size for non-inferiority tests in 2x2 cross-over designs

More information

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD MAJOR POINTS Sampling distribution of the mean revisited Testing hypotheses: sigma known An example Testing hypotheses:

More information

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal Email: yuppal@ysu.edu Chapter 12 Goodness of Fit Test: A Multinomial Population Test of Independence Hypothesis (Goodness of Fit) Test

More information

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER STA2601/105/2/2018 Tutorial letter 105/2/2018 Applied Statistics II STA2601 Semester 2 Department of Statistics TRIAL EXAMINATION PAPER Define tomorrow. university of south africa Dear Student Congratulations

More information

Lecture 39 Section 11.5

Lecture 39 Section 11.5 on Lecture 39 Section 11.5 Hampden-Sydney College Mon, Nov 10, 2008 Outline 1 on 2 3 on 4 on Exercise 11.27, page 715. A researcher was interested in comparing body weights for two strains of laboratory

More information

LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY

LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY 1 THIS WEEK S PLAN Part I: Theory + Practice ( Interval Estimation ) Part II: Theory + Practice ( Interval Estimation ) z-based Confidence Intervals for a Population

More information

SLIDES. BY. John Loucks. St. Edward s University

SLIDES. BY. John Loucks. St. Edward s University . SLIDES. BY John Loucks St. Edward s University 1 Chapter 10, Part A Inference About Means and Proportions with Two Populations n Inferences About the Difference Between Two Population Means: σ 1 and

More information

Chapter 6 Confidence Intervals Section 6-1 Confidence Intervals for the Mean (Large Samples) Estimating Population Parameters

Chapter 6 Confidence Intervals Section 6-1 Confidence Intervals for the Mean (Large Samples) Estimating Population Parameters Chapter 6 Confidence Intervals Section 6-1 Confidence Intervals for the Mean (Large Samples) Estimating Population Parameters VOCABULARY: Point Estimate a value for a parameter. The most point estimate

More information

Two-Sample T-Test for Superiority by a Margin

Two-Sample T-Test for Superiority by a Margin Chapter 219 Two-Sample T-Test for Superiority by a Margin Introduction This procedure provides reports for making inference about the superiority of a treatment mean compared to a control mean from data

More information

CHAPTER 6 DATA ANALYSIS AND INTERPRETATION

CHAPTER 6 DATA ANALYSIS AND INTERPRETATION 208 CHAPTER 6 DATA ANALYSIS AND INTERPRETATION Sr. No. Content Page No. 6.1 Introduction 212 6.2 Reliability and Normality of Data 212 6.3 Descriptive Analysis 213 6.4 Cross Tabulation 218 6.5 Chi Square

More information

Lecture 35 Section Wed, Mar 26, 2008

Lecture 35 Section Wed, Mar 26, 2008 on Lecture 35 Section 10.2 Hampden-Sydney College Wed, Mar 26, 2008 Outline on 1 2 3 4 5 on 6 7 on We will familiarize ourselves with the t distribution. Then we will see how to use it to test a hypothesis

More information

STA215 Confidence Intervals for Proportions

STA215 Confidence Intervals for Proportions STA215 Confidence Intervals for Proportions Al Nosedal. University of Toronto. Summer 2017 June 14, 2017 Pepsi problem A market research consultant hired by the Pepsi-Cola Co. is interested in determining

More information

Probability & Statistics

Probability & Statistics Probability & Statistics BITS Pilani K K Birla Goa Campus Dr. Jajati Keshari Sahoo Department of Mathematics Statistics Descriptive statistics Inferential statistics /38 Inferential Statistics 1. Involves:

More information

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg :

More information

Homework Assignment Section 3

Homework Assignment Section 3 Homework Assignment Section 3 Tengyuan Liang Business Statistics Booth School of Business Problem 1 A company sets different prices for a particular stereo system in eight different regions of the country.

More information

Chapter 6 Confidence Intervals

Chapter 6 Confidence Intervals Chapter 6 Confidence Intervals Section 6-1 Confidence Intervals for the Mean (Large Samples) VOCABULARY: Point Estimate A value for a parameter. The most point estimate of the population parameter is the

More information

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions SGSB Workshop: Using Statistical Data to Make Decisions Module 2: The Logic of Statistical Inference Dr. Tom Ilvento January 2006 Dr. Mugdim Pašić Key Objectives Understand the logic of statistical inference

More information

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics CONTENTS Estimating parameters The sampling distribution Confidence intervals for μ Hypothesis tests for μ The t-distribution Comparison

More information

Two-Sample T-Test for Non-Inferiority

Two-Sample T-Test for Non-Inferiority Chapter 198 Two-Sample T-Test for Non-Inferiority Introduction This procedure provides reports for making inference about the non-inferiority of a treatment mean compared to a control mean from data taken

More information

7.1 Comparing Two Population Means: Independent Sampling

7.1 Comparing Two Population Means: Independent Sampling University of California, Davis Department of Statistics Summer Session II Statistics 13 September 4, 01 Lecture 7: Comparing Population Means Date of latest update: August 9 7.1 Comparing Two Population

More information

Tests for One Variance

Tests for One Variance Chapter 65 Introduction Occasionally, researchers are interested in the estimation of the variance (or standard deviation) rather than the mean. This module calculates the sample size and performs power

More information

M1 M1 A1 M1 A1 M1 A1 A1 A1 11 A1 2 B1 B1. B1 M1 Relative efficiency (y) = M1 A1 BEWARE PRINTED ANSWER. 5

M1 M1 A1 M1 A1 M1 A1 A1 A1 11 A1 2 B1 B1. B1 M1 Relative efficiency (y) = M1 A1 BEWARE PRINTED ANSWER. 5 Q L e σ π ( W μ e σ π ( W μ M M A Product form. Two Normal terms. Fully correct. (ii ln L const ( W ( W d ln L ( W + ( W dμ 0 σ W σ μ W σ W W ˆ μ σ Chec this is a maximum. d ln L E.g. < 0 dμ σ σ σ μ σ

More information

Review: Population, sample, and sampling distributions

Review: Population, sample, and sampling distributions Review: Population, sample, and sampling distributions A population with mean µ and standard deviation σ For instance, µ = 0, σ = 1 0 1 Sample 1, N=30 Sample 2, N=30 Sample 100000000000 InterquartileRange

More information

σ 2 : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

σ 2 : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics σ : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics CONTENTS Estimating other parameters besides μ Estimating variance Confidence intervals for σ Hypothesis tests for σ Estimating standard

More information

7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4

7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4 7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4 - Would the correlation between x and y in the table above be positive or negative? The correlation is negative. -

More information

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed. The Central Limit Theorem The central limit theorem (clt for short) is one of the most powerful and useful ideas in all of statistics. The clt says that if we collect samples of size n with a "large enough

More information

Chapter 7. Inferences about Population Variances

Chapter 7. Inferences about Population Variances Chapter 7. Inferences about Population Variances Introduction () The variability of a population s values is as important as the population mean. Hypothetical distribution of E. coli concentrations from

More information

12.1 One-Way Analysis of Variance. ANOVA - analysis of variance - used to compare the means of several populations.

12.1 One-Way Analysis of Variance. ANOVA - analysis of variance - used to compare the means of several populations. 12.1 One-Way Analysis of Variance ANOVA - analysis of variance - used to compare the means of several populations. Assumptions for One-Way ANOVA: 1. Independent samples are taken using a randomized design.

More information

Final Exam Suggested Solutions

Final Exam Suggested Solutions University of Washington Fall 003 Department of Economics Eric Zivot Economics 483 Final Exam Suggested Solutions This is a closed book and closed note exam. However, you are allowed one page of handwritten

More information

CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates

CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates A point estimate is a single number, a confidence interval provides additional information about the variability of the estimate Lower

More information

A Test of the Normality Assumption in the Ordered Probit Model *

A Test of the Normality Assumption in the Ordered Probit Model * A Test of the Normality Assumption in the Ordered Probit Model * Paul A. Johnson Working Paper No. 34 March 1996 * Assistant Professor, Vassar College. I thank Jahyeong Koo, Jim Ziliak and an anonymous

More information

One sample z-test and t-test

One sample z-test and t-test One sample z-test and t-test January 30, 2017 psych10.stanford.edu Announcements / Action Items Install ISI package (instructions in Getting Started with R) Assessment Problem Set #3 due Tu 1/31 at 7 PM

More information

Power in Mixed Effects

Power in Mixed Effects Power in Mixed Effects Gary W. Oehlert School of Statistics University of Minnesota December 1, 2014 Power is an important aspect of designing an experiment; we now return to power in mixed effects. We

More information

The Two-Sample Independent Sample t Test

The Two-Sample Independent Sample t Test Department of Psychology and Human Development Vanderbilt University 1 Introduction 2 3 The General Formula The Equal-n Formula 4 5 6 Independence Normality Homogeneity of Variances 7 Non-Normality Unequal

More information

Chapter 8 Estimation

Chapter 8 Estimation Chapter 8 Estimation There are two important forms of statistical inference: estimation (Confidence Intervals) Hypothesis Testing Statistical Inference drawing conclusions about populations based on samples

More information

Two-Sample Z-Tests Assuming Equal Variance

Two-Sample Z-Tests Assuming Equal Variance Chapter 426 Two-Sample Z-Tests Assuming Equal Variance Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample z-tests when the variances of the two groups

More information

MgtOp S 215 Chapter 8 Dr. Ahn

MgtOp S 215 Chapter 8 Dr. Ahn MgtOp S 215 Chapter 8 Dr. Ahn An estimator of a population parameter is a rule that tells us how to use the sample values,,, to estimate the parameter, and is a statistic. An estimate is the value obtained

More information

Converting to the Standard Normal rv: Exponential PDF and CDF for x 0 Chapter 7: expected value of x

Converting to the Standard Normal rv: Exponential PDF and CDF for x 0 Chapter 7: expected value of x Key Formula Sheet ASU ECN 22 ASWCC Chapter : no key formulas Chapter 2: Relative Frequency=freq of the class/n Approx Class Width: =(largest value-smallest value) /number of classes Chapter 3: sample and

More information

C.10 Exercises. Y* =!1 + Yz

C.10 Exercises. Y* =!1 + Yz C.10 Exercises C.I Suppose Y I, Y,, Y N is a random sample from a population with mean fj. and variance 0'. Rather than using all N observations consider an easy estimator of fj. that uses only the first

More information

Confidence Intervals. σ unknown, small samples The t-statistic /22

Confidence Intervals. σ unknown, small samples The t-statistic /22 Confidence Intervals σ unknown, small samples The t-statistic 1 /22 Homework Read Sec 7-3. Discussion Question pg 365 Do Ex 7-3 1-4, 6, 9, 12, 14, 15, 17 2/22 Objective find the confidence interval for

More information

Conover Test of Variances (Simulation)

Conover Test of Variances (Simulation) Chapter 561 Conover Test of Variances (Simulation) Introduction This procedure analyzes the power and significance level of the Conover homogeneity test. This test is used to test whether two or more population

More information

Diploma Part 2. Quantitative Methods. Examiner s Suggested Answers

Diploma Part 2. Quantitative Methods. Examiner s Suggested Answers Diploma Part 2 Quantitative Methods Examiner s Suggested Answers Question 1 (a) The binomial distribution may be used in an experiment in which there are only two defined outcomes in any particular trial

More information

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Statistics 431 Spring 2007 P. Shaman. Preliminaries Statistics 4 Spring 007 P. Shaman The Binomial Distribution Preliminaries A binomial experiment is defined by the following conditions: A sequence of n trials is conducted, with each trial having two possible

More information

Tests for the Difference Between Two Linear Regression Intercepts

Tests for the Difference Between Two Linear Regression Intercepts Chapter 853 Tests for the Difference Between Two Linear Regression Intercepts Introduction Linear regression is a commonly used procedure in statistical analysis. One of the main objectives in linear regression

More information

22S:105 Statistical Methods and Computing. Two independent sample problems. Goal of inference: to compare the characteristics of two different

22S:105 Statistical Methods and Computing. Two independent sample problems. Goal of inference: to compare the characteristics of two different 22S:105 Statistical Methods and Computing Two independent-sample t-tests Lecture 17 Apr. 5, 2013 1 2 Two independent sample problems Goal of inference: to compare the characteristics of two different populations

More information

A) The first quartile B) The Median C) The third quartile D) None of the previous. 2. [3] If P (A) =.8, P (B) =.7, and P (A B) =.

A) The first quartile B) The Median C) The third quartile D) None of the previous. 2. [3] If P (A) =.8, P (B) =.7, and P (A B) =. Review for stat2507 Final (December 2008) Part I: Multiple Choice questions (on 39%): Please circle only one choice. 1. [3] Which one of the following summary measures is affected most by outliers A) The

More information

Point-Biserial and Biserial Correlations

Point-Biserial and Biserial Correlations Chapter 302 Point-Biserial and Biserial Correlations Introduction This procedure calculates estimates, confidence intervals, and hypothesis tests for both the point-biserial and the biserial correlations.

More information

Study Ch. 11.2, #51, 63 69, 73

Study Ch. 11.2, #51, 63 69, 73 May 05, 014 11. Inferences for σ's, Populations Study Ch. 11., #51, 63 69, 73 Statistics Home Page Gertrude Battaly, 014 11. Inferences for σ's, Populations Procedures that assume = σ's 1. Pooled t test.

More information

Upcoming Schedule PSU Stat 2014

Upcoming Schedule PSU Stat 2014 Upcoming Schedule PSU Stat 014 Monday Tuesday Wednesday Thursday Friday Jan 6 Sec 7. Jan 7 Jan 8 Sec 7.3 Jan 9 Jan 10 Sec 7.4 Jan 13 Chapter 7 in a nutshell Jan 14 Jan 15 Chapter 7 test Jan 16 Jan 17 Final

More information

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2 Determining Sample Size Slide 1 E = z α / 2 ˆ ˆ p q n (solve for n by algebra) n = ( zα α / 2) 2 p ˆ qˆ E 2 Sample Size for Estimating Proportion p When an estimate of ˆp is known: Slide 2 n = ˆ ˆ ( )

More information

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times.

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times. Mixed-effects models An introduction by Christoph Scherber Up to now, we have been dealing with linear models of the form where ß0 and ß1 are parameters of fixed value. Example: Let us assume that we are

More information

χ 2 distributions and confidence intervals for population variance

χ 2 distributions and confidence intervals for population variance χ 2 distributions and confidence intervals for population variance Let Z be a standard Normal random variable, i.e., Z N(0, 1). Define Y = Z 2. Y is a non-negative random variable. Its distribution is

More information

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved. STAT 509: Statistics for Engineers Dr. Dewei Wang Applied Statistics and Probability for Engineers Sixth Edition Douglas C. Montgomery George C. Runger 7 Point CHAPTER OUTLINE 7-1 Point Estimation 7-2

More information

An approximate sampling distribution for the t-ratio. Caution: comparing population means when σ 1 σ 2.

An approximate sampling distribution for the t-ratio. Caution: comparing population means when σ 1 σ 2. Stat 529 (Winter 2011) Non-pooled t procedures (The Welch test) Reading: Section 4.3.2 The sampling distribution of Y 1 Y 2. An approximate sampling distribution for the t-ratio. The Sri Lankan analysis.

More information

Section 8.1 Estimating μ When σ is Known

Section 8.1 Estimating μ When σ is Known Chapter 8 Estimation Name Section 8.1 Estimating μ When σ is Known Objective: In this lesson you learned to explain the meanings of confidence level, error of estimate, and critical value; to find the

More information

Logit Models for Binary Data

Logit Models for Binary Data Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis These models are appropriate when the response

More information

Random Effects ANOVA

Random Effects ANOVA Random Effects ANOVA Grant B. Morgan Baylor University This post contains code for conducting a random effects ANOVA. Make sure the following packages are installed: foreign, lme4, lsr, lattice. library(foreign)

More information

Statistics for Business and Economics

Statistics for Business and Economics Statistics for Business and Economics Chapter 7 Estimation: Single Population Copyright 010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-1 Confidence Intervals Contents of this chapter: Confidence

More information

Mean GMM. Standard error

Mean GMM. Standard error Table 1 Simple Wavelet Analysis for stocks in the S&P 500 Index as of December 31 st 1998 ^ Shapiro- GMM Normality 6 0.9664 0.00281 11.36 4.14 55 7 0.9790 0.00300 56.58 31.69 45 8 0.9689 0.00319 403.49

More information

Journal of Exclusive Management Science May Vol 6 Issue 05 ISSN

Journal of Exclusive Management Science May Vol 6 Issue 05 ISSN A Study on Saving Pattern and Investment Opportunities Awareness at Rural Level - With reference to Nizamabad District, Telangana State. *Alok Raj Bhatt **Dr. KhyserMohd * Junior Research Fellow, Department

More information

Lecture 18 Section Mon, Feb 16, 2009

Lecture 18 Section Mon, Feb 16, 2009 The s the Lecture 18 Section 5.3.4 Hampden-Sydney College Mon, Feb 16, 2009 Outline The s the 1 2 3 The 4 s 5 the 6 The s the Exercise 5.12, page 333. The five-number summary for the distribution of income

More information

Chapter 8 Statistical Intervals for a Single Sample

Chapter 8 Statistical Intervals for a Single Sample Chapter 8 Statistical Intervals for a Single Sample Part 1: Confidence intervals (CI) for population mean µ Section 8-1: CI for µ when σ 2 known & drawing from normal distribution Section 8-1.2: Sample

More information

Distribution. Lecture 34 Section Fri, Oct 31, Hampden-Sydney College. Student s t Distribution. Robb T. Koether.

Distribution. Lecture 34 Section Fri, Oct 31, Hampden-Sydney College. Student s t Distribution. Robb T. Koether. Lecture 34 Section 10.2 Hampden-Sydney College Fri, Oct 31, 2008 Outline 1 2 3 4 5 6 7 8 Exercise 10.4, page 633. A psychologist is studying the distribution of IQ scores of girls at an alternative high

More information

20135 Theory of Finance Part I Professor Massimo Guidolin

20135 Theory of Finance Part I Professor Massimo Guidolin MSc. Finance/CLEFIN 2014/2015 Edition 20135 Theory of Finance Part I Professor Massimo Guidolin A FEW SAMPLE QUESTIONS, WITH SOLUTIONS SET 2 WARNING: These are just sample questions. Please do not count

More information

Lecture 18 Section Mon, Sep 29, 2008

Lecture 18 Section Mon, Sep 29, 2008 The s the Lecture 18 Section 5.3.4 Hampden-Sydney College Mon, Sep 29, 2008 Outline The s the 1 2 3 The 4 s 5 the 6 The s the Exercise 5.12, page 333. The five-number summary for the distribution of income

More information

In terms of covariance the Markowitz portfolio optimisation problem is:

In terms of covariance the Markowitz portfolio optimisation problem is: Markowitz portfolio optimisation Solver To use Solver to solve the quadratic program associated with tracing out the efficient frontier (unconstrained efficient frontier UEF) in Markowitz portfolio optimisation

More information

Data Analysis. BCF106 Fundamentals of Cost Analysis

Data Analysis. BCF106 Fundamentals of Cost Analysis Data Analysis BCF106 Fundamentals of Cost Analysis June 009 Chapter 5 Data Analysis 5.0 Introduction... 3 5.1 Terminology... 3 5. Measures of Central Tendency... 5 5.3 Measures of Dispersion... 7 5.4 Frequency

More information

Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means

Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means Chapter 11: Inference for Distributions 11.1 Inference for Means of a Population 11.2 Comparing Two Means 1 Population Standard Deviation In the previous chapter, we computed confidence intervals and performed

More information

Tests for Two Variances

Tests for Two Variances Chapter 655 Tests for Two Variances Introduction Occasionally, researchers are interested in comparing the variances (or standard deviations) of two groups rather than their means. This module calculates

More information

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron Statistical Models of Stocks and Bonds Zachary D Easterling: Department of Economics The University of Akron Abstract One of the key ideas in monetary economics is that the prices of investments tend to

More information

Lecture 37 Sections 11.1, 11.2, Mon, Mar 31, Hampden-Sydney College. Independent Samples: Comparing Means. Robb T. Koether.

Lecture 37 Sections 11.1, 11.2, Mon, Mar 31, Hampden-Sydney College. Independent Samples: Comparing Means. Robb T. Koether. : : Lecture 37 Sections 11.1, 11.2, 11.4 Hampden-Sydney College Mon, Mar 31, 2008 Outline : 1 2 3 4 5 : When two samples are taken from two different populations, they may be taken independently or not

More information

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS Answer any FOUR of the SIX questions.

More information

1. Statistical problems - a) Distribution is known. b) Distribution is unknown.

1. Statistical problems - a) Distribution is known. b) Distribution is unknown. Probability February 5, 2013 Debdeep Pati Estimation 1. Statistical problems - a) Distribution is known. b) Distribution is unknown. 2. When Distribution is known, then we can have either i) Parameters

More information

PhD Qualifier Examination

PhD Qualifier Examination PhD Qualifier Examination Department of Agricultural Economics May 29, 2015 Instructions This exam consists of six questions. You must answer all questions. If you need an assumption to complete a question,

More information

Statistics vs. statistics

Statistics vs. statistics Statistics vs. statistics Question: What is Statistics (with a capital S)? Definition: Statistics is the science of collecting, organizing, summarizing and interpreting data. Note: There are 2 main ways

More information

Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance

Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance Prof. Tesler Math 186 Winter 2017 Prof. Tesler Ch. 5: Confidence Intervals, Sample Variance Math 186 / Winter 2017 1 / 29 Estimating parameters

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 7.4-1

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 7.4-1 Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Section 7.4-1 Chapter 7 Estimates and Sample Sizes 7-1 Review and Preview 7- Estimating a Population

More information

Basics. STAT:5400 Computing in Statistics Simulation studies in statistics Lecture 9 September 21, 2016

Basics. STAT:5400 Computing in Statistics Simulation studies in statistics Lecture 9 September 21, 2016 STAT:5400 Computing in Statistics Simulation studies in statistics Lecture 9 September 21, 2016 Based on a lecture by Marie Davidian for ST 810A - Spring 2005 Preparation for Statistical Research North

More information

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage 7 Statistical Intervals Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to

More information

RESEARCH ARTICLE. The Penalized Biclustering Model And Related Algorithms Supplemental Online Material

RESEARCH ARTICLE. The Penalized Biclustering Model And Related Algorithms Supplemental Online Material Journal of Applied Statistics Vol. 00, No. 00, Month 00x, 8 RESEARCH ARTICLE The Penalized Biclustering Model And Related Algorithms Supplemental Online Material Thierry Cheouo and Alejandro Murua Département

More information

Tests for Intraclass Correlation

Tests for Intraclass Correlation Chapter 810 Tests for Intraclass Correlation Introduction The intraclass correlation coefficient is often used as an index of reliability in a measurement study. In these studies, there are K observations

More information

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems Interval estimation September 29, 2017 STAT 151 Class 7 Slide 1 Outline of Topics 1 Basic ideas 2 Sampling variation and CLT 3 Interval estimation using X 4 More general problems STAT 151 Class 7 Slide

More information

Equivalence Tests for the Ratio of Two Means in a Higher- Order Cross-Over Design

Equivalence Tests for the Ratio of Two Means in a Higher- Order Cross-Over Design Chapter 545 Equivalence Tests for the Ratio of Two Means in a Higher- Order Cross-Over Design Introduction This procedure calculates power and sample size of statistical tests of equivalence of two means

More information

Mixed models in R using the lme4 package Part 3: Inference based on profiled deviance

Mixed models in R using the lme4 package Part 3: Inference based on profiled deviance Mixed models in R using the lme4 package Part 3: Inference based on profiled deviance Douglas Bates Department of Statistics University of Wisconsin - Madison Madison January 11, 2011

More information

Tests for Paired Means using Effect Size

Tests for Paired Means using Effect Size Chapter 417 Tests for Paired Means using Effect Size Introduction This procedure provides sample size and power calculations for a one- or two-sided paired t-test when the effect size is specified rather

More information

Financial Economics. Runs Test

Financial Economics. Runs Test Test A simple statistical test of the random-walk theory is a runs test. For daily data, a run is defined as a sequence of days in which the stock price changes in the same direction. For example, consider

More information

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay. Solutions to Final Exam

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay. Solutions to Final Exam The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay Solutions to Final Exam Problem A: (42 pts) Answer briefly the following questions. 1. Questions

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 14 (MWF) The t-distribution Suhasini Subba Rao Review of previous lecture Often the precision

More information

Problem max points points scored Total 120. Do all 6 problems.

Problem max points points scored Total 120. Do all 6 problems. Solutions to (modified) practice exam 4 Statistics 224 Practice exam 4 FINAL Your Name Friday 12/21/07 Professor Michael Iltis (Lecture 2) Discussion section (circle yours) : section: 321 (3:30 pm M) 322

More information

CABARRUS COUNTY 2008 APPRAISAL MANUAL

CABARRUS COUNTY 2008 APPRAISAL MANUAL STATISTICS AND THE APPRAISAL PROCESS PREFACE Like many of the technical aspects of appraising, such as income valuation, you have to work with and use statistics before you can really begin to understand

More information

Statistical Intervals (One sample) (Chs )

Statistical Intervals (One sample) (Chs ) 7 Statistical Intervals (One sample) (Chs 8.1-8.3) Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to normally distributed with expected value µ and

More information