ST 350 Lecture Worksheet #33 Reiland

Similar documents
Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron

Homework Assignment Section 3

Linear regression model

A STATISTICAL ANALYSIS OF GDP AND FINAL CONSUMPTION USING SIMPLE LINEAR REGRESSION. THE CASE OF ROMANIA

WEB APPENDIX 8A 7.1 ( 8.9)

PRACTICE PROBLEMS FOR EXAM 2

σ e, which will be large when prediction errors are Linear regression model

Statistics 101: Section L - Laboratory 6

GGraph. Males Only. Premium. Experience. GGraph. Gender. 1 0: R 2 Linear = : R 2 Linear = Page 1

Stat3011: Solution of Midterm Exam One

> attach(grocery) > boxplot(sales~discount, ylab="sales",xlab="discount")

Diploma in Business Administration Part 2. Quantitative Methods. Examiner s Suggested Answers

REGIONAL WORKSHOP ON TRAFFIC FORECASTING AND ECONOMIC PLANNING

Factors affecting the share price of FMCG Companies

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

ROLES OF COMMERCIAL BANKS IN THE GROWTH OF SMALL AND MEDIUM ENTERPRISES - CASE OF ALBANIA

Stat 328, Summer 2005

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times.

XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

Going from General to Specific

CHAPTER 2 Describing Data: Numerical

Non-linearities in Simple Regression

Presented at the 2003 SCEA-ISPA Joint Annual Conference and Training Workshop -

ANALYSIS OF THE GDP IN THE REPUBLIC OF MOLDOVA BASED ON MAJOR MACROECONOMIC INDICATORS. Ştefan Cristian CIUCU

A Brief Illustration of Regression Analysis in Economics John Bucci. Okun s Law

Impact of Household Income on Poverty Levels

Business Statistics: A First Course

PRMIA Exam 8002 PRM Certification - Exam II: Mathematical Foundations of Risk Measurement Version: 6.0 [ Total Questions: 132 ]

YEAR 12 Trial Exam Paper FURTHER MATHEMATICS. Written examination 1. Worked solutions

Random Effects ANOVA

20135 Theory of Finance Part I Professor Massimo Guidolin

Special Report 1: The Importance Of Non-Earned Sources Of Income In Wyoming.

SFSU FIN822 Project 1

Washington University Fall Economics 487. Project Proposal due Monday 10/22 Final Project due Monday 12/3

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

The study on the financial leverage effect of GD Power Corp. based on. financing structure

3. The distinction between variable costs and fixed costs is:

Department of Economics ECO 204 Microeconomic Theory for Commerce Test 2

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Estimation, Analysis and Projection of India s GDP

f x f x f x f x x 5 3 y-intercept: y-intercept: y-intercept: y-intercept: y-intercept of a linear function written in function notation

Analysis of the Influence of the Annualized Rate of Rentability on the Unit Value of the Net Assets of the Private Administered Pension Fund NN

When determining but for sales in a commercial damages case,

Jacob: The illustrative worksheet shows the values of the simulation parameters in the upper left section (Cells D5:F10). Is this for documentation?

The Internal Rate of Return Model for Life Insurance Policies

Multiple Choice: Identify the choice that best completes the statement or answers the question.

Final Exam Suggested Solutions

Homework Assignment Section 3

CHAPTER 4 DATA ANALYSIS Data Hypothesis

Session 178 TS, Stats for Health Actuaries. Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA. Presenter: Joan C. Barrett, FSA, MAAA

Name Period. Linear Correlation

STATISTICS 110/201, FALL 2017 Homework #5 Solutions Assigned Mon, November 6, Due Wed, November 15

Analysis of Variance in Matrix form

Study of one-way ANOVA with a fixed-effect factor

The Least Squares Regression Line

CHAPTER 7 MULTIPLE REGRESSION

Market Approach A. Relationship to Appraisal Principles

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay. Solutions to Final Exam

Conflict of Exchange Rates

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2012, Mr. Ruey S. Tsay. Solutions to Midterm

Mixed models in R using the lme4 package Part 3: Inference based on profiled deviance

Econometrics and Economic Data

Demonstrate Approval of Loans by a Bank

Lecture Note: Analysis of Financial Time Series Spring 2017, Ruey S. Tsay

Homework Solutions - Lecture 2 Part 2

Establishing a framework for statistical analysis via the Generalized Linear Model

Where Vami 0 = 1000 and Where R N = Return for period N. Vami N = ( 1 + R N ) Vami N-1. Where R I = Return for period I. Average Return = ( S R I ) N

Business Statistics 41000: Probability 3

Examining The Impact Of Inflation On Indian Money Markets: An Empirical Study

7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4

Additional Case Study One: Risk Analysis of Home Purchase

starting on 5/1/1953 up until 2/1/2017.

Econometrics is. The estimation of relationships suggested by economic theory

Estimating Support Labor for a Production Program

INSTITUTE OF ACTUARIES OF INDIA EXAMINATIONS. 20 th May Subject CT3 Probability & Mathematical Statistics

Testing the Solow Growth Theory

Final Exam, section 1. Thursday, May hour, 30 minutes

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.

SOLUTION QUANTITATIVE TOOLS IN MANAGEMENT MAY (x) 5000 ( ) ( )

Global Journal of Engineering Science and Research Management

Study Ch. 11.2, #51, 63 69, 73

DETERMINANTS OF SUCCESSFUL TECHNOLOGY TRANSFER

Stat 201: Business Statistics I Additional Exercises on Chapter Chapter 3

AP Stats: 3B ~ Least Squares Regression and Residuals. Objectives:

Statistics TI-83 Usage Handout

STAT 1220 FALL 2010 Common Final Exam December 10, 2010

Topic 30: Random Effects Modeling

Washington University Fall Economics 487

Central University of Punjab, Bathinda

Econometric Model Applied in the Analysis of the Correlation between Some of the Macroeconomic Variables

Econometric Methods for Valuation Analysis

Introduction to Population Modeling

Math of Finance Exponential & Power Functions

Introduction to R (2)

The histogram should resemble the uniform density, the mean should be close to 0.5, and the standard deviation should be close to 1/ 12 =

Statistical analysis for health expenditures by Gujarat state India

11/28/2018. Overview. Multiple Linear Regression Analysis. Multiple regression. Multiple regression. Multiple regression. Multiple regression

Lecture 7 Random Variables

TESTING STATISTICAL HYPOTHESES

Transcription:

ST 350 Lecture Worksheet #33 Reiland SOLUTIONS Name Lotteries: Good Idea or Scam? Lotteries have become important sources of revenue for many state governments. However, people have criticized lotteries for several reasons. To establish the lottery, politicians promise that the lottery revenues will be used for education; however, they frequently cut back an equal amount on the state's funding for education when the lottery revenues begin appearing in the state's coffers. Another attack has been made by those who claim that lotteries are a tax on the poor and uneducated. To examine the validity of the second claim, a random sample of 100 adults was asked how much they spend on lottery tickets and interviewed about various socioeconomic variables. The purpose of the study was to test the following beliefs: i). Relatively uneducated people spend more on lotteries than do relatively educated people ii). Older people buy more lottery tickets than younger people iii). People with more children spend more on lottery tickets than people with fewer children iv). Relatively poor people spend a greater proportion of their income on lottery tickets than relatively rich people The following data were collected: CÀ amount spent on lottery tickets as a percentage of total household income B" À number of years of education B# À age B$ À number of children B% À personal income (in thousands of dollars) Output from Excel is shown below Correlation Matrix:

ST 350 Worksheet #33 page 2 Education Age Children Income Education 1 Age -0.1782 1 Children 0.1073 0.1072 1 Income 0.7339-0.0418 0.0801 1 SUMMARY OUTPUT Regression Statistics Multiple R 0.6584 R Square 0.4335 Adjusted R Square 0.4096 Standard Error 2.91 Observations 100 ANOVA df SS MS F Significance F Regression 4 615.4 153.9 18.17 0.0000 Residual 95 804.3 8.47 Total 99 1419.8 Coefficients Standard Error t Stat P-value Intercept 11.91 1.79 6.67 0.0000 Education -0.430 0.132-3.26 0.0016 Age 0.0292 0.0252 1.16 0.2501 Children 0.0934 0.224 0.42 0.6780 Income -0.0745 0.0277-2.69 0.0085 1. Write the least squares prediction equation and interpret the coefficients of the model. sc ""Þ*"!Þ%$!ÐI.?-+>398Ñ!Þ!#*#ÐE1/Ñ!Þ!*$%ÐG236.</8Ñ!Þ!(%&ÐM8-97/Ñ Interpretation: i) intercept : the intercept value ""Þ*" does not have a meaningful interpretation in the context of this data (if Education, Age, Children, and Income are all 0, then the model predicts that 11.91% of total family income is spent on the lottery) but its value is still important to make accurate predictions when using the model. ii) Education: computationally, after allowing for the linear effects of Age, Children, and Income, then 1 additional year of Education results in the regression model predicting a decrease of 0.43% of total household income spent on the lottery. Note that it is NOT correct to say that 1 additional year of education causes a 0.43% reduction in the percentage of total household income spent on the lottery. iii) Age: computationally, after allowing for the linear effects of Education, Children, and Income, then 1 additional year of Age results in the regression model predicting an increase of 0.0292% of total household income spent on the lottery. Note that it is NOT correct to say that 1 additional year of Age causes a 0.0292% increase in the percentage of total household income spent on the lottery.

ST 350 Worksheet #33 page 3 iv) Children: computationally, after allowing for the linear effects of Education, Age, and Income,, then 1 additional child results in the regression model predicting an increase of 0.0934% of total household income spent on the lottery. Note that it is NOT correct to say that 1 additional child causes a 0.0934% increase in the percentage of total household income spent on the lottery. v) Income: computationally, after allowing for the linear effects of Education, Age, and Children, then an additional $1,000 of personal income (1 unit of B %, the Income variable) results in the regression model predicting a decrease of 0.0745% of total household income spent on the lottery. Note that it is NOT correct to say that an additional $1,000 of personal income causes a 0.0745% decrease in the percentage of total household income spent on the lottery. 2. What percent of the variation in the percentage of total household income spent on the lottery is explained by the differences in the explanatory variables? (express as a percent; use 1 decimal place) # From the regression output, V!Þ%$$&à this means that 43.35% of the variation in the percent of total household income spent on the lottery is explained by differences in the explanatory variables Education, Age, Children, and Income. 3. Is the complete model useful for predicting percent of household income spent on the lottery? Global F test: L! À " " " $ "! " # % L+ Àat least 1 " 3 Á!ß3"ß#ß$ß%Þ QWV/1</398 "&$Þ* Test statistic (from output) J QWI<<9< )Þ%( ")Þ"( T @+6?/ (from output, Significance F, to 4 decimal places)!þ!!!! Conclusion: Reject the nulll hypothesis and conclude that at least 1 " 3 Á!. The complete model is useful for predicting the percent of household income is spent on the lottery. NOTE that this does NOT imply that this is the BEST modelþ 4. Test each of the individual beliefs i) - iv) at the 5% significance level. i) Relatively uneducated people spend more on lotteries than do relatively educated people L! À ""!ß La À "" Á!Þ,! "!Þ%$ >!Þ"$# $Þ#'à T @+6?/!Þ!!"' (from the above regression output), " Conclusion: rejct L! À" "! in favor of the alternative La À" " Á!. Since,!ßit " appears that people with more years of education spend LESS on the lottery when Age, Children, and Income are accounted for; that is, relatively uneducated people spend MORE on lotteries than do relatively educated people. ii). Older people buy more lottery tickets than younger people L À "!ß L À " Á!Þ! # a #

ST 350 Worksheet #33 page 4,!!Þ!#*# # >!Þ!#&# "Þ"'à T @+6?/!Þ#&!" (from the above regression output), # Conclusion: do not reject L! À "#!. When Education, Children and Income are accounted for, Age is not linearly related to the percentage of household income spent on the lottery. iii). People with more children spend more on lottery tickets than people with fewer children L! À " 3!ß La À " 3 Á!Þ,! 3!Þ!*$% >!Þ##%!Þ%#à T @+6?/!Þ'#)! (from the above regression output), 3 Conclusion: do not reject L! À " $!. When Education, Age, and Income are accounted for, the number of Children is not linearly related to the percentage of household income spent on the lottery. iv) Relatively poor people spend a greater proportion of their income on lottery tickets than relatively rich people. L! À " 4!ß La À " 4 Á!Þ,! 4!Þ!(%& >!Þ!#(( #Þ'*à T @+6?/!Þ!!)& (from the above regression output), 4 Conclusion: reject L! À "%!. When Education, Age, and Children are accounted for, personal Income is linearly related to the percentage of household income spent on the lottery. 5. Are the required conditions satisfied? Histogram Plot of Residuals versus Predicted Frequency 20 15 10 5 0-8 -6-4 -2 0 2 4 6 Residuals 10 5 0-5 -5 0 5 10-10 Residuals Predicted The regression model is C3 "! " " B"3 " # B#3 " $ B$3 " % B%3 % 3ß 3 "ßáß8 with the following assumptions on the error terms % 3 À i) For all 3, IÐ% 3 Ñ! ii) For all values of BßBßBßB " # $ %, WHÐ% 3Ñ5 % for all 3 iii) the distribution of the error % 3 is normal, 3 "ß á ß 8 iv) for errors associated with values of the response variable C are independent. Summary of the assumptions: % 3 µ33.rð!ß5% Ñfor all BßBßBßB " # $ % where 33. denotes independent and identically distributed.

ST 350 Worksheet #33 page 5 The histogram of the residuals shows an approximate mound-shaped (normal) pattern with some left skewness and a mean of approximately 0 so we can be reasonably comfortale with assumptions i) and iii). To check assumptions ii) and iv) we can examine residual plots to look for patterns. The cone-shaped pattern of the residual plot calls into question assumption ii) but not assumption iv). 6. a. Estimate the percentage of income spent on the lottery by adults with 11 years of education that are 45 years old with 3 children and annual income of $30,000. sc ""Þ*"!Þ%$Ð""Ñ!Þ!#*#Ð%&Ñ!Þ!*$%Ð$Ñ!Þ!(%&Ð$!Ñ 'Þ&% b. Estimate the percentage of income spent on the lottery by adults with 16 years of education who are 45 years old with 3 children and annual income of $60,000. sc ""Þ*"!Þ%$Ð"'Ñ!Þ!#*#Ð%&Ñ!Þ!*$%Ð$Ñ!Þ!(%&Ð'!Ñ #Þ"& 7. Since education and income are highly correlated, let's eliminate income from the model and observe how the estimates change. The output is shown below. SUMMARY OUTPUT Regression Statistics Multiple R 0.62486173 R Square 0.39045218 Adjusted R Square 0.37140381 Standard Error 3.00248144 Observations 100 ANOVA df SS MS F Significance F Regression 3 554.3600991 184.7867 20.49793 2.39533E-10 Residual 96 865.4299009 9.014895 Total 99 1419.79 Coefficients Standard Error t Stat P-value Intercept 13.1714062 1.77677665 7.413091 4.85E-11 Education -0.6913761 0.092145484-7.503093 3.15E-11 Age 0.02011054 0.025796646 0.77958 0.437556 Children 0.10280375 0.231431694 0.444208 0.657892 a. Is the complete model useful for predicting percent of income spent on the lottery? Global F test: L! À " " " $! " # L+ Àat least 1 " 3 Á!ß3"ß#ß$Þ QWV/1</398 ")%Þ(* Test statistic (from output) J QWI<<9< *Þ!" #!Þ%*) T @+6?/ (from output, Significance F, to 4 decimal places)!þ!!!!

ST 350 Worksheet #33 page 6 Conclusion: Reject the nulll hypothesis and conclude that at least 1 " 3 Á!. The complete model is useful for predicting the percent of household income is spent on the lottery. NOTE that this does NOT imply that this is the BEST modelþ b. Test each of the individual beliefs i) - iii) at the 5% significance level. i) Relatively uneducated people spend more on lotteries than do relatively educated people L! À ""!ß La À "" Á!Þ,! "!Þ'* >!Þ!*# (Þ&à T @+6?/!Þ!!!! (to 4 decimal palces, from the aove, " regression output) Conclusion: rejct L! À" "! in favor of the alternative La À" " Á!. Since,!ßit " appears that people with more years of education spend LESS on the lottery when Age, and Children are accounted for; that is, relatively uneducated people spend MORE on lotteries than do relatively educated people. ii). Older people buy more lottery tickets than younger people L! À "#!ß La À "# Á!Þ,! #!Þ!# 01 >!Þ!#& Þ((*'à T @+6?/!Þ%$(', # 8 0 (from the above regression output) Conclusion: do not reject L! À "#!. When Education and Children are accounted for, Age is not linearly related to the percentage of household income spent on the lottery. iii). People with more children spend more on lottery tickets than people with fewer children L! À " 3!ß La À " 3 Á!Þ,! 3!Þ"!#) >!Þ#$"%!Þ%%%#à T @+6?/!Þ'&(* (from the above regression output), 3 Conclusion: do not reject L! À " $!. When Education and Age are accounted for, the number of Children is not linearly related to the percentage of household income spent on the lottery.