Predicting Charitable Contributions

Size: px
Start display at page:

Download "Predicting Charitable Contributions"

Transcription

1 Predicting Charitable Contributions By Lauren Meyer Executive Summary Charitable contributions depend on many factors from financial security to personal characteristics. This report will focus on demographic and financial information from 2000 families who participated in the 2004 Survey of Consumer Finance. Specifically, two regression models are introduced. The first serves to indicate whether or not a family will make any charitable contribution, while the second serves to predict the amount of the contribution. The models pinpoint certain financial and demographic information to be important determinants of charitable contributions. Among other things, both models find that older, more educated families with greater incomes, savings, and inheritances are likely to contribute more money to charity. Section 1. Introduction Charitable contributions are one thing that society values. As a society, we almost expect people making over some threshold to contribute large amounts to charity. Many ask: why does one person need all that wealth? Even your average citizen usually enjoys contributing to a cause in which they believe. Yet people s behaviors tend to vary a great deal, even in similar situations. Despite this, is it possible to accurately predict charitable contributions of individuals and families? It is logical to think that one s personal values will influence the amount they contribute to charity. There are some very generous people who find ways to donate even when they themselves are not in the best financial position. Personal characteristics, such as generosity, selflessness, and the degree to which one is materialistic, could all be influential in predicting charitable contributions. However, there must be some more objective financial evidence that can serve as a better predictor of charitable contributions. Personal characteristics aside, we would expect a billionaire to contribute more to charity than someone with an income of $30,000 per year. Income, along with numerous other financial measures, could be considered important predictors of charitable contributions. This report serves to use data from the 2004 Survey of Consumer Finance (SCF) in order to estimate charitable contributions of families, given that we know some demographic and financial information about them. The SCF is conducted by the Federal Reserve Board and collects information concerning the balance sheet, pension, income, and other demographic characteristics of U.S. families. One question asked of these U.S. families is the amount of their charitable contributions in 2003, which will be the key variable of interest for this report. 1

2 The remainder of the report will be organized as follows. Section 2 will explain important characteristics of the data. Section 3 will explore the model chosen to represent the data in hopes of explaining charitable contributions as best as possible. Section 4 will have concluding thoughts. Finally, more detailed analysis is provided in the appendices. Section 2. Data Characteristics The SCF is a cross-sectional survey that is nationally representative. Observational data is collected every three years by the National Opinion Research Center, a national organization for research and computing at the University of Chicago, via on-site and phone interviews. In order to study the factors affecting charitable contributions, two thousand observations were randomly selected from the SCF database. Two thousand observations are enough to ensure a representative sample, yet are easier to work with than the original forty-five hundred observations. When examining charitable contributions, it is expected that many families will not contribute anything to charity. On the other hand, for those who do contribute to charity, there are a wide range of amounts contributed. Based on this understanding, I found it best to first predict whether or not the family made a charitable contribution using all two thousand data points. Next, I would predict how much a given charitable contribution would be, using only observations with positive charitable contributions. Out of the two thousand observations from the SCF, one thousand fifty households contributed something to charity. The smallest contribution amount reported was five hundred dollars, while the largest amount contributed to charity was $9,070,000. Figure 1 below shows the distribution of charitable contributions, after removing the observations with no contribution. The numbers above the bars indicate how many observations fell into that range. Figure 1. 2

3 From Figure 1, notice that 1032 of the 1050 observations, or more than 98% of positive charitable contributions, were for amounts of $1000 or less. This graph indicates that the data is skewed to the right. A logarithmic transformation will be used for charitable contributions in order to symmetrize this skewed distribution. This transformation is useful, as it will pull-in extreme contribution values, yet will retain the original ordering, such that the largest contribution amounts are still the largest amounts on the logarithmic scale. Figure 2 shows the new distribution of charitable contributions, after performing a logarithmic transformation. While this transformation does not perfectly symmetrize the data, it does serve well in spreading out the observations. Figure 2. Distribution of Logarithmic Charitable Contributions Logarithmic Charitable Contribution Up to this point, I have not yet introduced the variables that will be used to predict charitable contributions. The SCF has more than 5,000 questions that were answered by participating families. Of these variables, there are many which could impact charitable contributions. The details of variable selection for this report will be discussed more in section three. There are, of course, some variables that we expect to have a great influence on charitable contributions. For example, it seems reasonable that income would be a strong predictor of charitable contributions. Figure 3 identifies the relationship between income and logarithmic charitable contributions, which was decided upon as the dependent variable of 3

4 interest. Notice that this plot indicates a quadratic component for the explanatory variable INCOME. Charitable contributions are lower at the three highest income values compared to the contributions at lower, but still very high, incomes. While this may be surprising, it is a fact of the data that will be taken into account during the model selection process. Figure 4 confirms that some very high income families contributed relatively little to charity. These three points are both outliers and high leverage points. However, I think it is important for these points to be left in the analysis because it is a fact of the real world which would likely be seen in other random samples from the SCF as well. Appendix A provides additional information about the outliers and high leverage points. Figure 3. Figure 4. Logarithmic Charitable Contributions Charitable Contributions in Thousands Income in Thousands Income in Thousands Section 3. Model Selection and Interpretation Section two established that logarithmic charitable contributions will be the dependent variable and also that income will be a strong predictor of contributions. It also foreshadowed the fact that a squared income term could be useful in the model. This section will introduce the rest of the explanatory variables and the model comprised of these variables. First, the logistic regression model will be introduced, which serves to predict whether or not a charitable contribution was made. This model is not concerned with the amount of a contribution. Next, the linear regression model, which serves to predict the amount of contribution, given one occurred, will be introduced and explained. For clarity, the first model will be referred to as the contribution indication model, and second model will be referred to as the contribution amount model. This will help remind one of the purposes of each model. After introducing both models, section three will continue with an explanation of how these models were concluded to be the best and also a discussion of alternative models. 4

5 Contribution Indication Model The contribution indication model is a nonlinear, logistic regression model with the following dependent variables and regression coefficients: Explanatory Variables Coefficient Z Value *** Income 1.327E ** Age 1.537E *** Pension, Annuities Income 1.005E * Life Insurance (FACE AMT) 7.622E *** Spouse Age 1.054E *** # Businesses Managed 4.700E *** Savings Bond Value 6.252E ** Support to Family/Friends 2.485E * Spouse Education (=Bachelors) 6.145E *** Spouse Education (=Masters) 1.355E *** Amount Owed on Mortgages 3.136E *** Total Savings 1.939E Total Inherited 4.717E Assets 7.811E * Saving Habits (Not regularly) 5.326E *** The coefficients in the above table provide information about how the explanatory variables will affect the dependent variable, which is a binary variable with a value of 1 if a charitable contribution occurred. For instance, the negative coefficient associated with saving habits indicates that a family who does not save regularly will be less likely to make a charitable contribution. The z-values are another important statistics associated with each variable. The larger the z-value, the more significant that variable is in explaining charitable contributions. The asterisks also serve as a significance indicator. For example, the *** indicates that the variable is significant at the one-tenth of a percent level. Many of the variables and coefficients in this model seem reasonable, maybe even expected. It makes sense that a family with more income would be more likely to make a charitable contribution. Other variables such as AGE and SPOUSE AGE have a positive coefficient indicating that older families are more likely to contribute to charity. This could be true because older households do not need to support children anymore. LIFE INSURANCE amount and number of BUSINESSES_MANAGED are also positively related to charitable contributions. This is partly because they are positively correlated with income. Appendix B contains a table of correlation coefficients for the contribution indication model. Also, business managers may be more likely to contribute to charity because it creates a good image for their company. 5

6 The variable ASSETS has a surprising negative coefficient. This negative coefficient indicates that we would expect a family with more assets to be less likely to make charitable contributions. This could be because these families do not have as much liquid cash available, which is usually the form of payment for charitable contributions. The positive coefficient for MORTGAGE_OWED also is surprising. I originally expected families with large outstanding mortgage to be less likely to make charitable contributions. However, a mortgage is not generally considered bad debt, so perhaps families with these large mortgages are more affluent and still able to make contributions, despite mortgage debt. This model had an Akaike s Information Criterion (AIC) of A lower AIC implies a better model fit. This AIC was relatively low, compared to other models considered. Further summary data, from the statistical program R, can be found in appendix C. Summary statistics are available in appendix D. Contribution Amount Model The contribution amount model uses many of the same variables that were seen in the contribution indication model. The below table presents each dependent variable as well as the linear regression coefficients and t-values associated with each variable. Explanatory Variables Coefficient T Value *** Income 2.614E *** Squared Income 3.767E *** Age 1.107E *** Pension, Annuities Income 4.648E ** Life Insurance (FACE AMT) 8.580E *** Spouse Age 1.031E *** Education (=Bachelors Degree) 2.476E Education (=Masters Degree) 5.843E ** # Businesses Managed 1.102E *** CD Value 2.288E *** Support to Family/Friends 4.269E *** Value of Stocks 1.115E ** Spouse Education (=Bachelors) 3.374E * Spouse Education (=Masters) 4.065E * Amount Owed on Mortgages 2.009E * Credit Line Available 3.369E ** Total Savings 9.519E * Total Inherited 4.303E *** Total in Checking 2.738E *** 6

7 One important variable in the model is SQUARED INCOME. This variable was added based on the plot of LOG(CHARITABLE_CONT) vs. INCOME that was shown in section two. SQUARED INCOME has a negative coefficient, as expected, indicating that as INCOME is increasing, LOG(CHARITABLE_CONT) is increasing at a decreasing rate. This squared term also explains the decreasing contribution amounts at very large income levels. Interestingly, CD_VALUE and STOCK_VALUE were useful in predicting a contribution amount, rather than BOND_VALUE, which was important in the contribution indication model. Both CD and stock value have positive coefficients indicating higher amounts in CDs and stocks is generally associated with a larger charitable contribution. In this model, EDUCATION of the respondent was now an important explanatory variable in addition to SPOUSE _EDUCATION, which was seen in the contribution indication model. Higher education levels are generally associated with high income levels, although the correlation coefficients between education and income for this data set were not strong. Appendix E has correlation coefficients for the contribution amount model. T-values, like z-values, indicate how significant each variable is in predicting LOG(CHARITABLE_CONT). Every variable except EDUCATION (=BACHELORS) has a t- value greater than two in absolute value. The coefficient of determination for this model, R 2 =.554, indicates that the model explains 55.4% of the variability in charitable contributions. The coefficient of determination adjusted for degrees of freedom, R a 2 =.5458, was not much lower. The size of the typical error, s, was Further summary data from the statistical program R can be found in appendix F. Summary statistics for this model are in appendix G. One concern with this model was related to the presence of collinearity, which occurs when one explanatory variable is nearly a linear combination of the other explanatory variables. To examine the presence of collinearity, variance inflation factors (VIF) were calculated for each variable. The highest VIF was that of INCOME with a value of While this is somewhat large, it is not greater than 10, at which point severe collinearity would exist. The VIF for the other variables can be found in appendix H. 7

8 An Example using these Models An example will help clarify how to use the two aforementioned models. Suppose we have the following information about the Smith family: Mr. Smith was the survey respondent. Family income is $140,000 per year. The family is receiving no pension or annuity income. The family holds a life insurance policy on Mr. Smith with FACE amount of $500,000. Mr. Smith is 46 and Mrs. Smith is 42. Both Mr. and Mrs. Smith have earned bachelor s degrees. They do not manage any businesses and do not support family or friends, monetarily. They have CDs worth $15,000, Stocks worth $20,000, and Savings Bonds for $10,000. They still owe $100,000 on their mortgage. They own a yacht worth $40,000 which is considered ASSETS. Their available credit line is $50,000. The Smiths have $50,000 in savings and $25,000 in checking. They recently inherited $100,000 when Mr. Smith s father passed away. Their savings habits are defined as regular by the SCF. First, using the contribution indication model, we can find the probability that the Smiths make a charitable contribution. Π(z) will be used to denote the logit regression case. Π(z) = (1.327E-06)(140,000) + (1.537E-02 )(46) + (7.622E-07)(500,000) + (1.054E-02)(42) + (6.252E-05)(10,000) E-01 + (3.136E-06)(100,000) + (1.939E-06)(50,000) + (4.717E-07)(100,000) + (-7.811E-08)(40,000) = For the logit case, Π(z) = e z /(1+e z ) or in this case e ( ) /(1+e ( ) ). The probability that the Smiths make some contribution to charity is.805. Next let s see how much we would expect the Smiths to contribute to charity, given they make a contribution. Log(Contribution) = (2.614E-07)(140,000) + (-3.767E-15)(140,000^2) + (1.107E-02)(46) + (8.580E-08)(500,000) + (1.031E-02)(42) + (2.476E-01) + (2.288E-07)(15,000) + (1.115E-08)(20,000) E-01 + (2.009E-07)(100,000) + (3.369E-07)(50,000) + (9.519E-08)(50,000) + (4.303E-08)(100,000) + (2.738E-07)(25,000) = Since log(contributions) = 7.92, we expect the Smiths to contribute $2,767 to charity. 8

9 Determining the Final Models Both models are fairly complex due to the large number of variables. However, this was expected, as charitable contributions are a complex prediction to make. Many other models were considered, but the two chosen models provide the best estimates. Goodness of fit measures were already discussed in section two. Stepwise regression was used as a first step in determining both models. This was important to use because the original data set had over five thousand variables. After narrowing these down to 70 variables, which can been seen in appendix I, stepwise regression was run for the logistic and linear regression models. The final model is not exactly what was recommended from the stepwise regression. After looking at scatter plots, correlations, and variable coefficients, the final model was created. As section two demonstrated, the scatter plot of log(charitable CONT) vs. INCOME indicated the squared income term, which would not be identified by stepwise regression. In section two, it was also noted that three points were both outliers and high leverage points. After removing the three most unusual points, as identified in the residuals vs. leverages plot, the R a 2 increased to 56.2% from 54.6%. Appendix J has more information about the regression model after removing these three points. These points were left in the data during model selection. One drawback of the contribution amount model is evident after looking at the residuals vs. fitted plot in appendix K. This plot indicates that heteroscedasticity is present. The variance of the residuals starts out relatively small, grows, and then slightly decreases. The plot also shows residuals becoming more consistently negative as fitted values increase. In other words, the model is overestimating charitable contributions for observations with large fitted values. Despite this, the chosen model is still the best, as all other models considered had this same problem. I feel more qualitative variables are necessary to solve this issue. This idea will be expanded upon in the recommendations portion of section four. Also relating to residuals, the histogram of standardized residuals appears normally distributed. Alternative Models One alternative model to be considered involves adding an INCOME^3 term. Doing this increases the R a 2 to 59.7% and also improves the residual versus fitted plot discussed earlier. I did not use this term because it increases the model complexity. Also, I questioned whether an INCOME^3 term made economical sense and whether the term would be significant if a different data sample from the SCF was taken. As mentioned earlier, the data used for this report was a subset of 2000 from the 4500 observations in the SCF. I would suggest rerunning the contribution amount model introduced above with all 4500 observations. At this point, it would easier to consider whether INCOME^3 would actually be useful. The reason is that my subset of 2000 had only three data points with 9

10 INCOME well over $40,000,000. These data points are influential in the analysis, and it would be useful to have more data points with such extremely high INCOME. Section 4. Summary and Concluding Remarks Despite the wide range of variability associated with charitable contributions, we know that it is possible to predict both whether a family will make a contribution and also how much that contribution would be. The recommended explanatory variables for indicating the probability that a contribution will be made are: the family s income level, the age of the respondent and their spouse, amount of life insurance purchased, amount of income from pensions or annuities, number of businesses managed, amount of support provided to others, amount of money in bonds, savings, and assets, total amount inherited, spouse s education level, amount owed on mortgages, and saving habits of the family. To predict the amount of money likely to be contributed to charity, the following additional variables are recommended: Income squared, education level of the respondent, value of CDs and stocks, the total amount in checking, and the credit line available to the family. The study looked at cross-sectional data from the Survey of Consumer Finance, which mainly has demographic and financial information about families. I believe the coefficient of determination, R 2, could be dramatically improved if additional data sources could be brought into the study. Mainly, subjective information would improve the study, such as how important one feels it is to donate to charity and other personal attributes that reflect how caring, giving, and materialistic the respondent is. It would also be interesting to consider a more longitudinal study, by looking at how much charitable contributions change for the same families year after year. This would allow personal characteristics to be held steadier, and the focus could be turned to financial variables, as we would expect salary and net worth to be increasing with time. 10

11 Appendices: Table of Contents Appendix A: Standardized residuals vs. Leverages for the Contribution Amount Model Appendix B: Table of Correlations for the Contribution Indication Model Appendix C: Summary data for predicting whether a contribution was made Appendix D: Summary Statistics for the Contribution Indication Model Appendix E: Table of Correlations for the Contribution Amount Model Appendix F: Summary data for predicting the amount of a charitable contribution Appendix G: Summary Statistics for the Contribution Amount Model Appendix H: Variance Inflation Factors Appendix I: Variables Considered from the SCF and Variables Created from the SCF Appendix J: R output after removing three unusual points Appendix K: Residuals vs. Fitted Values for the Contribution Amount Model 11

12 Appendix A: Residuals vs. Leverage for the Contribution Amount Model Observation Number Standardized Residual Leverage

13 Appendix B: Table of Correlations for the Contribution Indication Model Income Age Spouse Age Pension_ASaving_Habits Bus. Managed Support SavingBond_VLife_Insurance Spouse_Educ Mortgage OTotalSavings Assets Age Spouse Age Pension_Annuity_Income Saving_Habits Businesses_Managed Support SavingBond_Value Life_Insurance Spouse_Educ OwedOnMortgage TotalSavings Assets Total Inherit

14 Appendix C. Summary Data for Model 1: Predicting whether a contribution was made Call: glm(formula = CharityInd ~ AGE + SPOUSEAGE + INCOME + Pension_Annuity_Income + factor(saving_habits) + Businesses_Managed + Support + SavingBond_Value + Life_Insurance + factor(spouse_educ) + OwedOnMortgages + TotalSavings + Assets + TotalInherit, family = binomial(link = logit), data = mydata) Deviance Residuals: Min 1Q Median 3Q Max e e e e e+00 Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) e e < 2e-16 *** AGE 1.537e e *** SPOUSEAGE 1.054e e e-05 *** INCOME 1.327e e ** Pension_Annuity_Income 1.005e e * factor(saving_habits) e e e-06 *** Businesses_Managed 4.700e e e-05 *** Support 2.485e e * SavingBond_Value 6.252e e ** Life_Insurance 7.622e e *** factor(spouse_educ) e e e-06 *** factor(spouse_educ) e e e-08 *** OwedOnMortgages 3.136e e e-06 *** TotalSavings 1.939e e Assets e e * TotalInherit 4.717e e Null deviance: on 1999 degrees of freedom Residual deviance: on 1984 degrees of freedom AIC: Number of Fisher Scoring iterations: 9 Appendix D: Summary Statistics for the Contribution Indication Model Variable Mean Median SD Min Max INCOME 701, , ,440, , ,000, AGE Pension_Annuity_Income 13, , ,000, Life_Insurance 459, , ,880, ,000, SPOUSEAGE Businesses_Managed Support 8, , , SavingBond_Value 2, , , OwedOnMortgages 106, , ,000, Total Savings 100, , ,000, Assets 251, ,070, , ,000, TotalInherit 196, ,120, ,000,000.00

15 Appendix E: Table of Correlations for the Contribution Amount Model log(char) Income SqIncome Age Pension_AnnLife_Insuran SpouseAge Education Bus. Managed CD_Value Support Stock_Value Spouse_Educ Mortgages Owed Credit LineTotal Savings Total Inherit Income SqIncome Age Pension_Annuity_Income Life_Insurance SpouseAge Education Businesses_Managed CD_Value Support Stock_Value Spouse_Educ Mortgages Owed Credit Line Total Savings Total Inherit Total Checking

16 Appendix F. Summary Data for Model 2: Predicting the amount of a charitable contribution Call: lm(formula = lnchar ~ INCOME + SqIncome + AGE + Pension_Annuity_Income + Life_Insurance + SPOUSEAGE + factor(education) + Businesses_Managed + CD_Value + Support + Stock_Value + factor(spouse_educ) + OwedOnMortgages + CreditLine + TotalSavings + TotalInherit + TotalChecking, data = PositiveCont) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 6.263e e < 2e-16 *** INCOME 2.614e e < 2e-16 *** SqIncome e e < 2e-16 *** AGE 1.107e e *** Pension_Annuity_Income 4.648e e ** Life_Insurance 8.580e e e-07 *** SPOUSEAGE 1.031e e e-07 *** factor(education) e e factor(education) e e ** Businesses_Managed 1.102e e e-05 *** CD_Value 2.288e e *** Support 4.269e e e-07 *** Stock_Value 1.115e e ** factor(spouse_educ) e e * factor(spouse_educ) e e * OwedOnMortgages 2.009e e * CreditLine 3.369e e ** TotalSavings 9.519e e * TotalInherit 4.303e e e-05 *** TotalChecking 2.738e e e-11 *** Residual standard error: on 1030 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 19 and 1030 DF, p-value: < 2.2e-16 Appendix G: Summary Statistics for the Contribution Indication Model Variable Mean Median SD Min Max CHARITYCONT 89, , , ,070, lnchar INCOME 1,290, , ,670, , ,000, AGE Pension_Annuity_Income 20, , , Life_Insurance 808, , ,530, ,000, SPOUSEAGE Businesses_Managed Support 15, , , CD_Value 81, , ,000, OwedOnMortgages 167, , , ,500, CreditLine 83, , ,000, Total Savings 184, , , ,000, TotalInherit 358, ,300, ,000, TotalChecking 199, , ,130, ,500,000.00

17 Appendix H: Variance Inflation Factors Variable Variance Inflation Factor INCOME SqIncome AGE Pension_Annuity_Income Life_Insurance SPOUSEAGE EDUCATION Businesses_Managed CD_Value Support Stock_Value Spouse_Educ OwedOnMortgages CreditLine TotalSavings TotalInherit TotalChecking Appendix I: Variables created from the Survey of Consumer Finance OwedOnMortgages = X805 + X905+ X X1044 CreditLine = X X X1126 PBusIncome = X X X3332+ X3337+ X X X X X X3430 CurrentPensionBal = X X6467 +X X X X6487 TotalSavings = X X X X X X X3765 Assets = X4022+X4026+X4030+X4018-X4032 CashSettle = X5504+X5507+X5510+X5513+X5516+X5519 FuturePensionAMT = X5608+X5616+X5624+X5632+X5640+X5648 TotalInherit = X5804+X5809+X5814+X5818 TotalChecking = X3506+X3510+X3514+X3518+X3522+X3526+X3529 TotalAnnuities = X X6580 PropertyWorth = X X X X2002+ X2012 OtherLoans = X X X X X X2940 PensionReceived = X X X X X X5434

18 Appendix I: Variables Considered: From the SCF Age X14 Spouse Age X19 Income (as Wage or Salary) X5702 Number of people in Household X101 Gender X8021 Education Level X5901 EXpectations for Economy X301 Amt Business Income X5704 Nontax Investment Income X5706 Interest income X5708 Dividend Income X5710 Stock, Bond, Real Estate Income X5712 Rent, Trust, Royalties Income X5714 Pension, Annuities Income X5722 Child Support, Alimony Income X5718 Credit: Turned down in last 5 years X407 CC Bank: Amount of new charges X412 CC Bank: Amount still owed X413 CC Bank: Credit Limit X414 Mortgage1: # years X806 # Lines of Credit X1102 # Loans owed to Respondant X1403 # vehicles owned X2202 Foreseeable Major EXpenses X3010 How Much Financial Risk will they take on X3014 Don't Save- spend more than income X3015 Save regularly X3020 # business actively managed X3105 Total Value of CDs X3721 # of Savings/Money Market accounts X3728 Value of Cash/Call Money Account X3930 Total MKT Value of Stock funds X3822 Respondent AMT Earned before TaXes X4112 Respondent: Any pensions through jobs X4135 Amt to support friends/relatives X5734 EXpected to Inherit X5821 Respondent Race X6809 # people in PEU X7001 Respondent Marital Status X8023 Use Computer to Manage $ X6497 Currently Smoke? X7380 R: How old you'll live to be? X7381 How they Rate Retirement Income X3023 Total Number Mutual Funds X3820 Value of Savings Bonds X3902 Total MKT Value Bonds X6706 Total Value Mutual Funds X6704 Total MKT Value Stocks X3915 Currently Receiving Pension PMTs X4140 Face AMT of Life Insurance Policies X4003 Spouse earnings before taxes X4712 Spouse grade completed X6101 Spouse year of birth X6108 NPEU: Total Amount Owed on Mortgages X6437 NPEU: Amt in Debt X6439 Number of Properties X6688 Number of Checking Accounts X6695

19 Appendix J: R output after removing three unusual points. Call: lm(formula = lnchar ~ INCOME + SqIncome + AGE + Pension_Annuity_Income + Life_Insurance + SPOUSEAGE + factor(education) + Businesses_Managed + CD_Value + Support + Stock_Value + factor(spouse_educ) + OwedOnMortgages + CreditLine + TotalSavings + TotalInherit + TotalChecking, data = PositiveCont, subset = -c(1008, 775, 424)) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 6.26e e < 2e-16 *** INCOME 3.15e e < 2e-16 *** SqIncome -5.76e e < 2e-16 *** AGE 1.06e e ** Pension_Annuity_Income 4.34e e ** Life_Insurance 7.23e e e-05 *** SPOUSEAGE 1.07e e e-08 *** factor(education)2 2.74e e * factor(education)3 6.23e e *** Businesses_Managed 9.56e e *** CD_Value 1.98e e ** Support 3.88e e e-06 *** Stock_Value 1.21e e *** factor(spouse_educ)2 2.67e e * factor(spouse_educ)3 3.44e e OwedOnMortgages 1.97e e * CreditLine 3.16e e ** TotalSavings 2.05e e *** TotalInherit 4.00e e e-05 *** TotalChecking 3.63e e e-07 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: 1.3 on 1027 degrees of freedom Multiple R-Squared: 0.57, Adjusted R-squared: F-statistic: 71.6 on 19 and 1027 DF, p-value: <2e-16

20 Appendix K: Residuals vs. Fitted Values for the Contribution Amount Model

Multiple Regression and Logistic Regression II. Dajiang 525 Apr

Multiple Regression and Logistic Regression II. Dajiang 525 Apr Multiple Regression and Logistic Regression II Dajiang Liu @PHS 525 Apr-19-2016 Materials from Last Time Multiple regression model: Include multiple predictors in the model = + + + + How to interpret the

More information

Ordinal Multinomial Logistic Regression. Thom M. Suhy Southern Methodist University May14th, 2013

Ordinal Multinomial Logistic Regression. Thom M. Suhy Southern Methodist University May14th, 2013 Ordinal Multinomial Logistic Thom M. Suhy Southern Methodist University May14th, 2013 GLM Generalized Linear Model (GLM) Framework for statistical analysis (Gelman and Hill, 2007, p. 135) Linear Continuous

More information

Logistic Regression. Logistic Regression Theory

Logistic Regression. Logistic Regression Theory Logistic Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Logistic Regression The linear probability model.

More information

Non-linearities in Simple Regression

Non-linearities in Simple Regression Non-linearities in Simple Regression 1. Eample: Monthly Earnings and Years of Education In this tutorial, we will focus on an eample that eplores the relationship between total monthly earnings and years

More information

Stat 401XV Exam 3 Spring 2017

Stat 401XV Exam 3 Spring 2017 Stat 40XV Exam Spring 07 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning

More information

############################ ### toxo.r ### ############################

############################ ### toxo.r ### ############################ ############################ ### toxo.r ### ############################ toxo < read.table(file="n:\\courses\\stat8620\\fall 08\\toxo.dat",header=T) #toxo < read.table(file="c:\\documents and Settings\\dhall\\My

More information

General Business 706 Midterm #3 November 25, 1997

General Business 706 Midterm #3 November 25, 1997 General Business 706 Midterm #3 November 25, 1997 There are 9 questions on this exam for a total of 40 points. Please be sure to put your name and ID in the spaces provided below. Now, if you feel any

More information

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times.

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times. Mixed-effects models An introduction by Christoph Scherber Up to now, we have been dealing with linear models of the form where ß0 and ß1 are parameters of fixed value. Example: Let us assume that we are

More information

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998 Economics 312 Sample Project Report Jeffrey Parker Introduction This project is based on Exercise 2.12 on page 81 of the Hill, Griffiths, and Lim text. It examines how the sale price of houses in Stockton,

More information

Multiple regression - a brief introduction

Multiple regression - a brief introduction Multiple regression - a brief introduction Multiple regression is an extension to regular (simple) regression. Instead of one X, we now have several. Suppose, for example, that you are trying to predict

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Scott Creel Wednesday, September 10, 2014 This exercise extends the prior material on using the lm() function to fit an OLS regression and test hypotheses about effects on a parameter.

More information

Regression and Simulation

Regression and Simulation Regression and Simulation This is an introductory R session, so it may go slowly if you have never used R before. Do not be discouraged. A great way to learn a new language like this is to plunge right

More information

Intro to GLM Day 2: GLM and Maximum Likelihood

Intro to GLM Day 2: GLM and Maximum Likelihood Intro to GLM Day 2: GLM and Maximum Likelihood Federico Vegetti Central European University ECPR Summer School in Methods and Techniques 1 / 32 Generalized Linear Modeling 3 steps of GLM 1. Specify the

More information

Multiple Regression. Review of Regression with One Predictor

Multiple Regression. Review of Regression with One Predictor Fall Semester, 2001 Statistics 621 Lecture 4 Robert Stine 1 Preliminaries Multiple Regression Grading on this and other assignments Assignment will get placed in folder of first member of Learning Team.

More information

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.

More information

CREDIT RISK MODELING IN R. Logistic regression: introduction

CREDIT RISK MODELING IN R. Logistic regression: introduction CREDIT RISK MODELING IN R Logistic regression: introduction Final data structure > str(training_set) 'data.frame': 19394 obs. of 8 variables: $ loan_status : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1

More information

Dummy Variables. 1. Example: Factors Affecting Monthly Earnings

Dummy Variables. 1. Example: Factors Affecting Monthly Earnings Dummy Variables A dummy variable or binary variable is a variable that takes on a value of 0 or 1 as an indicator that the observation has some kind of characteristic. Common examples: Sex (female): FEMALE=1

More information

STAT 113 Variability

STAT 113 Variability STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2

More information

CHAPTER V. PRESENTATION OF RESULTS

CHAPTER V. PRESENTATION OF RESULTS CHAPTER V. PRESENTATION OF RESULTS This study is designed to develop a conceptual model that describes the relationship between personal financial wellness and worker job productivity. A part of the model

More information

Regression. Lecture Notes VII

Regression. Lecture Notes VII Regression Lecture Notes VII Statistics 112, Fall 2002 Outline Predicting based on Use of the conditional mean (the regression function) to make predictions. Prediction based on a sample. Regression line.

More information

Regression Review and Robust Regression. Slides prepared by Elizabeth Newton (MIT)

Regression Review and Robust Regression. Slides prepared by Elizabeth Newton (MIT) Regression Review and Robust Regression Slides prepared by Elizabeth Newton (MIT) S-Plus Oil City Data Frame Monthly Excess Returns of Oil City Petroleum, Inc. Stocks and the Market SUMMARY: The oilcity

More information

Homework Assignment Section 3

Homework Assignment Section 3 Homework Assignment Section 3 Tengyuan Liang Business Statistics Booth School of Business Problem 1 A company sets different prices for a particular stereo system in eight different regions of the country.

More information

STATISTICS 110/201, FALL 2017 Homework #5 Solutions Assigned Mon, November 6, Due Wed, November 15

STATISTICS 110/201, FALL 2017 Homework #5 Solutions Assigned Mon, November 6, Due Wed, November 15 STATISTICS 110/201, FALL 2017 Homework #5 Solutions Assigned Mon, November 6, Due Wed, November 15 For this assignment use the Diamonds dataset in the Stat2Data library. The dataset is used in examples

More information

9. Logit and Probit Models For Dichotomous Data

9. Logit and Probit Models For Dichotomous Data Sociology 740 John Fox Lecture Notes 9. Logit and Probit Models For Dichotomous Data Copyright 2014 by John Fox Logit and Probit Models for Dichotomous Responses 1 1. Goals: I To show how models similar

More information

ORDERED MULTINOMIAL LOGISTIC REGRESSION ANALYSIS. Pooja Shivraj Southern Methodist University

ORDERED MULTINOMIAL LOGISTIC REGRESSION ANALYSIS. Pooja Shivraj Southern Methodist University ORDERED MULTINOMIAL LOGISTIC REGRESSION ANALYSIS Pooja Shivraj Southern Methodist University KINDS OF REGRESSION ANALYSES Linear Regression Logistic Regression Dichotomous dependent variable (yes/no, died/

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - IIIb Henrik Madsen March 18, 2012 Henrik Madsen () Chapman & Hall March 18, 2012 1 / 32 Examples Overdispersion and Offset!

More information

Lecture 2 Describing Data

Lecture 2 Describing Data Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms

More information

Negative Binomial Model for Count Data Log-linear Models for Contingency Tables - Introduction

Negative Binomial Model for Count Data Log-linear Models for Contingency Tables - Introduction Negative Binomial Model for Count Data Log-linear Models for Contingency Tables - Introduction Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Negative Binomial Family Example: Absenteeism from

More information

Jaime Frade Dr. Niu Interest rate modeling

Jaime Frade Dr. Niu Interest rate modeling Interest rate modeling Abstract In this paper, three models were used to forecast short term interest rates for the 3 month LIBOR. Each of the models, regression time series, GARCH, and Cox, Ingersoll,

More information

T. Rowe Price 2015 FAMILY FINANCIAL TRADE-OFFS SURVEY

T. Rowe Price 2015 FAMILY FINANCIAL TRADE-OFFS SURVEY T. Rowe Price 2015 FAMILY FINANCIAL TRADE-OFFS SURVEY Contents Perceptions About Saving for Retirement & College Education Respondent College Experience Family Financial Profile Saving for College Paying

More information

CHAPTER 2 Describing Data: Numerical

CHAPTER 2 Describing Data: Numerical CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of

More information

DATA SUMMARIZATION AND VISUALIZATION

DATA SUMMARIZATION AND VISUALIZATION APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296

More information

6 Multiple Regression

6 Multiple Regression More than one X variable. 6 Multiple Regression Why? Might be interested in more than one marginal effect Omitted Variable Bias (OVB) 6.1 and 6.2 House prices and OVB Should I build a fireplace? The following

More information

Some estimates of the height of the podium

Some estimates of the height of the podium Some estimates of the height of the podium 24 36 40 40 40 41 42 44 46 48 50 53 65 98 1 5 number summary Inter quartile range (IQR) range = max min 2 1.5 IQR outlier rule 3 make a boxplot 24 36 40 40 40

More information

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron Statistical Models of Stocks and Bonds Zachary D Easterling: Department of Economics The University of Akron Abstract One of the key ideas in monetary economics is that the prices of investments tend to

More information

Lecture 13: Identifying unusual observations In lecture 12, we learned how to investigate variables. Now we learn how to investigate cases.

Lecture 13: Identifying unusual observations In lecture 12, we learned how to investigate variables. Now we learn how to investigate cases. Lecture 13: Identifying unusual observations In lecture 12, we learned how to investigate variables. Now we learn how to investigate cases. Goal: Find unusual cases that might be mistakes, or that might

More information

Monetary Economics Risk and Return, Part 2. Gerald P. Dwyer Fall 2015

Monetary Economics Risk and Return, Part 2. Gerald P. Dwyer Fall 2015 Monetary Economics Risk and Return, Part 2 Gerald P. Dwyer Fall 2015 Reading Malkiel, Part 2, Part 3 Malkiel, Part 3 Outline Returns and risk Overall market risk reduced over longer periods Individual

More information

Chapter 18: The Correlational Procedures

Chapter 18: The Correlational Procedures Introduction: In this chapter we are going to tackle about two kinds of relationship, positive relationship and negative relationship. Positive Relationship Let's say we have two values, votes and campaign

More information

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Midterm

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Midterm Booth School of Business, University of Chicago Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay Solutions to Midterm Problem A: (30 pts) Answer briefly the following questions. Each question has

More information

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 13, 2018

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 13, 2018 Maximum Likelihood Estimation Richard Williams, University of otre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 3, 208 [This handout draws very heavily from Regression Models for Categorical

More information

Final Exam Suggested Solutions

Final Exam Suggested Solutions University of Washington Fall 003 Department of Economics Eric Zivot Economics 483 Final Exam Suggested Solutions This is a closed book and closed note exam. However, you are allowed one page of handwritten

More information

2 Exploring Univariate Data

2 Exploring Univariate Data 2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting

More information

CHAPTER 4 DATA ANALYSIS Data Hypothesis

CHAPTER 4 DATA ANALYSIS Data Hypothesis CHAPTER 4 DATA ANALYSIS 4.1. Data Hypothesis The hypothesis for each independent variable to express our expectations about the characteristic of each independent variable and the pay back performance

More information

Step 1: Load the appropriate R package. Step 2: Fit a separate mixed model for each independence claim in the basis set.

Step 1: Load the appropriate R package. Step 2: Fit a separate mixed model for each independence claim in the basis set. Step 1: Load the appropriate R package. You will need two libraries: nlme and lme4. Step 2: Fit a separate mixed model for each independence claim in the basis set. For instance, in Table 2 the first basis

More information

Study 2: data analysis. Example analysis using R

Study 2: data analysis. Example analysis using R Study 2: data analysis Example analysis using R Steps for data analysis Install software on your computer or locate computer with software (e.g., R, systat, SPSS) Prepare data for analysis Subjects (rows)

More information

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Midterm

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Midterm Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay Midterm GSB Honor Code: I pledge my honor that I have not violated the Honor Code during this examination.

More information

COMMUNITY ADVANTAGE PANEL SURVEY: DATA COLLECTION UPDATE AND ANALYSIS OF PANEL ATTRITION

COMMUNITY ADVANTAGE PANEL SURVEY: DATA COLLECTION UPDATE AND ANALYSIS OF PANEL ATTRITION COMMUNITY ADVANTAGE PANEL SURVEY: DATA COLLECTION UPDATE AND ANALYSIS OF PANEL ATTRITION Technical Report: March 2011 By Sarah Riley HongYu Ru Mark Lindblad Roberto Quercia Center for Community Capital

More information

Estimation of a credit scoring model for lenders company

Estimation of a credit scoring model for lenders company Estimation of a credit scoring model for lenders company Felipe Alonso Arias-Arbeláez Juan Sebastián Bravo-Valbuena Francisco Iván Zuluaga-Díaz November 22, 2015 Abstract Historically it has seen that

More information

SOLUTIONS TO THE LAB 1 ASSIGNMENT

SOLUTIONS TO THE LAB 1 ASSIGNMENT SOLUTIONS TO THE LAB 1 ASSIGNMENT Question 1 Excel produces the following histogram of pull strengths for the 100 resistors: 2 20 Histogram of Pull Strengths (lb) Frequency 1 10 0 9 61 63 6 67 69 71 73

More information

Cognitive Constraints on Valuing Annuities. Jeffrey R. Brown Arie Kapteyn Erzo F.P. Luttmer Olivia S. Mitchell

Cognitive Constraints on Valuing Annuities. Jeffrey R. Brown Arie Kapteyn Erzo F.P. Luttmer Olivia S. Mitchell Cognitive Constraints on Valuing Annuities Jeffrey R. Brown Arie Kapteyn Erzo F.P. Luttmer Olivia S. Mitchell Under a wide range of assumptions people should annuitize to guard against length-of-life uncertainty

More information

MODEL SELECTION CRITERIA IN R:

MODEL SELECTION CRITERIA IN R: 1. R 2 statistics We may use MODEL SELECTION CRITERIA IN R R 2 = SS R SS T = 1 SS Res SS T or R 2 Adj = 1 SS Res/(n p) SS T /(n 1) = 1 ( ) n 1 (1 R 2 ). n p where p is the total number of parameters. R

More information

Multiple linear regression

Multiple linear regression Multiple linear regression Business Statistics 41000 Spring 2017 1 Topics 1. Including multiple predictors 2. Controlling for confounders 3. Transformations, interactions, dummy variables OpenIntro 8.1,

More information

COMMUNITY ADVANTAGE PANEL SURVEY: DATA COLLECTION UPDATE AND ANALYSIS OF PANEL ATTRITION

COMMUNITY ADVANTAGE PANEL SURVEY: DATA COLLECTION UPDATE AND ANALYSIS OF PANEL ATTRITION COMMUNITY ADVANTAGE PANEL SURVEY: DATA COLLECTION UPDATE AND ANALYSIS OF PANEL ATTRITION Technical Report: February 2012 By Sarah Riley HongYu Ru Mark Lindblad Roberto Quercia Center for Community Capital

More information

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either

More information

Influence of Personal Factors on Health Insurance Purchase Decision

Influence of Personal Factors on Health Insurance Purchase Decision Influence of Personal Factors on Health Insurance Purchase Decision INFLUENCE OF PERSONAL FACTORS ON HEALTH INSURANCE PURCHASE DECISION The decision in health insurance purchase include decisions about

More information

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley. Appendix: Statistics in Action Part I Financial Time Series 1. These data show the effects of stock splits. If you investigate further, you ll find that most of these splits (such as in May 1970) are 3-for-1

More information

COMMUNITY ADVANTAGE PANEL SURVEY: DATA COLLECTION UPDATE AND ANALYSIS OF PANEL ATTRITION

COMMUNITY ADVANTAGE PANEL SURVEY: DATA COLLECTION UPDATE AND ANALYSIS OF PANEL ATTRITION COMMUNITY ADVANTAGE PANEL SURVEY: DATA COLLECTION UPDATE AND ANALYSIS OF PANEL ATTRITION Technical Report: February 2013 By Sarah Riley Qing Feng Mark Lindblad Roberto Quercia Center for Community Capital

More information

Random Effects ANOVA

Random Effects ANOVA Random Effects ANOVA Grant B. Morgan Baylor University This post contains code for conducting a random effects ANOVA. Make sure the following packages are installed: foreign, lme4, lsr, lattice. library(foreign)

More information

boxcox() returns the values of α and their loglikelihoods,

boxcox() returns the values of α and their loglikelihoods, Solutions to Selected Computer Lab Problems and Exercises in Chapter 11 of Statistics and Data Analysis for Financial Engineering, 2nd ed. by David Ruppert and David S. Matteson c 2016 David Ruppert and

More information

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2016, Mr. Ruey S. Tsay. Solutions to Midterm

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2016, Mr. Ruey S. Tsay. Solutions to Midterm Booth School of Business, University of Chicago Business 41202, Spring Quarter 2016, Mr. Ruey S. Tsay Solutions to Midterm Problem A: (30 pts) Answer briefly the following questions. Each question has

More information

starting on 5/1/1953 up until 2/1/2017.

starting on 5/1/1953 up until 2/1/2017. An Actuary s Guide to Financial Applications: Examples with EViews By William Bourgeois An actuary is a business professional who uses statistics to determine and analyze risks for companies. In this guide,

More information

To be two or not be two, that is a LOGISTIC question

To be two or not be two, that is a LOGISTIC question MWSUG 2016 - Paper AA18 To be two or not be two, that is a LOGISTIC question Robert G. Downer, Grand Valley State University, Allendale, MI ABSTRACT A binary response is very common in logistic regression

More information

Percentiles, STATA, Box Plots, Standardizing, and Other Transformations

Percentiles, STATA, Box Plots, Standardizing, and Other Transformations Percentiles, STATA, Box Plots, Standardizing, and Other Transformations Lecture 3 Reading: Sections 5.7 54 Remember, when you finish a chapter make sure not to miss the last couple of boxes: What Can Go

More information

Cost of Capital (represents risk)

Cost of Capital (represents risk) Cost of Capital (represents risk) Cost of Equity Capital - From the shareholders perspective, the expected return is the cost of equity capital E(R i ) is the return needed to make the investment = the

More information

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 18, 2006, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTIONS

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 18, 2006, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTIONS COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 18, 2006, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTIONS Answer all parts. Closed book, calculators allowed. It is important to show all working,

More information

Chapter 14. Descriptive Methods in Regression and Correlation. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 14, Slide 1

Chapter 14. Descriptive Methods in Regression and Correlation. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 14, Slide 1 Chapter 14 Descriptive Methods in Regression and Correlation Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 14, Slide 1 Section 14.1 Linear Equations with One Independent Variable Copyright

More information

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. For exams (MD1, MD2, and Final): You may bring one 8.5 by 11 sheet of

More information

Random variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny.

Random variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny. Distributions September 17 Random variables Anything that can be measured or categorized is called a variable If the value that a variable takes on is subject to variability, then it the variable is a

More information

Fannie Mae Own-Rent Analysis Theme 1: Persistence of the Homeownership Aspiration

Fannie Mae Own-Rent Analysis Theme 1: Persistence of the Homeownership Aspiration Fannie Mae Own-Rent Analysis Theme 1: Persistence of the Homeownership Aspiration Copyright 2010 by Fannie Mae Release Date: December 9, 2010 Overview of Fannie Mae Own-Rent Analysis Objective Fannie Mae

More information

Lecture Note: Analysis of Financial Time Series Spring 2017, Ruey S. Tsay

Lecture Note: Analysis of Financial Time Series Spring 2017, Ruey S. Tsay Lecture Note: Analysis of Financial Time Series Spring 2017, Ruey S. Tsay Seasonal Time Series: TS with periodic patterns and useful in predicting quarterly earnings pricing weather-related derivatives

More information

The following content is provided under a Creative Commons license. Your support

The following content is provided under a Creative Commons license. Your support MITOCW Recitation 6 The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make

More information

SFSU FIN822 Project 1

SFSU FIN822 Project 1 SFSU FIN822 Project 1 This project can be done in a team of up to 3 people. Your project report must be accompanied by printouts of programming outputs. You could use any software to solve the problems.

More information

Simple Fuzzy Score for Russian Public Companies Risk of Default

Simple Fuzzy Score for Russian Public Companies Risk of Default Simple Fuzzy Score for Russian Public Companies Risk of Default By Sergey Ivliev April 2,2. Introduction Current economy crisis of 28 29 has resulted in severe credit crunch and significant NPL rise in

More information

And The Winner Is? How to Pick a Better Model

And The Winner Is? How to Pick a Better Model And The Winner Is? How to Pick a Better Model Part 2 Goodness-of-Fit and Internal Stability Dan Tevet, FCAS, MAAA Goodness-of-Fit Trying to answer question: How well does our model fit the data? Can be

More information

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Chapter 3 Numerical Descriptive Measures Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Objectives In this chapter, you learn to: Describe the properties of central tendency, variation, and

More information

Credit Risk Modelling

Credit Risk Modelling Credit Risk Modelling Tiziano Bellini Università di Bologna December 13, 2013 Tiziano Bellini (Università di Bologna) Credit Risk Modelling December 13, 2013 1 / 55 Outline Framework Credit Risk Modelling

More information

Stat3011: Solution of Midterm Exam One

Stat3011: Solution of Midterm Exam One 1 Stat3011: Solution of Midterm Exam One Fall/2003, Tiefeng Jiang Name: Problem 1 (30 points). Choose one appropriate answer in each of the following questions. 1. (B ) The mean age of five people in a

More information

Probability & Statistics Modular Learning Exercises

Probability & Statistics Modular Learning Exercises Probability & Statistics Modular Learning Exercises About The Actuarial Foundation The Actuarial Foundation, a 501(c)(3) nonprofit organization, develops, funds and executes education, scholarship and

More information

Jamie Wagner Ph.D. Student University of Nebraska Lincoln

Jamie Wagner Ph.D. Student University of Nebraska Lincoln An Empirical Analysis Linking a Person s Financial Risk Tolerance and Financial Literacy to Financial Behaviors Jamie Wagner Ph.D. Student University of Nebraska Lincoln Abstract Financial risk aversion

More information

Your Retirement Lifestyle Workbook

Your Retirement Lifestyle Workbook Your Retirement Lifestyle Workbook Purpose of This Workbook and Helpful Checklist This lifestyle workbook is designed to help you collect and organize the information needed to develop your Retirement

More information

Impact of Household Income on Poverty Levels

Impact of Household Income on Poverty Levels Impact of Household Income on Poverty Levels ECON 3161 Econometrics, Fall 2015 Prof. Shatakshee Dhongde Group 8 Annie Strothmann Anne Marsh Samuel Brown Abstract: The relationship between poverty and household

More information

Window Width Selection for L 2 Adjusted Quantile Regression

Window Width Selection for L 2 Adjusted Quantile Regression Window Width Selection for L 2 Adjusted Quantile Regression Yoonsuh Jung, The Ohio State University Steven N. MacEachern, The Ohio State University Yoonkyung Lee, The Ohio State University Technical Report

More information

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 10, 2017

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 10, 2017 Maximum Likelihood Estimation Richard Williams, University of otre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 0, 207 [This handout draws very heavily from Regression Models for Categorical

More information

Diploma in Business Administration Part 2. Quantitative Methods. Examiner s Suggested Answers

Diploma in Business Administration Part 2. Quantitative Methods. Examiner s Suggested Answers Cumulative frequency Diploma in Business Administration Part Quantitative Methods Examiner s Suggested Answers Question 1 Cumulative Frequency Curve 1 9 8 7 6 5 4 3 1 5 1 15 5 3 35 4 45 Weeks 1 (b) x f

More information

Economics 424/Applied Mathematics 540. Final Exam Solutions

Economics 424/Applied Mathematics 540. Final Exam Solutions University of Washington Summer 01 Department of Economics Eric Zivot Economics 44/Applied Mathematics 540 Final Exam Solutions I. Matrix Algebra and Portfolio Math (30 points, 5 points each) Let R i denote

More information

SEX DISCRIMINATION PROBLEM

SEX DISCRIMINATION PROBLEM SEX DISCRIMINATION PROBLEM 5. Displaying Relationships between Variables In this section we will use scatterplots to examine the relationship between the dependent variable (starting salary) and each of

More information

Final Exam, section 1. Thursday, May hour, 30 minutes

Final Exam, section 1. Thursday, May hour, 30 minutes San Francisco State University Michael Bar ECON 312 Spring 2018 Final Exam, section 1 Thursday, May 17 1 hour, 30 minutes Name: Instructions 1. This is closed book, closed notes exam. 2. You can use one

More information

Data screening, transformations: MRC05

Data screening, transformations: MRC05 Dale Berger Data screening, transformations: MRC05 This is a demonstration of data screening and transformations for a regression analysis. Our interest is in predicting current salary from education level

More information

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2012, Mr. Ruey S. Tsay. Solutions to Midterm

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2012, Mr. Ruey S. Tsay. Solutions to Midterm Booth School of Business, University of Chicago Business 41202, Spring Quarter 2012, Mr. Ruey S. Tsay Solutions to Midterm Problem A: (34 pts) Answer briefly the following questions. Each question has

More information

ST 350 Lecture Worksheet #33 Reiland

ST 350 Lecture Worksheet #33 Reiland ST 350 Lecture Worksheet #33 Reiland SOLUTIONS Name Lotteries: Good Idea or Scam? Lotteries have become important sources of revenue for many state governments. However, people have criticized lotteries

More information

Topic 8: Model Diagnostics

Topic 8: Model Diagnostics Topic 8: Model Diagnostics Outline Diagnostics to check model assumptions Diagnostics concerning X Diagnostics using the residuals Diagnostics and remedial measures Diagnostics: look at the data to diagnose

More information

Survey Sampling, Fall, 2006, Columbia University Homework assignments (2 Sept 2006)

Survey Sampling, Fall, 2006, Columbia University Homework assignments (2 Sept 2006) Survey Sampling, Fall, 2006, Columbia University Homework assignments (2 Sept 2006) Assignment 1, due lecture 3 at the beginning of class 1. Lohr 1.1 2. Lohr 1.2 3. Lohr 1.3 4. Download data from the CBS

More information

The Brattle Group 1 st Floor 198 High Holborn London WC1V 7BD

The Brattle Group 1 st Floor 198 High Holborn London WC1V 7BD UPDATED ESTIMATE OF BT S EQUITY BETA NOVEMBER 4TH 2008 The Brattle Group 1 st Floor 198 High Holborn London WC1V 7BD office@brattle.co.uk Contents 1 Introduction and Summary of Findings... 3 2 Statistical

More information

The SAS System 11:03 Monday, November 11,

The SAS System 11:03 Monday, November 11, The SAS System 11:3 Monday, November 11, 213 1 The CONTENTS Procedure Data Set Name BIO.AUTO_PREMIUMS Observations 5 Member Type DATA Variables 3 Engine V9 Indexes Created Monday, November 11, 213 11:4:19

More information

Generalized Multilevel Regression Example for a Binary Outcome

Generalized Multilevel Regression Example for a Binary Outcome Psy 510/610 Multilevel Regression, Spring 2017 1 HLM Generalized Multilevel Regression Example for a Binary Outcome Specifications for this Bernoulli HLM2 run Problem Title: no title The data source for

More information

Section3-2: Measures of Center

Section3-2: Measures of Center Chapter 3 Section3-: Measures of Center Notation Suppose we are making a series of observations, n of them, to be exact. Then we write x 1, x, x 3,K, x n as the values we observe. Thus n is the total number

More information

Logit Models for Binary Data

Logit Models for Binary Data Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis These models are appropriate when the response

More information

Benchmarking Credit ratings

Benchmarking Credit ratings Benchmarking Credit ratings September 2013 Project team: Tom Hird Annabel Wilton CEG Asia Pacific 234 George St Sydney NSW 2000 Australia T +61 2 9881 5750 www.ceg-ap.com Table of Contents Executive summary...

More information

Stat 328, Summer 2005

Stat 328, Summer 2005 Stat 328, Summer 2005 Exam #2, 6/18/05 Name (print) UnivID I have neither given nor received any unauthorized aid in completing this exam. Signed Answer each question completely showing your work where

More information

1. (9; 3ea) The table lists the survey results of 100 non-senior students. Math major Art major Biology major

1. (9; 3ea) The table lists the survey results of 100 non-senior students. Math major Art major Biology major Math 54 Test #2(Chapter 4, 5, 6, 7) Name: Show all necessary work for full credit. You may use graphing calculators for your calculation, but you must show all detail and use the proper notations. Total

More information