The SAS System 11:03 Monday, November 11,

Similar documents
SAS Simple Linear Regression Example

EXST7015: Multiple Regression from Snedecor & Cochran (1967) RAW DATA LISTING

GGraph. Males Only. Premium. Experience. GGraph. Gender. 1 0: R 2 Linear = : R 2 Linear = Page 1

Notice that X2 and Y2 are skewed. Taking the SQRT of Y2 reduces the skewness greatly.

Topic 8: Model Diagnostics

Chapter 11 : Model checking and refinement An example: Blood-brain barrier study on rats

Stat 328, Summer 2005

Lecture 13: Identifying unusual observations In lecture 12, we learned how to investigate variables. Now we learn how to investigate cases.

Empirical Rule (P148)

> attach(grocery) > boxplot(sales~discount, ylab="sales",xlab="discount")

Labor Force Participation and the Wage Gap Detailed Notes and Code Econometrics 113 Spring 2014

is the bandwidth and controls the level of smoothing of the estimator, n is the sample size and

One Sample T-Test With Howell Data, IQ of Students in Vermont

Homework 0 Key (not to be handed in) due? Jan. 10

ARIMA ANALYSIS WITH INTERVENTIONS / OUTLIERS

Labor Market Returns to Two- and Four- Year Colleges. Paper by Kane and Rouse Replicated by Andreas Kraft

Are the movements of stocks, bonds, and housing linked? Zachary D Easterling Department of Economics The University of Akron

tm / / / / / / / / / / / / Statistics/Data Analysis User: Klick Project: Limited Dependent Variables{space -6}

Dummy variables 9/22/2015. Are wages different across union/nonunion jobs. Treatment Control Y X X i identifies treatment

11/28/2018. Overview. Multiple Linear Regression Analysis. Multiple regression. Multiple regression. Multiple regression. Multiple regression

Review: Chebyshev s Rule. Measures of Dispersion II. Review: Empirical Rule. Review: Empirical Rule. Auto Batteries Example, p 59.

Introduction to General and Generalized Linear Models

You created this PDF from an application that is not licensed to print to novapdf printer (

Time series data: Part 2

Econ 371 Problem Set #4 Answer Sheet. 6.2 This question asks you to use the results from column (1) in the table on page 213.

WesVar Analysis Example Replication C7

The Multivariate Regression Model

Appendix. A.1 Independent Random Effects (Baseline)

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998

Solutions for Session 5: Linear Models

Valid Missing Total. N Percent N Percent N Percent , ,0% 0,0% 2 100,0% 1, ,0% 0,0% 2 100,0% 2, ,0% 0,0% 5 100,0%

Models of Patterns. Lecture 3, SMMD 2005 Bob Stine

1. Distinguish three missing data mechanisms:

Advanced Econometrics

Analysis of Variance in Matrix form

İnsan TUNALI 8 November 2018 Econ 511: Econometrics I. ASSIGNMENT 7 STATA Supplement

2016 FACULTY SALARY EQUITY ANALYSIS

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Final Exam - section 1. Thursday, December hours, 30 minutes

The FREQ Procedure. Table of Sex by Gym Sex(Sex) Gym(Gym) No Yes Total Male Female Total

The relationship between GDP, labor force and health expenditure in European countries

Random Effects ANOVA

Problem Set 9 Heteroskedasticty Answers

u panel_lecture . sum

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times.

1.1 ANNUAL PRICE MODEL

Regression Review and Robust Regression. Slides prepared by Elizabeth Newton (MIT)

Modeling Panel Data: Choosing the Correct Strategy. Roberto G. Gutierrez

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing

Quantitative Techniques Term 2

boxcox() returns the values of α and their loglikelihoods,

Econometrics is. The estimation of relationships suggested by economic theory

Data screening, transformations: MRC05

Two-Sample T-Test for Superiority by a Margin

Point-Biserial and Biserial Correlations

Handout seminar 6, ECON4150

Two-Sample T-Test for Non-Inferiority

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Final Exam

Impact of Household Income on Poverty Levels

Effect of Education on Wage Earning

EC327: Limited Dependent Variables and Sample Selection Binomial probit: probit

Multiple Regression. Review of Regression with One Predictor

Spring, Beta and Regression

############################ ### toxo.r ### ############################

Session 5: Associations

Chapter 11 Part 6. Correlation Continued. LOWESS Regression

LAMPIRAN PERHITUNGAN EVIEWS

ECON Introductory Econometrics. Seminar 4. Stock and Watson Chapter 8

Lecture 1: Review and Exploratory Data Analysis (EDA)

Assignment #5 Solutions: Chapter 14 Q1.

6 Multiple Regression

Problem Set 6 ANSWERS

*1A. Basic Descriptive Statistics sum housereg drive elecbill affidavit witness adddoc income male age literacy educ occup cityyears if control==1

Professor Brad Jones University of Arizona POL 681, SPRING 2004 INTERACTIONS and STATA: Companion To Lecture Notes on Statistical Interactions

Intro. Econometrics Fall 2015

. tsset year, yearly time variable: year, 1959 to 1994 delta: 1 year. . reg lhous ldpi lrealp

Today's Agenda Hour 1 Correlation vs association, Pearson s R, non-linearity, Spearman rank correlation,

Determination of the Optimal Stratum Boundaries in the Monthly Retail Trade Survey in the Croatian Bureau of Statistics

Non-linearities in Simple Regression

Study 2: data analysis. Example analysis using R

Sean Howard Econometrics Final Project Paper. An Analysis of the Determinants and Factors of Physical Education Attendance in the Fourth Quarter

CSC Advanced Scientific Programming, Spring Descriptive Statistics

Negative Binomial Model for Count Data Log-linear Models for Contingency Tables - Introduction

Don t worry one bit about multicollinearity, because at the end of the day, you're going to be working with a favorite coefficient model.

Statistics for Business and Economics

MODEL SELECTION CRITERIA IN R:

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

Risk Analysis. å To change Benchmark tickers:

Master s in Financial Engineering Foundations of Buy-Side Finance: Quantitative Risk and Portfolio Management. > Teaching > Courses

Technical Documentation for Household Demographics Projection

R & R Study. Chapter 254. Introduction. Data Structure

Your Name (Please print) Did you agree to take the optional portion of the final exam Yes No. Directions

Chapter 3. Populations and Statistics. 3.1 Statistical populations

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018


Numerical Descriptions of Data

Regression Model Assumptions Solutions

Parameter Estimation

NCSS Statistical Software. Reference Intervals

Statistics S1 Advanced/Advanced Subsidiary

Transcription:

The SAS System 11:3 Monday, November 11, 213 1 The CONTENTS Procedure Data Set Name BIO.AUTO_PREMIUMS Observations 5 Member Type DATA Variables 3 Engine V9 Indexes Created Monday, November 11, 213 11:4:19 AM Observation Length 24 Last Modified Monday, November 11, 213 11:4:19 AM Deleted Observations Protection Compressed NO Data Set Type Sorted NO Label Data Representation Encoding WINDOWS_64 wlatin1 Western (Windows) Engine/Host Dependent Information Data Set Page Size 496 Number of Data Set Pages 1 First Data Page 1 Max Obs per Page 168 Obs in First Data Page 5 Number of Data Set Repairs Filename H:\_Amy Docs\UF\ Fall 213\ 65-652\SAS Library\auto_premiums.sas7bdat Release Created 9.31M Host Created X64_7PRO Alphabetic List of Variables and Attributes # Variable Type Len 1 Experience Num 8 2 Gender Num 8 3 Premium Num 8

11:3 Monday, November 11, 213 2 8 Premium 6 4 5 1 15 2 Experience Gender 1

11:3 Monday, November 11, 213 3 Analysis for Males 9 8 Premium 7 6 5 5 1 15 2 Experience Gender

Analysis for Males 11:3 Monday, November 11, 213 4 The MEANS Procedure Variable Minimum Lower Quartile Median Upper Quartile Maximum Mean Std Dev Lower 95% CL for Mean Upper 95% CL for Mean Experience Premium 1. 45. 6. 62. 11. 68. 15. 8. 2. 92. 1.9 69.3 5.79 11.94 8.69 64.49 13.1 73.58

11:3 Monday, November 11, 213 5 Analysis for Males 25 2 Percent 15 1 5-1 1 2 3 Experience Normal Kernel

11:3 Monday, November 11, 213 6 Analysis for Males 3 2 Percent 1 4 6 8 1 Premium Normal Kernel

11:3 Monday, November 11, 213 7 2 Analysis for Males 15 Experience 1 5

11:3 Monday, November 11, 213 8 Analysis for Males 9 8 Premium 7 6 5

Analysis for Males 11:3 Monday, November 11, 213 9 The UNIVARIATE Procedure 2 Q-Q Plot for Experience 15 Experience 1 5-3 -2-1 1 2 3 Normal Quantiles Normal Line Mu=1.897, Sigma=5.79

Analysis for Males 11:3 Monday, November 11, 213 1 The UNIVARIATE Procedure 1 Q-Q Plot for Premium 9 8 Premium 7 6 5 4-3 -2-1 1 2 3 Normal Quantiles Normal Line Mu=69.34, Sigma=11.939

Analysis for Males 11:3 Monday, November 11, 213 11 The CORR Procedure 2 Variables: Experience Premium Simple Statistics Variable N Mean Std Dev Median Minimum Maximum Experience 29 1.89655 5.795 11. 1. 2. Premium 29 69.3448 11.93878 68. 45. 92. Pearson Correlation Coefficients, N = 29 Prob > r under H: Rho= Experience Premium Experience 1. -.65558.1 Premium -.65558.1 1. Spearman Correlation Coefficients, N = 29 Prob > r under H: Rho= Experience Premium Experience 1. -.61285.4 Premium -.61285.4 1. Pearson Correlation Statistics (Fisher's z Transformation) Variable With Variable N Sample Correlation Fisher's z Bias Adjustment Correlation Estimate 95% Confidence Limits p Value for H:Rho= Experience Premium 29 -.65558 -.7852 -.1171 -.64885 -.82288 -.37443 <.1 Spearman Correlation Statistics (Fisher's z Transformation) Variable With Variable N Sample Correlation Fisher's z Bias Adjustment Correlation Estimate 95% Confidence Limits p Value for H:Rho= Experience Premium 29 -.61285 -.71347 -.194 -.6597 -.795746 -.37829.3

Analysis for Males 11:3 Monday, November 11, 213 12 Number of Observations Read 29 Number of Observations Used 29 Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model 1 1715.26175 1715.26175 2.35.1 Error 27 2275.7377 84.28532 Corrected Total 28 399.96552 Root MSE 9.187 R-Square.4298 Dependent Mean 69.3448 Adj R-Sq.487 Coeff Var 13.29872 Parameter Estimates Variable DF Parameter Estimate Standard Error t Value Pr > t 95% Confidence Limits Intercept 1 83.76416 3.68343 22.74 <.1 76.2639 91.32193 Experience 1-1.35177.29965-4.51.1-1.96661 -.73694

Analysis for Males 11:3 Monday, November 11, 213 13 Distribution of Residuals for Premium 25 Normal Kernel 2 Percent 15 1 5-3 -24-18 -12-6 6 12 18 24 3 Residual

Analysis for Males 11:3 Monday, November 11, 213 14 15 Residual by Predicted for Premium 1 5 Residual -5-1 -15 6 65 7 75 8 Predicted Value

Analysis for Males 11:3 Monday, November 11, 213 15 2 RStudent by Predicted for Premium 1 RStudent -1-2 6 65 7 75 8 Predicted Value

Analysis for Males 11:3 Monday, November 11, 213 16 Observed by Predicted for Premium 9 8 Premium 7 6 5 5 6 7 8 9 Predicted Value

Analysis for Males 11:3 Monday, November 11, 213 17 Cook's D for Premium.125.1 Cook's D.75.5.25. 5 1 15 2 25 3 Observation

Analysis for Males 11:3 Monday, November 11, 213 18 Outlier and Leverage Diagnostics for Premium 2 1 RStudent -1-2.4.6.8.1.12.14 Leverage Outlier Leverage Outlier and Leverage

Analysis for Males 11:3 Monday, November 11, 213 19 2 Q-Q Plot of Residuals for Premium 1 Residual -1-2 -2-1 1 2 Quantile

Analysis for Males 11:3 Monday, November 11, 213 2 Residual-Fit Spread Plot for Premium 15 Fit Mean Residual 1 5-5 -1-15..2.4.6.8 1...2.4.6.8 1. Proportion Less

Analysis for Males 11:3 Monday, November 11, 213 21 15 Residuals for Premium 1 5 Residual -5-1 -15 5 1 15 2 Experience

Analysis for Males 11:3 Monday, November 11, 213 22 Fit Plot for Premium 1 Premium 8 6 Observations Parameters Error DF MSE R-Square Adj R-Square 29 2 27 84.285.4298.487 4 5 1 15 2 Experience Fit 95% Confidence Limits 95% Prediction Limits

11:3 Monday, November 11, 213 23 9 Analysis for Females 8 7 Premium 6 5 4 5 1 15 Experience Gender 1

Analysis for Females 11:3 Monday, November 11, 213 24 The MEANS Procedure Variable Minimum Lower Quartile Median Upper Quartile Maximum Mean Std Dev Lower 95% CL for Mean Upper 95% CL for Mean Experience Premium 1. 36. 5. 45. 9. 5. 12. 6. 16. 88. 8.43 54.62 4.78 15.44 6.25 47.59 1.6 61.65

11:3 Monday, November 11, 213 25 25 Analysis for Females 2 15 Percent 1 5 1 2 Experience Normal Kernel

11:3 Monday, November 11, 213 26 4 Analysis for Females 3 Percent 2 1 2 4 6 8 1 Premium Normal Kernel

11:3 Monday, November 11, 213 27 Analysis for Females 15 1 Experience 5

11:3 Monday, November 11, 213 28 9 Analysis for Females 8 7 Premium 6 5 4

Analysis for Females 11:3 Monday, November 11, 213 29 The UNIVARIATE Procedure 2 Q-Q Plot for Experience 15 Experience 1 5-2 -1 1 2 Normal Quantiles Normal Line Mu=8.4286, Sigma=4.789

Analysis for Females 11:3 Monday, November 11, 213 3 The UNIVARIATE Procedure 9 Q-Q Plot for Premium 8 7 Premium 6 5 4 3-2 -1 1 2 Normal Quantiles Normal Line Mu=54.619, Sigma=15.439

Analysis for Females 11:3 Monday, November 11, 213 31 The CORR Procedure 2 Variables: Experience Premium Simple Statistics Variable N Mean Std Dev Median Minimum Maximum Experience 21 8.42857 4.7891 9. 1. 16. Premium 21 54.6195 15.43851 5. 36. 88. Pearson Correlation Coefficients, N = 21 Prob > r under H: Rho= Experience Premium Experience 1. -.87696 <.1 Premium -.87696 <.1 1. Spearman Correlation Coefficients, N = 21 Prob > r under H: Rho= Experience Premium Experience 1. -.86749 <.1 Premium -.86749 <.1 1. Pearson Correlation Statistics (Fisher's z Transformation) Variable With Variable N Sample Correlation Fisher's z Bias Adjustment Correlation Estimate 95% Confidence Limits p Value for H:Rho= Experience Premium 21 -.87696-1.36245 -.2192 -.8718 -.94764 -.75695 <.1 Spearman Correlation Statistics (Fisher's z Transformation) Variable With Variable N Sample Correlation Fisher's z Bias Adjustment Correlation Estimate 95% Confidence Limits p Value for H:Rho= Experience Premium 21 -.86749-1.32286 -.2169 -.8622 -.942853 -.685388 <.1

Analysis for Females 11:3 Monday, November 11, 213 32 Number of Observations Read 21 Number of Observations Used 21 Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model 1 3666.6446 3666.6446 63.27 <.1 Error 19 11.88792 57.94147 Corrected Total 2 4766.95238 Root MSE 7.61193 R-Square.7691 Dependent Mean 54.6195 Adj R-Sq.7569 Coeff Var 13.9364 Parameter Estimates Variable DF Parameter Estimate Standard Error t Value Pr > t 95% Confidence Limits Intercept 1 78.48771 3.42977 22.88 <.1 71.3912 85.6663 Experience 1-2.83187.3562-7.95 <.1-3.5772-2.8673

Analysis for Females 11:3 Monday, November 11, 213 33 Distribution of Residuals for Premium 3 Normal Kernel 2 Percent 1-24 -18-12 -6 6 12 18 24 Residual

Analysis for Females 11:3 Monday, November 11, 213 34 15 Residual by Predicted for Premium 1 5 Residual -5-1 -15 4 5 6 7 Predicted Value

Analysis for Females 11:3 Monday, November 11, 213 35 RStudent by Predicted for Premium 2 1 RStudent -1-2 4 5 6 7 Predicted Value

Analysis for Females 11:3 Monday, November 11, 213 36 9 Observed by Predicted for Premium 8 7 Premium 6 5 4 4 5 6 7 8 9 Predicted Value

Analysis for Females 11:3 Monday, November 11, 213 37 Cook's D for Premium.3.2 Cook's D.1. 5 1 15 2 Observation

Analysis for Females 11:3 Monday, November 11, 213 38 Outlier and Leverage Diagnostics for Premium 2 1 RStudent -1-2.5.1.15.2 Leverage Outlier Leverage Outlier and Leverage

Analysis for Females 11:3 Monday, November 11, 213 39 15 Q-Q Plot of Residuals for Premium 1 5 Residual -5-1 -15-2 -1 1 2 Quantile

Analysis for Females 11:3 Monday, November 11, 213 4 Residual-Fit Spread Plot for Premium Fit Mean Residual 2 1-1 -2..2.4.6.8 1...2.4.6.8 1. Proportion Less

Analysis for Females 11:3 Monday, November 11, 213 41 15 Residuals for Premium 1 5 Residual -5-1 -15 5 1 15 Experience

Analysis for Females 11:3 Monday, November 11, 213 42 Fit Plot for Premium 8 Premium 6 Observations Parameters Error DF MSE R-Square Adj R-Square 21 2 19 57.941.7691.7569 4 2 5 1 15 Experience Fit 95% Confidence Limits 95% Prediction Limits