Regression and Simulation

Size: px
Start display at page:

Download "Regression and Simulation"

Transcription

1 Regression and Simulation This is an introductory R session, so it may go slowly if you have never used R before. Do not be discouraged. A great way to learn a new language like this is to plunge right in. We will simulate some data suitable for a regression. This will get you started in R and will teach you some useful tricks. I want you to build a script in RStudio (File/New File/R Script) and execute the lines from the script as you proceed. 1. Generate data for simple regression The first task is to generate some data for fitting a simple regression model. By simple, this means a single predictor X 1 and a response Y. We will use a sample size of N=100, and generate N pairs (X 1i, Y i ) from the model Y i = β 0 + X 1i β 1 + e i. In R we will call X 1 simply, since R does not like subscripts in names. Now for some details: The X 1i should come from a Gaussian distribution with mean 20 and standard deviation 3. Take β 0 = 25 and β 1 = 2 to be the true parameters, and assume that e i comes from a normal distribution with mean 0 and standard deviation 15. Generate your data, make a histogram of X 1, and plot the values of Y against those of X 1. Your plot should look something like this: Y Useful functions are rnorm, plot, and hist. If you type?rnorm, for example, you will see a helpfile. In the plot, include the true regression line; abline is useful here. 1

2 Answers First we will generate some data from a Gaussian distribution and call it. N=100?rnorm =rnorm(n,20,3) hist() Histogram of Frequency Now we will make a response Y that depends on plus some error. We will make the error Gaussian as well. Y= 25 + *2 + rnorm(n,0,15) plot(,y) abline(25,2,col="red") 2

3 Y Notice how we included the true regression line in the plot. 2. Fit a linear regression Let s now fit a linear regression of Y on X 1 and look at its properties. For this you will use the function lm in the following manner. Use the summary() command on your fitted object fit to see what is inside. Try coef as well. We can include the fitted regression line in the plot. The abline function knows about fitted lm models, so you can give it as a single argument to abline(). 3

4 Y Once you get this working, see what happens if you execute your entire script (including the data generation) a second time. You should get a somewhat different result, since different data are generated. You can prevent this, if you like, by using the set.seed command. Type?set.seed to learn how. Try this and see that your data generation is now reproducible. Answers Here is our initial regression fit. summary(fit) Call: lm(formula = Y ~ ) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-05 *** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 98 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 98 DF, p-value: 1.036e-05 4

5 coef(fit) (Intercept) plot(,y) abline(25,2,col="red") abline(fit,col="blue",lwd=2) Y If we repeat this process, we get something slightly different =rnorm(n,20,3) Y= 25 + *2 + rnorm(n,0,15) plot(,y) summary(fit) Call: lm(formula = Y ~ ) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) *** ** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 5

6 Residual standard error: on 98 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 98 DF, p-value: abline(25,2,col="red") abline(fit,col="blue",lwd=2) Y If we want to be able to repeat an analysis, including the random number generation, then we have to set the random number seed. This is easy to do in R. set.seed(101) # any number will do =rnorm(n,20,3) Y= 25 + *2 + rnorm(n,0,15) plot(,y) summary(fit) Call: lm(formula = Y ~ ) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-06 *** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 6

7 Residual standard error: on 98 degrees of freedom Multiple R-squared: 0.189, Adjusted R-squared: F-statistic: on 1 and 98 DF, p-value: 6.206e-06 coef(fit) (Intercept) abline(25,2,col="red") abline(fit,col="blue",lwd=2) Y Simulation to check aspects of regression We learn from the theory that the variance of the slope estimate in simple regression is given by the expression Var( ˆβ 1 ) = σ 2 N i=1 (X 1i X 1 ) 2, where X 1 is the mean of the X 1i, and σ is the standard deviation of the e i. (Here the hat on top of ˆβ 1 means it is estimated from our data.) What this theory is really telling us is that if we fix our X 1 values and repeatedly generate new e i and hence Y i and refit the regressions each time, the variance of the series of ˆβ 1 s that we get will be given by this expression. Since we typically don t know σ, we need to estimate it as well, using the the expression ˆσ 2 = 1 N 2 N (Y i Ŷi) 2. i=1 7

8 The linear model summary tells us the estimated standard error of the slope is 0.54 for these data (at least mine did). Let s check it out! Run a simulation where you generate a new response vector each time, fit the regression, extract the slope coefficient, and store it. Do this 1000 times using a for loop (?'for'), keeping the same values for X 1. Summarize the results and in particular compute the standard deviation of the 1000 values you get. How closely does it match the theoretical value? Answers We will generate 1000 different response vectors, compute the slope each time, and then claculate the standard deviation of the estimated slopes. set.seed(101) #any number will do beta=double(1000) for(i in 1:1000) { Y = 25 + *2 + rnorm(n,0,15) beta[i]=coef(fit)[2] } Now let s look at these betas (slope estimates). hist(beta) Histogram of beta Frequency beta mean(beta) [1]

9 sd(beta) [1] Pretty close! 4. Slightly different simulation Here we held the original X 1 values fixed. What if we generate new versions of them as well when we do our simulations. Repeat your simulations, adding this little twist. What do you learn about the distribution and standard error of ˆβ 1 in this setting? Any surprises? What would you have expected? Answers set.seed(101) #any number will do beta2=double(1000) for(i in 1:1000){ =rnorm(n,20,3) Y= 25 + *2 + rnorm(n,0,15) beta2[i]=coef(fit)[2] } mean(beta2) [1] sd(beta2) [1] It s smaller! What happened? 5. Multiple Regression We now generate an X 2 and create a multiple regression model Y i = β 0 + X 1i β 1 + X 2i β 2 + e i. Let X 2 have the same distribution as X 1 (but be independent of X 1 ), and let β 2 = 1 in the simulation. Fit the multiple linear regression model the formula is now Y~+X2 and use summary to summarize the results. What do you observe about the standard errors of each coefficient and in particular the coefficient of X 1? For your current version of X 1 and X 2, what is their sample correlation? (cor()). Is your sample correlation zero? Why or why not? Answers We now generate an X 2 and create a multiple regression model. 9

10 set.seed(109) =rnorm(n,20,3) X2=rnorm(N,20,3) Y= 25 + *2 + X2*1+ rnorm(n,0,15) fit2=lm(y~+x2) summary(fit) Call: lm(formula = Y ~ ) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) *** e-07 *** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 98 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 98 DF, p-value: 3.141e-07 summary(fit2) Call: lm(formula = Y ~ + X2) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-07 *** X Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 97 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 2 and 97 DF, p-value: 4.447e-07 In this case, X 1 and X 2 are realizations of two independent random variables, each Gaussian, so the presence of one does not really effect the other in the regression. Here their sample correlation is 10

11 cor(,x2) [1] Suppose we wanted the distribution of X 2 to be correlated with X 1. How might we do that? Figure out a way to do this, and experiment until your X 2 has sample correlation about 0.6 with X 1. Now generate a Y from your regression model, and use summary to examine the coefficients. What do you observe? Now try to produce X 2 with sample correlation about 0.8 with X 1, and repeat the regression experiment above. What do you observe? Answers How do we generate an X 2 that has correlation with X 1? One simple way is to generate a Z independent of X 1, and then make X 2 = αx 1 + Z for some value of α (below we use α = 0.7). Z=X2 X2=0.7*+Z plot(,x2) X cor(,x2) [1] Now when we fit the linear model, the variance of the coefficients increases in the multiple linear regression versus the simple linear regression, because of the correlation. Y= 25 + *2 + X2*1+ rnorm(n,0,15) fit2=lm(y~+x2) summary(fit) 11

12 Call: lm(formula = Y ~ ) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-06 *** e-09 *** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 15.1 on 98 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 98 DF, p-value: 6.404e-09 summary(fit2) Call: lm(formula = Y ~ + X2) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) *** X * --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 97 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 2 and 97 DF, p-value: 3.799e-09 12

Multiple regression - a brief introduction

Multiple regression - a brief introduction Multiple regression - a brief introduction Multiple regression is an extension to regular (simple) regression. Instead of one X, we now have several. Suppose, for example, that you are trying to predict

More information

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times.

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times. Mixed-effects models An introduction by Christoph Scherber Up to now, we have been dealing with linear models of the form where ß0 and ß1 are parameters of fixed value. Example: Let us assume that we are

More information

R is a collaborative project with many contributors. Type contributors() for more information.

R is a collaborative project with many contributors. Type contributors() for more information. R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type license() or licence() for distribution details. R is a collaborative project

More information

Final Exam Suggested Solutions

Final Exam Suggested Solutions University of Washington Fall 003 Department of Economics Eric Zivot Economics 483 Final Exam Suggested Solutions This is a closed book and closed note exam. However, you are allowed one page of handwritten

More information

Non-linearities in Simple Regression

Non-linearities in Simple Regression Non-linearities in Simple Regression 1. Eample: Monthly Earnings and Years of Education In this tutorial, we will focus on an eample that eplores the relationship between total monthly earnings and years

More information

6 Multiple Regression

6 Multiple Regression More than one X variable. 6 Multiple Regression Why? Might be interested in more than one marginal effect Omitted Variable Bias (OVB) 6.1 and 6.2 House prices and OVB Should I build a fireplace? The following

More information

MODEL SELECTION CRITERIA IN R:

MODEL SELECTION CRITERIA IN R: 1. R 2 statistics We may use MODEL SELECTION CRITERIA IN R R 2 = SS R SS T = 1 SS Res SS T or R 2 Adj = 1 SS Res/(n p) SS T /(n 1) = 1 ( ) n 1 (1 R 2 ). n p where p is the total number of parameters. R

More information

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 18, 2006, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTIONS

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 18, 2006, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTIONS COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 18, 2006, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTIONS Answer all parts. Closed book, calculators allowed. It is important to show all working,

More information

Dummy Variables. 1. Example: Factors Affecting Monthly Earnings

Dummy Variables. 1. Example: Factors Affecting Monthly Earnings Dummy Variables A dummy variable or binary variable is a variable that takes on a value of 0 or 1 as an indicator that the observation has some kind of characteristic. Common examples: Sex (female): FEMALE=1

More information

Regression Review and Robust Regression. Slides prepared by Elizabeth Newton (MIT)

Regression Review and Robust Regression. Slides prepared by Elizabeth Newton (MIT) Regression Review and Robust Regression Slides prepared by Elizabeth Newton (MIT) S-Plus Oil City Data Frame Monthly Excess Returns of Oil City Petroleum, Inc. Stocks and the Market SUMMARY: The oilcity

More information

Homework Assignment Section 3

Homework Assignment Section 3 Homework Assignment Section 3 Tengyuan Liang Business Statistics Booth School of Business Problem 1 A company sets different prices for a particular stereo system in eight different regions of the country.

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Scott Creel Wednesday, September 10, 2014 This exercise extends the prior material on using the lm() function to fit an OLS regression and test hypotheses about effects on a parameter.

More information

The Norwegian State Equity Ownership

The Norwegian State Equity Ownership The Norwegian State Equity Ownership B A Ødegaard 15 November 2018 Contents 1 Introduction 1 2 Doing a performance analysis 1 2.1 Using R....................................................................

More information

Homework Assignment Section 3

Homework Assignment Section 3 Homework Assignment Section 3 Tengyuan Liang Business Statistics Booth School of Business Problem 1 A company sets different prices for a particular stereo system in eight different regions of the country.

More information

Study 2: data analysis. Example analysis using R

Study 2: data analysis. Example analysis using R Study 2: data analysis Example analysis using R Steps for data analysis Install software on your computer or locate computer with software (e.g., R, systat, SPSS) Prepare data for analysis Subjects (rows)

More information

State Ownership at the Oslo Stock Exchange. Bernt Arne Ødegaard

State Ownership at the Oslo Stock Exchange. Bernt Arne Ødegaard State Ownership at the Oslo Stock Exchange Bernt Arne Ødegaard Introduction We ask whether there is a state rebate on companies listed on the Oslo Stock Exchange, i.e. whether companies where the state

More information

Lecture Note: Analysis of Financial Time Series Spring 2017, Ruey S. Tsay

Lecture Note: Analysis of Financial Time Series Spring 2017, Ruey S. Tsay Lecture Note: Analysis of Financial Time Series Spring 2017, Ruey S. Tsay Seasonal Time Series: TS with periodic patterns and useful in predicting quarterly earnings pricing weather-related derivatives

More information

Jaime Frade Dr. Niu Interest rate modeling

Jaime Frade Dr. Niu Interest rate modeling Interest rate modeling Abstract In this paper, three models were used to forecast short term interest rates for the 3 month LIBOR. Each of the models, regression time series, GARCH, and Cox, Ingersoll,

More information

Economics 424/Applied Mathematics 540. Final Exam Solutions

Economics 424/Applied Mathematics 540. Final Exam Solutions University of Washington Summer 01 Department of Economics Eric Zivot Economics 44/Applied Mathematics 540 Final Exam Solutions I. Matrix Algebra and Portfolio Math (30 points, 5 points each) Let R i denote

More information

STATISTICS 110/201, FALL 2017 Homework #5 Solutions Assigned Mon, November 6, Due Wed, November 15

STATISTICS 110/201, FALL 2017 Homework #5 Solutions Assigned Mon, November 6, Due Wed, November 15 STATISTICS 110/201, FALL 2017 Homework #5 Solutions Assigned Mon, November 6, Due Wed, November 15 For this assignment use the Diamonds dataset in the Stat2Data library. The dataset is used in examples

More information

Review: Population, sample, and sampling distributions

Review: Population, sample, and sampling distributions Review: Population, sample, and sampling distributions A population with mean µ and standard deviation σ For instance, µ = 0, σ = 1 0 1 Sample 1, N=30 Sample 2, N=30 Sample 100000000000 InterquartileRange

More information

NHY examples. Bernt Arne Ødegaard. 23 November Estimating dividend growth in Norsk Hydro 8

NHY examples. Bernt Arne Ødegaard. 23 November Estimating dividend growth in Norsk Hydro 8 NHY examples Bernt Arne Ødegaard 23 November 2017 Abstract Finance examples using equity data for Norsk Hydro (NHY) Contents 1 Calculating Beta 4 2 Cost of Capital 7 3 Estimating dividend growth in Norsk

More information

Monetary Economics Risk and Return, Part 2. Gerald P. Dwyer Fall 2015

Monetary Economics Risk and Return, Part 2. Gerald P. Dwyer Fall 2015 Monetary Economics Risk and Return, Part 2 Gerald P. Dwyer Fall 2015 Reading Malkiel, Part 2, Part 3 Malkiel, Part 3 Outline Returns and risk Overall market risk reduced over longer periods Individual

More information

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018 ` Subject CS1 Actuarial Statistics 1 Core Principles Syllabus for the 2019 exams 1 June 2018 Copyright in this Core Reading is the property of the Institute and Faculty of Actuaries who are the sole distributors.

More information

State Ownership at the Oslo Stock Exchange

State Ownership at the Oslo Stock Exchange State Ownership at the Oslo Stock Exchange Bernt Arne Ødegaard 1 Introduction We ask whether there is a state rebate on companies listed on the Oslo Stock Exchange, i.e. whether companies where the state

More information

Linear regression model

Linear regression model Regression Model Assumptions (Solutions) STAT-UB.0003: Regression and Forecasting Models Linear regression model 1. Here is the least squares regression fit to the Zagat restaurant data: 10 15 20 25 10

More information

The histogram should resemble the uniform density, the mean should be close to 0.5, and the standard deviation should be close to 1/ 12 =

The histogram should resemble the uniform density, the mean should be close to 0.5, and the standard deviation should be close to 1/ 12 = Chapter 19 Monte Carlo Valuation Question 19.1 The histogram should resemble the uniform density, the mean should be close to.5, and the standard deviation should be close to 1/ 1 =.887. Question 19. The

More information

σ e, which will be large when prediction errors are Linear regression model

σ e, which will be large when prediction errors are Linear regression model Linear regression model we assume that two quantitative variables, x and y, are linearly related; that is, the population of (x, y) pairs are related by an ideal population regression line y = α + βx +

More information

MCMC Package Example

MCMC Package Example MCMC Package Example Charles J. Geyer April 4, 2005 This is an example of using the mcmc package in R. The problem comes from a take-home question on a (take-home) PhD qualifying exam (School of Statistics,

More information

MixedModR2 Erika Mudrak Thursday, August 30, 2018

MixedModR2 Erika Mudrak Thursday, August 30, 2018 MixedModR Erika Mudrak Thursday, August 3, 18 Generate the Data Generate data points from a population with one random effect: levels of Factor A, each sampled 5 times set.seed(39) siga

More information

Chapter 8 Statistical Intervals for a Single Sample

Chapter 8 Statistical Intervals for a Single Sample Chapter 8 Statistical Intervals for a Single Sample Part 1: Confidence intervals (CI) for population mean µ Section 8-1: CI for µ when σ 2 known & drawing from normal distribution Section 8-1.2: Sample

More information

GGraph. Males Only. Premium. Experience. GGraph. Gender. 1 0: R 2 Linear = : R 2 Linear = Page 1

GGraph. Males Only. Premium. Experience. GGraph. Gender. 1 0: R 2 Linear = : R 2 Linear = Page 1 GGraph 9 Gender : R Linear =.43 : R Linear =.769 8 7 6 5 4 3 5 5 Males Only GGraph Page R Linear =.43 R Loess 9 8 7 6 5 4 5 5 Explore Case Processing Summary Cases Valid Missing Total N Percent N Percent

More information

Loss Simulation Model Testing and Enhancement

Loss Simulation Model Testing and Enhancement Loss Simulation Model Testing and Enhancement Casualty Loss Reserve Seminar By Kailan Shang Sept. 2011 Agenda Research Overview Model Testing Real Data Model Enhancement Further Development Enterprise

More information

Stat 401XV Exam 3 Spring 2017

Stat 401XV Exam 3 Spring 2017 Stat 40XV Exam Spring 07 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning

More information

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER STA2601/105/2/2018 Tutorial letter 105/2/2018 Applied Statistics II STA2601 Semester 2 Department of Statistics TRIAL EXAMINATION PAPER Define tomorrow. university of south africa Dear Student Congratulations

More information

Multiple linear regression

Multiple linear regression Multiple linear regression Business Statistics 41000 Spring 2017 1 Topics 1. Including multiple predictors 2. Controlling for confounders 3. Transformations, interactions, dummy variables OpenIntro 8.1,

More information

Determination of the Optimal Stratum Boundaries in the Monthly Retail Trade Survey in the Croatian Bureau of Statistics

Determination of the Optimal Stratum Boundaries in the Monthly Retail Trade Survey in the Croatian Bureau of Statistics Determination of the Optimal Stratum Boundaries in the Monthly Retail Trade Survey in the Croatian Bureau of Statistics Ivana JURINA (jurinai@dzs.hr) Croatian Bureau of Statistics Lidija GLIGOROVA (gligoroval@dzs.hr)

More information

> attach(grocery) > boxplot(sales~discount, ylab="sales",xlab="discount")

> attach(grocery) > boxplot(sales~discount, ylab=sales,xlab=discount) Example of More than 2 Categories, and Analysis of Covariance Example > attach(grocery) > boxplot(sales~discount, ylab="sales",xlab="discount") Sales 160 200 240 > tapply(sales,discount,mean) 10.00% 15.00%

More information

Econometric Methods for Valuation Analysis

Econometric Methods for Valuation Analysis Econometric Methods for Valuation Analysis Margarita Genius Dept of Economics M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 1 / 26 Correlation Analysis Simple Regression

More information

Linear Regression with One Regressor

Linear Regression with One Regressor Linear Regression with One Regressor Michael Ash Lecture 9 Linear Regression with One Regressor Review of Last Time 1. The Linear Regression Model The relationship between independent X and dependent Y

More information

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Chapter 7 Sampling Distributions and Point Estimation of Parameters Chapter 7 Sampling Distributions and Point Estimation of Parameters Part 1: Sampling Distributions, the Central Limit Theorem, Point Estimation & Estimators Sections 7-1 to 7-2 1 / 25 Statistical Inferences

More information

Basic Procedure for Histograms

Basic Procedure for Histograms Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that

More information

Random Effects ANOVA

Random Effects ANOVA Random Effects ANOVA Grant B. Morgan Baylor University This post contains code for conducting a random effects ANOVA. Make sure the following packages are installed: foreign, lme4, lsr, lattice. library(foreign)

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - IIIb Henrik Madsen March 18, 2012 Henrik Madsen () Chapman & Hall March 18, 2012 1 / 32 Examples Overdispersion and Offset!

More information

Business Statistics: A First Course

Business Statistics: A First Course Business Statistics: A First Course Fifth Edition Chapter 12 Correlation and Simple Linear Regression Business Statistics: A First Course, 5e 2009 Prentice-Hall, Inc. Chap 12-1 Learning Objectives In this

More information

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS Part 1: Introduction Sampling Distributions & the Central Limit Theorem Point Estimation & Estimators Sections 7-1 to 7-2 Sample data

More information

ORDERED MULTINOMIAL LOGISTIC REGRESSION ANALYSIS. Pooja Shivraj Southern Methodist University

ORDERED MULTINOMIAL LOGISTIC REGRESSION ANALYSIS. Pooja Shivraj Southern Methodist University ORDERED MULTINOMIAL LOGISTIC REGRESSION ANALYSIS Pooja Shivraj Southern Methodist University KINDS OF REGRESSION ANALYSES Linear Regression Logistic Regression Dichotomous dependent variable (yes/no, died/

More information

Econometric Methods for Valuation Analysis

Econometric Methods for Valuation Analysis Econometric Methods for Valuation Analysis Margarita Genius Dept of Economics M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 1 / 25 Outline We will consider econometric

More information

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.

More information

boxcox() returns the values of α and their loglikelihoods,

boxcox() returns the values of α and their loglikelihoods, Solutions to Selected Computer Lab Problems and Exercises in Chapter 11 of Statistics and Data Analysis for Financial Engineering, 2nd ed. by David Ruppert and David S. Matteson c 2016 David Ruppert and

More information

Step 1: Load the appropriate R package. Step 2: Fit a separate mixed model for each independence claim in the basis set.

Step 1: Load the appropriate R package. Step 2: Fit a separate mixed model for each independence claim in the basis set. Step 1: Load the appropriate R package. You will need two libraries: nlme and lme4. Step 2: Fit a separate mixed model for each independence claim in the basis set. For instance, in Table 2 the first basis

More information

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions SGSB Workshop: Using Statistical Data to Make Decisions Module 2: The Logic of Statistical Inference Dr. Tom Ilvento January 2006 Dr. Mugdim Pašić Key Objectives Understand the logic of statistical inference

More information

Business Statistics 41000: Probability 4

Business Statistics 41000: Probability 4 Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:

More information

Tests for the Difference Between Two Linear Regression Intercepts

Tests for the Difference Between Two Linear Regression Intercepts Chapter 853 Tests for the Difference Between Two Linear Regression Intercepts Introduction Linear regression is a commonly used procedure in statistical analysis. One of the main objectives in linear regression

More information

PASS Sample Size Software

PASS Sample Size Software Chapter 850 Introduction Cox proportional hazards regression models the relationship between the hazard function λ( t X ) time and k covariates using the following formula λ log λ ( t X ) ( t) 0 = β1 X1

More information

INSTITUTE OF ACTUARIES OF INDIA EXAMINATIONS. 20 th May Subject CT3 Probability & Mathematical Statistics

INSTITUTE OF ACTUARIES OF INDIA EXAMINATIONS. 20 th May Subject CT3 Probability & Mathematical Statistics INSTITUTE OF ACTUARIES OF INDIA EXAMINATIONS 20 th May 2013 Subject CT3 Probability & Mathematical Statistics Time allowed: Three Hours (10.00 13.00) Total Marks: 100 INSTRUCTIONS TO THE CANDIDATES 1.

More information

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley. Appendix: Statistics in Action Part I Financial Time Series 1. These data show the effects of stock splits. If you investigate further, you ll find that most of these splits (such as in May 1970) are 3-for-1

More information

Two Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 22 January :00 16:00

Two Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 22 January :00 16:00 Two Hours MATH38191 Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER STATISTICAL MODELLING IN FINANCE 22 January 2015 14:00 16:00 Answer ALL TWO questions

More information

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Midterm

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Midterm Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay Midterm GSB Honor Code: I pledge my honor that I have not violated the Honor Code during this examination.

More information

Models of Patterns. Lecture 3, SMMD 2005 Bob Stine

Models of Patterns. Lecture 3, SMMD 2005 Bob Stine Models of Patterns Lecture 3, SMMD 2005 Bob Stine Review Speculative investing and portfolios Risk and variance Volatility adjusted return Volatility drag Dependence Covariance Review Example Stock and

More information

Stat3011: Solution of Midterm Exam One

Stat3011: Solution of Midterm Exam One 1 Stat3011: Solution of Midterm Exam One Fall/2003, Tiefeng Jiang Name: Problem 1 (30 points). Choose one appropriate answer in each of the following questions. 1. (B ) The mean age of five people in a

More information

Problem Set 4 Answer Key

Problem Set 4 Answer Key Economics 31 Menzie D. Chinn Fall 4 Social Sciences 7418 University of Wisconsin-Madison Problem Set 4 Answer Key This problem set is due in lecture on Wednesday, December 1st. No late problem sets will

More information

Stat 328, Summer 2005

Stat 328, Summer 2005 Stat 328, Summer 2005 Exam #2, 6/18/05 Name (print) UnivID I have neither given nor received any unauthorized aid in completing this exam. Signed Answer each question completely showing your work where

More information

Case Study: Applying Generalized Linear Models

Case Study: Applying Generalized Linear Models Case Study: Applying Generalized Linear Models Dr. Kempthorne May 12, 2016 Contents 1 Generalized Linear Models of Semi-Quantal Biological Assay Data 2 1.1 Coal miners Pneumoconiosis Data.................

More information

Milestone2. Zillow House Price Prediciton. Group: Lingzi Hong and Pranali Shetty

Milestone2. Zillow House Price Prediciton. Group: Lingzi Hong and Pranali Shetty Milestone2 Zillow House Price Prediciton Group Lingzi Hong and Pranali Shetty MILESTONE 2 REPORT Data Collection The following additional features were added 1. Population, Number of College Graduates

More information

Introduction to Population Modeling

Introduction to Population Modeling Introduction to Population Modeling In addition to estimating the size of a population, it is often beneficial to estimate how the population size changes over time. Ecologists often uses models to create

More information

GARCH Models. Instructor: G. William Schwert

GARCH Models. Instructor: G. William Schwert APS 425 Fall 2015 GARCH Models Instructor: G. William Schwert 585-275-2470 schwert@schwert.ssb.rochester.edu Autocorrelated Heteroskedasticity Suppose you have regression residuals Mean = 0, not autocorrelated

More information

Ordinal Multinomial Logistic Regression. Thom M. Suhy Southern Methodist University May14th, 2013

Ordinal Multinomial Logistic Regression. Thom M. Suhy Southern Methodist University May14th, 2013 Ordinal Multinomial Logistic Thom M. Suhy Southern Methodist University May14th, 2013 GLM Generalized Linear Model (GLM) Framework for statistical analysis (Gelman and Hill, 2007, p. 135) Linear Continuous

More information

Chapter 6 Part 3 October 21, Bootstrapping

Chapter 6 Part 3 October 21, Bootstrapping Chapter 6 Part 3 October 21, 2008 Bootstrapping From the internet: The bootstrap involves repeated re-estimation of a parameter using random samples with replacement from the original data. Because the

More information

Window Width Selection for L 2 Adjusted Quantile Regression

Window Width Selection for L 2 Adjusted Quantile Regression Window Width Selection for L 2 Adjusted Quantile Regression Yoonsuh Jung, The Ohio State University Steven N. MacEachern, The Ohio State University Yoonkyung Lee, The Ohio State University Technical Report

More information

MVE051/MSG Lecture 7

MVE051/MSG Lecture 7 MVE051/MSG810 2017 Lecture 7 Petter Mostad Chalmers November 20, 2017 The purpose of collecting and analyzing data Purpose: To build and select models for parts of the real world (which can be used for

More information

Generalized Multilevel Regression Example for a Binary Outcome

Generalized Multilevel Regression Example for a Binary Outcome Psy 510/610 Multilevel Regression, Spring 2017 1 HLM Generalized Multilevel Regression Example for a Binary Outcome Specifications for this Bernoulli HLM2 run Problem Title: no title The data source for

More information

Monte Carlo Simulation and Resampling

Monte Carlo Simulation and Resampling Monte Carlo Simulation and Resampling Tom Carsey (Instructor) Jeff Harden (TA) ICPSR Summer Course Summer, 2011 Monte Carlo Simulation and Resampling 1/114 Introductions and Overview What do I plan for

More information

Time Observations Time Period, t

Time Observations Time Period, t Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard Time Series and Forecasting.S1 Time Series Models An example of a time series for 25 periods is plotted in Fig. 1 from the numerical

More information

Lecture 3: Factor models in modern portfolio choice

Lecture 3: Factor models in modern portfolio choice Lecture 3: Factor models in modern portfolio choice Prof. Massimo Guidolin Portfolio Management Spring 2016 Overview The inputs of portfolio problems Using the single index model Multi-index models Portfolio

More information

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Statistics 431 Spring 2007 P. Shaman. Preliminaries Statistics 4 Spring 007 P. Shaman The Binomial Distribution Preliminaries A binomial experiment is defined by the following conditions: A sequence of n trials is conducted, with each trial having two possible

More information

Predicting Charitable Contributions

Predicting Charitable Contributions Predicting Charitable Contributions By Lauren Meyer Executive Summary Charitable contributions depend on many factors from financial security to personal characteristics. This report will focus on demographic

More information

Gov 2001: Section 5. I. A Normal Example II. Uncertainty. Gov Spring 2010

Gov 2001: Section 5. I. A Normal Example II. Uncertainty. Gov Spring 2010 Gov 2001: Section 5 I. A Normal Example II. Uncertainty Gov 2001 Spring 2010 A roadmap We started by introducing the concept of likelihood in the simplest univariate context one observation, one variable.

More information

SAS Simple Linear Regression Example

SAS Simple Linear Regression Example SAS Simple Linear Regression Example This handout gives examples of how to use SAS to generate a simple linear regression plot, check the correlation between two variables, fit a simple linear regression

More information

Business Statistics Midterm Exam Fall 2013 Russell

Business Statistics Midterm Exam Fall 2013 Russell Name Business Statistics Midterm Exam Fall 2013 Russell Do not turn over this page until you are told to do so. You will have 2 hours to complete the exam. There are a total of 100 points divided into

More information

4.1 Introduction Estimating a population mean The problem with estimating a population mean with a sample mean: an example...

4.1 Introduction Estimating a population mean The problem with estimating a population mean with a sample mean: an example... Chapter 4 Point estimation Contents 4.1 Introduction................................... 2 4.2 Estimating a population mean......................... 2 4.2.1 The problem with estimating a population mean

More information

Panel Data. November 15, The panel is balanced if all individuals have a complete set of observations, otherwise the panel is unbalanced.

Panel Data. November 15, The panel is balanced if all individuals have a complete set of observations, otherwise the panel is unbalanced. Panel Data November 15, 2018 1 Panel data Panel data are obsevations of the same individual on different dates. time Individ 1 Individ 2 Individ 3 individuals The panel is balanced if all individuals have

More information

The misleading nature of correlations

The misleading nature of correlations The misleading nature of correlations In this note we explain certain subtle features of calculating correlations between time-series. Correlation is a measure of linear co-movement, to be contrasted with

More information

Economics 413: Economic Forecast and Analysis Department of Economics, Finance and Legal Studies University of Alabama

Economics 413: Economic Forecast and Analysis Department of Economics, Finance and Legal Studies University of Alabama Problem Set #1 (Linear Regression) 1. The file entitled MONEYDEM.XLS contains quarterly values of seasonally adjusted U.S.3-month ( 3 ) and 1-year ( 1 ) treasury bill rates. Each series is measured over

More information

Financial Econometrics Jeffrey R. Russell. Midterm 2014 Suggested Solutions. TA: B. B. Deng

Financial Econometrics Jeffrey R. Russell. Midterm 2014 Suggested Solutions. TA: B. B. Deng Financial Econometrics Jeffrey R. Russell Midterm 2014 Suggested Solutions TA: B. B. Deng Unless otherwise stated, e t is iid N(0,s 2 ) 1. (12 points) Consider the three series y1, y2, y3, and y4. Match

More information

Statistics vs. statistics

Statistics vs. statistics Statistics vs. statistics Question: What is Statistics (with a capital S)? Definition: Statistics is the science of collecting, organizing, summarizing and interpreting data. Note: There are 2 main ways

More information

Statistics for Business and Economics

Statistics for Business and Economics Statistics for Business and Economics Chapter 7 Estimation: Single Population Copyright 010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-1 Confidence Intervals Contents of this chapter: Confidence

More information

CHAPTER 2 Describing Data: Numerical

CHAPTER 2 Describing Data: Numerical CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of

More information

Spring, Beta and Regression

Spring, Beta and Regression Spring, 2000-1 - Administrative Items Getting help See me Monday 3-5:30 or tomorrow after 2:30. Send me an e-mail with your question. (stine@wharton) Visit the StatLab/TAs, particularly for help using

More information

Analysis of Variance in Matrix form

Analysis of Variance in Matrix form Analysis of Variance in Matrix form The ANOVA table sums of squares, SSTO, SSR and SSE can all be expressed in matrix form as follows. week 9 Multiple Regression A multiple regression model is a model

More information

CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates

CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates A point estimate is a single number, a confidence interval provides additional information about the variability of the estimate Lower

More information

Handout 4: Gains from Diversification for 2 Risky Assets Corporate Finance, Sections 001 and 002

Handout 4: Gains from Diversification for 2 Risky Assets Corporate Finance, Sections 001 and 002 Handout 4: Gains from Diversification for 2 Risky Assets Corporate Finance, Sections 001 and 002 Suppose you are deciding how to allocate your wealth between two risky assets. Recall that the expected

More information

Introduction to R (2)

Introduction to R (2) Introduction to R (2) Boxplots Boxplots are highly efficient tools for the representation of the data distributions. The five number summary can be located in boxplots. Additionally, we can distinguish

More information

CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS

CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS Note: This section uses session window commands instead of menu choices CENTRAL LIMIT THEOREM (SECTION 7.2 OF UNDERSTANDABLE STATISTICS) The Central Limit

More information

Variance clustering. Two motivations, volatility clustering, and implied volatility

Variance clustering. Two motivations, volatility clustering, and implied volatility Variance modelling The simplest assumption for time series is that variance is constant. Unfortunately that assumption is often violated in actual data. In this lecture we look at the implications of time

More information

Model 0: We start with a linear regression model: log Y t = β 0 + β 1 (t 1980) + ε, with ε N(0,

Model 0: We start with a linear regression model: log Y t = β 0 + β 1 (t 1980) + ε, with ε N(0, Stat 534: Fall 2017. Introduction to the BUGS language and rjags Installation: download and install JAGS. You will find the executables on Sourceforge. You must have JAGS installed prior to installing

More information

ECE 295: Lecture 03 Estimation and Confidence Interval

ECE 295: Lecture 03 Estimation and Confidence Interval ECE 295: Lecture 03 Estimation and Confidence Interval Spring 2018 Prof Stanley Chan School of Electrical and Computer Engineering Purdue University 1 / 23 Theme of this Lecture What is Estimation? You

More information

book 2014/5/6 15:21 page 261 #285

book 2014/5/6 15:21 page 261 #285 book 2014/5/6 15:21 page 261 #285 Chapter 10 Simulation Simulations provide a powerful way to answer questions and explore properties of statistical estimators and procedures. In this chapter, we will

More information

Lecture Note of Bus 41202, Spring 2010: Analysis of Multiple Series with Applications. x 1t x 2t. holdings (OIH) and energy select section SPDR (XLE).

Lecture Note of Bus 41202, Spring 2010: Analysis of Multiple Series with Applications. x 1t x 2t. holdings (OIH) and energy select section SPDR (XLE). Lecture Note of Bus 41202, Spring 2010: Analysis of Multiple Series with Applications Focus on two series (i.e., bivariate case) Time series: Data: x 1, x 2,, x T. X t = Some examples: (a) U.S. quarterly

More information

University of Zürich, Switzerland

University of Zürich, Switzerland University of Zürich, Switzerland RE - general asset features The inclusion of real estate assets in a portfolio has proven to bring diversification benefits both for homeowners [Mahieu, Van Bussel 1996]

More information