Regression and Simulation
|
|
- Lauren Lloyd
- 5 years ago
- Views:
Transcription
1 Regression and Simulation This is an introductory R session, so it may go slowly if you have never used R before. Do not be discouraged. A great way to learn a new language like this is to plunge right in. We will simulate some data suitable for a regression. This will get you started in R and will teach you some useful tricks. I want you to build a script in RStudio (File/New File/R Script) and execute the lines from the script as you proceed. 1. Generate data for simple regression The first task is to generate some data for fitting a simple regression model. By simple, this means a single predictor X 1 and a response Y. We will use a sample size of N=100, and generate N pairs (X 1i, Y i ) from the model Y i = β 0 + X 1i β 1 + e i. In R we will call X 1 simply, since R does not like subscripts in names. Now for some details: The X 1i should come from a Gaussian distribution with mean 20 and standard deviation 3. Take β 0 = 25 and β 1 = 2 to be the true parameters, and assume that e i comes from a normal distribution with mean 0 and standard deviation 15. Generate your data, make a histogram of X 1, and plot the values of Y against those of X 1. Your plot should look something like this: Y Useful functions are rnorm, plot, and hist. If you type?rnorm, for example, you will see a helpfile. In the plot, include the true regression line; abline is useful here. 1
2 Answers First we will generate some data from a Gaussian distribution and call it. N=100?rnorm =rnorm(n,20,3) hist() Histogram of Frequency Now we will make a response Y that depends on plus some error. We will make the error Gaussian as well. Y= 25 + *2 + rnorm(n,0,15) plot(,y) abline(25,2,col="red") 2
3 Y Notice how we included the true regression line in the plot. 2. Fit a linear regression Let s now fit a linear regression of Y on X 1 and look at its properties. For this you will use the function lm in the following manner. Use the summary() command on your fitted object fit to see what is inside. Try coef as well. We can include the fitted regression line in the plot. The abline function knows about fitted lm models, so you can give it as a single argument to abline(). 3
4 Y Once you get this working, see what happens if you execute your entire script (including the data generation) a second time. You should get a somewhat different result, since different data are generated. You can prevent this, if you like, by using the set.seed command. Type?set.seed to learn how. Try this and see that your data generation is now reproducible. Answers Here is our initial regression fit. summary(fit) Call: lm(formula = Y ~ ) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-05 *** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 98 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 98 DF, p-value: 1.036e-05 4
5 coef(fit) (Intercept) plot(,y) abline(25,2,col="red") abline(fit,col="blue",lwd=2) Y If we repeat this process, we get something slightly different =rnorm(n,20,3) Y= 25 + *2 + rnorm(n,0,15) plot(,y) summary(fit) Call: lm(formula = Y ~ ) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) *** ** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 5
6 Residual standard error: on 98 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 98 DF, p-value: abline(25,2,col="red") abline(fit,col="blue",lwd=2) Y If we want to be able to repeat an analysis, including the random number generation, then we have to set the random number seed. This is easy to do in R. set.seed(101) # any number will do =rnorm(n,20,3) Y= 25 + *2 + rnorm(n,0,15) plot(,y) summary(fit) Call: lm(formula = Y ~ ) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-06 *** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 6
7 Residual standard error: on 98 degrees of freedom Multiple R-squared: 0.189, Adjusted R-squared: F-statistic: on 1 and 98 DF, p-value: 6.206e-06 coef(fit) (Intercept) abline(25,2,col="red") abline(fit,col="blue",lwd=2) Y Simulation to check aspects of regression We learn from the theory that the variance of the slope estimate in simple regression is given by the expression Var( ˆβ 1 ) = σ 2 N i=1 (X 1i X 1 ) 2, where X 1 is the mean of the X 1i, and σ is the standard deviation of the e i. (Here the hat on top of ˆβ 1 means it is estimated from our data.) What this theory is really telling us is that if we fix our X 1 values and repeatedly generate new e i and hence Y i and refit the regressions each time, the variance of the series of ˆβ 1 s that we get will be given by this expression. Since we typically don t know σ, we need to estimate it as well, using the the expression ˆσ 2 = 1 N 2 N (Y i Ŷi) 2. i=1 7
8 The linear model summary tells us the estimated standard error of the slope is 0.54 for these data (at least mine did). Let s check it out! Run a simulation where you generate a new response vector each time, fit the regression, extract the slope coefficient, and store it. Do this 1000 times using a for loop (?'for'), keeping the same values for X 1. Summarize the results and in particular compute the standard deviation of the 1000 values you get. How closely does it match the theoretical value? Answers We will generate 1000 different response vectors, compute the slope each time, and then claculate the standard deviation of the estimated slopes. set.seed(101) #any number will do beta=double(1000) for(i in 1:1000) { Y = 25 + *2 + rnorm(n,0,15) beta[i]=coef(fit)[2] } Now let s look at these betas (slope estimates). hist(beta) Histogram of beta Frequency beta mean(beta) [1]
9 sd(beta) [1] Pretty close! 4. Slightly different simulation Here we held the original X 1 values fixed. What if we generate new versions of them as well when we do our simulations. Repeat your simulations, adding this little twist. What do you learn about the distribution and standard error of ˆβ 1 in this setting? Any surprises? What would you have expected? Answers set.seed(101) #any number will do beta2=double(1000) for(i in 1:1000){ =rnorm(n,20,3) Y= 25 + *2 + rnorm(n,0,15) beta2[i]=coef(fit)[2] } mean(beta2) [1] sd(beta2) [1] It s smaller! What happened? 5. Multiple Regression We now generate an X 2 and create a multiple regression model Y i = β 0 + X 1i β 1 + X 2i β 2 + e i. Let X 2 have the same distribution as X 1 (but be independent of X 1 ), and let β 2 = 1 in the simulation. Fit the multiple linear regression model the formula is now Y~+X2 and use summary to summarize the results. What do you observe about the standard errors of each coefficient and in particular the coefficient of X 1? For your current version of X 1 and X 2, what is their sample correlation? (cor()). Is your sample correlation zero? Why or why not? Answers We now generate an X 2 and create a multiple regression model. 9
10 set.seed(109) =rnorm(n,20,3) X2=rnorm(N,20,3) Y= 25 + *2 + X2*1+ rnorm(n,0,15) fit2=lm(y~+x2) summary(fit) Call: lm(formula = Y ~ ) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) *** e-07 *** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 98 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 98 DF, p-value: 3.141e-07 summary(fit2) Call: lm(formula = Y ~ + X2) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-07 *** X Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 97 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 2 and 97 DF, p-value: 4.447e-07 In this case, X 1 and X 2 are realizations of two independent random variables, each Gaussian, so the presence of one does not really effect the other in the regression. Here their sample correlation is 10
11 cor(,x2) [1] Suppose we wanted the distribution of X 2 to be correlated with X 1. How might we do that? Figure out a way to do this, and experiment until your X 2 has sample correlation about 0.6 with X 1. Now generate a Y from your regression model, and use summary to examine the coefficients. What do you observe? Now try to produce X 2 with sample correlation about 0.8 with X 1, and repeat the regression experiment above. What do you observe? Answers How do we generate an X 2 that has correlation with X 1? One simple way is to generate a Z independent of X 1, and then make X 2 = αx 1 + Z for some value of α (below we use α = 0.7). Z=X2 X2=0.7*+Z plot(,x2) X cor(,x2) [1] Now when we fit the linear model, the variance of the coefficients increases in the multiple linear regression versus the simple linear regression, because of the correlation. Y= 25 + *2 + X2*1+ rnorm(n,0,15) fit2=lm(y~+x2) summary(fit) 11
12 Call: lm(formula = Y ~ ) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-06 *** e-09 *** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 15.1 on 98 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 98 DF, p-value: 6.404e-09 summary(fit2) Call: lm(formula = Y ~ + X2) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) *** X * --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 97 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 2 and 97 DF, p-value: 3.799e-09 12
Multiple regression - a brief introduction
Multiple regression - a brief introduction Multiple regression is an extension to regular (simple) regression. Instead of one X, we now have several. Suppose, for example, that you are trying to predict
More informationLet us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times.
Mixed-effects models An introduction by Christoph Scherber Up to now, we have been dealing with linear models of the form where ß0 and ß1 are parameters of fixed value. Example: Let us assume that we are
More informationR is a collaborative project with many contributors. Type contributors() for more information.
R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type license() or licence() for distribution details. R is a collaborative project
More informationFinal Exam Suggested Solutions
University of Washington Fall 003 Department of Economics Eric Zivot Economics 483 Final Exam Suggested Solutions This is a closed book and closed note exam. However, you are allowed one page of handwritten
More informationNon-linearities in Simple Regression
Non-linearities in Simple Regression 1. Eample: Monthly Earnings and Years of Education In this tutorial, we will focus on an eample that eplores the relationship between total monthly earnings and years
More information6 Multiple Regression
More than one X variable. 6 Multiple Regression Why? Might be interested in more than one marginal effect Omitted Variable Bias (OVB) 6.1 and 6.2 House prices and OVB Should I build a fireplace? The following
More informationMODEL SELECTION CRITERIA IN R:
1. R 2 statistics We may use MODEL SELECTION CRITERIA IN R R 2 = SS R SS T = 1 SS Res SS T or R 2 Adj = 1 SS Res/(n p) SS T /(n 1) = 1 ( ) n 1 (1 R 2 ). n p where p is the total number of parameters. R
More informationCOMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 18, 2006, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTIONS
COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 18, 2006, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTIONS Answer all parts. Closed book, calculators allowed. It is important to show all working,
More informationDummy Variables. 1. Example: Factors Affecting Monthly Earnings
Dummy Variables A dummy variable or binary variable is a variable that takes on a value of 0 or 1 as an indicator that the observation has some kind of characteristic. Common examples: Sex (female): FEMALE=1
More informationRegression Review and Robust Regression. Slides prepared by Elizabeth Newton (MIT)
Regression Review and Robust Regression Slides prepared by Elizabeth Newton (MIT) S-Plus Oil City Data Frame Monthly Excess Returns of Oil City Petroleum, Inc. Stocks and the Market SUMMARY: The oilcity
More informationHomework Assignment Section 3
Homework Assignment Section 3 Tengyuan Liang Business Statistics Booth School of Business Problem 1 A company sets different prices for a particular stereo system in eight different regions of the country.
More informationGeneralized Linear Models
Generalized Linear Models Scott Creel Wednesday, September 10, 2014 This exercise extends the prior material on using the lm() function to fit an OLS regression and test hypotheses about effects on a parameter.
More informationThe Norwegian State Equity Ownership
The Norwegian State Equity Ownership B A Ødegaard 15 November 2018 Contents 1 Introduction 1 2 Doing a performance analysis 1 2.1 Using R....................................................................
More informationHomework Assignment Section 3
Homework Assignment Section 3 Tengyuan Liang Business Statistics Booth School of Business Problem 1 A company sets different prices for a particular stereo system in eight different regions of the country.
More informationStudy 2: data analysis. Example analysis using R
Study 2: data analysis Example analysis using R Steps for data analysis Install software on your computer or locate computer with software (e.g., R, systat, SPSS) Prepare data for analysis Subjects (rows)
More informationState Ownership at the Oslo Stock Exchange. Bernt Arne Ødegaard
State Ownership at the Oslo Stock Exchange Bernt Arne Ødegaard Introduction We ask whether there is a state rebate on companies listed on the Oslo Stock Exchange, i.e. whether companies where the state
More informationLecture Note: Analysis of Financial Time Series Spring 2017, Ruey S. Tsay
Lecture Note: Analysis of Financial Time Series Spring 2017, Ruey S. Tsay Seasonal Time Series: TS with periodic patterns and useful in predicting quarterly earnings pricing weather-related derivatives
More informationJaime Frade Dr. Niu Interest rate modeling
Interest rate modeling Abstract In this paper, three models were used to forecast short term interest rates for the 3 month LIBOR. Each of the models, regression time series, GARCH, and Cox, Ingersoll,
More informationEconomics 424/Applied Mathematics 540. Final Exam Solutions
University of Washington Summer 01 Department of Economics Eric Zivot Economics 44/Applied Mathematics 540 Final Exam Solutions I. Matrix Algebra and Portfolio Math (30 points, 5 points each) Let R i denote
More informationSTATISTICS 110/201, FALL 2017 Homework #5 Solutions Assigned Mon, November 6, Due Wed, November 15
STATISTICS 110/201, FALL 2017 Homework #5 Solutions Assigned Mon, November 6, Due Wed, November 15 For this assignment use the Diamonds dataset in the Stat2Data library. The dataset is used in examples
More informationReview: Population, sample, and sampling distributions
Review: Population, sample, and sampling distributions A population with mean µ and standard deviation σ For instance, µ = 0, σ = 1 0 1 Sample 1, N=30 Sample 2, N=30 Sample 100000000000 InterquartileRange
More informationNHY examples. Bernt Arne Ødegaard. 23 November Estimating dividend growth in Norsk Hydro 8
NHY examples Bernt Arne Ødegaard 23 November 2017 Abstract Finance examples using equity data for Norsk Hydro (NHY) Contents 1 Calculating Beta 4 2 Cost of Capital 7 3 Estimating dividend growth in Norsk
More informationMonetary Economics Risk and Return, Part 2. Gerald P. Dwyer Fall 2015
Monetary Economics Risk and Return, Part 2 Gerald P. Dwyer Fall 2015 Reading Malkiel, Part 2, Part 3 Malkiel, Part 3 Outline Returns and risk Overall market risk reduced over longer periods Individual
More informationSubject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018
` Subject CS1 Actuarial Statistics 1 Core Principles Syllabus for the 2019 exams 1 June 2018 Copyright in this Core Reading is the property of the Institute and Faculty of Actuaries who are the sole distributors.
More informationState Ownership at the Oslo Stock Exchange
State Ownership at the Oslo Stock Exchange Bernt Arne Ødegaard 1 Introduction We ask whether there is a state rebate on companies listed on the Oslo Stock Exchange, i.e. whether companies where the state
More informationLinear regression model
Regression Model Assumptions (Solutions) STAT-UB.0003: Regression and Forecasting Models Linear regression model 1. Here is the least squares regression fit to the Zagat restaurant data: 10 15 20 25 10
More informationThe histogram should resemble the uniform density, the mean should be close to 0.5, and the standard deviation should be close to 1/ 12 =
Chapter 19 Monte Carlo Valuation Question 19.1 The histogram should resemble the uniform density, the mean should be close to.5, and the standard deviation should be close to 1/ 1 =.887. Question 19. The
More informationσ e, which will be large when prediction errors are Linear regression model
Linear regression model we assume that two quantitative variables, x and y, are linearly related; that is, the population of (x, y) pairs are related by an ideal population regression line y = α + βx +
More informationMCMC Package Example
MCMC Package Example Charles J. Geyer April 4, 2005 This is an example of using the mcmc package in R. The problem comes from a take-home question on a (take-home) PhD qualifying exam (School of Statistics,
More informationMixedModR2 Erika Mudrak Thursday, August 30, 2018
MixedModR Erika Mudrak Thursday, August 3, 18 Generate the Data Generate data points from a population with one random effect: levels of Factor A, each sampled 5 times set.seed(39) siga
More informationChapter 8 Statistical Intervals for a Single Sample
Chapter 8 Statistical Intervals for a Single Sample Part 1: Confidence intervals (CI) for population mean µ Section 8-1: CI for µ when σ 2 known & drawing from normal distribution Section 8-1.2: Sample
More informationGGraph. Males Only. Premium. Experience. GGraph. Gender. 1 0: R 2 Linear = : R 2 Linear = Page 1
GGraph 9 Gender : R Linear =.43 : R Linear =.769 8 7 6 5 4 3 5 5 Males Only GGraph Page R Linear =.43 R Loess 9 8 7 6 5 4 5 5 Explore Case Processing Summary Cases Valid Missing Total N Percent N Percent
More informationLoss Simulation Model Testing and Enhancement
Loss Simulation Model Testing and Enhancement Casualty Loss Reserve Seminar By Kailan Shang Sept. 2011 Agenda Research Overview Model Testing Real Data Model Enhancement Further Development Enterprise
More informationStat 401XV Exam 3 Spring 2017
Stat 40XV Exam Spring 07 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationSTA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER
STA2601/105/2/2018 Tutorial letter 105/2/2018 Applied Statistics II STA2601 Semester 2 Department of Statistics TRIAL EXAMINATION PAPER Define tomorrow. university of south africa Dear Student Congratulations
More informationMultiple linear regression
Multiple linear regression Business Statistics 41000 Spring 2017 1 Topics 1. Including multiple predictors 2. Controlling for confounders 3. Transformations, interactions, dummy variables OpenIntro 8.1,
More informationDetermination of the Optimal Stratum Boundaries in the Monthly Retail Trade Survey in the Croatian Bureau of Statistics
Determination of the Optimal Stratum Boundaries in the Monthly Retail Trade Survey in the Croatian Bureau of Statistics Ivana JURINA (jurinai@dzs.hr) Croatian Bureau of Statistics Lidija GLIGOROVA (gligoroval@dzs.hr)
More information> attach(grocery) > boxplot(sales~discount, ylab="sales",xlab="discount")
Example of More than 2 Categories, and Analysis of Covariance Example > attach(grocery) > boxplot(sales~discount, ylab="sales",xlab="discount") Sales 160 200 240 > tapply(sales,discount,mean) 10.00% 15.00%
More informationEconometric Methods for Valuation Analysis
Econometric Methods for Valuation Analysis Margarita Genius Dept of Economics M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 1 / 26 Correlation Analysis Simple Regression
More informationLinear Regression with One Regressor
Linear Regression with One Regressor Michael Ash Lecture 9 Linear Regression with One Regressor Review of Last Time 1. The Linear Regression Model The relationship between independent X and dependent Y
More informationChapter 7 Sampling Distributions and Point Estimation of Parameters
Chapter 7 Sampling Distributions and Point Estimation of Parameters Part 1: Sampling Distributions, the Central Limit Theorem, Point Estimation & Estimators Sections 7-1 to 7-2 1 / 25 Statistical Inferences
More informationBasic Procedure for Histograms
Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that
More informationRandom Effects ANOVA
Random Effects ANOVA Grant B. Morgan Baylor University This post contains code for conducting a random effects ANOVA. Make sure the following packages are installed: foreign, lme4, lsr, lattice. library(foreign)
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models Generalized Linear Models - IIIb Henrik Madsen March 18, 2012 Henrik Madsen () Chapman & Hall March 18, 2012 1 / 32 Examples Overdispersion and Offset!
More informationBusiness Statistics: A First Course
Business Statistics: A First Course Fifth Edition Chapter 12 Correlation and Simple Linear Regression Business Statistics: A First Course, 5e 2009 Prentice-Hall, Inc. Chap 12-1 Learning Objectives In this
More informationChapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS
Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS Part 1: Introduction Sampling Distributions & the Central Limit Theorem Point Estimation & Estimators Sections 7-1 to 7-2 Sample data
More informationORDERED MULTINOMIAL LOGISTIC REGRESSION ANALYSIS. Pooja Shivraj Southern Methodist University
ORDERED MULTINOMIAL LOGISTIC REGRESSION ANALYSIS Pooja Shivraj Southern Methodist University KINDS OF REGRESSION ANALYSES Linear Regression Logistic Regression Dichotomous dependent variable (yes/no, died/
More informationEconometric Methods for Valuation Analysis
Econometric Methods for Valuation Analysis Margarita Genius Dept of Economics M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 1 / 25 Outline We will consider econometric
More informationStat 101 Exam 1 - Embers Important Formulas and Concepts 1
1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.
More informationboxcox() returns the values of α and their loglikelihoods,
Solutions to Selected Computer Lab Problems and Exercises in Chapter 11 of Statistics and Data Analysis for Financial Engineering, 2nd ed. by David Ruppert and David S. Matteson c 2016 David Ruppert and
More informationStep 1: Load the appropriate R package. Step 2: Fit a separate mixed model for each independence claim in the basis set.
Step 1: Load the appropriate R package. You will need two libraries: nlme and lme4. Step 2: Fit a separate mixed model for each independence claim in the basis set. For instance, in Table 2 the first basis
More informationKey Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions
SGSB Workshop: Using Statistical Data to Make Decisions Module 2: The Logic of Statistical Inference Dr. Tom Ilvento January 2006 Dr. Mugdim Pašić Key Objectives Understand the logic of statistical inference
More informationBusiness Statistics 41000: Probability 4
Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:
More informationTests for the Difference Between Two Linear Regression Intercepts
Chapter 853 Tests for the Difference Between Two Linear Regression Intercepts Introduction Linear regression is a commonly used procedure in statistical analysis. One of the main objectives in linear regression
More informationPASS Sample Size Software
Chapter 850 Introduction Cox proportional hazards regression models the relationship between the hazard function λ( t X ) time and k covariates using the following formula λ log λ ( t X ) ( t) 0 = β1 X1
More informationINSTITUTE OF ACTUARIES OF INDIA EXAMINATIONS. 20 th May Subject CT3 Probability & Mathematical Statistics
INSTITUTE OF ACTUARIES OF INDIA EXAMINATIONS 20 th May 2013 Subject CT3 Probability & Mathematical Statistics Time allowed: Three Hours (10.00 13.00) Total Marks: 100 INSTRUCTIONS TO THE CANDIDATES 1.
More informationCopyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.
Appendix: Statistics in Action Part I Financial Time Series 1. These data show the effects of stock splits. If you investigate further, you ll find that most of these splits (such as in May 1970) are 3-for-1
More informationTwo Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 22 January :00 16:00
Two Hours MATH38191 Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER STATISTICAL MODELLING IN FINANCE 22 January 2015 14:00 16:00 Answer ALL TWO questions
More informationGraduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Midterm
Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay Midterm GSB Honor Code: I pledge my honor that I have not violated the Honor Code during this examination.
More informationModels of Patterns. Lecture 3, SMMD 2005 Bob Stine
Models of Patterns Lecture 3, SMMD 2005 Bob Stine Review Speculative investing and portfolios Risk and variance Volatility adjusted return Volatility drag Dependence Covariance Review Example Stock and
More informationStat3011: Solution of Midterm Exam One
1 Stat3011: Solution of Midterm Exam One Fall/2003, Tiefeng Jiang Name: Problem 1 (30 points). Choose one appropriate answer in each of the following questions. 1. (B ) The mean age of five people in a
More informationProblem Set 4 Answer Key
Economics 31 Menzie D. Chinn Fall 4 Social Sciences 7418 University of Wisconsin-Madison Problem Set 4 Answer Key This problem set is due in lecture on Wednesday, December 1st. No late problem sets will
More informationStat 328, Summer 2005
Stat 328, Summer 2005 Exam #2, 6/18/05 Name (print) UnivID I have neither given nor received any unauthorized aid in completing this exam. Signed Answer each question completely showing your work where
More informationCase Study: Applying Generalized Linear Models
Case Study: Applying Generalized Linear Models Dr. Kempthorne May 12, 2016 Contents 1 Generalized Linear Models of Semi-Quantal Biological Assay Data 2 1.1 Coal miners Pneumoconiosis Data.................
More informationMilestone2. Zillow House Price Prediciton. Group: Lingzi Hong and Pranali Shetty
Milestone2 Zillow House Price Prediciton Group Lingzi Hong and Pranali Shetty MILESTONE 2 REPORT Data Collection The following additional features were added 1. Population, Number of College Graduates
More informationIntroduction to Population Modeling
Introduction to Population Modeling In addition to estimating the size of a population, it is often beneficial to estimate how the population size changes over time. Ecologists often uses models to create
More informationGARCH Models. Instructor: G. William Schwert
APS 425 Fall 2015 GARCH Models Instructor: G. William Schwert 585-275-2470 schwert@schwert.ssb.rochester.edu Autocorrelated Heteroskedasticity Suppose you have regression residuals Mean = 0, not autocorrelated
More informationOrdinal Multinomial Logistic Regression. Thom M. Suhy Southern Methodist University May14th, 2013
Ordinal Multinomial Logistic Thom M. Suhy Southern Methodist University May14th, 2013 GLM Generalized Linear Model (GLM) Framework for statistical analysis (Gelman and Hill, 2007, p. 135) Linear Continuous
More informationChapter 6 Part 3 October 21, Bootstrapping
Chapter 6 Part 3 October 21, 2008 Bootstrapping From the internet: The bootstrap involves repeated re-estimation of a parameter using random samples with replacement from the original data. Because the
More informationWindow Width Selection for L 2 Adjusted Quantile Regression
Window Width Selection for L 2 Adjusted Quantile Regression Yoonsuh Jung, The Ohio State University Steven N. MacEachern, The Ohio State University Yoonkyung Lee, The Ohio State University Technical Report
More informationMVE051/MSG Lecture 7
MVE051/MSG810 2017 Lecture 7 Petter Mostad Chalmers November 20, 2017 The purpose of collecting and analyzing data Purpose: To build and select models for parts of the real world (which can be used for
More informationGeneralized Multilevel Regression Example for a Binary Outcome
Psy 510/610 Multilevel Regression, Spring 2017 1 HLM Generalized Multilevel Regression Example for a Binary Outcome Specifications for this Bernoulli HLM2 run Problem Title: no title The data source for
More informationMonte Carlo Simulation and Resampling
Monte Carlo Simulation and Resampling Tom Carsey (Instructor) Jeff Harden (TA) ICPSR Summer Course Summer, 2011 Monte Carlo Simulation and Resampling 1/114 Introductions and Overview What do I plan for
More informationTime Observations Time Period, t
Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard Time Series and Forecasting.S1 Time Series Models An example of a time series for 25 periods is plotted in Fig. 1 from the numerical
More informationLecture 3: Factor models in modern portfolio choice
Lecture 3: Factor models in modern portfolio choice Prof. Massimo Guidolin Portfolio Management Spring 2016 Overview The inputs of portfolio problems Using the single index model Multi-index models Portfolio
More informationStatistics 431 Spring 2007 P. Shaman. Preliminaries
Statistics 4 Spring 007 P. Shaman The Binomial Distribution Preliminaries A binomial experiment is defined by the following conditions: A sequence of n trials is conducted, with each trial having two possible
More informationPredicting Charitable Contributions
Predicting Charitable Contributions By Lauren Meyer Executive Summary Charitable contributions depend on many factors from financial security to personal characteristics. This report will focus on demographic
More informationGov 2001: Section 5. I. A Normal Example II. Uncertainty. Gov Spring 2010
Gov 2001: Section 5 I. A Normal Example II. Uncertainty Gov 2001 Spring 2010 A roadmap We started by introducing the concept of likelihood in the simplest univariate context one observation, one variable.
More informationSAS Simple Linear Regression Example
SAS Simple Linear Regression Example This handout gives examples of how to use SAS to generate a simple linear regression plot, check the correlation between two variables, fit a simple linear regression
More informationBusiness Statistics Midterm Exam Fall 2013 Russell
Name Business Statistics Midterm Exam Fall 2013 Russell Do not turn over this page until you are told to do so. You will have 2 hours to complete the exam. There are a total of 100 points divided into
More information4.1 Introduction Estimating a population mean The problem with estimating a population mean with a sample mean: an example...
Chapter 4 Point estimation Contents 4.1 Introduction................................... 2 4.2 Estimating a population mean......................... 2 4.2.1 The problem with estimating a population mean
More informationPanel Data. November 15, The panel is balanced if all individuals have a complete set of observations, otherwise the panel is unbalanced.
Panel Data November 15, 2018 1 Panel data Panel data are obsevations of the same individual on different dates. time Individ 1 Individ 2 Individ 3 individuals The panel is balanced if all individuals have
More informationThe misleading nature of correlations
The misleading nature of correlations In this note we explain certain subtle features of calculating correlations between time-series. Correlation is a measure of linear co-movement, to be contrasted with
More informationEconomics 413: Economic Forecast and Analysis Department of Economics, Finance and Legal Studies University of Alabama
Problem Set #1 (Linear Regression) 1. The file entitled MONEYDEM.XLS contains quarterly values of seasonally adjusted U.S.3-month ( 3 ) and 1-year ( 1 ) treasury bill rates. Each series is measured over
More informationFinancial Econometrics Jeffrey R. Russell. Midterm 2014 Suggested Solutions. TA: B. B. Deng
Financial Econometrics Jeffrey R. Russell Midterm 2014 Suggested Solutions TA: B. B. Deng Unless otherwise stated, e t is iid N(0,s 2 ) 1. (12 points) Consider the three series y1, y2, y3, and y4. Match
More informationStatistics vs. statistics
Statistics vs. statistics Question: What is Statistics (with a capital S)? Definition: Statistics is the science of collecting, organizing, summarizing and interpreting data. Note: There are 2 main ways
More informationStatistics for Business and Economics
Statistics for Business and Economics Chapter 7 Estimation: Single Population Copyright 010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-1 Confidence Intervals Contents of this chapter: Confidence
More informationCHAPTER 2 Describing Data: Numerical
CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of
More informationSpring, Beta and Regression
Spring, 2000-1 - Administrative Items Getting help See me Monday 3-5:30 or tomorrow after 2:30. Send me an e-mail with your question. (stine@wharton) Visit the StatLab/TAs, particularly for help using
More informationAnalysis of Variance in Matrix form
Analysis of Variance in Matrix form The ANOVA table sums of squares, SSTO, SSR and SSE can all be expressed in matrix form as follows. week 9 Multiple Regression A multiple regression model is a model
More informationCHAPTER 8. Confidence Interval Estimation Point and Interval Estimates
CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates A point estimate is a single number, a confidence interval provides additional information about the variability of the estimate Lower
More informationHandout 4: Gains from Diversification for 2 Risky Assets Corporate Finance, Sections 001 and 002
Handout 4: Gains from Diversification for 2 Risky Assets Corporate Finance, Sections 001 and 002 Suppose you are deciding how to allocate your wealth between two risky assets. Recall that the expected
More informationIntroduction to R (2)
Introduction to R (2) Boxplots Boxplots are highly efficient tools for the representation of the data distributions. The five number summary can be located in boxplots. Additionally, we can distinguish
More informationCHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS
CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS Note: This section uses session window commands instead of menu choices CENTRAL LIMIT THEOREM (SECTION 7.2 OF UNDERSTANDABLE STATISTICS) The Central Limit
More informationVariance clustering. Two motivations, volatility clustering, and implied volatility
Variance modelling The simplest assumption for time series is that variance is constant. Unfortunately that assumption is often violated in actual data. In this lecture we look at the implications of time
More informationModel 0: We start with a linear regression model: log Y t = β 0 + β 1 (t 1980) + ε, with ε N(0,
Stat 534: Fall 2017. Introduction to the BUGS language and rjags Installation: download and install JAGS. You will find the executables on Sourceforge. You must have JAGS installed prior to installing
More informationECE 295: Lecture 03 Estimation and Confidence Interval
ECE 295: Lecture 03 Estimation and Confidence Interval Spring 2018 Prof Stanley Chan School of Electrical and Computer Engineering Purdue University 1 / 23 Theme of this Lecture What is Estimation? You
More informationbook 2014/5/6 15:21 page 261 #285
book 2014/5/6 15:21 page 261 #285 Chapter 10 Simulation Simulations provide a powerful way to answer questions and explore properties of statistical estimators and procedures. In this chapter, we will
More informationLecture Note of Bus 41202, Spring 2010: Analysis of Multiple Series with Applications. x 1t x 2t. holdings (OIH) and energy select section SPDR (XLE).
Lecture Note of Bus 41202, Spring 2010: Analysis of Multiple Series with Applications Focus on two series (i.e., bivariate case) Time series: Data: x 1, x 2,, x T. X t = Some examples: (a) U.S. quarterly
More informationUniversity of Zürich, Switzerland
University of Zürich, Switzerland RE - general asset features The inclusion of real estate assets in a portfolio has proven to bring diversification benefits both for homeowners [Mahieu, Van Bussel 1996]
More information