Measures of Broad Sense Heritability from multi-location and multi-year trials.

Size: px
Start display at page:

Download "Measures of Broad Sense Heritability from multi-location and multi-year trials."

Transcription

1 Measures of Broad Sense Heritability from multi-location and multi-year trials. ANOVA The Analysis of Variance (ANOVA) is a statistical tool that we rely on for measuring variation associated with a trait, ranging from yield in a field trial to gene expression on a micro-array. ANOVA techniques are used extensively in the analysis molecular marker associations with traits. The purpose of this exercise will be to review simple ANOVA models, and demonstrate how variance partitioning can be used to estimate heritability. ANOVA assigns total variation to known causes leaving a residual portion allocated to uncontrolled or unexplained variation, experimental error. The ability of ANOVA to partition variance is inextricably linked to experimental design. An excellent review of ANOVA can be found in McIntosh (1983). Table 1. Expected mean squares for randomized complete block experiments combined over locations. Sources of Var. df Mean Squares Locations L 1 M1 Blocks(Locations) L(r-1) M2 Treatment T-1 M3 L x T (L-1)(T-1) M4 Error L(r-1)(T-1) M5 L = Location r = Blocks (replicate) T = treatment Table 2. Expected mean squares for experiments with Random (R) and Fixed (F) effects. Sources of Var. RL-RT RL-FT FL-FT Locations Blocks(L) σ 2 e + rσ 2 TL + Tσ 2 r(l) + rtσ 2 L + rtσ 2 L + Tσ 2 r(l) + rtσ 2 L Treatment σ 2 e + rσ 2 TL + rlσ 2 T σ 2 e + rσ 2 TL + rlσ 2 T σ 2 e + rlσ 2 T L x T Error σ 2 e σ 2 e σ 2 e Table 3. F-ratios used to test effects for randomized complete block experiments combined over locations. F-Test Sources of Var. Mean Squares RL-RT RL-FT FL-FT Locations M1 (M1+M5)/(M2+M4) M1/M2 M1/M2 Blocks(Locations) M2 Treatment M3 M3/M4 M3/M4 M3/M5 L x T M4 M4/M5 M4/M5 M4/M5 Error M5 In a genetic/breeding experiment, Treatment would likely be genotype or variety. In an experiment designed to test for associations between a marker and a trait, Treatment would be the marker.

2 Take home messages: McIntosh 1983 serves as a set of reference tables for use during experimental design and analysis. The paper reinforces important points: (1) Unaccounted sources of variation will be pooled into the error term resulting in an inflated error; (2) the appropriate F-tests differ depending on whether the effects are fixed or random. Estimation of heritability from ANOVA Broad sense heritability can be estimated through estimates of variance obtained from replication of a population in time and space. Experiments designed to estimate variance components to be used in broad sense heritability must be grown in an adequate sample of environments. An example from a 2 year (97 and 98) replicated trial (replicate 1 and 2) Data: var rep year hplc uv

3 ANOVA Genotype SUM Average hplc SUM Sum Squares for Genotype: [(32.74) 2 + (33.44) 2 + (41.84) 2 + (31.62) 2 + (47.59) 2 + (49.57) 2 + (36.12) 2 ]/4 [272.92*9.75] = Sum of squares are calculated for: main effects, interactions, and error (total sum or squares main effects and interaction effects). Source DF Sum Squares Mean Square F Value Pr > F gen rep year year*gen rep*gen year*rep Error Note: Sum Squares / df = Mean Square Source gen rep year year*gen rep*gen year*rep gen*year*rep Expected MS Var(Error) + 2 Var(rep*gen) + 2 Var(year*gen) + 4 Var(gen) Var(Error) + 7 Var(year*rep) + 2 Var(rep*gen) + 14 Var(rep) Var(Error) + 7 Var(year*rep) + 2 Var(year*gen) + 14 Var(year) Var(Error) + 2 Var(year*gen) Var(Error) + 2 Var(rep*gen) Var(Error) + 7 Var(year*rep) Var(Error) To estimate Var(Gen) = Var(Error) + 2 Var(rep*gen) + 2 Var(year*gen) + 4 Var(gen) = Var(Error) + 2 Var(year*gen) = 2 Var(rep*gen) + 4 Var(gen) = 2 Var(rep*gen) = 4 Var(gen) = 4 var(gen) 2.55 = var(gen)

4 Compare to estimate obtained from MIVQUE0 in SAS: Component Estimate Var(gen) Var(rep) Var(year) Var(year*gen) Var(rep*gen) Var(year*rep) Var(Error) *SAS code (2yearlyc.txt) */; data cdata; infile 'a:2yrlyc.csv' delimiter = ',' firstobs = 2; input gen rep year hplc uv; proc glm; class year rep gen; model hplc uv = gen year rep gen*year gen*rep rep*year / ss3; random gen year rep gen*year gen*rep rep*year / test title 'ANOVA for lycopene data.with expected MS'; proc varcomp; class year rep gen; model hplc uv = rep gen*year gen*rep rep*year; title 'Variance Components for lycopene data,'; run; Notes on the SAS code The program above uses a new procedure (to us), Proc varcomp. Proc varcomp is one of two methods we can use to estimate variance components (the other, Proc Mixed will be introduced later). You can specify effects as fixed by putting them first in the MODEL statement and indicating the number of fixed effects with the FIXED= option. Variance components are estimated for RANDOM effects. There are four methods of estimation that can be specified in the PROC VARCOMP statement by using the METHOD = option. TYPE1 MIVQUE0 ML REML based on computation of the type 1 sums of squares The default. Similar to type 1, but computationally faster. Maximum likelihood Restricted maximum likelihood (favored in breeding work)

5 Estimating Heritability from Variance Components of lines. When using variance components from ANOVA to estimate broad sense heritability, the practical application will determine the appropriate denominator. For example it is common to exclude variance components due to blocks, years, and locations because it assumed that selection will occur on means across replicate, location, and years. This practice is based on the assumption that means will be corrected for differences between locations, blocks, and years (i.e. expressed as deviation from the mean). The rationale for this approach is that heritability should be defined based on the variance associated with the selection unit. BSH = σ 2 (G)/ σ 2 (x) and σ 2 (x) = σ 2 (P) when the selection unit is an individual. When the selection unit is not an individual, but a family, inbred line, or clone for which replicated phenotypic data has been collected, the expression of phenotypic variation is adjusted to represent the expected phenotypic variation among family (or clone, or inbred line) means. Rules of thumb: First, define selection unit. Second, correct phenotypic measurements for the appropriate effects 1. Third, main effects of year, location, and rep are dropped from the denominator. Fourth, variance estimates are adjusted for the selection unit. 1 least-squares, best linear unbiased prediction (BLUP), or simple corrections. Deviation JK = (Phenotypic measurement for individual J in block K) JK - (Block mean) K d JK = Y JK - Y K (Equation 4 from Cotterill, 1987) For a one year, one location, randomized complete block design with r replications and n individuals measured per plot and assuming we are selecting the best family or line, the heritability is: H family = σ 2 (G) σ 2 (G) + σ 2 error/r + σ 2 (within family)/rn When the plot is measured as a group (rather than n measurements per plot): H family = σ 2 (G) σ 2 (G) + σ 2 error/r + σ 2 (within family)/r However if the goal is to select the best individual from each line H = σ 2 (G) σ 2 (G) + σ 2 error + σ 2 (within family)

6 Justification for ignoring main effects (variation due to location, year, or blocks) is based on the fundamental assumption that corrections will be made for the effects prior to using phenotypic measurement to select. This is an important point if significant main effects exist. If main effects for location, year, or blocks are small they will contribute little to the denominator. The equation can be generalized as: [σ 2 (G)]/ [σ 2 (G) + σ 2 error/rep*year*location + σ 2 (GYL)/year*location + σ 2 (GL)/location + σ 2 (GY)/year] Hallauer and Miranda further generalize the equation as: H = [σ 2 (G)]/ [σ 2 (G) + σ 2 ge/e + σ 2 error/r*e] Where r = number of reps and e = number of environments The standard error for H is: SE(H) = [SEσ 2 (G)]/ [σ 2 (G) + σ 2 ge/e + σ 2 error/r*e] For the Lycopene data: Component Estimate Var(gen) Omit Var(rep) Var(year) Var(year*gen) Var(rep*gen) Var(year*rep) Var(Error) / [ ( /2) + ( /2) + ( /4) + ( /4) = setting negative estimates to 0 = Notes on Negative estimates of variance components. For the exercise using the 2yrlyc.xls data set, we noted that some variance components were negative. This possibility is noted in Chapter 18 of Lynch and Walsh, and is basically a problem associated with small n and variance components that are close to zero. The estimates were derived using the proc varcomp default MIVQUE0 which works by solving the set of expected mean square equations. Thus arithmetic is used and may result in negative estimates when variances are very low (i.e. one low number is smaller than another ). I noted earlier that there are multiple ways to estimate variance components in proc varcomp by adding the type = method. These methods are described again, below. The SAS proc varcomp can use the following four estimation procedures for variance components: Type1 computes the Type 1 sum of squares for each effect, equates each MS involving only random effects to its expected value, and solves the set of equations. MIVQUE0 is similar to type1, but is computationally simpler (and therefore is the default).

7 Maximum likelihood (ML) estimation uses a W-transformation of the expected mean squares equation and computes initial estimates using MIVQUE0. The program iterates until convergence. Restricted Maximum likelihood (REML) similar to ML, but separates the likelihood into two parts (one with fixed effects, one without). Initial estimates are obtained using MIVQUE0, then iteration is performed until convergence for the equation that does not contain fixed effects. REML is the method of choice for genetic studies. The syntax would be proc varcomp type = REML. The SAS procedure proc mixed can also be used for REML estimation. An advantage of using PROC MIXED over PROC VARCOMP is that the output will return estimates of error associated with the variance components. These estimates or error can then be used to place a standard error on our estimate of heritability. We will return to this point (below). PROC MIXED uses the following syntax: proc mixed data=cdata covtest; class year rep gen; model hplc = gen year rep(year) gen*year gen*rep(year) / ddfm = satterth ; random year rep gen; title Variance components using Proc mixed ; cdata is the data file name following the data statement.; covtest option statement calculates standard errors. a blank after model var = means that all affects are random. Fixed affects should be added here. Degrees of freedom are estimated by Satterthwaite s procedure. Using ANOVA to test for an association between a marker and a trait The application of simple F-tests or simple linear regression provides an intuitive example for statistical approaches that are used to establish linkage between a marker and a quantitative trait. For these approaches, the qualitative trait (or marker) is used to classify the progeny. The question is then asked if the populations based on marker classification have significantly different means. o oo quant ooo trait oo oo o o o o o o Genotypic Classes (backcross)

8 Source DF Expected MS Genotypes N-1 σ2 + bσ2(g) Marker 1 σ2 + b[σ2(g QTL ) + 4r(1-r)g 2 ] + bc(1 2r) 2 g 2 Gen(marker) N-2 σ2 + b[σ2(g QTL ) + 4r(1-r)g 2 ] Error N(b-1) σ2 Where b is the number of replicates r is the recombination fraction separating the marker from the QTL c is a coefficient related to the population size c = N (n n 2 2 )/N (n 1 + n 2 = 1; representing the number in each marker class) g is the genetic effect (in BC pop s additive and dominance effects are confounded). σ2(g QTL ) is the part of the error variance that cannot be explained by the QTL. When b = 1, Gen(marker) becomes the error term. If there are repeated measures on each genotype (b>1), the proper error term must be specified. The F-test for significance is Marker/Gen(marker) = bc(1 2r) 2 g 2. Thus significance of a marker depends on population size, recombination, the strength of the genetic effect relative to the error variance and the part of the error variance that cannot be explained by the QTL. References M.S. McIntosh Analysis of combined experiments. Agronomy Journal 75: Hallauer, AR. & J.B. Miranda Quantitative Genetics in Maize Breeding (second edition). Iowa State University Press. Ames, Iowa. (pp ). Cotterill, P. P. (1987). On estimating heritability according to practical applications. Silvae Genetica 36:46-48.

Topic 30: Random Effects Modeling

Topic 30: Random Effects Modeling Topic 30: Random Effects Modeling Outline One-way random effects model Data Model Inference Data for one-way random effects model Y, the response variable Factor with levels i = 1 to r Y ij is the j th

More information

A pragmatic approach to formulating linear mixed models for randomized experiments

A pragmatic approach to formulating linear mixed models for randomized experiments A pragmatic approach to formulating linear mixed models for randomized experiments Prof. Dr. Hans-Peter Piepho FG Biostatistics Universität Hohenheim 1 Content 1. Introduction 2. Randomized complete block

More information

Approximating the Confidence Intervals for Sharpe Style Weights

Approximating the Confidence Intervals for Sharpe Style Weights Approximating the Confidence Intervals for Sharpe Style Weights Angelo Lobosco and Dan DiBartolomeo Style analysis is a form of constrained regression that uses a weighted combination of market indexes

More information

SAS/STAT 14.1 User s Guide. The LATTICE Procedure

SAS/STAT 14.1 User s Guide. The LATTICE Procedure SAS/STAT 14.1 User s Guide The LATTICE Procedure This document is an individual chapter from SAS/STAT 14.1 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute

More information

The FREQ Procedure. Table of Sex by Gym Sex(Sex) Gym(Gym) No Yes Total Male Female Total

The FREQ Procedure. Table of Sex by Gym Sex(Sex) Gym(Gym) No Yes Total Male Female Total Jenn Selensky gathered data from students in an introduction to psychology course. The data are weights, sex/gender, and whether or not the student worked-out in the gym. Here is the output from a 2 x

More information

New SAS Procedures for Analysis of Sample Survey Data

New SAS Procedures for Analysis of Sample Survey Data New SAS Procedures for Analysis of Sample Survey Data Anthony An and Donna Watts, SAS Institute Inc, Cary, NC Abstract Researchers use sample surveys to obtain information on a wide variety of issues Many

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #6 EPSY 905: Maximum Likelihood In This Lecture The basics of maximum likelihood estimation Ø The engine that

More information

Study of one-way ANOVA with a fixed-effect factor

Study of one-way ANOVA with a fixed-effect factor Study of one-way ANOVA with a fixed-effect factor In the last blog on Introduction to ANOVA, we mentioned that in the oneway ANOVA study, the factor contributing to a possible source of variation that

More information

Studies on Frequency Distribution of Sorghum Downy Mildew Resistant BC 3 F 1 Progenies in Maize

Studies on Frequency Distribution of Sorghum Downy Mildew Resistant BC 3 F 1 Progenies in Maize Int.J.Curr.Microbiol.App.Sci (1) 7(): 35-3 International Journal of Current Microbiology and Applied Sciences ISSN: 319-77 Volume 7 Number (1) Journal homepage: http://www.ijcmas.com Original Research

More information

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times.

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times. Mixed-effects models An introduction by Christoph Scherber Up to now, we have been dealing with linear models of the form where ß0 and ß1 are parameters of fixed value. Example: Let us assume that we are

More information

Session 178 TS, Stats for Health Actuaries. Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA. Presenter: Joan C. Barrett, FSA, MAAA

Session 178 TS, Stats for Health Actuaries. Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA. Presenter: Joan C. Barrett, FSA, MAAA Session 178 TS, Stats for Health Actuaries Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA Presenter: Joan C. Barrett, FSA, MAAA Session 178 Statistics for Health Actuaries October 14, 2015 Presented

More information

Lecture note 8 Spring Lecture note 8. Analysis of Variance (ANOVA)

Lecture note 8 Spring Lecture note 8. Analysis of Variance (ANOVA) Lecture note 8 Analysis of Variance (ANOVA) 1 Overview of ANOVA Analysis of variance (ANOVA) is a comparison of means. ANOVA allows you to compare more than two means simultaneously. Proper experimental

More information

Chapter 8 Student Lecture Notes 8-1. Department of Quantitative Methods & Information Systems. Business Statistics

Chapter 8 Student Lecture Notes 8-1. Department of Quantitative Methods & Information Systems. Business Statistics Chapter 8 Student Lecture Notes 8-1 Department of Quantitative Methods & Information Systems Business Statistics Chapter 11 One Way analysis of Variance QMIS 0 Dr. Mohammad Zainal Chapter Goals After completing

More information

R & R Study. Chapter 254. Introduction. Data Structure

R & R Study. Chapter 254. Introduction. Data Structure Chapter 54 Introduction A repeatability and reproducibility (R & R) study (sometimes called a gauge study) is conducted to determine if a particular measurement procedure is adequate. If the measurement

More information

EXST7015: Multiple Regression from Snedecor & Cochran (1967) RAW DATA LISTING

EXST7015: Multiple Regression from Snedecor & Cochran (1967) RAW DATA LISTING Multiple (Linear) Regression Introductory example Page 1 1 options ps=256 ls=132 nocenter nodate nonumber; 3 DATA ONE; 4 TITLE1 ''; 5 INPUT X1 X2 X3 Y; 6 **** LABEL Y ='Plant available phosphorus' 7 X1='Inorganic

More information

Introduction to QTL (Quantitative Trait Loci) & LOD analysis Steven M. Carr / Biol 4241 / Winter Study Design of Hamer et al.

Introduction to QTL (Quantitative Trait Loci) & LOD analysis Steven M. Carr / Biol 4241 / Winter Study Design of Hamer et al. Introduction to QTL (Quantitative Trait Loci) & LOD analysis Steven M. Carr / Biol 4241 / Winter 2016 Quantitative Trait Loci: contribution of multiple genes to a single trait Linkage between phenotypic

More information

Random Effects ANOVA

Random Effects ANOVA Random Effects ANOVA Grant B. Morgan Baylor University This post contains code for conducting a random effects ANOVA. Make sure the following packages are installed: foreign, lme4, lsr, lattice. library(foreign)

More information

M1 M1 A1 M1 A1 M1 A1 A1 A1 11 A1 2 B1 B1. B1 M1 Relative efficiency (y) = M1 A1 BEWARE PRINTED ANSWER. 5

M1 M1 A1 M1 A1 M1 A1 A1 A1 11 A1 2 B1 B1. B1 M1 Relative efficiency (y) = M1 A1 BEWARE PRINTED ANSWER. 5 Q L e σ π ( W μ e σ π ( W μ M M A Product form. Two Normal terms. Fully correct. (ii ln L const ( W ( W d ln L ( W + ( W dμ 0 σ W σ μ W σ W W ˆ μ σ Chec this is a maximum. d ln L E.g. < 0 dμ σ σ σ μ σ

More information

is the bandwidth and controls the level of smoothing of the estimator, n is the sample size and

is the bandwidth and controls the level of smoothing of the estimator, n is the sample size and Paper PH100 Relationship between Total charges and Reimbursements in Outpatient Visits Using SAS GLIMMIX Chakib Battioui, University of Louisville, Louisville, KY ABSTRACT The purpose of this paper is

More information

Evaluation of a New Variance Components Estimation Method Modi ed Henderson s Method 3 With the Application of Two Way Mixed Model

Evaluation of a New Variance Components Estimation Method Modi ed Henderson s Method 3 With the Application of Two Way Mixed Model Evaluation of a New Variance Components Estimation Method Modi ed Henderson s Method 3 With the Application of Two Way Mixed Model Author: Weigang Qie; Chenfan Xu Supervisor: Lars Rönnegård June 0th, 009

More information

Numerical Descriptions of Data

Numerical Descriptions of Data Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =

More information

Statistical Consulting Topics. Repeated Measures Analysis Client: Ricardo Pena

Statistical Consulting Topics. Repeated Measures Analysis Client: Ricardo Pena Statistical Consulting Topics Repeated Measures Analysis Client: Ricardo Pena Consider a simplified version of Ricardo s analysis that includes the following factors of interest (essentially, fixing ACh-dosage

More information

To be two or not be two, that is a LOGISTIC question

To be two or not be two, that is a LOGISTIC question MWSUG 2016 - Paper AA18 To be two or not be two, that is a LOGISTIC question Robert G. Downer, Grand Valley State University, Allendale, MI ABSTRACT A binary response is very common in logistic regression

More information

Random Effects... and more about pigs G G G G G G G G G G G

Random Effects... and more about pigs G G G G G G G G G G G et s examine the random effects model in terms of the pig weight example. This had eight litters, and in the first analysis we were willing to think of as fixed effects. This means that we might want to

More information

Chapter 6 Confidence Intervals

Chapter 6 Confidence Intervals Chapter 6 Confidence Intervals Section 6-1 Confidence Intervals for the Mean (Large Samples) VOCABULARY: Point Estimate A value for a parameter. The most point estimate of the population parameter is the

More information

Homework Assignment Section 3

Homework Assignment Section 3 Homework Assignment Section 3 Tengyuan Liang Business Statistics Booth School of Business Problem 1 A company sets different prices for a particular stereo system in eight different regions of the country.

More information

REML Estimation and Linear Mixed Models 2. Variance components models

REML Estimation and Linear Mixed Models 2. Variance components models REML Estimation and Linear Mixed s 2. Variance components models Sue Welham Rothamsted Research Harpenden UK AL5 2JQ November 18, 2008 1 Issues in variance modelling Fixed or random? Analysis of data &

More information

Mixed models in R using the lme4 package Part 3: Inference based on profiled deviance

Mixed models in R using the lme4 package Part 3: Inference based on profiled deviance Mixed models in R using the lme4 package Part 3: Inference based on profiled deviance Douglas Bates Department of Statistics University of Wisconsin - Madison Madison January 11, 2011

More information

Week 7 Quantitative Analysis of Financial Markets Simulation Methods

Week 7 Quantitative Analysis of Financial Markets Simulation Methods Week 7 Quantitative Analysis of Financial Markets Simulation Methods Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 November

More information

Quantitative Techniques Term 2

Quantitative Techniques Term 2 Quantitative Techniques Term 2 Laboratory 7 2 March 2006 Overview The objective of this lab is to: Estimate a cost function for a panel of firms; Calculate returns to scale; Introduce the command cluster

More information

ILO-IPEC Interactive Sampling Tools No. 7

ILO-IPEC Interactive Sampling Tools No. 7 ILO-IPEC Interactive Sampling Tools No. 7 Version 1 December 2014 International Programme on the Elimination of Child Labour (IPEC) Fundamental Principles and Rights at Work (FPRW) Branch Governance and

More information

Design of Engineering Experiments Part 9 Experiments with Random Factors

Design of Engineering Experiments Part 9 Experiments with Random Factors Design of ngineering xperiments Part 9 xperiments with Random Factors Text reference, Chapter 13, Pg. 484 Previous chapters have considered fixed factors A specific set of factor levels is chosen for the

More information

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley. Appendix: Statistics in Action Part I Financial Time Series 1. These data show the effects of stock splits. If you investigate further, you ll find that most of these splits (such as in May 1970) are 3-for-1

More information

Problem max points points scored Total 120. Do all 6 problems.

Problem max points points scored Total 120. Do all 6 problems. Solutions to (modified) practice exam 4 Statistics 224 Practice exam 4 FINAL Your Name Friday 12/21/07 Professor Michael Iltis (Lecture 2) Discussion section (circle yours) : section: 321 (3:30 pm M) 322

More information

1.017/1.010 Class 19 Analysis of Variance

1.017/1.010 Class 19 Analysis of Variance .07/.00 Class 9 Analysis of Variance Concepts and Definitions Objective: dentify factors responsible for variability in observed data Specify one or more factors that could account for variability (e.g.

More information

Genetic analysis of yield and yield attributing characters in linseed (Linum usitatissimum L.)

Genetic analysis of yield and yield attributing characters in linseed (Linum usitatissimum L.) RESEARCH PAPER Asian Journal of Bio Science Vol. 6 Issue 1 (April, 2011) : 16-22 Genetic analysis of yield and yield attributing characters in linseed (Linum usitatissimum L.) DEEPAK GAURAHA, S.S. RAO

More information

Appendix A (Pornprasertmanit & Little, in press) Mathematical Proof

Appendix A (Pornprasertmanit & Little, in press) Mathematical Proof Appendix A (Pornprasertmanit & Little, in press) Mathematical Proof Definition We begin by defining notations that are needed for later sections. First, we define moment as the mean of a random variable

More information

Mixed Models Tests for the Slope Difference in a 3-Level Hierarchical Design with Random Slopes (Level-3 Randomization)

Mixed Models Tests for the Slope Difference in a 3-Level Hierarchical Design with Random Slopes (Level-3 Randomization) Chapter 375 Mixed Models Tests for the Slope Difference in a 3-Level Hierarchical Design with Random Slopes (Level-3 Randomization) Introduction This procedure calculates power and sample size for a three-level

More information

2.1 Mathematical Basis: Risk-Neutral Pricing

2.1 Mathematical Basis: Risk-Neutral Pricing Chapter Monte-Carlo Simulation.1 Mathematical Basis: Risk-Neutral Pricing Suppose that F T is the payoff at T for a European-type derivative f. Then the price at times t before T is given by f t = e r(t

More information

When determining but for sales in a commercial damages case,

When determining but for sales in a commercial damages case, JULY/AUGUST 2010 L I T I G A T I O N S U P P O R T Choosing a Sales Forecasting Model: A Trial and Error Process By Mark G. Filler, CPA/ABV, CBA, AM, CVA When determining but for sales in a commercial

More information

WEB APPENDIX 8A 7.1 ( 8.9)

WEB APPENDIX 8A 7.1 ( 8.9) WEB APPENDIX 8A CALCULATING BETA COEFFICIENTS The CAPM is an ex ante model, which means that all of the variables represent before-the-fact expected values. In particular, the beta coefficient used in

More information

A Brief Illustration of Regression Analysis in Economics John Bucci. Okun s Law

A Brief Illustration of Regression Analysis in Economics John Bucci. Okun s Law Okun s Law The following regression exercise measures the original relationship between unemployment and real output, as established first by the economist Arthur Okun in the 1960s. Brief History Arthur

More information

Intro to GLM Day 2: GLM and Maximum Likelihood

Intro to GLM Day 2: GLM and Maximum Likelihood Intro to GLM Day 2: GLM and Maximum Likelihood Federico Vegetti Central European University ECPR Summer School in Methods and Techniques 1 / 32 Generalized Linear Modeling 3 steps of GLM 1. Specify the

More information

NEWCASTLE UNIVERSITY. School SEMESTER /2013 ACE2013. Statistics for Marketing and Management. Time allowed: 2 hours

NEWCASTLE UNIVERSITY. School SEMESTER /2013 ACE2013. Statistics for Marketing and Management. Time allowed: 2 hours NEWCASTLE UNIVERSITY School SEMESTER 2 2012/2013 Statistics for Marketing and Management Time allowed: 2 hours Candidates should attempt ALL questions. Marks for each question are indicated. However you

More information

Calculator Advanced Features. Capital Budgeting. Contents. Net Present Value (NPV) Net Present Value (NPV) Net Present Value (NPV) Capital Budgeting

Calculator Advanced Features. Capital Budgeting. Contents. Net Present Value (NPV) Net Present Value (NPV) Net Present Value (NPV) Capital Budgeting Capital Budgeting Contents TI BAII Plus Calculator Advanced Features Uneven Cash Flows Mean, Variance, and Standard Deviation Covariance, Correlation, and Regression Deprecation Net Present Value (NPV)

More information

22S:105 Statistical Methods and Computing. Two independent sample problems. Goal of inference: to compare the characteristics of two different

22S:105 Statistical Methods and Computing. Two independent sample problems. Goal of inference: to compare the characteristics of two different 22S:105 Statistical Methods and Computing Two independent-sample t-tests Lecture 17 Apr. 5, 2013 1 2 Two independent sample problems Goal of inference: to compare the characteristics of two different populations

More information

Tests for the Difference Between Two Linear Regression Intercepts

Tests for the Difference Between Two Linear Regression Intercepts Chapter 853 Tests for the Difference Between Two Linear Regression Intercepts Introduction Linear regression is a commonly used procedure in statistical analysis. One of the main objectives in linear regression

More information

Power in Mixed Effects

Power in Mixed Effects Power in Mixed Effects Gary W. Oehlert School of Statistics University of Minnesota December 1, 2014 Power is an important aspect of designing an experiment; we now return to power in mixed effects. We

More information

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron Statistical Models of Stocks and Bonds Zachary D Easterling: Department of Economics The University of Akron Abstract One of the key ideas in monetary economics is that the prices of investments tend to

More information

Measure of Variation

Measure of Variation Measure of Variation Variation is the spread of a data set. The simplest measure is the range. Range the difference between the maximum and minimum data entries in the set. To find the range, the data

More information

20135 Theory of Finance Part I Professor Massimo Guidolin

20135 Theory of Finance Part I Professor Massimo Guidolin MSc. Finance/CLEFIN 2014/2015 Edition 20135 Theory of Finance Part I Professor Massimo Guidolin A FEW SAMPLE QUESTIONS, WITH SOLUTIONS SET 2 WARNING: These are just sample questions. Please do not count

More information

Predictive modelling around the world Peter Banthorpe, RGA Kevin Manning, Milliman

Predictive modelling around the world Peter Banthorpe, RGA Kevin Manning, Milliman Predictive modelling around the world Peter Banthorpe, RGA Kevin Manning, Milliman 11 November 2013 Agenda Introduction to predictive analytics Applications overview Case studies Conclusions and Q&A Introduction

More information

Market Integration, Price Discovery, and Volatility in Agricultural Commodity Futures P.Ramasundaram* and Sendhil R**

Market Integration, Price Discovery, and Volatility in Agricultural Commodity Futures P.Ramasundaram* and Sendhil R** Market Integration, Price Discovery, and Volatility in Agricultural Commodity Futures P.Ramasundaram* and Sendhil R** *National Coordinator (M&E), National Agricultural Innovation Project (NAIP), Krishi

More information

WesVar Analysis Example Replication C7

WesVar Analysis Example Replication C7 WesVar Analysis Example Replication C7 WesVar 5.1 is primarily a point and click application and though a text file of commands can be used in the WesVar (V5.1) batch processing environment, all examples

More information

Sharpe Ratio over investment Horizon

Sharpe Ratio over investment Horizon Sharpe Ratio over investment Horizon Ziemowit Bednarek, Pratish Patel and Cyrus Ramezani December 8, 2014 ABSTRACT Both building blocks of the Sharpe ratio the expected return and the expected volatility

More information

Mendelian Randomization with a Continuous Outcome

Mendelian Randomization with a Continuous Outcome Chapter 85 Mendelian Randomization with a Continuous Outcome Introduction This module computes the sample size and power of the causal effect in Mendelian randomization studies with a continuous outcome.

More information

Stat 401XV Exam 3 Spring 2017

Stat 401XV Exam 3 Spring 2017 Stat 40XV Exam Spring 07 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning

More information

STAT Chapter 6: Sampling Distributions

STAT Chapter 6: Sampling Distributions STAT 515 -- Chapter 6: Sampling Distributions Definition: Parameter = a number that characterizes a population (example: population mean ) it s typically unknown. Statistic = a number that characterizes

More information

Influence of Personal Factors on Health Insurance Purchase Decision

Influence of Personal Factors on Health Insurance Purchase Decision Influence of Personal Factors on Health Insurance Purchase Decision INFLUENCE OF PERSONAL FACTORS ON HEALTH INSURANCE PURCHASE DECISION The decision in health insurance purchase include decisions about

More information

One Sample T-Test With Howell Data, IQ of Students in Vermont

One Sample T-Test With Howell Data, IQ of Students in Vermont One Sample T-Test With Howell Data, IQ of Students in Vermont data howell; infile 'C:\Users\Vati\Documents\StatData\howell.dat'; input addsc sex repeat iq engl engg gpa socprob dropout; IQ_diff = iq -

More information

Stat 328, Summer 2005

Stat 328, Summer 2005 Stat 328, Summer 2005 Exam #2, 6/18/05 Name (print) UnivID I have neither given nor received any unauthorized aid in completing this exam. Signed Answer each question completely showing your work where

More information

Financial Econometrics: Problem Set # 3 Solutions

Financial Econometrics: Problem Set # 3 Solutions Financial Econometrics: Problem Set # 3 Solutions N Vera Chau The University of Chicago: Booth February 9, 219 1 a. You can generate the returns using the exact same strategy as given in problem 2 below.

More information

Tests for Two Means in a Cluster-Randomized Design

Tests for Two Means in a Cluster-Randomized Design Chapter 482 Tests for Two Means in a Cluster-Randomized Design Introduction Cluster-randomized designs are those in which whole clusters of subjects (classes, hospitals, communities, etc.) are put into

More information

Categorical Outcomes. Statistical Modelling in Stata: Categorical Outcomes. R by C Table: Example. Nominal Outcomes. Mark Lunt.

Categorical Outcomes. Statistical Modelling in Stata: Categorical Outcomes. R by C Table: Example. Nominal Outcomes. Mark Lunt. Categorical Outcomes Statistical Modelling in Stata: Categorical Outcomes Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Nominal Ordinal 28/11/2017 R by C Table: Example Categorical,

More information

(iii) Under equal cluster sampling, show that ( ) notations. (d) Attempt any four of the following:

(iii) Under equal cluster sampling, show that ( ) notations. (d) Attempt any four of the following: Central University of Rajasthan Department of Statistics M.Sc./M.A. Statistics (Actuarial)-IV Semester End of Semester Examination, May-2012 MSTA 401: Sampling Techniques and Econometric Methods Max. Marks:

More information

σ e, which will be large when prediction errors are Linear regression model

σ e, which will be large when prediction errors are Linear regression model Linear regression model we assume that two quantitative variables, x and y, are linearly related; that is, the population of (x, y) pairs are related by an ideal population regression line y = α + βx +

More information

The Simple Regression Model

The Simple Regression Model Chapter 2 Wooldridge: Introductory Econometrics: A Modern Approach, 5e Definition of the simple linear regression model "Explains variable in terms of variable " Intercept Slope parameter Dependent var,

More information

Non-Inferiority Tests for the Ratio of Two Means in a 2x2 Cross-Over Design

Non-Inferiority Tests for the Ratio of Two Means in a 2x2 Cross-Over Design Chapter 515 Non-Inferiority Tests for the Ratio of Two Means in a x Cross-Over Design Introduction This procedure calculates power and sample size of statistical tests for non-inferiority tests from a

More information

Gamma Distribution Fitting

Gamma Distribution Fitting Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics

More information

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER STA2601/105/2/2018 Tutorial letter 105/2/2018 Applied Statistics II STA2601 Semester 2 Department of Statistics TRIAL EXAMINATION PAPER Define tomorrow. university of south africa Dear Student Congratulations

More information

Mendelian Randomization with a Binary Outcome

Mendelian Randomization with a Binary Outcome Chapter 851 Mendelian Randomization with a Binary Outcome Introduction This module computes the sample size and power of the causal effect in Mendelian randomization studies with a binary outcome. This

More information

Expected Return Methodologies in Morningstar Direct Asset Allocation

Expected Return Methodologies in Morningstar Direct Asset Allocation Expected Return Methodologies in Morningstar Direct Asset Allocation I. Introduction to expected return II. The short version III. Detailed methodologies 1. Building Blocks methodology i. Methodology ii.

More information

Financial Time Series Analysis (FTSA)

Financial Time Series Analysis (FTSA) Financial Time Series Analysis (FTSA) Lecture 6: Conditional Heteroscedastic Models Few models are capable of generating the type of ARCH one sees in the data.... Most of these studies are best summarized

More information

Brooks, Introductory Econometrics for Finance, 3rd Edition

Brooks, Introductory Econometrics for Finance, 3rd Edition P1.T2. Quantitative Analysis Brooks, Introductory Econometrics for Finance, 3rd Edition Bionic Turtle FRM Study Notes Sample By David Harper, CFA FRM CIPM and Deepa Raju www.bionicturtle.com Chris Brooks,

More information

ST 350 Lecture Worksheet #33 Reiland

ST 350 Lecture Worksheet #33 Reiland ST 350 Lecture Worksheet #33 Reiland SOLUTIONS Name Lotteries: Good Idea or Scam? Lotteries have become important sources of revenue for many state governments. However, people have criticized lotteries

More information

Five Things You Should Know About Quantile Regression

Five Things You Should Know About Quantile Regression Five Things You Should Know About Quantile Regression Robert N. Rodriguez and Yonggang Yao SAS Institute #analyticsx Copyright 2016, SAS Institute Inc. All rights reserved. Quantile regression brings the

More information

Tests for Two Variances

Tests for Two Variances Chapter 655 Tests for Two Variances Introduction Occasionally, researchers are interested in comparing the variances (or standard deviations) of two groups rather than their means. This module calculates

More information

Consistent estimators for multilevel generalised linear models using an iterated bootstrap

Consistent estimators for multilevel generalised linear models using an iterated bootstrap Multilevel Models Project Working Paper December, 98 Consistent estimators for multilevel generalised linear models using an iterated bootstrap by Harvey Goldstein hgoldstn@ioe.ac.uk Introduction Several

More information

STA258 Analysis of Variance

STA258 Analysis of Variance STA258 Analysis of Variance Al Nosedal. University of Toronto. Winter 2017 The Data Matrix The following table shows last year s sales data for a small business. The sample is put into a matrix format

More information

Rand Final Pop 2. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Rand Final Pop 2. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question. Name: Class: Date: Rand Final Pop 2 Multiple Choice Identify the choice that best completes the statement or answers the question. Scenario 12-1 A high school guidance counselor wonders if it is possible

More information

Non-linearities in Simple Regression

Non-linearities in Simple Regression Non-linearities in Simple Regression 1. Eample: Monthly Earnings and Years of Education In this tutorial, we will focus on an eample that eplores the relationship between total monthly earnings and years

More information

Regression Review and Robust Regression. Slides prepared by Elizabeth Newton (MIT)

Regression Review and Robust Regression. Slides prepared by Elizabeth Newton (MIT) Regression Review and Robust Regression Slides prepared by Elizabeth Newton (MIT) S-Plus Oil City Data Frame Monthly Excess Returns of Oil City Petroleum, Inc. Stocks and the Market SUMMARY: The oilcity

More information

SAS Simple Linear Regression Example

SAS Simple Linear Regression Example SAS Simple Linear Regression Example This handout gives examples of how to use SAS to generate a simple linear regression plot, check the correlation between two variables, fit a simple linear regression

More information

CHAPTER 6 DATA ANALYSIS AND INTERPRETATION

CHAPTER 6 DATA ANALYSIS AND INTERPRETATION 208 CHAPTER 6 DATA ANALYSIS AND INTERPRETATION Sr. No. Content Page No. 6.1 Introduction 212 6.2 Reliability and Normality of Data 212 6.3 Descriptive Analysis 213 6.4 Cross Tabulation 218 6.5 Chi Square

More information

The World Bank Revised Minimum Standard Model: Concepts and limitations

The World Bank Revised Minimum Standard Model: Concepts and limitations Acta Universitatis Wratislaviensis No 3535 Wioletta Nowak University of Wrocław The World Bank Revised Minimum Standard Model: Concepts and limitations JEL Classification: C60, F33, F35, O Keywords: RMSM,

More information

Estimating the Current Value of Time-Varying Beta

Estimating the Current Value of Time-Varying Beta Estimating the Current Value of Time-Varying Beta Joseph Cheng Ithaca College Elia Kacapyr Ithaca College This paper proposes a special type of discounted least squares technique and applies it to the

More information

Contents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii)

Contents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii) Contents (ix) Contents Preface... (vii) CHAPTER 1 An Overview of Statistical Applications 1.1 Introduction... 1 1. Probability Functions and Statistics... 1..1 Discrete versus Continuous Functions... 1..

More information

STA218 Analysis of Variance

STA218 Analysis of Variance STA218 Analysis of Variance Al Nosedal. University of Toronto. Fall 2017 November 27, 2017 The Data Matrix The following table shows last year s sales data for a small business. The sample is put into

More information

Tutorial 6. Sampling Distribution. ENGG2450A Tutors. 27 February The Chinese University of Hong Kong 1/6

Tutorial 6. Sampling Distribution. ENGG2450A Tutors. 27 February The Chinese University of Hong Kong 1/6 Tutorial 6 Sampling Distribution ENGG2450A Tutors The Chinese University of Hong Kong 27 February 2017 1/6 Random Sample and Sampling Distribution 2/6 Random sample Consider a random variable X with distribution

More information

Market Approach A. Relationship to Appraisal Principles

Market Approach A. Relationship to Appraisal Principles Market Approach A. Relationship to Appraisal Principles 1. Supply and demand Prices are determined by negotiation between buyers and sellers o Buyers demand side o Sellers supply side At a specific time

More information

Statistics & Statistical Tests: Assumptions & Conclusions

Statistics & Statistical Tests: Assumptions & Conclusions Degrees of Freedom Statistics & Statistical Tests: Assumptions & Conclusions Kinds of degrees of freedom Kinds of Distributions Kinds of Statistics & assumptions required to perform each Normal Distributions

More information

Multiple regression - a brief introduction

Multiple regression - a brief introduction Multiple regression - a brief introduction Multiple regression is an extension to regular (simple) regression. Instead of one X, we now have several. Suppose, for example, that you are trying to predict

More information

CHAPTER 4 DATA ANALYSIS Data Hypothesis

CHAPTER 4 DATA ANALYSIS Data Hypothesis CHAPTER 4 DATA ANALYSIS 4.1. Data Hypothesis The hypothesis for each independent variable to express our expectations about the characteristic of each independent variable and the pay back performance

More information

Study Ch. 11.2, #51, 63 69, 73

Study Ch. 11.2, #51, 63 69, 73 May 05, 014 11. Inferences for σ's, Populations Study Ch. 11., #51, 63 69, 73 Statistics Home Page Gertrude Battaly, 014 11. Inferences for σ's, Populations Procedures that assume = σ's 1. Pooled t test.

More information

OPTIMAL RISKY PORTFOLIOS- ASSET ALLOCATIONS. BKM Ch 7

OPTIMAL RISKY PORTFOLIOS- ASSET ALLOCATIONS. BKM Ch 7 OPTIMAL RISKY PORTFOLIOS- ASSET ALLOCATIONS BKM Ch 7 ASSET ALLOCATION Idea from bank account to diversified portfolio Discussion principles are the same for any number of stocks A. bonds and stocks B.

More information

Introduction Dickey-Fuller Test Option Pricing Bootstrapping. Simulation Methods. Chapter 13 of Chris Brook s Book.

Introduction Dickey-Fuller Test Option Pricing Bootstrapping. Simulation Methods. Chapter 13 of Chris Brook s Book. Simulation Methods Chapter 13 of Chris Brook s Book Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 April 26, 2017 Christopher

More information

Financial Econometrics Jeffrey R. Russell Midterm 2014

Financial Econometrics Jeffrey R. Russell Midterm 2014 Name: Financial Econometrics Jeffrey R. Russell Midterm 2014 You have 2 hours to complete the exam. Use can use a calculator and one side of an 8.5x11 cheat sheet. Try to fit all your work in the space

More information

MODEL SELECTION CRITERIA IN R:

MODEL SELECTION CRITERIA IN R: 1. R 2 statistics We may use MODEL SELECTION CRITERIA IN R R 2 = SS R SS T = 1 SS Res SS T or R 2 Adj = 1 SS Res/(n p) SS T /(n 1) = 1 ( ) n 1 (1 R 2 ). n p where p is the total number of parameters. R

More information

Written by N.Nilgün Çokça. Advance Excel. Part One. Using Excel for Data Analysis

Written by N.Nilgün Çokça. Advance Excel. Part One. Using Excel for Data Analysis Written by N.Nilgün Çokça Advance Excel Part One Using Excel for Data Analysis March, 2018 P a g e 1 Using Excel for Calculations Arithmetic operations Arithmetic operators: To perform basic mathematical

More information

2SLS HATCO SPSS, STATA and SHAZAM. Example by Eddie Oczkowski. August 2001

2SLS HATCO SPSS, STATA and SHAZAM. Example by Eddie Oczkowski. August 2001 2SLS HATCO SPSS, STATA and SHAZAM Example by Eddie Oczkowski August 2001 This example illustrates how to use SPSS to estimate and evaluate a 2SLS latent variable model. The bulk of the example relates

More information