Mendelian Randomization with a Continuous Outcome

Similar documents
Mendelian Randomization with a Binary Outcome

Tests for Two Variances

Tests for the Difference Between Two Linear Regression Intercepts

Tests for One Variance

Mixed Models Tests for the Slope Difference in a 3-Level Hierarchical Design with Random Slopes (Level-3 Randomization)

PASS Sample Size Software

Tests for Two Means in a Cluster-Randomized Design

Tests for Intraclass Correlation

Tests for Two Means in a Multicenter Randomized Design

Two-Sample T-Tests using Effect Size

Tests for Two Exponential Means

Tests for the Difference Between Two Poisson Rates in a Cluster-Randomized Design

Tests for Paired Means using Effect Size

Tests for the Odds Ratio in a Matched Case-Control Design with a Binary X

Non-Inferiority Tests for the Ratio of Two Means

Tests for Multiple Correlated Proportions (McNemar-Bowker Test of Symmetry)

Two-Sample Z-Tests Assuming Equal Variance

One-Sample Cure Model Tests

Non-Inferiority Tests for the Ratio of Two Means in a 2x2 Cross-Over Design

Equivalence Tests for the Difference of Two Proportions in a Cluster- Randomized Design

Non-Inferiority Tests for Two Means in a 2x2 Cross-Over Design using Differences

Tests for the Matched-Pair Difference of Two Event Rates in a Cluster- Randomized Design

Group-Sequential Tests for Two Proportions

Superiority by a Margin Tests for the Ratio of Two Proportions

Point-Biserial and Biserial Correlations

Non-Inferiority Tests for the Odds Ratio of Two Proportions

Non-Inferiority Tests for the Ratio of Two Proportions

Confidence Intervals for the Difference Between Two Means with Tolerance Probability

Conover Test of Variances (Simulation)

Tests for Two ROC Curves

Equivalence Tests for the Odds Ratio of Two Proportions

Equivalence Tests for One Proportion

Equivalence Tests for Two Correlated Proportions

Tests for Two Independent Sensitivities

Confidence Intervals for an Exponential Lifetime Percentile

Non-Inferiority Tests for the Difference Between Two Proportions

Equivalence Tests for the Ratio of Two Means in a Higher- Order Cross-Over Design

Confidence Intervals for Paired Means with Tolerance Probability

Confidence Intervals for Pearson s Correlation

One Proportion Superiority by a Margin Tests

Conditional Power of One-Sample T-Tests

Conditional Power of Two Proportions Tests

Tolerance Intervals for Any Data (Nonparametric)

Confidence Intervals for One-Sample Specificity

Gamma Distribution Fitting

Two-Sample T-Test for Superiority by a Margin

Two-Sample T-Test for Non-Inferiority

Confidence Intervals for One Variance using Relative Error

NCSS Statistical Software. Reference Intervals

Risk Analysis. å To change Benchmark tickers:

7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4

Applied Macro Finance

Economics 424/Applied Mathematics 540. Final Exam Solutions

Lecture 8: Single Sample t test

LAB 2 INSTRUCTIONS PROBABILITY DISTRIBUTIONS IN EXCEL

Analysis of 2x2 Cross-Over Designs using T-Tests for Non-Inferiority

Problem max points points scored Total 120. Do all 6 problems.

Jacob: The illustrative worksheet shows the values of the simulation parameters in the upper left section (Cells D5:F10). Is this for documentation?

R & R Study. Chapter 254. Introduction. Data Structure

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER

Linear Regression with One Regressor

Confidence Intervals for One Variance with Tolerance Probability

Data Simulator. Chapter 920. Introduction

Omitted Variables Bias in Regime-Switching Models with Slope-Constrained Estimators: Evidence from Monte Carlo Simulations

Non-Inferiority Tests for the Ratio of Two Correlated Proportions

2SLS HATCO SPSS, STATA and SHAZAM. Example by Eddie Oczkowski. August 2001

σ e, which will be large when prediction errors are Linear regression model

Session 178 TS, Stats for Health Actuaries. Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA. Presenter: Joan C. Barrett, FSA, MAAA

MAS6012. MAS Turn Over SCHOOL OF MATHEMATICS AND STATISTICS. Sampling, Design, Medical Statistics

Risk Reduction Potential

Tests for Two Correlations

ONLINE APPENDIX (NOT FOR PUBLICATION) Appendix A: Appendix Figures and Tables

Understanding Differential Cycle Sensitivity for Loan Portfolios

Financial Econometrics Review Session Notes 4

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

1) 3 points Which of the following is NOT a measure of central tendency? a) Median b) Mode c) Mean d) Range

Advisor Proposal Generator. Getting Started

Online Appendix to Grouped Coefficients to Reduce Bias in Heterogeneous Dynamic Panel Models with Small T

Intro to GLM Day 2: GLM and Maximum Likelihood

Report 2 Instructions - SF2980 Risk Management

Business Statistics: A First Course

Answer FOUR questions out of the following FIVE. Each question carries 25 Marks.

Binary Diagnostic Tests Single Sample

THE IMPACT OF BANKING RISKS ON THE CAPITAL OF COMMERCIAL BANKS IN LIBYA

1 The ECN module. Note

Quantile Regression due to Skewness. and Outliers

Forecasting Real Estate Prices

ECOSOC MS EXCEL LECTURE SERIES DISTRIBUTIONS

Statistics TI-83 Usage Handout

Diploma Part 2. Quantitative Methods. Examiner s Suggested Answers

Discrete Probability Distributions

NCC5010: Data Analytics and Modeling Spring 2015 Exemption Exam

International Journal of Scientific Engineering and Science Volume 2, Issue 9, pp , ISSN (Online):

NEWCASTLE UNIVERSITY. School SEMESTER /2013 ACE2013. Statistics for Marketing and Management. Time allowed: 2 hours

Chapter 6 Confidence Intervals

Time Invariant and Time Varying Inefficiency: Airlines Panel Data

PhD Qualifier Examination

Final Exam Suggested Solutions

A1. Relating Level and Slope to Expected Inflation and Output Dynamics

An Information Based Methodology for the Change Point Problem Under the Non-central Skew t Distribution with Applications.

Transcription:

Chapter 85 Mendelian Randomization with a Continuous Outcome Introduction This module computes the sample size and power of the causal effect in Mendelian randomization studies with a continuous outcome. This analysis is used in observational studies where clinical trials are not possible. Analogous to randomized clinical trials (RCT), Mendelian randomization (MR) divides subjects into two or more groups. However, MR uses a genetic variable, such as the state of a certain gene, to form the groups. The state of the gene is assumed to be random. Using two-stage least squares and making several assumptions, the causal impact of an exposure variable on the outcome variable can be measured. For further reading, we recommend the book by Burgess and Thompson (015) which is completely devoted to this topic. We have used the paper by Brion, Shakhbazov, and Visscher (013) for sample size formulas. We also recommend the papers by Burgess (014) and Freeman, Cowling, and Schooling (013). Technical Details Causal Relationship Test The following details follow closely the results in Brion, Shakhbazov, and Visscher (013). Assume that we are interested in assessing the causal relationship between an outcome variable Y and an exposure variable X. A genetic variable G is available to use as an instrumental variable. A sample of n will be selected. The basic models are YY = ββ YYYY XX + ee YY XX = ββ XXXX GG + ee XX An estimate of β YX will be obtained using two-stage least squares and called b SLS. The mean and variance of b SLS is given by EE(bb SSSSSS ) = ββ YYYY + cov(ee YY, ee XX )/(nnρρ XXXX ) σσ bbssssss = σσ eeyy /(nnρρ XXXX σσ XX ) We are using ρρ XXXX to represent the proportion of the variation in X explained by G in the population. 85-1

Power Calculation The power is given by where χχ dddd,nnnnnn Power = 1 PP χχ dddd,nnnnnn > χχ dddd,1 αα is a non-central chi-square with df = 1 and non-centrality parameter NCP. Also, χχ dddd,1 αα is the quantile of a central chi-square with df = 1 and probability α. Note that this is a two-sided test. The result for a one-sided test is obtained by the usual adjustment. The value of NCP is given by This can be evaluated if we note that NNNNNN = EE(bb SSSSSS) σσ bbssssss cov(ee YY, ee XX ) = (ββ OOOOOO ββ YYYY )σσ XX σσ eeyy = σσ YY σσ XX ββ YYYY (ββ OOOOOO ββ YYYY ) where β OLS is the expected value of the ordinary least squares (OLS) regression coefficient of the regression of Y on X. Procedure Options This section describes the options that are specific to this procedure. These are located on the Design tab. For more information about the options of other tabs, go to the Procedure Window chapter. Design Tab The Design tab contains most of the parameters and options that you will be concerned with. This chapter covers four procedures, each of which has different effect size options. However, many of the options are common to all four procedures. These common options will be displayed first, followed by the various effect size options. Solve For Solve For This option specifies the parameter to be solved for from the other parameters. The parameters that may be selected are Power and Sample Size. Test Alternative Hypothesis Specify whether the statistical test is two-sided or one-sided. Two-Sided This option gives the two-sided test. One-Sided This option gives the approximate result for a one-sided test by making an appropriate substitution for α. 85-

Power and Alpha Power This option specifies one or more values for power. Power is the probability of rejecting a false null hypothesis. Values must be between zero and one. Historically, the value of 0.80 was used for power. Now, 0.90 is also commonly used. A single value may be entered here or a range of values such as 0.8 to 0.95 by 0.05 may be entered. Alpha This option specifies one or more values for the probability of a type-i error. A type-i error occurs when you reject the null hypothesis of equal survival curves when in fact the curves are equal. Values of alpha must be between zero and one. Historically, the value of 0.05 has been used for a two-sided test and 0.05 has been used for a one-sided test. You should pick a value for alpha that represents the risk of a type-i error you are willing to take in your experimental situation. You may enter a range of values such as 0.01 0.05 0.10 or 0.01 to 0.10 by 0.01. Sample Size N (Sample Size) Enter a value for the sample size, N. This is the number of subjects in the study. You can enter one or more positive integers greater than or equal to 3. Note that in these type of studies, very large sample sizes (50000+) are often needed. You may also enter a range such as 10000 to 50000 by 10000 or a list of values separated by commas or blanks such as 000 4000 6000 8000. Effect Size β1 (SLS of Y on X G) Enter one or more values for β1, the parameter of interest. This is the value of β YX. If all assumptions are met, β1 is a measure of the causal effect of X on Y. It is estimated using two-stage least squares (SLS), where G is the genetic, or instrumental, variable. You should keep this value in the same scale with your entries for σ(x) and σ(y). This value can be any value greater than zero. The results for negative values will be identical to those of β1. β0 (OLS of Y on X) Enter one or more values for β0, regression parameter from the OLS regression of Y on X. This is the value of β OLS. You should keep this value in the same scale with your entries for σ(x) and σ(y). This value can be any value greater than zero. The results for negative values will be identical to those of β0. ρ²(xg) Enter one or more value for ρ²(xg), the proportion of the variance of X explained by the regression of X on G. In practice, it is estimated by the R² of this regression. It is routinely very small. Values of 0.01 and 0.0 are common. This value varies between zero and one. For mathematical reasons, it cannot be exactly 0. σ(x) Enter one or more value for σ(x), the standard deviation of the X values. This value can be any number greater than 0. 85-3

σ(y) Enter one or more value for σ(y), the standard deviation of the Y values. This value can be any number greater than 0. Example 1 Finding the Sample Size Researchers are planning an observation study to determine the causal effect of an exposure X on a continuous outcome variable Y using an instrumental variable G. They want to determine the sample sizes necessary to have 80% power and 5% significance using a two-sided test when β1 is 1.05 to 1.3 by 0.05, β0 is 1.4, ρ²(xg) is 0.01 or 0.0, σ(x) is 1, and σ(y) is 10. Setup This section presents the values of each of the parameters needed to run this example. First, from the PASS Home window, load the procedure window by expanding Regression and then clicking on. You may then make the appropriate entries as listed below, or open Example 1 by going to the File menu and choosing Open Example Template. Option Value Design Tab Solve For... Sample Size Alternative Hypothesis... Two-Sided Power... 0.80 Alpha... 0.05 β1 (SLS of Y on X G)... 1.05 to 1.30 by 0.05 β0 (OLS of Y on X)... 1.40 ρ²(xg)... 0.01 0.0 σ(x)... 1 σ(y)... 10 Annotated Output Click the Calculate button to perform the calculations and generate the following output. Numeric Results Numeric Results for the Two-Sided, Mendelian Randomization Test of a Continuous Outcome Power N β1(sls) β0(ols) ρ²(xg) σ(x) σ(y) σ(b1) NCP Alpha 0.8000 69817 1.0500 1.4000 0.0100 1.0000 10.0000 0.3750 7.8490 0.0500 0.8000 34908 1.0500 1.4000 0.000 1.0000 10.0000 0.3750 7.8489 0.0500 0.8000 63600 1.1000 1.4000 0.0100 1.0000 10.0000 0.398 7.8489 0.0500 0.8000 31800 1.1000 1.4000 0.000 1.0000 10.0000 0.398 7.8491 0.0500 0.8000 58180 1.1500 1.4000 0.0100 1.0000 10.0000 0.4106 7.8490 0.0500 0.8000 9090 1.1500 1.4000 0.000 1.0000 10.0000 0.4106 7.8490 0.0500 0.8000 5347 1.000 1.4000 0.0100 1.0000 10.0000 0.485 7.8489 0.0500 0.8000 6713 1.000 1.4000 0.000 1.0000 10.0000 0.485 7.8490 0.0500 0.8000 4936 1.500 1.4000 0.0100 1.0000 10.0000 0.4463 7.8490 0.0500 0.8000 4618 1.500 1.4000 0.000 1.0000 10.0000 0.4463 7.8490 0.0500 0.8000 4553 1.3000 1.4000 0.0100 1.0000 10.0000 0.4641 7.8490 0.0500 0.8000 761 1.3000 1.4000 0.000 1.0000 10.0000 0.4641 7.8490 0.0500 85-4

References Burgess, S. and Thompson, S.G. 015. Mendelian Randomization Methods for Using Genetic Variants in Causal Estimation. Chapman & Hall/CRC Press. New York. Burgess, Stephen. 014. 'Sample size and power calculations in Mendelian randomization with a single instrumental variable and a binary outcome.' International Journal of Epidemiology, 43, pages 9-99. Freeman, G., Cowling, B.J., Schooling, C.M. 013. 'Power and sample size calculations for Mendelian randomization studies using one genetic instrument.' International Journal of Epidemiology, 4, pages 1157-1163. Brion, M.J.A., Shakhbazov, K., Visscher, P.M. 013. 'Calculating statistical power in Mendelian randomization studies.' International Journal of Epidemiology, 4, pages 1497-1501. Report Definitions Y is the continuous outcome variable. X is the exposure variable. G is the genetic variant or instrumental variable used to divide the subjects up into groups. Power is the probability of rejecting a false null hypothesis. N is the sample size, the number of subjects in the study. β1(sls) is the parameter of interest. It is a measure of the causal effect of X on Y. It can be estimated using two-stage least squares (SLS) where G is the genetic, or instrumental, variable. β0(ols) is the regression parameter (slope) of the regression of Y on X which is estimated by ordinary least squares. ρ²(xg) is the proportion of the variance of X explained by the regression of X on G. σ(x) is the standard deviation of the X variable. σ(y) is the standard deviation of the Y variable. σ(b1) is the standard error of the estimate of β1. NCP is the non-centrality parameter for testing the significance of β1. Alpha is the probability of rejecting a true null hypothesis. Summary Statements A two-sided test in a Mendelian randomization study for a continuous outcome variable that was based on a sample of 69817 subjects achieves 80.0% power at a 0.0500 significance level to detect a causal effect (β1) of 1.0500 when the ordinary least squares estimate of the regression slope (β0) is 1.4000. The standard deviations of X and Y are 1.0000 and 10.0000, respectively. The proportion of the variance of X explained by its regression on G, the instrumental variable, is 0.0100. This report presents the calculated sample sizes for each scenario as well as the values of the other parameters. Plots Section These plots show the relationship among the varying parameters. 85-5

Example Validation using Brion, et al. (013) Brion, et al. (013) give an example in the online tool (cnsgenomics.com/shiny/mrnd/) associated with their paper in which the power is 0.80, alpha = 0.05, β1 is 1.3, β0 is 1.41, ρ²(xg) is 0.01, σ(x) is 1, and σ(y) is 10.79815. They obtain N = 5318. Setup This section presents the values of each of the parameters needed to run this example. First, from the PASS Home window, load the procedure window by expanding Regression and then clicking on. You may then make the appropriate entries as listed below, or open Example by going to the File menu and choosing Open Example Template. Option Value Design Tab Solve For... Sample Size Alternative Hypothesis... Two-Sided Power... 0.80 Alpha... 0.05 β1 (SLS of Y on X G)... 1.3 β0 (OLS of Y on X)... 1.41 ρ²(xg)... 0.01 σ(x)... 1 σ(y)... 10.79815 Output Click the Calculate button to perform the calculations and generate the following output. Numeric Results Numeric Results for the Two-Sided, Mendelian Randomization Test of a Continuous Outcome Power N β1(sls) β0(ols) ρ²(xg) σ(x) σ(y) σ(b1) NCP Alpha 0.8000 5319 1.3000 1.4100 0.0100 1.0000 10.798 0.4641 7.8490 0.0500 PASS calculated N as 5319 because 5318 gives a power slightly less than the goal of 0.80. 85-6