Taming the Beast Workshop. Priors and starting values

Similar documents
Tutorial using BEAST v2.4.2 Skyline plots Nicola F. Müller and Louis du Plessis

Divergence time estimation using BEAST v1.7.2

Molecular Phylogenetics

Tree Models. Coalescent Trees, Birth Death Processes, and Beyond... Will Freyman

Chapter 7: Estimation Sections

Estimating HIV transmission rates with rcolgem

Chapter 7: Estimation Sections

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

University of California, Los Angeles Department of Statistics. Final exam 07 June 2013

6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Lecture 23

Statistical Intervals (One sample) (Chs )

Phylogenetic comparative biology

Practical methods of modelling operational risk

Probability. An intro for calculus students P= Figure 1: A normal integral

1 Bayesian Bias Correction Model

Bayesian course - problem set 3 (lecture 4)

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

Portfolio Management Under Epistemic Uncertainty Using Stochastic Dominance and Information-Gap Theory

Extended Model: Posterior Distributions

ECO220Y Estimation: Confidence Interval Estimator for Sample Proportions Readings: Chapter 11 (skip 11.5)

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

Chapter 5. Statistical inference for Parametric Models

Stochastic Games and Bayesian Games

Statistics 13 Elementary Statistics

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 4: May 2, Abstract

This is a open-book exam. Assigned: Friday November 27th 2009 at 16:00. Due: Monday November 30th 2009 before 10:00.

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

10/1/2012. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1

Asset Allocation with Exchange-Traded Funds: From Passive to Active Management. Felix Goltz

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Review of whole course

8.1 Estimation of the Mean and Proportion

value BE.104 Spring Biostatistics: Distribution and the Mean J. L. Sherley

Estimation after Model Selection

The Assumption(s) of Normality

ST440/550: Applied Bayesian Analysis. (5) Multi-parameter models - Summarizing the posterior

Making Sense of Cents

Chapter 7: Estimation Sections

Dealing with forecast uncertainty in inventory models

Outline. Review Continuation of exercises from last time

Measuring the Amount of Asymmetric Information in the Foreign Exchange Market

Risks for the Long Run: A Potential Resolution of Asset Pricing Puzzles

Math 5760/6890 Introduction to Mathematical Finance

Statistical Tables Compiled by Alan J. Terry

Model 0: We start with a linear regression model: log Y t = β 0 + β 1 (t 1980) + ε, with ε N(0,

Stochastic Loss Reserving with Bayesian MCMC Models Revised March 31

CS340 Machine learning Bayesian model selection

Fat tails and 4th Moments: Practical Problems of Variance Estimation

Introduction Credit risk

Section 0: Introduction and Review of Basic Concepts

By-Peril Deductible Factors

Commonly Used Distributions

Deriving the Black-Scholes Equation and Basic Mathematical Finance

Application of MCMC Algorithm in Interest Rate Modeling

ก ก ก ก ก ก ก. ก (Food Safety Risk Assessment Workshop) 1 : Fundamental ( ก ( NAC 2010)) 2 3 : Excel and Statistics Simulation Software\

7.1 Comparing Two Population Means: Independent Sampling

Stochastic Games and Bayesian Games

Module 4: Probability

Lecture 7: Bayesian approach to MAB - Gittins index

When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution?

Chapter 10. Chapter 10 Topics. What is Risk? The big picture. Introduction to Risk, Return, and the Opportunity Cost of Capital

CS 361: Probability & Statistics

When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution?

Real Options. Katharina Lewellen Finance Theory II April 28, 2003

BAYESIAN GAMES: GAMES OF INCOMPLETE INFORMATION

Advanced Microeconomics

Numerical Methods in Option Pricing (Part III)

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

An Introduction to Bayesian Inference and MCMC Methods for Capture-Recapture

Probability Weighted Moments. Andrew Smith

European option pricing under parameter uncertainty

Elementary Statistics

STAT 157 HW1 Solutions

Limit Theorems for the Empirical Distribution Function of Scaled Increments of Itô Semimartingales at high frequencies

of Complex Systems to ERM and Actuarial Work

Lecture 2. Probability Distributions Theophanis Tsandilas

Two Populations Hypothesis Testing

STATS 200: Introduction to Statistical Inference. Lecture 4: Asymptotics and simulation

Lecture outline. Monte Carlo Methods for Uncertainty Quantification. Importance Sampling. Importance Sampling

Confidence Intervals Introduction

Lecture 1 of 4-part series. Spring School on Risk Management, Insurance and Finance European University at St. Petersburg, Russia.

Approximation of Continuous-State Scenario Processes in Multi-Stage Stochastic Optimization and its Applications

Value at Risk Ch.12. PAK Study Manual

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Finitely repeated simultaneous move game.

Normal Probability Distributions

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Chapter 3 Discrete Random Variables and Probability Distributions

LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY

Stochastic Models. Statistics. Walt Pohl. February 28, Department of Business Administration

Sampling Distributions For Counts and Proportions

An Introduction to Statistical Extreme Value Theory

Alternative VaR Models

Answers to Problem Set 4

FE570 Financial Markets and Trading. Stevens Institute of Technology

Mathematics of Finance Final Preparation December 19. To be thoroughly prepared for the final exam, you should

Robustness and informativeness of systemic risk measures

Subject : Computer Science. Paper: Machine Learning. Module: Decision Theory and Bayesian Decision Theory. Module No: CS/ML/10.

STA 532: Theory of Statistical Inference

Transcription:

Workshop Veronika Bošková & Chi Zhang June 28, 2016 1 / 21

What is a prior? Distribution of a parameter before the data is collected and analysed as opposed to POSTERIOR distribution which combines the information from the prior and the data 2 / 21

What is a prior? Using Bayes theorem, we can decompose the posterior: P( )=P( )P( )P( )P( )P( ) P( ) genetic sequences genealogy demographic model substitution model Figure adapted from [du Plessis and Stadler, 2015] molecular clock model 3 / 21

What is a prior? Using Bayes theorem, we can decompose the posterior: Prior information P( )=P( )P( )P( )P( )P( ) P( ) genetic sequences genealogy demographic model substitution model Figure adapted from [du Plessis and Stadler, 2015] molecular clock model 3 / 21

What is a prior? Using Bayes theorem, we can decompose the posterior: Prior information P( )=P( )P( )P( )P( )P( ) P( ) genetic sequences genealogy demographic model substitution model Figure adapted from [du Plessis and Stadler, 2015] molecular clock model 3 / 21

Prior Allows us to include any information we have on the process, before looking at the data Do not be afraid of using it in the inference does not have to, and is not expected to, be exactly the same as the posterior 4 / 21

Prior Should not be and is not universal for all the analyses you will ever do in your research Should incorporate prior (before looking at the data) knowledge about the parameter/underlying process use results of previous independent experiments use other independent evidence Should not be too restrictive if prior knowledge/assumptions are weak One can use diffuse priors May not be adjusted after the run, to give higher and higher posterior support 5 / 21

Prior Is a choice of model tree-generating models, nucleotide/aa/codon substitution models,... and of distribution of plausible for a parameter of interest Uniform, Normal, Beta,... 6 / 21

(tree-generating model) Have to pick one from Coalescent or Birth-death process framework Have to put priors on parameters of the chosen model e.g. growth-rate of the population, R0, extinction rate,... 7 / 21

The selection is big: JC69, HKY85,..., GTR Use model which has been previously identified to be best for your type of data e.g. HKY85 Prior for transition/transversion rate ratio (κ) Prior for base frequencies To choose the best model Use model comparison to choose the one best fitting the data Use rjmcmc directly in BEAST2 to sample from the posterior distribution including different substitution models. The model where rjmcmc spends the most time (samples the most from), is the best fitting model. 8 / 21

(molecular clock model) Strict clock: all branches have the same clock rate Relaxed clock Uncorrelated: branches have independent clock rate distributions Correlated: child branch has clock rate distribution correlated to distribution of the parent branch 9 / 21

Can be fixed to a given value (though this is generally not recommended) Can have upper and lower limits If we know that any infected individual recovers after 5-10 days, we can set the distribution of infectious period to be e.g. min 4 days and max 11 days If specified by a parametric distribution, the parameters of this distribution can also be assigned a prior (hyperprior) You can visualise the distribution in BEAUti 10 / 21

Examples - Normal distribution PDF 0 1 2 3 4 5 µ=0, σ=0.5 µ=0.2, σ=0.2 µ=0, σ=0.1 µ=0, σ=0.2-0.4-0.2 0.0 0.2 0.4 Parameters: mean µ R, standard deviation σ > 0 Range of : (-, ) 11 / 21

Examples - LogNormal distribution PDF 0.0 0.2 0.4 0.6 0.8 1.0 M=0, S=1 M=0, S=0.5 M=2, S=1 M=1, S=0.75 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Parameters: mean M R, standard deviation S > 0 Range of : [0, ) Long tail, always positive 12 / 21

Examples - Beta distribution PDF 0 1 2 3 4 5 α=0.5, β=0.5 α=2, β=2 α=2, β=5 α=5, β=1 0.0 0.2 0.4 0.6 0.8 1.0 Parameters: shape α > 0, shape β > 0 Range of : [0,1] Good for e.g. sampling probability prior 13 / 21

Examples - Uniform distribution PDF 0.0 0.2 0.4 0.6 0.8 1.0 l=-0.5, u=0.5 l=0, u=1.7 l=-1, u=1 l=-1.5, u=1.5-2 -1 0 1 2 Parameters: lower, upper bound Range of : (-, ) 14 / 21

Is uniform distribution a non-informative prior? Not really Imagine setting a Uniform(0, 100) prior for the transition/transversion rate ratio (κ). You also know that the most likely for κ are between 0 and 10. But you now put 9/10 of the weight to > 10. f(κ) 9/10 of all weight 0 10 20 30 40 50 60 70 80 90 κ In fact there is nothing such as an non-informative prior If little or no information on the parameter is available, use diffuse priors Try to avoid Uniform(-, ) or Uniform(0, ) 15 / 21

Proper vs improper priors Sometimes the prior distribution is such that the sum or the integral of the prior does not converge, this is called an IMPROPER prior Examples 1/x Uniform(, ) 16 / 21

Are my priors what I set them to be? Not always Induced priors may change the picture, i.e. if the parameters interact, the marginal prior distribution for each individual parameter may be different from the originally specified prior Use sampling from the prior, to see what your real prior is Density Density Myears Myears Figure adapted from [Heled and Drummond, 2012] The marginal prior distributions that result from the multiplicative construction (gray) versus calibration densities (black line) specified for the calibrated nodes. 17 / 21

How to choose priors? Use all the prior knowledge you have to choose models and set appropriate parameter priors Sample from the prior distribution before using your data to check you really have the priors you want Check your posterior distribution against the prior 18 / 21

Word of caution In practice, it is important to evaluate the impact of the prior on the posterior in a Bayesian robustness analysis Ideally, the posterior should be dominated by your data, such that the choice of the prior has little influence on the result If this is not the case, the choice of prior is very important, and should be reported 19 / 21

Are just starting Have to be within the prior distribution, and its upper and lower limits, you chose for the parameter Use your best guess BEAST2 attempts 10 times at most (can be changed) to initialize the run, but if the starting are unreasonable, the runs may keep failing Start from different starting to make sure the chains converge to the same distribution 20 / 21

I - du Plessis, L. and Stadler, T. (2015). Getting to the root of epidemic spread with phylodynamic analysis of genomic data. Trends in microbiology, 23(7):383 386. - Heled, J. and Drummond, A. J. (2012). Calibrated tree priors for relaxed phylogenetics and divergence time estimation. Systematic Biology, 61(1):138 149. 21 / 21