STAT 425: Introduction to Bayesian Analysis
|
|
- Trevor Nash
- 5 years ago
- Views:
Transcription
1 STAT 45: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 018 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
2 Lectures 9-11: Multi-parameter models The Normal model Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall 018 / 37
3 Parameterizations of the Normal Distribution Mean and deviation: f(x µ, σ ) = 1 πσ e (x µ) σ, x R, σ > 0. Mean and precision: f(x µ, τ) = τ τ(x µ) π e, x R, τ = 1 σ > 0. The latter has advantages in numerical computations when σ 0 and simplify formulas. Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
4 Summary pdf/pmf Domain Mean Variance Bern P (x) = p x (1 p) 1 x {0, 1} p p(1 p) Bin P (x) = N p x (1 p) N x x {0,..., N} Np Np(1 p) λ Poi P (x) = e λx N λ λ x! NB P (x) = r + x 1 p r (1 p) x N r 1 p r 1 p p p x { M P (x 1,..., x k ) = N! k x k! k px k {0,..., N} K Np k U f(x) = 1 b a [a, b] a+b Be f(x) = Γ(a+b) Γ(a)Γ(b) xa 1 (1 x) b 1 [0, 1] a a+b Ga f(x) = ba Γ(a) xa 1 e bx R + a b Np k (1 p k ) Np k p k (b a) 1 ab (a+b) (a+b+1) N f(x) = 1 e (x µ) σ R µ σ πσ a b MN f(x) = (π) p Σ 1 e 1 (X µ)t Σ 1 (X µ) R p µ Σ Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
5 Summary Model parameters MOM MLE UMVUE Bern p X X X X n 1 Bin p n S X X X nn nn Poi λ X X X NB r p X X n 1 with known r n S ˆr ˆr+ X r r+ X U a X 3 n 1 n S X (1) with a = 0 Ga b a b X + 3 n 1 n S n+1 X (n) n X (n) X n 1 with known a n S X ā n 1 n S X N µ X X X σ n 1 n 1 S n n S with known σ Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
6 Related Distributions Normal distribution X N(µ, σ ): Truncated normal distribution: f(x µ, σ, a, b) = Φ Standardized t-distribution: X µ s/ n t n 1(0, 1), X = 1 n n i=1 f(x µ, σ ) ) ( b µ σ Φ ( a µ σ X i, s = 1 n 1 Standard normal distribution X N(0, 1): Log-normal distribution: e µ+σx LN(µ, σ ); Cauchy distribution: X 1 /X Cauchy(0, 1); ); n (X i X). i=1 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
7 Bell-shaped Distributions Laplace distribution (double exponential distribution): f(x µ, b) = 1 x µ be b, x R, b > 0. Cauchy distribution: f(x µ, γ) = [ πγ ( x µ γ ) ], x R, b > 0. t-distribution: f(x ν, µ, σ) = Γ ( ) [ ν+1 ( νπσγ ν ) ( ) ] x µ ν+1 ν σ Logistic distribution: f(x µ, s) = s e x µ s (1 + e x µ s ), x R, s > 0., x R, ν > 0, σ > Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
8 Laplace, Cauchy, Standardized t and logistic Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
9 The Gamma distribution - a refresher The Gamma distribution is often used to model parameters that can only take positive values. In turn, this has been motivated by the fact that the Gamma distribution acts as a conjugate prior in many models θ Gamma(α, β) Gamma(5, 1) p(θ) = βα Γ(α) θα 1 e βθ α, β > 0 Gamma(1, β) Exp(β) (exponential density) dgamma(sort(x), shape = 5, rate = 1) Gamma( ν, 1 ) χ ν (chi-square density) x Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
10 The Gamma distribution The Gamma distribution is often used to model parameters that can only take positive values. In turn, this has been motivated by the fact that the Gamma distribution acts as a conjugate prior in many models θ Gamma(α, β) Gamma(5, ) p(θ) = βα Γ(α) θα 1 e βθ α, β > 0 E(θ) = α β dgamma(sort(x), shape = 5, rate = ) Mode(θ) = α 1 β, α > x V (θ) = α β Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
11 Possible models Data likelihood: f(x 1,..., x n µ, σ ) = Models: µ is unknown, σ is known; µ is known, σ is unknown; Both µ and σ are unknown: µ is dependent on σ ; µ and σ are independent. = n f(x i µ, σ ) i=1 n i=1 1 e (x µ) σ πσ = ( πσ ) n e ni=1 (x i µ) σ. Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
12 Useful facts for derivations Normal component: if π(θ) e 1 (aθ bθ), then ( b θ N a, 1 ) a and 1 π a 1 e b a e 1 (aθ bθ) dθ = 1. Gamma component: if π(θ) θ a 1 e bθ, then θ Ga (a, b) and b a Γ(a) θa 1 e bθ dθ = 1. Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
13 Student component: if π(θ) (δ + (θ l) S θ t δ (l, S) ) δ+1, then and 1 Γ ( ) δ+1 πs Γ ( ) δ δ δ (δ + ) δ+1 (θ l) S dθ = 1. Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
14 The Normal Model x = (x 1,..., x n ) N(µ, σ ) i.i.d., with both µ and σ unknown. The likelihood is: L(µ, σ ) n 1 ( σ π exp 1 σ (x i µ) ) i=1 ( 1 ) n/ ( exp σ 1 σ (x i µ) ) For inference, focus is on p(µ, σ x) = p(µ σ, x)p(σ x). From a Bayesian perspective, it is easier to work with the precision, τ = 1. σ The likelihood becomes: n 1 ( L(µ, τ) τ 1/ exp 1 π τ(x i µ) ) i=1 τ n/ exp ( 1 τ i i (x i µ) ) Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
15 Likelihood factorization: ( L(µ, τ) τ n/ exp 1 τ i ( τ n/ exp 1 τ i (x i µ) ) [(x i x) (µ x)] ) ( τ n/ exp 1 [ τ (x i x) + n(µ x) ]) ( τ n/ exp τ n/ exp 1 ) τs (n 1) ( 1 ) τss exp i ( exp 1 τn(µ x)) ( 1 τn(µ x)) with s = i (x i x) /(n 1) and SS = i (x i x) sample variance and sum of squares [SS and x sufficient statistics] Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
16 Non-informative Prior Non-informative prior: π(µ, σ ) 1 σ. This arises by considering µ and σ a priori independent and taking the product of the standard non-inf priors. This is not a conjugate setting (the posterior does not factor into a product of two independent distributions). Prior is improper but posterior is proper. This is also the Jeffreys prior. Joint posterior distribution of µ and σ is { p(µ, σ x) (σ ) (n/+1) exp 1 } σ [(n 1)s + n( x µ) ] where s = 1 n 1 n (x i x) i=1 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
17 The conditional posterior distribution, p(µ σ, x), is equivalent to deriving the posterior for µ when σ is known ) µ σ, x N ( x, σ n The marginal posterior p(σ x), is obtained integrating p(µ, σ x) over µ [Hint: integral of a Gaussian function c π = exp( 1 (µ + b) )dµ] c { p(σ x) (σ ) (n/+1) exp 1 } µ σ [(n 1)s + n( x µ) ] dµ } (σ ) [(n 1)/+1] (n 1)s exp { σ which is an inverse-gamma density, i.e. ( n 1 σ x Inv-Gamma, n 1 ) s Inv-χ (n 1, s ) or, equivalently, τ x Ga. Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
18 Sampling from the joint posterior distribution One can simulate a value of (µ, σ ) from the joint posterior density by 1 simulating σ from an inverse-gamma ( ) n 1 n 1, s distribution [take the inverse of random samples from a Gamma ( ) n 1 n 1, s ] ( ) then simulating µ from N x, σ n distribution. Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
19 Marginal posterior distribution p(µ x) of µ As µ is typically the parameter of interest (σ nuisance parameter) it is useful to calculate its marginal posterior distribution [Hint: integral of a Gamma function Γ(a)a p(µ x) = 0 0 p(µ, σ x)dσ (σ ) (n/+1) exp = A n/ z (n )/ exp( z)dz, 0 b a = 0 z a 1 exp( zb )dz] { 1 } σ [(n 1)s + n( x µ) ] dσ with A = (n 1)s + n( x µ), z = A σ [ A n/ = ( ) ] µ x [(n 1)+1]/ n 1 s/ n that is, µ x t(n 1, x, s /n), or µ x s/ n x t n 1 with t n 1 the standard t-distribution with n 1 degrees of freedom Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
20 Conjugate Prior Model A conjugate prior must be of the form π(µ, σ ) = π(µ σ )π(σ ), e.g., µ σ N(µ 0, σ /τ 0 ), which corresponds to the joint prior density ( σ p(µ, σ ) 1/ ) exp τ 0 ( σ ν0 IG, SS ) [ ] 0 or τ Ga, { 1 } σ (µ µ 0 ) /τ 0 = (σ ) ( ν we call this a Normal-Inverse-Gamma prior, (µ, σ ) NIG(µ 0, τ 0, ν 0 /, SS 0 /) ) { (σ ) (ν0/+1) exp SS 0 σ { exp τ ( 0 SS 0 σ + (µ µ 0 ) τ 0 } )} Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
21 Joint Posterior p(µ, σ y) with ( ) p(µ, σ x) (σ ) ν (σ ) n/ exp { exp { 1 σ ( SS σ 0 + τ 0 (µ µ 0 ) ) } } n (y i µ) 1 i=1 (σ νn+1 ) ( +1) exp { τ ( n SS n σ + (µ µ n ) τ n µ σ, x N(µ n, σ /τ n ), µ n = µ 0 τ 0 σ + x n σ τ 0 σ + n σ )} = τ 0µ 0 + n x, τ n = τ 0 + n τ n ( σ νn x IG, SS ) n, ν n = ν 0 + n, SS n = SS 0 + SS + τ 0n ( x µ 0 ) τ n Thus, µ, σ y Normal-Inverse Gamma(µ n, τ n ; ν n /, SS n/). Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
22 Also µ x t νn (µ n, σ n/τ n ), σ n = SS n/ν n [Note: Again N(m, σ /τ)ig(ν/, SS/)dσ = t ν (m, SS/(ντ)] Comments: µ n expected value for µ after seeing the data µ n = n τ n x + τ 0 τ n µ 0, weighted average τ n precision for estimating µ after n observations. ν n degrees of freedom [τ Ga(α/, β/) βτ χ α, with α degrees of freedom] SS n posterior variation as prior variation+observed variation+variation between prior mean and sample mean. Limiting case τ 0 0, ν 0 1 (and SS 0) then µ x t n 1 ( x, s /n) (same as improper prior!) Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall 018 / 37
23 Example on SPF (from Merlise Clyde) A Sunlight Protection Factor (SPF) of 5 means an individual that can tolerate X minutes of sunlight without any sunscreen can tolerate 5X minutes with sunscreen. Data on 13 individual (tolerance, in min, with and without sunscreen). Analysis should take into account pairing which induces dependence between observations (take differences and use ratios or log(ratios) = difference in logs). Ratios make more sense given the goals: how much longer can a person be exposed to the sun relative to their baseline. Model: Y = log(t RT ) log(cont ROL) N(µ, τ). Then E(log(T RT/CONT ROL)) = µ = log(sp F ). Interested in exp(µ) = SP F. Summary statistics: ȳ = 1.998, s = 0.55, n = 13 [make boxplots and Q-Q normal plots to check on normality] Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
24 Model formulation: Y = log(trt) log(control) N(µ, σ ), n = 13, ȳ = 1.998, SS = Question: π(µ y 1,..., y n ) =? Bayesian model: Data likelihood: f(y 1,..., y n µ, σ ) = n i=1 N(y i; µ, σ ); Non-informative Prior: (µ, σ ) 1/σ ; Posterior: (µ, σ y 1,..., y n ) N(ȳ, σ /n)ig( n 1 n 1, s ) Posterior: µ y 1,..., y n t n 1 (ȳ, s n ); Prediction: y f y 1,..., y n t n 1 (ȳ, s (n 1)/n). Coding in R: rgamma(), rnorm() and rt(). Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
25 With non-informative prior. Posterior: (µ, σ y 1,..., y n ) N(ȳ, σ /n)ig( n 1 n 1, s ) Posterior: µ y 1,..., y n t n 1 (ȳ, s n ) Define: vn = (n 1) = 1, SSn = s (n 1) = 0.55, mn = Sampling from posterior: Draw τ Y tau = rgamma(10000, vn/, rate=ssn/) Draw µ τ, Y mu = rnorm(10000, mn, 1/sqrt(phi*n)) or draw µ Y directly mu = rt(10000,vn)*sqrt(ssn/(n*vn))+ mn Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
26 Model formulation: Y = log(trt) log(control) N(µ, σ ), n = 13, ȳ = 1.998, SS = Question: π(µ y 1,..., y n ) =? Bayesian model: Data likelihood: f(y 1,..., y n µ, σ ) = n i=1 N(y i; µ, σ ); Conjugate Prior: µ σ N(µ 0, σ τ 0 ), σ IG( ν0, SS0 ); Posterior: (µ, σ y 1,..., y n ) NIG(µ n, τ n ; ν n /, SS n) Posterior: µ y 1,..., y n t νn (µ n, SSn τ nν n ); Prediction: y f y 1,..., y n t νn (µ n, SSn ν n τ n+1 τ n ). Coding in R: rgamma(), rnorm() and rt(). Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
27 Expert opinions on µ: Best guess on median SPF is 16 P (µ > 64) = 0.01 information in prior is worth 5 observations Possible subjective prior: µ 0 = log(16), τ 0 = 5, ν 0 = τ 0 1 P (µ < log(64)) =.99 implies SS 0 = Posterior hyperpar: τ n = 38, µ n =.508, ν u = 37, SS n = Sampling from posterior: Draw τ Y tau = rgamma(10000, vn/, rate=ssn/) Draw µ τ, Y mu = rnorm(10000, mn, 1/sqrt(phi*tn)) or draw µ Y directly mu = rt(10000,vn)*sqrt(ssn/(tn*vn))+ mn Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
28 Transform to exp(µ). Find 95% C.I. of 4.54 to Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
29 Predictive Distribution of future z Posterior predictive distribution (given x = (x 1,..., x n )): p(z x) = p(z µ, σ, x)p(µ, σ x)dµdσ [Use assumption that z is independent of x given µ and σ, then integrate µ using the normal integral, then integrate σ using the Gamma integral] ) Reference prior: z x t n 1 ( x, s (n + 1)/n ( Conjugate prior: z x t νn µ n, σn(τ n + 1)/τ n ), σn = SSn/ν n [Can use the normal trick to integrate µ: If z N(µ, σ ) and µ N(µ 0, σ /τ 0 ) then y = z µ σ N(0, 1), that is z = d σy + µ and therefore z σ N(µ 0, σ (1 + 1 τ 0 )) since a linear comb of (independent) normals is normal with added mean and variance.] Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
30 Prior predictive distribution: What we expect the distribution to be before we observe the data, p(z) = p(z µ, σ )π(µ, σ )dµdσ z t ν0 (µ 0, SS 0 ν 0 (1 + 1 τ 0 )) [as above] [ N(µ, σ )N(µ 0, σ /τ 0 )IG(ν/, SS/)dµdσ = t ν (µ 0, SS ν (1 + 1 τ 0 ))] Note: This is what we used in the example to specify our subjective prior. Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
31 Back to example Prior predictive distribution: z t 4 ( log(16), ( ) ) Posterior predictive distribution: z t 37 (.5, 5.3( ) ) Y=rt(10000,4)*sqrt((1+1/5)*187.5/4)+log(16) quantile(exp(y)) 0% 5% 50% 75% 100% 4.57e Sampling from posterior predictive leads to 50% C.I. (0.0003,1.4) - with sunscreen, 50% chance that next individual can be exposed from 0 to 1 times longer than without sunscreen. Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
32 Semi-conjugate prior A semi-conjugate setting is obtained with independent priors π(µ, σ ) = π(µ)π(σ ) ( µ N(µ 0, σ0), σ ν0 IG, SS ) 0 then µ σ, x N(µ n, τ n), µ n = σ x not in closed form µ 0 σ0 + x n σ 1 + n, τn = σ0 σ n σ0 σ We will solve this with MCMC methods! Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
33 Summary of Conjugate Priors for the Normal Model Conjugate priors for normal data with unknown precision are τ Gamma( a, b ) µ τ N(µ 0, 1 τ 0 τ ) Here a, b, µ 0, and τ 0 are known hyper-parameters chosen to characterize the prior information. The problem with using this prior in practical data analysis is the difficulty of specifying a distribution for µ that is conditional on τ (which is also unknown). Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
34 Summary of Independence prior Here we assume that information about µ can be elicited independently of information on τ or σ, so p(µ, τ) = p(µ) p(τ) This makes elicitation relatively easy. Although the primary goal is to get a prior that reasonably captures the expert s information, independence priors work generally well. Usually, one considers Gamma priors for τ, since they are conjugate. But there s really no need, as long as the prior is defined on the positive real line. Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
35 Proper (semi-conjugate) Reference Priors More recently priors such as µ N(0, b) τ Gamma(c, c) have been used as proper reference priors. In this case, b and c are chosen so that the prior precision for µ, 1/b, and both hyperparameters c in the Gamma distribution are near zero. Such priors are seen as approximation of the p(µ, τ) 1/τ improper default prior. Common choices are b = 10 6 and c = Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
36 Back to the example We need to identify a prior distribution that gives information/no-information about the unknown parameters µ and τ = 1/σ. µ N(0, 10 6 ) as proper non-informative prior. Expert opinion that µ should be centered at 16. Then, µ N(16, 10 6 ) as diffuse prior. Expert 95% certain that the mean SPF should be µ should be between 10 and 75, that is, P r(10 < µ < 75) = Then µ N(10, ) Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
37 Back to the example We have no good information on σ, the variance of an observation So we can specify a reference (vague) prior on τ, which is independent of µ: τ Gamma(0.001, 0.001) Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall / 37
Non-informative Priors Multiparameter Models
Non-informative Priors Multiparameter Models Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Prior Types Informative vs Non-informative There has been a desire for a prior distributions that
More informationBayesian Normal Stuff
Bayesian Normal Stuff - Set-up of the basic model of a normally distributed random variable with unknown mean and variance (a two-parameter model). - Discuss philosophies of prior selection - Implementation
More informationBayesian Linear Model: Gory Details
Bayesian Linear Model: Gory Details Pubh7440 Notes By Sudipto Banerjee Let y y i ] n i be an n vector of independent observations on a dependent variable (or response) from n experimental units. Associated
More informationPosterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties
Posterior Inference Example. Consider a binomial model where we have a posterior distribution for the probability term, θ. Suppose we want to make inferences about the log-odds γ = log ( θ 1 θ), where
More informationBayesian Hierarchical/ Multilevel and Latent-Variable (Random-Effects) Modeling
Bayesian Hierarchical/ Multilevel and Latent-Variable (Random-Effects) Modeling 1: Formulation of Bayesian models and fitting them with MCMC in WinBUGS David Draper Department of Applied Mathematics and
More informationConjugate Models. Patrick Lam
Conjugate Models Patrick Lam Outline Conjugate Models What is Conjugacy? The Beta-Binomial Model The Normal Model Normal Model with Unknown Mean, Known Variance Normal Model with Known Mean, Unknown Variance
More informationcontinuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence
continuous rv Let X be a continuous rv. Then a probability distribution or probability density function (pdf) of X is a function f(x) such that for any two numbers a and b with a b, P(a X b) = b a f (x)dx.
More informationChapter 7: Estimation Sections
1 / 40 Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: Chapter 7 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods:
More information1 Bayesian Bias Correction Model
1 Bayesian Bias Correction Model Assuming that n iid samples {X 1,...,X n }, were collected from a normal population with mean µ and variance σ 2. The model likelihood has the form, P( X µ, σ 2, T n >
More informationChapter 7: Estimation Sections
1 / 31 : Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods: 7.5 Maximum Likelihood
More informationChapter 8: Sampling distributions of estimators Sections
Chapter 8 continued Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample
More informationChapter 7: Estimation Sections
Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions Frequentist Methods: 7.5 Maximum Likelihood Estimators
More informationCS340 Machine learning Bayesian statistics 3
CS340 Machine learning Bayesian statistics 3 1 Outline Conjugate analysis of µ and σ 2 Bayesian model selection Summarizing the posterior 2 Unknown mean and precision The likelihood function is p(d µ,λ)
More informationTwo hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER
Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS Answer any FOUR of the SIX questions.
More informationExam 2 Spring 2015 Statistics for Applications 4/9/2015
18.443 Exam 2 Spring 2015 Statistics for Applications 4/9/2015 1. True or False (and state why). (a). The significance level of a statistical test is not equal to the probability that the null hypothesis
More informationObjective Bayesian Analysis for Heteroscedastic Regression
Analysis for Heteroscedastic Regression & Esther Salazar Universidade Federal do Rio de Janeiro Colóquio Inter-institucional: Modelos Estocásticos e Aplicações 2009 Collaborators: Marco Ferreira and Thais
More informationST440/550: Applied Bayesian Analysis. (5) Multi-parameter models - Summarizing the posterior
(5) Multi-parameter models - Summarizing the posterior Models with more than one parameter Thus far we have studied single-parameter models, but most analyses have several parameters For example, consider
More informationSTART HERE: Instructions. 1 Exponential Family [Zhou, Manzil]
START HERE: Instructions Thanks a lot to John A.W.B. Constanzo and Shi Zong for providing and allowing to use the latex source files for quick preparation of the HW solution. The homework was due at 9:00am
More informationThe Normal Distribution
The Normal Distribution The normal distribution plays a central role in probability theory and in statistics. It is often used as a model for the distribution of continuous random variables. Like all models,
More information6. Genetics examples: Hardy-Weinberg Equilibrium
PBCB 206 (Fall 2006) Instructor: Fei Zou email: fzou@bios.unc.edu office: 3107D McGavran-Greenberg Hall Lecture 4 Topics for Lecture 4 1. Parametric models and estimating parameters from data 2. Method
More informationGOV 2001/ 1002/ E-200 Section 3 Inference and Likelihood
GOV 2001/ 1002/ E-200 Section 3 Inference and Likelihood Anton Strezhnev Harvard University February 10, 2016 1 / 44 LOGISTICS Reading Assignment- Unifying Political Methodology ch 4 and Eschewing Obfuscation
More informationExtended Model: Posterior Distributions
APPENDIX A Extended Model: Posterior Distributions A. Homoskedastic errors Consider the basic contingent claim model b extended by the vector of observables x : log C i = β log b σ, x i + β x i + i, i
More informationME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.
ME3620 Theory of Engineering Experimentation Chapter III. Random Variables and Probability Distributions Chapter III 1 3.2 Random Variables In an experiment, a measurement is usually denoted by a variable
More information(5) Multi-parameter models - Summarizing the posterior
(5) Multi-parameter models - Summarizing the posterior Spring, 2017 Models with more than one parameter Thus far we have studied single-parameter models, but most analyses have several parameters For example,
More informationINDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.
INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. Summary of the previous lecture Moments of a distribubon Measures of
More informationCommon one-parameter models
Common one-parameter models In this section we will explore common one-parameter models, including: 1. Binomial data with beta prior on the probability 2. Poisson data with gamma prior on the rate 3. Gaussian
More informationNormal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.
Lecture 21,22, 23 Text: A Course in Probability by Weiss 8.5 STAT 225 Introduction to Probability Models March 31, 2014 Standard Sums of Whitney Huang Purdue University 21,22, 23.1 Agenda 1 2 Standard
More informationMAS6012. MAS Turn Over SCHOOL OF MATHEMATICS AND STATISTICS. Sampling, Design, Medical Statistics
t r r r t s t SCHOOL OF MATHEMATICS AND STATISTICS Sampling, Design, Medical Statistics Spring Semester 206 207 3 hours t s 2 r t t t t r t t r s t rs t2 r t s s rs r t r t 2 r t st s rs q st s r rt r
More informationRandom Variables Handout. Xavier Vilà
Random Variables Handout Xavier Vilà Course 2004-2005 1 Discrete Random Variables. 1.1 Introduction 1.1.1 Definition of Random Variable A random variable X is a function that maps each possible outcome
More informationLikelihood Methods of Inference. Toss coin 6 times and get Heads twice.
Methods of Inference Toss coin 6 times and get Heads twice. p is probability of getting H. Probability of getting exactly 2 heads is 15p 2 (1 p) 4 This function of p, is likelihood function. Definition:
More informationPart II: Computation for Bayesian Analyses
Part II: Computation for Bayesian Analyses 62 BIO 233, HSPH Spring 2015 Conjugacy In both birth weight eamples the posterior distribution is from the same family as the prior: Prior Likelihood Posterior
More informationBusiness Statistics 41000: Probability 3
Business Statistics 41000: Probability 3 Drew D. Creal University of Chicago, Booth School of Business February 7 and 8, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404
More informationRegret-based Selection
Regret-based Selection David Puelz (UT Austin) Carlos M. Carvalho (UT Austin) P. Richard Hahn (Chicago Booth) May 27, 2017 Two problems 1. Asset pricing: What are the fundamental dimensions (risk factors)
More informationPractice Exercises for Midterm Exam ST Statistical Theory - II The ACTUAL exam will consists of less number of problems.
Practice Exercises for Midterm Exam ST 522 - Statistical Theory - II The ACTUAL exam will consists of less number of problems. 1. Suppose X i F ( ) for i = 1,..., n, where F ( ) is a strictly increasing
More informationEfficiency Measurement with the Weibull Stochastic Frontier*
OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 69, 5 (2007) 0305-9049 doi: 10.1111/j.1468-0084.2007.00475.x Efficiency Measurement with the Weibull Stochastic Frontier* Efthymios G. Tsionas Department of
More informationMATH 3200 Exam 3 Dr. Syring
. Suppose n eligible voters are polled (randomly sampled) from a population of size N. The poll asks voters whether they support or do not support increasing local taxes to fund public parks. Let M be
More informationEstimation Appendix to Dynamics of Fiscal Financing in the United States
Estimation Appendix to Dynamics of Fiscal Financing in the United States Eric M. Leeper, Michael Plante, and Nora Traum July 9, 9. Indiana University. This appendix includes tables and graphs of additional
More informationHierarchical Bayes Analysis of the Log-normal Distribution
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin Session CPS066 p.5614 Hierarchical Bayes Analysis of the Log-normal Distribution Fabrizi Enrico DISES, Università Cattolica Via
More informationReview for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom
Review for Final Exam 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom THANK YOU!!!! JON!! PETER!! RUTHI!! ERIKA!! ALL OF YOU!!!! Probability Counting Sets Inclusion-exclusion principle Rule of product
More informationChapter 4: Asymptotic Properties of MLE (Part 3)
Chapter 4: Asymptotic Properties of MLE (Part 3) Daniel O. Scharfstein 09/30/13 1 / 1 Breakdown of Assumptions Non-Existence of the MLE Multiple Solutions to Maximization Problem Multiple Solutions to
More informationWeight Smoothing with Laplace Prior and Its Application in GLM Model
Weight Smoothing with Laplace Prior and Its Application in GLM Model Xi Xia 1 Michael Elliott 1,2 1 Department of Biostatistics, 2 Survey Methodology Program, University of Michigan National Cancer Institute
More informationUnit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More informationUnit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More informationLecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions
Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions ELE 525: Random Processes in Information Systems Hisashi Kobayashi Department of Electrical Engineering
More informationECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10
ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10 Fall 2011 Lecture 8 Part 2 (Fall 2011) Probability Distributions Lecture 8 Part 2 1 / 23 Normal Density Function f
More informationLecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial
Lecture 23 STAT 225 Introduction to Probability Models April 4, 2014 approximation Whitney Huang Purdue University 23.1 Agenda 1 approximation 2 approximation 23.2 Characteristics of the random variable:
More informationPoint Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage
6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic
More informationLecture 10: Point Estimation
Lecture 10: Point Estimation MSU-STT-351-Sum-17B (P. Vellaisamy: MSU-STT-351-Sum-17B) Probability & Statistics for Engineers 1 / 31 Basic Concepts of Point Estimation A point estimate of a parameter θ,
More informationModeling skewness and kurtosis in Stochastic Volatility Models
Modeling skewness and kurtosis in Stochastic Volatility Models Georgios Tsiotas University of Crete, Department of Economics, GR December 19, 2006 Abstract Stochastic volatility models have been seen as
More informationContinuous Distributions
Quantitative Methods 2013 Continuous Distributions 1 The most important probability distribution in statistics is the normal distribution. Carl Friedrich Gauss (1777 1855) Normal curve A normal distribution
More informationDefinition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.
9 Point estimation 9.1 Rationale behind point estimation When sampling from a population described by a pdf f(x θ) or probability function P [X = x θ] knowledge of θ gives knowledge of the entire population.
More informationLecture III. 1. common parametric models 2. model fitting 2a. moment matching 2b. maximum likelihood 3. hypothesis testing 3a. p-values 3b.
Lecture III 1. common parametric models 2. model fitting 2a. moment matching 2b. maximum likelihood 3. hypothesis testing 3a. p-values 3b. simulation Parameters Parameters are knobs that control the amount
More informationConfidence Intervals Introduction
Confidence Intervals Introduction A point estimate provides no information about the precision and reliability of estimation. For example, the sample mean X is a point estimate of the population mean μ
More informationStatistical Tables Compiled by Alan J. Terry
Statistical Tables Compiled by Alan J. Terry School of Science and Sport University of the West of Scotland Paisley, Scotland Contents Table 1: Cumulative binomial probabilities Page 1 Table 2: Cumulative
More informationStatistics for Business and Economics
Statistics for Business and Economics Chapter 5 Continuous Random Variables and Probability Distributions Ch. 5-1 Probability Distributions Probability Distributions Ch. 4 Discrete Continuous Ch. 5 Probability
More informationINSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION
INSTITUTE AND FACULTY OF ACTUARIES Curriculum 2019 SPECIMEN EXAMINATION Subject CS1A Actuarial Statistics Time allowed: Three hours and fifteen minutes INSTRUCTIONS TO THE CANDIDATE 1. Enter all the candidate
More informationLecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.
Sufficient Statistics Lecture Notes 6 Sufficiency Data reduction in terms of a particular statistic can be thought of as a partition of the sample space X. Definition T is sufficient for θ if the conditional
More informationχ 2 distributions and confidence intervals for population variance
χ 2 distributions and confidence intervals for population variance Let Z be a standard Normal random variable, i.e., Z N(0, 1). Define Y = Z 2. Y is a non-negative random variable. Its distribution is
More informationPractice Exam 1. Loss Amount Number of Losses
Practice Exam 1 1. You are given the following data on loss sizes: An ogive is used as a model for loss sizes. Determine the fitted median. Loss Amount Number of Losses 0 1000 5 1000 5000 4 5000 10000
More informationNormal Inverse Gaussian (NIG) Process
With Applications in Mathematical Finance The Mathematical and Computational Finance Laboratory - Lunch at the Lab March 26, 2009 1 Limitations of Gaussian Driven Processes Background and Definition IG
More informationExam STAM Practice Exam #1
!!!! Exam STAM Practice Exam #1 These practice exams should be used during the month prior to your exam. This practice exam contains 20 questions, of equal value, corresponding to about a 2 hour exam.
More informationA Saddlepoint Approximation to Left-Tailed Hypothesis Tests of Variance for Non-normal Populations
UNF Digital Commons UNF Theses and Dissertations Student Scholarship 2016 A Saddlepoint Approximation to Left-Tailed Hypothesis Tests of Variance for Non-normal Populations Tyler L. Grimes University of
More informationStatistics and Probability
Statistics and Probability Continuous RVs (Normal); Confidence Intervals Outline Continuous random variables Normal distribution CLT Point estimation Confidence intervals http://www.isrec.isb-sib.ch/~darlene/geneve/
More informationWhat was in the last lecture?
What was in the last lecture? Normal distribution A continuous rv with bell-shaped density curve The pdf is given by f(x) = 1 2πσ e (x µ)2 2σ 2, < x < If X N(µ, σ 2 ), E(X) = µ and V (X) = σ 2 Standard
More informationCommonly Used Distributions
Chapter 4: Commonly Used Distributions 1 Introduction Statistical inference involves drawing a sample from a population and analyzing the sample data to learn about the population. We often have some knowledge
More informationNormal Distribution. Definition A continuous rv X is said to have a normal distribution with. the pdf of X is
Normal Distribution Normal Distribution Definition A continuous rv X is said to have a normal distribution with parameter µ and σ (µ and σ 2 ), where < µ < and σ > 0, if the pdf of X is f (x; µ, σ) = 1
More information**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:
**BEGINNING OF EXAMINATION** 1. You are given: (i) A random sample of five observations from a population is: 0.2 0.7 0.9 1.1 1.3 (ii) You use the Kolmogorov-Smirnov test for testing the null hypothesis,
More informationMachine Learning for Quantitative Finance
Machine Learning for Quantitative Finance Fast derivative pricing Sofie Reyners Joint work with Jan De Spiegeleer, Dilip Madan and Wim Schoutens Derivative pricing is time-consuming... Vanilla option pricing
More informationECE 340 Probabilistic Methods in Engineering M/W 3-4:15. Lecture 10: Continuous RV Families. Prof. Vince Calhoun
ECE 340 Probabilistic Methods in Engineering M/W 3-4:15 Lecture 10: Continuous RV Families Prof. Vince Calhoun 1 Reading This class: Section 4.4-4.5 Next class: Section 4.6-4.7 2 Homework 3.9, 3.49, 4.5,
More informationCSC 411: Lecture 08: Generative Models for Classification
CSC 411: Lecture 08: Generative Models for Classification Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 08-Generative Models 1 / 23 Today Classification
More informationدرس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی
یادگیري ماشین توزیع هاي نمونه و تخمین نقطه اي پارامترها Sampling Distributions and Point Estimation of Parameter (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی درس هفتم 1 Outline Introduction
More informationIEOR 165 Lecture 1 Probability Review
IEOR 165 Lecture 1 Probability Review 1 Definitions in Probability and Their Consequences 1.1 Defining Probability A probability space (Ω, F, P) consists of three elements: A sample space Ω is the set
More informationComputer Statistics with R
MAREK GAGOLEWSKI KONSTANCJA BOBECKA-WESO LOWSKA PRZEMYS LAW GRZEGORZEWSKI Computer Statistics with R 5. Point Estimation Faculty of Mathematics and Information Science Warsaw University of Technology []
More information1. You are given the following information about a stationary AR(2) model:
Fall 2003 Society of Actuaries **BEGINNING OF EXAMINATION** 1. You are given the following information about a stationary AR(2) model: (i) ρ 1 = 05. (ii) ρ 2 = 01. Determine φ 2. (A) 0.2 (B) 0.1 (C) 0.4
More informationHigh-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]
1 High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5] High-frequency data have some unique characteristics that do not appear in lower frequencies. At this class we have: Nonsynchronous
More informationValuing volatility and variance swaps for a non-gaussian Ornstein-Uhlenbeck stochastic volatility model
Valuing volatility and variance swaps for a non-gaussian Ornstein-Uhlenbeck stochastic volatility model 1(23) Valuing volatility and variance swaps for a non-gaussian Ornstein-Uhlenbeck stochastic volatility
More informationLecture 2. Probability Distributions Theophanis Tsandilas
Lecture 2 Probability Distributions Theophanis Tsandilas Comment on measures of dispersion Why do common measures of dispersion (variance and standard deviation) use sums of squares: nx (x i ˆµ) 2 i=1
More informationComparison of Pricing Approaches for Longevity Markets
Comparison of Pricing Approaches for Longevity Markets Melvern Leung Simon Fung & Colin O hare Longevity 12 Conference, Chicago, The Drake Hotel, September 30 th 2016 1 / 29 Overview Introduction 1 Introduction
More informationThe Bernoulli distribution
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationChapter 3 Common Families of Distributions. Definition 3.4.1: A family of pmfs or pdfs is called exponential family if it can be expressed as
Lecture 0 on BST 63: Statistical Theory I Kui Zhang, 09/9/008 Review for the previous lecture Definition: Several continuous distributions, including uniform, gamma, normal, Beta, Cauchy, double exponential
More informationUQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.
UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions. Random Variables 2 A random variable X is a numerical (integer, real, complex, vector etc.) summary of the outcome of the random experiment.
More informationQuantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Examples
Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Examples M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu
More informationSOCIETY OF ACTUARIES EXAM STAM SHORT-TERM ACTUARIAL MATHEMATICS EXAM STAM SAMPLE QUESTIONS
SOCIETY OF ACTUARIES EXAM STAM SHORT-TERM ACTUARIAL MATHEMATICS EXAM STAM SAMPLE QUESTIONS Questions 1-307 have been taken from the previous set of Exam C sample questions. Questions no longer relevant
More informationPoint Estimation. Principle of Unbiased Estimation. When choosing among several different estimators of θ, select one that is unbiased.
Point Estimation Point Estimation Definition A point estimate of a parameter θ is a single number that can be regarded as a sensible value for θ. A point estimate is obtained by selecting a suitable statistic
More informationEquity correlations implied by index options: estimation and model uncertainty analysis
1/18 : estimation and model analysis, EDHEC Business School (joint work with Rama COT) Modeling and managing financial risks Paris, 10 13 January 2011 2/18 Outline 1 2 of multi-asset models Solution to
More informationBivariate Birnbaum-Saunders Distribution
Department of Mathematics & Statistics Indian Institute of Technology Kanpur January 2nd. 2013 Outline 1 Collaborators 2 3 Birnbaum-Saunders Distribution: Introduction & Properties 4 5 Outline 1 Collaborators
More informationStatistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage
7 Statistical Intervals Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to
More informationGenerating Random Numbers
Generating Random Numbers Aim: produce random variables for given distribution Inverse Method Let F be the distribution function of an univariate distribution and let F 1 (y) = inf{x F (x) y} (generalized
More informationAn Information Based Methodology for the Change Point Problem Under the Non-central Skew t Distribution with Applications.
An Information Based Methodology for the Change Point Problem Under the Non-central Skew t Distribution with Applications. Joint with Prof. W. Ning & Prof. A. K. Gupta. Department of Mathematics and Statistics
More informationApplied Statistics I
Applied Statistics I Liang Zhang Department of Mathematics, University of Utah July 14, 2008 Liang Zhang (UofU) Applied Statistics I July 14, 2008 1 / 18 Point Estimation Liang Zhang (UofU) Applied Statistics
More informationQualifying Exam Solutions: Theoretical Statistics
Qualifying Exam Solutions: Theoretical Statistics. (a) For the first sampling plan, the expectation of any statistic W (X, X,..., X n ) is a polynomial of θ of degree less than n +. Hence τ(θ) cannot have
More informationModel 0: We start with a linear regression model: log Y t = β 0 + β 1 (t 1980) + ε, with ε N(0,
Stat 534: Fall 2017. Introduction to the BUGS language and rjags Installation: download and install JAGS. You will find the executables on Sourceforge. You must have JAGS installed prior to installing
More informationApplication of MCMC Algorithm in Interest Rate Modeling
Application of MCMC Algorithm in Interest Rate Modeling Xiaoxia Feng and Dejun Xie Abstract Interest rate modeling is a challenging but important problem in financial econometrics. This work is concerned
More information# generate data num.obs <- 100 y <- rnorm(num.obs,mean = theta.true, sd = sqrt(sigma.sq.true))
Posterior Sampling from Normal Now we seek to create draws from the joint posterior distribution and the marginal posterior distributions and Note the marginal posterior distributions would be used to
More informationSimple Random Sampling. Sampling Distribution
STAT 503 Sampling Distribution and Statistical Estimation 1 Simple Random Sampling Simple random sampling selects with equal chance from (available) members of population. The resulting sample is a simple
More informationStatistical Inference and Methods
Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 14th February 2006 Part VII Session 7: Volatility Modelling Session 7: Volatility Modelling
More informationContinuous random variables
Continuous random variables probability density function (f(x)) the probability distribution function of a continuous random variable (analogous to the probability mass function for a discrete random variable),
More informationECSE B Assignment 5 Solutions Fall (a) Using whichever of the Markov or the Chebyshev inequalities is applicable, estimate
ECSE 304-305B Assignment 5 Solutions Fall 2008 Question 5.1 A positive scalar random variable X with a density is such that EX = µ
More information2 of PU_2015_375 Which of the following measures is more flexible when compared to other measures?
PU M Sc Statistics 1 of 100 194 PU_2015_375 The population census period in India is for every:- quarterly Quinqennial year biannual Decennial year 2 of 100 105 PU_2015_375 Which of the following measures
More informationConfidence Intervals for an Exponential Lifetime Percentile
Chapter 407 Confidence Intervals for an Exponential Lifetime Percentile Introduction This routine calculates the number of events needed to obtain a specified width of a confidence interval for a percentile
More information