Estimation after Model Selection
|
|
- Tabitha Lyons
- 6 years ago
- Views:
Transcription
1 Estimation after Model Selection Vanja M. Dukić Department of Health Studies University of Chicago Edsel A. Peña* Department of Statistics University of South Carolina ENAR 2003 Talk March 31, 2003 Tampa Bay, FL Research support from NSF
2 Motivating Situations Suppose you have a random sample X = (X 1, X 2,..., X n ) (possibly censored) from an unknown distribution F which belongs to either the Weibull class or the gamma class. What is the best way to estimate F(t) or some other parameter of interest? Suppose it is known that the unknown df F belongs to either of p models M 1, M 2,..., M p, which are possibly nested. What is the best way of estimating a parameter common to each of these models?
3 Intuitive Strategies Strategy I: Utilize estimators developed under larger model M, or implement a fully nonparametric approach. Strategy II (Classical): [Step 1 (Model Selection):] Choose most plausible model using the data, possibly via information measures. [Step 2 (Inference):] Use estimators in the chosen sub-model, but with these estimators still using the same data X. Strategy III (Bayesian): Determine adaptively (i.e., using X) the plausibility of each of the sub-models, and form a weighted combination of the sub-model estimators or tests. Referred also as model averaging.
4 Relevance and Issues What are the consequences of first selecting a sub-model and then performing inference such as estimation or testing hypothesis, with these two steps utilizing the same sample data (i.e., double-dipping)? Is it always better to do model-averaging, that is, a Bayesian framework, or equivalently, under what circumstances is model averaging preferable over a classical two-step approach? When the number of possible models increases, would it be better to simply utilize a wider, possibly nonparametric, model?
5 A Concrete Gaussian Model Data: X (X 1, X 2,..., X n ) IID F M = { N(µ, σ 2 ) : µ R, σ 2 > 0 } Uniformly minimum variance unbiased (UMVU) estimator of σ 2 is the sample variance ˆσ 2 UMV U = S2 = 1 n 1 n i=1 (X i X) 2. Decision-theoretic framework with loss function (ˆσ L 1 (ˆσ 2,(µ, σ 2 2 σ 2 ) 2 )) =. σ 2
6 Risk function: For the quadratic loss L 1, ( Risk(ˆσ 2 ) = Variance ˆσ 2 σ 2 ) + [ Bias ( )] ˆσ 2 2 σ 2 S 2 is not the best. Dominated by ML and the minimum risk equivariant (MRE) estimators: ˆσ 2 MLE = 1 n n i=1 (X i X) 2 ˆσ 2 MRE = ( n n + 1 ) ˆσ MLE 2
7 Model M p : Our Test Model Suppose we do not know the exact value of µ, but we do know it is one of p possible values. This leads to model M p : M p = { N(µ, σ 2 ) : µ {µ 1,..., µ p }, σ 2 > 0 } where µ 1, µ 2,..., µ p are known constants. Under M p, how should we estimate σ 2? What are the consequences of using the estimators developed under M? Can we exploit structure of M p to obtain better estimators of σ 2?
8 Classical Estimators Under M p Sub-Model MLEs and MREs: ˆσ 2 i = 1 n n j=1 (X j µ i ) 2 ; ˆσ MRE,i 2 = 1 n + 2 n j=1 (X j µ i ) 2 Model Selector: ˆM = ˆM(X) ˆM = arg min 1 i pˆσ2 i = arg min 1 i p X µ i. ˆM chooses the sub-model leading to the smallest estimate of σ 2, or whose mean is closest to the sample mean.
9 MLE of σ 2 under M p (a two-step adaptive estimator): ˆσ p,mle 2 p = ˆσ2ˆM = I{ ˆM = i}ˆσ i 2. i=1 An alternative Estimator: Use the sub-model s MRE to obtain ˆσ 2 p,mre = ˆσ2 MRE, ˆM = p i=1 I{ ˆM = i}ˆσ 2 MRE,i. Properties of adaptive estimators not easily obtainable due to interplay between the model selector ˆM and the sub-model estimator.
10 Bayes Estimators Under M p Joint Prior for (µ, σ 2 ): Independent priors Prior for µ: Multinomial(1, θ) Prior for σ 2 : Inverted Gamma(κ, β) Posterior Probabilities of Sub-Models: ( θ i nˆσ 2 i /2 + β ) (n/2+κ 1) θ i (x) = p j=1 θ ( j nˆσ 2 j /2 + β ) (n/2+κ 1)
11 Posterior Density of σ 2 : π(σ 2 x) = C p i=1 θ i ( 1 σ 2 ) (κ+n/2) exp [ 1 σ 2 ( nˆσ 2 i /2 + β )]. Bayes (Weighted) Estimator of σ 2 : ˆσ 2 p,bayes (X) = p {( n i=1 n + 2(κ 2) θ i (X) ) ( ) ( ) } ˆσ i 2 2(κ 2) β +. n + 2(κ 2) κ 2 Non-Informative Priors: Uniform prior for sub-models: θ i = 1/p, i = 1,2,..., p; β 0.
12 One particular limiting Bayes estimator is: ˆσ 2 p,lb1 = p (ˆσ 2 i ) n/2 p i=1 j=1 (ˆσ2 j ) n/2 ˆσ 2 i an adaptively weighted estimator formed from the sub-model estimators. But, based on the simulation studies, a better one is that formed from the sub-model MREs: ˆσ 2 p,plb1 = ( n n + 2 ) ˆσ p,lb1 2
13 Comparing the Estimators R (ˆσ 2 UMV U,(µ, σ2 ) ) = 2 n 1. R (ˆσ 2 MRE,(µ, σ2 ) ) = 2 n+1. Efficiency measure relative to ˆσ 2 UMV U : Eff(ˆσ 2 : ˆσ UMV 2 U ) = R(ˆσ2 UMV U,(µ, σ2 )) R(ˆσ 2,(µ, σ 2. )) Eff(ˆσ 2 MRE : ˆσ2 UMV U ) = n+1 n 1 = n 1.
14 Properties of M p -Based Estimators Notation: Let Z N(0,1) and with µ i0 the true mean, define = µ µ i 0 1. σ Proposition: Under M p, ˆσ i 2 d 1 ( ) = W + V 2 σ 2 n i, i = 1,2,..., p; with W and V independent, and W χ 2 n 1 ; V = Z1 n N p ( n, J 11 ).
15 Notation: Given, let (1) < (2) <... < (p) be the ordered values. always has a zero component. Theorem: Under M p, with ˆσ p,mle 2 d 1 = σ 2 n {W+ p I{L( (i), ) < Z < U( (i), )}(Z n (i) ) 2 i=1 L( (i), ) = U( (i), ) = n [ ] (i) + 2 (i 1) ; n [ ] (i) + (i+1). 2 ;
16 Mean: EpMLE( ) E = 1 2 n p p i=1 i=1 ˆσ 2 p,mle σ 2 (i) [φ(l( (i), )) φ(u( (i), ))] + 2 (i) [Φ(U( (i), )) Φ(L( (i), ))]; Case of p = 2. EpMLE( ) = 1 { ( ) n φ 2 ( ) 2 n ( ) [ n 2 1 Φ ( )]} n 2
17 EpMLE sqrt(n) Delta /2 ˆσ 2 p,mle is negatively biased for σ2 (even though each submodel estimator is unbiased). Effect of double-dipping.
18 Variance: VpMLE( ) Var = 1 n 2 (1 1 n ˆσ 2 p,mle σ 2 ) + 1 p n i=1 ζ (i) (4) p i=1 ζ (i) (2) 2 ; ζ (i) (m) E { I{L( (i), ) < Z U( (i), )}(Z n (i) ) m}. These formulas enable computations of the theoretical risk functions of the classical M p -based estimators.
19 An Iterative Estimator Consider the Class: C = { σ 2 (c) cˆσ 2 p,mle : c 0} The risk function of σ 2 (c), which is a quadratic function in c, could be minimized wrt c. The minimizing value is c ( ) = EpMLE( )/{V pmle( ) + [EpMLE( )] 2 } Given a c, = (µ µ i0 1 p )/σ could be estimated via ˆ = (µ µ ˆM 1 p) σ(c ) This in turn could be used to obtain a new estimate of c ( )
20 Algorithm for σ 2 p,iter Step 0 (Initialization): Set a value for tol (say, tol = 10 8 ) and set c old = 1. Step 1: Define σ 2 = (c old )ˆσ 2 p,mle. Step 2: Compute ˆ = (µ µ ˆM 1 p)/ σ. Step 3: Compute c new = EpMLE(ˆ ) V pmle(ˆ )+[EpMLE(ˆ )] 2. Step 4: If c old c new < tol set σ 2 p,iter = σ2 then stop; else c old = c new then back to Step 1.
21 Impact of Number of Sub-Models Theorem: With n > 1 fixed, if as p, max 2 i p (i) (i 1) 0, (1), and (p), then Eff (ˆσ 2 p,mre : ˆσ2 MRE ) 2(n + 2) 2 (n + 1)(2n + 7) < 1. Therefore, the advantage of exploiting the structure of M p could be lost forever when p increases!
22 Representation: Weighted Estimators Umbrella Estimator: For α > 0, define ˆσ 2 p,lb (α) = p i=1 (ˆσ 2 i ) α p j=1 (ˆσ2 j ) α ˆσ2 i. Theorem: Under M p, ˆσ 2 p,lb (α) σ 2 d = W n {1 + H(T;α)}; T = (T 1, T 2,..., T p ) = V / W;
23 H(T;α) = p i=1 θ i (T;α)T 2 i ; θ i (T;α) = (1 + T2 i ) α p j=1 (1 + T2 j ) α. Even with this representation, still difficult to obtain exact expressions for the mean and variance. Developed 2nd-order approximations, but were not so satisfactory when n 15. In the comparisons, we resorted to simulations to approximate the risk function of the weighted estimators.
24 Some Simulation Results Figures 1 and 2 Simulated and Theoretical Risk Curves for n = 3 and n = 10 (Based on replications per )
25 Theoretical and/or Simulated Relative (to UMVU) Efficiency Curves Efficiency (relative to UMVU) pmle simulated pmle theoretical pmre simulated pmre theoretical pplb1 simulated piter simulated Delta
26 Theoretical and/or Simulated Relative (to UMVU) Efficiency Curves Efficiency (relative to UMVU) pmle simulated pmle theoretical pmre simulated pmre theoretical pplb1 simulated piter simulated Delta
27 Table: Relative efficiency (wrt UMVU) for symmetric and increasing p with limits [ 1,1] and n = 3,10,30 using 1000 replications. Except for the first set, denoted by (*), where the mean vector is {0,1}, the other mean vectors are of form [ 1 : 2 k : 1] whose p = 2 (k+1) + 1. A last letter of s on the label means theoretical, whereas an s means simulated.
28 n k p pmles pmlet pmres pmret pplb1s piters 3 * * *
29 Concluding Remarks In models with sub-models, and interest is to infer about a common parameter, possible approaches are: Approach I: Use a wider model subsuming the sub-models, possibly a fully nonparametric model. Possibly inefficient, though might be easier to ascertain properties. Approach II: A two-step approach: Select sub-model using data; then use procedure for chosen sub-model, again using same data.
30 Approach III: Utilize a Bayesian framework. Assign a prior to the sub-models, and (conditional) priors on the parameters within the sub-models. Leads to model-averaging. Approaches (II) and (III) are preferable over approach (I); but when the number of sub-models is large, approach (I) may provide better estimators and a simpler determination of the properties. If the sub-models are quite different and the model selector can choose the correct model easily, or the sub-models are not too different that an erroneous choice of the model by the selector will not matter much, approach (II) appears
31 preferable. In the in-between situation, approach (III) seems preferable. For the specific Gaussian model considered, the iterative estimator actually performed in a robust fashion. To conclude, Observe Caution! when doing inference after model selection especially when double-dipping on the data!
Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER
Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS Answer any FOUR of the SIX questions.
More informationدرس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی
یادگیري ماشین توزیع هاي نمونه و تخمین نقطه اي پارامترها Sampling Distributions and Point Estimation of Parameter (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی درس هفتم 1 Outline Introduction
More informationChapter 8: Sampling distributions of estimators Sections
Chapter 8 continued Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample
More informationCS340 Machine learning Bayesian model selection
CS340 Machine learning Bayesian model selection Bayesian model selection Suppose we have several models, each with potentially different numbers of parameters. Example: M0 = constant, M1 = straight line,
More informationChapter 7: Estimation Sections
Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions Frequentist Methods: 7.5 Maximum Likelihood Estimators
More informationChapter 7: Estimation Sections
1 / 40 Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: Chapter 7 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods:
More informationChapter 7: Point Estimation and Sampling Distributions
Chapter 7: Point Estimation and Sampling Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 20 Motivation In chapter 3, we learned
More informationChapter 7: Estimation Sections
1 / 31 : Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods: 7.5 Maximum Likelihood
More informationPoint Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage
6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic
More informationLecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.
Sufficient Statistics Lecture Notes 6 Sufficiency Data reduction in terms of a particular statistic can be thought of as a partition of the sample space X. Definition T is sufficient for θ if the conditional
More information**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:
**BEGINNING OF EXAMINATION** 1. You are given: (i) A random sample of five observations from a population is: 0.2 0.7 0.9 1.1 1.3 (ii) You use the Kolmogorov-Smirnov test for testing the null hypothesis,
More informationPoint Estimation. Some General Concepts of Point Estimation. Example. Estimator quality
Point Estimation Some General Concepts of Point Estimation Statistical inference = conclusions about parameters Parameters == population characteristics A point estimate of a parameter is a value (based
More informationA potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples
1.3 Regime switching models A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples (or regimes). If the dates, the
More informationStatistics 431 Spring 2007 P. Shaman. Preliminaries
Statistics 4 Spring 007 P. Shaman The Binomial Distribution Preliminaries A binomial experiment is defined by the following conditions: A sequence of n trials is conducted, with each trial having two possible
More informationChapter 7 - Lecture 1 General concepts and criteria
Chapter 7 - Lecture 1 General concepts and criteria January 29th, 2010 Best estimator Mean Square error Unbiased estimators Example Unbiased estimators not unique Special case MVUE Bootstrap General Question
More informationChapter 8. Introduction to Statistical Inference
Chapter 8. Introduction to Statistical Inference Point Estimation Statistical inference is to draw some type of conclusion about one or more parameters(population characteristics). Now you know that a
More informationDefinition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.
9 Point estimation 9.1 Rationale behind point estimation When sampling from a population described by a pdf f(x θ) or probability function P [X = x θ] knowledge of θ gives knowledge of the entire population.
More informationAdaptive Experiments for Policy Choice. March 8, 2019
Adaptive Experiments for Policy Choice Maximilian Kasy Anja Sautmann March 8, 2019 Introduction The goal of many experiments is to inform policy choices: 1. Job search assistance for refugees: Treatments:
More informationChapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29
Chapter 5 Univariate time-series analysis () Chapter 5 Univariate time-series analysis 1 / 29 Time-Series Time-series is a sequence fx 1, x 2,..., x T g or fx t g, t = 1,..., T, where t is an index denoting
More informationEE641 Digital Image Processing II: Purdue University VISE - October 29,
EE64 Digital Image Processing II: Purdue University VISE - October 9, 004 The EM Algorithm. Suffient Statistics and Exponential Distributions Let p(y θ) be a family of density functions parameterized by
More informationPoint Estimation. Principle of Unbiased Estimation. When choosing among several different estimators of θ, select one that is unbiased.
Point Estimation Point Estimation Definition A point estimate of a parameter θ is a single number that can be regarded as a sensible value for θ. A point estimate is obtained by selecting a suitable statistic
More informationConfidence Intervals Introduction
Confidence Intervals Introduction A point estimate provides no information about the precision and reliability of estimation. For example, the sample mean X is a point estimate of the population mean μ
More informationPosterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties
Posterior Inference Example. Consider a binomial model where we have a posterior distribution for the probability term, θ. Suppose we want to make inferences about the log-odds γ = log ( θ 1 θ), where
More informationApplied Statistics I
Applied Statistics I Liang Zhang Department of Mathematics, University of Utah July 14, 2008 Liang Zhang (UofU) Applied Statistics I July 14, 2008 1 / 18 Point Estimation Liang Zhang (UofU) Applied Statistics
More informationSYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data
SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu September 5, 2015
More informationChapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi
Chapter 4: Commonly Used Distributions Statistics for Engineers and Scientists Fourth Edition William Navidi 2014 by Education. This is proprietary material solely for authorized instructor use. Not authorized
More informationHigh Dimensional Bayesian Optimisation and Bandits via Additive Models
1/20 High Dimensional Bayesian Optimisation and Bandits via Additive Models Kirthevasan Kandasamy, Jeff Schneider, Barnabás Póczos ICML 15 July 8 2015 2/20 Bandits & Optimisation Maximum Likelihood inference
More informationDealing with forecast uncertainty in inventory models
Dealing with forecast uncertainty in inventory models 19th IIF workshop on Supply Chain Forecasting for Operations Lancaster University Dennis Prak Supervisor: Prof. R.H. Teunter June 29, 2016 Dennis Prak
More informationLikelihood Methods of Inference. Toss coin 6 times and get Heads twice.
Methods of Inference Toss coin 6 times and get Heads twice. p is probability of getting H. Probability of getting exactly 2 heads is 15p 2 (1 p) 4 This function of p, is likelihood function. Definition:
More informationStatistics for Business and Economics
Statistics for Business and Economics Chapter 7 Estimation: Single Population Copyright 010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-1 Confidence Intervals Contents of this chapter: Confidence
More informationThe Bernoulli distribution
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More information4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved.
4-1 Chapter 4 Commonly Used Distributions 2014 by The Companies, Inc. All rights reserved. Section 4.1: The Bernoulli Distribution 4-2 We use the Bernoulli distribution when we have an experiment which
More informationPractice Exercises for Midterm Exam ST Statistical Theory - II The ACTUAL exam will consists of less number of problems.
Practice Exercises for Midterm Exam ST 522 - Statistical Theory - II The ACTUAL exam will consists of less number of problems. 1. Suppose X i F ( ) for i = 1,..., n, where F ( ) is a strictly increasing
More informationSTAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.
STAT 509: Statistics for Engineers Dr. Dewei Wang Applied Statistics and Probability for Engineers Sixth Edition Douglas C. Montgomery George C. Runger 7 Point CHAPTER OUTLINE 7-1 Point Estimation 7-2
More informationQualifying Exam Solutions: Theoretical Statistics
Qualifying Exam Solutions: Theoretical Statistics. (a) For the first sampling plan, the expectation of any statistic W (X, X,..., X n ) is a polynomial of θ of degree less than n +. Hence τ(θ) cannot have
More informationBayesian Linear Model: Gory Details
Bayesian Linear Model: Gory Details Pubh7440 Notes By Sudipto Banerjee Let y y i ] n i be an n vector of independent observations on a dependent variable (or response) from n experimental units. Associated
More informationAsset Allocation and Risk Assessment with Gross Exposure Constraints
Asset Allocation and Risk Assessment with Gross Exposure Constraints Forrest Zhang Bendheim Center for Finance Princeton University A joint work with Jianqing Fan and Ke Yu, Princeton Princeton University
More informationIntroduction to Sequential Monte Carlo Methods
Introduction to Sequential Monte Carlo Methods Arnaud Doucet NCSU, October 2008 Arnaud Doucet () Introduction to SMC NCSU, October 2008 1 / 36 Preliminary Remarks Sequential Monte Carlo (SMC) are a set
More informationInterval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems
Interval estimation September 29, 2017 STAT 151 Class 7 Slide 1 Outline of Topics 1 Basic ideas 2 Sampling variation and CLT 3 Interval estimation using X 4 More general problems STAT 151 Class 7 Slide
More informationExtend the ideas of Kan and Zhou paper on Optimal Portfolio Construction under parameter uncertainty
Extend the ideas of Kan and Zhou paper on Optimal Portfolio Construction under parameter uncertainty George Photiou Lincoln College University of Oxford A dissertation submitted in partial fulfilment for
More informationChapter 4: Asymptotic Properties of MLE (Part 3)
Chapter 4: Asymptotic Properties of MLE (Part 3) Daniel O. Scharfstein 09/30/13 1 / 1 Breakdown of Assumptions Non-Existence of the MLE Multiple Solutions to Maximization Problem Multiple Solutions to
More information6. Genetics examples: Hardy-Weinberg Equilibrium
PBCB 206 (Fall 2006) Instructor: Fei Zou email: fzou@bios.unc.edu office: 3107D McGavran-Greenberg Hall Lecture 4 Topics for Lecture 4 1. Parametric models and estimating parameters from data 2. Method
More informationGenerating Random Numbers
Generating Random Numbers Aim: produce random variables for given distribution Inverse Method Let F be the distribution function of an univariate distribution and let F 1 (y) = inf{x F (x) y} (generalized
More informationChapter 8: Sampling distributions of estimators Sections
Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample variance Skip: p.
More informationChapter 5: Statistical Inference (in General)
Chapter 5: Statistical Inference (in General) Shiwen Shen University of South Carolina 2016 Fall Section 003 1 / 17 Motivation In chapter 3, we learn the discrete probability distributions, including Bernoulli,
More informationDecision theoretic estimation of the ratio of variances in a bivariate normal distribution 1
Decision theoretic estimation of the ratio of variances in a bivariate normal distribution 1 George Iliopoulos Department of Mathematics University of Patras 26500 Rio, Patras, Greece Abstract In this
More informationCSC 411: Lecture 08: Generative Models for Classification
CSC 411: Lecture 08: Generative Models for Classification Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 08-Generative Models 1 / 23 Today Classification
More informationmay be of interest. That is, the average difference between the estimator and the truth. Estimators with Bias(ˆθ) = 0 are called unbiased.
1 Evaluating estimators Suppose you observe data X 1,..., X n that are iid observations with distribution F θ indexed by some parameter θ. When trying to estimate θ, one may be interested in determining
More informationNon-informative Priors Multiparameter Models
Non-informative Priors Multiparameter Models Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Prior Types Informative vs Non-informative There has been a desire for a prior distributions that
More information1. You are given the following information about a stationary AR(2) model:
Fall 2003 Society of Actuaries **BEGINNING OF EXAMINATION** 1. You are given the following information about a stationary AR(2) model: (i) ρ 1 = 05. (ii) ρ 2 = 01. Determine φ 2. (A) 0.2 (B) 0.1 (C) 0.4
More information1 Bayesian Bias Correction Model
1 Bayesian Bias Correction Model Assuming that n iid samples {X 1,...,X n }, were collected from a normal population with mean µ and variance σ 2. The model likelihood has the form, P( X µ, σ 2, T n >
More information5.3 Statistics and Their Distributions
Chapter 5 Joint Probability Distributions and Random Samples Instructor: Lingsong Zhang 1 Statistics and Their Distributions 5.3 Statistics and Their Distributions Statistics and Their Distributions Consider
More informationBIO5312 Biostatistics Lecture 5: Estimations
BIO5312 Biostatistics Lecture 5: Estimations Yujin Chung September 27th, 2016 Fall 2016 Yujin Chung Lec5: Estimations Fall 2016 1/34 Recap Yujin Chung Lec5: Estimations Fall 2016 2/34 Today s lecture and
More informationChapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS
Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS Part 1: Introduction Sampling Distributions & the Central Limit Theorem Point Estimation & Estimators Sections 7-1 to 7-2 Sample data
More informationUnit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More informationUnit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More informationExam 2 Spring 2015 Statistics for Applications 4/9/2015
18.443 Exam 2 Spring 2015 Statistics for Applications 4/9/2015 1. True or False (and state why). (a). The significance level of a statistical test is not equal to the probability that the null hypothesis
More informationIEOR E4703: Monte-Carlo Simulation
IEOR E4703: Monte-Carlo Simulation Simulation Efficiency and an Introduction to Variance Reduction Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University
More informationMuch of what appears here comes from ideas presented in the book:
Chapter 11 Robust statistical methods Much of what appears here comes from ideas presented in the book: Huber, Peter J. (1981), Robust statistics, John Wiley & Sons (New York; Chichester). There are many
More informationVersion A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.
Math 224 Q Exam 3A Fall 217 Tues Dec 12 Version A Problem 1. Let X be the continuous random variable defined by the following pdf: { 1 x/2 when x 2, f(x) otherwise. (a) Compute the mean µ E[X]. E[X] x
More informationActuarial Mathematics and Statistics Statistics 5 Part 2: Statistical Inference Tutorial Problems
Actuarial Mathematics and Statistics Statistics 5 Part 2: Statistical Inference Tutorial Problems Spring 2005 1. Which of the following statements relate to probabilities that can be interpreted as frequencies?
More informationLecture 10: Point Estimation
Lecture 10: Point Estimation MSU-STT-351-Sum-17B (P. Vellaisamy: MSU-STT-351-Sum-17B) Probability & Statistics for Engineers 1 / 31 Basic Concepts of Point Estimation A point estimate of a parameter θ,
More informationExercise. Show the corrected sample variance is an unbiased estimator of population variance. S 2 = n i=1 (X i X ) 2 n 1. Exercise Estimation
Exercise Show the corrected sample variance is an unbiased estimator of population variance. S 2 = n i=1 (X i X ) 2 n 1 Exercise S 2 = = = = n i=1 (X i x) 2 n i=1 = (X i µ + µ X ) 2 = n 1 n 1 n i=1 ((X
More informationChapter 7. Inferences about Population Variances
Chapter 7. Inferences about Population Variances Introduction () The variability of a population s values is as important as the population mean. Hypothetical distribution of E. coli concentrations from
More informationStatistical Tables Compiled by Alan J. Terry
Statistical Tables Compiled by Alan J. Terry School of Science and Sport University of the West of Scotland Paisley, Scotland Contents Table 1: Cumulative binomial probabilities Page 1 Table 2: Cumulative
More informationBayesian Normal Stuff
Bayesian Normal Stuff - Set-up of the basic model of a normally distributed random variable with unknown mean and variance (a two-parameter model). - Discuss philosophies of prior selection - Implementation
More information8.1 Estimation of the Mean and Proportion
8.1 Estimation of the Mean and Proportion Statistical inference enables us to make judgments about a population on the basis of sample information. The mean, standard deviation, and proportions of a population
More informationWeek 1 Quantitative Analysis of Financial Markets Distributions B
Week 1 Quantitative Analysis of Financial Markets Distributions B Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 October
More informationMTH6154 Financial Mathematics I Stochastic Interest Rates
MTH6154 Financial Mathematics I Stochastic Interest Rates Contents 4 Stochastic Interest Rates 45 4.1 Fixed Interest Rate Model............................ 45 4.2 Varying Interest Rate Model...........................
More informationChapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.
Chapter 14 : Statistical Inference 1 Chapter 14 : Introduction to Statistical Inference Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same. Data x
More informationStatistical Inference and Methods
Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 14th February 2006 Part VII Session 7: Volatility Modelling Session 7: Volatility Modelling
More informationChapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables
Chapter 5 Continuous Random Variables and Probability Distributions 5.1 Continuous Random Variables 1 2CHAPTER 5. CONTINUOUS RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS Probability Distributions Probability
More informationMulti-armed bandit problems
Multi-armed bandit problems Stochastic Decision Theory (2WB12) Arnoud den Boer 13 March 2013 Set-up 13 and 14 March: Lectures. 20 and 21 March: Paper presentations (Four groups, 45 min per group). Before
More informationBusiness Statistics 41000: Probability 3
Business Statistics 41000: Probability 3 Drew D. Creal University of Chicago, Booth School of Business February 7 and 8, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404
More informationAsymptotic Methods in Financial Mathematics
Asymptotic Methods in Financial Mathematics José E. Figueroa-López 1 1 Department of Mathematics Washington University in St. Louis Statistics Seminar Washington University in St. Louis February 17, 2017
More informationWindow Width Selection for L 2 Adjusted Quantile Regression
Window Width Selection for L 2 Adjusted Quantile Regression Yoonsuh Jung, The Ohio State University Steven N. MacEachern, The Ohio State University Yoonkyung Lee, The Ohio State University Technical Report
More informationMATH 3200 Exam 3 Dr. Syring
. Suppose n eligible voters are polled (randomly sampled) from a population of size N. The poll asks voters whether they support or do not support increasing local taxes to fund public parks. Let M be
More informationChapter 5. Sampling Distributions
Lecture notes, Lang Wu, UBC 1 Chapter 5. Sampling Distributions 5.1. Introduction In statistical inference, we attempt to estimate an unknown population characteristic, such as the population mean, µ,
More informationOptimally Thresholded Realized Power Variations for Lévy Jump Diffusion Models
Optimally Thresholded Realized Power Variations for Lévy Jump Diffusion Models José E. Figueroa-López 1 1 Department of Statistics Purdue University University of Missouri-Kansas City Department of Mathematics
More informationPoint Estimators. STATISTICS Lecture no. 10. Department of Econometrics FEM UO Brno office 69a, tel
STATISTICS Lecture no. 10 Department of Econometrics FEM UO Brno office 69a, tel. 973 442029 email:jiri.neubauer@unob.cz 8. 12. 2009 Introduction Suppose that we manufacture lightbulbs and we want to state
More informationBivariate Birnbaum-Saunders Distribution
Department of Mathematics & Statistics Indian Institute of Technology Kanpur January 2nd. 2013 Outline 1 Collaborators 2 3 Birnbaum-Saunders Distribution: Introduction & Properties 4 5 Outline 1 Collaborators
More informationModeling of Price. Ximing Wu Texas A&M University
Modeling of Price Ximing Wu Texas A&M University As revenue is given by price times yield, farmers income risk comes from risk in yield and output price. Their net profit also depends on input price, but
More informationχ 2 distributions and confidence intervals for population variance
χ 2 distributions and confidence intervals for population variance Let Z be a standard Normal random variable, i.e., Z N(0, 1). Define Y = Z 2. Y is a non-negative random variable. Its distribution is
More informationCS340 Machine learning Bayesian statistics 3
CS340 Machine learning Bayesian statistics 3 1 Outline Conjugate analysis of µ and σ 2 Bayesian model selection Summarizing the posterior 2 Unknown mean and precision The likelihood function is p(d µ,λ)
More informationThe Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis
The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis Dr. Baibing Li, Loughborough University Wednesday, 02 February 2011-16:00 Location: Room 610, Skempton (Civil
More informationMonetary Economics Final Exam
316-466 Monetary Economics Final Exam 1. Flexible-price monetary economics (90 marks). Consider a stochastic flexibleprice money in the utility function model. Time is discrete and denoted t =0, 1,...
More informationQuantitative Risk Management
Quantitative Risk Management Asset Allocation and Risk Management Martin B. Haugh Department of Industrial Engineering and Operations Research Columbia University Outline Review of Mean-Variance Analysis
More informationMachine Learning for Quantitative Finance
Machine Learning for Quantitative Finance Fast derivative pricing Sofie Reyners Joint work with Jan De Spiegeleer, Dilip Madan and Wim Schoutens Derivative pricing is time-consuming... Vanilla option pricing
More informationParameter uncertainty for integrated risk capital calculations based on normally distributed subrisks
Parameter uncertainty for integrated risk capital calculations based on normally distributed subrisks Andreas Fröhlich and Annegret Weng March 7, 017 Abstract In this contribution we consider the overall
More informationAn Improved Skewness Measure
An Improved Skewness Measure Richard A. Groeneveld Professor Emeritus, Department of Statistics Iowa State University ragroeneveld@valley.net Glen Meeden School of Statistics University of Minnesota Minneapolis,
More informationConjugate Models. Patrick Lam
Conjugate Models Patrick Lam Outline Conjugate Models What is Conjugacy? The Beta-Binomial Model The Normal Model Normal Model with Unknown Mean, Known Variance Normal Model with Known Mean, Unknown Variance
More informationChapter 4 Variability
Chapter 4 Variability PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh Edition by Frederick J Gravetter and Larry B. Wallnau Chapter 4 Learning Outcomes 1 2 3 4 5
More informationCalibration of Interest Rates
WDS'12 Proceedings of Contributed Papers, Part I, 25 30, 2012. ISBN 978-80-7378-224-5 MATFYZPRESS Calibration of Interest Rates J. Černý Charles University, Faculty of Mathematics and Physics, Prague,
More informationSTAT 111 Recitation 4
STAT 111 Recitation 4 Linjun Zhang http://stat.wharton.upenn.edu/~linjunz/ September 29, 2017 Misc. Mid-term exam time: 6-8 pm, Wednesday, Oct. 11 The mid-term break is Oct. 5-8 The next recitation class
More informationGPD-POT and GEV block maxima
Chapter 3 GPD-POT and GEV block maxima This chapter is devoted to the relation between POT models and Block Maxima (BM). We only consider the classical frameworks where POT excesses are assumed to be GPD,
More informationSampling Distribution
MAT 2379 (Spring 2012) Sampling Distribution Definition : Let X 1,..., X n be a collection of random variables. We say that they are identically distributed if they have a common distribution. Definition
More informationBack to estimators...
Back to estimators... So far, we have: Identified estimators for common parameters Discussed the sampling distributions of estimators Introduced ways to judge the goodness of an estimator (bias, MSE, etc.)
More informationWorst-Case Value-at-Risk of Non-Linear Portfolios
Worst-Case Value-at-Risk of Non-Linear Portfolios Steve Zymler Daniel Kuhn Berç Rustem Department of Computing Imperial College London Portfolio Optimization Consider a market consisting of m assets. Optimal
More informationHomework Problems Stat 479
Chapter 10 91. * A random sample, X1, X2,, Xn, is drawn from a distribution with a mean of 2/3 and a variance of 1/18. ˆ = (X1 + X2 + + Xn)/(n-1) is the estimator of the distribution mean θ. Find MSE(
More informationStrategies for Improving the Efficiency of Monte-Carlo Methods
Strategies for Improving the Efficiency of Monte-Carlo Methods Paul J. Atzberger General comments or corrections should be sent to: paulatz@cims.nyu.edu Introduction The Monte-Carlo method is a useful
More information