Exam 1 Spring 2015 Statistics for Applications 3/5/2015

Similar documents
14.30 Introduction to Statistical Methods in Economics Spring 2009

18.S096 Problem Set 5 Fall 2013 Volatility Modeling Due Date: 10/29/2013

Introduction to Probability and Statistics Chapter 7

5 Statistical Inference

Exam 2. Instructor: Cynthia Rudin TA: Dimitrios Bisias. October 25, 2011

ECON 5350 Class Notes Maximum Likelihood Estimation

x satisfying all regularity conditions. Then

5. Best Unbiased Estimators

Sampling Distributions & Estimators

Lecture 5 Point Es/mator and Sampling Distribu/on

STAT 135 Solutions to Homework 3: 30 points

. (The calculated sample mean is symbolized by x.)

point estimator a random variable (like P or X) whose values are used to estimate a population parameter

Chapter 8. Confidence Interval Estimation. Copyright 2015, 2012, 2009 Pearson Education, Inc. Chapter 8, Slide 1

1 Random Variables and Key Statistics

Lecture 4: Parameter Estimation and Confidence Intervals. GENOME 560 Doug Fowler, GS

4.5 Generalized likelihood ratio test

Today: Finish Chapter 9 (Sections 9.6 to 9.8 and 9.9 Lesson 3)

Statistics for Economics & Business

Asymptotics: Consistency and Delta Method

Lecture 4: Probability (continued)

Estimating Proportions with Confidence

Parametric Density Estimation: Maximum Likelihood Estimation

Sampling Distributions and Estimation

Chapter 10 - Lecture 2 The independent two sample t-test and. confidence interval

A point estimate is the value of a statistic that estimates the value of a parameter.

Topic 14: Maximum Likelihood Estimation

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimation of Mean & Proportion. Introduction

NOTES ON ESTIMATION AND CONFIDENCE INTERVALS. 1. Estimation

Inferential Statistics and Probability a Holistic Approach. Inference Process. Inference Process. Chapter 8 Slides. Maurice Geraghty,

Combining imperfect data, and an introduction to data assimilation Ross Bannister, NCEO, September 2010

Lecture 5: Sampling Distribution

Bayes Estimator for Coefficient of Variation and Inverse Coefficient of Variation for the Normal Distribution

ii. Interval estimation:

ST 305: Exam 2 Fall 2014

Sampling Distributions and Estimation

Topic-7. Large Sample Estimation

AY Term 2 Mock Examination

Online appendices from Counterparty Risk and Credit Value Adjustment a continuing challenge for global financial markets by Jon Gregory

Problem Set 1a - Oligopoly

1. Suppose X is a variable that follows the normal distribution with known standard deviation σ = 0.3 but unknown mean µ.

r i = a i + b i f b i = Cov[r i, f] The only parameters to be estimated for this model are a i 's, b i 's, σe 2 i

1 Estimating sensitivities

BASIC STATISTICS ECOE 1323

Unbiased estimators Estimators

Math 124: Lecture for Week 10 of 17

Summary. Recap. Last Lecture. .1 If you know MLE of θ, can you also know MLE of τ(θ) for any function τ?

Standard Deviations for Normal Sampling Distributions are: For proportions For means _

Monetary Economics: Problem Set #5 Solutions

SCHOOL OF ACCOUNTING AND BUSINESS BSc. (APPLIED ACCOUNTING) GENERAL / SPECIAL DEGREE PROGRAMME

Point Estimation by MLE Lesson 5

Statistics for Business and Economics

These characteristics are expressed in terms of statistical properties which are estimated from the sample data.

Maximum Empirical Likelihood Estimation (MELE)

A random variable is a variable whose value is a numerical outcome of a random phenomenon.

Chapter 4: Asymptotic Properties of MLE (Part 3)

BIOSTATS 540 Fall Estimation Page 1 of 72. Unit 6. Estimation. Use at least twelve observations in constructing a confidence interval

Simulation Efficiency and an Introduction to Variance Reduction Methods

A Bayesian perspective on estimating mean, variance, and standard-deviation from data

CHAPTER 8: CONFIDENCE INTERVAL ESTIMATES for Means and Proportions

1. Find the area under the standard normal curve between z = 0 and z = 3. (a) (b) (c) (d)

0.1 Valuation Formula:

Point Estimation by MLE Lesson 5

1 Basic Growth Models

Confidence Intervals Introduction

Estimation of Parameters of Three Parameter Esscher Transformed Laplace Distribution

Basic formula for confidence intervals. Formulas for estimating population variance Normal Uniform Proportion

Lecture 9: The law of large numbers and central limit theorem

Chpt 5. Discrete Probability Distributions. 5-3 Mean, Variance, Standard Deviation, and Expectation

Kernel Density Estimation. Let X be a random variable with continuous distribution F (x) and density f(x) = d

Exercise. Show the corrected sample variance is an unbiased estimator of population variance. S 2 = n i=1 (X i X ) 2 n 1. Exercise Estimation

Subject CT1 Financial Mathematics Core Technical Syllabus

Outline. Populations. Defs: A (finite) population is a (finite) set P of elements e. A variable is a function v : P IR. Population and Characteristics

Quantitative Analysis

Rafa l Kulik and Marc Raimondo. University of Ottawa and University of Sydney. Supplementary material

FINM6900 Finance Theory How Is Asymmetric Information Reflected in Asset Prices?

B = A x z

CHAPTER 8: CONFIDENCE INTERVAL ESTIMATES for Means and Proportions

Probability and statistics

EVEN NUMBERED EXERCISES IN CHAPTER 4

Subject CT5 Contingencies Core Technical. Syllabus. for the 2011 Examinations. The Faculty of Actuaries and Institute of Actuaries.

Solutions to Problem Sheet 1

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

The material in this chapter is motivated by Experiment 9.

An Application of Extreme Value Analysis to U.S. Movie Box Office Returns

An Empirical Study of the Behaviour of the Sample Kurtosis in Samples from Symmetric Stable Distributions

SELECTING THE NUMBER OF CHANGE-POINTS IN SEGMENTED LINE REGRESSION

Research Article The Probability That a Measurement Falls within a Range of n Standard Deviations from an Estimate of the Mean

EXERCISE - BINOMIAL THEOREM

Generative Models, Maximum Likelihood, Soft Clustering, and Expectation Maximization

FOUNDATION ACTED COURSE (FAC)

= α e ; x 0. Such a random variable is said to have an exponential distribution, with parameter α. [Here, view X as time-to-failure.

Chapter 8 Interval Estimation. Estimation Concepts. General Form of a Confidence Interval

. The firm makes different types of furniture. Let x ( x1,..., x n. If the firm produces nothing it rents out the entire space and so has a profit of

Chapter 15. Bayesian Statistics and Decision Analysis. COMPLETE BUSINESS STATISTICS by AMIR D. ACZEL & JAYAVEL SOUNDERPANDIAN

MATH : EXAM 2 REVIEW. A = P 1 + AP R ) ny

Department of Mathematics, S.R.K.R. Engineering College, Bhimavaram, A.P., India 2

MA Lesson 11 Section 1.3. Solving Applied Problems with Linear Equations of one Variable

Quantitative Analysis

Transcription:

8.443 Exam Sprig 05 Statistics for Applicatios 3/5/05. Log Normal Distributio: A radom variable X follows a Logormal(θ, σ ) distributio if l(x) follows a Normal(θ, σ ) distributio. For the ormal radom variable l(x) The probability desity fuctio of is (y θ) f(y µ, σ ) e σ, < y <. πσ The momet-geeratig fuctio of is tθ + σ t M t (t) E[e θ, σ ] e (a). Compute the first two momets of a radom variable X Logormal(θ, σ ). µ E[X θ, σ ] ad µ E[X θ] Hit: Note that X e ad X e where N(θ, σ ) ad use the momet-geeratig fuctio of. (b). Suppose that X,..., X is a i.i.d. sample from the Logormal(θ, σ ) distributio of size. Fid the method of momets estimates of θ ad σ. Hit: evaluate µ /µ ad fid a method-of-momets estimate for σ first. (c). For the log-ormal radom variable X e, where Normal(θ, σ ), prove that the probability desity of X is (l(x) θ) f(x θ, σ ) ( )e σ, 0 < x <. πσ x (d). Suppose that X,..., X is a i.i.d. sample from the Logormal(θ, σ ) distributio of size. Fid the mle for θ assumig that σ is kow to equal σ 0. (e). Fid the asymptotic variace of the mle for θ i (d). Solutio:

(a). µ θ+σ / E[X] E[e ] M () e µ E[X ] E[e ] M () e θ+σ (b). First, ote that: σ µ /(µ ) e It follows that a method-of-momets estimate for σ is where σˆ l(ˆµ /µˆ) µˆ X i i µˆ X i i Substitutig ˆσ for σ i the formula for µ we get µˆ e θ+ˆσ / θˆ l(ˆµ ) σˆ/ (c). Cosider the trasformatio X e. which has the iverse: y l(x) ad dy/dx /x. It follows that σ (l(x) θ) f X (x) f (l(x)) dy/dx e πσ x (d). The log of the desity fuctio for sigle realizatios x is l[f(x θ, σ 0 ] l(πσ 0 ) l(x) (l(x) θ) σ 0 For a sample x,..., x, the likelihood fuctio is i σ i 0 i (θ) l[f(x i θ, σ 0 ] (l(x i ) θ) + (terms ot depedig o θ) (θ) is miimized by θˆ i l(x i ) values. (e). The asymptotic variace satisfies E[ d J(θ) dθ d J(θ) dθ σ 0 ] /V ar(θˆ) Sice is costat l(x i ) the mle from the sample of V ar(θˆ) σ 0 / This asymptotic variace is i fact the actual variace of θ. ˆ

. The Pareto distributio is used i ecoomics to model values exceedig a threshhold (e.g., liability losses greater tha $00 millio for a cosumer products compay). For a fixed, kow threshhold value of x 0 > 0, the desity fuctio is f(x x θ 0, θ) θx θ 0 x, x x 0, ad θ >. Note that the cumulative distributio fuctio of X is ( o x θ P (X x) F X (x). x 0 (a). Fid the method-of-momets estimate of θ. (b). Fid the mle of θ. (c). Fid the asymptotic variace of the mle. (d). What is the large-sample asymptotic distributio of the mle? Solutio: (a) Compute the first momet of a Pareto radom variable X : J µ x 0 xf(x x 0, θ)dx J x x θx θ 0 x θ 0 dx J θx θ 0 x x θ 0 dx (θ ) θx θ 0 ( θ )x o0 θ x 0 θ Solvig µ µˆ x for θ gives: θ ˆ x x x0 (b). For a sigle observatio X x, we ca write log[f(x θ)] l(θ) + θ l(x 0 ) (θ ) l(x) log[f(x θ) θ ] θ + l(x 0 ) l(x) log[f (x θ) ] θ θ The mle for θ solves J(θ) 0 θ θ ( i l[f(x i θ)]) [ θ + l(x 0 ) l(x i )] θ + l(x 0) l(x i) θˆ [ l(x i /x 0 )] l(x i ) l(x 0 ) (c). The asymptotic variace of θˆ is V ar(θˆ) θ I(θ) 3

l[f(x θ)] Because I(θ) E[ ] θ θ (d) The asymptotic distributio of θˆ is D (θˆ θ) N(0, I(θ) ) N(0, θ ) or D ˆθ N(θ, θ ) 4

3. Distributios derived from Normal radom variables. Cosider two idepedet radom samples from two ormal distributios: X,..., X are i.i.d. Normal(µ, σ ) radom variables.,..., m are m i.i.d. Normal(µ, σ ) radom variables. (a). If µ µ 0, fid two statistics T (X,..., X,,..., m ) T (X,..., X,,..., m ) each of which is a t radom variable ad which are statistically idepedet. Explai i detail why your aswers have a t distributio ad why they are idepedet. (b). If σ σ > 0, defie a statistic T 3 (X,..., X,,..., m ) which has a F distributio. A F distributio is determied by the umerator ad deomiator degrees of freedom. State the degrees of freedom for your statistic T 3. (c). For your aswer i (b), defie the statistic T 4 (X,..., X,,..., m ) T3 (X,..., X,,..., m ) What is the distributio of T 4 uder the coditios of (b)? (d). Suppose that σ σ. If S X i(x i X), ad S m m i ( i ), are the sample variaces of the two samples, show how to use the F distributio to fid P (S X /S > c). (e). Repeat questio (d) if it is kow that σ σ. Solutio: (a). Cosider where X T SX T m S 5

X X i σ m m i σ m m N(µ, σ /) SX (X i X) ( ) χ N(µ, σ /) SX m ( i ) ( ) χ m We kow from theory that X ad SX are idepedet, ad ad S are idepedet, ad all 4 are mutually idepedet because they deped o idepedet samples. For µ 0, we ca write X/σ T S X/σ t a t distributio with (m ) degrees of freedom, because the umerator is N(0, ) radom variable idepedet of the deomiator which is i χ /(m ). m Ad for µ 0, we ca write m /σ T t m S /σ a t distributio with ( ) degrees of freedom, because the umerator is N(0, ) radom variable idepedet of the deomiator which is i χ /( ). (b). For σ σ cosider the statistic: T 3 S X S S X/σ S /σ The umerator is a χ radom variable divided by its degrees of freedom ( ) ad the deomiator is a idepedet χ m radom variable divided by its degrees of freedom (m ). By defiitio the distributio of such a ratio is a F distributio with ( ) ad (m ) degrees of freedom i the umerator/deomiator. (c). The iverse of a F radom variable is also a F radom variable the degrees of freedom for umerator ad deomiator reverse. (d). I geeral we kow: ( )S X σ (m )S σ which are idepedet. χ χ m 6

So, we ca develop the expressio: S ( )SX /σ ( )/σ P ( S X > c) P ( > c) (m )S /σ (m )σ σ ( ) P (F ( ),(m ) > (m ) ( ) c) The aswer is the upper-tail probability of a F distributio with ( ), (m ) degrees of freedom, equal to the probability of exceedig ( ) σ ( (m ) ( σ ) c) For (d), use σ σ ad for (e) use σ σ / σ 7

4. Hardy-Weiberg (Multiomial) Model of Gee Frequecies For a certai populatio, gee frequecies are i equilibrium: the geotypes AA, Aa, ad aa occur with probabilities ( θ), θ( θ), ad θ. A radom sample of 50 people from the populatio yielded the followig data: Geotype Type AA Aa aa 35 0 5 The table couts ca be modeled as the multiomial distributio: (X, X, X 3 ) Multiomial( 50, p (( θ), θ( θ), θ ). (a). Fid the mle of θ (b). Fid the asymptotic variace of the mle. (c). What is the large sample asymptotic distributio of the mle? (d). Fid a approximate 90% cofidece iterval for θ. To costruct the iterval you may use the follow table of cumulative probabilities for a stadard ormal N(0, ) radom variable Z P (Z < z) z 0.99.36 0.975.960 0.950.645 0.90.8 (e). Usig the mle θˆ i (a), 000 samples from the Multiomial( 50, p (( θˆ), θˆ( θˆ), θˆ )) distributio were radomly geerated, ad mle estimates were computed for each sample: θˆj, j,..., 000. For the true parameter θ 0, the samplig distributio of Δ θˆ θ 0 is approximated by that of Δ θˆ θ. ˆ The 50-th largest value of Δ was +0.065 ad the 50-th smallest value was 0.067. Use this iformatio ad the estimate i (a) to costruct a (parametric) bootstrap cofidece iterval for the true θ 0. What is the cofidece level of the iterval? (If you do ot have a aswer to part (a), assume the mle θˆ 0.5). Solutio: (a). Fid the mle of θ 8

(X, X, X 3 ) Multiomial(, p (( θ), θ( θ), θ )) Log Likelihood for θ (θ) log(f(x, x, x 3 p (θ), p (θ), p 3 (θ)))! log( p (θ) x p (θ) x p 3 (θ) x 3 x!x!x 3! ) x log(( θ) ) + x log(θ( θ)) +x 3 log(θ ) + (o-θ terms) (x + x )log( θ) + (x 3 + x )log(θ) + (o-θ terms) First Differetial of log likelihood: (x + x ) (x 3 + x ) " (θ) + θ θ ˆ x 3 + x x 3 + x (5) + 0 θ 0. x + x + x 3 (50) (b). Fid the asymptotic variace of the mle. V ar(θˆ) E[ "" (θ)] Secod Differetial of log likelihood: d (x + x ) (x 3 + x ) "" (θ) [ + ] dθ θ θ (x + x ) (x 3 + x ) ( θ) θ Each of the X i are Biomial(, p i (θ)) so E[X ] p (θ) ( θ) E[X ] p (θ) θ( θ) E[X 3 ] p 3 (θ) θ E[ "" (θ)] θ( θ) θˆ( θˆ) 0.8( 0.8) σˆ 0.6/00 (.4/0) (.04) ˆθ 50 θ( θ) (c) The asymptotic distributio of θˆ is N(θ, ) (d) A approximate 90% cofidece iterval for θ is give by i i {θ : θˆ z(α/) V ar(θˆ) < θ < θˆ + z(α/) V ar(θˆ)} i where α 0.90 ad z(.05).645, ad V ar(θˆ) (.04). 9

So the approximate 90% cofidece iterval is: {θ : 0.0.06580 < θ < 0.0 +.06580} (e). For the bootstrap distributio of the errors Δ θˆ θ 0, (where θ 0 is the true value), the approximate 5% ad 95% quatiles are δ 0.067 ad δ 0.065. The approximate 90% cofidece iterval is {θ : θˆ δ < θ < θˆ δ} [0. 0.065, 0. + 0.067] 0

MIT OpeCourseWare http://ocw.mit.edu 8.443 Statistics for Applicatios Sprig 05 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.