Actuarial Mathematics and Statistics Statistics 5 Part 2: Statistical Inference Tutorial Problems

Similar documents
Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.

Point Estimators. STATISTICS Lecture no. 10. Department of Econometrics FEM UO Brno office 69a, tel

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

Lecture 10: Point Estimation

Chapter 7 - Lecture 1 General concepts and criteria

Back to estimators...

4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved.

6. Genetics examples: Hardy-Weinberg Equilibrium

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Chapter 4: Asymptotic Properties of MLE (Part 3)

Applied Statistics I

BIO5312 Biostatistics Lecture 5: Estimations

Chapter 8. Introduction to Statistical Inference

Chapter 6: Point Estimation

Likelihood Methods of Inference. Toss coin 6 times and get Heads twice.

Point Estimation. Principle of Unbiased Estimation. When choosing among several different estimators of θ, select one that is unbiased.

Computer Statistics with R

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

MATH/STAT 3360, Probability FALL 2012 Toby Kenney

Chapter 8: Sampling distributions of estimators Sections

Practice Exercises for Midterm Exam ST Statistical Theory - II The ACTUAL exam will consists of less number of problems.

Chapter 5. Sampling Distributions

MATH 3200 Exam 3 Dr. Syring

Chapter 7: Point Estimation and Sampling Distributions

Section Distributions of Random Variables

8.1 Estimation of the Mean and Proportion

UNIVERSITY OF VICTORIA Midterm June 2014 Solutions

MVE051/MSG Lecture 7

Simple Random Sample

3 ˆθ B = X 1 + X 2 + X 3. 7 a) Find the Bias, Variance and MSE of each estimator. Which estimator is the best according

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

MATH 264 Problem Homework I

Chapter 5: Statistical Inference (in General)

MATH/STAT 3360, Probability FALL 2013 Toby Kenney

SOCIETY OF ACTUARIES EXAM STAM SHORT-TERM ACTUARIAL MATHEMATICS EXAM STAM SAMPLE QUESTIONS

Statistical estimation

ECSE B Assignment 5 Solutions Fall (a) Using whichever of the Markov or the Chebyshev inequalities is applicable, estimate

A New Hybrid Estimation Method for the Generalized Pareto Distribution

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

Section Distributions of Random Variables

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Exercise. Show the corrected sample variance is an unbiased estimator of population variance. S 2 = n i=1 (X i X ) 2 n 1. Exercise Estimation

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

Probability. An intro for calculus students P= Figure 1: A normal integral

4.1 Introduction Estimating a population mean The problem with estimating a population mean with a sample mean: an example...

Random Samples. Mathematics 47: Lecture 6. Dan Sloughter. Furman University. March 13, 2006

Qualifying Exam Solutions: Theoretical Statistics

PROBABILITY DISTRIBUTIONS

Part 1 In which we meet the law of averages. The Law of Averages. The Expected Value & The Standard Error. Where Are We Going?

INSTITUTE OF ACTUARIES OF INDIA EXAMINATIONS. 20 th May Subject CT3 Probability & Mathematical Statistics

Statistical analysis and bootstrapping

The method of Maximum Likelihood.

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Chapter 7: Estimation Sections

A.REPRESENTATION OF DATA

Chapter 3 Discrete Random Variables and Probability Distributions

Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017

Much of what appears here comes from ideas presented in the book:

χ 2 distributions and confidence intervals for population variance

Homework Problems Stat 479

Chapter 5. Statistical inference for Parametric Models

CS340 Machine learning Bayesian model selection

ECON 214 Elements of Statistics for Economists 2016/2017

Hardy Weinberg Model- 6 Genotypes

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

Central Limit Theorem 11/08/2005

Section Random Variables and Histograms

PROBABILITY AND STATISTICS

Practice Exam 1. Loss Amount Number of Losses

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Statistics and Probability

Random variables. Discrete random variables. Continuous random variables.

Analysis of truncated data with application to the operational risk estimation

Learning From Data: MLE. Maximum Likelihood Estimators

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

Confidence Intervals Introduction

Homework Problems Stat 479

Chapter 7: Estimation Sections

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER

Homework Assignments

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

Binomial Random Variables. Binomial Random Variables

Lecture 3. Sampling distributions. Counts, Proportions, and sample mean.

(Practice Version) Midterm Exam 1

Shifting our focus. We were studying statistics (data, displays, sampling...) The next few lectures focus on probability (randomness) Why?

Comparing the Means of. Two Log-Normal Distributions: A Likelihood Approach

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Chapter 7: Estimation Sections

Lecture III. 1. common parametric models 2. model fitting 2a. moment matching 2b. maximum likelihood 3. hypothesis testing 3a. p-values 3b.

Chapter 8. Sampling and Estimation. 8.1 Random samples

STA 103: Final Exam. Print clearly on this exam. Only correct solutions that can be read will be given credit.

Applied Mathematics 12 Extra Practice Exercises Chapter 3

Log-linear Modeling Under Generalized Inverse Sampling Scheme

PhD Qualifier Examination

CERTIFICATE IN FINANCE CQF. Certificate in Quantitative Finance Subtext t here GLOBAL STANDARD IN FINANCIAL ENGINEERING

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

Transcription:

Actuarial Mathematics and Statistics Statistics 5 Part 2: Statistical Inference Tutorial Problems Spring 2005 1. Which of the following statements relate to probabilities that can be interpreted as frequencies? For those that can, describe the corresponding sequence of experiments. (a) The probability that a 6 is scored with a given die is 1 6. (b) The probability that a randomly chosen 20-year old male survives until age 60 is 0.8. (c) The probability that Wayne Rooney (famous footballer) survives until age 60 is 0.8. (d) The probability that Jack the Ripper was a member of the Royal Family is 0.05. (e) The probability that the next pint of milk you buy goes off within 1 week of purchase is 0.6. 2. Which standard distributions might be appropriate to model the following experimental outcomes? (a) Select 10 individuals randomly from the student population of Heriot-Watt and count the number of 1st-year students in your sample. (b) Throw a dart at a dartboard repeatedly until you score a treble 20. Count the number of throws required. (c) Monitor a stretch of road continuously for 1 year and count the number of accidents that occur. (d) Select a male student at random from the HW population and measure their height. (e) Put a new battery in a watch and measure the time elapsed before the watch stops. 3. Let X denote a random sample of size n from a distibution with parameter θ and let g(x) denote an estimator for θ. (a) Show that MSE(g(X)) = Var(g(X)) + (bias(g(x))) 2. 1

(b) Suppose the mean and variance of X are µ and σ 2 respectively and that X and S 2 denote the sample mean and variance. Show that (i) E( X) = µ, (ii) Var( X) = σ2 n, (iii) E(S2 ) = σ 2. 4. Let X denote a random sample of size n from a P oisson(λ) distribution. Consider the estimator ˆλ = X. (a) Is ˆλ an unbiased estimator for λ? Give reasons. (b) Calculate the MSE of ˆλ. (c) Calculate the Cramer-Rao Lower Bound and determine whether ˆλ is the most efficient unbiased estimator. (d) Consider the family of estimators ˆλ = a X where a > 0. Calculate the MSE of such an estimator as a function of a. Which value of a gives the estimator with the minimum MSE? 5. Consider the example 2.2.1. from the notes. (a) Prove carefully that a and b are given by the expressions (i.e. minimise the expression for the MSE with respect to a and b.) (b) For the case where θ = 100, σ 2 1 = σ 2 2 = 1 calculate the MSE for the optimal estimator (given by a and b ) and the MSE for the optimal unbiased estimator. 6. Let X be a random sample from exp(1/θ) (i.e. with mean θ). (a) Write down E( X), Var( X) and MSE( X). (b) Now consider the estimator Y = a X. Write down MSE(Y ) and determine the value of a which minimizes it. Compare X and the optimum Y for both small and large values of n. Comment. (c) For the case where n = 200 use Chebyshev s inequality to obtain an upper bound for the frequency with which the estimator X will not fall within 10% of the true value of θ (i.e. between 0.9θ and 1.1θ.) 7. In the situation described in question 6, for a sample of size n, determine the Cramer-Rao lower bound for the variance of an unbiased estimator of θ. Comment. 2

8. Let X Bin(n, p). Show that X n is a consistent estimator for p. 9. Let X be a random sample from N(0, σ 2 ). (a)show that the Cramer-Rao lower bound for the unbiased estimators of σ 2 is 2σ4 n. (b) Investigate the following estimators for σ 2 (i.e. comment on bias and efficiency). S 2 = 1 n 1 n n (X i X) 2 i=1 S 2 n = 1 n S 2 n+2 = i=1 1 n + 1 X 2 i n Xi 2. i=1 Note: For a random sample from N(µ, σ 2 ), we have V ar(s 2 ) = 2σ4 n 1 (we will prove this later). Also, for a N(0, σ 2 ) random variable X, V ar(x 2 ) = 2σ 4 (try to prove this using moment generating functions!). 10. A population consists of 100 individuals of whom some unknown proportion p carry a certain gene. Suppose you randomly select n of these without replacement and test whether they carry the gene, generating observations Y 1, Y 2,..., Y n, where Y i = 1 if the i th individual carries the gene and is equal to zero otherwise. (a) Prove that the marginal distribution of Y i is Bernoulli(p) for all i = 1,..., n. (b) Explain why the Y i are not independent of each other. (c) If p = 0.4 and n = 50 calculate the MSE of the estimator n i=1 ˆp = Y i. n (d) Use Chebyshev s inequality to calculate an upper bound for the frequency with which ˆp p exceeds 0.05, if p = 0.4 and n = 50. 3

11. A health-and-safety officer believes that the number of industrial accidents occuring each month comes from a P oisson(λ) distribution with the numbers being independent between months. In the 3 successive months they observe 3, 6, and 2 accidents, respectively. (a) Construct the likelihood L(λ) and the loglikelihood l(λ) for this data set. (b) Maximise the log-likelihood to obtain the MLE of λ for this data set. (c) Find the method of moments estimator for λ. compare with the MLE? How does it 12. For random (i.i.d.) samples from the following distributions, find the MLE and the MME of the indicated parameters: (a) p in Bin(n, p) with n known and sample size 1. (b) λ in exp(λ), sample size n. (c) θ in Beta(θ, 1), sample size n. (Density is given by f(x; θ) = θx θ 1 for 0 < x < 1). 13. A bag contains 10 balls in total, r of which are black with the remainder white, where r is unknown. A random selection of 4 balls (without replacement) consists of 2 black and 2 white balls. (a) Identify the range of values of r for which the likelihood L(r) is non-zero. (b) Show that the the likelihood L(r) satisfies: L(r) r(r 1)(10 r)(10 r 1). (c) Evaluate the r.h.s. of the expression in (b) for different values of r and hence find the MLE of r. Is it what you expected? 4

14. A random sample of size 2, i.e. X = (X 1, X 2 ), is taken from a population with density given by f(x) = 2 θ (1 x θ ) for 0 < x < θ where θ > 0 (note that the range of X depends on θ). Draw a rough graph of the likelihood function L(θ) and determine the MLE ˆθ. Determine the MME of θ and compare this to ˆθ. 15. Let x 1,..., x n be the values in a random sample of size n from a U(a, b) distribution (where b > a) and let x (1) and x (n) denote the smallest and largest values in the sample respectively. (a) Sketch the region of parameter space where the likelihood is non-zero and for points (a, b) lying in this region determine the value of the likelihood L(a, b). (b) Hence show that the maximum likelihood estimate is given by (â, ˆb) = (x (1), x (n) ). (b) (Harder) Consider the sampling properties of â and ˆb. Do you think that they are independent of each other? Calculate Cov(â, ˆb) for the case where n = 2, a = 0 and b = 1. (Hint: Consider the joint density of (X 1, X 2 ) can you identify the joint density of (X (1), X (2) )?) 16. The number of cars that exceed the speed limit on a stretch of road on any given day is believed to be P oisson(λ) where λ is unknown. Furthermore it is believed that the numbers on different days are independent of each other. To estimate λ you install a speed camera to record the number of cars on 10 successive days. However, the camera is defective and only records the first speeding car each day. At the end of the trial you find that there were no speeding cars on 4 of the days with at least one culprit on the other 6 days. (a) Find the likelihood L(λ) by writing the probability of the observations as a function of λ. Maximise it to obtain the MLE of λ. (b) What is the distribution of the number of days out of the 10 on which there are no speeding cars. (Hint: Think of a day on with no speeding cars as a success.) 5

(c) Use the result from (ii) and properties of MLEs to give a quicker derivation of the MLE of λ. 17. In a game involving a biased coin with P (H) = p (where p is unknown) players toss the coin 3 times and win if they obtain at least one Head. Suppose that 20 players play the game, resulting in 10 winners. Find the MLE of p. 18. The diameters of tubers (measured in cms) of a certain variety of potato follow an Exp(λ) distribution where λ is unknown. You are provided with a random sample of 100 tubers and two sorting grids. Any tuber less than 6 cms. in diameter will pass through the first grid while any tuber less than 3 cms. will pass though the second grid. Out of the 100 tubers, 20 pass through neither grid, and 50 pass through both grids. (a) Show carefully that the likelihood L(λ) is given by L(λ) = (1 e 3λ ) 80 e 210λ (b) Find the value of λ which maximises this expression. (Hint: Make the substitution p = e 3λ and find the MLE of p first of all.) 19. Let X be a random sample of size n from Poisson(µ). Of the n observations, n 0 equal 0, n 1 equal 1, and the remaining n n 0 n 1 observations are greater than 1. (a) Obtain an equation satisfied by the MLE ˆµ. (b) In the case n = 20, n 0 = 8, n 1 = 7, use numerical methods to find ˆµ correct to 4 decimal places. Note: Use Poisson tables to find a good starting value for your numerical calculations. 20. Let X be a random sample from Γ(t, λ). (a) Obtain a set of equations satisfied by the MLEs ˆt and ˆλ. Suggest how these might be solved. (b) Determine the MMEs for t and λ. Which method (MLE or MME) seems easier? 6

21. Independent samples of size n 1 and n 2 are taken from N(µ 1, σ 2 ) and N(µ 2, σ 2 ) respectively. Determine the MLEs ˆµ 1, ˆµ 2, and ˆσ 2. (Hint: L = L 1 L 2 ). 22. Let X be a random sample from a truncated N(µ, 1) distribution with density given by ( ) 1 1 (x µ)2 f(x; µ) = exp( ) for 0 < x <. 1 Φ( µ) 2π 2 to (a) Show that both the MLE and MME are given by the solution x µ φ( µ) 1 Φ( µ) = 0 where φ and Φ are the density and distribution functions, respectively, of N(0, 1). (b) If x = 1.42, find ˆµ to 1 decimal place. 7

23. The time till relapse in months of patients treated with a certain drug is believed to follow a Γ(2, β) distribution where β is unknown. To estimate β you take a random sample of 10 patients at time t = 0.0 and monitor them over the following 6 months. In this time 6 patients relapse at times t 1,..., t 6 for which t i = 13.2. The remaining 4 patients do not relapse. (a) Show that the probability that a given patient does not relapse (as a function of β)is given by p(β) = (1 + 6β)e 6β. (b) Explain carefully why the likelihood satisfies L(β) β 12 (1 + 6β) 4 e 37.2β. Hint. You need to include a factor corresponding to each of the 10 patients. (c) Find a quadratic equation satisfied by the MLE of β and hence find the MLE. (Remember that β must be positive!) 8