Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Examples

Similar documents
SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Tests for One Variance

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

Homework Problems Stat 479

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4

σ 2 : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

χ 2 distributions and confidence intervals for population variance

Probability & Statistics

Statistics for Business and Economics

Mean GMM. Standard error

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Midterm

Simulation Wrap-up, Statistics COS 323

Chapter 4 Continuous Random Variables and Probability Distributions

Confidence Intervals Introduction

Lecture 6: Non Normal Distributions

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Chapter 7. Inferences about Population Variances

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

MVE051/MSG Lecture 7

Applied Statistics I

Chapter 4 Continuous Random Variables and Probability Distributions

Diploma Part 2. Quantitative Methods. Examiner s Suggested Answers

1 Bayesian Bias Correction Model

Financial Econometrics

Homework Problems Stat 479

Sampling Distribution

MATH 3200 Exam 3 Dr. Syring

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Business Statistics 41000: Probability 3

Statistical Analysis of Data from the Stock Markets. UiO-STK4510 Autumn 2015

FINITE SAMPLE DISTRIBUTIONS OF RISK-RETURN RATIOS

The Bernoulli distribution

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

1. You are given the following information about a stationary AR(2) model:

Lecture 2. Probability Distributions Theophanis Tsandilas

An Information Based Methodology for the Change Point Problem Under the Non-central Skew t Distribution with Applications.

SOCIETY OF ACTUARIES EXAM STAM SHORT-TERM ACTUARIAL MATHEMATICS EXAM STAM SAMPLE QUESTIONS

SYSM 6304: Risk and Decision Analysis Lecture 6: Pricing and Hedging Financial Derivatives

Loss Simulation Model Testing and Enhancement

4.2 Probability Distributions

8.1 Estimation of the Mean and Proportion

Experimental Design and Statistics - AGA47A

Modeling Co-movements and Tail Dependency in the International Stock Market via Copulae

Financial Time Series and Their Characteristics

12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006.

University of New South Wales Semester 1, Economics 4201 and Homework #2 Due on Tuesday 3/29 (20% penalty per day late)

Statistical Intervals (One sample) (Chs )

CHAPTER 6 Random Variables

ECON 214 Elements of Statistics for Economists 2016/2017

Financial Risk Forecasting Chapter 1 Financial markets, prices and risk

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 7.4-1

The Two Sample T-test with One Variance Unknown

Central Limit Theorem (cont d) 7/28/2006

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2010, Mr. Ruey S. Tsay Solutions to Final Exam

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

PhD Qualifier Examination

Introduction to Statistics I

Assicurazioni Generali: An Option Pricing Case with NAGARCH

Point Estimation. Principle of Unbiased Estimation. When choosing among several different estimators of θ, select one that is unbiased.

Value at Risk with Stable Distributions

Introduction to Business Statistics QM 120 Chapter 6

Copyright 2005 Pearson Education, Inc. Slide 6-1

Financial Returns. Dakota Wixom Quantitative Analyst QuantCourse.com INTRO TO PORTFOLIO RISK MANAGEMENT IN PYTHON

Econometric Methods for Valuation Analysis

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2010, Mr. Ruey S. Tsay. Solutions to Midterm

Independent-Samples t Test

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2017, Mr. Ruey S. Tsay. Solutions to Final Exam

may be of interest. That is, the average difference between the estimator and the truth. Estimators with Bias(ˆθ) = 0 are called unbiased.

Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means

Data Analysis and Statistical Methods Statistics 651

Chapter 8. Introduction to Statistical Inference

Data Analysis and Statistical Methods Statistics 651

What was in the last lecture?

CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates

ECON Introductory Econometrics. Lecture 1: Introduction and Review of Statistics

STA215 Confidence Intervals for Proportions

Lecture 8: Single Sample t test

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER

Market Volatility and Risk Proxies

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR

Financial Econometrics

Goodness-of-fit of the Heston model

Course information FN3142 Quantitative finance

Chapter 8: Sampling distributions of estimators Sections

A Saddlepoint Approximation to Left-Tailed Hypothesis Tests of Variance for Non-normal Populations

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay. Solutions to Final Exam

Asymmetric Price Transmission: A Copula Approach

Measure of Variation

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Risk management. VaR and Expected Shortfall. Christian Groll. VaR and Expected Shortfall Risk management Christian Groll 1 / 56

ST440/550: Applied Bayesian Analysis. (5) Multi-parameter models - Summarizing the posterior

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

Week 1 Quantitative Analysis of Financial Markets Basic Statistics A

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 42

Transcription:

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Examples M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October 13, 2012

Outline Kolmogorov-Smirnov (K-S) Tests 1 Kolmogorov-Smirnov (K-S) Tests 2 3

Outline Kolmogorov-Smirnov (K-S) Tests 1 Kolmogorov-Smirnov (K-S) Tests 2 3

Outline Kolmogorov-Smirnov (K-S) Tests 1 Kolmogorov-Smirnov (K-S) Tests 2 3

Take the height and weight data given earlier. Compute the mean µ and standard deviation σ of the raw data without binning. For heights, we get µ h = 67.9639, σ h = 1.9318, µ w = 127.2639, σ w = 11.5442. Then we plot the actual empirical distribution against the Gaussian distribution corresponding to these choices. The next slides depict the results.

Empirical vs. Fitted Height Density

Empirical vs. Fitted Height Distribution

Application of One-Sample K-S Test to Height Data We have 5000 samples, so n = 5000. Let us look at the 95% confidence level, so δ = 0.05. Compute the threshold θ(n, δ) = ( 1 2n log 2 ) 1/2. δ For the chosen values, this turns out to be θ = 0.0192. So if the actual maximum difference between the empirical and fitted distribution functions for heights is larger than this threshold, then we can say with confidence 95% that the samples are not generated by a Gaussian distribution.

Application of One-Sample K-S Test to Height Data 2 The actual maximum difference is 0.0103. So can we accept the hypothesis that heights follow a Gaussian distribution with mean µ h and standard deviation σ h? The fact that the actual maximum is less than θ says only that we cannot reject the null hypothesis. So let us see if we can make a less wishy-washy statement.

Application of One-Sample K-S Test to Height Data 3 The K-S test above shows that the probability of the samples not coming from a Gaussian distribution is 95%. That might not be a good reason to accept the null hypothesis. So let us ask: What is the threshold for saying that the probability of the samples not coming from a Gaussian distribution is 50%? To find this threshold, substitute δ = 0.5 (not δ = 0.05 as earlier) into θ(n, δ) = ( 1 2n log 2 ) 1/2. δ This gives θ = 0.0118. Since the actual maximum is less than even this number, we can accept the null hypothesis, since the likelihood of its being false is less than half.

Empirical vs. Fitted Weight Density

Empirical vs. Fitted Weight Distribution

Application of One-Sample K-S Test to Weight Data We have 5000 samples, so n = 5000. Let us look at the 95% confidence level, so δ = 0.05. Compute the threshold θ(n, δ) = ( 1 2n log 2 ) 1/2. δ For the chosen values, this turns out to be θ = 0.0192 if δ = 0.05 and θ = 0.0118 if δ = 0.5. These are the same numbers as before! Is this a coincidence? No! The threshold θ depends only on the number of samples n and the confidence level δ.

Application of One-Sample K-S Test to Weight Data 2 Actual maximum difference between empirical and fitted distribution of weights is 0.0060, which is lower than the threshold for δ = 0.5. So we can accept the hypothesis that weights follow a Gaussian distribution with mean µ h and standard deviation σ h.

Outline Kolmogorov-Smirnov (K-S) Tests 1 Kolmogorov-Smirnov (K-S) Tests 2 3

We fitted a Gaussian to the logarithm of prices of homes sold in UK during June 2012. The results are shown on next slide.

Home Prices: Empirical vs. Fitted Density

Application of One-Sample K-S Test to Home Prices Data Recall that the K-S threshold is ( 1 θ(n, δ) = 2n log 2 ) 1/2. δ There are 54675 homes sold, so n = 54675. If we take δ = 0.05 as usual, then the threshold is θ = 0.0058.

Application of K-S Test to Home Prices Data 2 The mean of the log home price is µ l = 12.1175, and the standard deviation is σ l = 0.6608. The next slide shows the empirical and fitted distribution function.

Home Prices: Empirical vs. Fitted Distribution

Application of K-S Test to Home Prices Data 3 Now if we compute the K-S statistic, which is the maximum difference between the empirical and fitted, it works out to 0.0565, which is far higher than θ. So with 95% confidence, we can reject the null hypothesis that home prices follow a log-normal (actually log-gaussian) distribution.

Outline Kolmogorov-Smirnov (K-S) Tests 1 Kolmogorov-Smirnov (K-S) Tests 2 3

Example of Student t Test Problem: From the 200 samples of height, determine whether the mean of the first 120 samples differs in a statistically significant way from that of the last 80 samples. The null hypothesis is that the two means are the same. We compute the test statistic d t and compare against the t test threshold. If d t is larger than the threshold we reject the null hypothesis; otherwise we accept it.

Example of Student t Test (Cont d) We have Now the test statistic is x 1 = 68.0703, x 2 = 67.7694, S 1 = 3.3581, S 2 = 4.3730, S 12 = 3.7957. d t = x 1 x 2 S 12 (1/m1 ) + (1/m 2 ) = 0.5492. Let Φ t,198 denote the distribution function of the t distribution with m 1 + m 2 2 = 198 degrees of freedom. Then the threshold is Φ 1 t,198 (0.95) = 1.6526. Since the test statistic is smaller than the t test threshold, we cannot reject the null hypothesis that both means are the same.

Outline Kolmogorov-Smirnov (K-S) Tests 1 Kolmogorov-Smirnov (K-S) Tests 2 3

Example Kolmogorov-Smirnov (K-S) Tests Again take the height data, with m 1 = 100, m 2 = 10. So we are testing whether the next 10 samples have the same variance as the first 100 samples. In this case while V 1 = 3.3413, S 2 = 26.6335, S 2 V 1 = 7.9711, x l = Φ 1 χ 2,9 (0.05) = 3.3251, x u = Φ 1 (0.95) = 16.9190. χ 2,9 Since the chi-squared test statistic lies within the interval [x l, x u ] we accept the null hypothesis.