Lecture 12: The Bootstrap

Similar documents
HOMEWORK: Due Mon 11/8, Chapter 9: #15, 25, 37, 44

Estimation Y 3. Confidence intervals I, Feb 11,

Resampling Methods. Exercises.

Introduction Dickey-Fuller Test Option Pricing Bootstrapping. Simulation Methods. Chapter 13 of Chris Brook s Book.

Final Exam Suggested Solutions

MATH 10 INTRODUCTORY STATISTICS

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

Statistics 13 Elementary Statistics

ECE 295: Lecture 03 Estimation and Confidence Interval

Index Models and APT

Chapter 6 Part 3 October 21, Bootstrapping

Module 4: Point Estimation Statistics (OA3102)

Week 7 Quantitative Analysis of Financial Markets Simulation Methods

BIO5312 Biostatistics Lecture 5: Estimations

Midterm Exam III Review

Chapter 7 - Lecture 1 General concepts and criteria

A New Hybrid Estimation Method for the Generalized Pareto Distribution

Confidence Intervals Introduction

Homework: (Due Wed) Chapter 10: #5, 22, 42

Lecture 22. Survey Sampling: an Overview

12 The Bootstrap and why it works

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are

Foundations of Finance

MATH 10 INTRODUCTORY STATISTICS

STAT Chapter 7: Confidence Intervals

Asymmetric Price Transmission: A Copula Approach

Review: Population, sample, and sampling distributions

Stat 139 Homework 2 Solutions, Fall 2016

On Some Test Statistics for Testing the Population Skewness and Kurtosis: An Empirical Study

574 Flanders Drive North Woodmere, NY ~ fax

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

Confidence Intervals for Paired Means with Tolerance Probability

Principles of Finance Risk and Return. Instructor: Xiaomeng Lu

Bias Reduction Using the Bootstrap

MWF 3:15-4:30 Gates B01. Handout #13 as of International Asset Portfolios Bond Portfolios

Reminders. Quiz today - please bring a calculator I ll post the next HW by Saturday (last HW!)

Final/Exam #3 Form B - Statistics 211 (Fall 1999)

Improving Returns-Based Style Analysis

The Two-Sample Independent Sample t Test

Lecture 9 - Sampling Distributions and the CLT

Characterization of the Optimum

Performance of Statistical Arbitrage in Future Markets

Lecture 10-12: CAPM.

Application of the Bootstrap Estimating a Population Mean

Predictive Analytics STUART KLUGMAN. Senior Staff Fellow. June 7, 2018

Estimation. Focus Points 10/11/2011. Estimating p in the Binomial Distribution. Section 7.3

Midterm Exam. b. What are the continuously compounded returns for the two stocks?

1 Inferential Statistic

Section B: Risk Measures. Value-at-Risk, Jorion

Lecture 17: More on Markov Decision Processes. Reinforcement learning

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

PASS Sample Size Software

P2.T5. Market Risk Measurement & Management. Bruce Tuckman, Fixed Income Securities, 3rd Edition

PARAMETRIC AND NON-PARAMETRIC BOOTSTRAP: A SIMULATION STUDY FOR A LINEAR REGRESSION WITH RESIDUALS FROM A MIXTURE OF LAPLACE DISTRIBUTIONS

8.1 Estimation of the Mean and Proportion

Stat3011: Solution of Midterm Exam One

GPCO 453: Quantitative Methods I Review: Hypothesis Testing

Statistics for Business and Economics

Two Populations Hypothesis Testing

Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2010, Mr. Ruey S. Tsay Solutions to Final Exam

Name: 1. Use the data from the following table to answer the questions that follow: (10 points)

Example 1 of econometric analysis: the Market Model

Business Statistics 41000: Probability 4

Section 2: Estimation, Confidence Intervals and Testing Hypothesis

6 Central Limit Theorem. (Chs 6.4, 6.5)

TABLE OF CONTENTS - VOLUME 2

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

Approximate Variance-Stabilizing Transformations for Gene-Expression Microarray Data

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

Portfolio Performance Measurement

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

A RIDGE REGRESSION ESTIMATION APPROACH WHEN MULTICOLLINEARITY IS PRESENT

Your Name (Please print) Did you agree to take the optional portion of the final exam Yes No. Directions

Chapter 5. Sampling Distributions

Empirical Evidence. r Mt r ft e i. now do second-pass regression (cross-sectional with N 100): r i r f γ 0 γ 1 b i u i

ECO220Y Estimation: Confidence Interval Estimator for Sample Proportions Readings: Chapter 11 (skip 11.5)

Confidence Intervals for the Difference Between Two Means with Tolerance Probability

SOCIETY OF ACTUARIES Advanced Topics in General Insurance. Exam GIADV. Date: Thursday, May 1, 2014 Time: 2:00 p.m. 4:15 p.m.

Tests for Paired Means using Effect Size

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

Value (x) probability Example A-2: Construct a histogram for population Ψ.

Point Estimation. Principle of Unbiased Estimation. When choosing among several different estimators of θ, select one that is unbiased.

Section 2: Estimation, Confidence Intervals and Testing Hypothesis

Statistics and Probability

Data Analysis and Statistical Methods Statistics 651

AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4

Learning Objectives for Ch. 7

Data Analysis and Statistical Methods Statistics 651

Today's Agenda Hour 1 Correlation vs association, Pearson s R, non-linearity, Spearman rank correlation,

Internet Appendix for Asymmetry in Stock Comovements: An Entropy Approach

Dealing with forecast uncertainty in inventory models

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

σ e, which will be large when prediction errors are Linear regression model

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

σ 2 : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Stat 213: Intro to Statistics 9 Central Limit Theorem

Transcription:

Lecture 12: The Bootstrap Reading: Chapter 5 STATS 202: Data mining and analysis October 20, 2017 1 / 16

Announcements Midterm is on Monday, Oct 30 Topics: chapters 1-5 and 10 of the book everything until and including today s lecture. We will post two practice exams soon. Closed book, no notes. All hard equations will be provided. SCPD students: if you haven t chosen your proctor already, you must do it ASAP. For guidelines see: http://scpd.stanford.edu/programs/courses/ graduate-courses/exam-monitor-information 2 / 16

The learning curve and choosing k in k-fold cross validation 1-Err 0.0 0.2 0.4 0.6 0.8 The learning curve 0 50 100 150 200 Size of Training Set Recall that as we increase k, we decrease the bias but increase the variance of the cross validation error. How does the test error change as we increase the size n of the training set? Consider the curve on the left: If n = 200, then 5-fold CV estimates error using dataset of size 4 5 200 = 160: introduces little bias! If n = 50, then 5-fold CV estimates error using dataset of size 4 5 50 = 40: introduces more bias. 3 / 16

Cross-validation vs. the Bootstrap Cross-validation: principally used to estimate prediction error. The Bootstrap: principally used to estimate various measures of error or uncertainty of parameter estimates, e.g. standard error (SE) of parameter estimates, confidence intervals for parameters. One of the most important techniques in all of Statistics. Widely applicable, extremely powerful, computer intensive method. Popularized by Brad Efron, from Stanford. 4 / 16

Standard errors in linear regression Standard error: SD of an estimate from a sample of size n. 5 / 16

Classical way to compute Standard Errors Example: Estimate the variance of a sample x 1, x 2,..., x n : ˆσ 2 = 1 n 1 What is the Standard Error of ˆσ 2? n (x i x) 2. i=1 1. Assume that x 1,..., x n are i.i.d. normally distributed. 2. From that assumption one can derive that V ar(ˆσ 2 ) = 2σ4 therefore SE(ˆσ 2 ) = 2σ 2 n 1. 3. Problem: We typically don t know σ! 4. So assume ˆσ 2 n 1 is reasonable close to σ 2 n 1. 5. Then can use the estimate SE(ˆσ 2 ) = 2ˆσ 2 n 1. n 1, 6 / 16

Limitations of the classical approach The classical approach works for certain statistics under specific modeling assumptions. However, what happens if: The modeling assumptions for example, x 1,..., x n being normal break down? The estimator does not have a simple form and its sampling distribution cannot be derived analytically? 7 / 16

Example. Investing in two assets Suppose that and are the returns of two assets. These returns are observed every day: (x 1, y 1 ),..., (x n, y n ). 2 1 0 8 / 16

Example. Investing in two assets We have a fixed amount of money to invest and we will invest a fraction α on and a fraction (1 α) on. Therefore, our return will be α + (1 α). Our goal will be to minimize the variance of our return as a function of α. One can show that the optimal α is: α = σ 2 Cov(, ) σ 2 + σ2 2Cov(, ). Proposal: Use an estimate: ˆσ 2 ˆα = Cov(, ˆ ) ˆσ 2 + ˆσ2 2 Cov(, ˆ ). 9 / 16

Example. Investing in two assets Suppose we compute the estimate ˆα = 0.6 using the samples (x 1, y 1 ),..., (x n, y n ). How sure can we be of this value? If we sampled another set of observations (x 1, y 1 ),..., (x n, y n ), would we get a wildly different ˆα? In this thought experiment, we know the actual joint distribution P (, ), so we can resample the n observations to our hearts content. 10 / 16

Resampling the data from the true distribution 3 3 3 3 11 / 16

Computing the standard error of ˆα Suppose we can sample as many data as we want. For each resampling of the data, (x (1) 1, y(1) 1 (x (2) 1, y(2) 1 ),..., (x(1) n, y n (1) ) ),..., (x(2) n, y n (2) )... we can compute a value of the estimate ˆα (1), ˆα (2),.... The standard deviation of these values approximates the Standard Error of ˆα. 12 / 16

In reality, we only have one dataset of size n! However, this dataset can be used to approximate the joint distribution of P of and by forming the empirical distribution ˆP (, ) which gives probability 1 n to each pair (x i, y i ). The Bootstrap: Instead of sampling new datasets from the unknown distribution P, resample from the empirical distribution ˆP. Equivalently, 2 1 0resample 1 the 2 data by drawing n samples with replacement from the actual observations. 2 2 13 / 16

A schematic of the Bootstrap Obs *1 Z 3 1 3 5.3 4.3 5.3 2.8 2.4 2.8 αˆ*1 Obs Obs 1 4.3 2.4 2 2.1 1.1 3 5.3 2.8 Original Data (Z) *2 Z Z *B 2 3 1 Obs 2.1 5.3 4.3 1.1 2.8 2.4 ˆα *2 ˆα *B 2 2 1 2.1 2.1 4.3 1.1 1.1 2.4 Each resampled dataset Z b is called a bootstrap replicate. 14 / 16

Comparing Bootstrap resamplings to resamplings from the true distribution 0 50 100 150 200 0 50 100 150 200 α 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.4 0.5 0.6 0.7 0.8 0.9 α 0.3 0.4 0.5 0.6 0.7 0.8 0.9 α True Bootstrap 15 / 16

Bootstrapping your favorite statistics The bootstrap is broadly applicable and can be used to estimate the SE of a wide variety of statistics including linear regression coefficients, model predictions ˆf(x 0 ), principal component loadings,... 16 / 16