SELECTING THE NUMBER OF CHANGE-POINTS IN SEGMENTED LINE REGRESSION

Similar documents
Rafa l Kulik and Marc Raimondo. University of Ottawa and University of Sydney. Supplementary material

Solutions to Problem Sheet 1

Lecture 9: The law of large numbers and central limit theorem

Sequences and Series

The Limit of a Sequence (Brief Summary) 1

Math 312, Intro. to Real Analysis: Homework #4 Solutions

5. Best Unbiased Estimators

Maximum Empirical Likelihood Estimation (MELE)

Introduction to Probability and Statistics Chapter 7

point estimator a random variable (like P or X) whose values are used to estimate a population parameter

Bayes Estimator for Coefficient of Variation and Inverse Coefficient of Variation for the Normal Distribution

14.30 Introduction to Statistical Methods in Economics Spring 2009

ECON 5350 Class Notes Maximum Likelihood Estimation

18.S096 Problem Set 5 Fall 2013 Volatility Modeling Due Date: 10/29/2013

A New Constructive Proof of Graham's Theorem and More New Classes of Functionally Complete Functions

Summary. Recap. Last Lecture. .1 If you know MLE of θ, can you also know MLE of τ(θ) for any function τ?

Exam 1 Spring 2015 Statistics for Applications 3/5/2015

ASYMPTOTIC MEAN SQUARE ERRORS OF VARIANCE ESTIMATORS FOR U-STATISTICS AND THEIR EDGEWORTH EXPANSIONS

x satisfying all regularity conditions. Then

Asymptotics: Consistency and Delta Method

Statistics for Economics & Business

A point estimate is the value of a statistic that estimates the value of a parameter.

Inferential Statistics and Probability a Holistic Approach. Inference Process. Inference Process. Chapter 8 Slides. Maurice Geraghty,

4.5 Generalized likelihood ratio test

Parametric Density Estimation: Maximum Likelihood Estimation

1 Random Variables and Key Statistics

EXERCISE - BINOMIAL THEOREM

AY Term 2 Mock Examination

NORMALIZATION OF BEURLING GENERALIZED PRIMES WITH RIEMANN HYPOTHESIS

CHAPTER 8 Estimating with Confidence

SUPPLEMENTAL MATERIAL

Topic 14: Maximum Likelihood Estimation

A Bayesian perspective on estimating mean, variance, and standard-deviation from data

Chapter 8. Confidence Interval Estimation. Copyright 2015, 2012, 2009 Pearson Education, Inc. Chapter 8, Slide 1

Monopoly vs. Competition in Light of Extraction Norms. Abstract

STAT 135 Solutions to Homework 3: 30 points

1 The Black-Scholes model

A random variable is a variable whose value is a numerical outcome of a random phenomenon.

EVEN NUMBERED EXERCISES IN CHAPTER 4

Estimation of generalized Pareto distribution

An Improved Estimator of Population Variance using known Coefficient of Variation

0.1 Valuation Formula:

Discriminating Between The Log-normal and Gamma Distributions

Lecture 5 Point Es/mator and Sampling Distribu/on

Lecture 4: Parameter Estimation and Confidence Intervals. GENOME 560 Doug Fowler, GS

5 Statistical Inference

Supplement to Adaptive Estimation of High Dimensional Partially Linear Model

1 Estimating sensitivities

Random Sequences Using the Divisor Pairs Function

Game Theory. Lecture Notes By Y. Narahari. Department of Computer Science and Automation Indian Institute of Science Bangalore, India July 2012

BOUNDS FOR TAIL PROBABILITIES OF MARTINGALES USING SKEWNESS AND KURTOSIS. January 2008

FINM6900 Finance Theory How Is Asymmetric Information Reflected in Asset Prices?

NOTES ON ESTIMATION AND CONFIDENCE INTERVALS. 1. Estimation

Chapter 10 - Lecture 2 The independent two sample t-test and. confidence interval

Research Article The Probability That a Measurement Falls within a Range of n Standard Deviations from an Estimate of the Mean

Fourier Transform in L p (R) Spaces, p 1

Estimating Volatilities and Correlations. Following Options, Futures, and Other Derivatives, 5th edition by John C. Hull. Chapter 17. m 2 2.

Standard Deviations for Normal Sampling Distributions are: For proportions For means _

Estimation of Population Variance Utilizing Auxiliary Information

Combining imperfect data, and an introduction to data assimilation Ross Bannister, NCEO, September 2010

Estimation of integrated volatility of volatility with applications to goodness-of-fit testing

The Valuation of the Catastrophe Equity Puts with Jump Risks

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

Section Mathematical Induction and Section Strong Induction and Well-Ordering

arxiv: v3 [math.st] 3 May 2016

Bootstrapping high-frequency jump tests

Policy Improvement for Repeated Zero-Sum Games with Asymmetric Information

Supersedes: 1.3 This procedure assumes that the minimal conditions for applying ISO 3301:1975 have been met, but additional criteria can be used.

11.7 (TAYLOR SERIES) NAME: SOLUTIONS 31 July 2018

Overlapping Generations

Chapter 13 Binomial Trees. Options, Futures, and Other Derivatives, 9th Edition, Copyright John C. Hull

Subject CT5 Contingencies Core Technical. Syllabus. for the 2011 Examinations. The Faculty of Actuaries and Institute of Actuaries.

The Likelihood Ratio Test

Hopscotch and Explicit difference method for solving Black-Scholes PDE

Monetary Economics: Problem Set #5 Solutions

Chapter 8: Estimation of Mean & Proportion. Introduction

Online appendices from Counterparty Risk and Credit Value Adjustment a continuing challenge for global financial markets by Jon Gregory

Bootstrapping high-frequency jump tests

An Empirical Study of the Behaviour of the Sample Kurtosis in Samples from Symmetric Stable Distributions

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

ii. Interval estimation:

Topic-7. Large Sample Estimation

Diener and Diener and Walsh follow as special cases. In addition, by making. smooth, as numerically observed by Tian. Moreover, we propose the center

1 Basic Growth Models

Sampling Distributions & Estimators

Minhyun Yoo, Darae Jeong, Seungsuk Seo, and Junseok Kim

ONLINE APPENDIX. The real effects of monetary shocks in sticky price models: a sufficient statistic approach. F. Alvarez, H. Le Bihan, and F.

A DOUBLE INCREMENTAL AGGREGATED GRADIENT METHOD WITH LINEAR CONVERGENCE RATE FOR LARGE-SCALE OPTIMIZATION

χ 2 distributions and confidence intervals for population variance

These characteristics are expressed in terms of statistical properties which are estimated from the sample data.

. (The calculated sample mean is symbolized by x.)

Strong consistency of nonparametric Bayes density estimation on compact metric spaces

Unbiased estimators Estimators

Average Distance and Vertex-Connectivity

Subject CT1 Financial Mathematics Core Technical Syllabus

BASIC STATISTICS ECOE 1323

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 2

Further Pure 1 Revision Topic 5: Sums of Series

Notes on Expected Revenue from Auctions

Competing Auctions with Endogenous Quantities

Transcription:

1 SELECTING THE NUMBER OF CHANGE-POINTS IN SEGMENTED LINE REGRESSION Hyue-Ju Kim 1,, Bibig Yu 2, ad Eric J. Feuer 3 1 Syracuse Uiversity, 2 Natioal Istitute of Agig, ad 3 Natioal Cacer Istitute Supplemetary Material This ote cotais proofs for Theorems 3.2.1 ad 3.2.2, ad this is the versio revised i 2012. Appedix A: Proof of Theorem 3.2.1 Lemma A.1. Suppose that coditios (A1 ad (A2 i Assumptio 3.2.1 are satisfied. The, for α fixed ad j > i, there exists c = c = c (i, j; α = o(1 that asymptotically achieves the level α. Lemma A.2. Suppose that the assumptios i Lemma A.1 are satisfied ad c = o(1. The, for i < k, P (A i,k ;α κ = k coverges to zero as. Lemma A.3. Suppose that the assumptios i Lemma A.1 are satisfied, c = o(1, ad c (l 2 as. The, for j > k, P (R k,j;α κ = k coverges to zero as. Proof of Theorem 3.2.1. First, ote from (3.1 that P (ˆκ < k κ = k = 1 P (ˆκ = j κ = k j=0 d k0 P (A k0,k ;α κ = k d k0 max i,k ;α κ = k i=0,...,k 1 = g 1 (k, M max i,k ;α κ = k, i=0,...,k 1

2 where g 1 (k, M is a positive fuctio of k ad M. Lemma A.2 the provides the result that the uder-fittig probability coverges to zero. Sice P (ˆκ > k κ = k α 0 by the desig of the permutatio procedure, i geeral, we obtai that lim P (ˆκ = k κ = k 1 α 0. If c = c = o(1 is chose such that by Lemma A.3. c (l 2, the we achieve the desired result Proof of Lemma A.1. Sice, for j > i(= κ, 0 < ˆσ 2 i ˆσ 2 j = O p ((l 2 / ad ˆσ 2 j coverges to σ0 2 i probability from Lemma 5.4 of Liu et al. (1997, where ˆσ i 2 = RSS(i/ ( ˆσ 2 as i Liu et al., there exist B α ad N α such that P i ˆσ j 2 (l B 2 ˆσ j 2 α κ = i α for all (l > N α. Thus for > N α, there exists c = c B 2 α such that ( ˆσ 2 α = P (RSS(i (1 + crss(j κ = i = P i ˆσ j 2 c κ = i. Proof of Lemma A.2. For i < k, P (A i,k ;α κ = k = P (ˆσ 2 i < (1 + c ˆσ 2 k κ = k = P k (ˆσ 2 i > σ 2 0 + C, ˆσ 2 i < (1 + c ˆσ 2 k + P k (ˆσ2 i σ 2 0 + C, ˆσ 2 i < (1 + c ˆσ 2 k = P 1 + P 2, where C is a positive costat i Lemma 5.4 of Liu et al. (1997 for which P k (ˆσ 2 i > σ 2 0 + C 1 as. Sice ˆσ 2 k σ2 0 = o p (1, c = o(1 ad C > 0, we get for κ = k, P 1 = P k (ˆσ 2 i > σ 2 0 + C, ˆσ 2 i < (1 + c ˆσ 2 k P k (ˆσ2 k σ2 0 > C c ˆσ 2 k which coverges to zero. Also, P 2 = P k (ˆσ 2 i σ 2 0 + C, ˆσ 2 i < (1 + c ˆσ 2 k P k (ˆσ2 i σ 2 0 + C, ad thus P 2 coverges to zero by Lemma 5.4 of Liu et al. Proof of Lemma A.3. Note that P (R k,j;α κ = k = P (ˆσ 2 k (1 + c ˆσ 2 j κ = k = P k (ˆσ 2 k ˆσ2 j c ˆσ 2 j. ˆσ 2 j

From Lemma 5.4 of Liu et al. (1997, for j > k, 0 < ˆσ 2 k ˆσ2 j ˆσ 2 j = σ 2 0 + o p (1. If c = o(1 is chose such that P k (ˆσ 2 k ˆσ2 j c ˆσ 2 j = P k ( ˆσ 2 k ˆσ 2 j ˆσ 2 j c (l 2, 3 = O p ((l 2 / ad (l c 2 0 as. (l 2 Appedix B: Proof of Theorem 3.2.2 Note that i this revisio, the coditios (C1 ad (C2 i Assumptio 3.2.2 are replaced by (A1 ad (A2 of Assumptio 3.2.1. Lemma B.1. Suppose that coditios (C1, (C2 ad (C3 i Assumptio 3.2.2 are satisfied. The the η i = µ T (I H i (τ k µ satisfy the followigs: (i η i is a decreasig fuctio of i. (ii 1/η = 1/η k 1 = O(l /. Lemma B.2. Suppose that the assumptios i Lemma B.1 are satisfied. For α 0 fixed ad j > i, there exists c = c = c (i, j; α 0 /M that asymptotically achieves the level α 0 /M, where M / η 0 as. Lemma B.3. Suppose that the assumptios i Lemma B.1 are satisfied. For i < k, H k (τ k H i (τ k is idempotet. Lemma B.4. Suppose that the assumptios i Lemma B.1 are satisfied. For i < k, P (A i,k ;α κ = k P ( Z i, + yt (B 1 + B 2 + B 3 y 2σ 0 ηi > ηi where B 1 = H k (τ k H k (ˆτ k, B 2 = c(i H k (ˆτ k, B 3 = H i (ˆτ i H i (τ k, ad for ϵ = y E(y x, κ = k. Z i, = 2µ T (I H i (τ k ϵ 2σ 0 ηi, 2σ 0,

4 Lemma B.5. Suppose that the assumptios i Lemma B.1 are satisfied. For i < k, V i, = y T (B 1 + B 2 + B 3 y/(2σ 0 ηi = O p ( l + h i,, where c = O(1 ad h i, γ i, ηi /(2σ 0 for γ i, such that 0 < lim (1 γ i, 1. Proof of Theorem 3.2.2. We first show that P (ˆκ < k κ = k 0 as. Note that for V i, = y T (B 1 + B 2 + B 3 y/(2σ 0 ηi (i < k, P (A i,k ;α κ = k P (Z i, + V i, h i, (1 γ i, η i /(2σ 0 P (e Z i, +Ṽi, e η i /(2σ 0 E(e Z i, +Ṽi, /e η i /(2σ 0, where Z i, = Z i, /((1 γ i, l, Ṽi, = (V i, h i, /((1 γ i, l, ad η i = η i /l, ad the last iequality is obtaied by Markov s iequality. The, P (ˆκ < k κ = k = k 1 P (ˆκ = j κ = k j=0 d k0 P (A k0,k ;α κ = k d k0 max i=0,...,k 1 ( g 2 (k max j=0,...,k 1 E(e Z i, +Ṽi, e η i /(2σ 0 ( M max j i=0,...,k 1 g 2 (k M k 1 max i=0,...,k 1 E(e Z i, +Ṽi, mi i=0,...,k 1 e η i /(2σ 0 g 2 (k g 2 (k M k 1 e η /(2σ 0 ( M η max E(e Z i, +Ṽi, i=0,...,k 1 k 1 ( (l 2 k 1 η where g 2 (k is a positive fuctio of k. Sice Z i, + Ṽi, = o p (1 ad E(e Z i, +Ṽi, e η i /(2σ 0 max E(e Z i, +Ṽi,, i=0,...,k 1 (l 2 η = o(1, the upper boud will coverge to zero uder a mild coditio o M such as the oe described

i Assumptio 3.2.2 (C3. The, by usig P (ˆκ > k κ = k α 0, we ca show that lim P (ˆκ = k κ = k 1 α 0. Similarly as i Theorem 3.2.1, by choosig c = c such that c = O(1 ad the correspodig α 0 approaches to zero, we ca achieve the desired result. Proof of Lemma B.1. Let X i+1 (t = (X i (t x i+1 (t, where x i+1 (t = ((x 1 t i+1 +,..., (x t i+1 + T. Note that η i = µ T (I H i (τ k µ is a decreasig fuctio of i, which ca be proved by showig that xi+1 (tx T ] (I H i (t (I H i+1 (t = (I H i (t i+1(t (I H i (t > 0, where a 22 i+1 = x T i+1(t(i H i (tx i+1 (t. a 22 i+1 Thus, for X k 1 = X k 1(τ k, x k = x k (τ k, µ = µ(τ k ad H i = H i (τ k, η = mi η i<k i = η k 1 = (µ T (I H k 1µ = (µ (I T xk x T ] k H k + (I H k 1 (I H k 1 = β T (X k 1 x k T (I H k 1 a 22 k ] xk x T k a 22 k = δ k a 22 k δ k = δk 2 x T k (I H k 1x k ] = δk 2 (x j τ k b mj (x m τ k, j=l k +1 m=l k +1 µ (I H k 1 (X k 1 x k β 5 where (x lk +1,..., x are the observatios i τ k, 1] ad I H k 1 = (b mj. Uder (C1, it ca be show that for large, η D 1 / l, where D 1 is a positive costat. Proofs of Lemma B.2. ad Lemma B.3. The proof of Lemma B.3, which is based o legthy ad straightforward matrix algebra, is omitted, ad the proof of Lemma B.2. is sketched below. ˆσ i Suppose that for some a > 0 such that a as, Z = a 2 ˆσ2 j, uder the ˆσ j 2 ull hypothesis of κ = i, coverges i distributio to Z with a cumulative distributio

6 fuctio F ( ad the probability desity fuctio f(. We the see that for j > i, α 0 M = P (RSS(i (1 + c RSS(j κ = i = P (Z c 1 F ( c, where c = a c. Sice d 1 d M is proportioal to f( c d d c ad d g d is proportioal to l, where 1/ η l = g D 1, a slowly icreasig fuctio of, c, such that l /f( c d c d 0 as satisfies the coditio of M = M such that M/ η 0 ( as. Usig that Z /a = O M(l 2 p, it ca also be show that for appropriately chose c, c = O(1 sice c = c a where c / is slowly icreasig ad a / at least as fast as /{M (l 2 } does as. For example, if f is a chi-square desity with fiite degrees of freedom, the c such that c = a c = D 2 l for 0 < D 2 < 1 ca be used. Proof of Lemma B.4. P (A i,k ;α κ = k = P k y T (I H i (ˆτ i y < (1 + c y T (I H k (ˆτ k y ] = P k y T (I H i (τ k y + y T (H i (τ k H i (ˆτ i y < (1 + c { y T (I H k (ˆτ k y }]. Notig that y = µ + ϵ whe κ = k ad (I H k (τ k µ = 0, the right had side is equivalet to P k 2µ T (I H i (τ k ϵ < µ T (I H i (τ k µ ϵ T (H k (τ k H i (τ k ϵ y T (H k (τ k H k (ˆτ k y + c y T (I H k (ˆτ k y + y T (H i (ˆτ i H i (τ k y ]. Sice ϵ T (H k (τ k H i (τ k ϵ > 0 by Lemma B.3, P (A i,k ;α κ = k P ( 2µ T (I H i (τ k ϵ + y T (B 1 + B 2 + B 3 y > µ T (I H i (τ k µ = ( P Z i, + yt (B 1 + B 2 + B 3 y ηi >. 2σ 0 ηi 2σ 0

7 Proof of Lemma B.5. (i y T B 1 y/(2σ 0 ηi = y T (H k (τ k H k (ˆτ k y/(2σ 0 ηi = O p ( l. This ca be obtaied by usig ˆσ k 2 σ2 0 = O p (1/ ad 1/ η i 1/ η = O( l /. (ii y T B 2 y/(2σ 0 ηi = c y T (I H k (ˆτ k y/(2σ 0 ηi = O p ( l for a choice of c = c such that c = O(1. This ca be show because /η i = O( l ad (iii ˆσ 2 k is a cosistet estimator of σ2 0. y T B 3 y/(2σ 0 ηi = yt (I H i (τ k y yt (I H i (ˆτ i y 2σ 0 ηi 2σ 0 ηi = σ2 0 (Z 1, Z 2, + E k Q 1 ] E k Q 2 ] 2η i 2, η i /σ 0 where Q 1 = y T (I H i (τ k y/σ 2 0, Q 2 = y T (I H i (ˆτ i y/σ 2 0, Z 1, = (Q 1 E k Q 1 ]/ 2, ad Z 2, = (Q 2 E k Q 2 ]/ 2. Matrix algebra shows that (E k Q 1 ] E k Q 2 ]/(2 η i /σ 0 = h i, + O( l, where h i, γ i, ηi /(2σ 0 for γ i, such that 0 < lim (1 γ i, 1. Sice Z 1, Z 2, = O p (1 ad /η i = O( l, y T B 3 y/(2σ 0 ηi = O p ( l + h i,. Combiig (i, (ii ad (iii, we obtai that V i, = O p ( l + h i,.