1 SELECTING THE NUMBER OF CHANGE-POINTS IN SEGMENTED LINE REGRESSION Hyue-Ju Kim 1,, Bibig Yu 2, ad Eric J. Feuer 3 1 Syracuse Uiversity, 2 Natioal Istitute of Agig, ad 3 Natioal Cacer Istitute Supplemetary Material This ote cotais proofs for Theorems 3.2.1 ad 3.2.2, ad this is the versio revised i 2012. Appedix A: Proof of Theorem 3.2.1 Lemma A.1. Suppose that coditios (A1 ad (A2 i Assumptio 3.2.1 are satisfied. The, for α fixed ad j > i, there exists c = c = c (i, j; α = o(1 that asymptotically achieves the level α. Lemma A.2. Suppose that the assumptios i Lemma A.1 are satisfied ad c = o(1. The, for i < k, P (A i,k ;α κ = k coverges to zero as. Lemma A.3. Suppose that the assumptios i Lemma A.1 are satisfied, c = o(1, ad c (l 2 as. The, for j > k, P (R k,j;α κ = k coverges to zero as. Proof of Theorem 3.2.1. First, ote from (3.1 that P (ˆκ < k κ = k = 1 P (ˆκ = j κ = k j=0 d k0 P (A k0,k ;α κ = k d k0 max i,k ;α κ = k i=0,...,k 1 = g 1 (k, M max i,k ;α κ = k, i=0,...,k 1
2 where g 1 (k, M is a positive fuctio of k ad M. Lemma A.2 the provides the result that the uder-fittig probability coverges to zero. Sice P (ˆκ > k κ = k α 0 by the desig of the permutatio procedure, i geeral, we obtai that lim P (ˆκ = k κ = k 1 α 0. If c = c = o(1 is chose such that by Lemma A.3. c (l 2, the we achieve the desired result Proof of Lemma A.1. Sice, for j > i(= κ, 0 < ˆσ 2 i ˆσ 2 j = O p ((l 2 / ad ˆσ 2 j coverges to σ0 2 i probability from Lemma 5.4 of Liu et al. (1997, where ˆσ i 2 = RSS(i/ ( ˆσ 2 as i Liu et al., there exist B α ad N α such that P i ˆσ j 2 (l B 2 ˆσ j 2 α κ = i α for all (l > N α. Thus for > N α, there exists c = c B 2 α such that ( ˆσ 2 α = P (RSS(i (1 + crss(j κ = i = P i ˆσ j 2 c κ = i. Proof of Lemma A.2. For i < k, P (A i,k ;α κ = k = P (ˆσ 2 i < (1 + c ˆσ 2 k κ = k = P k (ˆσ 2 i > σ 2 0 + C, ˆσ 2 i < (1 + c ˆσ 2 k + P k (ˆσ2 i σ 2 0 + C, ˆσ 2 i < (1 + c ˆσ 2 k = P 1 + P 2, where C is a positive costat i Lemma 5.4 of Liu et al. (1997 for which P k (ˆσ 2 i > σ 2 0 + C 1 as. Sice ˆσ 2 k σ2 0 = o p (1, c = o(1 ad C > 0, we get for κ = k, P 1 = P k (ˆσ 2 i > σ 2 0 + C, ˆσ 2 i < (1 + c ˆσ 2 k P k (ˆσ2 k σ2 0 > C c ˆσ 2 k which coverges to zero. Also, P 2 = P k (ˆσ 2 i σ 2 0 + C, ˆσ 2 i < (1 + c ˆσ 2 k P k (ˆσ2 i σ 2 0 + C, ad thus P 2 coverges to zero by Lemma 5.4 of Liu et al. Proof of Lemma A.3. Note that P (R k,j;α κ = k = P (ˆσ 2 k (1 + c ˆσ 2 j κ = k = P k (ˆσ 2 k ˆσ2 j c ˆσ 2 j. ˆσ 2 j
From Lemma 5.4 of Liu et al. (1997, for j > k, 0 < ˆσ 2 k ˆσ2 j ˆσ 2 j = σ 2 0 + o p (1. If c = o(1 is chose such that P k (ˆσ 2 k ˆσ2 j c ˆσ 2 j = P k ( ˆσ 2 k ˆσ 2 j ˆσ 2 j c (l 2, 3 = O p ((l 2 / ad (l c 2 0 as. (l 2 Appedix B: Proof of Theorem 3.2.2 Note that i this revisio, the coditios (C1 ad (C2 i Assumptio 3.2.2 are replaced by (A1 ad (A2 of Assumptio 3.2.1. Lemma B.1. Suppose that coditios (C1, (C2 ad (C3 i Assumptio 3.2.2 are satisfied. The the η i = µ T (I H i (τ k µ satisfy the followigs: (i η i is a decreasig fuctio of i. (ii 1/η = 1/η k 1 = O(l /. Lemma B.2. Suppose that the assumptios i Lemma B.1 are satisfied. For α 0 fixed ad j > i, there exists c = c = c (i, j; α 0 /M that asymptotically achieves the level α 0 /M, where M / η 0 as. Lemma B.3. Suppose that the assumptios i Lemma B.1 are satisfied. For i < k, H k (τ k H i (τ k is idempotet. Lemma B.4. Suppose that the assumptios i Lemma B.1 are satisfied. For i < k, P (A i,k ;α κ = k P ( Z i, + yt (B 1 + B 2 + B 3 y 2σ 0 ηi > ηi where B 1 = H k (τ k H k (ˆτ k, B 2 = c(i H k (ˆτ k, B 3 = H i (ˆτ i H i (τ k, ad for ϵ = y E(y x, κ = k. Z i, = 2µ T (I H i (τ k ϵ 2σ 0 ηi, 2σ 0,
4 Lemma B.5. Suppose that the assumptios i Lemma B.1 are satisfied. For i < k, V i, = y T (B 1 + B 2 + B 3 y/(2σ 0 ηi = O p ( l + h i,, where c = O(1 ad h i, γ i, ηi /(2σ 0 for γ i, such that 0 < lim (1 γ i, 1. Proof of Theorem 3.2.2. We first show that P (ˆκ < k κ = k 0 as. Note that for V i, = y T (B 1 + B 2 + B 3 y/(2σ 0 ηi (i < k, P (A i,k ;α κ = k P (Z i, + V i, h i, (1 γ i, η i /(2σ 0 P (e Z i, +Ṽi, e η i /(2σ 0 E(e Z i, +Ṽi, /e η i /(2σ 0, where Z i, = Z i, /((1 γ i, l, Ṽi, = (V i, h i, /((1 γ i, l, ad η i = η i /l, ad the last iequality is obtaied by Markov s iequality. The, P (ˆκ < k κ = k = k 1 P (ˆκ = j κ = k j=0 d k0 P (A k0,k ;α κ = k d k0 max i=0,...,k 1 ( g 2 (k max j=0,...,k 1 E(e Z i, +Ṽi, e η i /(2σ 0 ( M max j i=0,...,k 1 g 2 (k M k 1 max i=0,...,k 1 E(e Z i, +Ṽi, mi i=0,...,k 1 e η i /(2σ 0 g 2 (k g 2 (k M k 1 e η /(2σ 0 ( M η max E(e Z i, +Ṽi, i=0,...,k 1 k 1 ( (l 2 k 1 η where g 2 (k is a positive fuctio of k. Sice Z i, + Ṽi, = o p (1 ad E(e Z i, +Ṽi, e η i /(2σ 0 max E(e Z i, +Ṽi,, i=0,...,k 1 (l 2 η = o(1, the upper boud will coverge to zero uder a mild coditio o M such as the oe described
i Assumptio 3.2.2 (C3. The, by usig P (ˆκ > k κ = k α 0, we ca show that lim P (ˆκ = k κ = k 1 α 0. Similarly as i Theorem 3.2.1, by choosig c = c such that c = O(1 ad the correspodig α 0 approaches to zero, we ca achieve the desired result. Proof of Lemma B.1. Let X i+1 (t = (X i (t x i+1 (t, where x i+1 (t = ((x 1 t i+1 +,..., (x t i+1 + T. Note that η i = µ T (I H i (τ k µ is a decreasig fuctio of i, which ca be proved by showig that xi+1 (tx T ] (I H i (t (I H i+1 (t = (I H i (t i+1(t (I H i (t > 0, where a 22 i+1 = x T i+1(t(i H i (tx i+1 (t. a 22 i+1 Thus, for X k 1 = X k 1(τ k, x k = x k (τ k, µ = µ(τ k ad H i = H i (τ k, η = mi η i<k i = η k 1 = (µ T (I H k 1µ = (µ (I T xk x T ] k H k + (I H k 1 (I H k 1 = β T (X k 1 x k T (I H k 1 a 22 k ] xk x T k a 22 k = δ k a 22 k δ k = δk 2 x T k (I H k 1x k ] = δk 2 (x j τ k b mj (x m τ k, j=l k +1 m=l k +1 µ (I H k 1 (X k 1 x k β 5 where (x lk +1,..., x are the observatios i τ k, 1] ad I H k 1 = (b mj. Uder (C1, it ca be show that for large, η D 1 / l, where D 1 is a positive costat. Proofs of Lemma B.2. ad Lemma B.3. The proof of Lemma B.3, which is based o legthy ad straightforward matrix algebra, is omitted, ad the proof of Lemma B.2. is sketched below. ˆσ i Suppose that for some a > 0 such that a as, Z = a 2 ˆσ2 j, uder the ˆσ j 2 ull hypothesis of κ = i, coverges i distributio to Z with a cumulative distributio
6 fuctio F ( ad the probability desity fuctio f(. We the see that for j > i, α 0 M = P (RSS(i (1 + c RSS(j κ = i = P (Z c 1 F ( c, where c = a c. Sice d 1 d M is proportioal to f( c d d c ad d g d is proportioal to l, where 1/ η l = g D 1, a slowly icreasig fuctio of, c, such that l /f( c d c d 0 as satisfies the coditio of M = M such that M/ η 0 ( as. Usig that Z /a = O M(l 2 p, it ca also be show that for appropriately chose c, c = O(1 sice c = c a where c / is slowly icreasig ad a / at least as fast as /{M (l 2 } does as. For example, if f is a chi-square desity with fiite degrees of freedom, the c such that c = a c = D 2 l for 0 < D 2 < 1 ca be used. Proof of Lemma B.4. P (A i,k ;α κ = k = P k y T (I H i (ˆτ i y < (1 + c y T (I H k (ˆτ k y ] = P k y T (I H i (τ k y + y T (H i (τ k H i (ˆτ i y < (1 + c { y T (I H k (ˆτ k y }]. Notig that y = µ + ϵ whe κ = k ad (I H k (τ k µ = 0, the right had side is equivalet to P k 2µ T (I H i (τ k ϵ < µ T (I H i (τ k µ ϵ T (H k (τ k H i (τ k ϵ y T (H k (τ k H k (ˆτ k y + c y T (I H k (ˆτ k y + y T (H i (ˆτ i H i (τ k y ]. Sice ϵ T (H k (τ k H i (τ k ϵ > 0 by Lemma B.3, P (A i,k ;α κ = k P ( 2µ T (I H i (τ k ϵ + y T (B 1 + B 2 + B 3 y > µ T (I H i (τ k µ = ( P Z i, + yt (B 1 + B 2 + B 3 y ηi >. 2σ 0 ηi 2σ 0
7 Proof of Lemma B.5. (i y T B 1 y/(2σ 0 ηi = y T (H k (τ k H k (ˆτ k y/(2σ 0 ηi = O p ( l. This ca be obtaied by usig ˆσ k 2 σ2 0 = O p (1/ ad 1/ η i 1/ η = O( l /. (ii y T B 2 y/(2σ 0 ηi = c y T (I H k (ˆτ k y/(2σ 0 ηi = O p ( l for a choice of c = c such that c = O(1. This ca be show because /η i = O( l ad (iii ˆσ 2 k is a cosistet estimator of σ2 0. y T B 3 y/(2σ 0 ηi = yt (I H i (τ k y yt (I H i (ˆτ i y 2σ 0 ηi 2σ 0 ηi = σ2 0 (Z 1, Z 2, + E k Q 1 ] E k Q 2 ] 2η i 2, η i /σ 0 where Q 1 = y T (I H i (τ k y/σ 2 0, Q 2 = y T (I H i (ˆτ i y/σ 2 0, Z 1, = (Q 1 E k Q 1 ]/ 2, ad Z 2, = (Q 2 E k Q 2 ]/ 2. Matrix algebra shows that (E k Q 1 ] E k Q 2 ]/(2 η i /σ 0 = h i, + O( l, where h i, γ i, ηi /(2σ 0 for γ i, such that 0 < lim (1 γ i, 1. Sice Z 1, Z 2, = O p (1 ad /η i = O( l, y T B 3 y/(2σ 0 ηi = O p ( l + h i,. Combiig (i, (ii ad (iii, we obtai that V i, = O p ( l + h i,.