Asymptotic results discrete time martingales and stochastic algorithms

Asymptotic results discrete time martingales and stochastic algorithms Bernard Bercu Bordeaux University, France IFCAM Summer School Bangalore, India, July 2015 Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 1 / 60

Outline 1 Introduction Definition and Examples On Doob s convergence theorem On the stopping time theorem Kolmogorov-Doob martingale inequalities 2 Asymptotic results Two useful Lemmas Square integrable martingales Robbins-Siegmund Theorem Strong law of large numbers for martingales Central limit theorem for martingales 3 Statistical applications Autoregressive processes Stochastic algorithms Kernel density estimation Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 2 / 60

Outline Introduction 1 Introduction Definition and Examples On Doob s convergence theorem On the stopping time theorem Kolmogorov-Doob martingale inequalities 2 Asymptotic results Two useful Lemmas Square integrable martingales Robbins-Siegmund Theorem Strong law of large numbers for martingales Central limit theorem for martingales 3 Statistical applications Autoregressive processes Stochastic algorithms Kernel density estimation Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 3 / 60

Introduction Definition and Examples Let (Ω, A, P) be a probability space with a filtration F = (F n ) where F n is the σ-algebra of events occurring up to time n. Definition Let (M n ) be a sequence of integrable random variables defined on (Ω, A, P) such that, for all n 0, M n is F n -measurable. 1 (M n ) is a martingale MG if for all n 0, E[M n+1 F n ] = M n a.s. 2 (M n ) is a submartingale smg if for all n 0, E[M n+1 F n ] M n a.s. 3 (M n ) is a supermartingale SMG if for all n 0, E[M n+1 F n ] M n a.s. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 4 / 60

Martingales with sums Introduction Definition and Examples Example (Sums) Let (X n ) be a sequence of integrable and independent random variables such that, for all n 1, E[X n ] = m. Denote n S n = X k. k=1 We clearly have S n+1 = S n + X n+1. Consequently, (S n ) is a sequence of integrable random variables with E[S n+1 F n ] = S n + E[X n+1 F n ], = S n + E[X n+1 ], = S n + m Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 5 / 60

Martingales with sums Introduction Definition and Examples Example (Sums) E[S n+1 F n ] = S n + m. (S n ) is a martingale if m = 0, (S n ) is a submartingale if m 0, (S n ) is a supermartingale if m 0. It holds for Rademacher R(p) distribution with 0 < p < 1 where m = 2p 1. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 6 / 60

Introduction Definition and Examples Martingales with Rademacher sums 60 40 Martingale Submartingale Supermartingale Martingales with Rademacher sums 20 0 20 40 60 0 20 40 60 80 100 Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 7 / 60

Martingales with products Introduction Definition and Examples Example (Products) Let (X n ) be a sequence of positive, integrable and independent random variables such that, for all n 1, E[X n ] = m. Denote n P n = X k. k=1 We clearly have P n+1 = P n X n+1. Consequently, (P n ) is a sequence of integrable random variables with E[P n+1 F n ] = P n E[X n+1 F n ], = P n E[X n+1 ], = mp n Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 8 / 60

Introduction Martingales with products Definition and Examples Example (Products) E[P n+1 F n ] = mp n. (P n ) is a martingale if m = 1, (P n ) is a submartingale if m 1, (P n ) is a supermartingale if m 1. It holds for Exponential E(λ) distribution with λ > 0 where m = 1 λ. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 9 / 60

Stability Introduction Definition and Examples Theorem (Stability) 1 If (M n ) is a SMG, then ( M n ) is a smg. 2 If (M n ) and (N n ) are two smg and S n = sup(m n, N n ) (S n ) is a smg. 3 If (M n ) and (N n ) are two SMG and I n = inf(x n, Y n ) (I n ) is a SMG. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 10 / 60

Stability, continued Introduction Definition and Examples Theorem (Stability) 1 If (M n ) and (N n ) are two MG, a, b R and (S n ) is a MG. S n = am n + bn n 2 If (M n ) is a MG and F is a convex real function such that, for all n 1, F (M n ) L 1 (R) and if (F n ) is a smg. F n = F(M n ) Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 11 / 60

Introduction Doob s convergence theorem On Doob s convergence theorem Every bounded above increasing sequence converges to its supremum, Every bounded bellow decreasing sequence converges to its infimum. The stochastic analogous of this result is due to Doob. Theorem (Doob) 1 If (M n ) is a smg bounded above by some constant M, then (M n ) converges a.s. 2 If (M n ) is a SMG bounded below by some constant m, then (M n ) converges a.s. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 12 / 60

Introduction On Doob s convergence theorem Doob s convergence theorem, continued Theorem (Doob) Let (M n ) be a MG, smg, or SMG bounded in L 1 which means sup E[ M n ] < +. n 0 (M n ) converges a.s. to an integrable random variable M. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 13 / 60

Introduction On Doob s convergence theorem Joseph Leo Doob Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 14 / 60

Introduction Convergence of martingales On Doob s convergence theorem Theorem Let (M n ) be a MG bounded in L p with p 1, which means that sup E[ M n p ] < +. n 0 1 If p > 1, (M n ) converges a.s. a random variable M. The convergence is also true in L p. 2 If p = 1, (M n ) converges a.s. to a random variable M. The convergence holds in L 1 as soon as (M n ) is uniformly integrable that is lim sup E [ M n I a { Mn a}] = 0. n 0 Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 15 / 60

Chow s Theorem Introduction On Doob s convergence theorem Theorem (Chow) Let (M n ) be a MG such that for 1 a 2 and for all n 1, E[ M n a ] <. Denote, for all n 1, M n = M n M n 1 and assume that E[ M n a F n 1 ] < n=1 a.s. (M n ) converges a.s. to a random variable M. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 16 / 60

Introduction Exponential Martingale On Doob s convergence theorem Example (Exponential Martingale) Let (X n ) be a sequence of independent random variable sharing the same N (0, 1) distribution. For all t R, let S n = X 1 + + X n and denote M n (t) = exp (ts n nt2 ). 2 It is clear that (M n (t)) is a MG which converges a.s. to zero. However, E[M n (t)] = E[M 1 (t)] = 1 which means that (M n (t)) does not converge in L 1. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 17 / 60

Introduction Autoregressive Martingale On Doob s convergence theorem Example (Autoregressive Martingale) Let (X n ) be the autoregressive process given for all n 0 by X n+1 = θx n + (1 θ)ε n+1 where X 0 = p with 0 < p < 1 and the parameter 0 < θ < 1. Assume that L(ε n+1 F n ) is the Bernoulli B(X n ) distribution. We can show that 0 < X n < 1 and (X n ) is a MG such that lim n X n = X a.s. The convergence also holds in L p for all p 1. Finally, X has the Bernoulli B(p) distribution. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 18 / 60

Stopping time theorem Introduction On the stopping time theorem Definition We shall say that a random variable T is a stopping time if T takes its values in N {+ } and, for all n 0, the event Theorem {T = n} F n. Assume that (M n ) is a MG and let T be a stopping time adapted to F = (F n ). Then, (M n T ) is also a MG. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 19 / 60

Introduction On the stopping time theorem Proof of the stopping time theorem Proof. First of all, it is clear that for all n 0, (M n T ) is integrable as M n T = M T I {T <n} + M n I {T n}. In addition, {T n} F n 1 as its complementary {T < n} F n 1. Then, for all n 0, E[M (n+1) T F n ] = E[M T I {T <n+1} + M n+1 I {T n+1} F n ], = M T I {T <n+1} + I {T n+1} E[M n+1 F n ], = M T I {T <n+1} + M n I {T n+1}, = M T I {T <n} + M n I {T =n} + M n I {T n} M n I {T =n}, = M T I {T <n} + M n I {T n}, = M n T. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 20 / 60

Introduction Kolmogorov s inequality Kolmogorov-Doob martingale inequalities Theorem (Kolmogorov s inequality) Assume that (M n ) is a MG. Then, for all a > 0, P ( M # n > a) 1 a E[ M n I {M # n >a} ] where M # n = max 0 k n M k. As (M n ) is a MG, we clearly have that ( M n ) is a smg. The proof relies on the entry time T a of the smg ( M n ) into the interval [a, + [, T a = inf { n 0, M n a }. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 21 / 60

Proof. Introduction Kolmogorov-Doob martingale inequalities First of all, we clearly have for all n 0, { Ta n } = { max 0 k n M k a } = { M # n > a }. Since M Ta a, it leads to P(M # n > a) = P(T a n) = E [ ] 1 I {Ta n} a E[ ] M Ta I {Ta n}. However, we have for all k n, M k E[ M n F k ] a.s. Therefore, E [ ] n M Ta I {Ta n} = E [ ] n M k I {Ta=k} E [ E [ ] M n F k I{Ta=k}], k=0 k=0 n E [ [ M n I {Ta=k}] = E Mn I {Ta n}], k=0 which completes the proof of Kolmogorov s inequality. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 22 / 60

Doob s inequality Introduction Kolmogorov-Doob martingale inequalities Theorem (Doob s inequality) Assume that (M n ) is a MG bounded in L p with p > 1. Then, we have E [ M n p] E [ ( p ) pe (M n # )p] [ Mn p]. p 1 In particular, for p = 2, E [ M n 2] E [ (M n # ) 2] 4E [ M n 2]. The proof relies on the elementary fact that for any positive random variable X and for all p 1, E [ X p] = 0 pa p 1 P ( X > a ) da. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 23 / 60

Introduction Proof of Doob s inequality Proof. Kolmogorov-Doob martingale inequalities It follows from Kolmogorov s inequality and Fubini s theorem that E [ (M # n ) p] = 0 0 pa p 1 P ( M # n > a ) da, pa p 2 E [ M n I {M # n >a}] da, [ = E M n pa p 2 I {M # ], 0 n >a} da ( p ) = E [ M n (M n # ) p 1]. p 1 Finally, via Holder s inequality, E [ M n (M n # ) p 1] (E [ M n p]) 1/p(E [ (M n # ) p]) (p 1)/p which completes the proof of Doob s inequality. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 24 / 60

Outline Asymptotic results 1 Introduction Definition and Examples On Doob s convergence theorem On the stopping time theorem Kolmogorov-Doob martingale inequalities 2 Asymptotic results Two useful Lemmas Square integrable martingales Robbins-Siegmund Theorem Strong law of large numbers for martingales Central limit theorem for martingales 3 Statistical applications Autoregressive processes Stochastic algorithms Kernel density estimation Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 25 / 60

Asymptotic results Two useful Lemmas We start with two useful lemmas in stochastic analysis. Lemma (Toeplitz) Let (a n ) be a sequence of positive real numbers satisfying a n = +. n=1 In addition, let (x n ) be a sequence of real numbers such that Then, we have lim n ( n lim x n = x. n k=1 a k ) 1 n k=1 a k x k = x. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 26 / 60

Kronecker s Lemma Asymptotic results Two useful Lemmas Lemma (Kronecker) Let (a n ) be a sequence of positive real numbers strictly increasing to infinity. Moreover, let (x n ) be a sequence of real numbers such that n=1 exists and is finite. Then, we have lim n a 1 n x n a n = l n x k = 0. k=1 Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 27 / 60

Increasing process Asymptotic results Square integrable martingales Definition Let (M n ) be a square integrable MG that is for all n 1, E[M 2 n ] <. The increasing process associated with (M n ) is given by <M > 0 = 0 and, for all n 1, < M > n = where M k = M k M k 1. n E[ Mk 2 F k 1] k=1 If (M n ) is a square integrable MG and N n = M 2 n < M > n, then (N n ) is a MG. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 28 / 60

Asymptotic results Square integrable martingales Example (Increasing Process) Let (X n ) be a sequence of square integrable and independent random variables such that, for all n 1, E[X n ] = m and Var(X n ) = σ 2 > 0. Denote M n = n (X k m) k=1 Then, (M n ) is a martingale and its increasing process < M > n = σ 2 n. Moreover, if N n = M 2 n σ 2 n, (N n ) is a MG. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 29 / 60

Asymptotic results Robbins-Siegmund Theorem Theorem (Robbins-Siegmund) Let (V n ), (A n ) and (B n ) be three positive sequences adapted to F = (F n ). Assume that V 0 is integrable and, for all n 0, E[V n+1 F n ] V n + A n B n a.s. Denote { } Γ = A n < +. n=0 1 On Γ, (V n ) converges a.s. to a finite random variable V. 2 On Γ, we also have B n < + n=0 a.s. If A n = 0 and B n = 0, then (V n ) is a positive SMG which converges a.s. to V thanks to Doob s theorem. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 30 / 60

Proof. For all n 1, denote Asymptotic results n 1 Robbins-Siegmund Theorem M n = V n (A k B k ). k=0 We clearly have, for all n 0, E[M n+1 F n ] M n. For any positive a, let T a be the stopping time { n } T a = inf n 0, (A k B k ) a. We deduce from the stopping time theorem that (M n Ta ) is a SMG bounded below by a. It follows from Doob s theorem that (M n Ta ) converges a.s. to M. Consequently, on the set {T a = + }, (M n ) converges a.s. to M. In addition, we also have n n n M n+1 + A k = V n+1 + B k B k. k=0 k=0 Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 31 / 60 k=0 k=0

Asymptotic results Robbins-Siegmund Theorem Proof of Robbins-Siegmund s theorem, continued Proof. Hence, on the set Γ {T a = + }, we obtain that B n < + a.s. n=0 and (V n ) converges a.s. to a finite random variable V. Finally, as (B n ) is a sequence of positive random variables, we have on Γ, n n (A k B k ) A k < + a.s. k=0 k=0 It means that Γ {T p = + }, Γ = Γ {T p = + } p=0 p=0 which completes the proof of Robbins-Siegmund s theorem. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 32 / 60

Corollary Asymptotic results Robbins-Siegmund Theorem Let (V n ), (A n ) and (B n ) be three positive sequences adapted to F = (F n ). Let (a n ) be a positive increasing sequence adapted to F = (F n ). Assume that V 0 is integrable and, for all n 0, E[V n+1 F n ] V n + A n B n a.s. Denote { A } n Λ = < +. a n n=0 1 On Γ {a n a }, (V n ) converges a.s. to V. 2 On Γ {a n + }, V n = o(a n ) a.s., V n+1 = o(a n ) a.s. and n B k = o(a n ) k=0 a.s. This result is the keystone for the SLLN for martingales. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 33 / 60

Asymptotic results Strong law of large numbers for martingales Strong law of large numbers for martingales Theorem (Strong Law of large numbers) Let (M n ) be a square integrable MG and denote by <M > n its increasing process. 1 On {<M > n <M > }, (M n ) converges a.s. to a square integrable random variable M. 2 On {<M > n + }, we have lim n M n < M > n = 0 a.s. More precisely, for any positive γ, ( Mn ) 2 ( (log < M >n ) 1+γ ) = o < M > n < M > n a.s. If it exists a positive sequence (a n ) increasing to infinity such that <M > n = O(a n ), then we have M n = o(a n ) a.s. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 34 / 60

Easy example Asymptotic results Strong law of large numbers for martingales Let (X n ) be a sequence of square integrable and independent random variables such that, for all n 1, E[X n ] = m and Var(X n ) = σ 2 > 0. We already saw that n M n = (X k m) k=1 is square integrable MG with <M > n = σ 2 n. It follows from the SLLN for martingales that M n = o(n) a.s. which means that 1 lim n n n X k = m k=1 a.s. More precisely, for any positive γ, ( Mn ) 2 ( 1 n ) 2 ( (log n) 1+γ ) = X k m = o n n n k=1 a.s. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 35 / 60

Asymptotic results Strong law of large numbers for martingales Proof of the strong Law of large numbers Proof. For any positive a, let T a be the stopping time { } T a = inf n 0, <M > n+1 a. It follows from the stopping time theorem that (M n Ta ) is a MG. It is bounded in L 2 as sup n 0 E[(M n Ta ) 2 ] = sup E[< M > n Ta ] < a. n 0 We deduce from Doob s convergence theorem that (M n Ta ) converges a.s. to a square integrable random variable M. Hence, on the set {T a = + }, (M n ) converges a.s. to M. However, {< M > < + } = {T p = + } p=1 which completes the proof of the first part of the theorem. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 36 / 60

Asymptotic results Strong law of large numbers for martingales Proof. Let V n = M 2 n, A n =<M > n+1 <M > n and B n = 0. We clearly have E[V n+1 F n ] V n + A n B n a.s. For any positive γ, denote a n =<M > n+1 (log <M > n+1 ) 1+γ. On {<M > n + }, (a n ) is a positive increasing sequence adapted to F = (F n ), which goes to infinity a.s. Hence, for n large enough, a n α > 1 and it exists a positive finite random variable β such that A n 1 dx + β < + x(log x) a.s. 1+γ a n=0 n α Finally, V n+1 = o(a n ) a.s. which achieves the proof of the theorem. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 37 / 60

Asymptotic results Central limit theorem for martingales Central limit theorem for martingales Theorem (Central Limit Theorem) Let (M n ) be a square integrable MG and let (a n ) be a sequence of positive real numbers increasing to infinity. Assume that 1 It exists a deterministic limit l 0 such that <M > n a n P l. 2 Lindeberg s condition. For all ε > 0, 1 a n k=1 where M k = M k M k 1. n E[ M k 2 I { Mk ε a n} F k 1 ] P 0 Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 38 / 60

Asymptotic results Central limit theorem for martingales Central limit theorem fro martingales, continued Theorem (Central Limit Theorem) Then, we have 1 an M n L N (0, l). Moreover, if l > 0, we also have an ( Mn ) L < M > n N (0, l 1 ). Lyapunov s condition implies Lindeberg s condition. α > 2, n E[ M k α F k 1 ] = O(a n ) a.s. k=1 Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 39 / 60

Outline Statistical applications 1 Introduction Definition and Examples On Doob s convergence theorem On the stopping time theorem Kolmogorov-Doob martingale inequalities 2 Asymptotic results Two useful Lemmas Square integrable martingales Robbins-Siegmund Theorem Strong law of large numbers for martingales Central limit theorem for martingales 3 Statistical applications Autoregressive processes Stochastic algorithms Kernel density estimation Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 40 / 60

Statistical applications Stable autoregressive processes Autoregressive processes Consider the stable autoregressive process X n+1 = θx n + ε n+1, θ < 1 where (ε n ) is a sequence of iid N (0, σ 2 ) random variables. Assume that X 0 is independent of (ε n ) with N (0, σ 2 /(1 θ 2 )) distribution. (X n ) is a centered stationary Gaussian process, (X n ) is a positive recurrent process. Goal Estimate the unknown parameter θ. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 41 / 60

Statistical applications Least squares estimator Autoregressive processes Let θ n be the least squares estimator of the unknown parameter θ n X k X k 1 k=1 θ n =. n We have θ n θ = = k=1 X 2 k 1 n k=1 X kx k 1 θ n n k=1 X k 1 2 n k=1 X k 1(X k θx k 1 ) n k=1 X k 1 2, k=1 X 2 k 1, = n k=1 X k 1ε k n k=1 X k 1 2. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 42 / 60

Statistical applications Least squares estimator Autoregressive processes Consequently, M n θ n θ = σ 2 <M> n n n M n = X k 1 ε k and <M> n = σ 2 Xk 1 2. k=1 k=1 The sequence (M n ) is a square integrable martingale such that where < M > n lim n n l = = l a.s. σ4 1 θ 2. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 43 / 60

Statistical applications Stable autoregressive processes Autoregressive processes Theorem We have the almost sure convergence lim θ n = θ n a.s. In addition, we also have the asymptotic normality n ( θn θ) L N (0, 1 θ 2 ). Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 44 / 60

Statistical applications Stable autoregressive processes Autoregressive processes 1.5 Almost sure convergence 1 0.5 0 0.5 1 0 200 400 600 800 1000 0.5 Asymptotic normality 0.4 0.3 0.2 0.1 0 5 0 5 Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 45 / 60

Statistical applications Stochastic approximation Stochastic algorithms Herbert Robbins Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 46 / 60

Statistical applications Stochastic approximation Stochastic algorithms Jack Kiefer Jacob Wolfowitz Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 47 / 60

Statistical applications Stochastic approximation Stochastic algorithms 8 7 f 6 5 4 3 =f( ) 2 1 Goal 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Find the value θ without any knowledge on the function f. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 48 / 60

Statistical applications Stochastic algorithms 8 7 f f( n ) 6 5 4 3 =f( ) 2 1 Basic Idea 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 n At time n, if you are able to say that f ( θ n ) > α, then increase the value of θ n. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 49 / 60

Statistical applications Stochastic algorithms 8 7 f 6 5 4 3 =f( ) 2 f( n ) 1 Basic Idea 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 At time n, if you are able to say that f ( θ n ) < α, then decrease the value of θ n. n Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 50 / 60

Statistical applications Stochastic approximation Stochastic algorithms Let (γ n ) be a decreasing sequence of positive real numbers γ n = + n=1 and γn 2 < +. n=1 For the sake of simplicity, we shall make use of γ n = 1 n. Robbins-Monro algorithm θ n+1 = θ ) n + γ n+1 (T n+1 α where T n+1 is a random variable such that E[T n+1 F n ] = f ( θ n ). Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 51 / 60

Statistical applications Stochastic approximation Stochastic algorithms Theorem (Robbins-Monro, 1951) Assume that f is a decreasing function. Then, we have the almost sure convergence lim θ n = θ n a.s. In addition, as soon as 2f (θ) > 1, we also have the asymptotic normality n ( θn θ) L N (0, ξ 2 (θ)) where the asymptotic variance ξ 2 (θ) can be explicitely calculated. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 52 / 60

Statistical applications Kernel density estimation Kernel density estimation Let (X n ) be a sequence of iid random variables with unknown density function f. Let K be a positive and bounded function, called kernel, such that K (x) dx = 1, R R R K 2 (x) dx = σ 2. xk (x) dx = 0, Goal Estimate the unknown density function f. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 53 / 60

Choice of the Kernel Statistical applications Kernel density estimation Uniform kernel Epanechnikov kernel K a (x) = 1 2a I { x a}, K b (x) = 3 4b (1 x 2 b 2 ) I { x b}, Gaussian kernel K c (x) = 1 c 2π exp ( x 2 2c 2 ). Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 54 / 60

Statistical applications Kernel density estimation The Wolverton-Wagner estimator We estimate the density function f by The Wolverton-Wagner estimator where fn (x) = 1 n n W k (x) k=1 W k (x) = 1 ( Xk x ) K. h k h k The bandwidth (h n ) is a sequence of positive real numbers, h n 0, nh n. For 0 < α < 1, we can make use of h n = 1 n α. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 55 / 60

Statistical applications Kernel density estimation Kernel density estimation We have fn (x) f (x) = 1 n = 1 n Consequently, n W k (x) f (x), k=1 n (W k (x) E[W k (x)]) + 1 n k=1 n (E[W k (x)] f (x)). k=1 where fn (x) f (x) = M n(x) n M n (x) = + R n(x) n n (W k (x) E[W k (x)]). k=1 Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 56 / 60

Statistical applications Kernel density estimation Kernel density estimation We have M n (x) = <M(x)> n = n (W k (x) E[W k (x)]), k=1 n Var(W k (x)). k=1 The sequence (M n (x)) is a square integrable martingale such that where < M(x) > n lim = l a.s. n n 1+α l = σ2 f (x) 1 + α. Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 57 / 60

Statistical applications Kernel density estimation Kernel density estimation Theorem For all x R, we have the pointwise almost sure convergence lim fn (x) = f (x) n a.s. In addition, as soon as 1/5 < α < 1, we have, for all x R, the asymptotic normality nhn ( fn (x) f (x)) L N ( 0, σ2 f (x) 1 + α ). Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 58 / 60

Statistical applications Kernel density estimation Kernel density estimation 0.5 Almost sure convergence 0.4 0.3 0.2 0.1 0 200 400 600 800 1000 0.5 Asymptotic normality 0.4 0.3 0.2 0.1 0 5 0 5 Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 59 / 60

Statistical applications Kernel density estimation!!!! Many thanks for your attention!!!! Bernard Bercu Asymptotic results for discrete time martingales and stochastic algorithms 60 / 60