Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMSN50)

Similar documents
Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMSN50)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMSN50)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMSN50)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Introduction to Sequential Monte Carlo Methods

Lecture outline. Monte Carlo Methods for Uncertainty Quantification. Importance Sampling. Importance Sampling

ELEMENTS OF MONTE CARLO SIMULATION

The Vasicek Distribution

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

Strategies for Improving the Efficiency of Monte-Carlo Methods

Strategies for High Frequency FX Trading

Market interest-rate models

Homework Assignments

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

Analysis of the Bitcoin Exchange Using Particle MCMC Methods

Stratified Sampling in Monte Carlo Simulation: Motivation, Design, and Sampling Error

Asymptotic results discrete time martingales and stochastic algorithms

Week 7 Quantitative Analysis of Financial Markets Simulation Methods

Statistical Computing (36-350)

Exact Sampling of Jump-Diffusion Processes

Monte Carlo Methods for Uncertainty Quantification

15 : Approximate Inference: Monte Carlo Methods

Unobserved Heterogeneity Revisited

Statistical Inference and Methods

VaR Estimation under Stochastic Volatility Models

Introduction Dickey-Fuller Test Option Pricing Bootstrapping. Simulation Methods. Chapter 13 of Chris Brook s Book.

STAT/MATH 395 PROBABILITY II

STOCHASTIC CALCULUS AND BLACK-SCHOLES MODEL

IEOR E4703: Monte-Carlo Simulation

On Complexity of Multistage Stochastic Programs

Final exam solutions

FE570 Financial Markets and Trading. Stevens Institute of Technology

EC316a: Advanced Scientific Computation, Fall Discrete time, continuous state dynamic models: solution methods

2 Control variates. λe λti λe e λt i where R(t) = t Y 1 Y N(t) is the time from the last event to t. L t = e λr(t) e e λt(t) Exercises

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Stock Loan Valuation Under Brownian-Motion Based and Markov Chain Stock Models

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties

CPSC 540: Machine Learning

AMH4 - ADVANCED OPTION PRICING. Contents

Non-informative Priors Multiparameter Models

Math 416/516: Stochastic Simulation

GRANULARITY ADJUSTMENT FOR DYNAMIC MULTIPLE FACTOR MODELS : SYSTEMATIC VS UNSYSTEMATIC RISKS

I. Time Series and Stochastic Processes

Mathematics of Finance Final Preparation December 19. To be thoroughly prepared for the final exam, you should

Calibration of Interest Rates

Chapter 2 Uncertainty Analysis and Sampling Techniques

2.1 Mathematical Basis: Risk-Neutral Pricing

CPSC 540: Machine Learning

INTERTEMPORAL ASSET ALLOCATION: THEORY

Using Halton Sequences. in Random Parameters Logit Models

Chapter 7: Estimation Sections

The histogram should resemble the uniform density, the mean should be close to 0.5, and the standard deviation should be close to 1/ 12 =

Course information FN3142 Quantitative finance

Modelling financial data with stochastic processes

The Use of Importance Sampling to Speed Up Stochastic Volatility Simulations

Self-organized criticality on the stock market

Stochastic Volatility (SV) Models

Computer Vision Group Prof. Daniel Cremers. 7. Sequential Data

Modelling Returns: the CER and the CAPM

Multilevel quasi-monte Carlo path simulation

Computational Finance

Monte Carlo Methods for Uncertainty Quantification

Lecture 7: Bayesian approach to MAB - Gittins index

Modelling the Sharpe ratio for investment strategies

STK 3505/4505: Summary of the course

Stochastic Dynamical Systems and SDE s. An Informal Introduction

The Binomial Lattice Model for Stocks: Introduction to Option Pricing

A Macro-Finance Model of the Term Structure: the Case for a Quadratic Yield Model

Math489/889 Stochastic Processes and Advanced Mathematical Finance Homework 5

Probability. An intro for calculus students P= Figure 1: A normal integral

Chapter 7. Sampling Distributions and the Central Limit Theorem

IEOR 3106: Introduction to Operations Research: Stochastic Models SOLUTIONS to Final Exam, Sunday, December 16, 2012

Conjugate Models. Patrick Lam

The data-driven COS method

Chapter 6 Forecasting Volatility using Stochastic Volatility Model

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

MEASURING PORTFOLIO RISKS USING CONDITIONAL COPULA-AR-GARCH MODEL

Module 10:Application of stochastic processes in areas like finance Lecture 36:Black-Scholes Model. Stochastic Differential Equation.

Chapter 5. Statistical inference for Parametric Models

Estimating the Greeks

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Notes on the EM Algorithm Michael Collins, September 24th 2005

Practice Exercises for Midterm Exam ST Statistical Theory - II The ACTUAL exam will consists of less number of problems.

A Markov Chain Monte Carlo Approach to Estimate the Risks of Extremely Large Insurance Claims

Market Risk Analysis Volume I

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

Computer Exercise 2 Simulation

Fast and accurate pricing of discretely monitored barrier options by numerical path integration

Risk Neutral Valuation

Limit Theorems for the Empirical Distribution Function of Scaled Increments of Itô Semimartingales at high frequencies

Bivariate Birnbaum-Saunders Distribution

Financial Risk Management

Exam in TFY4275/FY8907 CLASSICAL TRANSPORT THEORY Feb 14, 2014

1 Rare event simulation and importance sampling

PROBABILITY. Wiley. With Applications and R ROBERT P. DOBROW. Department of Mathematics. Carleton College Northfield, MN

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2012, Mr. Ruey S. Tsay. Solutions to Final Exam

Particle methods and the pricing of American options

COS 513: Gibbs Sampling

Computer Exercise 2 Simulation

Dynamic Portfolio Choice II

Transcription:

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMSN50) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I January 30, 2018 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (1) 1 / 35

Plan of today s lecture Variance reduction reconsidered 1 Variance reduction reconsidered 2 3 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (2) 2 / 35

We are here Variance reduction reconsidered 1 Variance reduction reconsidered 2 3 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (3) 3 / 35

Last time: Variance reduction Last time we discussed how to reduce the variance of the standard MC sampler by introducing correlation between the variables of the sample. More specifically, we used 1 a control variate Y such that E(Y ) = m is known: Z = φ(x) + β(y m), where β was tuned optimally to β = C(φ(X), Y )/V(Y ). 2 antithetic variables V and V such that E(V ) = E(V ) = τ and C(V, V ) < 0: W = V + V. 2 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (4) 4 / 35

Last time: Variance reduction The following theorem turned out to be useful when constructing antithetic variables. Theorem Let V = ϕ(u), where ϕ : R R is a monotone function. Moreover, assume that there exists a non-increasing transform T : R R such that U = d. T (U). Then V = ϕ(u) and V = ϕ(t (U)) are identically distributed and C(V, V ) = C(ϕ(U), ϕ(t (U))) 0. An important application of this theorem is the following: Let F be a distribution function and φ a monotone function. Then, letting U U(0, 1), T (u) = 1 u, and ϕ(u) = φ(f 1 (u)) yields, for V = φ(f 1 (U)) and V = φ(f 1 (1 U)), V = d. V and C(V, V ) 0. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (5) 5 / 35

Last time: Variance reduction τ = 2 π/2 0 exp(cos 2 (x)) dx, V = 2 π 2 exp(cos2 (X)), V = 2 π 2 exp(sin2 (X)), W = V +V 2. 6.2 6 5.8 With Antithetic sampling 5.6 5.4 5.2 5 4.8 4.6 Standard MC 4.4 4.2 0 200 400 600 800 1000 Sample size N V (= 2 N W ) M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (6) 6 / 35

Control variates reconsidered A problem with the control variate approach is that the optimal β, i.e. β = C(φ(X), Y ), V(Y ) is generally not known explicitly. Thus, it was suggested to 1 draw (X i ) N i=1, 2 draw (Y i ) N i=1, 3 estimate, via MC, β using the drawn samples, and 4 use this to optimally construct (Z i ) N i=1. This yields a so-called batch estimator of β. However, this procedure is computationally somewhat complex. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (7) 7 / 35

An online approach to optimal control variates The estimators def C N = 1 N def V N = 1 N N φ(x i )(Y i m) i=1 N (Y i m) 2 i=1 of C(φ(X), Y ) and V(Y ), respectively, can be implemented recursively according to and C l+1 = with C 0 = V 0 = 0. V l+1 = l l + 1 C l + 1 l + 1 φ(x l+1 )(Y l+1 m) l l + 1 V l + 1 l + 1 (Y l+1 m) 2. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (8) 8 / 35

An online approach to optimal control variates (cont.) Inspired by this we set for l = 0, 1, 2,..., N 1, Z l+1 = φ(x l+1 ) + β l (Y l+1 m), τ l+1 = l l + 1 τ l + 1 l + 1 Z l+1, ( ) def def def where β 0 = 1, β l = C l /V l for l > 0, and τ 0 = 0 yielding an online estimator. One may then establish the following convergence results. Theorem Let τ N be obtained through ( ). Then, as N, (i) τ N τ (a.s.), (ii) N(τ N τ) where σ 2 d. N (0, σ 2 ), def = V(φ(X)){1 ρ(φ(x), Y ) 2 } is the optimal variance. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (9) 9 / 35

Example: the tricky integral again We estimate τ = using π/2 π/2 π/2 exp(cos 2 (x)) dx sym = 2 0 π 2 exp(cos2 (x)) } {{ } =φ(x) Z = φ(x) + β (Y m), where Y = cos 2 (X) is a control variate with m = E(Y ) = π/2 0 2 π }{{} =f(x) dx = E f (φ(x)) cos 2 (x) 2 π dx = {use integration by parts} = 1 2. However, the optimal coefficient β is in general not known explicitly (tedious calculations give β = 4 e 1 2 π I 1 ( 1 2) 5.3432). M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (10) 10 / 35

Example: the tricky integral again cos2 = @(x) cos(x).^2; phi = @(x) 2(pi/2)*exp(cos2(x)); m = 1/2; X = (pi/2)*rand; Y = cos2(x); c = phi(x)*(y m); v = (Y m)^2; tau_cv = phi(x) + (Y m); beta = c/v; for k = 2:N, X = (pi/2)*rand; Y = cos2(x); Z = phi(x) + beta*(y m); tau_cv = (k 1)*tau_CV/k + Z/k; c = (k 1)*c/k + phi(x)*(y m)/k; v = (k 1)*v/k + (Y m)^2/k; beta = c/v; end M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (11) 11 / 35

Example: the tricky integral again 6.2 6 τ Crude MC Adap CV Batch CV Exact -1-1.5-2 β Adap CV Batch CV Exact 5.8-2.5 5.6-3 -3.5 5.4-4 5.2-4.5-5 5 0 500 1000 1500 Sample size N -5.5 0 500 1000 1500 Sample size N M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (12) 12 / 35

We are here Variance reduction reconsidered 1 Variance reduction reconsidered 2 3 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (13) 13 / 35

We will now (and for the coming two lectures) extend the principal goal of the course to the problem of estimating sequentially sequences (τ n ) n 0 of expectations τ n = E fn (φ(x 0:n )) = φ(x 0:n )f n (x 0:n ) dx 0:n X n over spaces X n of increasing dimension, where again the densities (f n ) n 0 are known up to normalizing constants only; i.e. for every n 0, f n (x 0:n ) = z n(x 0:n ) c n, where c n is an unknown constant and z n is a known positive function on X n. As we will see, such sequences appear in many applications in statistics and numerical analysis. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (14) 14 / 35

We are here Variance reduction reconsidered 1 Variance reduction reconsidered 2 3 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (15) 15 / 35

We are here Variance reduction reconsidered 1 Variance reduction reconsidered 2 3 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (16) 16 / 35

A Markov chain on X R d is a family of random variables (= stochastic process) (X k ) k 0 taking values in X such that P(X k+1 B X 0, X 1,..., X k ) = P(X k+1 B X k ) for all B X. We call the chain time homogeneous if the conditional distribution of X k+1 given X k does not depend on k. The distribution of X k+1 given X k = x determines completely the dynamics of the process, and the density q of this distribution is called the transition density of (X k ). Consequently, P(X k+1 B X k = x k ) = q(x k+1 x k ) dx k+1. B M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (17) 17 / 35

Markov chains (cont.) Variance reduction reconsidered The following theorem provides the joint density f n (x 0, x 1,..., x n ) of X 0, X 1,..., X n. Theorem Let (X k ) be Markov with initial distribution χ. Then for n > 0, n 1 f n (x 0, x 1,..., x n ) = χ(x 0 ) q(x k+1 x k ). Corollary (Chapman-Kolmogorov equation) Let (X k ) be Markov. Then for n > 1, f n (x n x 0 ) = ( n 1 k=0 k=0 q(x k+1 x k ) ) dx 1 dx n 1. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (18) 18 / 35

Example: The AR(1) process As a first example we consider a first order autoregressive process (AR(1)) in R. Set X 0 = 0, X k+1 = αx k + ɛ k+1, where α is a constant and the variables (ɛ k ) k 1 of the noise sequence are i.i.d. with density function f. In this case, P(X k+1 x k+1 X k = x k ) = P(αX k + ɛ k+1 x k+1 X k = x k ) implying that q(x k+1 x k ) = = P(ɛ k+1 x k+1 αx k X k = x k ) = P(ɛ k+1 x k+1 αx k ), x k+1 P(X k+1 x k+1 X k = x k ) = x k+1 P(ɛ k+1 x k+1 αx k ) = f(x k+1 αx k ). M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (19) 19 / 35

We are here Variance reduction reconsidered 1 Variance reduction reconsidered 2 3 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (20) 20 / 35

Simulation of rare events for Markov chains Let (X k ) be a Markov chain on X = R and consider the rectangle B = B 0 B 1 B n R n, where every B l = (a l, b l ) is an interval. Here B can be a possibly extreme event. Say that we wish to to compute, sequentially as n increases, some expectation under the conditional distribution f n B of the states X 0:n = (X 0, X 1, X 2,..., X n ) given X 0:n B, i.e. τ n = E fn (φ(x 0:n ) X 0:n B) = E fn B (φ(x 0:n )) f(x 0:n ) = φ(x 0:n ) dx 0:n. B P(X 0:n B) }{{} =f n B (x 0:n )=z n(x 0:n )/c n Here the unknown probability c n = P(X 0:n B) of the rare event B is often the quantity of interest. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (21) 21 / 35

Simulation of rare events for Markov chains (cont.) As c n = P(X 0:n B) = 1 B (x 0:n )f(x 0:n ) dx 0:n a first naive approach could of course be to use standard MC and simply 1 simulate the Markov chain N times, yielding trajectories (X 0:n i ) N i=1, 2 count the number N B of trajectories that fall into B, and 3 estimate c n using the MC estimator c N n = N B N. Problem: if c n = 10 9 we may expect to produce a billion draws before obtaining a single draw belonging to B! As we will se, SMC methods solve the problem efficiently. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (22) 22 / 35

We are here Variance reduction reconsidered 1 Variance reduction reconsidered 2 3 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (23) 23 / 35

Estimation in general hidden Markov models (HMMs) A hidden Markov model (HMM) comprises two stochastic processes: 1 A Markov chain (X k ) k 0 with transition density q: X k+1 X k = x k q(x k+1 x k ). The Markov chain is not seen by us (hidden) but partially observed through 2 an observation process (Y k ) k 0 such that conditionally on the chain (X k ) k 0, (i) the Y k s are independent with (ii) conditional distribution of each Y k depending on the corresponding X k only. d. The density of the conditional distribution Y k (X k ) k 0 = Y k X k will be denoted by p(y k x k ). M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (24) 24 / 35

Estimation in general HMMs (cont.) Graphically: Y k 1 Y k Y k+1 (Observations)... X k 1 X k X k+1... (Markov chain) Y k X k = x k p(y k x k ) X k+1 X k = x k q(x k+1 x k ) X 0 χ(x 0 ) (Observation density) (Transition density) (Initial distribution) M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (25) 25 / 35

Example HMM: A stochastic volatility model The following dynamical system is used in financial economy (see Taylor (1982)). Let { Xk+1 = αx k + σɛ k+1, Y k = β exp ( Xk 2 ) ε k, where α (0, 1), σ > 0, and β > 0 are constants and (ɛ k ) k 1 and (ε k ) k 0 are sequences of i.i.d. standard normal-distributed noise variables. In this model the values of the observation process (Y k ) are observed daily log-returns (from e.g. the Swedish OMXS30 index) and the hidden chain (X k ) is the unobserved log-volatility (modeled by a stationary AR(1) process). The strength of this model is that it allows for volatility clustering, a phenomenon that is often observed in real financial time series. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (26) 26 / 35

Example HMM: A stochastic volatility model A realization of the the model looks like follows (here α = 0.975, σ = 0.16, and β = 0.63). 4 3 2 Log returns Log volatility process log returns 1 0 1 2 0 100 200 300 400 500 600 700 800 900 1000 Days M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (27) 27 / 35

Example HMM: A stochastic volatility model Daily log-returns from the Swedish stock index OMXS30, from 2005-03-30 to 2007-03-30. 0.06 0.04 0.02 y k 0 0.02 0.04 0.06 0 50 100 150 200 250 300 350 400 450 500 k M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (28) 28 / 35

The smoothing distribution When operating on HMMs, one is most often interested in the smoothing distribution f n (x 0:n y 0:n ), i.e. the conditional distribution of a set X 0:n of hidden states given Y 0:n = y 0:n. Theorem (Smoothing distribution) f n (x 0:n y 0:n ) = χ(x 0)p(y 0 x 0 ) n k=1 p(y k x k )q(x k x k 1 ), L n (y 0:n ) where L n (y 0:n ) is the likelihood function given by L n (y 0:n ) = density of the observations y 0:n n = χ(x 0 )p(y 0 x 0 ) p(y k x k )q(x k x k 1 ) dx 0 dx n. k=1 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (29) 29 / 35

Estimation of smoothed expectations Being a high-dimensional (say n 1000 or 10, 000) integral over complicated integrands, L n (y 0:n ) is in general unknown. However by writing τ n = E(φ(X 0:n ) Y 0:n = y 0:n ) = φ(x 0:n )f n (x 0:n y 0:n ) dx 0 dx n with = φ(x 0:n ) z n(x 0:n ) c n dx 0 dx n, { z n (x 0:n ) = χ(x 0 )p(y 0 x 0 ) n k=1 p(y k x k )q(x k x k 1 ), c n = L n (y 0:n ), we may cast the problem of computing τ n into the framework of self-normalized IS. In particular we would like to update sequentially, in n, the approximation as new data (Y k ) appears. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (30) 30 / 35

We are here Variance reduction reconsidered 1 Variance reduction reconsidered 2 3 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (31) 31 / 35

Self-avoiding walks (SAWs) in the 2-dim integer (Z 2 ) lattice def Let S n = {x 0:n Z 2n : x 0 = 0, x k+1 x k = 1, x k x l, 0 l < k n} be the set of n-step self-avoiding walks in Z 2. 6 A self avoiding walk of length 50 5 4 3 2 1 0 1 2 3 8 6 4 2 0 2 4 6 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (32) 32 / 35

Self-avoiding walks (SAWs) in the honeycomb lattice (HCL) Let def S n = {x 0:n HCL : x 0 = 0, x k+1 x k = 1, x k x l, 0 l < k n} be the set of n-step self-avoiding walks in HCL. 0 Self-avoiding walk of length 50 in the HCL -2-4 -6-8 -10-12 0 5 10 15 20 25 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (33) 33 / 35

Application of SAWs Variance reduction reconsidered In addition, let SAWs are used in c n = S n = The number of possible SAWs of length n. polymer science for describing long chain polymers, with the self-avoidance condition modeling the excluded volume effect. statistical mechanics and the theory of critical phenomena in equilibrium. However, computing c n (and in analyzing how c n depends on n) is known to be a very challenging combinatorial problem! M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (34) 34 / 35

An MC approach to SAWs Trick: let f n (x 0:n ) be the uniform distribution on S n : f n (x 0:n ) = 1 1 Sn (x 0:n ), x c n }{{} 0:n Z 2n, =z(x 0:n ) We may thus cast the problem of computing the number c n (= the normalizing constant of f n ) into the framework of self-normalized IS based on some convenient instrumental distribution g n on Z 2n. In addition, solving this problem for n = 1, 2, 3,..., 508, 509,... calls for sequential implementation of IS. This will be the topic of HA2! M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L5 (35) 35 / 35