Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 3 Importance sampling January 27, 2015 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (1)

Last time: MC output analysis We used the CLT N (τn τ) d. N (0, σ 2 (φ)) to target τ by the approximate two-sided condence interval ( ) σ(φ) σ(φ) τ N λ α/2, τ N + λ α/2. N N In addition, we discussed how to estimate ϕ(τ) for some function ϕ : R R having at hand an estimator τ N of τ. If ϕ C 1 one may prove the CLT d. N (ϕ(τn ) ϕ(τ)) N (0, ϕ (τ) 2 σ 2 (φ)). Consequently, the natural estimator ϕ(τ N ) works ne, at least asymptotically (but suers in general from bias for nite N's). M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (2)

Example: Buon's needle Consider a a wooden oor with parallel boards of width d on which we randomly drop a needle of length l, with l d. Let { X = distance from the lower needlepoint to the upper board edge line, θ = angle between the needle and the board edge normal. Then τ = P (needle intersects board edge) = P(X l cos θ) =... = 2l πd. or, equivalently, π = 2l τd. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (3)

Example: Buon's needle (cont.) Since τ = P(needle intersects board edge) = E(1 {X l cos θ} ) can be easily estimated by means of MC, an approximation of π = ϕ(τ) = 2l/(τd) can be obtained via the delta method: 5 4.5 4 π Estimate 3.5 3 2.5 2 1.5 0 100 200 300 400 500 600 700 800 900 1000 Number of samples N M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (4)

Last time: pseudo-random number generation We discussed (briey) how to generate pseudo-random, uniformly distributed numbers (U n ) using the linear congruential generator U n = (a U n 1 + c) mod m. Having at hand such U(0, 1)-distributed numbers U, we also looked at how to generate random numbers X from an arbitrary distribution F by means of the inversion method, i.e., by letting X = F (U) = inf{x R : F (x) U}. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (5)

Conditional methods Importance sampling (IS) Let f be a multivariate density on R d. By decomposing f into conditional densities according to f(x 1,..., x d ) = f(x 1 ) d f(x l x 1,..., x l 1 ), the problem of sampling from a multivariate density can be reduced to that of sampling from several univariate densities: draw X 1 f(x 1 ) for l = 2 d do draw X l f(x l X 1,..., X l 1 ) end for return X = (X 1,..., X d ) Trivially, the resulting draw X has the correct distribution f. This method presumes that the conditional densities are easily obtained, which is not always the case. l=2 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (6)

Last time: pseudo-random number generation In many cases we do not know the inverse of F or not even the normalizing constant of the density f. However, if g is another density such that f(x) Kg(x) for all x R d and some constant K <, we may use rejection sampling: repeat draw X g draw U U(0, 1) until U f(x ) Kg(X ) X X return X M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (7)

Last time: pseudo-random number generation Theorem (Rejection sampling) The output X of the rejection sampling algorithm is a random variable with density function f. Moreover, the expected number of trials needed before acceptance is K. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (8)

Example Importance sampling (IS) We wish to simulate f(x) = exp(cos 2 (x))/c, x ( π/2, π/2), where c = π/2 π/2 exp(cos2 (z)) dz = πe 1/2 I o (1/2) is the normalizing constant. However, since for all x ( π/2, π/2), f(x) = exp(cos2 (x)) c e c = eπ 1, }{{} c }{{} π K g where g is the density of U( π/2, π/2), we may use rejection sampling where a candidate X U( π/2, π/2) is accepted if U f(x ) Kg(X ) = exp(cos2 (X ))/c = exp(cos 2 (X ) 1). e/c M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (9)

prob = @(x) exp((cos(x))^2 1); trial = 1; accepted = false; while ~accepted, Xcand = pi/2 + pi*rand; if rand < prob(xcand), accepted = true; X = Xcand; else trial = trial + 1; end end 0.7 0.6 0.5 0.4 0.3 Histogram of accept reject draws f(x) = exp(cos 2 (x))/c 0.2 0.1 0 2 1.5 1 0.5 0 0.5 1 1.5 2 Figure : Plot of a histogram of 20,000 accept-reject draws together with the true density. The average number of trials was 1.5555( K = e 1/2 /I 0 (1/2) 1.5503). M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (10)

Plan of today's lecture Importance sampling (IS) 1 Importance sampling (IS) 2 3 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (11)

We are here Importance sampling (IS) 1 Importance sampling (IS) 2 3 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (12)

Advantages of the MC method The MC method is more ecient than deterministic methods in high dimensions, does in general not require knowledge of the normalizing constant of a density for computing expectations, and handles eciently strange integrands that may cause problems for deterministic methods. 1 h(x) = sin 2 (1/cos(log(1 + 2πx))) 0.9 0.8 0.7 0.6 y = h(x) 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (13)

Problems with MC integration OK, MC integration looks promising. We may however run into problems if it is hard to sample from f or if the integrand φ and the density f are dissimilar; in this case we will end of with a lot of draws where the integrand is small, and consequently only a few draws will contribute to the estimate. This gives a large variance. 0.7 0.6 φ(x) 0.5 0.4 f(x) 0.3 0.2 0.1 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 These problems can often be solved using importance sampling. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (14)

Importance sampling (IS, Ch. 6.4.1) The basis of importance sampling is to take an instrumental density g on X such that g(x) = 0 f(x) = 0 and rewrite the integral as τ = E f (φ(x)) = = where g(x)>0 X φ(x)f(x) dx = φ(x) f(x) g(x) g(x) dx = E g f(x)>0 ( φ(x) f(x) g(x) φ(x)f(x) dx ω : {x X : g(x) > 0} x f(x) g(x) is the so-called importance weight function. ) = E g (φ(x)ω(x)), M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (15)

Importance sampling (cont.) We may now estimate τ = E g (φ(x)ω(x)) using standard MC: for i = 1 N do draw X i g end for set τ N N i=1 φ(x i )ω(x i )/N return τ N Here, trivially, V(τ N ) = 1 N V g(φ(x)ω(x)) and we should thus aim at choosing g so that the function x φ(x)ω(x) is close to constant in the support of g. This gives a minimal variance. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (16)

Example: A tricky normal expectation Let X have N (2, 1) distribution and try to compute ( ) τ = E 1 X 0 X exp( X 3 ) = 1 x 0 x exp( x 3 ) N (x; 2, 1) dx, }{{}}{{} =φ(x) =f(x) where N (x; µ, σ 2 ) denotes the density of the normal distribution. 0.7 0.6 φ(x) 0.5 0.4 f(x) 0.3 0.2 0.1 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Here the support of f is signicantly larger than that of φ. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (17)

Example: A tricky normal expectation (cont.) Thus, standard MC will lead to a waste of computational power. Better is to use IS with g being a scale-location-transformed normal-distribution: 1 0.9 0.8 0.7 g(x) 0.6 0.5 0.4 φ(x) f(x) 0.3 0.2 0.1 φ(x)f(x) 0 0 0.5 1 1.5 2 2.5 3 Lecture 3.5 3 4Importance 4.5 sampling 5 January 27, 2 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (18)

Example: A tricky normal expectation (cont.) phi = @(x) (x >= 0).*sqrt(x).*exp( x.^3); mu = 0.8; sigma = 0.4; omega = @(x) normpdf(x,2,1)./normpdf(x,mu,sigma); X = sigma*randn(1,n)+mu; tau = mean(f(x).*omega(x)); 0.13 0.12 0.11 Importance Sampling 0.1 0.09 0.08 0.07 Standard MC 0.06 0.05 0.04 0.03 0 500 1000 1500 M. Wiktorsson Sample size N Monte Carlo and Empirical Methods for Stochastic Inference, L3 (19)

We are here Importance sampling (IS) 1 Importance sampling (IS) 2 3 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (20)

(Ch. 6.4.1) Often f(x) is known only up to a normalizing constant c > 0, i.e. f(x) = z(x)/c, where we can evaluate z(x) = cf(x) but not f(x). Then, as before, τ = E f (φ(x)) = = X cf(x) g(x)>0 φ(x) g(x) g(x)>0 where we are able to evaluate φ(x)f(x) dx = c f(x)>0 φ(x)f(x) dx c f(x)>0 f(x) dx g(x) dx cf(x) g(x) g(x) dx = g(x)>0 g(x)>0 ω : {x X : g(x) > 0} x z(x) g(x). φ(x)ω(x)g(x) dx ω(x)g(x) dx = E g(φ(x)ω(x)), E g (ω(x)) M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (21)

(cont.) Thus, having generated a sample X 1,..., X N from g we may estimate the numerator E g (φ(x)ω(x)) as well as the dominator E g (ω(x)) using standard MC: τ = E g(φ(x)ω(x)) E g (ω(x)) 1 N N i=1 φ(x i )ω(x i ) N l=1 ω(x l ) = 1 N N ω(x i ) N l=1 ω(x l }{{ ) φ(x i ). } normalized weight i=1 Note that the denominator yields an estimate of the normalizing constant c: c = E g (ω(x)) 1 N N ω(x l ). l=1 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (22)

Example Importance sampling (IS) We reconsider the density f(x) = exp(cos 2 (x))/c, x ( π/2, π/2), treated last time and estimate its variance as well as the normalizing constant c > 0 using self-normalized IS. Let the instrumental distribution g be the uniform distribution U( π/2, π/2). M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (23)

Example (cont.) Importance sampling (IS) z = @(x) exp(cos(x).^2); X = pi/2 + pi*rand(1,n); omega = @(x) pi*z(x); tau = cumsum(x.^2.*omega(x))./cumsum(omega(x)); c = cumsum(omega(x))./(1:n); subplot(2,1,1); plot(1:n,c); Estimated normalizing constant 6 subplot(2,1,2); 5.5 plot(1:n,tau); 5 4.5 4 0 100 200 300 400 500 600 700 800 Sample size N 1.4 Estimated variance 1.2 1 0.8 0.6 0.4 0 100 200 300 400 500 600 700 800 Sample size N M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (24)

The weighted sample (X i, ω(x i )) can be viewed as a MC representation of the target distribution f. 0.5 f(x) = exp(cos 2 (x))/c, π/2 < x < π/2 0.07 MC representation 0.45 f(x) 0.06 0.4 0.05 0.35 g(x) 0.04 0.3 0.03 0.25 0.02 0.2 0.01 2 1.5 1 0.5 0 0.5 1 1.5 2 0 2 1.5 1 0.5 0 0.5 1 1.5 2 f(x) IS (X i, ω(x i )) M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (25)

We are here Importance sampling (IS) 1 Importance sampling (IS) 2 3 M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (26)

HA1: Simulation and Monte Carlo integration HA1 comprises one question on random number generation and two larger questions on IS (one- and two-dimensional problems) containing one subquestion (2(c)) on variance reduction (which we will discuss next time). M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (27)

Submission: A written report in PDF format (No MS Word-les). The pair Errol Norqvist and Lillemor Dahlstedt name their report le proj1-dlne.pdf. A printed and stitched copy of the report is given to the lecturer at the very beginning of the lecture on Tuesday 10 Feb. An email containing the report le as well as all your m-les with a le proj1.m that runs your analysis. This email has to be sent to fms091@matstat.lu.se before Tuesday 10 Feb, 15:00:00 (that is, 15 minutes before the beginning of the lecture). Use your STIL-IDs in the subject line of the email. Set the subject line to: Project 1 by STILID1 and STILID2 Late submissions do not qualify for marks higher than 3. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (28)

Instructions on report writing Explain carefully all introduced notation: X =?. Describe/explain the model. The text should be readable without access to the Matlab code; write plain text instead of including Matlab code in the report. Include your solutions in the text; do not write calculations of? can be found in the matlab code, or similar. When referring to the lecture notes or the book, be specic (i.e. refer to Chapter/which lecture). Refer to your gures in the text. Explain colors etc. in the gure captions (a gure caption is almost never to long). Write clear motivations and reasonings when it concerns choice of instrumental distributions etc. M. Wiktorsson Monte Carlo and Empirical Methods for Stochastic Inference, L3 (29)