Machine Learning in Computer Vision Markov Random Fields Part II

Size: px
Start display at page:

Download "Machine Learning in Computer Vision Markov Random Fields Part II"

Transcription

1 Machine Learning in Computer Vision Markov Random Fields Part II Oren Freifeld Computer Science, Ben-Gurion University March 22, 2018 Mar 22, / 40

2 1 Some MRF Computations 2 Mar 22, / 40

3 Few Things Plan for what is coming up next (not just today): Computations on MRF (i.e., inference): Where we can compute things exactly using Dynamic Programming Markov-Chain Monte Carlo (MCMC) and Gibbs Sampling Briefly discuss additional methods for inference Several specific MRF models (e.g., the Ising model) and their applications in computer vision HW #2 will focus on implementing exact sampling from the Ising model prior (but DP is also applicable for the posterior) HW #3 will focus on implementing MCMC sampling (particularly Gibbs sampling) from the Ising model prior and posterior (given observations corrupted by Gaussian noise). Mar 22, / 40

4 Some MRF Computations Computing on General Graphs Things we may want to compute Most likely state (possibly conditioned on some observations) Example The most probable binary segmentation given a noisy image. Marginals, normalizing constants, and expectations. Example p(x) = 1 Z exp ( c C E c(x c ) ). Compute Z = x exp ( c C E c(x c ) ) Sampling from p Example Sampling an image from a distribution over images. (Ulf Grennander s Pattern Theory: model synthesis = model analysis ) In all cases, exploiting the conditional independences captured by G is key. Mar 22, / 40

5 Some MRF Computations Many Approaches For Computing on MRFs If R (range of x s ) is finite and G is special (e.g.: linear; a tree) there are efficient methods with theoretical guarantees; in fact, these methods are often used even if there is no special structure (hence no guarantees) In Gaussian MRFs often things can be done in either a closed form or very efficiently (exploiting sparse linear algebra and sparsity of Q = Σ 1 ) If R is finite and G is either small or big-but-not-too-complicated, can do things exactly using Dynamic Programming. If clique functions have special structure, this can often be exploited If R is finite, graph-theoretic methods can often efficiently (locally-) maximize p Markov-Chain Monte-Carlo (MCMC) Methods that approximate G and/or p (e.g., using a simpler graph) We will touch upon only some of these. Mar 22, / 40

6 Dynamic Programming Not directly applicable for continuous RVs Impractical if the associated complexity of the MRF is too high When applicable, computations are exact. Consider, as a prototype problem, working with p(x A y B ) (more generally, working with p(x), an MRF w.r.t. some graph G) Let us start with finding most-likely (argmax) configurations. Mar 22, / 40

7 Dynamic Programming for Argmax Example S = {s, t, u, v, w} x = (x s, x t, x u, x v, x w ) p(x) = H uvw (x u, x v, x w )H u (x u )H ut (x u, x t )H vt (x v, x t )H t (x t )H us (x u, x s ) C = {{u, v, w}, {u}, {u, t}, {v, t}, {t}, {u, s}} x v x t dependency graph for p(x) x w x u x s Mar 22, / 40

8 Dynamic Programming for Argmax Example Step 1: Absorb sub-cliques (optional, facilitates book keeping), and redefine C (set of relevant cliques) accordingly. E.g., with F uvw = H uvw F ut = H u H ut F vt = H vt H t F us = H us we convert p(x) = H uvw (x u, x v, x w )H u (x u )H ut (x u, x t )H vt (x v, x t )H t (x t )H us (x u, x s ) C = {{u, v, w}, {u}, {u, t}, {v, t}, {t}, {u, s}} to p(x) = F uvw (x u, x v, x w )F ut (x u, x t )F vt (x v, x t )F us (x u, x s ) C = {{u, v, w}, {u, t}, {v, t}, {u, s}} Mar 22, / 40

9 Dynamic Programming for Argmax Example S = {s, t, u, v, w} x = (x s, x t, x u, x v, x w ) p(x) = F uvw (x u, x v, x w )F ut (x u, x t )F vt (x v, x t )F us (x u, x s ) C = {{u, v, w}, {u, t}, {v, t}, {u, s}} x v x t dependency graph for p(x) x w x u x s Remark: could have done better with C = {{u, v, w}, {u, v, t}, {u, s}} but that would have hidden some of the structure of the factorization. Mar 22, / 40

10 Dynamic Programming for Argmax Example Step 2: Pick a site-visitation schedule, i.e., an ordering (and rename sites accordingly) S = {1, 2, 3, 4, 5} x = (x 1, x 2, x 3, x 4, x 5 ) p(x) = F 123 (x 1, x 2, x 3 )F 24 (x 2, x 4 )F 34 (x 3, x 4 )F 35 (x 3, x 5 ) C = {{1, 2, 3}, {2, 4}, {3, 4}, {3, 5}} x 2 x 4 dependency graph for p(x) x 1 x 3 x 5 Mar 22, / 40

11 Dynamic Programming for Argmax Example During the blood and gore of the next steps, keep in mind that the only thing that is really happening here is this: max x p(x) = max x 5 = max x 5 = max x 5 max x 4 max x 4 max x 4 max x 3 max x 3 max x 3 max x 2 max p(x) x 1 max max F 123 F 24 F 34 F 35 x 2 x 1 F 24 max F 123 x 1 }{{} C 1 (x 2,x 3 ) F 34 F 35 max x 2 } {{ } } C 2 (x 3,x 4 ) {{ } } C 3 (x 4,x 5 ) {{ } } C 4 (x 5 ) {{ } C 5 Mar 22, / 40

12 Dynamic Programming for Argmax Example Step 3: Define boundaries and innovation cliques, k {1, 2,..., S }. Boundaries: b k = boundary of 1, 2,..., k in G ={l S : l > k & l c for some c C s.t. c {1, 2,..., k} } e.g., for C = {{1, 2, 3}, {2, 4}, {3, 4}, {3, 5}}: b 1 = {2, 3} = {l S : l > 1 & l c for some c C s.t. c {1} } b 2 = {3, 4} = {l S : l > 2 & l c for some c C s.t. c {1, 2} } b 3 = {4, 5} = {l S : l > 3 & l c for some c C s.t. c {1, 2, 3} } b 4 = {5} = {l S : l > 4 & l c for some c C s.t. c {1, 2, 3, 4} } b 5 = = {l S : l > 5 & l c for some c C s.t. c {1, 2, 3, 4, 5} } Mar 22, / 40

13 Dynamic Programming for Argmax Example Step 3: Define boundaries and innovation cliques, k {1, 2,..., S }. Innovation cliques: C k = relevant new cliques at step k = {c C : k c & l / c l < k} e.g., for C = {{1, 2, 3}, {2, 4}, {3, 4}, {3, 5}}: C 1 = {{1, 2, 3}} = {c C : 1 c & l / c l < 1} C 2 = {{2, 4}} = {c C : 2 c & l / c l < 2} C 3 = {{3, 4}, {3, 5}} = {c C : 3 c & l / c l < 3} C 4 = = {c C : 4 c & l / c l < 4} C 5 = = {c C : 5 c & l / c l < 5} Mar 22, / 40

14 Dynamic Programming for Argmax Example Step 4: Loop through sites, computing conditional optima. Initialize: R 1 (x b1 ) = arg max x 1 C 1 (x b1 ) = max x 1 F c (x c ) c C 1 F c (x c ) = F c (x c ) c C 1 c C 1 E.g., for b 1 = {2, 3} and C 1 = {{1, 2, 3}} R 1 (x 2, x 3 ) = arg max x 1 F 123 (x 1, x 2, x 3 ) C 1 (x 2, x 3 ) = F 123 (x 1, x 2, x 3 ) x1 =R 1 (x b1 ) x1 =R 1 (x b1 ) Mar 22, / 40

15 Dynamic Programming for Argmax Example Step 4: Iterate: R k (x bk ) = arg max C k 1 (x bk 1 ) F c (x c ) x k c C k C k (x bk ) = max C k 1 (x bk 1 ) F c (x c ) x k c C k = C k 1 (x bk 1 ) F c (x c ) c C k xk =R k (x bk ) k = 2, 3,..., S (break ties arbitrarily) Mar 22, / 40

16 Dynamic Programming for Argmax Example E.g.: R 2 (x 3, x 4 ) = arg max x 2 C 1 (x 2, x 3 )F 24 (x 2, x 4 ) C 2 (x 3, x 4 ) = C 1 (R 2 (x 3, x 4 ), x 3 )F 24 (R 2 (x 3, x 4 ), x 4 ) R 3 (x 4, x 5 ) = arg max x 3 C 2 (x 3, x 4 )F 34 (x 3, x 4 )F 35 (x 3, x 5 ) C 3 (x 4, x 5 ) = C 2 (R 3 (x 4, x 5 ), x 4 )F 34 (R 3 (x 4, x 5 ), x 4 )F 35 (R 3 (x 4, x 5 ), x 5 ) R 4 (x 5 ) = arg max x 4 C 3 (x 4, x 5 ) C 4 (x 5 ) = C 3 (R 4 (x 5 ), x 5 ) R 5 = arg max x 5 C 4 (x 5 ) C 5 = C 4 (R 5 ) Mar 22, / 40

17 Dynamic Programming for Argmax Example Step 4: Compute global optimum, backwards E.g.: x S = R S x k 1 = R k 1 ( x bk 1 ), k = S,..., 2 x 5 = R 5 x 4 = R 4 ( x 5 ) x 3 = R 3 ( x 4, x 5 ) x 2 = R 2 ( x 3, x 4 ) x 1 = R 1 ( x 2, x 3 ) ( x 1, x 2, x 3, x 4, x 5 ) maximizes p (i.e., it is the most likely state) Mar 22, / 40

18 Dynamic Programming for Argmax Example Computational Cost: Assume a common state space, R (x k R k). Step k involves O( R b k +1 ) operations. Hence the cost of the entire procedure is no more than O( S R B+1 ) where B = max k b k. E.g., B = 2 in the example. If the visitation schedule were re-ordered, exchanging 3 and 4, B would still be 2, but the third step would require O( R 2 ) instead of O( R 3 ) operations. In either case, cost is about 5 R 3. Finding the ordering that minimizes B is NP hard. Recall that in HMM, p(x A y B ) is a Markov Chain. Using left-to-right visitation schedule, B = 1, and the number of computations is O(n R 2 ). Brute forces is O(n R n ). Mar 22, / 40

19 Dynamic Programming for Normalization-constant Example S = {s, t, u, v, w} x = (x s, x t, x u, x v, x w ) p(x) = 1 Z H uvw(x u, x v, x w )H u (x u )H ut (x u, x t )H vt (x v, x t )H t (x t )H us (x u, x s ) H uvw (x u, x v, x w )H u (x u )H ut (x u, x t )H vt (x v, x t )H t (x t )H us (x u, x s ) C = {{u, v, w}, {u}, {u, t}, {v, t}, {t}, {u, s}} Same example, but compute the normalizer, Z = x c C H c(x c ): x v x t dependency graph for p(x) x w x u x s Mar 22, / 40

20 Dynamic Programming for Normalization-constant Example Similar to the argmax case Step 1: Absorb sub-cliques (optional, facilitates book keeping), and redefine C (set of relevant cliques) accordingly. E.g., with F uvw = H uvw F ut = H u H ut F vt = H vt H t F us = H us we convert p(x) H uvw (x u, x v, x w )H u (x u )H ut (x u, x t )H vt (x v, x t )H t (x t )H us (x u, x s ) C = {{u, v, w}, {u}, {u, t}, {v, t}, {t}, {u, s}} to p(x) F uvw (x u, x v, x w )F ut (x u, x t )F vt (x v, x t )F us (x u, x s ) C = {{u, v, w}, {u, t}, {v, t}, {u, s}} Mar 22, / 40

21 Dynamic Programming for Normalization-constant Example Similar to the argmax case S = {s, t, u, v, w} x = (x s, x t, x u, x v, x w ) p(x) F uvw (x u, x v, x w )F ut (x u, x t )F vt (x v, x t )F us (x u, x s ) C = {{u, v, w}, {u, t}, {v, t}, {u, s}} x v x t dependency graph for p(x) x w x u x s Remark: could have done better with C = {{u, v, w}, {u, v, t}, {u, s}} but that would have hidden some of the structure of the factorization. Mar 22, / 40

22 Dynamic Programming for Normalization-constant Example Similar to the argmax case Step 2: Pick a site-visitation schedule, i.e., an ordering (and rename sites accordingly) S = {1, 2, 3, 4, 5} x = (x 1, x 2, x 3, x 4, x 5 ) p(x) F 123 (x 1, x 2, x 3 )F 24 (x 2, x 4 )F 34 (x 3, x 4 )F 35 (x 3, x 5 ) C = {{1, 2, 3}, {2, 4}, {3, 4}, {3, 5}} x 2 x 4 dependency graph for p(x) x 1 x 3 x 5 Mar 22, / 40

23 Dynamic Programming for Normalization-constant Example Similar to the argmax case During the blood and gore of the next steps, keep in mind that the only thing that is really happening here is this: F c (c) = F c (c) x c C x 5 x 4 x 3 x 2 x 1 c C = F 123 F 24 F 34 F 35 x 5 x 4 x 3 x 2 x 1 = F 34 F 35 F 24 F 123 x 5 x 4 x 3 x 2 x 1 }{{} } T 1 (x 2,x 3 ) {{} } T 2 (x 3,x 4 ) {{ } } T 3 (x 4,x 5 ) {{ } } T 4 (x 5 ) {{ } T 5 Mar 22, / 40

24 Dynamic Programming for Normalization-constant Example Identical to the argmax case Step 3: Define boundaries and innovation cliques, k {1, 2,..., S }. Boundaries: b k = boundary of 1, 2,..., k in G ={l S : l > k & l c for some c C s.t. c {1, 2,..., k} } e.g., for C = {{1, 2, 3}, {2, 4}, {3, 4}, {3, 5}}: b 1 = {2, 3} = {l S : l > 1 & l c for some c C s.t. c {1} } b 2 = {3, 4} = {l S : l > 2 & l c for some c C s.t. c {1, 2} } b 3 = {4, 5} = {l S : l > 3 & l c for some c C s.t. c {1, 2, 3} } b 4 = {5} = {l S : l > 4 & l c for some c C s.t. c {1, 2, 3, 4} } b 5 = = {l S : l > 5 & l c for some c C s.t. c {1, 2, 3, 4, 5} } Mar 22, / 40

25 Dynamic Programming for Normalization-constant Example Identical to the argmax case Step 3: Define boundaries and innovation cliques, k {1, 2,..., S }. Innovation cliques: C k = relevant new cliques at step k = {c C : k c & l / c l < k} e.g., for C = {{1, 2, 3}, {2, 4}, {3, 4}, {3, 5}}: C 1 = {{1, 2, 3}} = {c C : 1 c & l / c l < 1} C 2 = {{2, 4}} = {c C : 2 c & l / c l < 2} C 3 = {{3, 4}, {3, 5}} = {c C : 3 c & l / c l < 3} C 4 = = {c C : 4 c & l / c l < 4} C 5 = = {c C : 5 c & l / c l < 5} Mar 22, / 40

26 Dynamic Programming for Normalization-constant Example Slightly simpler than the argmax case (less bookkeeping) Step 4: Loop through sites, computing partial sums. Initialize: T 1 (x b1 ) = F c (x c ) x 1 c C 1 E.g., for b 1 = {2, 3} and C 1 = {{1, 2, 3}} T 1 (x 2, x 3 ) = F 123 (x 1, x 2, x 3 ) x 1 Mar 22, / 40

27 Dynamic Programming for Normalization-constant Example Slightly simpler than the argmax case (less bookkeeping) Step 4: Iterate: k = 2, 3,..., S Then T S = Z. T k (x bk ) = x k T k 1 (x bk 1 ) c C k F c (x c ) Mar 22, / 40

28 Dynamic Programming for Normalization-constant Example E.g.: T 2 (x 3, x 4 ) = x 2 T 1 (x 2, x 3 )F 24 (x 2, x 4 ) T 3 (x 4, x 5 ) = x 3 T 2 (x 3, x 4 )F 34 (x 3, x 4 )F 35 (x 3, x 5 ) T 4 (x 5 ) = x 4 T 3 (x 4, x 5 ) T 5 = x 5 T 4 (x 5 ) = Z Mar 22, / 40

29 Dynamic Programming for Normalization-constant Example Computational complexity analysis is similar to the argmax case. Mar 22, / 40

30 Dynamic Programming for Expectation If p(x) = c C F c(x c ) and J(x) = J(x A ), for some function J and some A S, then computing E(J(X A )) = x J(x A ) c C F c (x c ) is identical to the normalization-constant case, except that C is augmented with the clique A. Computational complexity analysis is similar to the argmax case. Mar 22, / 40

31 Dynamic Programming for Marginals Let n = S and p = c C F c(x c ). By induction, T k (x bk ) = F c (x c ) k = 1, 2,..., n x 1 :x k c C:c {1,2,...,k} = p(x (k+1):n ) = 1 p(x) = 1 F c (x c ) Z Z x 1 :x k x 1 :x k c C = 1 F c (x c ) Z c C:c {1,2,...,k}= x 1 :x k = 1 F c (x c ) T k (x bk ) Z c C:c {1,2,...,k}= c C:c {1,2,...,k} = F c (x c ) There are ways to organize the computations so that any set of marginals can be computed while avoiding repeating sub-computations. Mar 22, / 40

32 Dynamic Programming for Marginals Example x 2 x 4 dependency graph for p(x 2, x 3, x 4, x 5 ) = x 1 p(x) x 1 x 3 x 5 η 1 = {2, 3} was already fully connected. Mar 22, / 40

33 Dynamic Programming for Marginals Example x 2 x 4 x 1 x 3 x 5 dependency graph for p(x 3, x 4, x 5 ) = p(x) x 2 x 1 η 2 = {3, 4} was already fully connected. Mar 22, / 40

34 Dynamic Programming for Normalization-constant Example x 2 x 4 x 1 x 3 x 5 dependency graph for p(x 4, x 5 ) = p(x) x 3 x 2 x 1 η 3 = {4, 5} was not fully connected so we added an edge. Mar 22, / 40

35 Dynamic Programming for Marginals Example x 2 x 4 x 1 x 3 x 5 dependency graph for p(x 5 ) = p(x) x 4 x 3 x 2 x 1 η 4 = {5} is (trivially) already fully connected. Mar 22, / 40

36 Sampling from p Assume we know how to sample u U(0, 1) (a uniform continuous RV) Suppose we want to sample from some pmf p over a finite set of states. If we can enumerate all the states according to some ordering, there is a simple way to do it using u U(0, 1) and the inverse of the CDF. But this is impractical if there are too many states (e.g., ) If p is an MRF, we can again resort to Dynamic Programming, with the same caveats as we had for argmax, normalization constants, marginals, etc. Mar 22, / 40

37 Dynamic Programming for Sampling Note p(x) = p(x n )p(x n 1 x n )p(x n 2 x (n 1):n )p(x n 3 x (n 2):n ) p(x 1 x 2:n ) If p = 1 Z c C F c(x c ), then, using what we saw for marginals, p(x k x (k+1):n ) = = p(x k:n) p(x (k+1):n ) = c C:c {1,2,...,k 1}= F c(x c ) c C:c {1,2,...,k}= F c(x c ) T k 1 (x bk 1 ) T k (x bk ) = F c (x c ) T k 1(x bk 1 ) k = 2, 3,..., n T k (x bk ) c C k ( ) ( ) p(x n ) = 1 F c (x c ) T n 1 (x bn 1 ) = 1 F c (x c ) T n 1 (x n ) Z Z c C n c C n Mar 22, / 40

38 Dynamic Programming for Sampling Since we have explicit representations for p(x n ) and p(x k x (k+1):n ) k = 1, 2,..., n 1, we can sample from p x n p(x n ) x k p(x k x (k+1):n ) k = n 1, n 2,..., 1 If all the variables are scalars, then these sampling operations are particularly easy Note that p(x k x (k+1):n ) depends only on x k and x bk. Thus p(x k x (k+1):n ) = p(x k x bk ) Mar 22, / 40

39 Dynamic Programming excellent bookkeeping methods for reusing calculations. Example (HMM) p(x) = n k=2 n n F k,k 1 (x k, x k 1 ) G k (x k, y k ) F k,k 1 (x k, x k 1 ) k=1 k=2 For a fixed y, and visit order, 1, 2,..., n: T k (x k+1 ) = x 1:k k+1 i=1 F i 1,i(x i 1, x i ) forward T k (x k+1 ) = x (k:1):n k i=2 F i 1,i(x i 1, x i ) backward p(x i y) = T i(x i ) T i+1 (x i ) T n i n 1, n 2,..., 1 Mar 22, / 40

40 Sampling When Dynamic Programming is Inapplicable Examples: If the RV s are continuous (easy for Gaussians, but usually not in other cases) Discrete RVs with (infinite) countable state space Discrete RVs but where the graph is too large and complicated The good news: Can resort to MCMC procedures that, asymptotically, produce a sample which behaves like a sample from the p we want. This is, in fact, true in general for a general complicated p, regardless whether it is an MRF or not. In the MRF case we can do it while exploiting the local structure. This is what a method called Gibbs sampling (and its variants) is based on. Mar 22, / 40

Inference in Bayesian Networks

Inference in Bayesian Networks Andrea Passerini passerini@disi.unitn.it Machine Learning Inference in graphical models Description Assume we have evidence e on the state of a subset of variables E in the model (i.e. Bayesian Network)

More information

Adaptive Experiments for Policy Choice. March 8, 2019

Adaptive Experiments for Policy Choice. March 8, 2019 Adaptive Experiments for Policy Choice Maximilian Kasy Anja Sautmann March 8, 2019 Introduction The goal of many experiments is to inform policy choices: 1. Job search assistance for refugees: Treatments:

More information

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Lecture 17: More on Markov Decision Processes. Reinforcement learning Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture

More information

Final exam solutions

Final exam solutions EE365 Stochastic Control / MS&E251 Stochastic Decision Models Profs. S. Lall, S. Boyd June 5 6 or June 6 7, 2013 Final exam solutions This is a 24 hour take-home final. Please turn it in to one of the

More information

Introduction to Fall 2007 Artificial Intelligence Final Exam

Introduction to Fall 2007 Artificial Intelligence Final Exam NAME: SID#: Login: Sec: 1 CS 188 Introduction to Fall 2007 Artificial Intelligence Final Exam You have 180 minutes. The exam is closed book, closed notes except a two-page crib sheet, basic calculators

More information

Gibbs Fields: Inference and Relation to Bayes Networks

Gibbs Fields: Inference and Relation to Bayes Networks Statistical Techniques in Robotics (16-831, F10) Lecture#08 (Thursday September 16) Gibbs Fields: Inference and Relation to ayes Networks Lecturer: rew agnell Scribe:ebadeepta ey 1 1 Inference on Gibbs

More information

Finding Equilibria in Games of No Chance

Finding Equilibria in Games of No Chance Finding Equilibria in Games of No Chance Kristoffer Arnsfelt Hansen, Peter Bro Miltersen, and Troels Bjerre Sørensen Department of Computer Science, University of Aarhus, Denmark {arnsfelt,bromille,trold}@daimi.au.dk

More information

CS 237: Probability in Computing

CS 237: Probability in Computing CS 237: Probability in Computing Wayne Snyder Computer Science Department Boston University Lecture 12: Continuous Distributions Uniform Distribution Normal Distribution (motivation) Discrete vs Continuous

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2018 Last Time: Markov Chains We can use Markov chains for density estimation, p(x) = p(x 1 ) }{{} d p(x

More information

Machine Learning. Graphical Models. Marc Toussaint University of Stuttgart Summer 2015

Machine Learning. Graphical Models. Marc Toussaint University of Stuttgart Summer 2015 Machine Learning Graphical Models Marc Toussaint University of Stuttgart Summer 2015 Outline A. Introduction Motivation and definition of Bayes Nets Conditional independence in Bayes Nets Examples B. Inference

More information

Reasoning with Uncertainty

Reasoning with Uncertainty Reasoning with Uncertainty Markov Decision Models Manfred Huber 2015 1 Markov Decision Process Models Markov models represent the behavior of a random process, including its internal state and the externally

More information

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned

More information

SOLVING ROBUST SUPPLY CHAIN PROBLEMS

SOLVING ROBUST SUPPLY CHAIN PROBLEMS SOLVING ROBUST SUPPLY CHAIN PROBLEMS Daniel Bienstock Nuri Sercan Özbay Columbia University, New York November 13, 2005 Project with Lucent Technologies Optimize the inventory buffer levels in a complicated

More information

Uniform Probability Distribution. Continuous Random Variables &

Uniform Probability Distribution. Continuous Random Variables & Continuous Random Variables & What is a Random Variable? It is a quantity whose values are real numbers and are determined by the number of desired outcomes of an experiment. Is there any special Random

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2019 Last Time: Markov Chains We can use Markov chains for density estimation, d p(x) = p(x 1 ) p(x }{{}

More information

Computer Vision Group Prof. Daniel Cremers. 7. Sequential Data

Computer Vision Group Prof. Daniel Cremers. 7. Sequential Data Group Prof. Daniel Cremers 7. Sequential Data Bayes Filter (Rep.) We can describe the overall process using a Dynamic Bayes Network: This incorporates the following Markov assumptions: (measurement) (state)!2

More information

Notes on the EM Algorithm Michael Collins, September 24th 2005

Notes on the EM Algorithm Michael Collins, September 24th 2005 Notes on the EM Algorithm Michael Collins, September 24th 2005 1 Hidden Markov Models A hidden Markov model (N, Σ, Θ) consists of the following elements: N is a positive integer specifying the number of

More information

6. Continous Distributions

6. Continous Distributions 6. Continous Distributions Chris Piech and Mehran Sahami May 17 So far, all random variables we have seen have been discrete. In all the cases we have seen in CS19 this meant that our RVs could only take

More information

Chapter 3 Discrete Random Variables and Probability Distributions

Chapter 3 Discrete Random Variables and Probability Distributions Chapter 3 Discrete Random Variables and Probability Distributions Part 3: Special Discrete Random Variable Distributions Section 3.5 Discrete Uniform Section 3.6 Bernoulli and Binomial Others sections

More information

The exam is closed book, closed calculator, and closed notes except your three crib sheets.

The exam is closed book, closed calculator, and closed notes except your three crib sheets. CS 188 Spring 2016 Introduction to Artificial Intelligence Final V2 You have approximately 2 hours and 50 minutes. The exam is closed book, closed calculator, and closed notes except your three crib sheets.

More information

15 : Approximate Inference: Monte Carlo Methods

15 : Approximate Inference: Monte Carlo Methods 10-708: Probabilistic Graphical Models 10-708, Spring 2016 15 : Approximate Inference: Monte Carlo Methods Lecturer: Eric P. Xing Scribes: Binxuan Huang, Yotam Hechtlinger, Fuchen Liu 1 Introduction to

More information

Handout 4: Deterministic Systems and the Shortest Path Problem

Handout 4: Deterministic Systems and the Shortest Path Problem SEEM 3470: Dynamic Optimization and Applications 2013 14 Second Term Handout 4: Deterministic Systems and the Shortest Path Problem Instructor: Shiqian Ma January 27, 2014 Suggested Reading: Bertsekas

More information

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned

More information

Lesson 3: Basic theory of stochastic processes

Lesson 3: Basic theory of stochastic processes Lesson 3: Basic theory of stochastic processes Dipartimento di Ingegneria e Scienze dell Informazione e Matematica Università dell Aquila, umberto.triacca@univaq.it Probability space We start with some

More information

Dynamic Programming and Reinforcement Learning

Dynamic Programming and Reinforcement Learning Dynamic Programming and Reinforcement Learning Daniel Russo Columbia Business School Decision Risk and Operations Division Fall, 2017 Daniel Russo (Columbia) Fall 2017 1 / 34 Supervised Machine Learning

More information

CPS 270: Artificial Intelligence Markov decision processes, POMDPs

CPS 270: Artificial Intelligence  Markov decision processes, POMDPs CPS 270: Artificial Intelligence http://www.cs.duke.edu/courses/fall08/cps270/ Markov decision processes, POMDPs Instructor: Vincent Conitzer Warmup: a Markov process with rewards We derive some reward

More information

STOR Lecture 7. Random Variables - I

STOR Lecture 7. Random Variables - I STOR 435.001 Lecture 7 Random Variables - I Shankar Bhamidi UNC Chapel Hill 1 / 31 Example 1a: Suppose that our experiment consists of tossing 3 fair coins. Let Y denote the number of heads that appear.

More information

Markov Decision Processes

Markov Decision Processes Markov Decision Processes Ryan P. Adams COS 324 Elements of Machine Learning Princeton University We now turn to a new aspect of machine learning, in which agents take actions and become active in their

More information

Can we have no Nash Equilibria? Can you have more than one Nash Equilibrium? CS 430: Artificial Intelligence Game Theory II (Nash Equilibria)

Can we have no Nash Equilibria? Can you have more than one Nash Equilibrium? CS 430: Artificial Intelligence Game Theory II (Nash Equilibria) CS 0: Artificial Intelligence Game Theory II (Nash Equilibria) ACME, a video game hardware manufacturer, has to decide whether its next game machine will use DVDs or CDs Best, a video game software producer,

More information

Lecture outline W.B.Powell 1

Lecture outline W.B.Powell 1 Lecture outline What is a policy? Policy function approximations (PFAs) Cost function approximations (CFAs) alue function approximations (FAs) Lookahead policies Finding good policies Optimizing continuous

More information

CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 6: Prior-Free Single-Parameter Mechanism Design (Continued)

CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 6: Prior-Free Single-Parameter Mechanism Design (Continued) CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 6: Prior-Free Single-Parameter Mechanism Design (Continued) Instructor: Shaddin Dughmi Administrivia Homework 1 due today. Homework 2 out

More information

Supplementary Material: Strategies for exploration in the domain of losses

Supplementary Material: Strategies for exploration in the domain of losses 1 Supplementary Material: Strategies for exploration in the domain of losses Paul M. Krueger 1,, Robert C. Wilson 2,, and Jonathan D. Cohen 3,4 1 Department of Psychology, University of California, Berkeley

More information

Sequential Decision Making

Sequential Decision Making Sequential Decision Making Dynamic programming Christos Dimitrakakis Intelligent Autonomous Systems, IvI, University of Amsterdam, The Netherlands March 18, 2008 Introduction Some examples Dynamic programming

More information

UGM Crash Course: Conditional Inference and Cutset Conditioning

UGM Crash Course: Conditional Inference and Cutset Conditioning UGM Crash Course: Conditional Inference and Cutset Conditioning Julie Nutini August 19 th, 2015 1 / 25 Conditional UGM 2 / 25 We know the value of one or more random variables i.e., we have observations,

More information

Relevant parameter changes in structural break models

Relevant parameter changes in structural break models Relevant parameter changes in structural break models A. Dufays J. Rombouts Forecasting from Complexity April 27 th, 2018 1 Outline Sparse Change-Point models 1. Motivation 2. Model specification Shrinkage

More information

Characterization of the Optimum

Characterization of the Optimum ECO 317 Economics of Uncertainty Fall Term 2009 Notes for lectures 5. Portfolio Allocation with One Riskless, One Risky Asset Characterization of the Optimum Consider a risk-averse, expected-utility-maximizing

More information

Probability Theory. Mohamed I. Riffi. Islamic University of Gaza

Probability Theory. Mohamed I. Riffi. Islamic University of Gaza Probability Theory Mohamed I. Riffi Islamic University of Gaza Table of contents 1. Chapter 2 Discrete Distributions The binomial distribution 1 Chapter 2 Discrete Distributions Bernoulli trials and the

More information

Exact Inference (9/30/13) 2 A brief review of Forward-Backward and EM for HMMs

Exact Inference (9/30/13) 2 A brief review of Forward-Backward and EM for HMMs STA561: Probabilistic machine learning Exact Inference (9/30/13) Lecturer: Barbara Engelhardt Scribes: Jiawei Liang, He Jiang, Brittany Cohen 1 Validation for Clustering If we have two centroids, η 1 and

More information

Case Study: Heavy-Tailed Distribution and Reinsurance Rate-making

Case Study: Heavy-Tailed Distribution and Reinsurance Rate-making Case Study: Heavy-Tailed Distribution and Reinsurance Rate-making May 30, 2016 The purpose of this case study is to give a brief introduction to a heavy-tailed distribution and its distinct behaviors in

More information

16 MAKING SIMPLE DECISIONS

16 MAKING SIMPLE DECISIONS 247 16 MAKING SIMPLE DECISIONS Let us associate each state S with a numeric utility U(S), which expresses the desirability of the state A nondeterministic action A will have possible outcome states Result

More information

Construction and behavior of Multinomial Markov random field models

Construction and behavior of Multinomial Markov random field models Graduate Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 2010 Construction and behavior of Multinomial Markov random field models Kim Mueller Iowa State University Follow

More information

Stochastic Optimal Control

Stochastic Optimal Control Stochastic Optimal Control Lecturer: Eilyan Bitar, Cornell ECE Scribe: Kevin Kircher, Cornell MAE These notes summarize some of the material from ECE 5555 (Stochastic Systems) at Cornell in the fall of

More information

IEOR E4004: Introduction to OR: Deterministic Models

IEOR E4004: Introduction to OR: Deterministic Models IEOR E4004: Introduction to OR: Deterministic Models 1 Dynamic Programming Following is a summary of the problems we discussed in class. (We do not include the discussion on the container problem or the

More information

Problem 1: Random variables, common distributions and the monopoly price

Problem 1: Random variables, common distributions and the monopoly price Problem 1: Random variables, common distributions and the monopoly price In this problem, we will revise some basic concepts in probability, and use these to better understand the monopoly price (alternatively

More information

Reinforcement Learning. Slides based on those used in Berkeley's AI class taught by Dan Klein

Reinforcement Learning. Slides based on those used in Berkeley's AI class taught by Dan Klein Reinforcement Learning Slides based on those used in Berkeley's AI class taught by Dan Klein Reinforcement Learning Basic idea: Receive feedback in the form of rewards Agent s utility is defined by the

More information

Subject : Computer Science. Paper: Machine Learning. Module: Decision Theory and Bayesian Decision Theory. Module No: CS/ML/10.

Subject : Computer Science. Paper: Machine Learning. Module: Decision Theory and Bayesian Decision Theory. Module No: CS/ML/10. e-pg Pathshala Subject : Computer Science Paper: Machine Learning Module: Decision Theory and Bayesian Decision Theory Module No: CS/ML/0 Quadrant I e-text Welcome to the e-pg Pathshala Lecture Series

More information

Basic Framework. About this class. Rewards Over Time. [This lecture adapted from Sutton & Barto and Russell & Norvig]

Basic Framework. About this class. Rewards Over Time. [This lecture adapted from Sutton & Barto and Russell & Norvig] Basic Framework [This lecture adapted from Sutton & Barto and Russell & Norvig] About this class Markov Decision Processes The Bellman Equation Dynamic Programming for finding value functions and optimal

More information

Provisioning and used models description. Ondřej Výborný

Provisioning and used models description. Ondřej Výborný Provisioning and used models description Ondřej Výborný April 2013 Contents Provisions? What is it and why should be used? How do we calculate provisions? Different types of models used Rollrate model

More information

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Piyush Rai CS5350/6350: Machine Learning November 29, 2011 Reinforcement Learning Supervised Learning: Uses explicit supervision

More information

16 MAKING SIMPLE DECISIONS

16 MAKING SIMPLE DECISIONS 253 16 MAKING SIMPLE DECISIONS Let us associate each state S with a numeric utility U(S), which expresses the desirability of the state A nondeterministic action a will have possible outcome states Result(a)

More information

THAT COSTS WHAT! PROBABILISTIC LEARNING FOR VOLATILITY & OPTIONS

THAT COSTS WHAT! PROBABILISTIC LEARNING FOR VOLATILITY & OPTIONS THAT COSTS WHAT! PROBABILISTIC LEARNING FOR VOLATILITY & OPTIONS MARTIN TEGNÉR (JOINT WITH STEPHEN ROBERTS) 6 TH OXFORD-MAN WORKSHOP, 11 JUNE 2018 VOLATILITY & OPTIONS S&P 500 index S&P 500 [USD] 0 500

More information

PORTFOLIO OPTIMIZATION AND EXPECTED SHORTFALL MINIMIZATION FROM HISTORICAL DATA

PORTFOLIO OPTIMIZATION AND EXPECTED SHORTFALL MINIMIZATION FROM HISTORICAL DATA PORTFOLIO OPTIMIZATION AND EXPECTED SHORTFALL MINIMIZATION FROM HISTORICAL DATA We begin by describing the problem at hand which motivates our results. Suppose that we have n financial instruments at hand,

More information

Statistical and Computational Inverse Problems with Applications Part 5B: Electrical impedance tomography

Statistical and Computational Inverse Problems with Applications Part 5B: Electrical impedance tomography Statistical and Computational Inverse Problems with Applications Part 5B: Electrical impedance tomography Aku Seppänen Inverse Problems Group Department of Applied Physics University of Eastern Finland

More information

Lecture 23: April 10

Lecture 23: April 10 CS271 Randomness & Computation Spring 2018 Instructor: Alistair Sinclair Lecture 23: April 10 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They

More information

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Piyush Rai CS5350/6350: Machine Learning November 29, 2011 Reinforcement Learning Supervised Learning: Uses explicit supervision

More information

Machine Learning for Quantitative Finance

Machine Learning for Quantitative Finance Machine Learning for Quantitative Finance Fast derivative pricing Sofie Reyners Joint work with Jan De Spiegeleer, Dilip Madan and Wim Schoutens Derivative pricing is time-consuming... Vanilla option pricing

More information

The Values of Information and Solution in Stochastic Programming

The Values of Information and Solution in Stochastic Programming The Values of Information and Solution in Stochastic Programming John R. Birge The University of Chicago Booth School of Business JRBirge ICSP, Bergamo, July 2013 1 Themes The values of information and

More information

Unobserved Heterogeneity Revisited

Unobserved Heterogeneity Revisited Unobserved Heterogeneity Revisited Robert A. Miller Dynamic Discrete Choice March 2018 Miller (Dynamic Discrete Choice) cemmap 7 March 2018 1 / 24 Distributional Assumptions about the Unobserved Variables

More information

(RP13) Efficient numerical methods on high-performance computing platforms for the underlying financial models: Series Solution and Option Pricing

(RP13) Efficient numerical methods on high-performance computing platforms for the underlying financial models: Series Solution and Option Pricing (RP13) Efficient numerical methods on high-performance computing platforms for the underlying financial models: Series Solution and Option Pricing Jun Hu Tampere University of Technology Final conference

More information

Introduction to Sequential Monte Carlo Methods

Introduction to Sequential Monte Carlo Methods Introduction to Sequential Monte Carlo Methods Arnaud Doucet NCSU, October 2008 Arnaud Doucet () Introduction to SMC NCSU, October 2008 1 / 36 Preliminary Remarks Sequential Monte Carlo (SMC) are a set

More information

Kernel Conditional Quantile Estimation via Reduction Revisited

Kernel Conditional Quantile Estimation via Reduction Revisited Kernel Conditional Quantile Estimation via Reduction Revisited Novi Quadrianto Novi.Quad@gmail.com The Australian National University, Australia NICTA, Statistical Machine Learning Program, Australia Joint

More information

Lecture 17. The model is parametrized by the time period, δt, and three fixed constant parameters, v, σ and the riskless rate r.

Lecture 17. The model is parametrized by the time period, δt, and three fixed constant parameters, v, σ and the riskless rate r. Lecture 7 Overture to continuous models Before rigorously deriving the acclaimed Black-Scholes pricing formula for the value of a European option, we developed a substantial body of material, in continuous

More information

Reinforcement Learning and Simulation-Based Search

Reinforcement Learning and Simulation-Based Search Reinforcement Learning and Simulation-Based Search David Silver Outline 1 Reinforcement Learning 2 3 Planning Under Uncertainty Reinforcement Learning Markov Decision Process Definition A Markov Decision

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 22. PGM Probabilistic Inference Probabilistic inference on PGMs Computing marginal and conditional distributions from the joint

More information

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same. Chapter 14 : Statistical Inference 1 Chapter 14 : Introduction to Statistical Inference Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same. Data x

More information

4 Reinforcement Learning Basic Algorithms

4 Reinforcement Learning Basic Algorithms Learning in Complex Systems Spring 2011 Lecture Notes Nahum Shimkin 4 Reinforcement Learning Basic Algorithms 4.1 Introduction RL methods essentially deal with the solution of (optimal) control problems

More information

STATS 200: Introduction to Statistical Inference. Lecture 4: Asymptotics and simulation

STATS 200: Introduction to Statistical Inference. Lecture 4: Asymptotics and simulation STATS 200: Introduction to Statistical Inference Lecture 4: Asymptotics and simulation Recap We ve discussed a few examples of how to determine the distribution of a statistic computed from data, assuming

More information

Write legibly. Unreadable answers are worthless.

Write legibly. Unreadable answers are worthless. MMF 2021 Final Exam 1 December 2016. This is a closed-book exam: no books, no notes, no calculators, no phones, no tablets, no computers (of any kind) allowed. Do NOT turn this page over until you are

More information

START HERE: Instructions. 1 Exponential Family [Zhou, Manzil]

START HERE: Instructions. 1 Exponential Family [Zhou, Manzil] START HERE: Instructions Thanks a lot to John A.W.B. Constanzo and Shi Zong for providing and allowing to use the latex source files for quick preparation of the HW solution. The homework was due at 9:00am

More information

Optimal Stopping. Nick Hay (presentation follows Thomas Ferguson s Optimal Stopping and Applications) November 6, 2008

Optimal Stopping. Nick Hay (presentation follows Thomas Ferguson s Optimal Stopping and Applications) November 6, 2008 (presentation follows Thomas Ferguson s and Applications) November 6, 2008 1 / 35 Contents: Introduction Problems Markov Models Monotone Stopping Problems Summary 2 / 35 The Secretary problem You have

More information

Scenario Generation and Sampling Methods

Scenario Generation and Sampling Methods Scenario Generation and Sampling Methods Güzin Bayraksan Tito Homem-de-Mello SVAN 2016 IMPA May 9th, 2016 Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 1 / 30

More information

EE266 Homework 5 Solutions

EE266 Homework 5 Solutions EE, Spring 15-1 Professor S. Lall EE Homework 5 Solutions 1. A refined inventory model. In this problem we consider an inventory model that is more refined than the one you ve seen in the lectures. The

More information

Numerical Methods in Option Pricing (Part III)

Numerical Methods in Option Pricing (Part III) Numerical Methods in Option Pricing (Part III) E. Explicit Finite Differences. Use of the Forward, Central, and Symmetric Central a. In order to obtain an explicit solution for the price of the derivative,

More information

The Binomial Distribution

The Binomial Distribution Patrick Breheny September 13 Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 1 / 16 Outcomes and summary statistics Random variables Distributions So far, we have discussed the

More information

Statistics for Managers Using Microsoft Excel 7 th Edition

Statistics for Managers Using Microsoft Excel 7 th Edition Statistics for Managers Using Microsoft Excel 7 th Edition Chapter 5 Discrete Probability Distributions Statistics for Managers Using Microsoft Excel 7e Copyright 014 Pearson Education, Inc. Chap 5-1 Learning

More information

Optimal stopping problems for a Brownian motion with a disorder on a finite interval

Optimal stopping problems for a Brownian motion with a disorder on a finite interval Optimal stopping problems for a Brownian motion with a disorder on a finite interval A. N. Shiryaev M. V. Zhitlukhin arxiv:1212.379v1 [math.st] 15 Dec 212 December 18, 212 Abstract We consider optimal

More information

Business Statistics 41000: Probability 3

Business Statistics 41000: Probability 3 Business Statistics 41000: Probability 3 Drew D. Creal University of Chicago, Booth School of Business February 7 and 8, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404

More information

Log-linear Dynamics and Local Potential

Log-linear Dynamics and Local Potential Log-linear Dynamics and Local Potential Daijiro Okada and Olivier Tercieux [This version: November 28, 2008] Abstract We show that local potential maximizer ([15]) with constant weights is stochastically

More information

Final Examination CS540: Introduction to Artificial Intelligence

Final Examination CS540: Introduction to Artificial Intelligence Final Examination CS540: Introduction to Artificial Intelligence December 2008 LAST NAME: FIRST NAME: Problem Score Max Score 1 15 2 15 3 10 4 20 5 10 6 20 7 10 Total 100 Question 1. [15] Probabilistic

More information

Lecture 22: Dynamic Filtering

Lecture 22: Dynamic Filtering ECE 830 Fall 2011 Statistical Signal Processing instructor: R. Nowak Lecture 22: Dynamic Filtering 1 Dynamic Filtering In many applications we want to track a time-varying (dynamic) phenomenon. Example

More information

Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model

Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model Simerjot Kaur (sk3391) Stanford University Abstract This work presents a novel algorithmic trading system based on reinforcement

More information

An effective perfect-set theorem

An effective perfect-set theorem An effective perfect-set theorem David Belanger, joint with Keng Meng (Selwyn) Ng CTFM 2016 at Waseda University, Tokyo Institute for Mathematical Sciences National University of Singapore The perfect

More information

Penalty Functions. The Premise Quadratic Loss Problems and Solutions

Penalty Functions. The Premise Quadratic Loss Problems and Solutions Penalty Functions The Premise Quadratic Loss Problems and Solutions The Premise You may have noticed that the addition of constraints to an optimization problem has the effect of making it much more difficult.

More information

6.231 DYNAMIC PROGRAMMING LECTURE 3 LECTURE OUTLINE

6.231 DYNAMIC PROGRAMMING LECTURE 3 LECTURE OUTLINE 6.21 DYNAMIC PROGRAMMING LECTURE LECTURE OUTLINE Deterministic finite-state DP problems Backward shortest path algorithm Forward shortest path algorithm Shortest path examples Alternative shortest path

More information

Dynamic Portfolio Choice II

Dynamic Portfolio Choice II Dynamic Portfolio Choice II Dynamic Programming Leonid Kogan MIT, Sloan 15.450, Fall 2010 c Leonid Kogan ( MIT, Sloan ) Dynamic Portfolio Choice II 15.450, Fall 2010 1 / 35 Outline 1 Introduction to Dynamic

More information

Universal Portfolios

Universal Portfolios CS28B/Stat24B (Spring 2008) Statistical Learning Theory Lecture: 27 Universal Portfolios Lecturer: Peter Bartlett Scribes: Boriska Toth and Oriol Vinyals Portfolio optimization setting Suppose we have

More information

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages Harvard School of Engineering and Applied Sciences CS 152: Programming Languages Lecture 3 Tuesday, February 2, 2016 1 Inductive proofs, continued Last lecture we considered inductively defined sets, and

More information

CS221 / Spring 2018 / Sadigh. Lecture 7: MDPs I

CS221 / Spring 2018 / Sadigh. Lecture 7: MDPs I CS221 / Spring 2018 / Sadigh Lecture 7: MDPs I cs221.stanford.edu/q Question How would you get to Mountain View on Friday night in the least amount of time? bike drive Caltrain Uber/Lyft fly CS221 / Spring

More information

Lecture 7: MDPs I. Question. Course plan. So far: search problems. Uncertainty in the real world

Lecture 7: MDPs I. Question. Course plan. So far: search problems. Uncertainty in the real world Lecture 7: MDPs I cs221.stanford.edu/q Question How would you get to Mountain View on Friday night in the least amount of time? bike drive Caltrain Uber/Lyft fly CS221 / Spring 2018 / Sadigh CS221 / Spring

More information

Stochastic Games and Bayesian Games

Stochastic Games and Bayesian Games Stochastic Games and Bayesian Games CPSC 532l Lecture 10 Stochastic Games and Bayesian Games CPSC 532l Lecture 10, Slide 1 Lecture Overview 1 Recap 2 Stochastic Games 3 Bayesian Games 4 Analyzing Bayesian

More information

2D5362 Machine Learning

2D5362 Machine Learning 2D5362 Machine Learning Reinforcement Learning MIT GALib Available at http://lancet.mit.edu/ga/ download galib245.tar.gz gunzip galib245.tar.gz tar xvf galib245.tar cd galib245 make or access my files

More information

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2017

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2017 ECON 459 Game Theory Lecture Notes Auctions Luca Anderlini Spring 2017 These notes have been used and commented on before. If you can still spot any errors or have any suggestions for improvement, please

More information

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties Posterior Inference Example. Consider a binomial model where we have a posterior distribution for the probability term, θ. Suppose we want to make inferences about the log-odds γ = log ( θ 1 θ), where

More information

Chapter 7: Random Variables

Chapter 7: Random Variables Chapter 7: Random Variables 7.1 Discrete and Continuous Random Variables 7.2 Means and Variances of Random Variables 1 Introduction A random variable is a function that associates a unique numerical value

More information

CHAPTER 5: DYNAMIC PROGRAMMING

CHAPTER 5: DYNAMIC PROGRAMMING CHAPTER 5: DYNAMIC PROGRAMMING Overview This chapter discusses dynamic programming, a method to solve optimization problems that involve a dynamical process. This is in contrast to our previous discussions

More information

Action Selection for MDPs: Anytime AO* vs. UCT

Action Selection for MDPs: Anytime AO* vs. UCT Action Selection for MDPs: Anytime AO* vs. UCT Blai Bonet 1 and Hector Geffner 2 1 Universidad Simón Boĺıvar 2 ICREA & Universitat Pompeu Fabra AAAI, Toronto, Canada, July 2012 Online MDP Planning and

More information

Markov Decision Processes

Markov Decision Processes Markov Decision Processes Robert Platt Northeastern University Some images and slides are used from: 1. CS188 UC Berkeley 2. AIMA 3. Chris Amato Stochastic domains So far, we have studied search Can use

More information

COS 513: Gibbs Sampling

COS 513: Gibbs Sampling COS 513: Gibbs Sampling Matthew Salesi December 6, 2010 1 Overview Concluding the coverage of Markov chain Monte Carlo (MCMC) sampling methods, we look today at Gibbs sampling. Gibbs sampling is a simple

More information

Math 152: Applicable Mathematics and Computing

Math 152: Applicable Mathematics and Computing Math 152: Applicable Mathematics and Computing May 22, 2017 May 22, 2017 1 / 19 Bertrand Duopoly: Undifferentiated Products Game (Bertrand) Firm and Firm produce identical products. Each firm simultaneously

More information

Estimation of the Markov-switching GARCH model by a Monte Carlo EM algorithm

Estimation of the Markov-switching GARCH model by a Monte Carlo EM algorithm Estimation of the Markov-switching GARCH model by a Monte Carlo EM algorithm Maciej Augustyniak Fields Institute February 3, 0 Stylized facts of financial data GARCH Regime-switching MS-GARCH Agenda Available

More information