Cogsci 118B. Virginia de Sa. Maximum Likelihood estimation, Bayesian Parameter estimation

Size: px
Start display at page:

Download "Cogsci 118B. Virginia de Sa. Maximum Likelihood estimation, Bayesian Parameter estimation"

Transcription

1 Cogsci 118B 1 Virginia de Sa Maximum Likelihood estimation, Bayesian Parameter estimation

2 $ Density Estimation 2 Consider a classification task. If we know the densities of each class, it is easy to pick a decision boundary. "!# (') %'& *,+ -/ :6 ;=<?>7@7A9B7CDA9E9F?GDHIFDH9G?JKJKLMFD@7N7O:E9A9E9@7N7GDH=>7PM@7Q7G?Q7E9H9E9A9<RO7CDN7JKE9A9<RS9T7N:FDA9E9@7N7JUJKB7@:VWA9B7C >7PM@7Q7G?Q7E9H9E9A9<XO7CDN7JKE9A9<X@7SZY[CDGDJKT7PME\N7]^G^>7GDP_A9E9FDT7H9GDP1S9CDGDA9T7PMCXàGDH9T:CUbdcDe fagdhri9j7gxk7ldi9i9gdmmhne9ope9h q ldi9gdc?r7mms/tvu9wvx9y"zn{m D}7{_ D~K D7 9~= \ 7 ƒ 9 9 D 7 97 D~ ~= :ˆŠ ƒ 7~K 7Œv 9 7 ƒ 9 1 XŽD :{M? D~1 p 9? 7 Š 7 D~ ŽD{M 9 7 ƒ ˆ9ˆ9 D{M?7ŽD D 7 97 D~K~ :ˆ:}7 :}7 7 9 D 9 9 7:~v :ˆ: \ 1 [ \ D}7 D~ 7ˆ 7~K 7 = D7~ 9 9 ˆ9 77ŽD 9 9 7:~v?{M 1: 7{M p D 9š 9 D D 7Œœ? ~' 9 7 ž D{M? ž 77 7 D{Ÿ D DŽa ŽD 7{M a ž 9~ œ {M 7 [ œ " 9Ža 7 D{M ž œ = 7 7?Œœ D 9 D{Ÿ D œª= D{M 9Œ D7 «1 a? 9 «= D 9 7{_ DŒ' ± D²9²9³?Ḿµ«Š 9 D Ķ¹9º7»D D²9¹\¼7µ3½ ¾Š 7ÀÁDÂMÃ9ÄDÅ7ÆUÇ ÈÊÉ7Ë7Ë7ÌÎÍ:Ï=Ð Ñ7Ò7Ó«ÔÖÕ9 9ØDÏ ÙÛÚDÑ7Ó7ÜKÝ Þ9Ó7ßDà But how do we learn the densities? One choice is to do some sort of histogram method. These methods are called non-parametric. Another choice is to assume that the density is of a certain form (e.g. Gaussian) and find the best fitting parameters. These methods are called parametric

3 How do we fit the parameters? 3 We will consider two different schools of thought Maximum Likelihood estimation: There is a fixed but unknown parameter vector. Our best estimate (the maximum likelihood estimate) of the unknown parameter vector is the one that has the highest probability of generating the data Bayesian estimation: Treat the parameter vector as a random variable. We have a prior distribution for the parameters, and then after looking at the data, compute a posterior density over the parameters. We do not pick a most likely parameter vector but a density over parameter vectors. The whole density is used to make classification and other inference decisions.

4 Review of Bayes Theorem 4 So far we have talked about Probability of a Class given the data P (ω j x) = p(x ω j)p (ω j ) p(x) P (ω j ) = prior probability of ω j P (x) = evidence P (ω j x) = posterior probability of ω j P (x ω j ) = likelihood of ω j with respect to x Here we talk about Probability of a parameter vector given the data P (θ D) = p(d θ)p (θ) p(d) p(θ) = prior probability of θ

5 p(d) = evidence 5 p(θ D) = posterior probability of θ p(d θ) = likelihood of θ with respect to x

6 Parametric Estimation 6 Assume we know the form of a probability density but not some or all of the parameters of the functional form. We estimate the parameter vector θ from samples of the data Maximum likelihood estimate of θ is ˆθ that maximizes p(data θ) Usually we assume p(d θ) = n k=1 p(x(k) θ) by assuming independence of the samples x (k)

7 Parametric Estimation 6 Assume we know the form of a probability density but not some or all of the parameters of the functional form. We estimate the parameter vector θ from samples of the data Maximum likelihood estimate of θ is ˆθ that maximizes p(data θ) Usually we assume p(d θ) = n k=1 p(x(k) θ) by assuming independence of the samples x (k) = p(x (1) θ)p(x (2) θ),...p(x (n) θ)

8 % P $ P C L Maximum Likelihood Estimation 7 ;/< =+>-?/@1A!B 3( /819!: &(' )+*-,/.10!2 DFE!GIH J#K! #" MNO QSR TVUXWZY\[^]!_^] `1a^bdc!e^fVgihkjif^aVlmanenoplqlrbistbihkjiuSc!hvjiw!x^w!xng\f^e^w!x^cylqw!xze^x^b\{^w! }bix^lmw!e^xn~six^enoxxve^h jilmlm ^ }bi{ c!e ^bƒ{^hvj opx!h/e^ jƒ Xji ^lmlmw!jix e^ ˆjƒf^jth/c!w! i ^u!jth-sšjih/w!jix^ tbi~ n ^ĉ ^x^ix^enoxx Œbijixn Ž e^ ^hde^ c!a^b w!x^ ^x^w!cyb x^ ^ Œ nbih\e^ ijix^{^w!{^jtc!b lmen ^hv ib {^wylmc!h/w! ^ ^c!w!e^xnlvjihvb lma^enopx w!x {^jtlma^bi{ u!w!x^bilm `1a^bŒ Œw!{^{^u!bŒ ^gi ^h/bœlma^enopl c!a^bœu!w!ibiu!w!a^ene^{š q œ žnÿ i -Œ! ^ ^ i y! ^ ^!ª^«š }«ii ^ F! p«ª^i ƒœ±t«t²/³š!t²/µi«œ ^ ^ Œ ^«i²ˆ ^!²vi! ^! nµœ ^ ^! ^! m¹!ª^!! yºi«i!!ª^ ^ n ƒ X ^ ^! ƒ ^«Œ± «i²/³œ ^i²/²/ #»1ª^«± iý ^«!ª^i Œi¼t! Œ!½i«i ¾!ª^«!!ºt«i!!ª^ ^ ^! p Œi²/ºi«t ÁÀ ÂnÃÅÄ!Æ ÇiÈ!ÉrÊ Ë}ÇiÌiÄ!Ë}Ä!ÍiÎiɾÆ!Ï^Î È!Ê^ÐiÇiÑvÄ!Æ!Ï^ËÒÊ^Ó Æ!Ï^Î È!Ä!ÔiÎtÈ!Ä!Ï^Ê^Ê^ÕŠÖ Æ!Ï^ÇiÆ Ä!ÉmØ Æ!Ï^Î È!Ê^ÐiÙvÈ!Ä!ÔiÎiÈ!Ä!Ï^Ê^ÊnÕÛÚrÜ!ÝnÞrß àmá^âãpä åiæ æ!á^ç è^â^æ!æyâ^éœê ëpâ^æyçìæyá^åiæ çiítçtä æ!á^â^î^ïiá}æ!á^çiðpñ!â^â^òpàró!éœó!ñ!åiô ß^æ!á^çpñ!ó!òiçiñyó!á^â^â^õpöÅ ø ù únûýü!þˆþmÿ iþ!ü!#"%$ &(' ) *,+-(-/ , ;:=<?>@ ABC DAE F ADGIHJ(H6K@ L MBNF L4F K O P QR SUTVWSTVR X X YUZ%[ \^] _ `bacbd e fhgui je k gk almafnuo pd cqafmnsrt d ufae dvgd ovafcbgjpguw gcbd evcqamxd ay z ugd up{} ~je } auƒw gjo {} t o gˆ} ;pfpj Š{}Œ gj fˆ}gd o g Žao {} fme j Î š q qœ ž œÿ } ª «±}²(³µ º¹»¼² ½ ¾ ÀqÁ  ÃÄ

9 Maximum Likelihood Estimation 8 log-likelihood function l(θ) ln p(d θ) ˆθ = argmax θ l(θ) to find ˆθ solve for θ in θ l = 0 θ l = n k=1 θ ln p(x (k) θ) The log-likelihood is the logarithm of the probability density function but it is interpreted as a function of θ for given data whereas the probility density function is thought of as a function over the sample space for given parameter θ [Hofmann class notes]

10 Maximum Likelihood Estimation: Example 9 Bernoulli random variable two possible outcomes 0,1 P (x = 1) = ρ P (x = 0) = 1 ρ

11 Maximum Likelihood Estimation: Example 9 Bernoulli random variable two possible outcomes 0,1 P (x = 1) = ρ P (x = 0) = 1 ρ p(x = s) = ρ s (1 ρ) (1 s)

12 Maximum Likelihood Estimation: Example 9 Bernoulli random variable two possible outcomes 0,1 P (x = 1) = ρ P (x = 0) = 1 ρ p(x = s) = ρ s (1 ρ) (1 s) l(ρ) = n k=1 s(k) ln ρ + (1 s (k) )ln(1 ρ)

13 Maximum Likelihood Estimation: Example 9 Bernoulli random variable two possible outcomes 0,1 P (x = 1) = ρ P (x = 0) = 1 ρ p(x = s) = ρ s (1 ρ) (1 s) l(ρ) = n k=1 s(k) ln ρ + (1 s (k) )ln(1 ρ) l ρ = n k=1 s(k) /ρ (1 s (k) )/(1 ρ)

14 Maximum Likelihood Estimation: Example 9 Bernoulli random variable two possible outcomes 0,1 P (x = 1) = ρ P (x = 0) = 1 ρ p(x = s) = ρ s (1 ρ) (1 s) l(ρ) = n k=1 s(k) ln ρ + (1 s (k) )ln(1 ρ) l ρ = n k=1 s(k) /ρ (1 s (k) )/(1 ρ) setting ρ l = 0

15 Maximum Likelihood Estimation: Example 9 Bernoulli random variable two possible outcomes 0,1 P (x = 1) = ρ P (x = 0) = 1 ρ p(x = s) = ρ s (1 ρ) (1 s) l(ρ) = n k=1 s(k) ln ρ + (1 s (k) )ln(1 ρ) l ρ = n k=1 s(k) /ρ (1 s (k) )/(1 ρ) setting ρ l = 0 gives ˆρ = n k=1 s(k) /n

16 Maximum Likelihood Estimation: Example 10 Gaussian with unknown µ and σ 2 l(µ, σ) = n k=1 1 2 ln 2π ln σ 1 2 σ 2 (x (k) µ) 2

17 Maximum Likelihood Estimation: Example 10 Gaussian with unknown µ and σ 2 l(µ, σ) = n k=1 1 2 ln 2π ln σ 1 2 σ 2 (x (k) µ) 2 l µ = n k=1 1 σ 2 (x (k) µ)

18 Maximum Likelihood Estimation: Example 10 Gaussian with unknown µ and σ 2 l(µ, σ) = n k=1 1 2 ln 2π ln σ 1 2 σ 2 (x (k) µ) 2 l µ = n k=1 1 (x (k) µ) σ 2 l σ = n k=1 1 σ + (x(k) µ) 2 σ 3

19 Maximum Likelihood Estimation: Example 10 Gaussian with unknown µ and σ 2 l(µ, σ) = n k=1 1 2 ln 2π ln σ 1 2 σ 2 (x (k) µ) 2 l µ = n k=1 1 (x (k) µ) σ 2 l σ = n k=1 1 σ + (x(k) µ) 2 σ 3 ˆµ = P nk=1 x (k) n

20 Maximum Likelihood Estimation: Example 10 Gaussian with unknown µ and σ 2 l(µ, σ) = n k=1 1 2 ln 2π ln σ 1 2 σ 2 (x (k) µ) 2 l µ = n k=1 1 (x (k) µ) σ 2 l σ = n k=1 1 σ + (x(k) µ) 2 σ 3 ˆµ = P nk=1 x (k) n ˆσ 2 = 1 n n k=1 (x(k) ˆµ) 2

21 Bayesian Parameter Estimation 11 Assume the form of the density p(x θ) is known but θ is not known exactly Initial knowledge about θ is presented as a prior density p(θ) We have n samples drawn independently from the unknown real probability density p(x) p(x D) = p(x, θ D)dθ

22 Bayesian Parameter Estimation 11 Assume the form of the density p(x θ) is known but θ is not known exactly Initial knowledge about θ is presented as a prior density p(θ) We have n samples drawn independently from the unknown real probability density p(x) p(x D) = = p(x, θ D)dθ p(x θ)p(θ D)dθ p(θ D) = p(d θ)p(θ) R p(d θ)p(θ) p(d θ) = k i=1 p(x(k) θ)

23 Context/Aside 12 What will we do with p(x D, (ω i ))? P (ω i x, D) = = p(x ω i, D)P (ω i D) c j=1 p(x ω j, D)P (ω j D) p(x ω i, D)P (ω i ) c j=1 p(x ω j, D)P (ω j )

24 Example of Bayesian learning 13 MAP estimate of parameter Full Bayesian inference

25 Bayesian Learning 14 A conjugate prior is a prior density that is in the same form as the posterior density.

26 Bayesian Learning Example Gaussian density 15 Assume p(x µ) N(µ, σ 2 ) where σ 2 is known p(µ) N(µ 0, σ 2 0) p(µ D) = p(d µ)p(µ) P (D)

27 Bayesian Learning Example Gaussian density 15 Assume p(x µ) N(µ, σ 2 ) where σ 2 is known p(µ) N(µ 0, σ 2 0) p(µ D) = p(d µ)p(µ) P (D) = αp(d µ)p(µ)

28 Bayesian Learning Example Gaussian density 15 Assume p(x µ) N(µ, σ 2 ) where σ 2 is known p(µ) N(µ 0, σ 2 0) p(µ D) = p(d µ)p(µ) P (D) = αp(d µ)p(µ) n = α p(x (k) µ)p(µ) k=1

29 Bayesian Learning Example Gaussian density 15 Assume p(x µ) N(µ, σ 2 ) where σ 2 is known p(µ) N(µ 0, σ 2 0) p(µ D) = p(d µ)p(µ) P (D) = αp(d µ)p(µ) n = α p(x (k) µ)p(µ) = α k=1 n k=1 1 e.5((x(k) µ) σ ) 2 1 e.5((µ µ 0 ) σ ) 2 0 (2π)σ (2π)σ0

30 Bayesian Learning Example Gaussian density 15 Assume p(x µ) N(µ, σ 2 ) where σ 2 is known p(µ) N(µ 0, σ 2 0) p(µ D) = p(d µ)p(µ) P (D) = αp(d µ)p(µ) n = α p(x (k) µ)p(µ) = α k=1 n k=1 1 e.5((x(k) µ) σ ) 2 1 e.5((µ µ 0 ) σ ) 2 0 (2π)σ (2π)σ0 = α e.5 P n k=1 ( µ x(k) σ ) 2 +( µ µ 0 σ 0 ) 2

31 Bayesian Learning Example Gaussian density 15 Assume p(x µ) N(µ, σ 2 ) where σ 2 is known p(µ) N(µ 0, σ 2 0) p(µ D) = p(d µ)p(µ) P (D) = αp(d µ)p(µ) n = α p(x (k) µ)p(µ) = α k=1 n k=1 1 e.5((x(k) µ) σ ) 2 1 e.5((µ µ 0 ) σ ) 2 0 (2π)σ (2π)σ0 = α e.5 P n k=1 ( µ x(k) σ ) 2 +( µ µ 0 σ 0 ) 2 = α e.5[( n σ 2+ 1 σ 0 2 )µ 2 2( 1 P nk=1 σ 2 x (k) + µ 0)µ] σ 2 0

32 n 1 p(µ D) = (2πσn ) e.5(µ µ σn )2 16 where 1 σ 2 n = n σ σ 2 0 and µ n = n µ ˆ σn 2 σ 2 n + µ 0 σ0 2 This gives µ n = nσ2 0 µ ˆ nσ0 2 n + σ2 µ +σ2 nσ σ2 and σ 2 n = σ2 0 σ2 nσ 2 0 +σ2

33 5 4 3 ' ( < 9 Bayesian Learning 17! = >?@ACB D EGF H I I I H JGK L :; 9: " # $ % & 1 2 / , ) * MON PRQTSVUWYXZYX []\_^a`cbed\cfhg` \cijfydfykmlyn opy`mqr`c\cfhln fylyijqs\cg tydbeoiudvywyodlyfybxdfhlyf`\cfytyoztlytydqr`cfybedlyfybe{ }p`~ylybo`cijdlyi tydbeoijdvwyodlyfs`cbeodqr\co`cb\cij`tg\cv`cg`ctsv ^TopY`TfYwYqrv`ci lynoiu\cdfdfyktbe\cqs~yg`cb wbe`ctsdfsopy`t` beodqr\codlyfy{ _ijlqrƒ ]d apy\cijts h{ ˆwYtY\c Š}`co`ciO c{œˆ\ ijo \cfyth ˆ\_ cdtsžt{ colij c _ c c j s cšeše œy c žy VŸ Y c j c Y ª «Y Y Y h ±³²µÝ s º¹»¼c±T½º¾cÝ Y eà Á YÂcÃ

34 Computing Class-conditional density 18 p(x D) = = p(x µ)p(µ D)dµ 1 (2π)σ e.5((x µ) σ ) 2 1 (2π)σn e.5((x µ n) σn )2 dµ = 1 2πσσ n e.5 (x µn) 2 σ 2 +σ 2 n f(σ, σ n )

35 Bayesian Method vs Maximum Likelihood 19 Maximum likelihood: Compute ˆθ = argmax θ p(data θ) then p(x data) = p(x ˆθ) Bayesian estimation: Compute p(θ data) then p(x data) = p(θ data)p(x θ)dθ Maximum Likelihood approach is usually simpler, less computationally expensive and gives a more easily understood solution Maximum Likelihood returns a conditional density of the assumed parametric form Bayesian methods use more of the available information (and are likely to be more useful when training data is very sparse)

36 Sources of Classification Error 20 Bayes Error Due to overlap between the classes in the input space Model Error Due to not picking the correct class of models Estimation Error Due to insufficient training data

37 21 Assume You are an intellectual snob You have a child Copyright 2001, Andrew W. Moore Gaussians: Slide 50

38 22 Intellectual snobs with children are obsessed with IQ In the world as a whole, IQs are drawn from a Gaussian N(100,15 2 ) Copyright 2001, Andrew W. Moore Gaussians: Slide 51

39 23 IQ tests If you take an IQ test you ll get a score that, on average (over many tests) will be your IQ But because of noise on any one test the score will often be a few points lower or higher than your true IQ. SCORE IQ ~ N(IQ,10 2 ) Copyright 2001, Andrew W. Moore Gaussians: Slide 52

40 24 Assume You drag your kid off to get tested She gets a score of 130 Yippee you screech and start deciding how to casually refer to her membership of the top 2% of IQs in your Christmas newsletter. P(X< 130 µ= 100,σ 2 = 15 2 ) = P(X< 2 µ= 0,σ 2 = 1) = erf(2) = Copyright 2001, Andrew W. Moore Gaussians: Slide 53

41 Assume You drag your kid off to get tested She gets a score of 130 You are thinking: Well sure the test isn t accurate, so Yippee you screech she might and have an start IQ of 120 deciding or she how might have an 1Q of 140, but the to casually refer most to her likely IQ membership given the evidence of the top 2% of IQs in score= your 130 Christmas is, of course, newsletter P(X< 130 µ= 100,σ 2 = 15 2 ) = P(X< 2 µ= 0,σ 2 = 1) = erf(2) = Can we trust this reasoning? Copyright 2001, Andrew W. Moore Gaussians: Slide 54

42 26 Maximum Likelihood IQ IQ~ N(100,15 2 ) S IQ ~ N(IQ, 10 2 ) S= 130 IQ mle The MLE is the value of the hidden parameter that makes the observed data most likely In this case = arg max iq p( s = 130 iq) IQ mle =130 Copyright 2001, Andrew W. Moore Gaussians: Slide 55

43 27 IQ~ N(100,15 2 ) S IQ ~ N(IQ, 10 2 ) S= 130 BUT. IQ mle The MLE is the value of the hidden parameter that makes the observed data most likely In this case = arg IQ max iq mle p( s =130 = 130 iq) This is not the same as The most likely value of the parameter given the observed data Copyright 2001, Andrew W. Moore Gaussians: Slide 56

44 28 What we really want: IQ~ N(100,15 2 ) S IQ ~ N(IQ, 10 2 ) S= 130 Question: What is IQ (S= 130)? Called the Posterior Distribution of IQ Copyright 2001, Andrew W. Moore Gaussians: Slide 57

45 29 IQ~ N(100,15 2 ) S IQ ~ N(IQ, 10 2 ) S= 130 Question: What is IQ (S= 130)? Which tool or tools? U V U V U V V + X + Y Chain Rule U V Copyright 2001, Andrew W. Moore Gaussians: Slide 58 X X Y Conditionalize Marginalize Matrix A Multiply U AX U V

46 30 IQ~ N(100,15 2 ) S IQ ~ N(IQ, 10 2 ) S= 130 Plan Question: What is IQ (S= 130)? S IQ IQ Chain Rule S IQ Swap IQ S Conditionalize I Q S Copyright 2001, Andrew W. Moore Gaussians: Slide 59

47 Working IQ~ N(100,15 2 ) S IQ ~ N(IQ, 10 2 ) S= 130 IF U V µ ~ N µ u v S, S uu T uv S S uv vv T 1 µ u v = µ u + S uvs vv ( V µ v ( AV, S u v ) and V ~ N( µ v S vv ) IF U V ~ N, THEN ) 31 Question: What is IQ (S= 130)? THEN U ~ V N ( µ, S ), with S T AS + = vva S T ( AS vv ) u v AS S vv vv Copyright 2001, Andrew W. Moore Gaussians: Slide 60

48 32 Your pride and joy s posterior IQ If you did the working, you now have p(iq S= 130) If you have to give the most likely IQ given the score you should give IQ map = arg max p( iq s = iq 130) where MAP means Maximum A-posteriori Copyright 2001, Andrew W. Moore Gaussians: Slide 61

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ. 9 Point estimation 9.1 Rationale behind point estimation When sampling from a population described by a pdf f(x θ) or probability function P [X = x θ] knowledge of θ gives knowledge of the entire population.

More information

Chapter 4: Asymptotic Properties of MLE (Part 3)

Chapter 4: Asymptotic Properties of MLE (Part 3) Chapter 4: Asymptotic Properties of MLE (Part 3) Daniel O. Scharfstein 09/30/13 1 / 1 Breakdown of Assumptions Non-Existence of the MLE Multiple Solutions to Maximization Problem Multiple Solutions to

More information

CS340 Machine learning Bayesian model selection

CS340 Machine learning Bayesian model selection CS340 Machine learning Bayesian model selection Bayesian model selection Suppose we have several models, each with potentially different numbers of parameters. Example: M0 = constant, M1 = straight line,

More information

Chapter 7: Estimation Sections

Chapter 7: Estimation Sections 1 / 40 Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: Chapter 7 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods:

More information

ST440/550: Applied Bayesian Analysis. (5) Multi-parameter models - Summarizing the posterior

ST440/550: Applied Bayesian Analysis. (5) Multi-parameter models - Summarizing the posterior (5) Multi-parameter models - Summarizing the posterior Models with more than one parameter Thus far we have studied single-parameter models, but most analyses have several parameters For example, consider

More information

Chapter 7: Estimation Sections

Chapter 7: Estimation Sections Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions Frequentist Methods: 7.5 Maximum Likelihood Estimators

More information

CSC 411: Lecture 08: Generative Models for Classification

CSC 411: Lecture 08: Generative Models for Classification CSC 411: Lecture 08: Generative Models for Classification Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 08-Generative Models 1 / 23 Today Classification

More information

EE641 Digital Image Processing II: Purdue University VISE - October 29,

EE641 Digital Image Processing II: Purdue University VISE - October 29, EE64 Digital Image Processing II: Purdue University VISE - October 9, 004 The EM Algorithm. Suffient Statistics and Exponential Distributions Let p(y θ) be a family of density functions parameterized by

More information

Chapter 7: Estimation Sections

Chapter 7: Estimation Sections 1 / 31 : Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods: 7.5 Maximum Likelihood

More information

Exercise. Show the corrected sample variance is an unbiased estimator of population variance. S 2 = n i=1 (X i X ) 2 n 1. Exercise Estimation

Exercise. Show the corrected sample variance is an unbiased estimator of population variance. S 2 = n i=1 (X i X ) 2 n 1. Exercise Estimation Exercise Show the corrected sample variance is an unbiased estimator of population variance. S 2 = n i=1 (X i X ) 2 n 1 Exercise S 2 = = = = n i=1 (X i x) 2 n i=1 = (X i µ + µ X ) 2 = n 1 n 1 n i=1 ((X

More information

Point Estimators. STATISTICS Lecture no. 10. Department of Econometrics FEM UO Brno office 69a, tel

Point Estimators. STATISTICS Lecture no. 10. Department of Econometrics FEM UO Brno office 69a, tel STATISTICS Lecture no. 10 Department of Econometrics FEM UO Brno office 69a, tel. 973 442029 email:jiri.neubauer@unob.cz 8. 12. 2009 Introduction Suppose that we manufacture lightbulbs and we want to state

More information

Learning From Data: MLE. Maximum Likelihood Estimators

Learning From Data: MLE. Maximum Likelihood Estimators Learning From Data: MLE Maximum Likelihood Estimators 1 Parameter Estimation Assuming sample x1, x2,..., xn is from a parametric distribution f(x θ), estimate θ. E.g.: Given sample HHTTTTTHTHTTTHH of (possibly

More information

Lecture 10: Point Estimation

Lecture 10: Point Estimation Lecture 10: Point Estimation MSU-STT-351-Sum-17B (P. Vellaisamy: MSU-STT-351-Sum-17B) Probability & Statistics for Engineers 1 / 31 Basic Concepts of Point Estimation A point estimate of a parameter θ,

More information

START HERE: Instructions. 1 Exponential Family [Zhou, Manzil]

START HERE: Instructions. 1 Exponential Family [Zhou, Manzil] START HERE: Instructions Thanks a lot to John A.W.B. Constanzo and Shi Zong for providing and allowing to use the latex source files for quick preparation of the HW solution. The homework was due at 9:00am

More information

Likelihood Methods of Inference. Toss coin 6 times and get Heads twice.

Likelihood Methods of Inference. Toss coin 6 times and get Heads twice. Methods of Inference Toss coin 6 times and get Heads twice. p is probability of getting H. Probability of getting exactly 2 heads is 15p 2 (1 p) 4 This function of p, is likelihood function. Definition:

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics March 12, 2018 CS 361: Probability & Statistics Inference Binomial likelihood: Example Suppose we have a coin with an unknown probability of heads. We flip the coin 10 times and observe 2 heads. What can

More information

CSE 312 Winter Learning From Data: Maximum Likelihood Estimators (MLE)

CSE 312 Winter Learning From Data: Maximum Likelihood Estimators (MLE) CSE 312 Winter 2017 Learning From Data: Maximum Likelihood Estimators (MLE) 1 Parameter Estimation Given: independent samples x1, x2,..., xn from a parametric distribution f(x θ) Goal: estimate θ. Not

More information

Exact Inference (9/30/13) 2 A brief review of Forward-Backward and EM for HMMs

Exact Inference (9/30/13) 2 A brief review of Forward-Backward and EM for HMMs STA561: Probabilistic machine learning Exact Inference (9/30/13) Lecturer: Barbara Engelhardt Scribes: Jiawei Liang, He Jiang, Brittany Cohen 1 Validation for Clustering If we have two centroids, η 1 and

More information

Non-informative Priors Multiparameter Models

Non-informative Priors Multiparameter Models Non-informative Priors Multiparameter Models Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Prior Types Informative vs Non-informative There has been a desire for a prior distributions that

More information

Bayesian Hierarchical/ Multilevel and Latent-Variable (Random-Effects) Modeling

Bayesian Hierarchical/ Multilevel and Latent-Variable (Random-Effects) Modeling Bayesian Hierarchical/ Multilevel and Latent-Variable (Random-Effects) Modeling 1: Formulation of Bayesian models and fitting them with MCMC in WinBUGS David Draper Department of Applied Mathematics and

More information

Qualifying Exam Solutions: Theoretical Statistics

Qualifying Exam Solutions: Theoretical Statistics Qualifying Exam Solutions: Theoretical Statistics. (a) For the first sampling plan, the expectation of any statistic W (X, X,..., X n ) is a polynomial of θ of degree less than n +. Hence τ(θ) cannot have

More information

The Bernoulli distribution

The Bernoulli distribution This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

CS340 Machine learning Bayesian statistics 3

CS340 Machine learning Bayesian statistics 3 CS340 Machine learning Bayesian statistics 3 1 Outline Conjugate analysis of µ and σ 2 Bayesian model selection Summarizing the posterior 2 Unknown mean and precision The likelihood function is p(d µ,λ)

More information

6. Genetics examples: Hardy-Weinberg Equilibrium

6. Genetics examples: Hardy-Weinberg Equilibrium PBCB 206 (Fall 2006) Instructor: Fei Zou email: fzou@bios.unc.edu office: 3107D McGavran-Greenberg Hall Lecture 4 Topics for Lecture 4 1. Parametric models and estimating parameters from data 2. Method

More information

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples 1.3 Regime switching models A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples (or regimes). If the dates, the

More information

Common one-parameter models

Common one-parameter models Common one-parameter models In this section we will explore common one-parameter models, including: 1. Binomial data with beta prior on the probability 2. Poisson data with gamma prior on the rate 3. Gaussian

More information

Bayesian Normal Stuff

Bayesian Normal Stuff Bayesian Normal Stuff - Set-up of the basic model of a normally distributed random variable with unknown mean and variance (a two-parameter model). - Discuss philosophies of prior selection - Implementation

More information

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی یادگیري ماشین توزیع هاي نمونه و تخمین نقطه اي پارامترها Sampling Distributions and Point Estimation of Parameter (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی درس هفتم 1 Outline Introduction

More information

12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006.

12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006. 12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006. References for this Lecture: Robert F. Engle. Autoregressive Conditional Heteroscedasticity with Estimates of Variance

More information

(5) Multi-parameter models - Summarizing the posterior

(5) Multi-parameter models - Summarizing the posterior (5) Multi-parameter models - Summarizing the posterior Spring, 2017 Models with more than one parameter Thus far we have studied single-parameter models, but most analyses have several parameters For example,

More information

Chapter 8. Introduction to Statistical Inference

Chapter 8. Introduction to Statistical Inference Chapter 8. Introduction to Statistical Inference Point Estimation Statistical inference is to draw some type of conclusion about one or more parameters(population characteristics). Now you know that a

More information

Multi-armed bandit problems

Multi-armed bandit problems Multi-armed bandit problems Stochastic Decision Theory (2WB12) Arnoud den Boer 13 March 2013 Set-up 13 and 14 March: Lectures. 20 and 21 March: Paper presentations (Four groups, 45 min per group). Before

More information

Applied Statistics I

Applied Statistics I Applied Statistics I Liang Zhang Department of Mathematics, University of Utah July 14, 2008 Liang Zhang (UofU) Applied Statistics I July 14, 2008 1 / 18 Point Estimation Liang Zhang (UofU) Applied Statistics

More information

Estimation after Model Selection

Estimation after Model Selection Estimation after Model Selection Vanja M. Dukić Department of Health Studies University of Chicago E-Mail: vanja@uchicago.edu Edsel A. Peña* Department of Statistics University of South Carolina E-Mail:

More information

Bivariate Birnbaum-Saunders Distribution

Bivariate Birnbaum-Saunders Distribution Department of Mathematics & Statistics Indian Institute of Technology Kanpur January 2nd. 2013 Outline 1 Collaborators 2 3 Birnbaum-Saunders Distribution: Introduction & Properties 4 5 Outline 1 Collaborators

More information

Chapter 5. Statistical inference for Parametric Models

Chapter 5. Statistical inference for Parametric Models Chapter 5. Statistical inference for Parametric Models Outline Overview Parameter estimation Method of moments How good are method of moments estimates? Interval estimation Statistical Inference for Parametric

More information

Outline. Review Continuation of exercises from last time

Outline. Review Continuation of exercises from last time Bayesian Models II Outline Review Continuation of exercises from last time 2 Review of terms from last time Probability density function aka pdf or density Likelihood function aka likelihood Conditional

More information

Adaptive Experiments for Policy Choice. March 8, 2019

Adaptive Experiments for Policy Choice. March 8, 2019 Adaptive Experiments for Policy Choice Maximilian Kasy Anja Sautmann March 8, 2019 Introduction The goal of many experiments is to inform policy choices: 1. Job search assistance for refugees: Treatments:

More information

Modelling financial data with stochastic processes

Modelling financial data with stochastic processes Modelling financial data with stochastic processes Vlad Ardelean, Fabian Tinkl 01.08.2012 Chair of statistics and econometrics FAU Erlangen-Nuremberg Outline Introduction Stochastic processes Volatility

More information

AMP Group Finance Services Limited ABN Directors report and Financial report for the half year ended 30 June 2015

AMP Group Finance Services Limited ABN Directors report and Financial report for the half year ended 30 June 2015 ABN 95 084 247 914 Directors report and Financial report for the half year ended 30 June 2015 Ernst & Young 680 George Street Sydney NSW 2000 Australia GPO Box 2646 Sydney NSW 2001 Tel: +61 2 9248 5555

More information

STAT 425: Introduction to Bayesian Analysis

STAT 425: Introduction to Bayesian Analysis STAT 45: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 018 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall 018 1 / 37 Lectures 9-11: Multi-parameter

More information

Point Estimation. Copyright Cengage Learning. All rights reserved.

Point Estimation. Copyright Cengage Learning. All rights reserved. 6 Point Estimation Copyright Cengage Learning. All rights reserved. 6.2 Methods of Point Estimation Copyright Cengage Learning. All rights reserved. Methods of Point Estimation The definition of unbiasedness

More information

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Lecture 17: More on Markov Decision Processes. Reinforcement learning Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture

More information

Computer Vision Group Prof. Daniel Cremers. 7. Sequential Data

Computer Vision Group Prof. Daniel Cremers. 7. Sequential Data Group Prof. Daniel Cremers 7. Sequential Data Bayes Filter (Rep.) We can describe the overall process using a Dynamic Bayes Network: This incorporates the following Markov assumptions: (measurement) (state)!2

More information

N a.. n o s.. c a l e.. S.. y.. s t e.. m.. s.. M.. M.. T.. A bullet.. I SSN : hyphen 3290 \ centerline

N a.. n o s.. c a l e.. S.. y.. s t e.. m.. s.. M.. M.. T.. A bullet.. I SSN : hyphen 3290 \ centerline N S M M SSN : 99 39 S N D O : 4 7 8 S 8 N M N SSN : S 99 39 M M SSN : 99-39 V 4 S D O : 4 7 8 / 8 N M M V M S 4 D O : 4 7 8 / M 8 N M M V W - 4 F X - * C D J 3 S S M - W F X À V C D J 3 S - H 8 D 93 B

More information

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS Answer any FOUR of the SIX questions.

More information

Back to estimators...

Back to estimators... Back to estimators... So far, we have: Identified estimators for common parameters Discussed the sampling distributions of estimators Introduced ways to judge the goodness of an estimator (bias, MSE, etc.)

More information

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems Interval estimation September 29, 2017 STAT 151 Class 7 Slide 1 Outline of Topics 1 Basic ideas 2 Sampling variation and CLT 3 Interval estimation using X 4 More general problems STAT 151 Class 7 Slide

More information

Chapter 6: Point Estimation

Chapter 6: Point Estimation Chapter 6: Point Estimation Professor Sharabati Purdue University March 10, 2014 Professor Sharabati (Purdue University) Point Estimation Spring 2014 1 / 37 Chapter Overview Point estimator and point estimate

More information

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29 Chapter 5 Univariate time-series analysis () Chapter 5 Univariate time-series analysis 1 / 29 Time-Series Time-series is a sequence fx 1, x 2,..., x T g or fx t g, t = 1,..., T, where t is an index denoting

More information

A Stochastic Reserving Today (Beyond Bootstrap)

A Stochastic Reserving Today (Beyond Bootstrap) A Stochastic Reserving Today (Beyond Bootstrap) Presented by Roger M. Hayne, PhD., FCAS, MAAA Casualty Loss Reserve Seminar 6-7 September 2012 Denver, CO CAS Antitrust Notice The Casualty Actuarial Society

More information

Conjugate Models. Patrick Lam

Conjugate Models. Patrick Lam Conjugate Models Patrick Lam Outline Conjugate Models What is Conjugacy? The Beta-Binomial Model The Normal Model Normal Model with Unknown Mean, Known Variance Normal Model with Known Mean, Unknown Variance

More information

Quantitative Risk Management

Quantitative Risk Management Quantitative Risk Management Asset Allocation and Risk Management Martin B. Haugh Department of Industrial Engineering and Operations Research Columbia University Outline Review of Mean-Variance Analysis

More information

BIO5312 Biostatistics Lecture 5: Estimations

BIO5312 Biostatistics Lecture 5: Estimations BIO5312 Biostatistics Lecture 5: Estimations Yujin Chung September 27th, 2016 Fall 2016 Yujin Chung Lec5: Estimations Fall 2016 1/34 Recap Yujin Chung Lec5: Estimations Fall 2016 2/34 Today s lecture and

More information

Statistical estimation

Statistical estimation Statistical estimation Statistical modelling: theory and practice Gilles Guillot gigu@dtu.dk September 3, 2013 Gilles Guillot (gigu@dtu.dk) Estimation September 3, 2013 1 / 27 1 Introductory example 2

More information

Stochastic Volatility (SV) Models

Stochastic Volatility (SV) Models 1 Motivations Stochastic Volatility (SV) Models Jun Yu Some stylised facts about financial asset return distributions: 1. Distribution is leptokurtic 2. Volatility clustering 3. Volatility responds to

More information

Session 5. A brief introduction to Predictive Modeling

Session 5. A brief introduction to Predictive Modeling SOA Predictive Analytics Seminar Malaysia 27 Aug. 2018 Kuala Lumpur, Malaysia Session 5 A brief introduction to Predictive Modeling Lichen Bao, Ph.D A Brief Introduction to Predictive Modeling LICHEN BAO

More information

DTY FDY POY FDY 475,000 DTY ,000

DTY FDY POY FDY 475,000 DTY ,000 2299.HK 2011 2012 2 This document has been compiled by Billion Industrial Holdings Limited. All persons are prohibited to copy or forward this document onward. In other jurisdictions, the distribution

More information

GOV 2001/ 1002/ E-200 Section 3 Inference and Likelihood

GOV 2001/ 1002/ E-200 Section 3 Inference and Likelihood GOV 2001/ 1002/ E-200 Section 3 Inference and Likelihood Anton Strezhnev Harvard University February 10, 2016 1 / 44 LOGISTICS Reading Assignment- Unifying Political Methodology ch 4 and Eschewing Obfuscation

More information

Statistics for Business and Economics

Statistics for Business and Economics Statistics for Business and Economics Chapter 5 Continuous Random Variables and Probability Distributions Ch. 5-1 Probability Distributions Probability Distributions Ch. 4 Discrete Continuous Ch. 5 Probability

More information

Lecture 2. Probability Distributions Theophanis Tsandilas

Lecture 2. Probability Distributions Theophanis Tsandilas Lecture 2 Probability Distributions Theophanis Tsandilas Comment on measures of dispersion Why do common measures of dispersion (variance and standard deviation) use sums of squares: nx (x i ˆµ) 2 i=1

More information

METHODS AND ASSISTANCE PROGRAM 2014 REPORT Navarro Central Appraisal District. Glenn Hegar

METHODS AND ASSISTANCE PROGRAM 2014 REPORT Navarro Central Appraisal District. Glenn Hegar METHODS AND ASSISTANCE PROGRAM 2014 REPORT Navarro Central Appraisal District Glenn Hegar Navarro Central Appraisal District Mandatory Requirements PASS/FAIL 1. Does the appraisal district have up-to-date

More information

Intro to Decision Theory

Intro to Decision Theory Intro to Decision Theory Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601 Lecture 3 1 Please be patient with the Windows machine... 2 Topics Loss function Risk Posterior Risk Bayes

More information

Statistical and Computational Inverse Problems with Applications Part 5B: Electrical impedance tomography

Statistical and Computational Inverse Problems with Applications Part 5B: Electrical impedance tomography Statistical and Computational Inverse Problems with Applications Part 5B: Electrical impedance tomography Aku Seppänen Inverse Problems Group Department of Applied Physics University of Eastern Finland

More information

Practice Exercises for Midterm Exam ST Statistical Theory - II The ACTUAL exam will consists of less number of problems.

Practice Exercises for Midterm Exam ST Statistical Theory - II The ACTUAL exam will consists of less number of problems. Practice Exercises for Midterm Exam ST 522 - Statistical Theory - II The ACTUAL exam will consists of less number of problems. 1. Suppose X i F ( ) for i = 1,..., n, where F ( ) is a strictly increasing

More information

Machine Learning in Computer Vision Markov Random Fields Part II

Machine Learning in Computer Vision Markov Random Fields Part II Machine Learning in Computer Vision Markov Random Fields Part II Oren Freifeld Computer Science, Ben-Gurion University March 22, 2018 Mar 22, 2018 1 / 40 1 Some MRF Computations 2 Mar 22, 2018 2 / 40 Few

More information

Review: Population, sample, and sampling distributions

Review: Population, sample, and sampling distributions Review: Population, sample, and sampling distributions A population with mean µ and standard deviation σ For instance, µ = 0, σ = 1 0 1 Sample 1, N=30 Sample 2, N=30 Sample 100000000000 InterquartileRange

More information

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom Review for Final Exam 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom THANK YOU!!!! JON!! PETER!! RUTHI!! ERIKA!! ALL OF YOU!!!! Probability Counting Sets Inclusion-exclusion principle Rule of product

More information

MVE051/MSG Lecture 7

MVE051/MSG Lecture 7 MVE051/MSG810 2017 Lecture 7 Petter Mostad Chalmers November 20, 2017 The purpose of collecting and analyzing data Purpose: To build and select models for parts of the real world (which can be used for

More information

AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4

AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4 AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Summer 2014 1 / 26 Sampling Distributions!!!!!!

More information

SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data

SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu September 5, 2015

More information

Dealing with forecast uncertainty in inventory models

Dealing with forecast uncertainty in inventory models Dealing with forecast uncertainty in inventory models 19th IIF workshop on Supply Chain Forecasting for Operations Lancaster University Dennis Prak Supervisor: Prof. R.H. Teunter June 29, 2016 Dennis Prak

More information

Conjugate priors: Beta and normal Class 15, Jeremy Orloff and Jonathan Bloom

Conjugate priors: Beta and normal Class 15, Jeremy Orloff and Jonathan Bloom 1 Learning Goals Conjugate s: Beta and normal Class 15, 18.05 Jeremy Orloff and Jonathan Bloom 1. Understand the benefits of conjugate s.. Be able to update a beta given a Bernoulli, binomial, or geometric

More information

Morningstar Rating Analysis

Morningstar Rating Analysis Morningstar Research January 2017 Morningstar Rating Analysis of European Investment Funds Authors: Nikolaj Holdt Mikkelsen, CFA, CIPM Ali Masarwah Content Morningstar European Rating Analysis of Investment

More information

Inference of Several Log-normal Distributions

Inference of Several Log-normal Distributions Inference of Several Log-normal Distributions Guoyi Zhang 1 and Bose Falk 2 Abstract This research considers several log-normal distributions when variances are heteroscedastic and group sizes are unequal.

More information

High Dimensional Bayesian Optimisation and Bandits via Additive Models

High Dimensional Bayesian Optimisation and Bandits via Additive Models 1/20 High Dimensional Bayesian Optimisation and Bandits via Additive Models Kirthevasan Kandasamy, Jeff Schneider, Barnabás Póczos ICML 15 July 8 2015 2/20 Bandits & Optimisation Maximum Likelihood inference

More information

Homework Assignments

Homework Assignments Homework Assignments Week 1 (p. 57) #4.1, 4., 4.3 Week (pp 58 6) #4.5, 4.6, 4.8(a), 4.13, 4.0, 4.6(b), 4.8, 4.31, 4.34 Week 3 (pp 15 19) #1.9, 1.1, 1.13, 1.15, 1.18 (pp 9 31) #.,.6,.9 Week 4 (pp 36 37)

More information

Calibration of Interest Rates

Calibration of Interest Rates WDS'12 Proceedings of Contributed Papers, Part I, 25 30, 2012. ISBN 978-80-7378-224-5 MATFYZPRESS Calibration of Interest Rates J. Černý Charles University, Faculty of Mathematics and Physics, Prague,

More information

Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model

Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model Simerjot Kaur (sk3391) Stanford University Abstract This work presents a novel algorithmic trading system based on reinforcement

More information

The Vasicek Distribution

The Vasicek Distribution The Vasicek Distribution Dirk Tasche Lloyds TSB Bank Corporate Markets Rating Systems dirk.tasche@gmx.net Bristol / London, August 2008 The opinions expressed in this presentation are those of the author

More information

Point Estimation. Principle of Unbiased Estimation. When choosing among several different estimators of θ, select one that is unbiased.

Point Estimation. Principle of Unbiased Estimation. When choosing among several different estimators of θ, select one that is unbiased. Point Estimation Point Estimation Definition A point estimate of a parameter θ is a single number that can be regarded as a sensible value for θ. A point estimate is obtained by selecting a suitable statistic

More information

Lecture Stat 302 Introduction to Probability - Slides 15

Lecture Stat 302 Introduction to Probability - Slides 15 Lecture Stat 30 Introduction to Probability - Slides 15 AD March 010 AD () March 010 1 / 18 Continuous Random Variable Let X a (real-valued) continuous r.v.. It is characterized by its pdf f : R! [0, )

More information

Multiple-Population Moment Estimation: Exploiting Inter-Population Correlation for Efficient Moment Estimation in Analog/Mixed-Signal Validation

Multiple-Population Moment Estimation: Exploiting Inter-Population Correlation for Efficient Moment Estimation in Analog/Mixed-Signal Validation MAUSCRIPT Multiple-Population Moment Estimation: Exploiting Inter-Population Correlation for Efficient Moment Estimation in Analog/Mixed-Signal Validation Chenjie Gu, Member, IEEE, Manzil Zaheer, Student

More information

Analysis of the Bitcoin Exchange Using Particle MCMC Methods

Analysis of the Bitcoin Exchange Using Particle MCMC Methods Analysis of the Bitcoin Exchange Using Particle MCMC Methods by Michael Johnson M.Sc., University of British Columbia, 2013 B.Sc., University of Winnipeg, 2011 Project Submitted in Partial Fulfillment

More information

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage 6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic

More information

Regret-based Selection

Regret-based Selection Regret-based Selection David Puelz (UT Austin) Carlos M. Carvalho (UT Austin) P. Richard Hahn (Chicago Booth) May 27, 2017 Two problems 1. Asset pricing: What are the fundamental dimensions (risk factors)

More information

Microeconomic Theory II Preliminary Examination Solutions

Microeconomic Theory II Preliminary Examination Solutions Microeconomic Theory II Preliminary Examination Solutions 1. (45 points) Consider the following normal form game played by Bruce and Sheila: L Sheila R T 1, 0 3, 3 Bruce M 1, x 0, 0 B 0, 0 4, 1 (a) Suppose

More information

Stochastic Games and Bayesian Games

Stochastic Games and Bayesian Games Stochastic Games and Bayesian Games CPSC 532l Lecture 10 Stochastic Games and Bayesian Games CPSC 532l Lecture 10, Slide 1 Lecture Overview 1 Recap 2 Stochastic Games 3 Bayesian Games 4 Analyzing Bayesian

More information

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions April 9th, 2018 Lecture 20: Special distributions Week 1 Chapter 1: Axioms of probability Week 2 Chapter 3: Conditional probability and independence Week 4 Chapters 4, 6: Random variables Week 9 Chapter

More information

Riemannian Geometry, Key to Homework #1

Riemannian Geometry, Key to Homework #1 Riemannian Geometry Key to Homework # Let σu v sin u cos v sin u sin v cos u < u < π < v < π be a parametrization of the unit sphere S {x y z R 3 x + y + z } Fix an angle < θ < π and consider the parallel

More information

Stat 260/CS Learning in Sequential Decision Problems. Peter Bartlett

Stat 260/CS Learning in Sequential Decision Problems. Peter Bartlett Stat 260/CS 294-102. Learning in Sequential Decision Problems. Peter Bartlett 1. Gittins Index: Discounted, Bayesian (hence Markov arms). Reduces to stopping problem for each arm. Interpretation as (scaled)

More information

Gov 2001: Section 5. I. A Normal Example II. Uncertainty. Gov Spring 2010

Gov 2001: Section 5. I. A Normal Example II. Uncertainty. Gov Spring 2010 Gov 2001: Section 5 I. A Normal Example II. Uncertainty Gov 2001 Spring 2010 A roadmap We started by introducing the concept of likelihood in the simplest univariate context one observation, one variable.

More information

Inverse reinforcement learning from summary data

Inverse reinforcement learning from summary data Inverse reinforcement learning from summary data Antti Kangasrääsiö, Samuel Kaski Aalto University, Finland ECML PKDD 2018 journal track Published in Machine Learning (2018), 107:1517 1535 September 12,

More information

Bayesian course - problem set 3 (lecture 4)

Bayesian course - problem set 3 (lecture 4) Bayesian course - problem set 3 (lecture 4) Ben Lambert November 14, 2016 1 Ticked off Imagine once again that you are investigating the occurrence of Lyme disease in the UK. This is a vector-borne disease

More information

Chapter 8: Sampling distributions of estimators Sections

Chapter 8: Sampling distributions of estimators Sections Chapter 8 continued Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample

More information

Computational social choice

Computational social choice Computational social choice Statistical approaches Lirong Xia Sep 26, 2013 Last class: manipulation Various undesirable behavior manipulation bribery control NP- Hard 2 Example: Crowdsourcing...........

More information

Practice Exam 1. Loss Amount Number of Losses

Practice Exam 1. Loss Amount Number of Losses Practice Exam 1 1. You are given the following data on loss sizes: An ogive is used as a model for loss sizes. Determine the fitted median. Loss Amount Number of Losses 0 1000 5 1000 5000 4 5000 10000

More information

Parameter uncertainty for integrated risk capital calculations based on normally distributed subrisks

Parameter uncertainty for integrated risk capital calculations based on normally distributed subrisks Parameter uncertainty for integrated risk capital calculations based on normally distributed subrisks Andreas Fröhlich and Annegret Weng March 7, 017 Abstract In this contribution we consider the overall

More information

Chapter 7: Point Estimation and Sampling Distributions

Chapter 7: Point Estimation and Sampling Distributions Chapter 7: Point Estimation and Sampling Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 20 Motivation In chapter 3, we learned

More information

Comparing the Means of. Two Log-Normal Distributions: A Likelihood Approach

Comparing the Means of. Two Log-Normal Distributions: A Likelihood Approach Journal of Statistical and Econometric Methods, vol.3, no.1, 014, 137-15 ISSN: 179-660 (print), 179-6939 (online) Scienpress Ltd, 014 Comparing the Means of Two Log-Normal Distributions: A Likelihood Approach

More information