Introducton to : Dscrete Varables Sargur srhar@cedar.buffalo.edu
Topcs. What are graphcal models (or ) 2. Use of Engneerng and AI 3. Drectonalty n graphs 4. Bayesan Networks 5. Generatve Models and Samplng 6. Usng wth fully Bayesan Models 7. Dscrete Case 8. Complexty Issues 2
Dscrete Varables When constructng more complex probablty dstrbutons from smpler (exponental) dstrbutons graphcal models are useful Graphcal models have nce propertes when each parent-chld par are conjugate Two cases of nterest: Both correspond to dscrete varables Both correspond to Gaussan varables 3
Probablty dstrbuton varable x havng K states Dscrete Case for a sngle dscrete Usng -of -K representaton For K=6 when x 3 = then x represented as x=(0,0,,0,0,0) T Note that K å x = k = If probablty of x k = s gven by parameter K k ( x µ) = Õ x p µ where µ = ( µ,.., µ k = The dstrbuton s normalzed: k k p(x µ) There are K- ndependent values for needed to defne 4 dstrbuton x: K ) K x T å k = µ k µ k = µ k x K
Two Dscrete Varables x and x 2 each wth K states each Denote probablty of both x k = and x 2l = by x : x 2 : x x 2 µ kl x K x 2K where x k denotes k th component of x Jont dstrbuton s Snce parameters are subject to constrant There are K 2 - parameters p(x, x 2 µ) = K k= K l= x µ k x 2 l kl å k å l µ kl For arbtrary dstrbuton over M varables there are K M - parameters 5 =
Graphcal Models for Two Dscrete Varables Jont dstrbuton p(x,x 2 ) Usng product rule s factored as p(x 2 x )p(x ) Has two node graph Margnal dstrbuton p(x ) has K- parameters Condtonal dstrbuton p(x 2 x ) also requres K- parameters for each of K values of x Total number of parameters s (K-)+K(K-) =K 2 - As before 6
Two Independent Dscrete Varables x and x 2 are ndependent Has graphcal model Each varable descrbed by a separate multnomal dstrbuton Total no of parameters s 2(K-) For M varables no of parameters s M(K-) Reduced number of parameters by droppng lnks n graph Grows lnearly wth no of varables 7
Fully connected has hgh complexty General case of M dscrete varables x,.., x M If BN s fully connected Completely general dstrbuton wth K M - parameters If there are no lnks Jont dstrbuton factorzes nto product of margnals Total no of parameters s M(K-) Graphs of ntermedate levels of connectvty More general dstrbuton than fully factorzed ones Requre fewer parameters than general jont dstrbuton Example: chan of nodes 8
Specal Case: Chan of Nodes Margnal dstrbuton p(x ) requres K- parameters Each of the M- condtonal dstrbutons p(x x - ), for =2,..,M requres K(K-) parameters Total parameter count s K-+(M-)K(K-) Whch s quadratc n K Grows lnearly (not exponentally) wth length of chan 9
Alternatve: Sharng Parameters Reduce parameters by sharng or tyng parameters In above, all condtonal dstrbutons p(x x - ), for =2,..,M share same set of K(K-) parameters governng dstrbuton of x Total of K 2 - parameters needed to specfy dstrbuton 0
Converson nto Bayesan Model Gven graph over dscrete varables We can turn t nto a Bayesan model by ntroducng Drchlet prors for parameters Each node acqures an addtonal parent for each dscrete node Te the parameters governng condtonal dstrbutons p(x x - ) Chan of Nodes Wth prors Sharng Parameters
Bnomal: Beta Pror Bernoull: p(x= μ)=μ Lkelhood of Bernoull wth D={x,..x N } p(d µ) = Bnomal: Bn(m N, µ) = Conjugate Pror: Beta(µ a, b) = N µ x n n= N m ( µ) x n µ m ( µ) N m Γ(a + b) Γ(a)Γ(b) µ a ( µ) b
Multnomal: Drchlet Pror Generalzed Bernoull (-of-k) x=(0,0,,0,0,0) T K=6 Multnomal K x ( x µ) = Õ µ k k where µ = ( k = p µ,.., µ ) Mult ( mm2 mk µ, N ) = ç Õ Where the normalzaton coeffcent s the no of ways of parttonng N objects nto K groups of sze Conjugate pror dstrbuton for parameters m k Normalzed form s K æ N K ö.. ç µ k èmm2.. mk ø k = m m, m2.. å a - Õ k k p( µ a) a µ where 0 µ k and µ = k k Dr k = K=2 s Bnomal G( a ) K K 0 a k - ( µ a) = Õ µ wherea = k 0 åa k G( a)... G( a k ) k = k = K k m k T K=2 s Bernoull æ N ö N! ç m m2.. m = è k ø m! m2!.. m k!
Controllng Number of parameters n models: Parameterzed Condtonal Dstrbutons Control exponental growth of parameters n models of dscrete varables Use parameterzed models for condtonal dstrbutons nstead of complete tables of condtonal probablty values 4
Parameterzed Condtonal Dstrbutons Consder graph wth bnary varables Each parent varable x governed by sngle parameter µ representng probablty p(x =) M parameters n total for parent nodes Condtonal dstrbuton p(y x,..,x M ) requres 2 M parameters Representng probablty p(y=) for each of the 2 M settngs of parent varables 000000000 to 5
Condtonal dstrbuton usng logstc sgmod Parsmonous form of condtonal dstrbuton Logstc sgmod actng on lnear combnaton of parent varables p( y M æ ö = x,.., xm ) = s ç w0 + å w x = s è = ø ( T w x) where s(a) = (+exp(-a)) - s the logstc sgmod x=(x 0,x,..,x M ) T s vector of parent states No of parameters grows lnearly wth M Analogous to choce of a restrctve form of covarance matrx n multvarate Gaussan 6
Lnear Gaussan Models Expressng multvarate Gaussan as a drected graph Correspondng to lnear Gaussan model over component varables Mean of a condtonal dstrbuton s a lnear functon of the condtonng varable Allows expressng nterestng structure of dstrbuton General Gaussan case and dagonal covarance case represent opposte extremes 7
Graph wth contnuous random varables Arbtrary acyclc graph over D varables Node represents a sngle contnuous random varable x havng Gaussan dstrbuton Mean of dstrbuton s a lnear combnaton of states of ts parent nodes pa of node æ p ( x = Nç å pa ) x wj x j + b, v è jîpa Where w j and b are parameters governng the mean v s the varance of the condtonal dstrbuton for x ö ø 8
Jont Dstrbuton Log of jont dstrbuton ln p(x) = = - D å = D å = ln p( x 2v æ ç x è Where x=( x,..,x D ) T - + const Ths s a quadratc functon of x pa ) å jîpa w j x j ö - b ø Hence jont dstrbuton p(x) s a multvarate Gaussan 2 Terms ndependent of x 9
Mean and Covarance of Jont Dstrbuton Recursve Formulaton Snce each varable x has, condtonal on the states of ts parents, a Gaussan dstrbuton, we can wrte where e s a zero mean, unt varance Gaussan random varable satsfyng E[e ]=0 and E[e e j ]=I j and I j s the,j element of the dentty matrx Takng expectaton E[ x ] = å wje[ x j ] + b jîpa Thus we can fnd components of E[x]=(E[x ],..E[x D ]) T by startng at lowest numbered node and workng recursvely through the graph Smlarly elements of covarance matrx å cov[ x, x ] = w cov[ x, x ] + I j jîpa j jk k j x v j = å jîpa w j + b + v e 20
Three cases for no. of parameters No lnks n the graph 2D parameters Fully connected graph D(D+)/2 parameters Graphs wth ntermedate level of complexty Chan x 2 x 4 x x 5 x 3 x 6 2
Extreme Case wth no lnks D solated nodes There are no parameters w j Only D parameters b and D parameters v Mean of p(x) gven by (b,..,b D ) T Covarance matrx s dagonal of form dag(v,..,v D ) Jont dstrbuton has total of 2D parameters Represents set of D ndependent unvarate Gaussan dstrbutons 22
Extreme case wth all lnks Fully connected graph Each node has all lower numbered nodes as parents Matrx w j has - entres on the th row and hence s a lower trangular matrx (wth no entres on leadng dagonal) Total no of parameters w j s to take D 2 no of elements n D x D matrx, subtractng D to account for dagonal and dvde by 2 to account for elements only below dagonal 23
Graph wth ntermedate complexty Lnk mssng between varables x and x 3 Mean and covarance of jont dstrbuton are µ = å ( b, b + w b, b + w b + w w b ) 2 æ v ç = ç w2v ç è w32w2v 2 w 32 v 3 2 ( v w + 2 2 + v w 32 2 2 w v 2 2 2 v ) 32 w w 32 2 32 2 w 32 ( v ( v 2 2 T w + + 2 v w w 2 2 2 2 v v ö ) ) ø 24
Extenson to multvarate Gaussan varables Nodes n the graph represent multvarate Gaussan varables Wrte condtonal dstrbuton for node n the form æ p (x ç pa ) N x Wjx j + b, è jîpa = å å where W j s a matrx (non-square f x and x j have dfferent dmensonaltes) v ö ø 25
Summary. allow vsualzng probablstc models Jont dstrbutons are drected/undrected 2. can be used to generate samples Ancestral samplng wth drected s smple 3. are useful for Bayesan statstcs Dscrete varable PGM represented usng Drchlet prors 4. Parameter exploson controlled by tyng parameters 5. Multvarate Gaussan expressed as PGM Graph s a lnear Gaussan model over components 26