RECONCILING ATTRIBUTE VALUES FROM MULTIPLE DATA SOURCES

Size: px
Start display at page:

Download "RECONCILING ATTRIBUTE VALUES FROM MULTIPLE DATA SOURCES"

Transcription

1 RECONCILING ATTRIBUTE VALUES FROM MULTIPLE DATA SOURCES Zhengru Jang School of Management Unversty of Texas at Dallas Rchardson, TX U.S.A. Prabuddha De Krannert School of Management Purdue Unversty West Lafayette, IN U.S.A. Sumt Sarkar School of Management Unversty of Texas at Dallas Rchardson, TX U.S.A. Debrabata Dey Unversty of Washngton Busness School Seattle, WA U.S.A. Abstract Because of the heterogeneous nature of multple data sources, data ntegraton s often one of the most challengng tasks of today s nformaton systems. Whle the exstng lterature has focused on problems such as schema ntegraton and entty dentfcaton, our current study attempts to answer a basc queston: When an attrbute value for a real-world entty s recorded dfferently n two databases, how should the best value be chosen from the set of possble values? We frst show how probabltes for attrbute values can be derved, and then propose a framework for decdng the cost-mnmzng value based on the total cost of type I, type II, and msrepresentaton errors. Keywords: Data ntegraton, heterogeneous databases, probablstc databases, msclassfcaton errors, msrepresentaton errors Introducton Busness decsons often requre data from multple sources. As has been wdely documented, ntegratng data from several exstng ndependent databases poses a varety of complex problems. Data ntegraton problems arsng from heterogeneous data sources can be dvded nto two broad categores: the schema-level problems and nstance level-problems (Rahm and Do 2000). Topcs such as schema ntegraton (Batn et al. 1986) and semantc conflct resoluton (Ram and Park 2004) belong to the frst category, whle problems such as entty dentfcaton and matchng (Dey et al. 1998b) and data cleanng and duplcaton removal (Hernandez and Stolfo 1998) belong to the second. All of these problems have been extensvely studed and varous solutons have been proposed. However, after schema ntegraton and entty matchng, another problem emerges: What should be done f, once all schema level problems have been resolved, all real-world enttes optmally matched, and duplcates removed, we stll face two conflctng data values for the same attrbute of a real-world entty? How should we deal wth the conflctng attrbute values when we merge, for example, alumn data stored separately by a unversty and by one of ts departments and encounter two dfferent work addresses for the same person? One soluton s to store all conflctng values wth assocated probabltes; probablstc relatonal models (Dekhtyar et al. 2001; Dey and Sarkar 1996) have been proposed n that context. However, despte the theoretcal progress on the probablstc database model, t s not commercally avalable as yet. Even f t becomes readly avalable n the future, because of the sgnfcant 2004 Twenty-Ffth Internatonal Conference on Informaton Systems 725

2 Jang et al./reconclng Attrbute Values from Multple Data Sources overhead assocated wth the storng and handlng of the probablstc data, t remans to be seen whether t s cost-ustfable to mplement a probablstc database model. Therefore, storng the most lkely value or the best value based on some gven crtera seems to be a more practcal soluton at ths pont. To choose a sngle determnstc value, we need to frst evaluate the probabltes assgned to each conflctng value. Varous peces of nformaton, such as values of related attrbutes, tme stamps of stored values n dfferent data sources, and data source relabltes, may be utlzed to estmate the probablty dstrbutons assocated wth all possble true attrbute values. The approach we propose n ths study s as follows: We frst estmate the probablty for each conflctng attrbute value based on source data (attrbute) relablty, and then determne the best value to store, based on total expected error costs assocated wth each canddate value. We examne stochastc attrbutes wth only dscrete domans n ths study. The paper s organzed as follows. In the next secton, we derve the probablty assocated wth each possble attrbute value based on source data relablty. We then classfy queres based on possble errors that may result from ncorrect attrbute values. Such classfcatons are used n computng the total expected cost assocated wth ncorrect values beng stored n the database. We demonstrate how the cost mnmzng attrbute values can be determned for a dscrete attrbute. We extend the soluton to multple dscrete attrbutes. The last secton provdes concludng remarks and dscusses possble extensons. Computng Attrbute Value Probabltes We frst derve the probabltes for a sngle dscrete attrbute. Consder data sources S 1 and S 2. We denote by A the value of attrbute A for a partcular entty nstance as observed n S 1, and by A the value of attrbute A for the same entty nstance as observed n S 2. For example, we may fnd A 1 n S 1 (A 1 ) and A 2 n S 2 (A 2 ). For any number of reasons, the data n these data sources may be naccurate (Dey et al. 1998b). We would lke to determne the probablty that a specfc value (whch may or may not be the value observed from a data source) s ndeed the true value of an attrbute. When multple sources are nvolved, t would requre us to consder the relablty of the dfferent data sources. In general, the requred probablty terms can be expressed as P(A k, A ). We dentfy the followng stuatons that cover all the possbltes: Case 1a: k = = ; P(A, A ), Case 1b: k = ; P(A k, A ); and Case 2a: k = ; P(A, A ), Case 2b: k ; P(A k, A ). What s Avalable? We can sample S 1 and S 2 to fnd what proporton of values of attrbute A s accurate n S 1, and what proporton of values of attrbute A s accurate n S 2, n general. For example, we may sample S 1 and S 2, and fnd that attrbute A s accurate n S 1 80 percent of the tme (.e., 80 percent of the values for attrbute A are correct n the sample from S 1 ), and that t s accurate n S 2 90 percent of the tme (Morey 1982). Then, P(A ) = 0.8, P(A a ) = 0.2; and P(A ) = 0.9, P(A a ) = 0.1. Assumptons Our obectve s to determne the desred probabltes for the attrbute values based on sample estmates from each ndvdual data source. In order to do that, we need to make a few assumptons. These are lsted next. Assumpton 1: The value of A recorded n S 1 (.e., A ) s not dependent on the value of A recorded n S 2, once we know the true value of A (and vce versa). Ths mplctly assumes that the causes of errors are ndependent n the two data sources. Mathematcally, ths mples P(A k, A ) = P(A k ) œ,, k Twenty-Ffth Internatonal Conference on Informaton Systems

3 Jang et al./reconclng Attrbute Values from Multple Data Sources The above assumpton would, of course, be volated f the data n one source are derved from the data n the other source. Assumpton 2: Our prors for P(A ), P(A ), etc. are the same f we have no reason a pror to beleve that one value s more lkely to occur than another. Ths would be true n general when the doman s large or qute unpredctable. Mathematcally, ths mples P(A k ) = 1/ œ k, where s the number of possble realzatons of attrbute A. If the doman s restrcted to ust a few values, and one (or a subset of those values) s known to be predomnant, then the approprate prors should be used. These prors can be ncorporated n our estmate (we omt ths analyss here for space consderatons). Assumpton 3: All possble values of A other than that observed from a partcular data source are assumed to be equally lkely. Mathematcally, ths mples P(A k ) = P(A a )/ [ A -1] = [1- P(A )]/ [ A -1] œ k, where s the number of possble realzatons of attrbute A. The mplcatons are smlar to those of the prevous assumpton. Probablty Estmates for Attrbute Values: Sngle Attrbute Case Based on the relablty nformaton about an attrbute n two data sources and the assumptons dscussed above, we derve the probabltes of true values n varous stuatons as follows (detaled dervatons are provded n the appendx). Case 1a: k = = ; P(A, A ) = P(A P(A ) P(A ) ) P(A ) + P(A a ) P(A a ) / [ -1] The above expresson llustrates that, as the number of possble values of an attrbute ncreases, the lkelhood that both sources have ncorrectly captured A goes down. Case 1b: k = ; P(A k, A ) = P(A P(A a ) P(A Case 2a: k = ; P(A, A ) = P(A ) P(A a ) + P(A a P(A ) P(A a ) + P(A a ) P(A ) P(A a 2 )/ [ -1] ) P(A a ) + P(A a ) In the stuaton where A can have only two values a and a, the above expresson becomes P(A, A ) = P(A P(A ) P(A a The expresson for P(A, A ) s analogous. ) P(A a ) + P(A a ) ) / [ -1] ) P(A a ) P(A ) )[ -2] / [ -1] 2004 Twenty-Ffth Internatonal Conference on Informaton Systems 727

4 Jang et al./reconclng Attrbute Values from Multple Data Sources Case 2b: k ; P(A k, A ) = P(A ) P(A a P(A a ) + P(A a ) P(A a ) P(A ) / [ -1] ) + P(A a ) P(A a )[ -2] / [ -1] Probablty Estmates for Attrbute Values: Multple Attrbutes Case In the above dervaton, only one common attrbute s consdered. The solutons can, however, be easly extended to stuatons where multple attrbutes are common across the databases. If the relabltes of the attrbutes are ndependent of one another, the analyss presented n the prevous subsecton apples for each attrbute ndvdually. When the relabltes across attrbutes are dependent, we treat that group of attrbutes as one composte attrbute and all possble combnatons of the multple attrbutes values as the possble realzatons of the composte attrbute. If two attrbutes A and B form a composte attrbute, then the number of realzatons for ths composte attrbute s B. For example, f the values stored for a composte attrbute are the same n two locatons, then analogous to case 1a, we have N P( A, B = b, B = b, A, B = b ) =, where N = P(A, B = b D = P(A a, B b, B, B N + D = b ) P(A, B = b = b ) P(A a, B b, B, B The desred probabltes for the other cases can be obtaned n the same manner. = b = b ), ) / [ B -1]. Alumn Database Example Consder alumn data that are collected ndependently n both an unversty database (S 1 ) and a department database (S 2 ), and suppose we want to merge them nto one database. Some attrbute values for the same person could be dfferent n the two sources. Bnary Attrbute Suppose there exsts a bnary attrbute Self-Employed (SE for brevty), whch can take a value of a 1 = Yes or a 2 = No. Assume that, based on samplng, we have found that ths attrbute s accurate n S 1 80 percent of the tme and that t s accurate n S 2 90 percent of the tme,.e., P(SE SE ) = 0.8, P(SE a SE ) = 0.2, œ = 1, 2; and P(SE SE ) = 0.9, P(SE a SE ) = 0.1, œ = 1, 2. When the stored values of SE n the two sources are dfferent for an alumnus,.e., a a, we have: P(SE SE, SE ) = ( ) / ( ) = 8 / 26 = 0.308, P(SE SE, SE ) = ( ) / ( ) = 18 / 26 = In case the stored values n the two sources are the same for a partcular person, we have P(SE SE, SE ) = ( ) / ( ) = 72 / 74 = 0.973, P(SE k SE, SE ) = ( ) / ( ) = 2 / 74 = Mult-Valued Attrbute Suppose the orgnal alumn database also stores the current home locaton of the alumn and the attrbute Home_Locaton (HL for brevty) can be any of the 50 states n the Unted States. We may fnd that, for nstance, the Home_Locaton for an alumnus Robert Black s stored as TX n the unversty database S 1 and as LA n the department database S 2. Assumng that the attr Twenty-Ffth Internatonal Conference on Informaton Systems

5 Jang et al./reconclng Attrbute Values from Multple Data Sources Table 1. Alumn Data A_ID FName LName Employer Ttle Home_Locaton Value Prob Robert Black Walmart Sales Manager TX LA AOV Tmothy Earnest GTE Accountant NY AOV * * AOV Any Other Value. The probablty for AOV reflects the probablty that any other specfc attrbute value except those lsted separately s true. For example, from Table 1 we know that the probablty that Calforna s the true Home_Locaton for Robert Black s bute s accurate n S 1 80 percent of the tme and accurate n S 2 90 percent of the tme, we are able to calculate the dstrbuton of the true home locaton values for Robert Black as follows: HL = 50 (.e., there are 50 possble state values). P(HL = TX HL = TX, HL = LA) = ( ) / ( /49) = 0.286, P(HL = LA HL = TX, HL = LA) = ( ) / ( /49) = 0.644, P(HL = HL k HL = TX, HL = LA) = ( /49)/( /49) = , (for each HL k other than TX or LA.) On the other hand, f the Home_Locaton shown for Tmothy Earnest n both data sources s NY, the value dstrbuton s as follows: P(HL = NY HL = NY, HL = NY) = ( ) / ( /49) = , P(HL = HL k HL = NY, HL = NY) = ( /2401) / ( /49) = , (for each HL k other than NY.) Table 1 summarzes the Home_Locaton value dstrbuton under the two dfferent cases. Classfcaton of Queres and Errors As n pror research (Dey et al. 1998a; Mendelson and Sahara, 1986), we assume that all relevant queres have been dentfed. Based on where the attrbute beng examned appears n a query, we categorze the relevant queres nto three classes. If the stochastc attrbute(s) appear only n the selecton condton of a query, we call ths query a class C (Condtonng) query. If the attrbute(s) appear only n the proecton lst of a query, we call ths query a class T (Targetng) query. We call a query a class CT query f the attrbute(s) beng examned appear n both the selecton condton and the proecton lst. In the alumn example, f Home_Locaton s a stochastc attrbute, then query Q1 s of class C, Q2 s of class T, and Q3 s of class CT. Q1: Dsplay ID of those alumn who lve n LA. Q2: Dsplay Name and Home_Locaton of all alumn who work for GTE. Q3: Dsplay ID, Name, and Home_Locaton of all alumn who lve n TX. If the stored value of an attrbute s not the true value, three types of errors can occur. A type I error occurs when an obect should have been selected by a query based on the true value of an attrbute, but was not selected because the stored value was dfferent from the true value. A type II error occurs when an obect that should not have been selected based on the true attrbute value was selected because of the ncorrectly recorded value. A msrepresentaton error occurs when the value dsplayed for an attrbute n a query output s not the true value. The followng parameters are applcable to all classes of queres: (1) f(q) Frequency of query q Twenty-Ffth Internatonal Conference on Informaton Systems 729

6 Jang et al./reconclng Attrbute Values from Multple Data Sources Table 2. Cost Matrx for Three Classes of Queres Error Type class Type I Type II Msrepresentaton (C) Condtonng "(q) $(q) N/A (T) Targetng N/A N/A g(q) (CT) Condtonng and Targetng "(q) $(q) ((q) (2) "(q) Cost of type I error for query q. (3) $(q) Cost of type II error for query q. (4) ((a, q) Average cost of one occurrence of msrepresentng attrbute a n the query output of q. The parameter a s omtted n the sngle attrbute problem. (5) J(q) Expected percentage of obects n a relaton that may be selected by query q. For the smple query, dsplay all alumn who lve n Texas, J(q) equals the expected percentage of employees who lve n Texas. For the complex query, dsplay all alumn who lve n Texas AND are at least 50 years old, J(q) equals the product of the expected percentage of employees who lve n Texas and the expected percentage of employees who are at least 50 years old, assumng that attrbutes Home_Locaton and AGE are ndependent of each other. The three cost parameters of a query are estmated based on the utlzaton of the query output. Consder a drect marketng frm that runs a query based on some gven crtera to dentfy potental customers. In ths case, the expected net proft per potental customer and the average cost of sendng a promoton to each potental customer consttute the type I error cost and the type II error cost, respectvely. Whle both the type I error cost and the type II error cost are unque for a partcular query, the msrepresentaton cost s specfc to a query as well as to one of ts attrbutes dsplayed n the query output. In the drect marketng example, f the potental customer s street address s n the query output, then the cost of msrepresentng the street address of a potental customer equals the expected net proft per potental customer tmes the probablty that the mal would be lost or returned due to the ncorrect address nformaton. The three classes of queres and ther relevant error types are summarzed n Table 2. Attrbute Reconclaton: Sngle Stochastc Attrbute We start our analyss wth the smplest case where there s only one stochastc attrbute n a relaton. We use the alumn example shown n Table 1 to llustrate how the cost-mnmzng value for the attrbute Home_Locaton can be determned for Robert Black, gven the value dstrbuton lsted n Table 1. We start our analyss usng a set of smple queres, follow t by some more complex queres, and fnally provde the standardzed procedure for the analyss. An Example wth Queres Havng a Sngle Clause n the Condton For the purpose of llustraton, we assume that the three queres dscussed n the prevous secton are the only queres relevant to the Home_Locaton attrbute,.e, only these three queres have the Home_Locaton attrbute ether n the selecton condton or n the proecton lst. All three queres have a sngle clause n the condton. We frst calculate the expected type I, type II, and msrepresentaton error costs when dfferent values are chosen to be stored n the merged table, and then compare the total costs to determne the best value to store. Cost of Type I and Type II Errors. We frst examne the cost of type I and type II errors when TX s the stored Home_Locaton value for Robert Black. Only Q1 and Q3 need to be consdered for type I and type II errors. Obvously, Q1 wll not select Robert Black f the stored Home_Locaton value for hm s TX. Gven that he s not selected, f Robert Black s true Twenty-Ffth Internatonal Conference on Informaton Systems

7 Jang et al./reconclng Attrbute Values from Multple Data Sources Table 3. Cost of Type I and Type II Errors ( If TX s Stored ) Q1 (C) Q3 (CT) Retreval Crteron LA TX Result If True Value s Type I Error Cost Type II Error Cost Not Not LA 0 N/A LA "(Q1)f(Q1)P(LA) N/A TX N/A 0 Not TX N/A $(Q3)f(Q3)[1-P(TX)] Home_Locaton s ndeed LA, then a type I error occurs. The frequency of ths occurrence equals the product of the frequency f(q1) and the probablty that the true value s LA, denoted by P(LA). On the other hand, when Q3 s processed, Robert Black wll be selected. Gven that Robert Black has been selected based on the stored determnstc value TX, f the true value s not TX, but LA or any other value, then a type II error occurs. The frequency of ths occurrence equals the product of the frequency f(q3) and the probablty that the true value s not TX, whch equals ( 1-P(TX) ). The above analyss s summarzed n Table 3. By multplyng the error frequences wth the cost parameters for each query, we obtan the total of type I and type II error costs resultng from choosng TX as the stored value: C I,II (TX) = "(Q1)f(Q1)P(LA) + $(Q3)f(Q3)[1 P(TX)]. (1) Smlarly, the type I and type II error costs resultng from choosng LA as the stored value for Robert Black (shown below) s obtaned based on the analyss presented n Table 4. C I,II (LA) = $(Q1)f(Q1)[1 P(LA)] + "(Q3)f(Q3)P(TX). (2) Table 5 shows the type I and type II error costs ncurred f any value other than TX and LA s stored, and equaton (3) shows the resultng cost expresson: C I,II (AOV) = "(Q1)f(Q1)P(LA) + "(Q3)f(Q3)P(TX). (3) In ths example, the cost analyss shown n Table 5 s also vald f NULL s chosen. Therefore, we have C I,II (NULL) = "(Q1)f(Q1)P(LA) + "(Q3)f(Q3)P(TX). (4) Table 4. Cost of Type I and Type II Errors ( If LA s Stored) Retreval Crteron Result If True Value s Type I Error Cost Type II Error Cost Q1 (C) LA LA N/A 0 Not LA N/A $(Q1)f(Q1)[1-P(LA)] Q3 Not Not TX 0 N/A TX (CT) TX "(Q3)f(Q3)P(TX)] N/A 2004 Twenty-Ffth Internatonal Conference on Informaton Systems 731

8 Jang et al./reconclng Attrbute Values from Multple Data Sources Table 5. Cost of Type I and Type II Errors (If Any Value Other than TX and LA s Stored) Q1 (C) Q3 (CT) Retreval Crteron LA TX Result If True Value s Type I Error Cost Type II Error Cost Not Not Not LA 0 N/A LA "(Q1)f(Q1)P(LA) N/A TX "(Q3)f(Q3)P(TX) N/A Not TX 0 N/A Cost of Msrepresentaton Errors. A msrepresentaton error occurs when a value n a query output s not the true attrbute value, and ths type of error s relevant to only class T and class CT queres. In our example, Q2 s a class T query and Q3 s a CT query. We frst assume that TX s chosen to be the determnstc value for Robert Black. Therefore, whenever Robert Black s selected by a query and Home_Locaton s n the proecton lst, the value dsplayed wll be TX. Gven that TX s dsplayed n the query output, f the actual true value s not TX, a msrepresentaton error occurs. The frequency of ths occurrence equals the frequency that Robert Black s selected by a class T or class CT query tmes the probablty that TX s not the true value. To calculate the frequency that Robert Black s selected, we examne Q2 and Q3 separately. Snce Q3 always selects Robert Black f TX s chosen to be the determnstc value, the frequency that Robert Black s selected by Q3 equals the frequency of the query f(q3). For the class T query Q2, we assume that all obects are equally lkely to be selected by ths query. Therefore, the frequency that Robert Black s selected by Q2 equals f(q2)j(q2). Multplyng the error frequences by (, the unt cost of a msrepresentaton error, and summng over all class T and class CT queres, we obtan the total msrepresentaton cost when TX s chosen to be the stored value as follows: C m (TX) = [((Q3)f(Q3) + ((Q2)f(Q2)J(Q2)][1 P(TX)]. (5) If LA had been chosen to be the stored value, Robert Black would never appear n the query output of Q3. The total msrepresentaton cost, denoted by C m (LA), thus equals the followng: C m (LA) = ((Q2)f(Q2)J(Q2)[1 P(LA)]. (6) Now consder when a value other than TX or LA s chosen for Robert Black. We can gnore Q3 snce t wll not select Robert Black. The msrepresentaton cost s straghtforward: C m (AOV) = ((Q2)f(Q2)J(Q2)[1 P(AOV)]. (7) Fnally, we examne the msrepresentaton cost f NULL s stored. For smplcty, we assume that NULL s never the true value and the unt msrepresentaton cost when NULL or any other ncorrect value s stored s the same. Then the resultng msrepresentaton cost s C m (NULL) = ((Q2)f(Q2)J(Q2). (8) If the cost of msrepresentaton s dfferent when NULL s stored, then we can estmate another msrepresentaton parameter (' specfcally for NULL and replace ( n (8) by ('. All other analyses reman the same. Mnmzng Total Error Cost. The best determnstc Home_Locaton value for Robert Black s chosen by mnmzng the total expected error cost, obtaned by summng up the cost of type I and type II errors and the cost of msrepresentaton errors: TC(TX) = "(Q1)f(Q1)P(LA) + [$(Q3)f(Q3) + ((Q3)f(Q3) + ((Q2)f(Q2)J(Q2)][1 P(TX)] (9) TC(LA) = "(Q3)f(Q3)P(TX) + [$(Q1)f(Q1) + ((Q2)f(Q2)J(Q2)][1 P(LA)] (10) TC(AOV) = "(Q1)f(Q1)P(LA) + a(q3)f(q3)p(tx) + ((Q2)f(Q2)J(Q2)][1 P(AOV)], (11) TC(NULL) = "(Q1)f(Q1)P(LA) + "(Q3)f(Q3)P(TX) + ((Q2)f(Q2)J(Q2). (12) Twenty-Ffth Internatonal Conference on Informaton Systems

9 Jang et al./reconclng Attrbute Values from Multple Data Sources The value that mnmzes the total cost should be the one stored n the merged table. Dependng on the cost parameters, any of the values can be chosen. In normal stuatons, TX or LA should be the lkely best value. However, n cases where the cost of type II errors s sgnfcantly hgher than cost of type I errors, AOV could be the cost mnmzng opton. Ths s due to the fact that f AOV or NULL s stored, Robert Black wll not be selected by Q1 and Q3 and hence type II errors wll never occur. Addtonal Queres wth Dsunctve Clauses n the Condton The example dscussed above nvolves three smple queres wth a sngle clause n the condton. To generalze the soluton, we consder three addtonal queres Q4, Q5, and Q6 that have dsunctve clauses n the selecton condton. Among them, Q4 s of class CT and Q5 and Q6 are of class C. Q4: Dsplay ID, Name, and Home_Locaton of those alumn who lve n OK or TX. Q5: Dsplay ID of those alumn who lve n CA, NY, or TX. Q6: Dsplay ID, Name, and Employer of those alumn who lve n IN, MN, or PA. We use the example data for Robert Black to llustrate how the costs assocated wth these new queres can be determned. Assume that TX s the stored value. We frst derve the msrepresentaton cost snce t s relatvely smple. As dscussed earler, for the msrepresentaton cost, we only need to consder the Class T queres and those class CT queres that select Robert Black. Therefore, only Q4 needs to be consdered for the msrepresentaton cost. Based on the dscusson presented above, the msrepresentaton cost assocated wth Q4 s [1 P(TX)][((Q4) f(q4)]. The costs of type I and type II errors are summarzed n Table 6. Based on Table 6 and Table 3, we make the followng observatons: Observaton 1: Observaton 2: Gven that the chosen determnstc value s ncluded n the retreval crteron of a query (such as Q3, Q4, and Q5), the probablty of type II error equals the probablty that a value other than those ncluded n the retreval crteron s the true value. If the chosen determnstc value s not ncluded n the retreval crteron of a query (e.g., Q1 or Q6), then the probablty of type I error equals the probablty that one of the values ncluded n the query s retreval crteron s the true value. To determne whch value s the best choce, the total cost assocated wth other possble values also needs to be calculated. The value that results n the smallest cost should be stored. Table 6. Cost of Type I and Type II Errors (If TX s chosen to be the determnstc value for Robert Black) No. Retreval Crteron Result If True Value s Type I Error cost Type II Error cost Q4 (CT) OK, TX OK or TX N/A 0 Others N/A $(q4)f(q4)[1-p(ok)- P(TX)] Q5 (C) CA, NY, TX CA, NY or N/A 0 TX Others N/A $(q5)f(q5)[1-p(ca)- P(NY)-P(TX)] Q6 (C) IN, MN, PA Not IN, MN, or PA "(Q6)f(Q6)[P(IN)+ P(MN)+P(PA)] N/A Others 0 N/A 2004 Twenty-Ffth Internatonal Conference on Informaton Systems 733

10 Jang et al./reconclng Attrbute Values from Multple Data Sources Table 7. Coverage Btmap Crteron Queres CA IN LA MN NY OK PA TX PS Q1 (C) P(LA) Q3 (CT) P(TX) Q4 (CT) P(OK) + P(TX) Q5 (C) P(CA) + P(NY) + P(TX) Q6 (C) P(IN) + P(MN) + P(PA) A Standardzed Procedure Based on Coverage Btmap As we can see from the prevous analyses, f the number of possble realzatons of the attrbute or the number of relevant queres s large, the error cost calculaton can be a tedous process. To smplfy the computaton, we construct a query coverage btmap as shown n Table 7. The query coverage btmap summarzes the values ncluded n the retreval crteron of each query. For example, for Q5, the columns for CA, NY, and TX are marked as 1 snce these three state values are ncluded n the retreval crteron of Q5. The last column, labeled as PS, represents the probablty sum that any one of the attrbute values ncluded n the query s retreval crteron s the true attrbute value. Input: Probablstc value vector V and correspondng probablty vector P. 1. Fnd out the correspondng column ndex numbers n the Coverage Btmap for all components of V and keep them n vector J. 2. Let C mn = A very large number; BestVal= V 0. For all J (representng all probablstc values): Begn TC = 0; C = 0; C = 0; C = (1 P ) f ( q) τ ( q) γ ( q). I II m q classt qures For each query wth row ndex : Do If QCBtmap[][] equals 1, /* selects the obect, possble type II errors and msrepresentaton errors.*/ Then ncrease cost of type II error C II by f(q )β(q )[1-PS(q )]; and f q s class CT query, then ncrease msrepresentaton cost C m by (1-P )f(q )γ(q ); Else /* wll not select the obect, possble type I errors */ ncrease cost of type I error C I by f(q )α(q )PS(q ). Endf End TC = CI + CII + Cm. If (TC < C mn ) Then C mn = TC, BV=V. End Output: The cost-mnmzng value BV and the assocated total cost. Fgure 1. Procedure for Determnng the Best Value Twenty-Ffth Internatonal Conference on Informaton Systems

11 Jang et al./reconclng Attrbute Values from Multple Data Sources Based on the btmap, we can automate the cost calculaton and value determnaton process. Fgure 1 shows the algorthm for determnng the best determnstc value for an obect wth a probablstc value vector V and a correspondng probablty dstrbuton P. The total cost assocated wth each chosen value s determned as follows: Frst, f there are class T queres, the assocated msrepresentaton cost s calculated. Second, the column n the query coverage btmap that corresponds to the chosen value s dentfed. Thrd, for each query n the btmap, the value n the cell that corresponds to the row of the query and the column of the chosen value s checked. If the value s 1, the cost of type II errors s calculated based on observaton 1 dscussed n the prevous subsecton, and the msrepresentaton cost assocated wth ths query s determned f the query s of Class CT; f the value s 0, the cost of type I error s calculated based on observaton 2 n the prevous subsecton. Fourth, the total costs assocated wth the chosen value are determned by summng up all three types of costs. Fnally, the best determnstc value s chosen based on the total error costs. If the number of queres s n and the number of probablstc values s r, then the complexty of ths standardzed procedure s O(nr). Dscusson For the above procedure, not all possble values need to be explctly examned to decde whch one s the best. If, n a group of canddate values, all have the same probablty of beng the true value, and one of the followng two condtons holds: (1) none of the canddate values appears n any query or (2) f one value appears n a query, then all other canddate values n the group also appear n the query n exactly the same manner, then the total expected cost assocated wth each value n the group s always the same. In the gven example, the expected cost when ether CA or NY s chosen s the same; the expected cost assocated wth choosng IN or MN s the same; and the costs assocated wth all other values except TX, LA, CA, NY, IN, MN, OK and NULL are also the same. Attrbute Reconclaton: Multple Stochastc Attrbutes In the prevous secton, we have shown how the cost-mnmzng value can be determned for a sngle stochastc attrbute based on query parameters. In ths secton, we extend the analyss to multple stochastc attrbutes wth conflctng values from dfferent data sources. There are two possble cases: (1) The relabltes for the stochastc attrbutes are mutually ndependent. In ths case, the cost-mnmzng value for each attrbute can be determned ndvdually wthout consderng other attrbutes. The probablty dervaton for a sngle attrbute shown n the second secton and the value determnaton process dscussed n the prevous secton can be appled. (2) The relabltes are dependent. In that scenaro, we have to consder the combnatons of attrbute values and ther ont probabltes. As shown earler, we can estmate the ont probabltes for all feasble value combnatons. The cost-mnmzng value combnaton can be determned based on the total expected error cost, whch ncludes, as n the sngle attrbute case, the costs of type I errors, type II errors, and msrepresentaton errors. To llustrate how the cost-mnmzng value combnaton can be determned, consder a modfed alumn database example as shown n Table 8. In ths example, we assume that the values of both attrbutes Employer and Home_Locaton are dfferent for Robert Black n the two data sources. Further assume that the probabltes are as shown n Table 8 for the combnaton of Employer and Home_Locaton. The number of realzatons for Employer s assumed to be 200 and the number of possble Home_Locatons s agan assumed to be 50. Thus, the total number of possble realzatons of the composte attrbute s = 10,000. Table 8. Modfed Probablstc Alumn Data A_ID FName LName Ttle Employer H_L Prob Robert Black Sales Manager (Walmart, TX) 0.6 (Nortel, LA) 0.3 AOV Twenty-Ffth Internatonal Conference on Informaton Systems 735

12 Jang et al./reconclng Attrbute Values from Multple Data Sources Consder the stuaton when the followng are the only queres that have the above two attrbutes ether n ther proecton lsts or n ther selecton condtons: Q7: Dsplay Name, Employer, and Home_Locaton of those alumn who are managers. Q8: Dsplay ID, Name, and Home_Locaton of all alumn who work for WalMart. Q9: Dsplay ID and Name of those alumn who work for WalMart OR lve n LA. Q10 Dsplay Name and Home_Locaton of those alumn who work for Nortel AND lve n TX. Among the four queres, snce Q7 has both Employer and Home_Locaton n ts proect lst, t s of class T wth respect to both Employer and Home_Locaton. Smlarly, Q8 s of class C wth respect to Employer and of class T wth respect to Home_Locaton; Q9 s of class C wth respect to both Employer and Home_Locaton; and Q10 s of class C wth respect to Employer and of class CT wth respect to Home_Locaton. Among the four queres, Q9 and Q10 deserve specal attenton snce both attrbutes appear n the selecton condton of the two queres. The dfference s that the two parts of the selecton condton n Q9 are connected by the OR operator and those n Q10 are connected by the AND operator. We llustrate how the costs of type I, type II, and msrepresentaton errors assocated wth each query are determned for the example shown n Table 8. We frst consder the case when (Walmart, TX) s the stored value for Robert Black n the merged table. Cost of Msrepresentaton Errors. For msrepresentaton errors, only those class T and class CT queres wth respect to ether Employer or Home_Locaton,.e., queres that nclude ether Employer or Home_Locaton or both n ther proect lst, need to be consdered. In our example, Q7 and Q8 are the only class T queres and Q10 s the only class CT query. The msrepresentaton cost assocated wth each query equals the product of the unt cost of msrepresentaton error, the frequency of the query, the probablty that the target obect s selected by the query, and the margnal probablty that the stored value s not the true value. If more than one examned attrbute s n the proecton lst (e.g., Q7), the msrepresentaton errors equal the sum of the msrepresentaton errors computed separately for each attrbute. For example, the msrepresentaton errors assocated wth Q7, Q8, and Q10 are as follows: C mq7 (Walmart, Tx) = ((Q7, Home_Locaton)f(Q7)J(Q7)[1 P(TX)] + ((Q7, Employer)f(Q7)J(Q7)[1 P(Walmart)], C mq8 (Walmart, Tx) = ((Q8, Home_Locaton)f(Q8)J(Q7)[1 P(TX)], and C mq10 (Walmart, Tx) = 0 In the above expressons, the margnal probabltes are P(TX) = and P(Walmart) = The expresson for C m,q8 does not contan J(Q8) snce ths query always selects Robert Black, gven that (Walmart, TX) s stored. C m,q10 equals zero because Q10 never selects Robert Black based on the stored values (Walmart, TX). From the above example, we can see that, for msrepresentaton error costs, the soluton s smlar to that for the sngle attrbute case, except that the probablty of a sngle value s replaced by the margnal dstrbuton n the multple attrbute case. Cost of Type I and Type II Errors. For type I and type II errors, we only need to consder class T or class CT queres wth respect to Employer or Home_Locaton. Therefore, we can gnore Q7. The costs of type I and type II errors assocated wth Q8, Q9, and Q10 are summarzed n Table 9. We observe that, f only one of the multple attrbute beng examned s n the selecton condton of a query (e.g., Q8), the same soluton that we derve for a sngle stochastc attrbute can be used. For queres wth more than one attrbutes beng examned n ts selecton condton, the same rules stll apply: f the obect s selected based on the stored attrbute values, the cost of type II errors equals the product of the unt type II error cost, the frequency of the query, and the probablty that selecton condton s volated; f the obect s not selected, then cost of type I error equals the product of the unt type I error cost, the frequency of the query, and the probablty that the selecton condton s satsfed. Although the probablty that a selecton condton s satsfed or volated s slghtly more complex wth multple attrbutes beng consdered, as shown n Table 9, t can be derved wthout much dffculty. To decde the cost-mnmzng value for both Employer and Home_Locaton for Robert Black, the total expected cost assocated wth other value combnatons also needs to be examned. Snce values that need to be separately examned for Employer nclude Walmart, Nortel, NULL, and any other value, and those for Home_Locaton nclude TX, LA, NULL, and any other value, there are a total of only 16 cases, nstead of 10,000 potental cases, to be examned n order to decde the cost-mnmzng value for both attrbutes Twenty-Ffth Internatonal Conference on Informaton Systems

13 Jang et al./reconclng Attrbute Values from Multple Data Sources Table 9. Cost of Type I and Type Error for Two Stochastc Attrbutes (If (Walmart, TX) s stored) No. Retreval Crtera Employer H_L. Result Q8 Walmart Q9 (OR) Walmart LA Q10 Nortel TX Not If True Vaues are Type I Error Cost Type II Error Cost (Walmart, -) N/A 0 Others N/A $(Q8)f(Q8)[1 P(Walmart)] (Walmart, -) N/A 0 (-, LA) N/A 0 Others N/A $(Q9)f(Q9)[1 P(Walmart) P(LA) + P(Walmart, LA)] (Nortel, TX) "(Q10)f(Q10)[1 P(Nortel, TX)] N/A Others 0 N/A Dscussons and Future Research We have shown how the cost-mnmzng values can be determned for dscrete attrbutes based on source relablty nformaton and query nformaton. Based on the proposed procedure, when conflctng data values for a real-world entty are encountered n the data ntegraton process, the cost-mnmzng value can be determned and stored n the merged table. Subsequently, queres can be drectly executed on the merged table. We call ths approach determnstc ntegraton. Compared wth probablstc ntegraton,.e., storng all probablstc values based on the probablstc database model, the man dsadvantage of the determnstc approach s the loss of potentally useful dstrbuton nformaton. The advantages of determnstc ntegraton nclude the followng: Frst, data storage and subsequent query processng can be effcently handled by the exstng commercal database systems. Wth probablstc ntegraton, snce the probablstc algebra s not currently supported by standard database packages, the cost of mplementng such a probablstc model could be prohbtvely hgh. Second, the storage cost s lower wth determnstc ntegraton. Ths s because wth the probablstc relatonal model (e.g., Dey and Sarkar 1996), a new column has to be added even f only one obect n the table has uncertan values for only one attrbute. In addton, a row has to be added to the table for every probablstc value assocated wth each obect. Thrd, wth determnstc ntegraton, the operatonal performance of the resultng database s better. Ths s because the computatonal overhead assocated wth a probablstc database s avoded wth a determnstc table. The procedures we propose here are for dscrete attrbutes only. An extenson to ths study s to examne stochastc attrbutes wth contnuous domans and a mxture of dscrete attrbutes and contnuous attrbutes. Computatonally, n the multple attrbutes scenaro, f the number of realzatons of each attrbute or the number of queres beng consdered s large, the computatonal overhead could ncrease sgnfcantly. We are tryng to dentfy rules or patterns that can help reduce the computatonal effort. References Batn, C., Lenzern, M., and Navathe, S. B. A Comparatve Analyss of Methodologes for Database Schema Integraton, ACM Computng Surveys (18:4), December 1986, pp Dekhtyar, A., Ross, R., and Subrahmanan, V. S. Probablstc Temporal Databases, I: Algebra, ACM Transactons on Database Systems (26:1), March 2001, pp Dey, D., Barron, T. M., and Sahara, A. N. A Decson Model for Choosng the Optmal Level of Storage n Temporal Databases, IEEE Transactons on Knowledge and Data Engneerng (10:2), February 1998a, pp Dey, D., and Sarkar, S. A Probablstc Relatonal Model and Algebra, ACM Transactons on Database Systems (TODS) (21:3), September 1996, pp Dey, D., Sarkar, S., and De, P. A Probablstc Decson Model for Entty Matchng n Heterogeneous Databases, Management Scence (44:10), October 1998b, pp Hernandez, M. A., and Stolfo, S. J. Real-World Data s Drty: Data Cleanng and the Merge/Purge Problem, Data Mnng and Knowledge Dscovery (2:1), January 1998, pp Twenty-Ffth Internatonal Conference on Informaton Systems 737

14 Jang et al./reconclng Attrbute Values from Multple Data Sources Mendelson, H., and Sahara, A. N. Incomplete Informaton Costs and Database Desgn, ACM Transactons on Database Systems (TODS) (11:2), June 1986, pp Morey, R. C. Estmatng and Improvng the Qualty of Informaton n a MIS, Communcaton of the ACM (25:5), May 1982, pp Rahm, E., and Do, H. H. Data Cleanng: Problems and Current Approaches, IEEE Bulletn of the Techncal Commttee on Data Engneerng (23:4), December 2000, pp Ram, S., and Park, J. Semantc Conflct Resoluton Ontology (SCROL): An Ontology for Detectng and Resolvng Data and Schema Level Conflcts, IEEE Transactons on Knowledge and Data Engneerng (16:2), February 2004, pp Appendx. Probablty Dervatons for One Attrbute, Two Data Sources We frst show some smplfcatons for the general expresson that apply to all of the cases. I. P(A, A ) = E m P(A, A, A m ) = E m P(A, A m ) P(A m ) = E m P(A m ) P(A m ) P(A m ) = E m [P(A m ) P(A )/ P(A m )] [P(A m A ) P(A )/P(A m )] P(A m ) = E m P(A m ) P(A m A ) P(A ) P(A ) / P(A m ) = P(A ) P(A ) E m P(A m ) P(A m A ) / P(A m ). II. P(A k, A ) = P(A, A k ) P(A k ) / P(A, A ) Analogous to I, we can show: P(A, A k ) P(A k ) = P(A k ) P(A k A ) P(A ) P(A ) / P(A k ). Therefore, P(A k, A ) = P(A k ) P(A k A ) P(A ) P(A ) / [P(A k ) P(A, A )]. Substtutng for P(A, A ) from I, we get: P(A k, A ) = [P(A k A ) P(A k A )/P(A k )] / [E m P(A m ) P(A m A ) / P(A m )]. From our second assumpton, P(A m ) = P(A k ), and P(A m ) = 1/ A for all m. Therefore, P(A k A,A ) = [P(A k A ) P(A k A )]/[E m P(A m ) P(A m A )]. We now show how the desred probablty estmates may be obtaned for each case. Case 1a: k = = ; P(A, A ) P(A, A ) = [P(A A ) P(A A )] / [E m P(A m ) P(A m A )]. For m, and assumpton three, we have P(A m ) = P(A a ) / [ A -1] œ m. Smlarly, P(A m ) = P(A a ) / [ A -1] œ m. Therefore, E m P(A m ) P(A m A ) = P(A A ) P(A A ) + E m P(A m ) P(A m A ) = P(A A ) P(A A ) + E m P(A a ) P(A a ) / [ A -1] 2 = P(A A ) P(A A ) + P(A a ) P(A a ) / [ A -1] Twenty-Ffth Internatonal Conference on Informaton Systems

15 Jang et al./reconclng Attrbute Values from Multple Data Sources Hence, P(A, A ) = P(A P(A ) * P(A ) ) * P(A ) + P(A a ) * P(A a ) / [ -1] Case 1b: k = ; P(A k, A ) P(A k, A ) = [P(A k A ) P(A k A )] / [E m P(A m ) P(A m A )]. Snce k, from assumpton 3, we have P(A k ) = P(A a )/ [ A -1], and P(A k ) = P(A a A )/ [ A -1]. Therefore, P(A k A ) P(A k A ) = P(A a ) P(A a ) / [ A -1] 2. Hence, P(A k, A ) = P(A P(A a ) * P(A ) * P(A a ) + P(A a )/ [ -1] ) * P(A a 2 ) / [ -1] Case 2a: k = ; P(A, A ) P(A, A ) = [P(A A ) P(A A )] / [E m P(A m ) P(A m A )]. Here, we have, from assumpton 3, P(A A ) = P(A a )/ [ A -1], and P(A A ) = P(A a )/ [ A -1]. Now, E m P(A m ) P(A m A ) = P(A ) P(A A ) + P(A ) P(A A ) + E m, P(A m ) P(A m A ) = P(A ) P(A a )/ [ A -1] + P(A a )/ [ A -1] P(A A ) + E m, P(A a ) P(A a ) / [ A -1] 2 = P(A ) P(A a = a )/ [ A -1] + P(A a )/ [ A -1] P(A A ) + P(A a ) P(A a ) [ A -2] / [ A -1] 2. Hence, P(A, A ) = P(A ) * P(A a ) + P(A a P(A ) * P(A a ) * P(A ) ) + P(A a ) * P(A a. )[ -2] / [ -1] Case 2b: k ; P(A k, A ) P(A k, A ) = [P(A k A ) P(A k A )] / [E m P(A m ) P(A m A )]. The numerator s P(A k A ) P(A k A ) = P(A a ) P(A a ) / [ A -1] 2 The denomnator s as shown n case 2a. Hence, P(A, A ) = P(A ) * P(A a P(A a ) + P(A a ) * P(A a ) * P(A ) / [ -1] ) + P(A a ) * P(A a. )[ -2] / [ -1] 2004 Twenty-Ffth Internatonal Conference on Informaton Systems 739

16 Twenty-Ffth Internatonal Conference on Informaton Systems

OPERATIONS RESEARCH. Game Theory

OPERATIONS RESEARCH. Game Theory OPERATIONS RESEARCH Chapter 2 Game Theory Prof. Bbhas C. Gr Department of Mathematcs Jadavpur Unversty Kolkata, Inda Emal: bcgr.umath@gmal.com 1.0 Introducton Game theory was developed for decson makng

More information

Elements of Economic Analysis II Lecture VI: Industry Supply

Elements of Economic Analysis II Lecture VI: Industry Supply Elements of Economc Analyss II Lecture VI: Industry Supply Ka Hao Yang 10/12/2017 In the prevous lecture, we analyzed the frm s supply decson usng a set of smple graphcal analyses. In fact, the dscusson

More information

Tests for Two Ordered Categorical Variables

Tests for Two Ordered Categorical Variables Chapter 253 Tests for Two Ordered Categorcal Varables Introducton Ths module computes power and sample sze for tests of ordered categorcal data such as Lkert scale data. Assumng proportonal odds, such

More information

Tests for Two Correlations

Tests for Two Correlations PASS Sample Sze Software Chapter 805 Tests for Two Correlatons Introducton The correlaton coeffcent (or correlaton), ρ, s a popular parameter for descrbng the strength of the assocaton between two varables.

More information

Economic Design of Short-Run CSP-1 Plan Under Linear Inspection Cost

Economic Design of Short-Run CSP-1 Plan Under Linear Inspection Cost Tamkang Journal of Scence and Engneerng, Vol. 9, No 1, pp. 19 23 (2006) 19 Economc Desgn of Short-Run CSP-1 Plan Under Lnear Inspecton Cost Chung-Ho Chen 1 * and Chao-Yu Chou 2 1 Department of Industral

More information

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE) ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE) May 17, 2016 15:30 Frst famly name: Name: DNI/ID: Moble: Second famly Name: GECO/GADE: Instructor: E-mal: Queston 1 A B C Blank Queston 2 A B C Blank Queston

More information

A MODEL OF COMPETITION AMONG TELECOMMUNICATION SERVICE PROVIDERS BASED ON REPEATED GAME

A MODEL OF COMPETITION AMONG TELECOMMUNICATION SERVICE PROVIDERS BASED ON REPEATED GAME A MODEL OF COMPETITION AMONG TELECOMMUNICATION SERVICE PROVIDERS BASED ON REPEATED GAME Vesna Radonć Đogatovć, Valentna Radočć Unversty of Belgrade Faculty of Transport and Traffc Engneerng Belgrade, Serba

More information

Price and Quantity Competition Revisited. Abstract

Price and Quantity Competition Revisited. Abstract rce and uantty Competton Revsted X. Henry Wang Unversty of Mssour - Columba Abstract By enlargng the parameter space orgnally consdered by Sngh and Vves (984 to allow for a wder range of cost asymmetry,

More information

Quiz on Deterministic part of course October 22, 2002

Quiz on Deterministic part of course October 22, 2002 Engneerng ystems Analyss for Desgn Quz on Determnstc part of course October 22, 2002 Ths s a closed book exercse. You may use calculators Grade Tables There are 90 ponts possble for the regular test, or

More information

The Integration of the Israel Labour Force Survey with the National Insurance File

The Integration of the Israel Labour Force Survey with the National Insurance File The Integraton of the Israel Labour Force Survey wth the Natonal Insurance Fle Natale SHLOMO Central Bureau of Statstcs Kanfey Nesharm St. 66, corner of Bach Street, Jerusalem Natales@cbs.gov.l Abstact:

More information

MgtOp 215 Chapter 13 Dr. Ahn

MgtOp 215 Chapter 13 Dr. Ahn MgtOp 5 Chapter 3 Dr Ahn Consder two random varables X and Y wth,,, In order to study the relatonshp between the two random varables, we need a numercal measure that descrbes the relatonshp The covarance

More information

CHAPTER 9 FUNCTIONAL FORMS OF REGRESSION MODELS

CHAPTER 9 FUNCTIONAL FORMS OF REGRESSION MODELS CHAPTER 9 FUNCTIONAL FORMS OF REGRESSION MODELS QUESTIONS 9.1. (a) In a log-log model the dependent and all explanatory varables are n the logarthmc form. (b) In the log-ln model the dependent varable

More information

Note on Cubic Spline Valuation Methodology

Note on Cubic Spline Valuation Methodology Note on Cubc Splne Valuaton Methodology Regd. Offce: The Internatonal, 2 nd Floor THE CUBIC SPLINE METHODOLOGY A model for yeld curve takes traded yelds for avalable tenors as nput and generates the curve

More information

CHAPTER 3: BAYESIAN DECISION THEORY

CHAPTER 3: BAYESIAN DECISION THEORY CHATER 3: BAYESIAN DECISION THEORY Decson makng under uncertanty 3 rogrammng computers to make nference from data requres nterdscplnary knowledge from statstcs and computer scence Knowledge of statstcs

More information

Equilibrium in Prediction Markets with Buyers and Sellers

Equilibrium in Prediction Markets with Buyers and Sellers Equlbrum n Predcton Markets wth Buyers and Sellers Shpra Agrawal Nmrod Megddo Benamn Armbruster Abstract Predcton markets wth buyers and sellers of contracts on multple outcomes are shown to have unque

More information

Appendix - Normally Distributed Admissible Choices are Optimal

Appendix - Normally Distributed Admissible Choices are Optimal Appendx - Normally Dstrbuted Admssble Choces are Optmal James N. Bodurtha, Jr. McDonough School of Busness Georgetown Unversty and Q Shen Stafford Partners Aprl 994 latest revson September 00 Abstract

More information

Money, Banking, and Financial Markets (Econ 353) Midterm Examination I June 27, Name Univ. Id #

Money, Banking, and Financial Markets (Econ 353) Midterm Examination I June 27, Name Univ. Id # Money, Bankng, and Fnancal Markets (Econ 353) Mdterm Examnaton I June 27, 2005 Name Unv. Id # Note: Each multple-choce queston s worth 4 ponts. Problems 20, 21, and 22 carry 10, 8, and 10 ponts, respectvely.

More information

II. Random Variables. Variable Types. Variables Map Outcomes to Numbers

II. Random Variables. Variable Types. Variables Map Outcomes to Numbers II. Random Varables Random varables operate n much the same way as the outcomes or events n some arbtrary sample space the dstncton s that random varables are smply outcomes that are represented numercally.

More information

Creating a zero coupon curve by bootstrapping with cubic splines.

Creating a zero coupon curve by bootstrapping with cubic splines. MMA 708 Analytcal Fnance II Creatng a zero coupon curve by bootstrappng wth cubc splnes. erg Gryshkevych Professor: Jan R. M. Röman 0.2.200 Dvson of Appled Mathematcs chool of Educaton, Culture and Communcaton

More information

Cyclic Scheduling in a Job shop with Multiple Assembly Firms

Cyclic Scheduling in a Job shop with Multiple Assembly Firms Proceedngs of the 0 Internatonal Conference on Industral Engneerng and Operatons Management Kuala Lumpur, Malaysa, January 4, 0 Cyclc Schedulng n a Job shop wth Multple Assembly Frms Tetsuya Kana and Koch

More information

/ Computational Genomics. Normalization

/ Computational Genomics. Normalization 0-80 /02-70 Computatonal Genomcs Normalzaton Gene Expresson Analyss Model Computatonal nformaton fuson Bologcal regulatory networks Pattern Recognton Data Analyss clusterng, classfcaton normalzaton, mss.

More information

A Bootstrap Confidence Limit for Process Capability Indices

A Bootstrap Confidence Limit for Process Capability Indices A ootstrap Confdence Lmt for Process Capablty Indces YANG Janfeng School of usness, Zhengzhou Unversty, P.R.Chna, 450001 Abstract The process capablty ndces are wdely used by qualty professonals as an

More information

15-451/651: Design & Analysis of Algorithms January 22, 2019 Lecture #3: Amortized Analysis last changed: January 18, 2019

15-451/651: Design & Analysis of Algorithms January 22, 2019 Lecture #3: Amortized Analysis last changed: January 18, 2019 5-45/65: Desgn & Analyss of Algorthms January, 09 Lecture #3: Amortzed Analyss last changed: January 8, 09 Introducton In ths lecture we dscuss a useful form of analyss, called amortzed analyss, for problems

More information

EDC Introduction

EDC Introduction .0 Introducton EDC3 In the last set of notes (EDC), we saw how to use penalty factors n solvng the EDC problem wth losses. In ths set of notes, we want to address two closely related ssues. What are, exactly,

More information

Analysis of Variance and Design of Experiments-II

Analysis of Variance and Design of Experiments-II Analyss of Varance and Desgn of Experments-II MODULE VI LECTURE - 4 SPLIT-PLOT AND STRIP-PLOT DESIGNS Dr. Shalabh Department of Mathematcs & Statstcs Indan Insttute of Technology Kanpur An example to motvate

More information

An Application of Alternative Weighting Matrix Collapsing Approaches for Improving Sample Estimates

An Application of Alternative Weighting Matrix Collapsing Approaches for Improving Sample Estimates Secton on Survey Research Methods An Applcaton of Alternatve Weghtng Matrx Collapsng Approaches for Improvng Sample Estmates Lnda Tompkns 1, Jay J. Km 2 1 Centers for Dsease Control and Preventon, atonal

More information

Finance 402: Problem Set 1 Solutions

Finance 402: Problem Set 1 Solutions Fnance 402: Problem Set 1 Solutons Note: Where approprate, the fnal answer for each problem s gven n bold talcs for those not nterested n the dscusson of the soluton. 1. The annual coupon rate s 6%. A

More information

CS 286r: Matching and Market Design Lecture 2 Combinatorial Markets, Walrasian Equilibrium, Tâtonnement

CS 286r: Matching and Market Design Lecture 2 Combinatorial Markets, Walrasian Equilibrium, Tâtonnement CS 286r: Matchng and Market Desgn Lecture 2 Combnatoral Markets, Walrasan Equlbrum, Tâtonnement Matchng and Money Recall: Last tme we descrbed the Hungaran Method for computng a maxmumweght bpartte matchng.

More information

Measures of Spread IQR and Deviation. For exam X, calculate the mean, median and mode. For exam Y, calculate the mean, median and mode.

Measures of Spread IQR and Deviation. For exam X, calculate the mean, median and mode. For exam Y, calculate the mean, median and mode. Part 4 Measures of Spread IQR and Devaton In Part we learned how the three measures of center offer dfferent ways of provdng us wth a sngle representatve value for a data set. However, consder the followng

More information

Data Mining Linear and Logistic Regression

Data Mining Linear and Logistic Regression 07/02/207 Data Mnng Lnear and Logstc Regresson Mchael L of 26 Regresson In statstcal modellng, regresson analyss s a statstcal process for estmatng the relatonshps among varables. Regresson models are

More information

Lecture Note 2 Time Value of Money

Lecture Note 2 Time Value of Money Seg250 Management Prncples for Engneerng Managers Lecture ote 2 Tme Value of Money Department of Systems Engneerng and Engneerng Management The Chnese Unversty of Hong Kong Interest: The Cost of Money

More information

TCOM501 Networking: Theory & Fundamentals Final Examination Professor Yannis A. Korilis April 26, 2002

TCOM501 Networking: Theory & Fundamentals Final Examination Professor Yannis A. Korilis April 26, 2002 TO5 Networng: Theory & undamentals nal xamnaton Professor Yanns. orls prl, Problem [ ponts]: onsder a rng networ wth nodes,,,. In ths networ, a customer that completes servce at node exts the networ wth

More information

Mathematical Thinking Exam 1 09 October 2017

Mathematical Thinking Exam 1 09 October 2017 Mathematcal Thnkng Exam 1 09 October 2017 Name: Instructons: Be sure to read each problem s drectons. Wrte clearly durng the exam and fully erase or mark out anythng you do not want graded. You may use

More information

Random Variables. b 2.

Random Variables. b 2. Random Varables Generally the object of an nvestgators nterest s not necessarly the acton n the sample space but rather some functon of t. Techncally a real valued functon or mappng whose doman s the sample

More information

Lecture 7. We now use Brouwer s fixed point theorem to prove Nash s theorem.

Lecture 7. We now use Brouwer s fixed point theorem to prove Nash s theorem. Topcs on the Border of Economcs and Computaton December 11, 2005 Lecturer: Noam Nsan Lecture 7 Scrbe: Yoram Bachrach 1 Nash s Theorem We begn by provng Nash s Theorem about the exstance of a mxed strategy

More information

REFINITIV INDICES PRIVATE EQUITY BUYOUT INDEX METHODOLOGY

REFINITIV INDICES PRIVATE EQUITY BUYOUT INDEX METHODOLOGY REFINITIV INDICES PRIVATE EQUITY BUYOUT INDEX METHODOLOGY 1 Table of Contents INTRODUCTION 3 TR Prvate Equty Buyout Index 3 INDEX COMPOSITION 3 Sector Portfolos 4 Sector Weghtng 5 Index Rebalance 5 Index

More information

Clearing Notice SIX x-clear Ltd

Clearing Notice SIX x-clear Ltd Clearng Notce SIX x-clear Ltd 1.0 Overvew Changes to margn and default fund model arrangements SIX x-clear ( x-clear ) s closely montorng the CCP envronment n Europe as well as the needs of ts Members.

More information

Problem Set 6 Finance 1,

Problem Set 6 Finance 1, Carnege Mellon Unversty Graduate School of Industral Admnstraton Chrs Telmer Wnter 2006 Problem Set 6 Fnance, 47-720. (representatve agent constructon) Consder the followng two-perod, two-agent economy.

More information

ISE High Income Index Methodology

ISE High Income Index Methodology ISE Hgh Income Index Methodology Index Descrpton The ISE Hgh Income Index s desgned to track the returns and ncome of the top 30 U.S lsted Closed-End Funds. Index Calculaton The ISE Hgh Income Index s

More information

Global Optimization in Multi-Agent Models

Global Optimization in Multi-Agent Models Global Optmzaton n Mult-Agent Models John R. Brge R.R. McCormck School of Engneerng and Appled Scence Northwestern Unversty Jont work wth Chonawee Supatgat, Enron, and Rachel Zhang, Cornell 11/19/2004

More information

UNIVERSITY OF NOTTINGHAM

UNIVERSITY OF NOTTINGHAM UNIVERSITY OF NOTTINGHAM SCHOOL OF ECONOMICS DISCUSSION PAPER 99/28 Welfare Analyss n a Cournot Game wth a Publc Good by Indraneel Dasgupta School of Economcs, Unversty of Nottngham, Nottngham NG7 2RD,

More information

Optimising a general repair kit problem with a service constraint

Optimising a general repair kit problem with a service constraint Optmsng a general repar kt problem wth a servce constrant Marco Bjvank 1, Ger Koole Department of Mathematcs, VU Unversty Amsterdam, De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands Irs F.A. Vs Department

More information

Understanding price volatility in electricity markets

Understanding price volatility in electricity markets Proceedngs of the 33rd Hawa Internatonal Conference on System Scences - 2 Understandng prce volatlty n electrcty markets Fernando L. Alvarado, The Unversty of Wsconsn Rajesh Rajaraman, Chrstensen Assocates

More information

Chapter 10 Making Choices: The Method, MARR, and Multiple Attributes

Chapter 10 Making Choices: The Method, MARR, and Multiple Attributes Chapter 0 Makng Choces: The Method, MARR, and Multple Attrbutes INEN 303 Sergy Butenko Industral & Systems Engneerng Texas A&M Unversty Comparng Mutually Exclusve Alternatves by Dfferent Evaluaton Methods

More information

FORD MOTOR CREDIT COMPANY SUGGESTED ANSWERS. Richard M. Levich. New York University Stern School of Business. Revised, February 1999

FORD MOTOR CREDIT COMPANY SUGGESTED ANSWERS. Richard M. Levich. New York University Stern School of Business. Revised, February 1999 FORD MOTOR CREDIT COMPANY SUGGESTED ANSWERS by Rchard M. Levch New York Unversty Stern School of Busness Revsed, February 1999 1 SETTING UP THE PROBLEM The bond s beng sold to Swss nvestors for a prce

More information

Applications of Myerson s Lemma

Applications of Myerson s Lemma Applcatons of Myerson s Lemma Professor Greenwald 28-2-7 We apply Myerson s lemma to solve the sngle-good aucton, and the generalzaton n whch there are k dentcal copes of the good. Our objectve s welfare

More information

Instituto de Engenharia de Sistemas e Computadores de Coimbra Institute of Systems Engineering and Computers INESC - Coimbra

Instituto de Engenharia de Sistemas e Computadores de Coimbra Institute of Systems Engineering and Computers INESC - Coimbra Insttuto de Engenhara de Sstemas e Computadores de Combra Insttute of Systems Engneerng and Computers INESC - Combra Joana Das Can we really gnore tme n Smple Plant Locaton Problems? No. 7 2015 ISSN: 1645-2631

More information

Introduction to PGMs: Discrete Variables. Sargur Srihari

Introduction to PGMs: Discrete Variables. Sargur Srihari Introducton to : Dscrete Varables Sargur srhar@cedar.buffalo.edu Topcs. What are graphcal models (or ) 2. Use of Engneerng and AI 3. Drectonalty n graphs 4. Bayesan Networks 5. Generatve Models and Samplng

More information

AC : THE DIAGRAMMATIC AND MATHEMATICAL APPROACH OF PROJECT TIME-COST TRADEOFFS

AC : THE DIAGRAMMATIC AND MATHEMATICAL APPROACH OF PROJECT TIME-COST TRADEOFFS AC 2008-1635: THE DIAGRAMMATIC AND MATHEMATICAL APPROACH OF PROJECT TIME-COST TRADEOFFS Kun-jung Hsu, Leader Unversty Amercan Socety for Engneerng Educaton, 2008 Page 13.1217.1 Ttle of the Paper: The Dagrammatc

More information

Capability Analysis. Chapter 255. Introduction. Capability Analysis

Capability Analysis. Chapter 255. Introduction. Capability Analysis Chapter 55 Introducton Ths procedure summarzes the performance of a process based on user-specfed specfcaton lmts. The observed performance as well as the performance relatve to the Normal dstrbuton are

More information

Problems to be discussed at the 5 th seminar Suggested solutions

Problems to be discussed at the 5 th seminar Suggested solutions ECON4260 Behavoral Economcs Problems to be dscussed at the 5 th semnar Suggested solutons Problem 1 a) Consder an ultmatum game n whch the proposer gets, ntally, 100 NOK. Assume that both the proposer

More information

occurrence of a larger storm than our culvert or bridge is barely capable of handling? (what is The main question is: What is the possibility of

occurrence of a larger storm than our culvert or bridge is barely capable of handling? (what is The main question is: What is the possibility of Module 8: Probablty and Statstcal Methods n Water Resources Engneerng Bob Ptt Unversty of Alabama Tuscaloosa, AL Flow data are avalable from numerous USGS operated flow recordng statons. Data s usually

More information

New Distance Measures on Dual Hesitant Fuzzy Sets and Their Application in Pattern Recognition

New Distance Measures on Dual Hesitant Fuzzy Sets and Their Application in Pattern Recognition Journal of Artfcal Intellgence Practce (206) : 8-3 Clausus Scentfc Press, Canada New Dstance Measures on Dual Hestant Fuzzy Sets and Ther Applcaton n Pattern Recognton L Xn a, Zhang Xaohong* b College

More information

Raising Food Prices and Welfare Change: A Simple Calibration. Xiaohua Yu

Raising Food Prices and Welfare Change: A Simple Calibration. Xiaohua Yu Rasng Food Prces and Welfare Change: A Smple Calbraton Xaohua Yu Professor of Agrcultural Economcs Courant Research Centre Poverty, Equty and Growth Unversty of Göttngen CRC-PEG, Wlhelm-weber-Str. 2 3773

More information

OCR Statistics 1 Working with data. Section 2: Measures of location

OCR Statistics 1 Working with data. Section 2: Measures of location OCR Statstcs 1 Workng wth data Secton 2: Measures of locaton Notes and Examples These notes have sub-sectons on: The medan Estmatng the medan from grouped data The mean Estmatng the mean from grouped data

More information

ECE 586GT: Problem Set 2: Problems and Solutions Uniqueness of Nash equilibria, zero sum games, evolutionary dynamics

ECE 586GT: Problem Set 2: Problems and Solutions Uniqueness of Nash equilibria, zero sum games, evolutionary dynamics Unversty of Illnos Fall 08 ECE 586GT: Problem Set : Problems and Solutons Unqueness of Nash equlbra, zero sum games, evolutonary dynamcs Due: Tuesday, Sept. 5, at begnnng of class Readng: Course notes,

More information

Scribe: Chris Berlind Date: Feb 1, 2010

Scribe: Chris Berlind Date: Feb 1, 2010 CS/CNS/EE 253: Advanced Topcs n Machne Learnng Topc: Dealng wth Partal Feedback #2 Lecturer: Danel Golovn Scrbe: Chrs Berlnd Date: Feb 1, 2010 8.1 Revew In the prevous lecture we began lookng at algorthms

More information

Solution of periodic review inventory model with general constrains

Solution of periodic review inventory model with general constrains Soluton of perodc revew nventory model wth general constrans Soluton of perodc revew nventory model wth general constrans Prof Dr J Benkő SZIU Gödöllő Summary Reasons for presence of nventory (stock of

More information

Spurious Seasonal Patterns and Excess Smoothness in the BLS Local Area Unemployment Statistics

Spurious Seasonal Patterns and Excess Smoothness in the BLS Local Area Unemployment Statistics Spurous Seasonal Patterns and Excess Smoothness n the BLS Local Area Unemployment Statstcs Keth R. Phllps and Janguo Wang Federal Reserve Bank of Dallas Research Department Workng Paper 1305 September

More information

Improved Marginal Loss Calculations During Hours of Transmission Congestion

Improved Marginal Loss Calculations During Hours of Transmission Congestion Improved Margnal Loss Calculatons Durng Hours of Transmsson Congeston Judth B. Cardell Smth College jcardell@smth.edu Abstract Shortcomngs of the current polcy focus and accepted mplementatons for calculatng

More information

Single-Item Auctions. CS 234r: Markets for Networks and Crowds Lecture 4 Auctions, Mechanisms, and Welfare Maximization

Single-Item Auctions. CS 234r: Markets for Networks and Crowds Lecture 4 Auctions, Mechanisms, and Welfare Maximization CS 234r: Markets for Networks and Crowds Lecture 4 Auctons, Mechansms, and Welfare Maxmzaton Sngle-Item Auctons Suppose we have one or more tems to sell and a pool of potental buyers. How should we decde

More information

Games and Decisions. Part I: Basic Theorems. Contents. 1 Introduction. Jane Yuxin Wang. 1 Introduction 1. 2 Two-player Games 2

Games and Decisions. Part I: Basic Theorems. Contents. 1 Introduction. Jane Yuxin Wang. 1 Introduction 1. 2 Two-player Games 2 Games and Decsons Part I: Basc Theorems Jane Yuxn Wang Contents 1 Introducton 1 2 Two-player Games 2 2.1 Zero-sum Games................................ 3 2.1.1 Pure Strateges.............................

More information

Parallel Prefix addition

Parallel Prefix addition Marcelo Kryger Sudent ID 015629850 Parallel Prefx addton The parallel prefx adder presented next, performs the addton of two bnary numbers n tme of complexty O(log n) and lnear cost O(n). Lets notce the

More information

Linear Combinations of Random Variables and Sampling (100 points)

Linear Combinations of Random Variables and Sampling (100 points) Economcs 30330: Statstcs for Economcs Problem Set 6 Unversty of Notre Dame Instructor: Julo Garín Sprng 2012 Lnear Combnatons of Random Varables and Samplng 100 ponts 1. Four-part problem. Go get some

More information

2) In the medium-run/long-run, a decrease in the budget deficit will produce:

2) In the medium-run/long-run, a decrease in the budget deficit will produce: 4.02 Quz 2 Solutons Fall 2004 Multple-Choce Questons ) Consder the wage-settng and prce-settng equatons we studed n class. Suppose the markup, µ, equals 0.25, and F(u,z) = -u. What s the natural rate of

More information

Bid-auction framework for microsimulation of location choice with endogenous real estate prices

Bid-auction framework for microsimulation of location choice with endogenous real estate prices Bd-aucton framework for mcrosmulaton of locaton choce wth endogenous real estate prces Rcardo Hurtuba Mchel Berlare Francsco Martínez Urbancs Termas de Chllán, Chle March 28 th 2012 Outlne 1) Motvaton

More information

Pivot Points for CQG - Overview

Pivot Points for CQG - Overview Pvot Ponts for CQG - Overvew By Bran Bell Introducton Pvot ponts are a well-known technque used by floor traders to calculate ntraday support and resstance levels. Ths technque has been around for decades,

More information

Facility Location Problem. Learning objectives. Antti Salonen Farzaneh Ahmadzadeh

Facility Location Problem. Learning objectives. Antti Salonen Farzaneh Ahmadzadeh Antt Salonen Farzaneh Ahmadzadeh 1 Faclty Locaton Problem The study of faclty locaton problems, also known as locaton analyss, s a branch of operatons research concerned wth the optmal placement of facltes

More information

A Case Study for Optimal Dynamic Simulation Allocation in Ordinal Optimization 1

A Case Study for Optimal Dynamic Simulation Allocation in Ordinal Optimization 1 A Case Study for Optmal Dynamc Smulaton Allocaton n Ordnal Optmzaton Chun-Hung Chen, Dongha He, and Mchael Fu 4 Abstract Ordnal Optmzaton has emerged as an effcent technque for smulaton and optmzaton.

More information

Production and Supply Chain Management Logistics. Paolo Detti Department of Information Engeneering and Mathematical Sciences University of Siena

Production and Supply Chain Management Logistics. Paolo Detti Department of Information Engeneering and Mathematical Sciences University of Siena Producton and Supply Chan Management Logstcs Paolo Dett Department of Informaton Engeneerng and Mathematcal Scences Unversty of Sena Convergence and complexty of the algorthm Convergence of the algorthm

More information

University of Toronto November 9, 2006 ECO 209Y MACROECONOMIC THEORY. Term Test #1 L0101 L0201 L0401 L5101 MW MW 1-2 MW 2-3 W 6-8

University of Toronto November 9, 2006 ECO 209Y MACROECONOMIC THEORY. Term Test #1 L0101 L0201 L0401 L5101 MW MW 1-2 MW 2-3 W 6-8 Department of Economcs Prof. Gustavo Indart Unversty of Toronto November 9, 2006 SOLUTION ECO 209Y MACROECONOMIC THEORY Term Test #1 A LAST NAME FIRST NAME STUDENT NUMBER Crcle your secton of the course:

More information

University of Toronto November 9, 2006 ECO 209Y MACROECONOMIC THEORY. Term Test #1 L0101 L0201 L0401 L5101 MW MW 1-2 MW 2-3 W 6-8

University of Toronto November 9, 2006 ECO 209Y MACROECONOMIC THEORY. Term Test #1 L0101 L0201 L0401 L5101 MW MW 1-2 MW 2-3 W 6-8 Department of Economcs Prof. Gustavo Indart Unversty of Toronto November 9, 2006 SOLUTION ECO 209Y MACROECONOMIC THEORY Term Test #1 C LAST NAME FIRST NAME STUDENT NUMBER Crcle your secton of the course:

More information

Examining the Validity of Credit Ratings Assigned to Credit Derivatives

Examining the Validity of Credit Ratings Assigned to Credit Derivatives Examnng the Valdty of redt atngs Assgned to redt Dervatves hh-we Lee Department of Fnance, Natonal Tape ollege of Busness No. 321, Sec. 1, h-nan d., Tape 100, Tawan heng-kun Kuo Department of Internatonal

More information

Notes on experimental uncertainties and their propagation

Notes on experimental uncertainties and their propagation Ed Eyler 003 otes on epermental uncertantes and ther propagaton These notes are not ntended as a complete set of lecture notes, but nstead as an enumeraton of some of the key statstcal deas needed to obtan

More information

Ch Rival Pure private goods (most retail goods) Non-Rival Impure public goods (internet service)

Ch Rival Pure private goods (most retail goods) Non-Rival Impure public goods (internet service) h 7 1 Publc Goods o Rval goods: a good s rval f ts consumpton by one person precludes ts consumpton by another o Excludable goods: a good s excludable f you can reasonably prevent a person from consumng

More information

Online Appendix for Merger Review for Markets with Buyer Power

Online Appendix for Merger Review for Markets with Buyer Power Onlne Appendx for Merger Revew for Markets wth Buyer Power Smon Loertscher Lesle M. Marx July 23, 2018 Introducton In ths appendx we extend the framework of Loertscher and Marx (forthcomng) to allow two

More information

USING A MULTICRITERIA INTERACTIVE APPROACH IN SCHEDULING NON-CRITICAL ACTIVITIES

USING A MULTICRITERIA INTERACTIVE APPROACH IN SCHEDULING NON-CRITICAL ACTIVITIES OPERATIONS RESEARCH AND DECISIONS No. 1 2018 DOI: 10.5277/ord180103 Mace NOWAK 1 Krzysztof S. TARGIEL 1 USING A MULTICRITERIA INTERACTIVE APPROACH IN SCHEDULING NON-CRITICAL ACTIVITIES A typcal proect

More information

Bayesian belief networks

Bayesian belief networks CS 2750 achne Learnng Lecture 12 ayesan belef networks los Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square CS 2750 achne Learnng Densty estmaton Data: D { D1 D2.. Dn} D x a vector of attrbute values ttrbutes:

More information

ISyE 512 Chapter 9. CUSUM and EWMA Control Charts. Instructor: Prof. Kaibo Liu. Department of Industrial and Systems Engineering UW-Madison

ISyE 512 Chapter 9. CUSUM and EWMA Control Charts. Instructor: Prof. Kaibo Liu. Department of Industrial and Systems Engineering UW-Madison ISyE 512 hapter 9 USUM and EWMA ontrol harts Instructor: Prof. Kabo Lu Department of Industral and Systems Engneerng UW-Madson Emal: klu8@wsc.edu Offce: Room 317 (Mechancal Engneerng Buldng) ISyE 512 Instructor:

More information

A HEURISTIC SOLUTION OF MULTI-ITEM SINGLE LEVEL CAPACITATED DYNAMIC LOT-SIZING PROBLEM

A HEURISTIC SOLUTION OF MULTI-ITEM SINGLE LEVEL CAPACITATED DYNAMIC LOT-SIZING PROBLEM A eurstc Soluton of Mult-Item Sngle Level Capactated Dynamc Lot-Szng Problem A EUISTIC SOLUTIO OF MULTI-ITEM SIGLE LEVEL CAPACITATED DYAMIC LOT-SIZIG POBLEM Sultana Parveen Department of Industral and

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #21 Scribe: Lawrence Diao April 23, 2013

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #21 Scribe: Lawrence Diao April 23, 2013 COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #21 Scrbe: Lawrence Dao Aprl 23, 2013 1 On-Lne Log Loss To recap the end of the last lecture, we have the followng on-lne problem wth N

More information

Interregional Trade, Industrial Location and. Import Infrastructure*

Interregional Trade, Industrial Location and. Import Infrastructure* Interregonal Trade, Industral Locaton and Import Infrastructure* Toru Kkuch (Kobe Unversty) and Kazumch Iwasa (Kyoto Unversty)** Abstract The purpose of ths study s to llustrate, wth a smple two-regon,

More information

A Single-Product Inventory Model for Multiple Demand Classes 1

A Single-Product Inventory Model for Multiple Demand Classes 1 A Sngle-Product Inventory Model for Multple Demand Classes Hasan Arslan, 2 Stephen C. Graves, 3 and Thomas Roemer 4 March 5, 2005 Abstract We consder a sngle-product nventory system that serves multple

More information

A Comparison of Statistical Methods in Interrupted Time Series Analysis to Estimate an Intervention Effect

A Comparison of Statistical Methods in Interrupted Time Series Analysis to Estimate an Intervention Effect Transport and Road Safety (TARS) Research Joanna Wang A Comparson of Statstcal Methods n Interrupted Tme Seres Analyss to Estmate an Interventon Effect Research Fellow at Transport & Road Safety (TARS)

More information

International ejournals

International ejournals Avalable onlne at www.nternatonalejournals.com ISSN 0976 1411 Internatonal ejournals Internatonal ejournal of Mathematcs and Engneerng 7 (010) 86-95 MODELING AND PREDICTING URBAN MALE POPULATION OF BANGLADESH:

More information

Chapter 5 Student Lecture Notes 5-1

Chapter 5 Student Lecture Notes 5-1 Chapter 5 Student Lecture Notes 5-1 Basc Busness Statstcs (9 th Edton) Chapter 5 Some Important Dscrete Probablty Dstrbutons 004 Prentce-Hall, Inc. Chap 5-1 Chapter Topcs The Probablty Dstrbuton of a Dscrete

More information

ISE Cloud Computing Index Methodology

ISE Cloud Computing Index Methodology ISE Cloud Computng Index Methodology Index Descrpton The ISE Cloud Computng Index s desgned to track the performance of companes nvolved n the cloud computng ndustry. Index Calculaton The ISE Cloud Computng

More information

Real Exchange Rate Fluctuations, Wage Stickiness and Markup Adjustments

Real Exchange Rate Fluctuations, Wage Stickiness and Markup Adjustments Real Exchange Rate Fluctuatons, Wage Stckness and Markup Adjustments Yothn Jnjarak and Kanda Nakno Nanyang Technologcal Unversty and Purdue Unversty January 2009 Abstract Motvated by emprcal evdence on

More information

Dynamic Analysis of Knowledge Sharing of Agents with. Heterogeneous Knowledge

Dynamic Analysis of Knowledge Sharing of Agents with. Heterogeneous Knowledge Dynamc Analyss of Sharng of Agents wth Heterogeneous Kazuyo Sato Akra Namatame Dept. of Computer Scence Natonal Defense Academy Yokosuka 39-8686 JAPAN E-mal {g40045 nama} @nda.ac.jp Abstract In ths paper

More information

4. Greek Letters, Value-at-Risk

4. Greek Letters, Value-at-Risk 4 Greek Letters, Value-at-Rsk 4 Value-at-Rsk (Hull s, Chapter 8) Math443 W08, HM Zhu Outlne (Hull, Chap 8) What s Value at Rsk (VaR)? Hstorcal smulatons Monte Carlo smulatons Model based approach Varance-covarance

More information

Interval Estimation for a Linear Function of. Variances of Nonnormal Distributions. that Utilize the Kurtosis

Interval Estimation for a Linear Function of. Variances of Nonnormal Distributions. that Utilize the Kurtosis Appled Mathematcal Scences, Vol. 7, 013, no. 99, 4909-4918 HIKARI Ltd, www.m-hkar.com http://dx.do.org/10.1988/ams.013.37366 Interval Estmaton for a Lnear Functon of Varances of Nonnormal Dstrbutons that

More information

Likelihood Fits. Craig Blocker Brandeis August 23, 2004

Likelihood Fits. Craig Blocker Brandeis August 23, 2004 Lkelhood Fts Crag Blocker Brandes August 23, 2004 Outlne I. What s the queston? II. Lkelhood Bascs III. Mathematcal Propertes IV. Uncertantes on Parameters V. Mscellaneous VI. Goodness of Ft VII. Comparson

More information

PASS Sample Size Software. :log

PASS Sample Size Software. :log PASS Sample Sze Software Chapter 70 Probt Analyss Introducton Probt and lot analyss may be used for comparatve LD 50 studes for testn the effcacy of drus desned to prevent lethalty. Ths proram module presents

More information

Information Flow and Recovering the. Estimating the Moments of. Normality of Asset Returns

Information Flow and Recovering the. Estimating the Moments of. Normality of Asset Returns Estmatng the Moments of Informaton Flow and Recoverng the Normalty of Asset Returns Ané and Geman (Journal of Fnance, 2000) Revsted Anthony Murphy, Nuffeld College, Oxford Marwan Izzeldn, Unversty of Lecester

More information

Understanding Annuities. Some Algebraic Terminology.

Understanding Annuities. Some Algebraic Terminology. Understandng Annutes Ma 162 Sprng 2010 Ma 162 Sprng 2010 March 22, 2010 Some Algebrac Termnology We recall some terms and calculatons from elementary algebra A fnte sequence of numbers s a functon of natural

More information

3: Central Limit Theorem, Systematic Errors

3: Central Limit Theorem, Systematic Errors 3: Central Lmt Theorem, Systematc Errors 1 Errors 1.1 Central Lmt Theorem Ths theorem s of prme mportance when measurng physcal quanttes because usually the mperfectons n the measurements are due to several

More information

Xiaoli Lu VA Cooperative Studies Program, Perry Point, MD

Xiaoli Lu VA Cooperative Studies Program, Perry Point, MD A SAS Program to Construct Smultaneous Confdence Intervals for Relatve Rsk Xaol Lu VA Cooperatve Studes Program, Perry Pont, MD ABSTRACT Assessng adverse effects s crtcal n any clncal tral or nterventonal

More information

Benefit-Cost Analysis

Benefit-Cost Analysis Chapter 12 Beneft-Cost Analyss Utlty Possbltes and Potental Pareto Improvement Wthout explct nstructons about how to compare one person s benefts wth the losses of another, we can not expect beneft-cost

More information

Financial mathematics

Financial mathematics Fnancal mathematcs Jean-Luc Bouchot jean-luc.bouchot@drexel.edu February 19, 2013 Warnng Ths s a work n progress. I can not ensure t to be mstake free at the moment. It s also lackng some nformaton. But

More information

A Utilitarian Approach of the Rawls s Difference Principle

A Utilitarian Approach of the Rawls s Difference Principle 1 A Utltaran Approach of the Rawls s Dfference Prncple Hyeok Yong Kwon a,1, Hang Keun Ryu b,2 a Department of Poltcal Scence, Korea Unversty, Seoul, Korea, 136-701 b Department of Economcs, Chung Ang Unversty,

More information