SUPPLEMENTAL MATERIAL

Similar documents
Regret Minimization Algorithms for the Follower s Behaviour Identification in Leadership Games

Standard Deviations for Normal Sampling Distributions are: For proportions For means _

point estimator a random variable (like P or X) whose values are used to estimate a population parameter

5. Best Unbiased Estimators

Statistics for Economics & Business

The Limit of a Sequence (Brief Summary) 1

Sequences and Series

Unbiased estimators Estimators

Math 312, Intro. to Real Analysis: Homework #4 Solutions

Estimating Proportions with Confidence


CHAPTER 8 Estimating with Confidence

Solution to Tutorial 6

Introduction to Probability and Statistics Chapter 7

Chapter 8. Confidence Interval Estimation. Copyright 2015, 2012, 2009 Pearson Education, Inc. Chapter 8, Slide 1

B = A x z

A point estimate is the value of a statistic that estimates the value of a parameter.

18.S096 Problem Set 5 Fall 2013 Volatility Modeling Due Date: 10/29/2013

Monetary Economics: Problem Set #5 Solutions

1 Random Variables and Key Statistics

Sampling Distributions and Estimation

EVEN NUMBERED EXERCISES IN CHAPTER 4

Game Theory. Lecture Notes By Y. Narahari. Department of Computer Science and Automation Indian Institute of Science Bangalore, India July 2012

x satisfying all regularity conditions. Then

CAPITAL ASSET PRICING MODEL

r i = a i + b i f b i = Cov[r i, f] The only parameters to be estimated for this model are a i 's, b i 's, σe 2 i

2.6 Rational Functions and Their Graphs

Topic-7. Large Sample Estimation

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

AY Term 2 Mock Examination

INTERVAL GAMES. and player 2 selects 1, then player 2 would give player 1 a payoff of, 1) = 0.


Standard BAL a Real Power Balancing Control Performance

5 Statistical Inference

Journal of Statistical Software

The material in this chapter is motivated by Experiment 9.

Today: Finish Chapter 9 (Sections 9.6 to 9.8 and 9.9 Lesson 3)

Inferential Statistics and Probability a Holistic Approach. Inference Process. Inference Process. Chapter 8 Slides. Maurice Geraghty,

A random variable is a variable whose value is a numerical outcome of a random phenomenon.

Solutions to Problem Sheet 1

Sampling Distributions and Estimation

Lecture 4: Probability (continued)

Hopscotch and Explicit difference method for solving Black-Scholes PDE

Math 124: Lecture for Week 10 of 17

EXERCISE - BINOMIAL THEOREM

Monopoly vs. Competition in Light of Extraction Norms. Abstract

Online appendices from Counterparty Risk and Credit Value Adjustment a continuing challenge for global financial markets by Jon Gregory

Subject CT1 Financial Mathematics Core Technical Syllabus

Chapter 10 - Lecture 2 The independent two sample t-test and. confidence interval

FINM6900 Finance Theory How Is Asymmetric Information Reflected in Asset Prices?

DESCRIPTION OF MATHEMATICAL MODELS USED IN RATING ACTIVITIES

ii. Interval estimation:

Maximum Empirical Likelihood Estimation (MELE)

Calculation of the Annual Equivalent Rate (AER)

Overlapping Generations

Outline. Plotting discrete-time signals. Sampling Process. Discrete-Time Signal Representations Important D-T Signals Digital Signals

APPLICATION OF GEOMETRIC SEQUENCES AND SERIES: COMPOUND INTEREST AND ANNUITIES

CAPITAL PROJECT SCREENING AND SELECTION

Rafa l Kulik and Marc Raimondo. University of Ottawa and University of Sydney. Supplementary material

Summary. Recap. Last Lecture. .1 If you know MLE of θ, can you also know MLE of τ(θ) for any function τ?

These characteristics are expressed in terms of statistical properties which are estimated from the sample data.

SELECTING THE NUMBER OF CHANGE-POINTS IN SEGMENTED LINE REGRESSION

Confidence Intervals Introduction

Non-Inferiority Logrank Tests

ad covexity Defie Macaulay duratio D Mod = r 1 = ( CF i i k (1 + r k) i ) (1.) (1 + r k) C = ( r ) = 1 ( CF i i(i + 1) (1 + r k) i+ k ) ( ( i k ) CF i

Lecture 4: Parameter Estimation and Confidence Intervals. GENOME 560 Doug Fowler, GS

Supersedes: 1.3 This procedure assumes that the minimal conditions for applying ISO 3301:1975 have been met, but additional criteria can be used.

Binomial Model. Stock Price Dynamics. The Key Idea Riskless Hedge

Limits of sequences. Contents 1. Introduction 2 2. Some notation for sequences The behaviour of infinite sequences 3

14.30 Introduction to Statistical Methods in Economics Spring 2009

1 Estimating sensitivities

An Empirical Study of the Behaviour of the Sample Kurtosis in Samples from Symmetric Stable Distributions

ECON 5350 Class Notes Maximum Likelihood Estimation

Lecture 5: Sampling Distribution

1 Grim Trigger in the Repeated Prisoner s Dilemma (70 points)

Forecasting bad debt losses using clustering algorithms and Markov chains

BIOSTATS 540 Fall Estimation Page 1 of 72. Unit 6. Estimation. Use at least twelve observations in constructing a confidence interval

BASIC STATISTICS ECOE 1323

0.07. i PV Qa Q Q i n. Chapter 3, Section 2

Problem Set 1a - Oligopoly

Bayes Estimator for Coefficient of Variation and Inverse Coefficient of Variation for the Normal Distribution

The Valuation of the Catastrophe Equity Puts with Jump Risks

Parametric Density Estimation: Maximum Likelihood Estimation

SCHOOL OF ACCOUNTING AND BUSINESS BSc. (APPLIED ACCOUNTING) GENERAL / SPECIAL DEGREE PROGRAMME

CHAPTER 8: CONFIDENCE INTERVAL ESTIMATES for Means and Proportions

ACTUARIAL RESEARCH CLEARING HOUSE 1990 VOL. 2 INTEREST, AMORTIZATION AND SIMPLICITY. by Thomas M. Zavist, A.S.A.

Elementary Statistics and Inference. Elementary Statistics and Inference. Chapter 20 Chance Errors in Sampling (cont.) 22S:025 or 7P:025.

1. Suppose X is a variable that follows the normal distribution with known standard deviation σ = 0.3 but unknown mean µ.

AUTOMATIC GENERATION OF FUZZY PAYOFF MATRIX IN GAME THEORY

. (The calculated sample mean is symbolized by x.)

A Bayesian perspective on estimating mean, variance, and standard-deviation from data

Osborne Books Update. Financial Statements of Limited Companies Tutorial

We analyze the computational problem of estimating financial risk in a nested simulation. In this approach,

Dynamic Pricing with Limited Supply

. The firm makes different types of furniture. Let x ( x1,..., x n. If the firm produces nothing it rents out the entire space and so has a profit of

Chapter 8 Interval Estimation. Estimation Concepts. General Form of a Confidence Interval

The Time Value of Money in Financial Management

This article is part of a series providing

Marking Estimation of Petri Nets based on Partial Observation

CHAPTER 2 PRICING OF BONDS

Transcription:

A SULEMENTAL MATERIAL Theorem (Expert pseudo-regret upper boud. Let us cosider a istace of the I-SG problem ad apply the FL algorithm, where each possible profile A is a expert ad receives, at roud, a expert reward equal to mius the loss she would have icurred observig i A, by playig the best respose to the attacer A. The, there always exists a attacer set A s.t. the defeder D icurs i a expected pseudo-regret of: R N (U L N. roof. Let us aalyse the I-SG problem i which the attacer profile set is A = {Sta, Sto}, the true attacer A = Sta ad we use the Follow the Leader algorithm (Cesa-Biachi ad Lugosi, 006. Assume that the best respose σ D (Sto to the stochastic attacer Sto correspods to the pure strategy played by the Stacelberg attacer at the equilibrium, i.e, σ Sta (σ D (Sta = σ D (Sto. Assume the chose target by the two strategies has value v ˆm i target ˆm, maximum value v m i target m ad that the stochastic attacer has strategy p s.t.: α if m = ˆm p m = 1 α if m = m, 0 otherwise where α = v ml(sta v m ad αv m > (1 αv m. I this case, the defeder might commit to two differet strategies: if the defeder D declares its best respose to the Stacerlberg attacer σ D (Sta for the tur, it would provide zero loss as feedbac for the stochastic attacer expert ad loss equal to L(Sta to the Stacelberg oe if the defeder D selects the best respose to the stochastic attacer σ D (Sto, the defeder would gai loss equal to (1 αv m = L(Sta for the stochastic attacer expert ad L(Sta for the Stacelberg oe. Thus, i this case the two types would receive the same feedbac. Summarizig, we have that the Stacelberg attacer expert always icurs i a loss greater or equal to the oe of the stochastic oe, eve if the real attacer is Stacelberg. Thus, with a probability grater tha 0.5 we are icurrig i a loss of L for the etire horizo, with a total regret proportioal to L N. Eve by resortig to radomizatio, thus eve adoptig the FL we would have a probability of at least 0.5 ε (beig ε the probability with which the FL chooses a suboptimal optio to select the wrog optio, thus also the FL algorithm would icur i a liear regret over the time horizo. Theorem 3 ( pseudo-regret upper boud. Give a istace of the I-SG problem s.t. b > 0 for each A A ad applyig, the defeder icurs i a pseudo-regret of: R N (U =1 (λ λ L ( b, where λ := max m M max σ S l(σ A (σ m mi m M mi σ S l(σ A (σ m I {σ A (σ m 0} is the rage where the logarithm of the beliefs realizatios lies (excludig realizatios equal to zero, which ed the exploratio of a profile ad S := σ D (A is the set of the available best respose to the attacers profile. roof. Let us aalyze the regret of the algorithm. We get some regret if the algorithm selects a strategy profile correspodig to a type differet from the real oe. Thus, the regret is upper bouded by: [ N ] R N (U = E l L N where we recall that: [ N ] = E l L = L E[T (N], =1

T i (N = N I{A = A } is the umber of times we played the best respose σ D (A to attacer A ; L = M m=1 σ A(σ D (A m v m (1 σ D (A m L is the expected regret of playig the best respose to attacer A whe the real attacer is A. Each roud i which the algorithm selects a profile s.t. the best respose is ot equal to the oe of A we are gettig some regret. Let us defie variables B, ad B, deotig the belief we have for the possible attacer A ad of the real attacer A, respectively, of the actio played by the real attacer A at tur. Moreover, let b j,t := E σ D (A j[b,t ] be the expected value of the belief we get for attacer A whe we are best respodig to A j ad the true type is A A at roud t. Note that b j,t < b j,t, j, sice b is positive. For each profile A A, we have: E[T (N] = = [ { E I B,t [ { E I l(b,t B,t }] }] l(b,t l(b,t l(b j t,t l(b,t l(b j t,t l(b j t,t l(b j t,t }{{} b l(b,t l(b j t,t b l(b j t,t b }{{} R 1 l(b,t l(b j t,t b 0 l(b j t,t b, (1 }{{} R (7 (8 (9 (10 (11 where j t is the idex of the attacer A jt we selected at roud t ad we defied b := mi j Aj A l(b j,tl(b j,t, i.e., the miimum w.r.t. the best respose for the available attacers of the differece betwee the expected value of the loglielihood of attacer A ad A if the true profile is A. Equatio (9 has bee ( obtaied from Equatio (8 sice E [I { }] = ( while Equatio (10 has bee computed from Equatio (9 addig l(b j t,t l(b j t,t to both l.h.s. ad r.h.s. of the iequality. We would lie to poit out that b does ot deped o t sice the distributio of B,t ad B,t is the same over rouds. Let us focus o R 1. We use the McDiarmid iequality (McDiarmid, 1989 to boud the probability that the empirical

estimate of the loglielihood expected value is higher tha a certai upper boud as follows: R 1 = l(b j t,t l(b j t,t { exp ( b } λ λ ( b, b b where we exploited x=1 eκx 1 κ. We defie λ := max m M max σ S l(σ A (σ m mi m M mi σ S l(σ A (σ m I {σ A (σ m 0} as the rage where the beliefs realizatios lie (excludig realizatios equal to zero which eds the exploratio of a profile, where we used the fact that E[B,t ] = b, t ad S := σ D (A is the set of the available best respose to the attacers profile. A similar reasoig ca be applied to R gettig a upper boud of the followig form: The regret becomes: i=1 i=1 R λ ( b. ( λ R N (U = L E[T (N] L ( b λ ( b which cocludes the proof. i=1 (λ λ L ( b,

B ADDITIONAL RESULTS For the sae of completeess, we report i Figures 8 ad 9 all the graphs regardig the regret for all the ruig cofiguratios C 1,..., C 7 ad for the two dimesios of the target space, amely M {5, 10}. By ispectig these additioal set of figures are i lie with what has bee preseted i Sectio 6 of the mai paper, where the proposed techiques, amely ad, are able to outperform the literature methods. Eve here, there is ot a clear method providig statistical evidece that it is able to outperform the other. Moreover, we also provide i Figure 10 the results for cofiguratio C 6 with a umber of target M = 40. I this cofiguratio, we were able to ru oly the algorithm for computatioal time costraits. The results show that the has performace similar to the oes experieced with smaller target space, thus it is able to scale without sigificat loss i terms of expected pseudo-regret R N (U. FL FL FL R(U R(U R(U 0 00 400 600 800 1000 10 0 00 400 600 800 1000 0 00 400 600 800 1000 (a Cofiguratio C 1. (b Cofiguratio C. (c Cofiguratio C 3. 10 FL FL 10 FL R(U R(U R(U 10 0 00 400 600 800 1000 (d Cofiguratio C 4. 0 00 400 600 800 1000 (e Cofiguratio C 5. 0 00 400 600 800 1000 (f Cofiguratio C 6. 10 FL R(U 0 00 400 600 800 1000 (g Cofiguratio C 7. Figure 8: Expected pseudo-regret for the differet cofiguratios with M = 5 targets.

FL FL FL R(U R(U R(U 10 0 00 400 600 800 1000 0 00 400 600 800 1000 0 00 400 600 800 1000 (a Cofiguratio C 1. (b Cofiguratio C. (c Cofiguratio C 3. FL FL 10 FL R(U R(U R(U 10 0 00 400 600 800 1000 (d Cofiguratio C 4. 0 00 400 600 800 1000 (e Cofiguratio C 5. 0 00 400 600 800 1000 (f Cofiguratio C 6. 10 FL R(U 0 00 400 600 800 1000 (g Cofiguratio C 7. Figure 9: Expected pseudo-regret for the differet cofiguratios with M = 10 targets.

FL R(U 0 00 400 600 800 1000 Figure 10: Expected pseudo-regret for the cofiguratio C 6 with M = 40 targets.

Table 4: Computatioal time i secods eeded by ad to solve a istace over N = 1000 rouds ad the correspodig 95% cofidece itervals. M = 40 M = 0 M = 10 M = 5 C 1 C C 3 C 4 C 5 C 6 C 7 5.9 ± 1.7 11.1 ±. 11.7 ±.9 3.5 ± 1.0 3.7 ±.4 14.9 ± 4.3 14.7 ± 3. 77.0 ±.1 11.1 ± 3. 170.4 ± 4.1 146. ± 4.7 651.7 ± 36.6 109. ± 64.7 1113.7 ± 40. 10.3 ±.6 1.9 ± 13. 3.0 ± 17.9 7.1 ±.3 63.0 ± 7.4 47. ± 14.05 48.59 ± 13.48 356.1 ± 14.3 678.5 ± 15.9 887.0 ± 11.1 960.4 ± 13.0 440.5 ± 14. 756.5 ± 189.9 791.6 ± 3.7 33.5 ± 3.0. ± 16.9 137.8 ± 77.6 33.7 ± 1. 484.5 ± 107.7 6.8 ± 45.3 9.5 ± 46.44 104.5 ± 7.1 061.5 ± 837. 141.0 ± 81.1 18.9 ± 16.5 347.9 ± 13. 1634. ± 487.6 1643.6 ± 468.8 We also report here Table 4, the full versio of Table 3, with the time values up to the first decimal ad also specifyig the cofidece iterval.