Regret Minimization against Strategic Buyers

Size: px
Start display at page:

Download "Regret Minimization against Strategic Buyers"

Transcription

1 Regret Minimization against Strategic Buyers Mehryar Mohri Courant Institute & Google Research Andrés Muñoz Medina Google Research

2

3 Motivation Online advertisement: revenue of modern search engine and popular online sites. billions of transactions every day. key role of revenue optimization algorithms.

4 Motivation Second-price auctions with reserve: widely adopted mechanism in Ad Exchanges. many transactions admit a single bidder posted-price auctions. study of posted-price auctions with strategic buyers.

5 Related Work Revenue optimization in second-price auctions [Cui et al. 2011; He et al., 2013; Cesa-Bianchi et al., 2013; MM and Muñoz 2014]. Revenue optimization in generalized second-price auctions (GSP) [MM and Muñoz, 2015; Varian, 2007; Lucier et al., 2014; Sun et al, 2014; Rudolph et al. 2016; Charles et al., 2016; Roughgarden and Wang, 2016]. Dynamic pricing [Kanoria and Nazerzadeh 2014, Bikhchandani and McCardle 2012; den Boer, 2015; Chen et al. 2015]. Pricing with strategic and patient buyers [Feldman et al., 2016]. Preference reconstruction [Blum et al. 2014]. Learning optimal auctions [Huang et al., 2015; Morgenstern and Roughgarden, 2015].

6 This Talk Scenarios: Fixed valuation. Random valuation.

7 Setup Repeated posted-price auctions: good repeatedly offered for sale by a seller to a single buyer over rounds. buyer holds private valuation. seller offers price and buyer accepts, a t =1, p t or rejects,, at each round. T v 2 [0, 1] a t =0 t 2 [T ]

8 Setup Seller: pricing algorithm A. total revenue: P T t=1 a tp t. regret: Reg T (A) =vt P T t=1 a tp t. Buyer: discounting factor 2 (0, 1]. surplus: Sur T (A) = P T t=1 t 1 a t (v p t ).

9 Strategic Setting [Amin et al., 2013] Seller announces his algorithm A. Buyer acts strategically: he seeks to maximize his surplus. Sur T (A) Seller seeks to minimize his strategic regret, that is regret Reg T (A) against strategic buyer. Question: can we design algorithms for minimizing strategic regret?

10 Truthful Setting Fast Search (FS) algorithm: [Kleinberg and Leighton, 2003] keeps track of feasible interval with and parameter. [0, 1] = 1 2 [a, b], starting in each phase, offers prices until a price is rejected. a +,a+2,... if price a + k is rejected, new phase with interval [a +(k 1),a+ k ] and parameter. until size of the interval less than. 1 T 2

11 Truthful Setting Fast Search (FS) algorithm: [Kleinberg and Leighton, 2003] at most dlog 2 log 2 T )e +1 phases. regret in. O(log log T ) lower bound: (log log T ).

12 Example v = 16 $8? $4? $2?!!! $1? No No No YES!

13 Monotone algorithms Algorithm: offer price p t = t ( < 1) until it is accepted. offer accepted price thereafter. idea: slow enough decrease inconvenient for the buyer. q O T 1 Strategic regret in. [Amin et al., 2013]

14 Monotone algorithms Theorem: the strategic regret of any monotone p decreasing convex algorithm is in ( T ). [MM and Muñoz, 2014]

15 Proof idea Fix monotone function. Choose v 2 [ 1 2, 1] at random. Let apple =inf{t: p t <v}, then E[apple] E[v p apple ] c. Tradeoff optimized for p t p t+1 p 1. T

16 Lower Bound [Amin et al. 2013; Kleinberg and Leighton, 2003; MM and Muñoz, 2014] Theorem: for any pricing algorithm following lower bound holds: Reg T (A) max A, the 1,Clog log T 12(1 ) for some universal constant C.

17 Idea Lie buyer when rejecting v>p t or accepting when v<p t. Can we dissuade the buyer from lying? buyer s weakness: time (discounted surplus). penalization: if buyer rejects price, reoffer the price for another (r 1) rounds. r choice of subject to a trade-off.

18 Pricing Strategies Any deterministic strategy can be represented by a tree. 1/2 1/4 3/4 1/8 3/8 5/8 7/8

19 Meta-Algorithm 1/2 1/4 3/4 Strategic 1/8 3/8 5/8 7/8 1/2 Truthful r rounds 1/2 3/4 1/4 3/4 7/8

20 PFS Guarantees [MM and Muñoz, 2014] Theorem: let 0 2 ( 1 ; the following strategic 2, 1) regret guarantees hold for Penalized Fast Search (PFS): for ; Reg T (PFS) = O(log log T ) 2 (0, 1 2 ] Reg T (PFS) = O(log T log log T ) for 2 ( 1. 2, 0)

21 Proof Idea Surplus of rejected path at most t+r 1 1 p t Surplus of accepted path at least t 1 (v p t ) (v p t ) apple 1 r

22 Horizon-Indep. Regret Extension of PFS via exponentiating trick to horizon-independent algorithm : i gpfs length of th epoch verifies log 2 log 2 T i =2 i 1. Reg for 2 (0, 1 T ( PFS) g = O(log log T ) ; 2 ] Reg for 2 ( 1 T ( PFS) g = O(log T log log T ). 2, 0) [Drutsa, 2017]

23 Further Improvement PRRFES algorithm: truthful FES algorithm: modified FS; after rejection, reoffer last accepted price g times. same lie penalization as in PFS. continue to offer until rejection. [Drutsa, 2017] strategic regret: for 2 (0, 0]. Reg T (PRRFES) = O(log log T )

24 Random valuations

25 Setup [Amin et al., 2013] Repeated posted-price auctions: good repeatedly offered for sale by a seller to a single buyer over rounds. buyer receives valuation v t 2 [0, 1], v t D. seller offers price and buyer accepts, a t =1, p t or rejects, a t =0, at each round. T t 2 [T ]

26 Setup Seller: pricing algorithm A. total revenue:. regret: Buyer: discounting factor 2 (0, 1]. surplus: E P T t=1 a tp t Reg T (A) = max p2p Sur T (A) =E p P(v >p)t E apple T X t=1 apple T X t=1 t 1 a t (v p t ). a t p t.

27 Strategic buyers Simple tree structure for fixed valuation. Seller offers price from distribution. Surplus of state s t =(P t,h t 1,v t,p t ): S t (s t ) = max t 1 a t (v t p t ) a t 2{0,1} + E St+1 (f t (P t,h t 1 ),H t,v t+1,p t+1. (v t+1,p t+1 ) D f t (P t,h t 1 ) Solution found in time T P. P t

28 -strategic buyers Stop optimizing if all future surplus is at most. Behave truthfully otherwise, Optimize for Tractable MDP. log[ 1 (1 )] log 1 rounds.

29 Bandit Formulation Only observe reward of price offered. Minimize pseudo-regret Reg T (A) = max p2p p P(v >p)t E apple X T a t p t. t=1 Problem: rewards not i.i.d. (strategic buyer).

30 Regret bound [MM and Muñoz, 2015] Theorem: Let be a finite set of prices. Let be the number of time the buyer lies. Let P p = argmax p2p and p = p P(v >p ) p P(v >p). For any > 0, Reg T apple E[L]+ p P(v >p) X p : p> where T p (t) is the number of times price p has been offered up to time. t E[T p (t)] p + T L

31 R-UCB Make UCB robust to lies. Use different upper confidence bounds bµ p (t) = 1 T p (t) tx a i p i 1 pi =p i=1 s 2 log t + T p (t) + Lp T p (t)

32 Regret analysis Proposition: The regret of the R-UCB algorithm is bounded by Reg T apple E[L]+ X 4Lp + p: p> 32 log T p +2 p + T + TX t=1 P t (p, L), where P t (p, L) :=P Lt (p) T p (t) + L t(p ) T (t) L p T p (t) + p T (t).

33 Bound on Lies An -strategic buyer lies at most. Regret of R-UCB in O log T +. Extension to continuous set of prices by discretization. Regret in O. p T + T 1/4 1 1 P l log(1/ (1 )) m log(1/ )

34 Conclusion Analysis of strategic regret. Fixed and random valuation scenarios. Simple algorithms extending truthful scenario. Many questions: Can we extend results to other types of buyers? What about if the buyer learns too? Extension to general auctions?

35 Other Related Questions Can the buyer trust the algorithm announced? testing incentive-compatibility [Lahaie, Muñoz, Sivan, and Vassilvitskii, 2017] (Andres s talk). Extend analysis to the case where algorithmic details are not known.

Learning for Revenue Optimization. Andrés Muñoz Medina Renato Paes Leme

Learning for Revenue Optimization. Andrés Muñoz Medina Renato Paes Leme Learning for Revenue Optimization Andrés Muñoz Medina Renato Paes Leme How to succeed in business with basic ML? ML $1 $5 $10 $9 Google $35 $1 $8 $7 $7 Revenue $8 $30 $24 $18 $10 $1 $5 Price $7 $8$9$10

More information

Revenue optimization in AdExchange against strategic advertisers

Revenue optimization in AdExchange against strategic advertisers 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Optimal Regret Minimization in Posted-Price Auctions with Strategic Buyers

Optimal Regret Minimization in Posted-Price Auctions with Strategic Buyers Optimal Regret Minimization in Posted-Price Auctions with Strategic Buyers Mehryar Mohri Courant Institute and Google Research 251 Mercer Street New York, NY 10012 mohri@cims.nyu.edu Andres Muñoz Medina

More information

Horizon-Independent Optimal Pricing in Repeated Auctions with Truthful and Strategic Buyers

Horizon-Independent Optimal Pricing in Repeated Auctions with Truthful and Strategic Buyers Horizon-Independent Optimal Pricing in Repeated Auctions with Truthful and Strategic Buyers Alexey Drutsa Yandex, 16, Leo Tolstoy St. Moscow, Russia adrutsa@yandex.ru ABSTRACT We study revenue optimization

More information

arxiv: v1 [cs.lg] 23 Nov 2014

arxiv: v1 [cs.lg] 23 Nov 2014 Revenue Optimization in Posted-Price Auctions with Strategic Buyers arxiv:.0v [cs.lg] Nov 0 Mehryar Mohri Courant Institute and Google Research Mercer Street New York, NY 00 mohri@cims.nyu.edu Abstract

More information

Multi-armed bandit problems

Multi-armed bandit problems Multi-armed bandit problems Stochastic Decision Theory (2WB12) Arnoud den Boer 13 March 2013 Set-up 13 and 14 March: Lectures. 20 and 21 March: Paper presentations (Four groups, 45 min per group). Before

More information

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned

More information

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned

More information

arxiv: v1 [cs.gt] 7 May 2018

arxiv: v1 [cs.gt] 7 May 2018 Optimal Pricing in Repeated Posted-Price Auctions arxiv:1805.02574v1 [cs.gt] 7 May 2018 Arsenii Vanunts Yandex, MSU avanunts@yandex.ru Alexey Drutsa Yandex, MSU adrutsa@yandex.ru 19 March 2018 Abstract

More information

Dynamic Pricing with Varying Cost

Dynamic Pricing with Varying Cost Dynamic Pricing with Varying Cost L. Jeff Hong College of Business City University of Hong Kong Joint work with Ying Zhong and Guangwu Liu Outline 1 Introduction 2 Problem Formulation 3 Pricing Policy

More information

Posted-Price Mechanisms and Prophet Inequalities

Posted-Price Mechanisms and Prophet Inequalities Posted-Price Mechanisms and Prophet Inequalities BRENDAN LUCIER, MICROSOFT RESEARCH WINE: CONFERENCE ON WEB AND INTERNET ECONOMICS DECEMBER 11, 2016 The Plan 1. Introduction to Prophet Inequalities 2.

More information

Essays on Some Combinatorial Optimization Problems with Interval Data

Essays on Some Combinatorial Optimization Problems with Interval Data Essays on Some Combinatorial Optimization Problems with Interval Data a thesis submitted to the department of industrial engineering and the institute of engineering and sciences of bilkent university

More information

Lecture 11: Bandits with Knapsacks

Lecture 11: Bandits with Knapsacks CMSC 858G: Bandits, Experts and Games 11/14/16 Lecture 11: Bandits with Knapsacks Instructor: Alex Slivkins Scribed by: Mahsa Derakhshan 1 Motivating Example: Dynamic Pricing The basic version of the dynamic

More information

Online Network Revenue Management using Thompson Sampling

Online Network Revenue Management using Thompson Sampling Online Network Revenue Management using Thompson Sampling Kris Johnson Ferreira David Simchi-Levi He Wang Working Paper 16-031 Online Network Revenue Management using Thompson Sampling Kris Johnson Ferreira

More information

CS364A: Algorithmic Game Theory Lecture #3: Myerson s Lemma

CS364A: Algorithmic Game Theory Lecture #3: Myerson s Lemma CS364A: Algorithmic Game Theory Lecture #3: Myerson s Lemma Tim Roughgarden September 3, 23 The Story So Far Last time, we introduced the Vickrey auction and proved that it enjoys three desirable and different

More information

Treatment Allocations Based on Multi-Armed Bandit Strategies

Treatment Allocations Based on Multi-Armed Bandit Strategies Treatment Allocations Based on Multi-Armed Bandit Strategies Wei Qian and Yuhong Yang Applied Economics and Statistics, University of Delaware School of Statistics, University of Minnesota Innovative Statistics

More information

D I S C O N T I N U O U S DEMAND FUNCTIONS: ESTIMATION AND PRICING. Rotterdam May 24, 2018

D I S C O N T I N U O U S DEMAND FUNCTIONS: ESTIMATION AND PRICING. Rotterdam May 24, 2018 D I S C O N T I N U O U S DEMAND FUNCTIONS: ESTIMATION AND PRICING Arnoud V. den Boer University of Amsterdam N. Bora Keskin Duke University Rotterdam May 24, 2018 Dynamic pricing and learning: Learning

More information

Adaptive Experiments for Policy Choice. March 8, 2019

Adaptive Experiments for Policy Choice. March 8, 2019 Adaptive Experiments for Policy Choice Maximilian Kasy Anja Sautmann March 8, 2019 Introduction The goal of many experiments is to inform policy choices: 1. Job search assistance for refugees: Treatments:

More information

Dynamic Programming and Reinforcement Learning

Dynamic Programming and Reinforcement Learning Dynamic Programming and Reinforcement Learning Daniel Russo Columbia Business School Decision Risk and Operations Division Fall, 2017 Daniel Russo (Columbia) Fall 2017 1 / 34 Supervised Machine Learning

More information

CS364B: Frontiers in Mechanism Design Lecture #18: Multi-Parameter Revenue-Maximization

CS364B: Frontiers in Mechanism Design Lecture #18: Multi-Parameter Revenue-Maximization CS364B: Frontiers in Mechanism Design Lecture #18: Multi-Parameter Revenue-Maximization Tim Roughgarden March 5, 2014 1 Review of Single-Parameter Revenue Maximization With this lecture we commence the

More information

Recap First-Price Revenue Equivalence Optimal Auctions. Auction Theory II. Lecture 19. Auction Theory II Lecture 19, Slide 1

Recap First-Price Revenue Equivalence Optimal Auctions. Auction Theory II. Lecture 19. Auction Theory II Lecture 19, Slide 1 Auction Theory II Lecture 19 Auction Theory II Lecture 19, Slide 1 Lecture Overview 1 Recap 2 First-Price Auctions 3 Revenue Equivalence 4 Optimal Auctions Auction Theory II Lecture 19, Slide 2 Motivation

More information

EE365: Risk Averse Control

EE365: Risk Averse Control EE365: Risk Averse Control Risk averse optimization Exponential risk aversion Risk averse control 1 Outline Risk averse optimization Exponential risk aversion Risk averse control Risk averse optimization

More information

Budget Management In GSP (2018)

Budget Management In GSP (2018) Budget Management In GSP (2018) Yahoo! March 18, 2018 Miguel March 18, 2018 1 / 26 Today s Presentation: Budget Management Strategies in Repeated auctions, Balseiro, Kim, and Mahdian, WWW2017 Learning

More information

An Approximation Algorithm for Capacity Allocation over a Single Flight Leg with Fare-Locking

An Approximation Algorithm for Capacity Allocation over a Single Flight Leg with Fare-Locking An Approximation Algorithm for Capacity Allocation over a Single Flight Leg with Fare-Locking Mika Sumida School of Operations Research and Information Engineering, Cornell University, Ithaca, New York

More information

Outline. Objective. Previous Results Our Results Discussion Current Research. 1 Motivation. 2 Model. 3 Results

Outline. Objective. Previous Results Our Results Discussion Current Research. 1 Motivation. 2 Model. 3 Results On Threshold Esteban 1 Adam 2 Ravi 3 David 4 Sergei 1 1 Stanford University 2 Harvard University 3 Yahoo! Research 4 Carleton College The 8th ACM Conference on Electronic Commerce EC 07 Outline 1 2 3 Some

More information

Lecture 2: Making Good Sequences of Decisions Given a Model of World. CS234: RL Emma Brunskill Winter 2018

Lecture 2: Making Good Sequences of Decisions Given a Model of World. CS234: RL Emma Brunskill Winter 2018 Lecture 2: Making Good Sequences of Decisions Given a Model of World CS234: RL Emma Brunskill Winter 218 Human in the loop exoskeleton work from Steve Collins lab Class Structure Last Time: Introduction

More information

Mechanism Design and Auctions

Mechanism Design and Auctions Mechanism Design and Auctions Game Theory Algorithmic Game Theory 1 TOC Mechanism Design Basics Myerson s Lemma Revenue-Maximizing Auctions Near-Optimal Auctions Multi-Parameter Mechanism Design and the

More information

MDP Algorithms. Thomas Keller. June 20, University of Basel

MDP Algorithms. Thomas Keller. June 20, University of Basel MDP Algorithms Thomas Keller University of Basel June 20, 208 Outline of this lecture Markov decision processes Planning via determinization Monte-Carlo methods Monte-Carlo Tree Search Heuristic Search

More information

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Lecture 17: More on Markov Decision Processes. Reinforcement learning Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture

More information

Lower Bounds on Revenue of Approximately Optimal Auctions

Lower Bounds on Revenue of Approximately Optimal Auctions Lower Bounds on Revenue of Approximately Optimal Auctions Balasubramanian Sivan 1, Vasilis Syrgkanis 2, and Omer Tamuz 3 1 Computer Sciences Dept., University of Winsconsin-Madison balu2901@cs.wisc.edu

More information

Single-Parameter Mechanisms

Single-Parameter Mechanisms Algorithmic Game Theory, Summer 25 Single-Parameter Mechanisms Lecture 9 (6 pages) Instructor: Xiaohui Bei In the previous lecture, we learned basic concepts about mechanism design. The goal in this area

More information

Pricing a Low-regret Seller

Pricing a Low-regret Seller Hoda Heidari Mohammad Mahdian Umar Syed Sergei Vassilvitskii Sadra Yazdanbod HODA@CIS.UPENN.EDU MAHDIAN@GOOGLE.COM USYED@GOOGLE.COM SERGEIV@GOOGLE.COM YAZDANBOD@GATECH.EDU Abstract As the number of ad

More information

A lower bound on seller revenue in single buyer monopoly auctions

A lower bound on seller revenue in single buyer monopoly auctions A lower bound on seller revenue in single buyer monopoly auctions Omer Tamuz October 7, 213 Abstract We consider a monopoly seller who optimally auctions a single object to a single potential buyer, with

More information

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence CS 188: Artificial Intelligence Markov Decision Processes Dan Klein, Pieter Abbeel University of California, Berkeley Non Deterministic Search Example: Grid World A maze like problem The agent lives in

More information

Lecture 7: Bayesian approach to MAB - Gittins index

Lecture 7: Bayesian approach to MAB - Gittins index Advanced Topics in Machine Learning and Algorithmic Game Theory Lecture 7: Bayesian approach to MAB - Gittins index Lecturer: Yishay Mansour Scribe: Mariano Schain 7.1 Introduction In the Bayesian approach

More information

Tuning bandit algorithms in stochastic environments

Tuning bandit algorithms in stochastic environments Tuning bandit algorithms in stochastic environments Jean-Yves Audibert, CERTIS - Ecole des Ponts Remi Munos, INRIA Futurs Lille Csaba Szepesvári, University of Alberta The 18th International Conference

More information

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence CS 188: Artificial Intelligence Markov Decision Processes Dan Klein, Pieter Abbeel University of California, Berkeley Non-Deterministic Search 1 Example: Grid World A maze-like problem The agent lives

More information

From the Assignment Model to Combinatorial Auctions

From the Assignment Model to Combinatorial Auctions From the Assignment Model to Combinatorial Auctions IPAM Workshop, UCLA May 7, 2008 Sushil Bikhchandani & Joseph Ostroy Overview LP formulations of the (package) assignment model Sealed-bid and ascending-price

More information

Recharging Bandits. Joint work with Nicole Immorlica.

Recharging Bandits. Joint work with Nicole Immorlica. Recharging Bandits Bobby Kleinberg Cornell University Joint work with Nicole Immorlica. NYU Machine Learning Seminar New York, NY 24 Oct 2017 Prologue Can you construct a dinner schedule that: never goes

More information

Correlation-Robust Mechanism Design

Correlation-Robust Mechanism Design Correlation-Robust Mechanism Design NICK GRAVIN and PINIAN LU ITCS, Shanghai University of Finance and Economics In this letter, we discuss the correlation-robust framework proposed by Carroll [Econometrica

More information

Markov Decision Processes

Markov Decision Processes Markov Decision Processes Robert Platt Northeastern University Some images and slides are used from: 1. CS188 UC Berkeley 2. AIMA 3. Chris Amato Stochastic domains So far, we have studied search Can use

More information

Monte-Carlo Planning: Basic Principles and Recent Progress

Monte-Carlo Planning: Basic Principles and Recent Progress Monte-Carlo Planning: Basic Principles and Recent Progress Alan Fern School of EECS Oregon State University Outline Preliminaries: Markov Decision Processes What is Monte-Carlo Planning? Uniform Monte-Carlo

More information

Lecture outline W.B.Powell 1

Lecture outline W.B.Powell 1 Lecture outline What is a policy? Policy function approximations (PFAs) Cost function approximations (CFAs) alue function approximations (FAs) Lookahead policies Finding good policies Optimizing continuous

More information

Matching Markets and Google s Sponsored Search

Matching Markets and Google s Sponsored Search Matching Markets and Google s Sponsored Search Part III: Dynamics Episode 9 Baochun Li Department of Electrical and Computer Engineering University of Toronto Matching Markets (Required reading: Chapter

More information

Algorithmic Game Theory (a primer) Depth Qualifying Exam for Ashish Rastogi (Ph.D. candidate)

Algorithmic Game Theory (a primer) Depth Qualifying Exam for Ashish Rastogi (Ph.D. candidate) Algorithmic Game Theory (a primer) Depth Qualifying Exam for Ashish Rastogi (Ph.D. candidate) 1 Game Theory Theory of strategic behavior among rational players. Typical game has several players. Each player

More information

,,, be any other strategy for selling items. It yields no more revenue than, based on the

,,, be any other strategy for selling items. It yields no more revenue than, based on the ONLINE SUPPLEMENT Appendix 1: Proofs for all Propositions and Corollaries Proof of Proposition 1 Proposition 1: For all 1,2,,, if, is a non-increasing function with respect to (henceforth referred to as

More information

Sublinear Time Algorithms Oct 19, Lecture 1

Sublinear Time Algorithms Oct 19, Lecture 1 0368.416701 Sublinear Time Algorithms Oct 19, 2009 Lecturer: Ronitt Rubinfeld Lecture 1 Scribe: Daniel Shahaf 1 Sublinear-time algorithms: motivation Twenty years ago, there was practically no investigation

More information

Multi-Armed Bandit, Dynamic Environments and Meta-Bandits

Multi-Armed Bandit, Dynamic Environments and Meta-Bandits Multi-Armed Bandit, Dynamic Environments and Meta-Bandits C. Hartland, S. Gelly, N. Baskiotis, O. Teytaud and M. Sebag Lab. of Computer Science CNRS INRIA Université Paris-Sud, Orsay, France Abstract This

More information

Evaluation of proportional portfolio insurance strategies

Evaluation of proportional portfolio insurance strategies Evaluation of proportional portfolio insurance strategies Prof. Dr. Antje Mahayni Department of Accounting and Finance, Mercator School of Management, University of Duisburg Essen 11th Scientific Day of

More information

Bernoulli Bandits An Empirical Comparison

Bernoulli Bandits An Empirical Comparison Bernoulli Bandits An Empirical Comparison Ronoh K.N1,2, Oyamo R.1,2, Milgo E.1,2, Drugan M.1 and Manderick B.1 1- Vrije Universiteit Brussel - Computer Sciences Department - AI Lab Pleinlaan 2 - B-1050

More information

Multi-period mean variance asset allocation: Is it bad to win the lottery?

Multi-period mean variance asset allocation: Is it bad to win the lottery? Multi-period mean variance asset allocation: Is it bad to win the lottery? Peter Forsyth 1 D.M. Dang 1 1 Cheriton School of Computer Science University of Waterloo Guangzhou, July 28, 2014 1 / 29 The Basic

More information

Robust Dual Dynamic Programming

Robust Dual Dynamic Programming 1 / 18 Robust Dual Dynamic Programming Angelos Georghiou, Angelos Tsoukalas, Wolfram Wiesemann American University of Beirut Olayan School of Business 31 May 217 2 / 18 Inspired by SDDP Stochastic optimization

More information

Optimal selling rules for repeated transactions.

Optimal selling rules for repeated transactions. Optimal selling rules for repeated transactions. Ilan Kremer and Andrzej Skrzypacz March 21, 2002 1 Introduction In many papers considering the sale of many objects in a sequence of auctions the seller

More information

CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 4: Prior-Free Single-Parameter Mechanism Design. Instructor: Shaddin Dughmi

CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 4: Prior-Free Single-Parameter Mechanism Design. Instructor: Shaddin Dughmi CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 4: Prior-Free Single-Parameter Mechanism Design Instructor: Shaddin Dughmi Administrivia HW out, due Friday 10/5 Very hard (I think) Discuss

More information

The Complexity of Simple and Optimal Deterministic Mechanisms for an Additive Buyer. Xi Chen, George Matikas, Dimitris Paparas, Mihalis Yannakakis

The Complexity of Simple and Optimal Deterministic Mechanisms for an Additive Buyer. Xi Chen, George Matikas, Dimitris Paparas, Mihalis Yannakakis The Complexity of Simple and Optimal Deterministic Mechanisms for an Additive Buyer Xi Chen, George Matikas, Dimitris Paparas, Mihalis Yannakakis Seller has n items for sale The Set-up Seller has n items

More information

Networks: Fall 2010 Homework 3 David Easley and Jon Kleinberg Due in Class September 29, 2010

Networks: Fall 2010 Homework 3 David Easley and Jon Kleinberg Due in Class September 29, 2010 Networks: Fall 00 Homework David Easley and Jon Kleinberg Due in Class September 9, 00 As noted on the course home page, homework solutions must be submitted by upload to the CMS site, at https://cms.csuglab.cornell.edu/.

More information

Non-Deterministic Search

Non-Deterministic Search Non-Deterministic Search MDP s 1 Non-Deterministic Search How do you plan (search) when your actions might fail? In general case, how do you plan, when the actions have multiple possible outcomes? 2 Example:

More information

Optimal Auctions. Game Theory Course: Jackson, Leyton-Brown & Shoham

Optimal Auctions. Game Theory Course: Jackson, Leyton-Brown & Shoham Game Theory Course: Jackson, Leyton-Brown & Shoham So far we have considered efficient auctions What about maximizing the seller s revenue? she may be willing to risk failing to sell the good she may be

More information

A simulation study of two combinatorial auctions

A simulation study of two combinatorial auctions A simulation study of two combinatorial auctions David Nordström Department of Economics Lund University Supervisor: Tommy Andersson Co-supervisor: Albin Erlanson May 24, 2012 Abstract Combinatorial auctions

More information

Lecture 5: Iterative Combinatorial Auctions

Lecture 5: Iterative Combinatorial Auctions COMS 6998-3: Algorithmic Game Theory October 6, 2008 Lecture 5: Iterative Combinatorial Auctions Lecturer: Sébastien Lahaie Scribe: Sébastien Lahaie In this lecture we examine a procedure that generalizes

More information

Forecast Horizons for Production Planning with Stochastic Demand

Forecast Horizons for Production Planning with Stochastic Demand Forecast Horizons for Production Planning with Stochastic Demand Alfredo Garcia and Robert L. Smith Department of Industrial and Operations Engineering Universityof Michigan, Ann Arbor MI 48109 December

More information

Rollout Allocation Strategies for Classification-based Policy Iteration

Rollout Allocation Strategies for Classification-based Policy Iteration Rollout Allocation Strategies for Classification-based Policy Iteration V. Gabillon, A. Lazaric & M. Ghavamzadeh firstname.lastname@inria.fr Workshop on Reinforcement Learning and Search in Very Large

More information

A Field Guide to Personalized Reserve Prices

A Field Guide to Personalized Reserve Prices A Field Guide to Personalized Reserve Prices Renato Paes Leme Martin Pál Sergei Vassilvitskii February 26, 2016 arxiv:1602.07720v1 [cs.gt] 24 Feb 2016 Abstract We study the question of setting and testing

More information

Universal Portfolios

Universal Portfolios CS28B/Stat24B (Spring 2008) Statistical Learning Theory Lecture: 27 Universal Portfolios Lecturer: Peter Bartlett Scribes: Boriska Toth and Oriol Vinyals Portfolio optimization setting Suppose we have

More information

The Irrevocable Multi-Armed Bandit Problem

The Irrevocable Multi-Armed Bandit Problem The Irrevocable Multi-Armed Bandit Problem Ritesh Madan Qualcomm-Flarion Technologies May 27, 2009 Joint work with Vivek Farias (MIT) 2 Multi-Armed Bandit Problem n arms, where each arm i is a Markov Decision

More information

An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits

An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits JMLR: Workshop and Conference Proceedings vol 49:1 5, 2016 An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits Peter Auer Chair for Information Technology Montanuniversitaet

More information

Markov Decision Processes

Markov Decision Processes Markov Decision Processes Ryan P. Adams COS 324 Elements of Machine Learning Princeton University We now turn to a new aspect of machine learning, in which agents take actions and become active in their

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Basic idea: Receive feedback in the form of rewards Agent s utility is defined by the reward function Must (learn to) act so as to maximize expected rewards Grid World The agent

More information

Generating Power Laws

Generating Power Laws Summary of Unit Six Generating Power Laws Introduction to Fractals and Scaling David P. Feldman http://www.complexityexplorer.org Rich-get-Richer Models Procedure for growing a network. At each step, make

More information

Dynamic Programming: An overview. 1 Preliminaries: The basic principle underlying dynamic programming

Dynamic Programming: An overview. 1 Preliminaries: The basic principle underlying dynamic programming Dynamic Programming: An overview These notes summarize some key properties of the Dynamic Programming principle to optimize a function or cost that depends on an interval or stages. This plays a key role

More information

Microeconomic Theory II Preliminary Examination Solutions

Microeconomic Theory II Preliminary Examination Solutions Microeconomic Theory II Preliminary Examination Solutions 1. (45 points) Consider the following normal form game played by Bruce and Sheila: L Sheila R T 1, 0 3, 3 Bruce M 1, x 0, 0 B 0, 0 4, 1 (a) Suppose

More information

SOLVING ROBUST SUPPLY CHAIN PROBLEMS

SOLVING ROBUST SUPPLY CHAIN PROBLEMS SOLVING ROBUST SUPPLY CHAIN PROBLEMS Daniel Bienstock Nuri Sercan Özbay Columbia University, New York November 13, 2005 Project with Lucent Technologies Optimize the inventory buffer levels in a complicated

More information

The value of foresight

The value of foresight Philip Ernst Department of Statistics, Rice University Support from NSF-DMS-1811936 (co-pi F. Viens) and ONR-N00014-18-1-2192 gratefully acknowledged. IMA Financial and Economic Applications June 11, 2018

More information

An optimal policy for joint dynamic price and lead-time quotation

An optimal policy for joint dynamic price and lead-time quotation Lingnan University From the SelectedWorks of Prof. LIU Liming November, 2011 An optimal policy for joint dynamic price and lead-time quotation Jiejian FENG Liming LIU, Lingnan University, Hong Kong Xianming

More information

MDPs and Value Iteration 2/20/17

MDPs and Value Iteration 2/20/17 MDPs and Value Iteration 2/20/17 Recall: State Space Search Problems A set of discrete states A distinguished start state A set of actions available to the agent in each state An action function that,

More information

The revenue management literature for queues typically assumes that providers know the distribution of

The revenue management literature for queues typically assumes that providers know the distribution of MANUFACTURING & SERVICE OPERATIONS MANAGEMENT Vol. 15, No. 2, Spring 2013, pp. 292 304 ISSN 1523-4614 (print) ISSN 1526-5498 (online) http://dx.doi.org/10.1287/msom.1120.0418 2013 INFORMS Bayesian Dynamic

More information

DRAFT. 1 exercise in state (S, t), π(s, t) = 0 do not exercise in state (S, t) Review of the Risk Neutral Stock Dynamics

DRAFT. 1 exercise in state (S, t), π(s, t) = 0 do not exercise in state (S, t) Review of the Risk Neutral Stock Dynamics Chapter 12 American Put Option Recall that the American option has strike K and maturity T and gives the holder the right to exercise at any time in [0, T ]. The American option is not straightforward

More information

Optimal Bidding Strategies in Sequential Auctions 1

Optimal Bidding Strategies in Sequential Auctions 1 Auction- Inventory Optimal Bidding Strategies in Sequential Auctions 1 Management Science and Information Systems Michael N. Katehakis, CDDA Spring 2014 Workshop & IAB Meeting May 7th and 8th, 2014 1 Joint

More information

Quasi-Convex Stochastic Dynamic Programming

Quasi-Convex Stochastic Dynamic Programming Quasi-Convex Stochastic Dynamic Programming John R. Birge University of Chicago Booth School of Business JRBirge SIAM FM12, MSP, 10 July 2012 1 General Theme Many dynamic optimization problems dealing

More information

and Pricing Problems

and Pricing Problems Mechanism Design, Machine Learning, and Pricing Problems Maria-Florina Balcan Carnegie Mellon University Overview Pricing and Revenue Maimization Software Pricing Digital Music Pricing Problems One Seller,

More information

Casino gambling problem under probability weighting

Casino gambling problem under probability weighting Casino gambling problem under probability weighting Sang Hu National University of Singapore Mathematical Finance Colloquium University of Southern California Jan 25, 2016 Based on joint work with Xue

More information

Bandit Learning with switching costs

Bandit Learning with switching costs Bandit Learning with switching costs Jian Ding, University of Chicago joint with: Ofer Dekel (MSR), Tomer Koren (Technion) and Yuval Peres (MSR) June 2016, Harvard University Online Learning with k -Actions

More information

Bounding Optimal Expected Revenues for Assortment Optimization under Mixtures of Multinomial Logits

Bounding Optimal Expected Revenues for Assortment Optimization under Mixtures of Multinomial Logits Bounding Optimal Expected Revenues for Assortment Optimization under Mixtures of Multinomial Logits Jacob Feldman School of Operations Research and Information Engineering, Cornell University, Ithaca,

More information

Dynamic Resource Allocation for Spot Markets in Cloud Computi

Dynamic Resource Allocation for Spot Markets in Cloud Computi Dynamic Resource Allocation for Spot Markets in Cloud Computing Environments Qi Zhang 1, Quanyan Zhu 2, Raouf Boutaba 1,3 1 David. R. Cheriton School of Computer Science University of Waterloo 2 Department

More information

Budget Feasible Mechanism Design

Budget Feasible Mechanism Design Budget Feasible Mechanism Design YARON SINGER Harvard University In this letter we sketch a brief introduction to budget feasible mechanism design. This framework captures scenarios where the goal is to

More information

The Optimality of Being Efficient. Lawrence Ausubel and Peter Cramton Department of Economics University of Maryland

The Optimality of Being Efficient. Lawrence Ausubel and Peter Cramton Department of Economics University of Maryland The Optimality of Being Efficient Lawrence Ausubel and Peter Cramton Department of Economics University of Maryland 1 Common Reaction Why worry about efficiency, when there is resale? Our Conclusion Why

More information

High Dimensional Bayesian Optimisation and Bandits via Additive Models

High Dimensional Bayesian Optimisation and Bandits via Additive Models 1/20 High Dimensional Bayesian Optimisation and Bandits via Additive Models Kirthevasan Kandasamy, Jeff Schneider, Barnabás Póczos ICML 15 July 8 2015 2/20 Bandits & Optimisation Maximum Likelihood inference

More information

TDT4171 Artificial Intelligence Methods

TDT4171 Artificial Intelligence Methods TDT47 Artificial Intelligence Methods Lecture 7 Making Complex Decisions Norwegian University of Science and Technology Helge Langseth IT-VEST 0 helgel@idi.ntnu.no TDT47 Artificial Intelligence Methods

More information

THE TRAVELING SALESMAN PROBLEM FOR MOVING POINTS ON A LINE

THE TRAVELING SALESMAN PROBLEM FOR MOVING POINTS ON A LINE THE TRAVELING SALESMAN PROBLEM FOR MOVING POINTS ON A LINE GÜNTER ROTE Abstract. A salesperson wants to visit each of n objects that move on a line at given constant speeds in the shortest possible time,

More information

The Duo-Item Bisection Auction

The Duo-Item Bisection Auction Comput Econ DOI 10.1007/s10614-013-9380-0 Albin Erlanson Accepted: 2 May 2013 Springer Science+Business Media New York 2013 Abstract This paper proposes an iterative sealed-bid auction for selling multiple

More information

Different Monotonicity Definitions in stochastic modelling

Different Monotonicity Definitions in stochastic modelling Different Monotonicity Definitions in stochastic modelling Imène KADI Nihal PEKERGIN Jean-Marc VINCENT ASMTA 2009 Plan 1 Introduction 2 Models?? 3 Stochastic monotonicity 4 Realizable monotonicity 5 Relations

More information

39 Minimizing Regret with Multiple Reserves

39 Minimizing Regret with Multiple Reserves 39 Minimizing Regret with Multiple Reserves TIM ROUGHGARDEN, Stanford University JOSHUA R. WANG, Stanford University We study the problem of computing and learning non-anonymous reserve prices to maximize

More information

TTIC An Introduction to the Theory of Machine Learning. The Adversarial Multi-armed Bandit Problem Avrim Blum.

TTIC An Introduction to the Theory of Machine Learning. The Adversarial Multi-armed Bandit Problem Avrim Blum. TTIC 31250 An Introduction to the Theory of Machine Learning The Adversarial Multi-armed Bandit Problem Avrim Blum Start with recap 1 Algorithm Consider the following setting Each morning, you need to

More information

1 Overview. 2 The Gradient Descent Algorithm. AM 221: Advanced Optimization Spring 2016

1 Overview. 2 The Gradient Descent Algorithm. AM 221: Advanced Optimization Spring 2016 AM 22: Advanced Optimization Spring 206 Prof. Yaron Singer Lecture 9 February 24th Overview In the previous lecture we reviewed results from multivariate calculus in preparation for our journey into convex

More information

Day 3. Myerson: What s Optimal

Day 3. Myerson: What s Optimal Day 3. Myerson: What s Optimal 1 Recap Last time, we... Set up the Myerson auction environment: n risk-neutral bidders independent types t i F i with support [, b i ] and density f i residual valuation

More information

Mechanisms for Risk Averse Agents, Without Loss

Mechanisms for Risk Averse Agents, Without Loss Mechanisms for Risk Averse Agents, Without Loss Shaddin Dughmi Microsoft Research shaddin@microsoft.com Yuval Peres Microsoft Research peres@microsoft.com June 13, 2012 Abstract Auctions in which agents

More information

Lecture 10: The knapsack problem

Lecture 10: The knapsack problem Optimization Methods in Finance (EPFL, Fall 2010) Lecture 10: The knapsack problem 24.11.2010 Lecturer: Prof. Friedrich Eisenbrand Scribe: Anu Harjula The knapsack problem The Knapsack problem is a problem

More information

Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing

Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing Prof. Chuan-Ju Wang Department of Computer Science University of Taipei Joint work with Prof. Ming-Yang Kao March 28, 2014

More information

4 Reinforcement Learning Basic Algorithms

4 Reinforcement Learning Basic Algorithms Learning in Complex Systems Spring 2011 Lecture Notes Nahum Shimkin 4 Reinforcement Learning Basic Algorithms 4.1 Introduction RL methods essentially deal with the solution of (optimal) control problems

More information

Pricing Problems under the Markov Chain Choice Model

Pricing Problems under the Markov Chain Choice Model Pricing Problems under the Markov Chain Choice Model James Dong School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853, USA jd748@cornell.edu A. Serdar Simsek

More information