Learning for Revenue Optimization. Andrés Muñoz Medina Renato Paes Leme
|
|
- Jerome McDowell
- 5 years ago
- Views:
Transcription
1 Learning for Revenue Optimization Andrés Muñoz Medina Renato Paes Leme
2 How to succeed in business with basic ML? ML $1 $5 $10 $9 Google $35 $1 $8 $7 $7 Revenue $8 $30 $24 $18 $10 $1 $5 Price $7 $8$9$10
3 Complications What if the seller only sees a sample of the population? What if the seller doesn t know every buyer s valuation? Can buyers lie and don t provide their true valuation? What if valuations change as a function of features?
4 Outline Online revenue optimization Batch revenue optimization
5 Various flavors of this problem One buyer (pricing) vs multiple buyers (auctions) Fixed valuations (realizable), random valuations (stochastic) and worst-case valuations (adversarial) Contextual vs non-contextual Strategic vs myopic buyers
6 Definitions Valuation ( v ): What a buyer is willing to pay for a good Bid: How much a buyer claims she is willing to pay Reserve price ( p): Minimum price acceptable to the seller Revenue ( Rev) :How much the seller gets from selling Interactions ( interact T ): Number of times buyer and seller
7 Single buyer Valuation v t = maximum willingness to pay Reserve price p t Myopic (price taking buyer): buys whenever v t p t i.e. doesn t reason about consequences of purchasing decision revenue function is Rev(p t,v t )=p t 1 vt p t Strategic buyer: reasons about how purchasing decisions affect future prices
8 Single myopic buyer Realizable setting: valuation is fixed but unknown v t = v 2 [0, 1] Stochastic setting: valuations are sampled from an unknown distribution v t D Adversarial setting no assumption made on valuations Seller s goal: Minimize regret
9 Single myopic buyer $5 $6 $4 $7 Yes Yes Yes No ML Google
10 Fixed valuation. v t = v 2 [0, 1] Regret: R = Tv TX Rev(p t,v t ) t=1
11 Binary Search At round k S k =[a k,a k + k ], s =0 and k+1 = k /2 While price accepted p t = a k + s k+1 ; s = s +1 Rejection: Start new round a k+1 is last accepted price k < 1 Stop, offer p t = a k for all t T p t p t v p t a k a k+1 a k+1 k + + k k+1
12 Fast Search Kleinberg and Leighton 2007 At round k S k =[a k,a k + k ], s =0 and k+1 = 2 k While price accepted p t = a k + s k+1 ; s = s +1 Rejection: Start new round a k+1 is last accepted price k < 1 Stop, offer p t = a k for all t T p t p t pa tk+1 + k+1 p t p t p t v a k a k+1 a k + k
13 Kleinberg and Leighton search Analysis: in each round there is at most one no-sale for each sale, the regret is at most k there are at most k/ k+1 =1/ sales k the total regret per round is O(1), since there are O(log log T) rounds before k < 1/T the total regret is O(log log T).
14 Kleinberg and Leighton search Regret R 2 O(log log T ) Lower bound (log log T )
15 Multiple valuations
16 Bandits Expected revenue curve R(p) =E v [Rev(p, v)] R(p) EXP3 p UCB 0 Discretize 1 Apply Bandits
17 Random valuation Valuation Regret General strategy: discretize prices and treat each prices as a bandit v t D R = T max p E p [Rev(p, v t )] h X T E t=1 i Rev(p t,v t ) without any assumptions Õ(T 2/3 ) : balance the discretization error and error in UCB can be improved for special families of distributions
18 Random valuation Expected revenue function E v D [Rev(p, v)] is unimodal Unimodal p Lipschitz bandits [Combes, Proutiere 2014] Õ( T ) If the revenue curve is quadratic around the maximum, then Kleinberg and Leighton also give a p Õ( T ) regret algorithm which is tight in this class.
19 Adversarial Valuations Compete against the best fixed price policy h TX TX i R = E max Rev(p,v t ) Rev(p t,v t ) p General approach: discretize prices in K intervals and treat each as an arm. Use EXP3: [Kleinberg and Leighton 07] t=1 R = Õ(p KT)+O(T/K) =Õ(T 2/3 ) EXP3 regret discretization regret t=1
20 Contextual Pricing Each product represented by a context x t 2 R d ; kx t k 2 apple 1 Buyer valuation is a dot-product: v t = h,x t i The weight vector is fixed but unknown, k k 2 apple 1 TX Regret is: R = v t Rev(p t,v t ) t=1 Can we draw a connection with online learning?
21 Contextual Pricing Õ( p T ) Stochastic gradient give regret [Amin et al. 2014] Cohen, Lobel, Paes Leme, Vladu, Schneider: R = O(d log T ) Algorithm based on the ellipsoid method Keep knowledge sets: S 0 = { 2 R d ; k k 2 apple 1} For each x t we know: v t 2 [a t,b t ] a t =min 2St h,x t i x t b t = max 2St h,x t i
22 Contextual Pricing Õ( p T ) Stochastic gradient give regret [Amin et al. 2014] Cohen, Lobel, Paes Leme, Vladu, Schneider: R = O(d log T ) Algorithm based on the ellipsoid method S t+1 If a t b t apple 1/T then we are done. If not, guess p t 2 [a t,b t ] Update the knowledge set to either: S t+1 = { 2 S t ; h,x t iapplep t } S t+1 = { 2 S t ; h,x t i p t } x t
23 Contextual Pricing Õ( p T ) Stochastic gradient give regret [Amin et al. 2014] Cohen, Lobel, Paes Leme, Vladu, Schneider: R = O(d log T ) Algorithm based on the ellipsoid method Theorem: Setting p t = 1 2 (a t + b t ) has (2 d log T ) regret. Theorem: Ellipsoid regularization has O(d 2 log T ) regret. Theorem: Cylindrification regularizer has O(d log T ) regret. Theorem: Squaring trick has regret O(d 4 log log T )
24 Strategic Buyers
25 Strategic buyers What happens if buyers know the seller will adapt prices?
26 Setup Buyer s valuation Seller offers price v t p t Buyer accepts a t =1or rejects a t =0 Discount factor h i PT Buyer optimizes E t=1 t a t (v t p t ) h X T i Seller maximizes revenue E a t p t t=1
27 Three scenarios Fixed value v t = v [Amin et al. 2013, Mohri and Muñoz 2014, Drutsa 2017] Random valuation and Muñoz 2015] v t D [Amin et al. 2013, Mohri Contextual valuation v t = h,x t i with x t D [Amin et al. 2014]
28 Game setup Seller selects pricing algorithm Announces algorithm to buyer Buyer can play strategically
29 Measuring regret Best fixed price in hindsight? real value = 8 fake value = 1 $4? $2? $1? No No YES! p t =4, 2, 1, 1, 1, 1,... a t =0, 0, 1, 1, 1, 1,...
30 Strategic Regret Compare against best possible outcome TX Fixed valuation R = Tv a t p t Random valuation Contextual valuation t=1 R = T max E p [Rev(p, v t )] E[a t p t ] p h X T R = E t=1 v t a t p t i
31 The Buyer Knowledge of future incentivizes buyer to lie Lie: Buyer rejects even if his value is greater than reserve price
32 How can we reduce the number of lies?
33 Warm up Monotone algorithms [Amin et al. 2013] Choose < 1 Offer prices p t = t If accepted offer price for the remaining rounds
34 Warm up Decrease slowly to make lies costly Not too slow or accumulate regret p T Regret in O 1 Lower bound log log T + 1 1
35 Better guarantees Fast search with penalized rejections [Mohri and Muñoz 2014] Every time a price is rejected offer again for several rounds Regret in Horizon independent guarantees [Drutsa 2017] Regret in log T O 1 log log T O 1
36 Random valuations Valuation v t D Regret R = T max p E p [Rev(p, v t )] E[a t p t ] UCB type algorithm with slow decreasing confidence bounds [Mohri and Muñoz 2015] p 1 Regret in O T + log 1/ T 1/4
37 Contextual Valuation Explore exploit algorithm with longer explore time Amin et al Regret in O T 2/3 p log(1/ )
38 Related Work Revenue optimization in second price auctions [Cesa- Bianchi et al. 2013] Modeling buyers as regret minimizers [Nekipelov et al. 2015] Selling to no regret buyers [Heidari et al. 2017, Braverman et al. 2017] Selling to patient buyers [Feldman et al. 2016]
39 Open problems Contextual valuations without realizability assumptions Strategic buyers with adversarial valuations Online learning algorithms in general auctions [Roughgarden 2016] Multiple strategic buyers
40 Revenue from Multiple Buyers (Pricing -> Auctions)
41 ? Multiple buyers $100 $1000 $50
42 Multi-buyer Setup N buyers with valuations v i 2 [0, 1] from distribution D i Auction A is an allocation x i :[0, 1] N! {0, 1} and payment p i :[0, 1] N! R Revenue: Rev(A) = Goal: Maximize NX i=1 p i E v1,...,v N [Rev(A)] Notation: Given valuation vector (v 1,...,v N ) (v, v i )=(v 1,...,v i 1,v,v i+1,...,v N )
43 Conditions on auction NX Object can only be allocated once x i apple 1 i=1 Individual rationality (IR): u i = v i x i p i 0 Incentive compatibility (IC): v i x i (v i,v i ) p i (v i,v i ) v i x i (v, v i ) p i (v, v i )
44 Why IC? Buyers truly reveal how much they are willing to pay. Makes auction stable Allows learning
45 Some IC auctions Second price auction: allocate to the buyer with highest v i and charge second highest value. x i =1$ v i = max j v j p i = max j6=i v j if x i =1; 0 otherwise
46 Second price auction $100 $1000 $50
47 IC auctions Second price with reserve price r: allocate to the highest bidder if v i r. Charge p i = max(r, max j6=i v j ) x i =1if v i max(max j v j,r) p i = max(max v j,r) if x i =1 j6=i
48 Second Price Auction With Reserve $100 r = $2000 $1000 r = $900 $50
49 Myerson Auction 1 $100 $600 $90 2 $1000 $500 3 $50 $300
50 Some IC auctions Myerson s auction: pick a monotone bid deformation i( ) x i =1$ i (v i ) = max j j(v j ) and i(v i ) > 0 p i = 1 (max(max j(v j ), 0)) if x i =1, 0 otherwise i j6=i If i = 8i x i =1$ v i = max j v j p i = 1 max(max j6=i (v j ), 0) = max(max j6=i v j, 1 (0)
51 Myerson Auction Optimal auction if v i D i independently D i If is known, functions i can be calculated exactly What about unknown distributions? Can we learn the optimal monotone functions? What is the sample complexity?
52 Sample Complexity of Auctions N bidders Valuations v i D i independent Observe Nm samples v i,1...v i,m D i, i 2 {1,...,N} Find auction A such that E[Rev(A)] (1 ) max A E[Rev(A)] Can we use empirical revenue optimization? max A 1 m mx j=1 NX i=1 p i (v 1j,...,v Nj )
53 Lower bounds on sample complexity Proof for a single buyer [Huang et al. 2015] Problem reduces to finding the optimal price for a distribution Need at least approximation 1 2 samples to get a 1
54 Idea of the proof Two similar distributions D2. KL(D1 D2) = Need 1 2 samples to distinguish them w.h.p
55 Revenue curves Approximately optimal revenue sets disjoint E v D2 [Rev(r, v)] E v D1 [Rev(r, v)] If algorithm optimizes revenue for both distributions. It must be able to distinguish them r
56 Upper bounds on sample complexity Auctions are parametrized by increasing functions i Pseudo-dimension of increasing functions is infinite! Restrict the class and measure approximation error
57 t-level auctions $100 $ $50
58 t-level auctions Morgenstern and Roughgarden 2016 Rank candidates using t-step functions Pseudo dimension bounded O(Ntlog Nt) Best t-level auction is a 1 t approximation
59 t-level auctions 1 Theorem: Let t =, using a sample of size N. m = the t-level auction ba maximizing 3 empirical revenue is a optimal auction 1 approximation to the
60 Algorithm Cole and Roughgarden 2015, Huang et al In summary, optimize auctions over all increasing functions Proof for finite support Extension by discretization 1 O 3 samples
61 Is this enough?
62 Features in auctions In practice valuations are not i.i.d. They depend on features (context) Dependency is not realizable in general Algorithm of Huang et al. can be generalized to 1 feature
63 Display ads Millions of auctions Parametrized by publisher information, time of day, Dependency of valuations on features is not clear
64 Setup Single buyer auction, find optimal reserve price Observe sample (x 1,v 1 ),...(x m,v m ) distribution D over X [0, 1] from Hypotheses Goal: Find h: X! R max h2h E (x,v) D[Rev(h(x),v)]
65 Revenue function Non-concave Non-differentiable Discontinuous Is it possible to learn?
66 Learning Theory Theorem [Mohri and Muñoz 2013] given a sample of size m, with high probability the following bound holds uniformly for all h 2 H E[Rev(h(x),v)] 1 m mx i=1 r P Dim(H) Rev(h(x i ),v i ) apple O m Space of linear functions?
67 Can we do empirical maximization?
68 The revenue function
69 Revenue function Non-concave Non-differentiable Discontinuous Is it possible to optimize?
70 Surrogates Loss similar to 0-1 loss Can we optimize a concave surrogate reward?
71 Calibration We say a function R: R R! R is calibrated with respect to Rev if for any distribution D we have argmax r E v [R(r, v)] argmax r E v [Rev(r, v)]
72 Surrogates Theorem [Mohri and Muñoz 2013]: Any concave function that is calibrated is constant.
73 Continuous Surrogates Remove discontinuity Difference of concave functions DC algorithm for linear hypothesis class [Mohri and Muñoz 2013]
74 Optimization Issues Sequential algorithm Not scalable
75 Other class of functions?
76 Clustering Muñoz and Vassilvitskii 2017 Show attainable revenue is related to variance of the distribution Cluster features to have low variance of valuations Revenue related to quality of cluster
77 Related problems Dynamic reserves for repeated auctions [Kanoria and Nazerzadeh 2017] New complexity measures [Syrgkanis 2017] Combinatorial auction sample complexity [Morgenstern and Roughgarden 2016, Balcan et al. 2016] Optimal auction design with neural networks [Dütting et al. 2017]
78 Conclusion Revenue optimization is a crucial practical problem Machine learning techniques have yielded new theory and algorithms on this field We need to better understand the relationship of buyers and sellers There are several open problems still out there
79 Thank you!
Regret Minimization against Strategic Buyers
Regret Minimization against Strategic Buyers Mehryar Mohri Courant Institute & Google Research Andrés Muñoz Medina Google Research Motivation Online advertisement: revenue of modern search engine and
More informationRevenue optimization in AdExchange against strategic advertisers
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationMulti-armed bandit problems
Multi-armed bandit problems Stochastic Decision Theory (2WB12) Arnoud den Boer 13 March 2013 Set-up 13 and 14 March: Lectures. 20 and 21 March: Paper presentations (Four groups, 45 min per group). Before
More informationMonte-Carlo Planning: Introduction and Bandit Basics. Alan Fern
Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned
More informationMonte-Carlo Planning: Introduction and Bandit Basics. Alan Fern
Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned
More informationBudget Management In GSP (2018)
Budget Management In GSP (2018) Yahoo! March 18, 2018 Miguel March 18, 2018 1 / 26 Today s Presentation: Budget Management Strategies in Repeated auctions, Balseiro, Kim, and Mahdian, WWW2017 Learning
More informationLecture 11: Bandits with Knapsacks
CMSC 858G: Bandits, Experts and Games 11/14/16 Lecture 11: Bandits with Knapsacks Instructor: Alex Slivkins Scribed by: Mahsa Derakhshan 1 Motivating Example: Dynamic Pricing The basic version of the dynamic
More informationDynamic Pricing with Varying Cost
Dynamic Pricing with Varying Cost L. Jeff Hong College of Business City University of Hong Kong Joint work with Ying Zhong and Guangwu Liu Outline 1 Introduction 2 Problem Formulation 3 Pricing Policy
More informationLecture 17: More on Markov Decision Processes. Reinforcement learning
Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture
More informationZooming Algorithm for Lipschitz Bandits
Zooming Algorithm for Lipschitz Bandits Alex Slivkins Microsoft Research New York City Based on joint work with Robert Kleinberg and Eli Upfal (STOC'08) Running examples Dynamic pricing. You release a
More informationApproximate Revenue Maximization with Multiple Items
Approximate Revenue Maximization with Multiple Items Nir Shabbat - 05305311 December 5, 2012 Introduction The paper I read is called Approximate Revenue Maximization with Multiple Items by Sergiu Hart
More informationOptimal Regret Minimization in Posted-Price Auctions with Strategic Buyers
Optimal Regret Minimization in Posted-Price Auctions with Strategic Buyers Mehryar Mohri Courant Institute and Google Research 251 Mercer Street New York, NY 10012 mohri@cims.nyu.edu Andres Muñoz Medina
More informationTTIC An Introduction to the Theory of Machine Learning. The Adversarial Multi-armed Bandit Problem Avrim Blum.
TTIC 31250 An Introduction to the Theory of Machine Learning The Adversarial Multi-armed Bandit Problem Avrim Blum Start with recap 1 Algorithm Consider the following setting Each morning, you need to
More informationTreatment Allocations Based on Multi-Armed Bandit Strategies
Treatment Allocations Based on Multi-Armed Bandit Strategies Wei Qian and Yuhong Yang Applied Economics and Statistics, University of Delaware School of Statistics, University of Minnesota Innovative Statistics
More informationHorizon-Independent Optimal Pricing in Repeated Auctions with Truthful and Strategic Buyers
Horizon-Independent Optimal Pricing in Repeated Auctions with Truthful and Strategic Buyers Alexey Drutsa Yandex, 16, Leo Tolstoy St. Moscow, Russia adrutsa@yandex.ru ABSTRACT We study revenue optimization
More informationAdaptive Experiments for Policy Choice. March 8, 2019
Adaptive Experiments for Policy Choice Maximilian Kasy Anja Sautmann March 8, 2019 Introduction The goal of many experiments is to inform policy choices: 1. Job search assistance for refugees: Treatments:
More informationSupport Vector Machines: Training with Stochastic Gradient Descent
Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Support vector machines Training by maximizing margin The SVM
More informationTuning bandit algorithms in stochastic environments
Tuning bandit algorithms in stochastic environments Jean-Yves Audibert, CERTIS - Ecole des Ponts Remi Munos, INRIA Futurs Lille Csaba Szepesvári, University of Alberta The 18th International Conference
More informationMechanism Design and Auctions
Mechanism Design and Auctions Game Theory Algorithmic Game Theory 1 TOC Mechanism Design Basics Myerson s Lemma Revenue-Maximizing Auctions Near-Optimal Auctions Multi-Parameter Mechanism Design and the
More informationAn algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits
JMLR: Workshop and Conference Proceedings vol 49:1 5, 2016 An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits Peter Auer Chair for Information Technology Montanuniversitaet
More informationOptimal Auctions. Game Theory Course: Jackson, Leyton-Brown & Shoham
Game Theory Course: Jackson, Leyton-Brown & Shoham So far we have considered efficient auctions What about maximizing the seller s revenue? she may be willing to risk failing to sell the good she may be
More informationand Pricing Problems
Mechanism Design, Machine Learning, and Pricing Problems Maria-Florina Balcan Carnegie Mellon University Overview Pricing and Revenue Maimization Software Pricing Digital Music Pricing Problems One Seller,
More informationarxiv: v1 [cs.lg] 23 Nov 2014
Revenue Optimization in Posted-Price Auctions with Strategic Buyers arxiv:.0v [cs.lg] Nov 0 Mehryar Mohri Courant Institute and Google Research Mercer Street New York, NY 00 mohri@cims.nyu.edu Abstract
More informationRecap First-Price Revenue Equivalence Optimal Auctions. Auction Theory II. Lecture 19. Auction Theory II Lecture 19, Slide 1
Auction Theory II Lecture 19 Auction Theory II Lecture 19, Slide 1 Lecture Overview 1 Recap 2 First-Price Auctions 3 Revenue Equivalence 4 Optimal Auctions Auction Theory II Lecture 19, Slide 2 Motivation
More informationCompeting Mechanisms with Limited Commitment
Competing Mechanisms with Limited Commitment Suehyun Kwon CESIFO WORKING PAPER NO. 6280 CATEGORY 12: EMPIRICAL AND THEORETICAL METHODS DECEMBER 2016 An electronic version of the paper may be downloaded
More informationLarge-Scale SVM Optimization: Taking a Machine Learning Perspective
Large-Scale SVM Optimization: Taking a Machine Learning Perspective Shai Shalev-Shwartz Toyota Technological Institute at Chicago Joint work with Nati Srebro Talk at NEC Labs, Princeton, August, 2008 Shai
More informationCOS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #24 Scribe: Jordan Ash May 1, 2014
COS 5: heoretical Machine Learning Lecturer: Rob Schapire Lecture #24 Scribe: Jordan Ash May, 204 Review of Game heory: Let M be a matrix with all elements in [0, ]. Mindy (called the row player) chooses
More informationCMSC 858F: Algorithmic Game Theory Fall 2010 Introduction to Algorithmic Game Theory
CMSC 858F: Algorithmic Game Theory Fall 2010 Introduction to Algorithmic Game Theory Instructor: Mohammad T. Hajiaghayi Scribe: Hyoungtae Cho October 13, 2010 1 Overview In this lecture, we introduce the
More informationSupplementary Material for Combinatorial Partial Monitoring Game with Linear Feedback and Its Application. A. Full proof for Theorems 4.1 and 4.
Supplementary Material for Combinatorial Partial Monitoring Game with Linear Feedback and Its Application. A. Full proof for Theorems 4.1 and 4. If the reader will recall, we have the following problem-specific
More informationPosted-Price Mechanisms and Prophet Inequalities
Posted-Price Mechanisms and Prophet Inequalities BRENDAN LUCIER, MICROSOFT RESEARCH WINE: CONFERENCE ON WEB AND INTERNET ECONOMICS DECEMBER 11, 2016 The Plan 1. Introduction to Prophet Inequalities 2.
More informationTTIC An Introduction to the Theory of Machine Learning. Learning and Game Theory. Avrim Blum 5/7/18, 5/9/18
TTIC 31250 An Introduction to the Theory of Machine Learning Learning and Game Theory Avrim Blum 5/7/18, 5/9/18 Zero-sum games, Minimax Optimality & Minimax Thm; Connection to Boosting & Regret Minimization
More informationAuction Theory: Some Basics
Auction Theory: Some Basics Arunava Sen Indian Statistical Institute, New Delhi ICRIER Conference on Telecom, March 7, 2014 Outline Outline Single Good Problem Outline Single Good Problem First Price Auction
More informationSingle Price Mechanisms for Revenue Maximization in Unlimited Supply Combinatorial Auctions
Single Price Mechanisms for Revenue Maximization in Unlimited Supply Combinatorial Auctions Maria-Florina Balcan Avrim Blum Yishay Mansour February 2007 CMU-CS-07-111 School of Computer Science Carnegie
More informationCS364B: Frontiers in Mechanism Design Lecture #18: Multi-Parameter Revenue-Maximization
CS364B: Frontiers in Mechanism Design Lecture #18: Multi-Parameter Revenue-Maximization Tim Roughgarden March 5, 2014 1 Review of Single-Parameter Revenue Maximization With this lecture we commence the
More informationMaking Decisions. CS 3793 Artificial Intelligence Making Decisions 1
Making Decisions CS 3793 Artificial Intelligence Making Decisions 1 Planning under uncertainty should address: The world is nondeterministic. Actions are not certain to succeed. Many events are outside
More informationLecture 7: Bayesian approach to MAB - Gittins index
Advanced Topics in Machine Learning and Algorithmic Game Theory Lecture 7: Bayesian approach to MAB - Gittins index Lecturer: Yishay Mansour Scribe: Mariano Schain 7.1 Introduction In the Bayesian approach
More informationOptimal Bidding Strategies in Sequential Auctions 1
Auction- Inventory Optimal Bidding Strategies in Sequential Auctions 1 Management Science and Information Systems Michael N. Katehakis, CDDA Spring 2014 Workshop & IAB Meeting May 7th and 8th, 2014 1 Joint
More informationBandit Learning with switching costs
Bandit Learning with switching costs Jian Ding, University of Chicago joint with: Ofer Dekel (MSR), Tomer Koren (Technion) and Yuval Peres (MSR) June 2016, Harvard University Online Learning with k -Actions
More informationSOLVING ROBUST SUPPLY CHAIN PROBLEMS
SOLVING ROBUST SUPPLY CHAIN PROBLEMS Daniel Bienstock Nuri Sercan Özbay Columbia University, New York November 13, 2005 Project with Lucent Technologies Optimize the inventory buffer levels in a complicated
More informationFast Convergence of Regress-later Series Estimators
Fast Convergence of Regress-later Series Estimators New Thinking in Finance, London Eric Beutner, Antoon Pelsser, Janina Schweizer Maastricht University & Kleynen Consultants 12 February 2014 Beutner Pelsser
More informationA lower bound on seller revenue in single buyer monopoly auctions
A lower bound on seller revenue in single buyer monopoly auctions Omer Tamuz October 7, 213 Abstract We consider a monopoly seller who optimally auctions a single object to a single potential buyer, with
More informationBernoulli Bandits An Empirical Comparison
Bernoulli Bandits An Empirical Comparison Ronoh K.N1,2, Oyamo R.1,2, Milgo E.1,2, Drugan M.1 and Manderick B.1 1- Vrije Universiteit Brussel - Computer Sciences Department - AI Lab Pleinlaan 2 - B-1050
More informationUp till now, we ve mostly been analyzing auctions under the following assumptions:
Econ 805 Advanced Micro Theory I Dan Quint Fall 2007 Lecture 7 Sept 27 2007 Tuesday: Amit Gandhi on empirical auction stuff p till now, we ve mostly been analyzing auctions under the following assumptions:
More informationECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2017
ECON 459 Game Theory Lecture Notes Auctions Luca Anderlini Spring 2017 These notes have been used and commented on before. If you can still spot any errors or have any suggestions for improvement, please
More informationThe Irrevocable Multi-Armed Bandit Problem
The Irrevocable Multi-Armed Bandit Problem Ritesh Madan Qualcomm-Flarion Technologies May 27, 2009 Joint work with Vivek Farias (MIT) 2 Multi-Armed Bandit Problem n arms, where each arm i is a Markov Decision
More information39 Minimizing Regret with Multiple Reserves
39 Minimizing Regret with Multiple Reserves TIM ROUGHGARDEN, Stanford University JOSHUA R. WANG, Stanford University We study the problem of computing and learning non-anonymous reserve prices to maximize
More informationRandom Search Techniques for Optimal Bidding in Auction Markets
Random Search Techniques for Optimal Bidding in Auction Markets Shahram Tabandeh and Hannah Michalska Abstract Evolutionary algorithms based on stochastic programming are proposed for learning of the optimum
More informationINVERSE REWARD DESIGN
INVERSE REWARD DESIGN Dylan Hadfield-Menell, Smith Milli, Pieter Abbeel, Stuart Russell, Anca Dragan University of California, Berkeley Slides by Anthony Chen Inverse Reinforcement Learning (Review) Inverse
More informationA Field Guide to Personalized Reserve Prices
A Field Guide to Personalized Reserve Prices Renato Paes Leme Martin Pál Sergei Vassilvitskii February 26, 2016 arxiv:1602.07720v1 [cs.gt] 24 Feb 2016 Abstract We study the question of setting and testing
More informationarxiv: v1 [cs.gt] 7 May 2018
Optimal Pricing in Repeated Posted-Price Auctions arxiv:1805.02574v1 [cs.gt] 7 May 2018 Arsenii Vanunts Yandex, MSU avanunts@yandex.ru Alexey Drutsa Yandex, MSU adrutsa@yandex.ru 19 March 2018 Abstract
More informationOptimal Stopping. Nick Hay (presentation follows Thomas Ferguson s Optimal Stopping and Applications) November 6, 2008
(presentation follows Thomas Ferguson s and Applications) November 6, 2008 1 / 35 Contents: Introduction Problems Markov Models Monotone Stopping Problems Summary 2 / 35 The Secretary problem You have
More informationMacroeconomics and finance
Macroeconomics and finance 1 1. Temporary equilibrium and the price level [Lectures 11 and 12] 2. Overlapping generations and learning [Lectures 13 and 14] 2.1 The overlapping generations model 2.2 Expectations
More informationMicroeconomic Theory II Preliminary Examination Solutions
Microeconomic Theory II Preliminary Examination Solutions 1. (45 points) Consider the following normal form game played by Bruce and Sheila: L Sheila R T 1, 0 3, 3 Bruce M 1, x 0, 0 B 0, 0 4, 1 (a) Suppose
More informationSingle Price Mechanisms for Revenue Maximization in Unlimited Supply Combinatorial Auctions
Single Price Mechanisms for Revenue Maximization in Unlimited Supply Combinatorial Auctions Maria-Florina Balcan Avrim Blum Yishay Mansour December 7, 2006 Abstract In this note we generalize a result
More informationCorrelation-Robust Mechanism Design
Correlation-Robust Mechanism Design NICK GRAVIN and PINIAN LU ITCS, Shanghai University of Finance and Economics In this letter, we discuss the correlation-robust framework proposed by Carroll [Econometrica
More informationDynamic Programming and Reinforcement Learning
Dynamic Programming and Reinforcement Learning Daniel Russo Columbia Business School Decision Risk and Operations Division Fall, 2017 Daniel Russo (Columbia) Fall 2017 1 / 34 Supervised Machine Learning
More informationMulti-Armed Bandit, Dynamic Environments and Meta-Bandits
Multi-Armed Bandit, Dynamic Environments and Meta-Bandits C. Hartland, S. Gelly, N. Baskiotis, O. Teytaud and M. Sebag Lab. of Computer Science CNRS INRIA Université Paris-Sud, Orsay, France Abstract This
More informationMonte-Carlo Planning: Basic Principles and Recent Progress
Monte-Carlo Planning: Basic Principles and Recent Progress Alan Fern School of EECS Oregon State University Outline Preliminaries: Markov Decision Processes What is Monte-Carlo Planning? Uniform Monte-Carlo
More informationOn Existence of Equilibria. Bayesian Allocation-Mechanisms
On Existence of Equilibria in Bayesian Allocation Mechanisms Northwestern University April 23, 2014 Bayesian Allocation Mechanisms In allocation mechanisms, agents choose messages. The messages determine
More informationBlack-Scholes and Game Theory. Tushar Vaidya ESD
Black-Scholes and Game Theory Tushar Vaidya ESD Sequential game Two players: Nature and Investor Nature acts as an adversary, reveals state of the world S t Investor acts by action a t Investor incurs
More informationThe Complexity of Simple and Optimal Deterministic Mechanisms for an Additive Buyer. Xi Chen, George Matikas, Dimitris Paparas, Mihalis Yannakakis
The Complexity of Simple and Optimal Deterministic Mechanisms for an Additive Buyer Xi Chen, George Matikas, Dimitris Paparas, Mihalis Yannakakis Seller has n items for sale The Set-up Seller has n items
More informationTeaching Bandits How to Behave
Teaching Bandits How to Behave Manuscript Yiling Chen, Jerry Kung, David Parkes, Ariel Procaccia, Haoqi Zhang Abstract Consider a setting in which an agent selects an action in each time period and there
More informationThe Optimality of Being Efficient. Lawrence Ausubel and Peter Cramton Department of Economics University of Maryland
The Optimality of Being Efficient Lawrence Ausubel and Peter Cramton Department of Economics University of Maryland 1 Common Reaction Why worry about efficiency, when there is resale? Our Conclusion Why
More informationCHOICE THEORY, UTILITY FUNCTIONS AND RISK AVERSION
CHOICE THEORY, UTILITY FUNCTIONS AND RISK AVERSION Szabolcs Sebestyén szabolcs.sebestyen@iscte.pt Master in Finance INVESTMENTS Sebestyén (ISCTE-IUL) Choice Theory Investments 1 / 65 Outline 1 An Introduction
More informationLecture 5. 1 Online Learning. 1.1 Learning Setup (Perspective of Universe) CSCI699: Topics in Learning & Game Theory
CSCI699: Topics in Learning & Game Theory Lecturer: Shaddin Dughmi Lecture 5 Scribes: Umang Gupta & Anastasia Voloshinov In this lecture, we will give a brief introduction to online learning and then go
More informationChapter 3. Dynamic discrete games and auctions: an introduction
Chapter 3. Dynamic discrete games and auctions: an introduction Joan Llull Structural Micro. IDEA PhD Program I. Dynamic Discrete Games with Imperfect Information A. Motivating example: firm entry and
More informationEE365: Risk Averse Control
EE365: Risk Averse Control Risk averse optimization Exponential risk aversion Risk averse control 1 Outline Risk averse optimization Exponential risk aversion Risk averse control Risk averse optimization
More informationFDPE Microeconomics 3 Spring 2017 Pauli Murto TA: Tsz-Ning Wong (These solution hints are based on Julia Salmi s solution hints for Spring 2015.
FDPE Microeconomics 3 Spring 2017 Pauli Murto TA: Tsz-Ning Wong (These solution hints are based on Julia Salmi s solution hints for Spring 2015.) Hints for Problem Set 3 1. Consider the following strategic
More information16 MAKING SIMPLE DECISIONS
247 16 MAKING SIMPLE DECISIONS Let us associate each state S with a numeric utility U(S), which expresses the desirability of the state A nondeterministic action A will have possible outcome states Result
More informationLecture outline W.B.Powell 1
Lecture outline What is a policy? Policy function approximations (PFAs) Cost function approximations (CFAs) alue function approximations (FAs) Lookahead policies Finding good policies Optimizing continuous
More informationAsymmetric Information and Distributional Impacts in New Environmental Markets
Asymmetric Information and Distributional Impacts in New Environmental Markets Brett Close 1 Corbett Grainger 1 & Linda Nøstbakken 2 1 University of Wisconsin - Madison 2 Norwegian School of Economics
More informationUniversal Portfolios
CS28B/Stat24B (Spring 2008) Statistical Learning Theory Lecture: 27 Universal Portfolios Lecturer: Peter Bartlett Scribes: Boriska Toth and Oriol Vinyals Portfolio optimization setting Suppose we have
More informationProblem 1: Random variables, common distributions and the monopoly price
Problem 1: Random variables, common distributions and the monopoly price In this problem, we will revise some basic concepts in probability, and use these to better understand the monopoly price (alternatively
More informationRollout Allocation Strategies for Classification-based Policy Iteration
Rollout Allocation Strategies for Classification-based Policy Iteration V. Gabillon, A. Lazaric & M. Ghavamzadeh firstname.lastname@inria.fr Workshop on Reinforcement Learning and Search in Very Large
More informationParkes Auction Theory 1. Auction Theory. Jacomo Corbo. School of Engineering and Applied Science, Harvard University
Parkes Auction Theory 1 Auction Theory Jacomo Corbo School of Engineering and Applied Science, Harvard University CS 286r Spring 2007 Parkes Auction Theory 2 Auctions: A Special Case of Mech. Design Allocation
More informationTDT4171 Artificial Intelligence Methods
TDT47 Artificial Intelligence Methods Lecture 7 Making Complex Decisions Norwegian University of Science and Technology Helge Langseth IT-VEST 0 helgel@idi.ntnu.no TDT47 Artificial Intelligence Methods
More information978 J.-J. LAFFONT, H. OSSARD, AND Q. WONG
978 J.-J. LAFFONT, H. OSSARD, AND Q. WONG As a matter of fact, the proof of the later statement does not follow from standard argument because QL,,(6) is not continuous in I. However, because - QL,,(6)
More informationD I S C O N T I N U O U S DEMAND FUNCTIONS: ESTIMATION AND PRICING. Rotterdam May 24, 2018
D I S C O N T I N U O U S DEMAND FUNCTIONS: ESTIMATION AND PRICING Arnoud V. den Boer University of Amsterdam N. Bora Keskin Duke University Rotterdam May 24, 2018 Dynamic pricing and learning: Learning
More informationThe Duo-Item Bisection Auction
Comput Econ DOI 10.1007/s10614-013-9380-0 Albin Erlanson Accepted: 2 May 2013 Springer Science+Business Media New York 2013 Abstract This paper proposes an iterative sealed-bid auction for selling multiple
More informationRegret Minimization and Correlated Equilibria
Algorithmic Game heory Summer 2017, Week 4 EH Zürich Overview Regret Minimization and Correlated Equilibria Paolo Penna We have seen different type of equilibria and also considered the corresponding price
More informationAuctions. Michal Jakob Agent Technology Center, Dept. of Computer Science and Engineering, FEE, Czech Technical University
Auctions Michal Jakob Agent Technology Center, Dept. of Computer Science and Engineering, FEE, Czech Technical University AE4M36MAS Autumn 2015 - Lecture 12 Where are We? Agent architectures (inc. BDI
More informationA truthful Multi Item-Type Double-Auction Mechanism. Erel Segal-Halevi with Avinatan Hassidim Yonatan Aumann
A truthful Multi Item-Type Double-Auction Mechanism Erel Segal-Halevi with Avinatan Hassidim Yonatan Aumann Intro: one item-type, one unit Buyers: Value Sellers: Erel Segal-Halevi et al 3 Multi Item Double
More information16 MAKING SIMPLE DECISIONS
253 16 MAKING SIMPLE DECISIONS Let us associate each state S with a numeric utility U(S), which expresses the desirability of the state A nondeterministic action a will have possible outcome states Result(a)
More informationSequential Decision Making
Sequential Decision Making Dynamic programming Christos Dimitrakakis Intelligent Autonomous Systems, IvI, University of Amsterdam, The Netherlands March 18, 2008 Introduction Some examples Dynamic programming
More informationOnline Network Revenue Management using Thompson Sampling
Online Network Revenue Management using Thompson Sampling Kris Johnson Ferreira David Simchi-Levi He Wang Working Paper 16-031 Online Network Revenue Management using Thompson Sampling Kris Johnson Ferreira
More information6.231 DYNAMIC PROGRAMMING LECTURE 5 LECTURE OUTLINE
6.231 DYNAMIC PROGRAMMING LECTURE 5 LECTURE OUTLINE Stopping problems Scheduling problems Minimax Control 1 PURE STOPPING PROBLEMS Two possible controls: Stop (incur a one-time stopping cost, and move
More informationRevenue Management with Forward-Looking Buyers
Revenue Management with Forward-Looking Buyers Posted Prices and Fire-sales Simon Board Andy Skrzypacz UCLA Stanford June 4, 2013 The Problem Seller owns K units of a good Seller has T periods to sell
More informationCS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 6: Prior-Free Single-Parameter Mechanism Design (Continued)
CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 6: Prior-Free Single-Parameter Mechanism Design (Continued) Instructor: Shaddin Dughmi Administrivia Homework 1 due today. Homework 2 out
More informationPricing a Low-regret Seller
Hoda Heidari Mohammad Mahdian Umar Syed Sergei Vassilvitskii Sadra Yazdanbod HODA@CIS.UPENN.EDU MAHDIAN@GOOGLE.COM USYED@GOOGLE.COM SERGEIV@GOOGLE.COM YAZDANBOD@GATECH.EDU Abstract As the number of ad
More informationCharacterization of the Optimum
ECO 317 Economics of Uncertainty Fall Term 2009 Notes for lectures 5. Portfolio Allocation with One Riskless, One Risky Asset Characterization of the Optimum Consider a risk-averse, expected-utility-maximizing
More informationYao s Minimax Principle
Complexity of algorithms The complexity of an algorithm is usually measured with respect to the size of the input, where size may for example refer to the length of a binary word describing the input,
More informationAlgorithmic Game Theory
Algorithmic Game Theory Lecture 10 06/15/10 1 A combinatorial auction is defined by a set of goods G, G = m, n bidders with valuation functions v i :2 G R + 0. $5 Got $6! More? Example: A single item for
More informationSF2972 GAME THEORY Infinite games
SF2972 GAME THEORY Infinite games Jörgen Weibull February 2017 1 Introduction Sofar,thecoursehasbeenfocusedonfinite games: Normal-form games with a finite number of players, where each player has a finite
More informationMulti-armed bandits in dynamic pricing
Multi-armed bandits in dynamic pricing Arnoud den Boer University of Twente, Centrum Wiskunde & Informatica Amsterdam Lancaster, January 11, 2016 Dynamic pricing A firm sells a product, with abundant inventory,
More informationOn Approximating Optimal Auctions
On Approximating Optimal Auctions (extended abstract) Amir Ronen Department of Computer Science Stanford University (amirr@robotics.stanford.edu) Abstract We study the following problem: A seller wishes
More informationCS 188: Artificial Intelligence
CS 188: Artificial Intelligence Markov Decision Processes Dan Klein, Pieter Abbeel University of California, Berkeley Non-Deterministic Search 1 Example: Grid World A maze-like problem The agent lives
More informationMean-Variance Analysis
Mean-Variance Analysis Mean-variance analysis 1/ 51 Introduction How does one optimally choose among multiple risky assets? Due to diversi cation, which depends on assets return covariances, the attractiveness
More informationConstrained Sequential Resource Allocation and Guessing Games
4946 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 54, NO. 11, NOVEMBER 2008 Constrained Sequential Resource Allocation and Guessing Games Nicholas B. Chang and Mingyan Liu, Member, IEEE Abstract In this
More informationPrice Discrimination As Portfolio Diversification. Abstract
Price Discrimination As Portfolio Diversification Parikshit Ghosh Indian Statistical Institute Abstract A seller seeking to sell an indivisible object can post (possibly different) prices to each of n
More informationA simulation study of two combinatorial auctions
A simulation study of two combinatorial auctions David Nordström Department of Economics Lund University Supervisor: Tommy Andersson Co-supervisor: Albin Erlanson May 24, 2012 Abstract Combinatorial auctions
More information