Daa-Driven Demand Learning and Dynamic Pricing Sraegies in Compeiive Markes Pricing Sraegies & Dynamic Programming Rainer Schlosser, Marin Boissier, Mahias Uflacker Hasso Planer Insiue (EPIC) April 30, 2018
Ouline Quesions/Suppor: Marke Simulaion (Exercise) Today: Pricing Sraegies & Duopoly Games Learn abou Dynamic Programming 1 s Exercise: Simulaed Markes & Price Reacions 1
Recall: Rule-Based Price Reacion Sraegies Idea: (1) Observe compeior prices p + (2) Adjus price a (1) Examples: a( s) a ( p): max c, min p k k1,.., K a s a p ( n) ( ) ( ): max aarank : ( a, p) n a a s a p ifu hen a p else a p ( ) (1) (2) ( ) random ( ): (0,1) 0.5 ( ) ( ) a ( p), p min p p a( s) a ( p): max p, else (1) min max (2 bound ) k k1,.., K 2
Duopoly Example Assume K=2 sellers. Assume only one feaure: price Define differen price reacion sraegies a( p ), i.e., if he compeior s curren price is p, we adjus our price o a( p ) Admissible prices are a( p) {1,2,...,100} Le he compeior s response sraegy be given by: p( a): max( a 1,1) We adjus our prices a a imes 1,2,3,... The compeior adjuss his prices p a imes 0.5,1.5, 2.5,... 3
Sequence of Evens (Duopoly Example) In every inerval (, 0.5), 0,0.5,1.0,..., a sale occurs wih probabiliy 1 min( a, p ) /100. Wih probabiliy min( a, p ) /100 no sale akes place 4
Duopoly Example In every inerval (, 0.5), 0,0.5,1.0,..., a sale occurs wih probabiliy 1 min( a, p ) /100. Wih probabiliy min( a, p ) /100 no sale akes place If a sale akes place he cusomer chooses eiher our offer (k=1) or he compeior s offer (k=2) wih probabiliy P( k, p ) according o Approach I, where p p p (1) (2) (, ) ( a, p), i.e., (1) p a (we) and (2) p p (compeior) Simulae unil ime T=1000. Sar wih a0 p0 20 a ime 0 Which sraegy a( p ) performs bes, i.e., maximizes expeced revenues? 5
Goals for Today We wan o opimally solve he duopoly example We have a dynamic opimizaion problem Wha are dynamic opimizaion problems? How o apply dynamic programming echniques? 6
Wha are Dynamic Opimizaion Problems? How o conrol a dynamic sysem over ime? Insead of a single decision we have a sequence of decisions The sysem evolves over ime according o a cerain dynamic The decisions are supposed o be chosen such ha a cerain objecive/quaniy/crieria is opimized Find he righ balance beween shor and long-erm effecs 7
Examples Please! Examples Task: Describe & Classify Invenory Replenishmen Goal/Objecive Reservoir Dam Sae of he Sysem Drinking a a Pary Acions Exam Preparaion Dynamic of he Sysem Brand Adverising Revenues/Coss Used Cars Finie/Infinie Horizon Eaing Cake Sochasic Componens 8
Classificaion Example Objecive Sae Acion Dynamic Paymens Invenory Mgm. min coss #iems #order enry-sales order/holding Reservoir Dam provide power #waer #producion rain-ouflow none Drinking a Pary* max fun #beer impac-rehab fun/money Exam Preparaion* max mark/effor #learned #learn learn-forge effor, mark Adverising max profis image #adverise effec-forge campaigns Used Cars min coss age replace(y/n) aging/fauls buy/repair coss Eaing Cake* max uiliy %cake #ea ouflow uiliy * Finie horizon 9
General Problem Descripion Wha do you wan o minimize or maximize (Objecive) Define he sae of your sysem (Sae) Define he se of possible acions (sae dependen) (Acions) Quanify even probabiliies (sae+acion dependen) (Dynamics) (!!) Define paymens (sae+acion+even dependen) (Paymens) Wha happens a he end of he ime horizon? (Final Paymen) 10
Dynamic Pricing Scenario (Duopoly Example) We wan o sell iems in a duopoly seing wih finie horizon We can observe he compeior s prices and adjus our prices (for free) We can anicipae he compeior s price reacion We know sales probabiliies for various siuaions We wan o maximize oal expeced profis 11
Problem Descripion (Duopoly Example) Framework: 0,1,2,..., T Discree ime periods Sae: s p Compeior s price Acions: a A 1,...,100 Offer prices (for one period of ime) Dynamic: P( i, a, s ) Paymens: R( i, a, s) i a Probabiliy o sell i iems a price a Realized profi New Sae: p ( i, a, s) F( a) Sae ransiion / price reacion F Iniial Sae: s0 p0 20 Compeior s prices in =0 12
Problem Formulaion Find a dynamic pricing sraegy ha maximizes oal expeced (discouned) profis: T max E Pi, a, S i a S0 s0 0 discoun i 0 probabiliy o sell i sales price, 0 1 facor iniial sae iems a price a in siuaion S Wha are admissible policies? How o solve such problems? Answer: Feedback Sraegies Answer: Dynamic Programming 13
Soluion Approach (Dynamic Programming) Wha is he bes expeced value of having he chance o sell... iems from ime on saring in marke siuaion s? Answer: Tha s easy V ( s )!????? We have renamed he problem. Awesome. Bu - ha s a soluion approach! We don know he Value Funcion V, bu V has o saisfy he relaion Value (sae oday) = Bes expeced (profi oday + Value (sae omorrow)) 14
Soluion Approach (Dynamic Programming) Value (sae oday) = Bes expeced (profi oday + Value (sae omorrow)) Idea: Consider he ransiion dynamics wihin one period Wha can happen during one ime inerval? sae oday #sales profi sae omorrow probabiliy 0 0 F( a ) P(0, a, s ) wih ( a, p) vs. ( a, F( a)) s p 1 a F( a ) P(1, a, s ) 2 2a F( a ) P(2, a, s ) Wha does ha mean for he value of sae s p, i.e., V ( s) V ( p)? 15
Bellman Equaion sae oday #sales profi sae omorrow probabiliy 0 0 F( a ) P(0, a, s ) s ( p) 1 a F( a ) P(1, a, s ) 2 2a F( a ) P(2, a, s ) wih ( a, p) vs. ( a, F( a)) V ( p) max P(0, a, p) 0 a V 1F( a) a0 oday' s profi probabiliy no o sell bes disc. exp. fuure profis P(1, a, p) 1 a V 1F( a)... probabiliy o sell oday' s profi bes disc. exp. fuure profis 16
Opimal Soluion We finally obain he Bellman Equaion: V ( p) max P( i, a, p) ia V F( a) a0 i0 oday' s profi probabiliy 1 bes disc. exp. fuure profis of new sae Ok, bu why is ha ineresing? Answer: Because a * ( p) argmax... is he opimal policy aa Ok! Now, we jus need o compue he Value Funcion! 17
How o solve he Bellman Equaion? Using he erminal condiion VT ( p): 0 for ime horizon T (e.g., 1000) We can compue he value funcion recursively, p : V ( p) max P( i, a, p) ia V F( a) a0 i0 oday' s profi probabiliy The opimal sraegy 1 a * ( p ), 1,..., T, p 1,...,100, is deermined by he arg max of he value funcion bes disc. exp. fuure profis of new sae 18
How o solve he Bellman Equaion? Using he erminal condiion VT ( p): 0 for ime horizon T (e.g., 1000) We can compue he value funcion recursively, p : V ( p) max P( i, a, p) ia V F( a) a0 i0 oday' s profi probabiliy The opimal sraegy 1 a * ( p ), 1,..., T, p 1,...,100, is deermined by he arg max of he value funcion bes disc. exp. fuure profis of new sae In Pseudo-Code: param V{ in 0..T,p in A}:= if <T hen max {a in A} sum{i in I} P[i,a,p] * ( i*a + dela * V[+1,F[a]] ); 19
Opimal Pricing Sraegy of he Duopoly Example Bes expeced profi Opimal pricing sraegy 20
Infinie Horizon Problem max E Pi, a, S i a S0 s0 0 discoun i 0 probabiliy o sell i sales price, 0 1 facor iniial sae a price a in siuaion S Will he value funcion be ime-dependen? Will he opimal price reacion sraegy be ime-dependen? How does he Bellman equaion look like? 21
Soluion of he Infinie Horizon Problem V * ( p) max P( i, a, p) i a V * F( a) a0 i0 oday' s profi probabiliy disc. exp. fuure profis of new sae Approximae soluion: Finie horizon approach (value ieraion) For large T he values V ( p ) 0 converge o he exac values V * ( p ) The opimal policy a * ( p ), p 1,...,100, is deermined by he arg max of he las ieraion sep, i.e., a ( p ) 0 22
Exac Soluion of he Infinie Horizon Problem V * ( p) max P( i, a, p) i a V * F( a) a0 i0 oday' s profi, p A probabiliy disc. exp. fuure profis of new sae We have o solve a sysem of nonlinear equaions Solvers can be applied, e.g, MINOS (see NEOS Solver) In Pseudo-Code: subjec o NB {p in A}: V[p] = max {a in A} sum{i in I} P[i,a,p] * ( i*a + dela * V[F[a]] ); solve; 23
Quesions? Sae Acion Evens Dynamics & Sae Transiions Recursive Soluion Principle General concep applicable o oher problems 24
1 s Exercise Simulaed Markes & Price Reacions Sudy our marke simulaion noebook Modify he arrival sream of ineresed cusomers Modify he cusomer behaviour (scoring weighs) Modify he compeiors reacion sraegies Bonus (Dynamic Programming): Solve duopoly example wih finie horizon 25
Overview 2 April 24 Cusomer Behavior 3 April 30 Pricing Sraegies, 1 s Homework (marke simulaion) 4 May 8 Demand Esimaion, 2 nd Homework (demand learning) 5 May 15 Inroducion Price Wars Plaform 6 May 22 Warm up Plaform Exercise (in Groups) 7 May 29 Dynamic Pricing Challenge / Projecs 8 June 5 no Meeing 9 June 12 Workshop / Group Meeings 10 June 19 Presenaions (Firs Resuls) 11 June 26 Workshop / Group Meeings 12 July 3 Workshop / Group Meeings 13 July 10 no Meeing 14 July 17 Presenaions (Final Resuls), Feedback, Documenaion (Aug/Sep) 26