Utilities and Decision Theory. Lirong Xia

Similar documents
Expectimax Search Trees. CS 188: Artificial Intelligence Fall Expectimax Quantities. Expectimax Pseudocode. Expectimax Pruning?

Announcements. CS 188: Artificial Intelligence Spring Expectimax Search Trees. Maximum Expected Utility. What are Probabilities?

CS 188: Artificial Intelligence Spring Announcements

Expectimax Search Trees. CS 188: Artificial Intelligence Fall Expectimax Example. Expectimax Pseudocode. Expectimax Pruning?

CS 188: Artificial Intelligence Fall 2011

343H: Honors AI. Lecture 7: Expectimax Search 2/6/2014. Kristen Grauman UT-Austin. Slides courtesy of Dan Klein, UC-Berkeley Unless otherwise noted

CS 188: Artificial Intelligence. Maximum Expected Utility

Expectimax and other Games

Worst-Case vs. Average Case. CSE 473: Artificial Intelligence Expectimax, Uncertainty, Utilities. Expectimax Search. Worst-Case vs.

CS 4100 // artificial intelligence

Uncertain Outcomes. CS 188: Artificial Intelligence Uncertainty and Utilities. Expectimax Search. Worst-Case vs. Average Case

CS 343: Artificial Intelligence

Probabilities. CSE 473: Artificial Intelligence Uncertainty, Utilities. Reminder: Expectations. Reminder: Probabilities

CS 5522: Artificial Intelligence II

Lecture 12: Introduction to reasoning under uncertainty. Actions and Consequences

Logistics. CS 473: Artificial Intelligence. Markov Decision Processes. PS 2 due today Midterm in one week

CS 6300 Artificial Intelligence Spring 2018

Decision making in the presence of uncertainty

CSL603 Machine Learning

Choice under risk and uncertainty

Overview: Representation Techniques

PhD Qualifier Examination

Decision making in the presence of uncertainty

Energy and public Policies

Markov Decision Processes. Lirong Xia

Markov Decision Processes

Outline Introduction Game Representations Reductions Solution Concepts. Game Theory. Enrico Franchi. May 19, 2010

Making Simple Decisions

Announcements. CS 188: Artificial Intelligence Fall Preferences. Rational Preferences. Rational Preferences. MEU Principle. Project 2 (due 10/1)

16 MAKING SIMPLE DECISIONS

16 MAKING SIMPLE DECISIONS

Introduction to Artificial Intelligence Spring 2019 Note 2

Lecture 6 Introduction to Utility Theory under Certainty and Uncertainty

Using the Maximin Principle

PAULI MURTO, ANDREY ZHUKOV

Models & Decision with Financial Applications Unit 3: Utility Function and Risk Attitude

Uncertain Outcomes. CS 232: Ar)ficial Intelligence Uncertainty and U)li)es Sep 24, Worst- Case vs. Average Case.

Advanced Microeconomics

UC Berkeley Haas School of Business Economic Analysis for Business Decisions (EWMBA 201A) Fall Module I

Micro Theory I Assignment #5 - Answer key

UC Berkeley Haas School of Business Economic Analysis for Business Decisions (EWMBA 201A) Fall Module I

Introduction to Decision Making. CS 486/686: Introduction to Artificial Intelligence

April 28, Decision Analysis 2. Utility Theory The Value of Information

Uncertainty. Contingent consumption Subjective probability. Utility functions. BEE2017 Microeconomics

Extensive-Form Games with Imperfect Information

Subject : Computer Science. Paper: Machine Learning. Module: Decision Theory and Bayesian Decision Theory. Module No: CS/ML/10.

Counting Basics. Venn diagrams

CHOICE THEORY, UTILITY FUNCTIONS AND RISK AVERSION

Game theory and applications: Lecture 1

Lecture 3 Representation of Games

CS188 Spring 2012 Section 4: Games

Incentive Compatibility: Everywhere vs. Almost Everywhere

Session 9: The expected utility framework p. 1

Microeconomics of Banking: Lecture 2

Random Variables and Applications OPRE 6301

CS 188 Fall Introduction to Artificial Intelligence Midterm 1. ˆ You have approximately 2 hours and 50 minutes.

PAULI MURTO, ANDREY ZHUKOV. If any mistakes or typos are spotted, kindly communicate them to

CEC login. Student Details Name SOLUTIONS

G5212: Game Theory. Mark Dean. Spring 2017

Learning Objectives = = where X i is the i t h outcome of a decision, p i is the probability of the i t h

MICROECONOMIC THEROY CONSUMER THEORY

Dr. Abdallah Abdallah Fall Term 2014

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Choice under Uncertainty

CONVENTIONAL FINANCE, PROSPECT THEORY, AND MARKET EFFICIENCY

BAYESIAN GAMES: GAMES OF INCOMPLETE INFORMATION

Notes for Session 2, Expected Utility Theory, Summer School 2009 T.Seidenfeld 1

Expected utility theory; Expected Utility Theory; risk aversion and utility functions

GAME THEORY. Department of Economics, MIT, Follow Muhamet s slides. We need the following result for future reference.

General Examination in Microeconomic Theory SPRING 2014

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Fundamentals of Managerial and Strategic Decision-Making

Microeconomic Theory II Preliminary Examination Solutions

Exercises - Moral hazard

Department of Economics The Ohio State University Final Exam Questions and Answers Econ 8712

d. Find a competitive equilibrium for this economy. Is the allocation Pareto efficient? Are there any other competitive equilibrium allocations?

Finish what s been left... CS286r Fall 08 Finish what s been left... 1

Copyright (C) 2001 David K. Levine This document is an open textbook; you can redistribute it and/or modify it under the terms of version 1 of the

Theory of Consumer Behavior First, we need to define the agents' goals and limitations (if any) in their ability to achieve those goals.

Period State of the world: n/a A B n/a A B Endowment ( income, output ) Y 0 Y1 A Y1 B Y0 Y1 A Y1. p A 1+r. 1 0 p B.

6.254 : Game Theory with Engineering Applications Lecture 3: Strategic Form Games - Solution Concepts

TOPIC: PROBABILITY DISTRIBUTIONS

Making Hard Decision. ENCE 627 Decision Analysis for Engineering. Identify the decision situation and understand objectives. Identify alternatives

EconS Micro Theory I Recitation #8b - Uncertainty II

DECISION ANALYSIS. Decision often must be made in uncertain environments. Examples:

ESD.71 Engineering Systems Analysis for Design

Expected Utility and Risk Aversion

Sequential-move games with Nature s moves.

6.1 Binomial Theorem

ENGINEERING RISK ANALYSIS (M S & E 250 A)

Decision Theory. Refail N. Kasimbeyli

University of California, Davis Department of Economics Giacomo Bonanno. Economics 103: Economics of uncertainty and information PRACTICE PROBLEMS

CSE 473: Artificial Intelligence

Making Decisions. CS 3793 Artificial Intelligence Making Decisions 1

Strategy Lines and Optimal Mixed Strategy for R

Building Consistent Risk Measures into Stochastic Optimization Models

CUR 412: Game Theory and its Applications, Lecture 9

CSEP 573: Artificial Intelligence

Name. Final Exam, Economics 210A, December 2014 Answer any 7 of these 8 questions Good luck!

ANASH EQUILIBRIUM of a strategic game is an action profile in which every. Strategy Equilibrium

Optimizing S-shaped utility and risk management

Transcription:

Utilities and Decision Theory Lirong Xia

Checking conditional independence from BN graph ØGiven random variables Z 1, Z p, we are asked whether X Y Z 1, Z p dependent if there exists a path where all triples are active independent if for each path, there exists an inactive triple 2

General method for variable elimination Ø Compute a marginal probability p(x 1,,x p ) in a Bayesian network Let Y 1,,Y k denote the remaining variables Step 1: fix an order over the Y s (wlog Y 1 > >Y k ) Step 2: rewrite the summation as sth only involving X s sth only Σ y1 Σ y2 involving Y 1, Σ yk-1 Σ yk anything involving Y 1 and X s sth only Y 2 and X s sth only involving Y 1, Y 2,,Y k-1 and X s Step 3: variable elimination from right to left 3

Today ØUtility theory expected utility: preferences over lotteries maximum expected utility (MEU) principle 4

Expectimax Search Trees Ø Expectimax search Max nodes (we) as in minimax search Chance nodes Need to compute chance node values as expected utilities Ø Next class we will formalize the underlying problem as a Markov decision Process 5

Expectimax Pseudocode Ø Def value(s): If s is a max node return maxvalue(s) If s is a chance node return expvalue(s) If s is a terminal node return evaluations(s) Ø Def maxvalue(s): values = [value(s ) for s in successors(s)] return max(values) Ø Def expvalue(s): values = [value(s ) for s in successors(s)] weights = [probability(s, s ) for s in successors(s)] return expectation(values, weights) 6

Expectimax Quantities 7

Maximum expected utility ØPrinciple of maximum expected utility: A rational agent should chose the action which maximizes its expected utility, given its (probabilistic) knowledge Ø Questions: Where do utilities come from? How do we know such utilities even exist? What if our behavior can t be described by utilities? 8

Inference with Bayes Rule Ø Example: diagnostic probability from causal probability: p Ø Example: ( Cause Effect ) F is fire, {f, f} A is the alarm, {a, a} = ( ) ( ) p Effect Cause p Cause p Effect ( ) p(f)=0.001 p(a)=0.1 p(a f)=0.9 ( ) = p ( a f ) p f p f a ( ) p a ( ) = 0.9 0.001 0.1 = 0.009 Note: posterior probability of fire still very small Note: you should still run when hearing an alarm! Why? 9

0.009 stay run out 0.991 100% 10

Utilities Ø Utilities are functions from outcomes (states of the world, sample space) to real numbers that represent an agent s preferences Ø Where do utilities come from? -10100 100 In a game, may be simple (+1/-1) Utilities summarize the agent s goals -100 11

Preferences over lotteries ØAn agent chooses among: Prizes: A, B, etc. Lotteries: situations with uncertain prizes L = " # p, A; ØNotation: ( 1 p $ ),B% L p 1-p A B A B A B A B A is strictly preferred to B In difference between A and B A is strictly preferred to or indifferent with B 12

Utility theory in Economics ØState of the world: money you earn Money does not behave as a utility function, but we can talk about the utility of having money (or being in debt) Ø Which would you prefer? A lottery ticket that pays out $10 with probability.5 and $0 otherwise, or A lottery ticket that pays out $1 with probability 1 Ø How about: A lottery ticket that pays out $100,000,000 with probability.5 and $0 otherwise, or A lottery ticket that pays out $10,000,000 with probability 1 Ø Usually, people do not simply go by expected value 13

Ø Which one you would prefer? Lottery A: $1M@100% Lottery B: $1M@89% + $5M@10% + 0@1% Ø How about Lottery A: $1M@11%+0@89% Lottery B: $0@90% + $5M@10% 14

Encoding preferences over lotteries ØHow many lotteries? infinite! ØNeed to find a compact representation Maximum expected utility (MEU) principle which type of preferences (rankings over lotteries) can be represented by MEU? 15

Rational Preferences ØWe want some constraints on preferences before we call them rational ØFor example: an agent with intransitive preferences can be induced to give away all of its money If B C, then an agent with C would pay (say) 1 cent to get B If A B, then an agent with B would pay (say) 1 cent to get A If C A, then an agent with A would pay (say) 1 cent to get C ( A B) ( B C) ( A C) 16

Rational Preferences ØPreference of a rational agent must obey constraints The axioms of rationality: for all lotteries A, B, C ( Orderability A B) ( B A) ( A B) ( Transitivity A B) ( B C) ( A C) Continuity A B C p$ p, A;1 p,c& % ' B Substitutability A B # p, A;1 p,c% $ & # p,b;1 p,c% $ & Monotonicity A B p q % p, A;1 p,b' & ( % q, A;1 q,b ' & ( ØTheorem: rational preferences imply behavior describable as maximization of expected utility ( ) 17

MEU Principle Ø Theorem: [Ramsey, 1931; von Neumann & Morgenstern, 1944] Given any preference satisfying these axioms, there exists a real-value function U such that: ( ) >U ( B) A B U A U (" # p 1,S 1 ; ; p n,s $ n %) = pu S i i i ( ) ØMaximum expected utility (MEU) principle: Choose the action that maximizes expected utility Utilities are just a representation! an agent can be entirely rational (consistent with MEU) without ever representing or manipulating utilities and probabilities Utilities are NOT money 18

What would you do? -10100 0.009 stay run out 0.991 100% 100-100 19

Common types of utilities 20

Risk attitudes ØAn agent is risk-neutral if she only cares about the expected value of the lottery ticket ØAn agent is risk-averse if she always prefers the expected value of the lottery ticket to the lottery ticket Most people are like this ØAn agent is risk-seeking if she always prefers the lottery ticket to the expected value of the lottery ticket

Decreasing marginal utility ØTypically, at some point, having an extra dollar does not make people much happier (decreasing marginal utility) utility buy a nicer car (utility = 3) buy a car (utility = 2) buy a bike (utility = 1) $200 $1500 $5000 money

Maximizing expected utility utility buy a nicer car (utility = 3) buy a car (utility = 2) buy a bike (utility = 1) $200 $1500 $5000 money Ø Lottery 1: get $1500 with probability 1 gives expected utility 2 Ø Lottery 2: get $5000 with probability.4, $200 otherwise gives expected utility.4*3 +.6*1 = 1.8 (expected amount of money =.4*$5000 +.6*$200 = $2120 > $1500) Ø So: maximizing expected utility is consistent with risk aversion

Different possible risk attitudes under expected utility maximization utility money ØGreen has decreasing marginal utility risk-averse Ø Blue has constant marginal utility risk-neutral Ø Red has increasing marginal utility risk-seeking ØGrey s marginal utility is sometimes increasing, sometimes decreasing neither risk-averse (everywhere) nor risk-seeking (everywhere)

Example: Insurance Ø Because people ascribe different utilities to different amounts of money, insurance agreements can increase both parties expected utility You own a car. Your lottery: L Y = [0.8, $0; 0.2, -$200] i.e., 20% chance of crashing You do not want -$200! Insurance is $50 U Y (L Y ) = 0.2*U Y (-$200)=-200 U Y (-$50)=-150 Amount Your Utility U Y $0 0 -$50-150 -$200-1000 25

Example: Insurance Ø Because people ascribe different utilities to different amounts of money, insurance agreements can increase both parties expected utility You own a car. Your lottery: L Y = [0.8, $0; 0.2, -$200] i.e., 20% chance of crashing You do not want -$200! Insurance company buys risk: L I = [0.8, $50; 0.2, -$150] i.e., $50 revenue + your L Y Insurer is risk-neutral: U(L) = U(EMV(L)) U Y (L Y ) = 0.2*U Y (-$200)=-200 U Y (-$50)=-150 U I (L I ) = U(0.8*50+0.2*(-150)) = U($10) >U($0) 26

Acting optimally over time Ø Finite number of periods: Overall utility = sum of rewards in individual periods Ø Infinite number of periods: are we just going to add up the rewards over infinitely many periods? Always get infinity! Ø (Limit of) average payoff: lim n Σ 1 t n r(t)/n Limit may not exist Ø Discounted payoff: Σ t ϒ t r(t) for some ϒ < 1 Interpretations of discounting: Interest rate r: ϒ= 1/(1+r) World ends with some probability 1-ϒ Ø Discounting is mathematically convenient We will see more in the next class 27