Decision making in the presence of uncertainty

Similar documents
Decision making in the presence of uncertainty

Decision making in the presence of uncertainty

Lecture 12: Introduction to reasoning under uncertainty. Actions and Consequences

Utilities and Decision Theory. Lirong Xia

ESD.71 Engineering Systems Analysis for Design

Microeconomics of Banking: Lecture 5

16 MAKING SIMPLE DECISIONS

16 MAKING SIMPLE DECISIONS

Choice under Uncertainty

April 28, Decision Analysis 2. Utility Theory The Value of Information

Making Decisions. CS 3793 Artificial Intelligence Making Decisions 1

36106 Managerial Decision Modeling Decision Analysis in Excel

Models & Decision with Financial Applications Unit 3: Utility Function and Risk Attitude

Outline Introduction Game Representations Reductions Solution Concepts. Game Theory. Enrico Franchi. May 19, 2010

DECISION ANALYSIS. Decision often must be made in uncertain environments. Examples:

Markov Decision Processes

Uncertainty. Contingent consumption Subjective probability. Utility functions. BEE2017 Microeconomics

Expectimax Search Trees. CS 188: Artificial Intelligence Fall Expectimax Quantities. Expectimax Pseudocode. Expectimax Pruning?

MICROECONOMIC THEROY CONSUMER THEORY

Logistics. CS 473: Artificial Intelligence. Markov Decision Processes. PS 2 due today Midterm in one week

CS 188: Artificial Intelligence. Maximum Expected Utility

DECISION ANALYSIS. (Hillier & Lieberman Introduction to Operations Research, 8 th edition)

Homework Assignment #1: Answer Sheet

Micro Theory I Assignment #5 - Answer key

Problem Set 3 Solutions

Expectimax Search Trees. CS 188: Artificial Intelligence Fall Expectimax Example. Expectimax Pseudocode. Expectimax Pruning?

Project Risk Analysis and Management Exercises (Part II, Chapters 6, 7)

CS 188: Artificial Intelligence Fall 2011

Introduction to Decision Making. CS 486/686: Introduction to Artificial Intelligence

Economics 101. Lecture 8 - Intertemporal Choice and Uncertainty

Decision Making Supplement A

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Decision Analysis

Project Risk Evaluation and Management Exercises (Part II, Chapters 4, 5, 6 and 7)

Energy and public Policies

Comparison of Payoff Distributions in Terms of Return and Risk

Announcements. CS 188: Artificial Intelligence Spring Expectimax Search Trees. Maximum Expected Utility. What are Probabilities?

CS188 Spring 2012 Section 4: Games

CS 188: Artificial Intelligence Spring Announcements

Probabilities. CSE 473: Artificial Intelligence Uncertainty, Utilities. Reminder: Expectations. Reminder: Probabilities

Econ 101A Final Exam We May 9, 2012.

1. A is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes,

CS 188 Fall Introduction to Artificial Intelligence Midterm 1. ˆ You have approximately 2 hours and 50 minutes.

Choice under risk and uncertainty

343H: Honors AI. Lecture 7: Expectimax Search 2/6/2014. Kristen Grauman UT-Austin. Slides courtesy of Dan Klein, UC-Berkeley Unless otherwise noted

Lecture 6 Introduction to Utility Theory under Certainty and Uncertainty

Decision Analysis under Uncertainty. Christopher Grigoriou Executive MBA/HEC Lausanne

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models

CEC login. Student Details Name SOLUTIONS

Econ 101A Final exam Mo 18 May, 2009.

Learning Objectives = = where X i is the i t h outcome of a decision, p i is the probability of the i t h

CS 5522: Artificial Intelligence II

Q1. [?? pts] Search Traces

What do Coin Tosses and Decision Making under Uncertainty, have in common?

To earn the extra credit, one of the following has to hold true. Please circle and sign.

Ambiguity Aversion. Mark Dean. Lecture Notes for Spring 2015 Behavioral Economics - Brown University

Worst-Case vs. Average Case. CSE 473: Artificial Intelligence Expectimax, Uncertainty, Utilities. Expectimax Search. Worst-Case vs.

Utility Indifference Pricing and Dynamic Programming Algorithm

UTILITY ANALYSIS HANDOUTS

Making Hard Decision. ENCE 627 Decision Analysis for Engineering. Identify the decision situation and understand objectives. Identify alternatives

MBF1413 Quantitative Methods

Microeconomics (Uncertainty & Behavioural Economics, Ch 05)

Characterization of the Optimum

TIm 206 Lecture notes Decision Analysis

Expectimax and other Games

A Taxonomy of Decision Models

CS 4100 // artificial intelligence

ENGINEERING RISK ANALYSIS (M S & E 250 A)

CS 343: Artificial Intelligence

Ph.D. Preliminary Examination MICROECONOMIC THEORY Applied Economics Graduate Program June 2017

Uncertain Outcomes. CS 188: Artificial Intelligence Uncertainty and Utilities. Expectimax Search. Worst-Case vs. Average Case

Attitudes Toward Risk. Joseph Tao-yi Wang 2013/10/16. (Lecture 11, Micro Theory I)

ECO303: Intermediate Microeconomic Theory Benjamin Balak, Spring 2008

Corporate Finance, Module 21: Option Valuation. Practice Problems. (The attached PDF file has better formatting.) Updated: July 7, 2005

Decision making under uncertainty

Notes for Session 2, Expected Utility Theory, Summer School 2009 T.Seidenfeld 1

CONVENTIONAL FINANCE, PROSPECT THEORY, AND MARKET EFFICIENCY

Rational theories of finance tell us how people should behave and often do not reflect reality.

Managerial Economics

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2017

CHOICE THEORY, UTILITY FUNCTIONS AND RISK AVERSION

General Examination in Microeconomic Theory SPRING 2014

8/28/2017. ECON4260 Behavioral Economics. 2 nd lecture. Expected utility. What is a lottery?

MBF1413 Quantitative Methods

Stochastic Games and Bayesian Games

Advanced Engineering Project Management Dr. Nabil I. El Sawalhi Assistant professor of Construction Management

CUR 412: Game Theory and its Applications, Lecture 9

Lecture 06 Single Attribute Utility Theory

6.231 DYNAMIC PROGRAMMING LECTURE 3 LECTURE OUTLINE

CHAPTER 15 Sequential rationality 1-1

Expected Utility and Risk Aversion

Subject : Computer Science. Paper: Machine Learning. Module: Decision Theory and Bayesian Decision Theory. Module No: CS/ML/10.

CS 798: Homework Assignment 4 (Game Theory)

Making Simple Decisions

Engineering Risk Benefit Analysis

AREC 815: Experimental and Behavioral Economics. Measuring Risk Preferences. Professor: Pamela Jakiela

Introduction to Economics I: Consumer Theory

PAULI MURTO, ANDREY ZHUKOV

Non-Deterministic Search

G5212: Game Theory. Mark Dean. Spring 2017

Complex Decisions. Sequential Decision Making

Transcription:

CS 271 Foundations of AI Lecture 21 Decision making in the presence of uncertainty Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Decision-making in the presence of uncertainty Many real-world problems require to choose future actions in the presence of uncertainty Examples: patient management, investment decisions Main issues: How to model the decision process in the computer? How to make decisions about actions in the presence of uncertainty? 1

(Stochastic) Decision tree Decision tree: Stock 1 Stock 2 Home decision node chance node outcome (value) node 11 9 14 8 11 1 Decision tree: solution Expectimax Stock 1 Stock 2 Home 12 14 11 1 decision node chance node outcome (value) node 11 9 14 8 11 1 2

Sequential (multi-step) problems The decision tree can be build to capture multi-step decision problems: Choose an action Observe a stochastic outcome And repeat How to make decisions for multi-step problems? Start from the leaves of the decision tree (outcome nodes) Compute expectations at chance nodes Maximize at the decision nodes Algorithm is sometimes called expectimax Multi-step problem example Assume: Two investment periods Two actions: stock and bank Stock 117 11 15 95 11 Stock Stock Stock 15 125 95 9 11 15.5.5.5.5.5.5 2 1 125 13 6 9 14 8 15 3

Conditioning in the decision tree But this may not hold in general. In decision trees: Later outcomes can be conditioned on the earlier stochastic outcomes and actions Example: stock movement probabilities. Assume: 1 st =up)= 2 nd =up 1 st =up)= 2 nd =up 1 st =down)=.5 Stock (1 st up) (1 st down) Stock Stock (2 nd up) 2 (2 nd down) 1 125.5 (2 nd up) 13.5 (2 nd down) 6 9 Trajectory payoffs Outcome values at leaf nodes (e.g. monetary values) Rewards and costs for the path trajectory Example: stock fees and gains. Assume: Fee per period: $5 paid at the beginning Gain for up: 15%, loss for down 1% 1 1-5 Stock (1-5)*1.15 (1 st up) (1 st down) (1-5)*1.15-5 Stock Stock [(1-5)*1.15-5]*1.15=131.14 [(1-5)*1.15-5]*.9=125.33 (2 nd up).5.5 (2 nd down) (2 nd up) 131.14 (2 nd down) 125.33 4

Information-gathering actions Many actions and their outcomes irreversibly change the world Information-gathering (exploratory) actions: make an inquiry about the world Key benefit: reduction in the uncertainty Example: medicine Assume a patient is admitted to the hospital with some set of initial complaints We are uncertain about the underlying problem and consider a surgery, or a medication to treat them But there are often lab tests or observations that can help us to determine more closely the disease the patient suffers from Goal of lab tests: Reduce the uncertainty of outcomes of treatments so that better treatment option can be chosen Decision-making with exploratory actions In decision trees: Exploratory actions can be represented and reasoned about the same way as other actions. How do we capture the effect of exploratory actions in the decision tree model? Information obtained through exploratory actions may affect the probabilities of later outcomes Recall that the probabilities on later outcomes can be conditioned on past observed outcomes and past actions Sequence of past actions and outcomes is remembered within the decision tree branch 5

An oil wildcatter has to make a decision of whether to drill or not to drill on a specific site Chance of hitting an oil deposit: Oil: 4% No-oil: 6% Oil T) Oil F) Cost of drilling: 7K Payoffs: Oil: 22K No-oil: K 22-7=15-7 An oil wildcatter has to make a decision of whether to drill or not to drill on a specific site Chance of hitting an oil deposit: Oil: 4% Oil T) No-oil: 6% Oil F) Cost of drilling: 7K Payoffs: Oil: 22K 18 No-oil: K 22-7=15-7 6

Oil wildcatter problem Assume that in addition to the drill/no-drill choices we have an option to run the seismic resonance test Seismic resonance test results: Closed pattern (more likely when the hole holds the oil) Diffuse pattern (more likely when it is empty) Seismic resonance test Oil ) Seismic resonance test pattern Oil cost: 1K True False closed.8.3 diffuse.2.7 Decision tree (diffuse) 7

Compute outcomes Oil: + 22 : - 7 : - 1? (diffuse) Compute outcomes Oil: + 22 : - 7 : - 1? (diffuse) 8

Compute outcomes Oil: + 22 : - 7 : - 1-7-1= -8? (diffuse) Compute outcomes Oil: + 22 : - 7 : - 1 (diffuse) -7 22-7=15-7-1= -8-1= -1-7-1= -8-1= -1 9

Compute probabilities? (diffuse) 22-7=15-7?? -7-1= -8-1= -1-7-1= -8-1= -1 Decision tree probabilities? -7-1=-8 No?) -1=-1 1

? Decision tree probabilities No -7-1=-8-1=-1 Oil T closed ) Oil T closed )? Decision tree probabilities No 4-7-1=-8-1=-1 Oil T closed ) closed Oil T) Oil T).8* Oil T closed) 4 closed).8* *.2 11

Decision tree probabilities No 4? -7-1=-8-1=-1 Oil F closed ) closed Oil T) Oil T).8* Oil T closed) 4 closed).8* *.2 Decision tree probabilities No 4.36-7-1=-8-1=-1 Oil F closed ) closed Oil T) Oil T).8* Oil T closed) 4 closed).8* *.2 closed Oil F) Oil F) Oil F closed ).36 T closed ) 12

Decision tree probabilities 4.36-7-1=-8-1=-1 No Oil closed ) closed Oil T) Oil T).8* Oil T closed) 4 closed).8* *.2 closed Oil F) Oil F) Oil F closed ).36 T closed ) closed ) closed Oil F) Oil F) closed Oil T) Oil T).5 Decision tree probabilities? (diffuse) No ) 4.36-7-1=-8-1=-1 closed ) closed Oil F) Oil F) closed Oil T) Oil T) diff ) diff Oil F) Oil F) diff Oil T) Oil T) 13

Decision tree probabilities.5.5 (diffuse) No ) 4.36-7-1=-8-1=-1 closed ) closed Oil F) Oil F) closed Oil T) Oil T) diff ) diff Oil F) Oil F) diff Oil T) Oil T) Decision tree.5.5 4.36.16.84 (diffuse) 22-7=15-7 -7-1=-8-1=-1-7-1=-8-1=-1 14

Alternative model.5.5 (diffuse) 4.36.16.84 No -7-1=-8-1=-1-7-1=-8-1=-1 22-7=15-7 Decision tree 25.4 No 18 6.8.5 6.8 4.36-1 -44.8.16.5-1 (diffuse) -1.84 18 18-1=-1-7-1=-8-1=-1 22-7=15-7 -7-1=-8 15

Decision tree 6.8 4 6.8.36-7-1=-8-1.5 25.4-1=-1-44.8.16.5-1.84 (diffuse) -7-1=-8-1 -1=-1 No 18 22-7=15 18 The presence of the test and its result affected our decision: 18-7 if test =closed then drill if test=diffuse then do not drill Value of information When the test makes sense? Only when its result makes the decision maker to change his mind, that is he decides not to drill. Value of information: Measure of the goodness of the information from the test Difference between the expected value with and without the test information Oil wildcatter example: Expected value without the test = 18 Expected value with the test =25.4 Value of information for the seismic test = 7.4 16

Using utility to measure the outcomes Selection based on expected values Until now: The optimal action choice was the option that maximized the expected monetary value. But is the expected monetary value always the quantity we want to optimize? Stock 1 Stock 2 Home 12 14 11 1 11 9 14 8 11 1 17

Selection based on expected values Is the expected monetary value always the quantity we want to optimize? Answer: Yes, but only if we are risk-neutral. But what if we do not like the risk (we are risk-averse)? In that case we may want to get the premium for undertaking the risk (of loosing the money) Example: we may prefer to get $11 for sure against $12 in expectation but with the risk of loosing the money Problem: How to model decisions and account for the risk? Solution: use utility function, and utility theory Utility function (denoted U) Utility function Quantifies how we value outcomes, i.e., it reflects our preferences Can be also applied to value outcomes other than money and gains (e.g. utility of a patient being healthy, or ill) Decision making: uses expected utilities (denoted EU) EU( X ) X x) U( X x) x X U( X x) the utility of outcome x Important!!! Under some conditions on preferences we can always design the utility function that fits our preferences 18

Utility theory Defines axioms on preferences that involve uncertainty and ways to manipulate them. Uncertainty is modeled through lotteries Lottery: [ p : A;(1 p) : C] Outcome A with probability p Outcome C with probability (1-p) The following six constraints are known as the axioms of utility theory. The axioms are the most obvious semantic constraints on preferences with lotteries. Notation: ~ - preferable - indifferent (equally preferable) Axioms of the utility theory Orderability: Given any two states, a rational agent prefers one of them, else the two as equally preferable. ( A B) ( B A) ( A ~ B) Transitivity: Given any three states, if an agent prefers A to B and prefers B to C, the agent must prefer A to C. ( A B) ( B C) ( A C) Continuity: If some state B is between A and C in preference, then there is a p for which the rational agent will be indifferent between state B and the lottery in which A comes with probability p, C with probability (1-p). ( A B C) p [ p : A;(1 p) : C] ~ B 19

Axioms of the utility theory Substitutability: If an agent is indifferent between two lotteries, A and B, then there is a more complex lottery in which A can be substituted with B. ( A ~ B) [ p : A;(1 p) : C] ~ [ p : B;(1 p) : C] Monotonicity: If an agent prefers A to B, then the agent must prefer the lottery in which A occurs with a higher probability ( A B) ( p q [ p : A;(1 p) : B] [ q : A;(1 q) : B]) Decomposability: Compound lotteries can be reduced to simpler lotteries using the laws of probability. [ p : A;(1 [ p : A;(1 p) :[ q : B;(1 q) : C]] p) q : B;(1 p)(1 q) : C] Utility theory If the agent obeys the axioms of the utility theory, then 1. there exists a real valued function U such that: U( A) U( B) A B U ( A) U( B) A ~ B 2. The utility of the lottery is the expected utility, that is the sum of utilities of outcomes weighted by their probability U[ p : A;(1 p) : B] pu( A) (1 p) U( B) 3. Rational agent makes the decisions in the presence of uncertainty by maximizing its expected utility 2

Utility functions We can design a utility function that fits our preferences if they satisfy the axioms of utility theory. But how to design the utility function for monetary values so that they incorporate the risk? What is the relation between utility function and monetary values? Assume we loose or gain $1. Typically this difference is more significant for lower values (around $1-1) than for higher values (~ $1,,) What is the relation between utilities and monetary value for a typical person? Utility functions What is the relation between utilities and monetary value for a typical person? Concave function that flattens at higher monetary values utility 1, Monetary value 21

Utility functions Expected utility of a sure outcome of 75 utility EU(sure 75) U(x) 5 75 1 Monetary value Utility functions Assume a lottery L [.5: 5,.5:1] Expected value of the lottery = 75 Expected utility of the lottery EU(L) is different: EU(L) =.5U(5) +.5*U(1) U(x) utility EU(lottery L) EU line for lotteries with outcomes 5 and 1 Lottery L: [.5: 5,.5:1] 5 75 1 Monetary value 22

Utility functions Expected utility of the lottery EU(lottery L) < EU(sure 75) utility EU(sure 75) EU(lottery L) U(x) Lottery L: [.5: 5,.5:1] 5 75 1 Monetary value Risk aversion a bonus is required for undertaking the risk Decision making with utility function Original problem with monetary outcomes Stock 1 Stock 2 Home 12 14 11 1 11 9 14 8 11 1 23

Decision making with the utility function Utility function log (x) Stock 1 Stock 2 Home 2.653 2.3 2.4 2 2.413 1.9542 2.1461 1.93 2.43 2. 24