CS188 Spring 2012 Section 4: Games

Similar documents
CS 188 Fall Introduction to Artificial Intelligence Midterm 1. ˆ You have approximately 2 hours and 50 minutes.

CS 6300 Artificial Intelligence Spring 2018

Q1. [?? pts] Search Traces

CS 188 Fall Introduction to Artificial Intelligence Midterm 1. ˆ You have approximately 2 hours and 50 minutes.

Expectimax and other Games

CS 5522: Artificial Intelligence II

Introduction to Artificial Intelligence Spring 2019 Note 2

Decision making in the presence of uncertainty

Worst-Case vs. Average Case. CSE 473: Artificial Intelligence Expectimax, Uncertainty, Utilities. Expectimax Search. Worst-Case vs.

Probabilities. CSE 473: Artificial Intelligence Uncertainty, Utilities. Reminder: Expectations. Reminder: Probabilities

To earn the extra credit, one of the following has to hold true. Please circle and sign.

Uncertain Outcomes. CS 188: Artificial Intelligence Uncertainty and Utilities. Expectimax Search. Worst-Case vs. Average Case

CS 343: Artificial Intelligence

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

CS221 / Spring 2018 / Sadigh. Lecture 9: Games I

Algorithms and Networking for Computer Games

Expectimax Search Trees. CS 188: Artificial Intelligence Fall Expectimax Quantities. Expectimax Pseudocode. Expectimax Pruning?

343H: Honors AI. Lecture 7: Expectimax Search 2/6/2014. Kristen Grauman UT-Austin. Slides courtesy of Dan Klein, UC-Berkeley Unless otherwise noted

Lecture 9: Games I. Course plan. A simple game. Roadmap. Machine learning. Example: game 1

Introduction to Fall 2011 Artificial Intelligence Midterm Exam

CS360 Homework 14 Solution

CS 4100 // artificial intelligence

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

Expectimax Search Trees. CS 188: Artificial Intelligence Fall Expectimax Example. Expectimax Pseudocode. Expectimax Pruning?

CS 188: Artificial Intelligence Fall 2011

CEC login. Student Details Name SOLUTIONS

Announcements. CS 188: Artificial Intelligence Spring Expectimax Search Trees. Maximum Expected Utility. What are Probabilities?

CS 188: Artificial Intelligence Spring Announcements

Algorithmic Game Theory and Applications. Lecture 11: Games of Perfect Information

The exam is closed book, closed calculator, and closed notes except your three crib sheets.

CSEP 573: Artificial Intelligence

CSE 473: Artificial Intelligence

Non-Deterministic Search

Introduction to Fall 2011 Artificial Intelligence Midterm Exam

CS 343: Artificial Intelligence

Logistics. CS 473: Artificial Intelligence. Markov Decision Processes. PS 2 due today Midterm in one week

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence

Uncertain Outcomes. CS 232: Ar)ficial Intelligence Uncertainty and U)li)es Sep 24, Worst- Case vs. Average Case.

Announcements. Today s Menu

CS 188 Fall Introduction to Artificial Intelligence Midterm 1

Introduction to Artificial Intelligence Midterm 1. CS 188 Spring You have approximately 2 hours.

Decision Making Supplement A

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Decision Analysis

Midterm I. Introduction to Artificial Intelligence. CS 188 Fall You have approximately 3 hours.

Dr. Abdallah Abdallah Fall Term 2014

Using the Maximin Principle

Reinforcement Learning. Slides based on those used in Berkeley's AI class taught by Dan Klein

Decision Trees Decision Tree

Finding Equilibria in Games of No Chance

Outline for today. Stat155 Game Theory Lecture 13: General-Sum Games. General-sum games. General-sum games. Dominated pure strategies

1. better to stick. 2. better to switch. 3. or does your second choice make no difference?

Extending MCTS

Markov Decision Processes

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2017

The exam is closed book, closed notes except a two-page crib sheet. Non-programmable calculators only.

Module 15 July 28, 2014

Decision making in the presence of uncertainty

TUFTS UNIVERSITY DEPARTMENT OF CIVIL AND ENVIRONMENTAL ENGINEERING ES 152 ENGINEERING SYSTEMS Spring Lesson 16 Introduction to Game Theory

Their opponent will play intelligently and wishes to maximize their own payoff.

Markov Decision Processes

Session 3: Computational Game Theory

Yao s Minimax Principle

MBF1413 Quantitative Methods

15.053/8 February 28, person 0-sum (or constant sum) game theory

Decision Trees An Early Classifier

CS 188: Artificial Intelligence Spring Announcements

Utilities and Decision Theory. Lirong Xia

Max strategy. CLASSWORK 26.1(6) tictactoe.f95 Write a program which allows two players to play Tic-tac-toe. X always starts.

1 Solutions to Tute09

Thursday, March 3

Decision Making. DKSharma

CS 798: Homework Assignment 4 (Game Theory)

Decision making in the presence of uncertainty

Foundations of Artificial Intelligence

Chapter ML:III. III. Decision Trees. Decision Trees Basics Impurity Functions Decision Tree Algorithms Decision Tree Pruning

CS 331: Artificial Intelligence Game Theory I. Prisoner s Dilemma

Lecture 12: Introduction to reasoning under uncertainty. Actions and Consequences

Notes on Auctions. Theorem 1 In a second price sealed bid auction bidding your valuation is always a weakly dominant strategy.

CS 188: Artificial Intelligence. Outline

3. The Dynamic Programming Algorithm (cont d)

GAME THEORY. (Hillier & Lieberman Introduction to Operations Research, 8 th edition)

Energy and public Policies

56:171 Operations Research Midterm Exam Solutions Fall 1994

GAME THEORY. Game theory. The odds and evens game. Two person, zero sum game. Prototype example

TTIC An Introduction to the Theory of Machine Learning. Learning and Game Theory. Avrim Blum 5/7/18, 5/9/18

ECO 5341 (Section 2) Spring 2016 Midterm March 24th 2016 Total Points: 100

DECISION ANALYSIS. Decision often must be made in uncertain environments. Examples:

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #24 Scribe: Jordan Ash May 1, 2014

ECON Microeconomics II IRYNA DUDNYK. Auctions.

Outline for today. Stat155 Game Theory Lecture 19: Price of anarchy. Cooperative games. Price of anarchy. Price of anarchy

Prisoner s Dilemma. CS 331: Artificial Intelligence Game Theory I. Prisoner s Dilemma. Prisoner s Dilemma. Prisoner s Dilemma.

Example: Grid World. CS 188: Artificial Intelligence Markov Decision Processes II. Recap: MDPs. Optimal Quantities

PLAYING GAMES WITHOUT OBSERVING PAYOFFS

> asympt( ln( n! ), n ); n 360n n

Homework 9 (for lectures on 4/2)

Lecture Note Set 3 3 N-PERSON GAMES. IE675 Game Theory. Wayne F. Bialas 1 Monday, March 10, N-Person Games in Strategic Form

56:171 Operations Research Midterm Exam Solutions October 19, 1994

An Analysis of Forward Pruning. University of Maryland Institute for Systems Research. College Park, MD 20742

Chapter 18 Student Lecture Notes 18-1

Transcription:

CS188 Spring 2012 Section 4: Games 1 Minimax Search In this problem, we will explore adversarial search. Consider the zero-sum game tree shown below. Trapezoids that point up, such as at the root, represent choices for the player seeking to maximize; trapezoids that point down represent choices for the minimizer. Outcome values for the maximizing player are listed for each leaf node. It is your move, and you seek to maximize the expected value of the game. (a) Assuming both opponents act optimally, carry out the minimax search algorithm. Write the value of each node inside the corresponding trapzoid. What move should you make now? How much is the game worth to you? The game is worth 5. We should make the move that takes us left down to the node containing 5. (b) Now reconsider the same game tree, but use α-β pruning (the tree is printed on the next page). Expand successors from left to right. In the brackets [, ], record the [α, β] pair that is passed down that edge (through a call to MIN- VALUE or MAX-VALUE). In the parentheses ( ), record the value (v) that is passed up the edge (the value returned by MIN-VALUE or MAX-VALUE). Circle all leaf nodes that are visited. Put an X through edges that are pruned off. How much is the game worth according to α-β pruning? α-β pruning finds the same solution. The game is still worth 5 to the maximizer. 1

(b) 2

2 Suicidal Pacman Pacman is sometimes suicidal when doing a minimax search because of its worst case analysis. We will build here a small expectimax tree to see the difference in behavior. Consider the following rules: Ghosts cannote change direction unless they are facing a wall. The possible actions are east, west, south, and north (not stop). Initially, they have no direction and can move to any adjacent square. We use random ghosts which choose uniformly between all their legal moves. Assume that Pacman cannot stop If Pacman runs into a space with a ghost, it dies before having the chance to eat any food which was there. The game is scored as follows: -1 for each action Pacman takes 10 for each food dot eaten -500 for losing (if Pacman is eaten) 500 for winning (all food dots eaten) Given the following trapped maze, build the expectimax tree with max and chance nodes clearly identified. Use the game score as the evaluation function at the leaves. If you don t want to make little drawings, all possible states of the game have been labeled for you on the next page: use them to identify the states of the game. Pacman moves first, followed by the lower left ghost, then the top right ghost. 3

(a) Build the expectimax tree. What is Pacman s optimal move? Play W for an expected payoff of 3. (b) What would pacman do if it was using minimax instead? If we treat the ghost nodes as minimizing nodes and run minimax, we see that if Pacman plays W the ghosts would play N,E respectively, and we would be stuck with a payoff of -502. Instead, we could earn a better payoff of -501 by immediately playing E: suicidal Pacman! 4

(c) By changing the probabilities of action for the ghosts, can you get expectimax to make the same decision as minimax? One possible choice is for the ghosts to play N 99.95% of the time if N is legal and to choose randomly among remaining legal moves the rest of the time. Then at the first chance node for the blue ghost, we play N 99.95% of the time and E 0.05% of the time, and the corresponding payoff is 0.9995 ( 502) + 0.0005 (508) = 501.5, which is worse than playing E immediately for a payoff of -501. (d) Now say you are using the following alternate game score components: -1 for Pacman making a move -1.5 for losing 0 for eating food 0.3 for winning Use this new game score as your evaluation function at the leaves. Note this yields a monotonic transformation of the original utilities: a function which preserves the ordering of the state according to their utility. Could this change the decision of Pacman using expectimax? The scores at the leaves (reading left to right) are -3.5, -3.5, -1.7, -3.5, -2.5. Propagating up the tree, we see that Pacman gets expected value 0.5 ( 1.7) + 0.5 ( 3.5) = 2.6 from playing W and 2.5 from playing E, and we see that his optimal decision has changed. Optimal decisions are sensitive to monotonic transformations because the probabilities involved are not scaled. 5