Programming for Engineers in Python

Size: px
Start display at page:

Download "Programming for Engineers in Python"

Transcription

1 Programming for Engineers in Python Lecture 12: Dynamic Programming Autumn

2 Lecture 11: Highlights GUI (Based on slides from the course Software1, CS, TAU) GUI in Python (Based on Chapter 19 from the book Think Python ) Swampy Widgets Callbacks Event driven programming Display an image in a GUI Sorting: Merge sort Bucket sort 2

3 Plan Fibonacci (overlapping subproblems) Evaluating the performance of stock market traders (optimal substructure) Dynamic programming basics Maximizing profits of a shipping company (Knapsack problem) A little on the exams and course s grade (if time allows) 3

4 Remember Fibonacci Series? Fibonacci series 0, 1, 1, 2, 3, 5, 8, 13, 21, 34 Definition: fib(0) = 0 fib(1) = 1 fib(n) = fib(n-1) + fib(n-2) en.wikipedia.org/wiki/fibonacci_number סלט פיבונאצ 'י 4

5 Recursive Fibonacci Series Every call with n > 1 invokes 2 function calls, and so on 5

6 Redundant Calls Fib(5) Fib(4) Fib(3) Fib(3) Fib(2) Fib(2) Fib(1) Fib(2) Fib(1) Fib(1) Fib(0) Fib(1) Fib(1) Fib(0) Fib(0) 6

7 Redundant Calls Iterative vs. recursive Fibonacci 7

8 Number of Calls to Fibonacci n value Number of calls

9 Demonstration: Iterative Versus Recursive Fibonacci 9

10 Demonstration: Iterative Versus Recursive Fibonacci (cont.) Output (shell): 10

11 Memoization Enhancing Efficiency of Recursive (Fibonacci) The problem: solving same subproblems many time The idea: avoid repeating the calculations of results for previously processed inputs, by solving each subproblem once, and using the solution the next time it is encountered How? Store subproblems solution in a list This technique is called memoization 11

12 Fibonacci with Memoization 12

13 Timeit Output (shell): 13

14 Fibonacci: Memoization Vs. Iterative Same time complexity O(N) Iterative x 5 times faster than Memoization the overhead of the recursive calls So why do we need Memoization? We shall discuss that later 14

15 Overlapping Subproblems A problem has overlapping subproblems if it can be broken down into subproblems which are reused multiple times If divide-and-conquer is applicable, then each problem solved will be brand new A recursive algorithm is called exponential number of times because it solves the same problems repeatedly 15

16 Evaluating Traders Performance How to evaluate a trader s performance on a?(טבע (e.g., given stock The trader earns $X on that stock, is it good? Mediocre? Bad? Define a measure of success: Maximal possible profit M$ Trader s performance X/M (%) Define M: Maximal profit in a given time range How can it be calculated? 16

17 Evaluating Traders Performance Consider the changes the stock undergoes in the given time range M is defined as a continuous time sub-range in which the profit is maximal Examples (all numbers are percentages): [1,2,-5,4,7,-2] [4,7] M = 11% If X = 6% traders performance is ~54% [1,5,-3,4,-2,1] [1,5,-3,4] M = 7% If X = 5% trader s performance is ~71% Let s make it a little more formal 17

18 Maximum Subarray Sum Input: array of numbers Output: what (contiguous) subarray has the largest sum? Naïve solution ( brute force ): How many subarrays exist for an array of size n? n + n-1 + n O(n 2 ) The plan: check each and report the maximal Time complexity of O(n 2 ) We will return both the sum and the corresponding subarray 18

19 Naïve Solution ( Brute Force ) 19

20 Naïve Solution (shorter code) 20

21 Efficient Solution The solution for a[i:i] for all i is 0 (Python notation) Let s assume we know the subarray of a[0:i] with the largest sum Can we use this information to find the subarray of a[0:i+1] with the largest sum? A problem is said to have optimal substructure if the globally optimal solution can be constructed from locally optimal solutions to subproblems 21

22 Optimal Substructure t = a[j:i] >= 0 (why?) j k i-1 i s = a[j:k+1] is the optimal subarray of a[0:i] What is the optimal solution s structure for a[0:i+1]? Can it start before j? No! Can it start in the range j + 1 k? No! Can it start in the range k + 1 i-1? No! Otherwise t would have been negative at a earlier stage 22

23 Optimal Substructure t = a[j:i] >= 0 (why?) j k i-1 i s = a[j:k+1] is the optimal subarray of a[0:i] What is the optimal solution s structure for a[0:i+1]? Set the new t = t + a[i] If t > s than s = t, the solution is (j,i+1) Otherwise the solution does not change If t < 0 than j is updated to i+1 so t = 0 (for next iteration) Otherwise (0 =< t <= s) change nothing 23

24 Example 24

25 The Code 25

26 Efficiency O(n) Constant time 26

27 Efficiency O(n) The "globally optimal" solution corresponds to a subarray with a globally maximal sum, but at each step we only make a decision relative to what we have already seen. At each step we know the best solution thus far, but might change our decision later based on our previous information and the current information. In this sense the problem has optimal substructure. Because we can make decisions locally we only need to traverse the list once. 27

28 O(n) Versus O(n 2 ) Output (shell): 28

29 Dynamic Programming (DP) Dynamic Programming is an algorithm design technique for optimization problems Similar to divide and conquer, DP solves problems by combining solutions to subproblems Not similar to divide and conquer, subproblems are not independent: Subproblems may share subsubproblems (overlapping subproblems) Solution to one subproblem may not affect the solutions to other subproblems of the same problem (optimal substructure)

30 Dynamic Programming (cont.) DP reduces computation by Solving subproblems in a bottom-up fashion Storing solution to a subproblem the first time it is solved Looking up the solution when subproblem is encountered again Key: determine structure of optimal solutions

31 Dynamic Programming Characteristics Overlapping subproblems Can be broken down into subproblems which are reused multiple times Examples: Factorial does not exhibit overlapping subproblems Fibonacci does Optimal substructure Globally optimal solution can be constructed from locally optimal solutions to subproblems Examples: Fibonacci, msum, Knapsack (coming next) 31

32 Optimizing Shipping Cargo (Knapsack) A shipping company is trying to sell a residual capacity of 1000 metric tones in a cargo ship to different shippers by an auction The company received 100 different offers from potential shippers each characterized by tonnage and offered reward The company wish to select a subset of the offers that fits into its residual capacity so as to maximize the total reward 32

33 Optimizing Shipping Cargo (Knapsack) The company wish to select a subset of the offers that fits into its residual capacity so as to maximize the total reward 33

34 Formalizing Shipping capacity W = 1000 Offers from potential shippers n = 100 Each offer i has a weight w i and an offered reward v i Maximize the reward given the W tonnage limit A(n,W) - the maximum value that can be attained from considering the first n items weighting at most W tons 34

35 First Try - Greedy Sort offers i by v i /w i ratio Select offers until the ship is full Counter example: W = 10, {(v i,w i )} = {(7,7),(4,5),(4,5)} 35

36 Solution A(i,j) - the maximum value that can be attained from considering the first i items with a j weight limit: A(0,j) = A(i,0) = 0 for all i n and j W If w i > j then A(i,j) = A(i-1,j) If w i < j we have two options: Formally: Do not include it so the value will be A(i-1,j) If we do include it the value will be v i + A(i-1,j-w i ) Which choice should we make? Whichever is larger! the maximum of the two 36

37 Optimal Substructure and Overlapping Subproblems Overlapping subproblems: at any stage (i,j) we might need to calculate A(k,l) for several k < i and l < j. Optimal substructure: at any point we only need information about the choices we have already made. 37

38 Solution (Recursive) 38

39 W Solution (Memoization) The Idea M(N,W) N 39

40 Solution (Memoization) - Code 40

41 W Solution (Iterative) The Idea In Class Bottom-Up : start with solving small problems and gradually grow M(N,W) N 41

42 DP VS. Memoization Same Big O computational complexity If all subproblems must be solved at least once, DP is better by a constant factor due to no recursive involvement If some subproblems may not need to be solved, Memoized algorithm may be more efficient, since it only solve these subproblems which are definitely required 42

43 Steps in Dynamic Programming 1. Characterize structure of an optimal solution 2. Define value of optimal solution recursively 3. Compute optimal solution values either topdown (memoization) or bottom-up (in a table) 4. Construct an optimal solution from computed values

44 Why Knapsack? בעיית הגנב 44

45 Extensions NP completeness Pseudo polynomial 45

46 References Intro to DP: Practice problems: 46

CSE 417 Dynamic Programming (pt 2) Look at the Last Element

CSE 417 Dynamic Programming (pt 2) Look at the Last Element CSE 417 Dynamic Programming (pt 2) Look at the Last Element Reminders > HW4 is due on Friday start early! if you run into problems loading data (date parsing), try running java with Duser.country=US Duser.language=en

More information

The Agent-Environment Interface Goals, Rewards, Returns The Markov Property The Markov Decision Process Value Functions Optimal Value Functions

The Agent-Environment Interface Goals, Rewards, Returns The Markov Property The Markov Decision Process Value Functions Optimal Value Functions The Agent-Environment Interface Goals, Rewards, Returns The Markov Property The Markov Decision Process Value Functions Optimal Value Functions Optimality and Approximation Finite MDP: {S, A, R, p, γ}

More information

Lecture 4: Divide and Conquer

Lecture 4: Divide and Conquer Lecture 4: Divide and Conquer Divide and Conquer Merge sort is an example of a divide-and-conquer algorithm Recall the three steps (at each level to solve a divideand-conquer problem recursively Divide

More information

June 11, Dynamic Programming( Weighted Interval Scheduling)

June 11, Dynamic Programming( Weighted Interval Scheduling) Dynamic Programming( Weighted Interval Scheduling) June 11, 2014 Problem Statement: 1 We have a resource and many people request to use the resource for periods of time (an interval of time) 2 Each interval

More information

Dynamic Programming cont. We repeat: The Dynamic Programming Template has three parts.

Dynamic Programming cont. We repeat: The Dynamic Programming Template has three parts. Page 1 Dynamic Programming cont. We repeat: The Dynamic Programming Template has three parts. Subproblems Sometimes this is enough if the algorithm and its complexity is obvious. Recursion Algorithm Must

More information

Reinforcement Learning. Slides based on those used in Berkeley's AI class taught by Dan Klein

Reinforcement Learning. Slides based on those used in Berkeley's AI class taught by Dan Klein Reinforcement Learning Slides based on those used in Berkeley's AI class taught by Dan Klein Reinforcement Learning Basic idea: Receive feedback in the form of rewards Agent s utility is defined by the

More information

CS 343: Artificial Intelligence

CS 343: Artificial Intelligence CS 343: Artificial Intelligence Markov Decision Processes II Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC

More information

0/1 knapsack problem knapsack problem

0/1 knapsack problem knapsack problem 1 (1) 0/1 knapsack problem. A thief robbing a safe finds it filled with N types of items of varying size and value, but has only a small knapsack of capacity M to use to carry the goods. More precisely,

More information

CMPSCI 311: Introduction to Algorithms Second Midterm Practice Exam SOLUTIONS

CMPSCI 311: Introduction to Algorithms Second Midterm Practice Exam SOLUTIONS CMPSCI 311: Introduction to Algorithms Second Midterm Practice Exam SOLUTIONS November 17, 2016. Name: ID: Instructions: Answer the questions directly on the exam pages. Show all your work for each question.

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Basic idea: Receive feedback in the form of rewards Agent s utility is defined by the reward function Must (learn to) act so as to maximize expected rewards Grid World The agent

More information

Maximum Contiguous Subsequences

Maximum Contiguous Subsequences Chapter 8 Maximum Contiguous Subsequences In this chapter, we consider a well-know problem and apply the algorithm-design techniques that we have learned thus far to this problem. While applying these

More information

Slides credited from Hsu-Chun Hsiao

Slides credited from Hsu-Chun Hsiao Slides credited from Hsu-Chun Hsiao Greedy Algorithms Greedy #1: Activity-Selection / Interval Scheduling Greedy #2: Coin Changing Greedy #3: Fractional Knapsack Problem Greedy #4: Breakpoint Selection

More information

91.420/543: Artificial Intelligence UMass Lowell CS Fall 2010

91.420/543: Artificial Intelligence UMass Lowell CS Fall 2010 91.420/543: Artificial Intelligence UMass Lowell CS Fall 2010 Lecture 17 & 18: Markov Decision Processes Oct 12 13, 2010 A subset of Lecture 9 slides from Dan Klein UC Berkeley Many slides over the course

More information

Markov Decision Processes

Markov Decision Processes Markov Decision Processes Robert Platt Northeastern University Some images and slides are used from: 1. CS188 UC Berkeley 2. RN, AIMA Stochastic domains Image: Berkeley CS188 course notes (downloaded Summer

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2011 Lecture 9: MDPs 2/16/2011 Pieter Abbeel UC Berkeley Many slides over the course adapted from either Dan Klein, Stuart Russell or Andrew Moore 1 Announcements

More information

Homework Assignment #3. 1 Demonstrate how mergesort works when sorting the following list of numbers:

Homework Assignment #3. 1 Demonstrate how mergesort works when sorting the following list of numbers: CISC 5835 Algorithms for Big Data Fall, 2018 Homework Assignment #3 1 Demonstrate how mergesort works when sorting the following list of numbers: 6 1 4 2 3 8 7 5 2 Given the following array (list), follows

More information

Markov Decision Processes

Markov Decision Processes Markov Decision Processes Robert Platt Northeastern University Some images and slides are used from: 1. CS188 UC Berkeley 2. AIMA 3. Chris Amato Stochastic domains So far, we have studied search Can use

More information

Chapter wise Question bank

Chapter wise Question bank GOVERNMENT ENGINEERING COLLEGE - MODASA Chapter wise Question bank Subject Name Analysis and Design of Algorithm Semester Department 5 th Term ODD 2015 Information Technology / Computer Engineering Chapter

More information

Optimization Methods. Lecture 16: Dynamic Programming

Optimization Methods. Lecture 16: Dynamic Programming 15.093 Optimization Methods Lecture 16: Dynamic Programming 1 Outline 1. The knapsack problem Slide 1. The traveling salesman problem 3. The general DP framework 4. Bellman equation 5. Optimal inventory

More information

CS 106X Lecture 11: Sorting

CS 106X Lecture 11: Sorting CS 106X Lecture 11: Sorting Friday, February 3, 2017 Programming Abstractions (Accelerated) Winter 2017 Stanford University Computer Science Department Lecturer: Chris Gregg reading: Programming Abstractions

More information

Markov Decision Processes (MDPs) CS 486/686 Introduction to AI University of Waterloo

Markov Decision Processes (MDPs) CS 486/686 Introduction to AI University of Waterloo Markov Decision Processes (MDPs) CS 486/686 Introduction to AI University of Waterloo Outline Sequential Decision Processes Markov chains Highlight Markov property Discounted rewards Value iteration Markov

More information

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence CS 188: Artificial Intelligence Markov Decision Processes Dan Klein, Pieter Abbeel University of California, Berkeley Non-Deterministic Search 1 Example: Grid World A maze-like problem The agent lives

More information

Markov Decision Process

Markov Decision Process Markov Decision Process Human-aware Robotics 2018/02/13 Chapter 17.3 in R&N 3rd Ø Announcement: q Slides for this lecture are here: http://www.public.asu.edu/~yzhan442/teaching/cse471/lectures/mdp-ii.pdf

More information

Iteration. The Cake Eating Problem. Discount Factors

Iteration. The Cake Eating Problem. Discount Factors 18 Value Function Iteration Lab Objective: Many questions have optimal answers that change over time. Sequential decision making problems are among this classification. In this lab you we learn how to

More information

COMP417 Introduction to Robotics and Intelligent Systems. Reinforcement Learning - 2

COMP417 Introduction to Robotics and Intelligent Systems. Reinforcement Learning - 2 COMP417 Introduction to Robotics and Intelligent Systems Reinforcement Learning - 2 Speaker: Sandeep Manjanna Acklowledgement: These slides use material from Pieter Abbeel s, Dan Klein s and John Schulman

More information

Algorithmic Game Theory

Algorithmic Game Theory Algorithmic Game Theory Lecture 10 06/15/10 1 A combinatorial auction is defined by a set of goods G, G = m, n bidders with valuation functions v i :2 G R + 0. $5 Got $6! More? Example: A single item for

More information

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence CS 188: Artificial Intelligence Markov Decision Processes Dan Klein, Pieter Abbeel University of California, Berkeley Non Deterministic Search Example: Grid World A maze like problem The agent lives in

More information

CSEP 573: Artificial Intelligence

CSEP 573: Artificial Intelligence CSEP 573: Artificial Intelligence Markov Decision Processes (MDP)! Ali Farhadi Many slides over the course adapted from Luke Zettlemoyer, Dan Klein, Pieter Abbeel, Stuart Russell or Andrew Moore 1 Outline

More information

What is Greedy Approach? Control abstraction for Greedy Method. Three important activities

What is Greedy Approach? Control abstraction for Greedy Method. Three important activities 0-0-07 What is Greedy Approach? Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each decision is locally optimal. These locally optimal solutions will finally

More information

useful than solving these yourself, writing up your solution and then either comparing your

useful than solving these yourself, writing up your solution and then either comparing your CSE 441T/541T: Advanced Algorithms Fall Semester, 2003 September 9, 2004 Practice Problems Solutions Here are the solutions for the practice problems. However, reading these is far less useful than solving

More information

Basic Framework. About this class. Rewards Over Time. [This lecture adapted from Sutton & Barto and Russell & Norvig]

Basic Framework. About this class. Rewards Over Time. [This lecture adapted from Sutton & Barto and Russell & Norvig] Basic Framework [This lecture adapted from Sutton & Barto and Russell & Norvig] About this class Markov Decision Processes The Bellman Equation Dynamic Programming for finding value functions and optimal

More information

Non-Deterministic Search

Non-Deterministic Search Non-Deterministic Search MDP s 1 Non-Deterministic Search How do you plan (search) when your actions might fail? In general case, how do you plan, when the actions have multiple possible outcomes? 2 Example:

More information

Problem Set 7. Problem 7-1.

Problem Set 7. Problem 7-1. Introduction to Algorithms: 6.006 Massachusetts Institute of Technology November 22, 2011 Professors Erik Demaine and Srini Devadas Problem Set 7 Problem Set 7 Both theory and programming questions are

More information

Logistics. CS 473: Artificial Intelligence. Markov Decision Processes. PS 2 due today Midterm in one week

Logistics. CS 473: Artificial Intelligence. Markov Decision Processes. PS 2 due today Midterm in one week CS 473: Artificial Intelligence Markov Decision Processes Dan Weld University of Washington [Slides originally created by Dan Klein & Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/27/16 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

More information

Making Decisions. CS 3793 Artificial Intelligence Making Decisions 1

Making Decisions. CS 3793 Artificial Intelligence Making Decisions 1 Making Decisions CS 3793 Artificial Intelligence Making Decisions 1 Planning under uncertainty should address: The world is nondeterministic. Actions are not certain to succeed. Many events are outside

More information

IEOR E4004: Introduction to OR: Deterministic Models

IEOR E4004: Introduction to OR: Deterministic Models IEOR E4004: Introduction to OR: Deterministic Models 1 Dynamic Programming Following is a summary of the problems we discussed in class. (We do not include the discussion on the container problem or the

More information

CS 188: Artificial Intelligence Fall 2011

CS 188: Artificial Intelligence Fall 2011 CS 188: Artificial Intelligence Fall 2011 Lecture 9: MDPs 9/22/2011 Dan Klein UC Berkeley Many slides over the course adapted from either Stuart Russell or Andrew Moore 2 Grid World The agent lives in

More information

Empirical and Average Case Analysis

Empirical and Average Case Analysis Empirical and Average Case Analysis l We have discussed theoretical analysis of algorithms in a number of ways Worst case big O complexities Recurrence relations l What we often want to know is what will

More information

1) S = {s}; 2) for each u V {s} do 3) dist[u] = cost(s, u); 4) Insert u into a 2-3 tree Q with dist[u] as the key; 5) for i = 1 to n 1 do 6) Identify

1) S = {s}; 2) for each u V {s} do 3) dist[u] = cost(s, u); 4) Insert u into a 2-3 tree Q with dist[u] as the key; 5) for i = 1 to n 1 do 6) Identify CSE 3500 Algorithms and Complexity Fall 2016 Lecture 17: October 25, 2016 Dijkstra s Algorithm Dijkstra s algorithm for the SSSP problem generates the shortest paths in nondecreasing order of the shortest

More information

Enhanced Shell Sorting Algorithm

Enhanced Shell Sorting Algorithm Enhanced ing Algorithm Basit Shahzad, and Muhammad Tanvir Afzal Abstract Many algorithms are available for sorting the unordered elements. Most important of them are Bubble sort, Heap sort, Insertion sort

More information

CSE 473: Artificial Intelligence

CSE 473: Artificial Intelligence CSE 473: Artificial Intelligence Markov Decision Processes (MDPs) Luke Zettlemoyer Many slides over the course adapted from Dan Klein, Stuart Russell or Andrew Moore 1 Announcements PS2 online now Due

More information

CSE202: Algorithm Design and Analysis. Ragesh Jaiswal, CSE, UCSD

CSE202: Algorithm Design and Analysis. Ragesh Jaiswal, CSE, UCSD Fractional knapsack Problem Fractional knapsack: You are a thief and you have a sack of size W. There are n divisible items. Each item i has a volume W (i) and a total value V (i). Design an algorithm

More information

Chapter 21. Dynamic Programming CONTENTS 21.1 A SHORTEST-ROUTE PROBLEM 21.2 DYNAMIC PROGRAMMING NOTATION

Chapter 21. Dynamic Programming CONTENTS 21.1 A SHORTEST-ROUTE PROBLEM 21.2 DYNAMIC PROGRAMMING NOTATION Chapter 21 Dynamic Programming CONTENTS 21.1 A SHORTEST-ROUTE PROBLEM 21.2 DYNAMIC PROGRAMMING NOTATION 21.3 THE KNAPSACK PROBLEM 21.4 A PRODUCTION AND INVENTORY CONTROL PROBLEM 23_ch21_ptg01_Web.indd

More information

Sequential Decision Making

Sequential Decision Making Sequential Decision Making Dynamic programming Christos Dimitrakakis Intelligent Autonomous Systems, IvI, University of Amsterdam, The Netherlands March 18, 2008 Introduction Some examples Dynamic programming

More information

Complex Decisions. Sequential Decision Making

Complex Decisions. Sequential Decision Making Sequential Decision Making Outline Sequential decision problems Value iteration Policy iteration POMDPs (basic concepts) Slides partially based on the Book "Reinforcement Learning: an introduction" by

More information

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned

More information

6.231 DYNAMIC PROGRAMMING LECTURE 3 LECTURE OUTLINE

6.231 DYNAMIC PROGRAMMING LECTURE 3 LECTURE OUTLINE 6.21 DYNAMIC PROGRAMMING LECTURE LECTURE OUTLINE Deterministic finite-state DP problems Backward shortest path algorithm Forward shortest path algorithm Shortest path examples Alternative shortest path

More information

Family Vacation. c 1 = c n = 0. w: maximum number of miles the family may drive each day.

Family Vacation. c 1 = c n = 0. w: maximum number of miles the family may drive each day. II-0 Family Vacation Set of cities denoted by P 1, P 2,..., P n. d i : Distance from P i 1 to P i (1 < i n). d 1 = 0 c i : Cost of dinner, lodging and breakfast when staying at city P i (1 < i < n). c

More information

OPPA European Social Fund Prague & EU: We invest in your future.

OPPA European Social Fund Prague & EU: We invest in your future. OPPA European Social Fund Prague & EU: We invest in your future. Cooperative Game Theory Michal Jakob and Michal Pěchouček Agent Technology Center, Dept. of Computer Science and Engineering, FEE, Czech

More information

Lecture outline W.B.Powell 1

Lecture outline W.B.Powell 1 Lecture outline What is a policy? Policy function approximations (PFAs) Cost function approximations (CFAs) alue function approximations (FAs) Lookahead policies Finding good policies Optimizing continuous

More information

Lecture Notes: Interprocedural Analysis

Lecture Notes: Interprocedural Analysis Lecture Notes: Interprocedural Analysis 15-819O: Program Analysis Jonathan Aldrich jonathan.aldrich@cs.cmu.edu Lecture 8 1 Interprocedural Analysis Interprocedural analysis concerns analyzing a program

More information

TOPICS ON ECONOMICS AND COMPUTATION

TOPICS ON ECONOMICS AND COMPUTATION TOPICS ON ECONOMICS AND COMPUTATION Spring 2016 Instructor: Sigal Oren 1. Algorithmic game theory 2. Digital Economics Algorithmic Game Theory AI Theory Experimental Survey C 10 Survey R 10 Social Computing

More information

Stat 476 Life Contingencies II. Profit Testing

Stat 476 Life Contingencies II. Profit Testing Stat 476 Life Contingencies II Profit Testing Profit Testing Profit testing is commonly done by actuaries in life insurance companies. It s useful for a number of reasons: Setting premium rates or testing

More information

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Piyush Rai CS5350/6350: Machine Learning November 29, 2011 Reinforcement Learning Supervised Learning: Uses explicit supervision

More information

Mechanism Design and Auctions

Mechanism Design and Auctions Mechanism Design and Auctions Game Theory Algorithmic Game Theory 1 TOC Mechanism Design Basics Myerson s Lemma Revenue-Maximizing Auctions Near-Optimal Auctions Multi-Parameter Mechanism Design and the

More information

Advanced Operations Research Prof. G. Srinivasan Dept of Management Studies Indian Institute of Technology, Madras

Advanced Operations Research Prof. G. Srinivasan Dept of Management Studies Indian Institute of Technology, Madras Advanced Operations Research Prof. G. Srinivasan Dept of Management Studies Indian Institute of Technology, Madras Lecture 23 Minimum Cost Flow Problem In this lecture, we will discuss the minimum cost

More information

Decision Theory: Value Iteration

Decision Theory: Value Iteration Decision Theory: Value Iteration CPSC 322 Decision Theory 4 Textbook 9.5 Decision Theory: Value Iteration CPSC 322 Decision Theory 4, Slide 1 Lecture Overview 1 Recap 2 Policies 3 Value Iteration Decision

More information

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Lecture 17: More on Markov Decision Processes. Reinforcement learning Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture

More information

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Piyush Rai CS5350/6350: Machine Learning November 29, 2011 Reinforcement Learning Supervised Learning: Uses explicit supervision

More information

Introduction to Dynamic Programming

Introduction to Dynamic Programming Introduction to Dynamic Programming http://bicmr.pku.edu.cn/~wenzw/bigdata2018.html Acknowledgement: this slides is based on Prof. Mengdi Wang s and Prof. Dimitri Bertsekas lecture notes Outline 2/65 1

More information

Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras

Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras Lecture 21 Successive Shortest Path Problem In this lecture, we continue our discussion

More information

Practical SAT Solving

Practical SAT Solving Practical SAT Solving Lecture 1 Carsten Sinz, Tomáš Balyo April 18, 2016 NSTITUTE FOR THEORETICAL COMPUTER SCIENCE KIT University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz

More information

Lecture 1 Introduction and Definition of TU games

Lecture 1 Introduction and Definition of TU games Lecture 1 Introduction and Definition of TU games 1.1 Introduction Game theory is composed by different fields. Probably the most well known is the field of strategic games that analyse interaction between

More information

Lecture 10: The knapsack problem

Lecture 10: The knapsack problem Optimization Methods in Finance (EPFL, Fall 2010) Lecture 10: The knapsack problem 24.11.2010 Lecturer: Prof. Friedrich Eisenbrand Scribe: Anu Harjula The knapsack problem The Knapsack problem is a problem

More information

Overview: Representation Techniques

Overview: Representation Techniques 1 Overview: Representation Techniques Week 6 Representations for classical planning problems deterministic environment; complete information Week 7 Logic programs for problem representations including

More information

Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model

Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model Simerjot Kaur (sk3391) Stanford University Abstract This work presents a novel algorithmic trading system based on reinforcement

More information

Example: Grid World. CS 188: Artificial Intelligence Markov Decision Processes II. Recap: MDPs. Optimal Quantities

Example: Grid World. CS 188: Artificial Intelligence Markov Decision Processes II. Recap: MDPs. Optimal Quantities CS 188: Artificial Intelligence Markov Deciion Procee II Intructor: Dan Klein and Pieter Abbeel --- Univerity of California, Berkeley [Thee lide were created by Dan Klein and Pieter Abbeel for CS188 Intro

More information

MDPs: Bellman Equations, Value Iteration

MDPs: Bellman Equations, Value Iteration MDPs: Bellman Equations, Value Iteration Sutton & Barto Ch 4 (Cf. AIMA Ch 17, Section 2-3) Adapted from slides kindly shared by Stuart Russell Sutton & Barto Ch 4 (Cf. AIMA Ch 17, Section 2-3) 1 Appreciations

More information

Design and Analysis of Algorithms 演算法設計與分析. Lecture 9 November 19, 2014 洪國寶

Design and Analysis of Algorithms 演算法設計與分析. Lecture 9 November 19, 2014 洪國寶 Design and Analysis of Algorithms 演算法設計與分析 Lecture 9 November 19, 2014 洪國寶 1 Outline Advanced data structures Binary heaps(review) Binomial heaps Fibonacci heaps Data structures for disjoint sets 2 Mergeable

More information

FE670 Algorithmic Trading Strategies. Stevens Institute of Technology

FE670 Algorithmic Trading Strategies. Stevens Institute of Technology FE670 Algorithmic Trading Strategies Lecture 4. Cross-Sectional Models and Trading Strategies Steve Yang Stevens Institute of Technology 09/26/2013 Outline 1 Cross-Sectional Methods for Evaluation of Factor

More information

MS-E2114 Investment Science Lecture 4: Applied interest rate analysis

MS-E2114 Investment Science Lecture 4: Applied interest rate analysis MS-E2114 Investment Science Lecture 4: Applied interest rate analysis A. Salo, T. Seeve Systems Analysis Laboratory Department of System Analysis and Mathematics Aalto University, School of Science Overview

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 9 Sep, 28, 2016 Slide 1 CPSC 422, Lecture 9 An MDP Approach to Multi-Category Patient Scheduling in a Diagnostic Facility Adapted from: Matthew

More information

Lecture 7: Bayesian approach to MAB - Gittins index

Lecture 7: Bayesian approach to MAB - Gittins index Advanced Topics in Machine Learning and Algorithmic Game Theory Lecture 7: Bayesian approach to MAB - Gittins index Lecturer: Yishay Mansour Scribe: Mariano Schain 7.1 Introduction In the Bayesian approach

More information

Markov Decision Processes. CS 486/686: Introduction to Artificial Intelligence

Markov Decision Processes. CS 486/686: Introduction to Artificial Intelligence Markov Decision Processes CS 486/686: Introduction to Artificial Intelligence 1 Outline Markov Chains Discounted Rewards Markov Decision Processes (MDP) - Value Iteration - Policy Iteration 2 Markov Chains

More information

Solution algorithm for Boz-Mendoza JME by Enrique G. Mendoza University of Pennsylvania, NBER & PIER

Solution algorithm for Boz-Mendoza JME by Enrique G. Mendoza University of Pennsylvania, NBER & PIER Solution algorithm for Boz-Mendoza JME 2014 by Enrique G. Mendoza University of Pennsylvania, NBER & PIER Two-stage solution method At each date t of a sequence of T periods of observed realizations of

More information

Financial Optimization ISE 347/447. Lecture 15. Dr. Ted Ralphs

Financial Optimization ISE 347/447. Lecture 15. Dr. Ted Ralphs Financial Optimization ISE 347/447 Lecture 15 Dr. Ted Ralphs ISE 347/447 Lecture 15 1 Reading for This Lecture C&T Chapter 12 ISE 347/447 Lecture 15 2 Stock Market Indices A stock market index is a statistic

More information

4 Reinforcement Learning Basic Algorithms

4 Reinforcement Learning Basic Algorithms Learning in Complex Systems Spring 2011 Lecture Notes Nahum Shimkin 4 Reinforcement Learning Basic Algorithms 4.1 Introduction RL methods essentially deal with the solution of (optimal) control problems

More information

Basic form of optimization of design Combines: Production function - Technical efficiency Input cost function, c(x) Economic efficiency

Basic form of optimization of design Combines: Production function - Technical efficiency Input cost function, c(x) Economic efficiency Marginal Analysis Outline 1. Definition 2. Assumptions 3. Optimality criteria Analysis Interpretation Application 4. Expansion path 5. Cost function 6. Economies of scale Massachusetts Institute of Technology

More information

CS 188: Artificial Intelligence. Outline

CS 188: Artificial Intelligence. Outline C 188: Artificial Intelligence Markov Decision Processes (MDPs) Pieter Abbeel UC Berkeley ome slides adapted from Dan Klein 1 Outline Markov Decision Processes (MDPs) Formalism Value iteration In essence

More information

CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 6: Prior-Free Single-Parameter Mechanism Design (Continued)

CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 6: Prior-Free Single-Parameter Mechanism Design (Continued) CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 6: Prior-Free Single-Parameter Mechanism Design (Continued) Instructor: Shaddin Dughmi Administrivia Homework 1 due today. Homework 2 out

More information

Revenue Management Under the Markov Chain Choice Model

Revenue Management Under the Markov Chain Choice Model Revenue Management Under the Markov Chain Choice Model Jacob B. Feldman School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853, USA jbf232@cornell.edu Huseyin

More information

Handout 4: Deterministic Systems and the Shortest Path Problem

Handout 4: Deterministic Systems and the Shortest Path Problem SEEM 3470: Dynamic Optimization and Applications 2013 14 Second Term Handout 4: Deterministic Systems and the Shortest Path Problem Instructor: Shiqian Ma January 27, 2014 Suggested Reading: Bertsekas

More information

Design and Analysis of Algorithms. Lecture 9 November 20, 2013 洪國寶

Design and Analysis of Algorithms. Lecture 9 November 20, 2013 洪國寶 Design and Analysis of Algorithms 演算法設計與分析 Lecture 9 November 20, 2013 洪國寶 1 Outline Advanced data structures Binary heaps (review) Binomial heaps Fibonacci heaps Dt Data structures t for disjoint dijitsets

More information

Handout for Unit 4 for Applied Corporate Finance

Handout for Unit 4 for Applied Corporate Finance Handout for Unit 4 for Applied Corporate Finance Unit 4 Capital Structure Contents 1. Types of Financing 2. Financing Choices 3. How much debt is good? 4. Debt Benefits vs Costs 5. Approaches to arriving

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 15 Adaptive Huffman Coding Part I Huffman code are optimal for a

More information

Reinforcement Learning 04 - Monte Carlo. Elena, Xi

Reinforcement Learning 04 - Monte Carlo. Elena, Xi Reinforcement Learning 04 - Monte Carlo Elena, Xi Previous lecture 2 Markov Decision Processes Markov decision processes formally describe an environment for reinforcement learning where the environment

More information

Homework solutions, Chapter 8

Homework solutions, Chapter 8 Homework solutions, Chapter 8 NOTE: We might think of 8.1 as being a section devoted to setting up the networks and 8.2 as solving them, but only 8.2 has a homework section. Section 8.2 2. Use Dijkstra

More information

Stock valuation. Chapter 10

Stock valuation. Chapter 10 Stock valuation Chapter 10 1 Principles Applied in This Chapter Principle 1: Money Has a Time Value. Principle 2: There is a Risk Reward Tradeoff. Principle 3: Cash Flows are the Source of Value. Principle

More information

3. The Dynamic Programming Algorithm (cont d)

3. The Dynamic Programming Algorithm (cont d) 3. The Dynamic Programming Algorithm (cont d) Last lecture e introduced the DPA. In this lecture, e first apply the DPA to the chess match example, and then sho ho to deal ith problems that do not match

More information

Fundamental Algorithms - Surprise Test

Fundamental Algorithms - Surprise Test Technische Universität München Fakultät für Informatik Lehrstuhl für Effiziente Algorithmen Dmytro Chibisov Sandeep Sadanandan Winter Semester 007/08 Sheet Model Test January 16, 008 Fundamental Algorithms

More information

Discrete Mathematics for CS Spring 2008 David Wagner Final Exam

Discrete Mathematics for CS Spring 2008 David Wagner Final Exam CS 70 Discrete Mathematics for CS Spring 2008 David Wagner Final Exam PRINT your name:, (last) SIGN your name: (first) PRINT your Unix account login: Your section time (e.g., Tue 3pm): Name of the person

More information

COS402- Artificial Intelligence Fall Lecture 17: MDP: Value Iteration and Policy Iteration

COS402- Artificial Intelligence Fall Lecture 17: MDP: Value Iteration and Policy Iteration COS402- Artificial Intelligence Fall 2015 Lecture 17: MDP: Value Iteration and Policy Iteration Outline The Bellman equation and Bellman update Contraction Value iteration Policy iteration The Bellman

More information

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned

More information

Node betweenness centrality: the definition.

Node betweenness centrality: the definition. Brandes algorithm These notes supplement the notes and slides for Task 11. They do not add any new material, but may be helpful in understanding the Brandes algorithm for calculating node betweenness centrality.

More information

Column generation to solve planning problems

Column generation to solve planning problems Column generation to solve planning problems ALGORITMe Han Hoogeveen 1 Continuous Knapsack problem We are given n items with integral weight a j ; integral value c j. B is a given integer. Goal: Find a

More information

CPS 270: Artificial Intelligence Markov decision processes, POMDPs

CPS 270: Artificial Intelligence  Markov decision processes, POMDPs CPS 270: Artificial Intelligence http://www.cs.duke.edu/courses/fall08/cps270/ Markov decision processes, POMDPs Instructor: Vincent Conitzer Warmup: a Markov process with rewards We derive some reward

More information

1.6 Heap ordered trees

1.6 Heap ordered trees 1.6 Heap ordered trees A heap ordered tree is a tree satisfying the following condition. The key of a node is not greater than that of each child if any In a heap ordered tree, we can not implement find

More information

CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 4: Prior-Free Single-Parameter Mechanism Design. Instructor: Shaddin Dughmi

CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 4: Prior-Free Single-Parameter Mechanism Design. Instructor: Shaddin Dughmi CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 4: Prior-Free Single-Parameter Mechanism Design Instructor: Shaddin Dughmi Administrivia HW out, due Friday 10/5 Very hard (I think) Discuss

More information

EC316a: Advanced Scientific Computation, Fall Discrete time, continuous state dynamic models: solution methods

EC316a: Advanced Scientific Computation, Fall Discrete time, continuous state dynamic models: solution methods EC316a: Advanced Scientific Computation, Fall 2003 Notes Section 4 Discrete time, continuous state dynamic models: solution methods We consider now solution methods for discrete time models in which decisions

More information