Phil 321: Week 2 Decisions under ignorance
Decisions under Ignorance 1) Decision under risk: The agent can assign probabilities (conditional or unconditional) to each state. 2) Decision under ignorance: No probability assignments.
Example: Buying a home Choice: Vancouver, Richmond or Burnaby. Costs are highest in Van, lowest in Richmond. Use ordinal scale. E major earthquake occurs in next 10 years Vancouver Richmond Burnaby E ~E 3 5 1 6 2 4 Problem: given ignorance of the probabilities, how do we make a decision?
3.1 Dominance Weak Dominance. A B iff val(a, s) val(b, s) for all states s. (Here, A and B are acts.) Strong Dominance. A B iff val(a, s) val(b, s) for all states s, and val(a, s) > val(b, s) for at least one state s. Note: we are using the same symbols as for preference rankings among outcomes. Dominance Principle. 1. If B is a strongly dominated act, rule it out. 2. If A is a strongly dominant act, perform it. In house purchase : Dominance rules out Burnaby, but does not help to decide between Richmond and Vancouver.
3.2 Maximin and Leximin Maximin. Find the minimum possible value for each act. Choose the act whose minimum value is maximal. That is: choose the act that maximizes the minimal possible value. A B iff MIN(A) MIN(B). House purchase: Vancouver.
What about ties? Example 1 Example 2 1 3 3 5 0 7 10 8 1 2 4 4 1 3 4 5 Maximin: not R2. (What else?) 1 1 1 10 1 1 0 1 1 2 3 4 1 2 10 5 Maximin: not R2 (What else?)
Leximin. In case of a tie (for worst outcome), maximize the second worst value (etc.) Text: A B iff for some n, MIN n (A) > MIN n (B), and MIN m (A) = MIN m (B) for all m < n. Idea: Eliminate all rows except those that are tied using maximin. Cross out all minimum entries in these rows and compare the next lowest entry, etc. Problem: In Example 1 and Example 2, Leximin as stated would pick R1! Is it best to interpret the lexical rule so as to pick R4?
Modified Leximin Leximin*: First eliminate strongly dominated acts, then apply Leximin (gives R4 in Ex. 1). Leximin**: Apply Leximin by crossing out only one entry at a time (gives R4 in Ex. 1 and Ex. 2). Note: Even the amended versions can yield bad results. (We don t know the probabilities.)
Evaluation of Maximin and Leximin Positive rationale for Maximin and Leximin rules. Conservatism: avoid worst outcome. Objections: 1) Lost opportunities (R2 in Example 1). 2) Probabilistic pre suppositions: Murphy s law attitude (worst outcome is most likely). Lottery counter example. R1 R2 100 1 1 0 99 99 [100 columns] 3) Intuitive counter examples. Religious belief: because you think there is a tiny chance of hell, you should become a believer? Day to day life: because I might get in a bad accident, I should never drive my car? Science/business: I should never invest in a new idea?
3.3 Best Average, Maximax, Optimism Pessimism Rule 1. Maximax: Maximize the MAXIMAL value obtainable. Formally: find MAX(A) for each A, and then A B if MAX(A) MAX(B). Example 1: R2. Example 2: R1 and R4 tied; lexical version gives R4. House purchase? 2. Best Average: For each option, compute AVG = (MAX + MIN) / 2. Then choose the option which maximizes AVG. Formally: find AVG(A) for each A, and then A B if AVG(A) AVG(B). Example 1: R2. Example 2: R1 and R4 tied; lexical gives R4. House purchase? 3. Optimism pessimism rule: For each A, compute VALα(A) = α MAX(A) + (1 α) MIN(A). Then choose the option which maximizes this quantity.
Evaluation Rationale: Avoids choices that might be catastrophic and choices that miss great opportunities (except when α=0 or α=1). Objections: 1) If 0 < α < 1: need interval scale. Results may vary under ordinal transformations. Example 1*: 2 3 3 5 *same ordinal ranking as Example 1 0 5.25 6 5.5 *Best Avg. now recommends R1 or R4 2 2.5 4 4 2 3 4 5 2) Counterexamples where there is some knowledge of probabilility. (Lottery counterexample for maximin. 3) Ignores all values other than best and worst (Table 3.11, p. 48).
3.4 Minimax Regret Rule Regret for act state pair A, s = (actual value) (MAX value in column s) Max. regret for action A = maximum regret value on its row (i.e., the most negative number) Minimax Regret Rule: Choose the action that minimizes maximum regret. Example 1: 1 3 3 5 [Step 1: highlight max. for each column] 0 7 10 8 1 2 4 4 1 3 4 5 Regret matrix/table: [Step 2: construct the regret table] 0 4 7 3 max. regret 7[highlight worst on each row] 1 0 0 0 max. regret 1 0 5 6 4 max. regret 6 0 4 6 3 max. regret 6 R2 minimizes maximum regret. [Step 3: select the action that minimizes max. regret] Use lexical version for ties.
Evaluation Positive: Makes decision that minimizes lost opportunity. Objections: 1. Needs an interval scale. 2. Not invariant under expansion by irrelevant alternatives. 3. Probabilistic counterexamples (lotter counterexample).
3.5 Principle of Insufficient Reason PIR (for probability): If there are n possible states and we have no more reason to believe any one of them to be true than any other, then we should assign each state an equal probability (namely, 1/n). PIR (for decisions under ignorance): Maximize expected utility on the basis of equal probability for each state. (Shortcut: just sum each row and then pick the row with the largest sum.) Ex: in Example 1, recommends R2. Ex: in Example 2, recommends R4. Ex: in lottery counter example, recommends R2.
Evaluation Rationale: With no good reason to assign any probabilities, you should treat all states equally (and assign them equal probability). Objections: 1) Pre supposes an interval scale. 2) Arbitrariness of assumption of equi probability. (E.g., religious belief; marriage example in text) 3) Possibility of catastrophe (given any bet at 2:1 or better, we should bet all our savings) *4) Inconsistency of the equi probability assumption: sensitivity to how states are individuated. Example: Lottery counterexample. Re describe as 1 wins and 2 100 wins. New table is 2x2 and R1 is recommended.
3.6 Randomised acts Mixed Strategy: instead of choosing a pure act (one of the rows), use a randomizing device to select the act. Example: with two options (R1 and R2), flip a fair coin to decide. If the original decision matrix is 1 0 0 1 we now add a row for the mixed strategy, so the matrix becomes 1 0 0 1 ½ ½ R3 is recommended by Maximin, Regret Rule, tied for Best Average. Objections: not a certainty of ½, only an expected value not clear what is gained by randomising
Dealing with disagreement How should we deal with disagreement of the rules on the rational action (or actions)? 1. Majority rule 2. Weighting 3. Axioms/rationality constraints: try to use these to knock out one or more decision principles. A) Mixture condition. If a rational agent is indifferent between A1 and A2, then the agent is indifferent between A1, A2 and [1/2A1, 1/2A2]. Eliminates every rule except Best Avg. and PIR. The condition may be too strong: presupposes something about probabilities. B) Irrelevant Expansion condition. Wipes out the Best Avg. (taking ½p, ½q) and Minimax Regret Rule. Upshot: Only PIR survives all of these. But these axioms are only desiderata, not absolutely essential. 4. Case by case (see graph).
Summary of criteria for evaluating decision rules: 1. Consistent with Dominance Principle. 2. Invariant under ordinal transformations 3. Invariant under positive linear transformations (i.e., interval scale). 4. Invariant under expansion by irrelevant alternatives. 5. Takes advantage of opportunities 6. Handles intuitive counterexamples 7. Handles probabilistic objections 8. Does not appear to yield arbitrary decisions.
Application to distributive justice (the Rawls Harsanyi debate) Example 1: 10 people are trying to divide $1,000. Step 1: The money is divided into 10 piles (which may be $0). Step 2: Each person is assigned one pile by an *unknown* mechanism. Consider four options: O1 equal piles ($100) O2 all to one person ($1000) O3 five equal portions to two people ($200) O4 graduated ($10, $30, $50,, $190). And consider where you might end up. Maximin: recommends O1. Minimax regret: O2. Best average: O2. Indifference: All equally good.
Example 2: Choosing a just society. Rawls Difference Principle: Society S1 is better than S2 if the worst off members of S1 fare better than the worst off members of S2. Harsanyi: S1 is better than S2 if the average utility is higher in S1 than in S2. Veil of ignorance: An imaginary situation in which agents know their preferences and have a complete description of rival societies, but do not know *which* role they will assume in society (nor what the probabilities are). Useful test for principles of justice: would agents choose them from behind the veil of ignorance? Harsanyi: Agents will assume the principle of indifference (equal chance to be anybody) and therefore will seek to maximize expected average utility. Hence, utilitarianism determines which society is better. Rawls: Agents will use the maximin principle. Hence, agents will evaluate societies by looking at the position of the worst off. Hence, the difference principle determines which society is better.