INTRODUCTION TO MATHEMATICAL MODELLING LECTURES 3-4: BASIC PROBABILITY THEORY

Similar documents
Econ 6900: Statistical Problems. Instructor: Yogesh Uppal

CHAPTER 6 Random Variables

The Binomial distribution

Marquette University MATH 1700 Class 8 Copyright 2018 by D.B. Rowe

Stochastic Calculus, Application of Real Analysis in Finance

FE 5204 Stochastic Differential Equations

Statistical Methods for NLP LT 2202

Chapter 5 Basic Probability

Part V - Chance Variability

Probability. An intro for calculus students P= Figure 1: A normal integral

Chapter 7. Random Variables

ECON 214 Elements of Statistics for Economists 2016/2017

6.1 Discrete and Continuous Random Variables. 6.1A Discrete random Variables, Mean (Expected Value) of a Discrete Random Variable

12. THE BINOMIAL DISTRIBUTION

12. THE BINOMIAL DISTRIBUTION

Chapter 4. Section 4.1 Objectives. Random Variables. Random Variables. Chapter 4: Probability Distributions

The Binomial Distribution

Chapter 6: Random Variables

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial

CS134: Networks Spring Random Variables and Independence. 1.2 Probability Distribution Function (PDF) Number of heads Probability 2 0.

Chapter 17: Vertical and Conglomerate Mergers

Random variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny.

Statistics for Business and Economics: Random Variables (1)

Probability & Sampling The Practice of Statistics 4e Mostly Chpts 5 7

Random Variables. 6.1 Discrete and Continuous Random Variables. Probability Distribution. Discrete Random Variables. Chapter 6, Section 1

P C. w a US PT. > 1 a US LC a US. a US

Chapter 5. Sampling Distributions

5.7 Probability Distributions and Variance

Random variables The binomial distribution The normal distribution Other distributions. Distributions. Patrick Breheny.

5.1 Personal Probability

The normal distribution is a theoretical model derived mathematically and not empirically.

The Binomial Distribution

Sec$on 6.1: Discrete and Con.nuous Random Variables. Tuesday, November 14 th, 2017

Midterm Exam 2. Tuesday, November 1. 1 hour and 15 minutes

Probability Distributions for Discrete RV

Problem Set #3 (15 points possible accounting for 3% of course grade) Due in hard copy at beginning of lecture on Wednesday, March

The binomial distribution

HHH HHT HTH THH HTT THT TTH TTT

When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution?

Statistics for Managers Using Microsoft Excel 7 th Edition

MA 1125 Lecture 14 - Expected Values. Wednesday, October 4, Objectives: Introduce expected values.

TOPIC: PROBABILITY DISTRIBUTIONS

Lesson 97 - Binomial Distributions IBHL2 - SANTOWSKI

Opening Exercise: Lesson 91 - Binomial Distributions IBHL2 - SANTOWSKI

Math 14 Lecture Notes Ch Mean

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date: Distribution Distribute in anyway but normal

9 Expectation and Variance

The Binomial Distribution

Keeping Your Options Open: An Introduction to Pricing Options

Probability mass function; cumulative distribution function

TRINITY COLLGE DUBLIN

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

MATH 264 Problem Homework I

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 4: May 2, Abstract

Math-Stat-491-Fall2014-Notes-V

Math489/889 Stochastic Processes and Advanced Mathematical Finance Homework 5

Why Bankers Should Learn Convex Analysis

Chapter 3 - Lecture 5 The Binomial Probability Distribution

Chapter 4 and 5 Note Guide: Probability Distributions

The Binomial Lattice Model for Stocks: Introduction to Option Pricing

Lecture 6 Probability

Statistics for Business and Economics

Random Variables CHAPTER 6.3 BINOMIAL AND GEOMETRIC RANDOM VARIABLES

Information Acquisition in Financial Markets: a Correction

Chapter 5 Discrete Probability Distributions. Random Variables Discrete Probability Distributions Expected Value and Variance

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 3, 1.9

Module 4: Probability

15.063: Communicating with Data Summer Recitation 3 Probability II

The probability of having a very tall person in our sample. We look to see how this random variable is distributed.

Binomial Random Variables. Binomial Random Variables

MLLunsford 1. Activity: Central Limit Theorem Theory and Computations

Learning Objec0ves. Statistics for Business and Economics. Discrete Probability Distribu0ons

Probability Distributions

E509A: Principle of Biostatistics. GY Zou

The Binomial Probability Distribution

Central Limit Theorem 11/08/2005

Conditional Probability. Expected Value.

Section Sampling Distributions for Counts and Proportions

ECON 214 Elements of Statistics for Economists 2016/2017

4 Random Variables and Distributions

Favorite Distributions

Central Limit Theorem, Joint Distributions Spring 2018

5.2 Random Variables, Probability Histograms and Probability Distributions

Stat511 Additional Materials

STA 6166 Fall 2007 Web-based Course. Notes 10: Probability Models

Chapter 8 Solutions Page 1 of 15 CHAPTER 8 EXERCISE SOLUTIONS

Stochastic Processes and Financial Mathematics (part one) Dr Nic Freeman

When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution?

Binomial and Normal Distributions

Business Statistics 41000: Probability 4

STAT 111 Recitation 4

IEOR 3106: Introduction to OR: Stochastic Models. Fall 2013, Professor Whitt. Class Lecture Notes: Tuesday, September 10.

Statistics for Business and Economics: Random Variables:Continuous

STAT 111 Recitation 2

Discrete Mathematics for CS Spring 2008 David Wagner Final Exam

Some Computational Aspects of Martingale Processes in ruling the Arbitrage from Binomial asset Pricing Model

Elementary Statistics Lecture 5

5.3 Statistics and Their Distributions

II - Probability. Counting Techniques. three rules of counting. 1multiplication rules. 2permutations. 3combinations

Transcription:

9 January 2004 revised 18 January 2004 INTRODUCTION TO MATHEMATICAL MODELLING LECTURES 3-4: BASIC PROBABILITY THEORY Project in Geometry and Physics, Department of Mathematics University of California/San Diego, La Jolla, CA 92093-0112 http://math.ucsd.edu/ ~ dmeyer/; dmeyer@math.ucsd.edu Example Suppose e observe a gambler enter a casino ith $100 in his/her pocket, and then leave a fe hours later ith $87. This is a situation e might ant to model, particularly if e are thinking about entering the casino ourselves. Without any additional information, i.e., any additional data, there is little more that e can predict than if someone else goes into the casino ith $100 and leaves after the same amount of time, s/he ill also have only $87. This is certainly a case in hich e must adjust our data collection. So suppose e enter the casino and find that the gambler is repeatedly playing a very simple game (this is not intended to be realistic): s/he flips a coin, inning a dollar if it comes up heads, and losing a dollar if it comes up tails. After playing 100 times, the gambler leaves the casino. This gives us much more information ith hich to build our model, although perhaps not as much as e might like. Probabilistic models In principle (to the extent that physics is classical), if e could measure exactly ho the coin is being flipped, exactly ho the coin is shaped and eighted, exactly the gravitational acceleration, exactly ho the coin bounces hen it hits the table, exactly ho the air currents are bloing on the coin hile it is in the air, etc., e could do a complicated physics calculation and determine hether the coin ill land head up or tail up. In practice, of course, e cannot kno most of these details, so e summarize our ignorance by saying that the coin comes up heads ith probability p, i.e., a fraction p of the time, here 0 p 1. That is, probability is an accounting for the effects of parts of the orld that e are not trying to model (exogenous variables) and about hich e have only limited knoledge. There are alays such effects in any real situation, hich is hy e are starting this course ith a discussion of them, rather than pretending that e can c 2004 1

usually make complete mathematical models and then only discussing probabilistic effects at the end of the course. Most of the models that e discuss ill not make deterministic predictions that something ill certainly happen, but rather make probabilistic predictions that different outcomes ill occur different fractions of the time. The binomial distribution If the gambler only plays once, it is easy to make a prediction: outcome payoff probability H $1 p T $1 1 p A fraction p of the time s/he ill leave the casino ith $101, and a fraction 1 p of the time s/he ill leave the casino ith $99. Playing tice is not much harder to understand: outcome payoff probability HH $2 p 2 HT $0 p(1 p) TH $0 (1 p)p TT $2 (1 p) 2 So a fraction p 2 of the time the gambler ill leave ith $102, a fraction 2p(1 p) of the time ith $100, and a fraction (1 p) 2 of the time ith $98. Finally, if the gambler plays three times e compute: outcome payoff probability HHH $3 p 3 HHT $1 p 2 (1 p) HTH $1 p 2 (1 p) THH $1 p 2 (1 p) TTH $1 p(1 p) 2 THT $1 p(1 p) 2 HTT $1 p(1 p) 2 TTT $3 (1 p) 3 No there are four results: $103, $101, $99 and $97, hich e predict to occur ith probabilities p 3, 3p 2 (1 p), 3p(1 p) 2 and (1 p) 3, respectively. 2

It ould be extremely tedious to analyze the case of 100 coin flips like this. Fortunately, e can be cleverer. Notice that the payoffs depend only on ho many heads there are, and that for a given total number of flips, n, the probability of a specific outcome ith heads is the same, p (1 p) n, no matter hen the heads appear. (We have made an assumption here, that the outcomes of different coin flips are independent, hich e ill discuss later.) To figure out ho many outcomes there are ith heads, imagine that the coins are labelled from 1 to n. We can arrange them in n(n 1)(n 2) 3 2 1 = n! different orders, by picking any of n for the first, any of the remaining n 1 for the second, etc. Not all of these correspond to different outcomes, hoever, since any ordering ith the heads in the same positions is the same outcome, no matter in hat order the labels on the heads are arranged. But these labels can be arranged in! orders by the same argument. Similarly, the n labels on the tail up coins can be arranged in (n )! ays. So the total number of different outcomes ith heads is no. orders of n coins (no. orders of heads)(no. orders of n tails) = n!!(n )! =, here the last symbol is pronounced n choose, and is called a binomial coefficient. Multiplying the number of different outcomes ith heads by the probability of a specific outcome ith heads gives the probability that the gambler ill in times out of n: p (1 p) n. Random variables This is an example of a probability function: We say that the number of heads, W, is a random variable and the probability that W =, prob(w = ) = p (1 p) n. For any probability function, if e add the probabilities of every possible outcome e must get 1; in this case: 1 = n prob(w = ) = n p (1 p) n. (3.1) Homeork: Read Larsen & Marx [1], p. 135 136. Sho algebraically that the sum on the right of eq. 3.1 equals 1. 3

The expectation value of a random variable, E[W], is the probability eighted average of its possible values; in this case: n E[W] = prob(w = ) n = p (1 p) n n n! =!(n )! p (1 p) n n n! =!(n )! p (1 p) n (since the = 0 term in the sum is 0) =1 n n! = ( 1)!(n )! p (1 p) n =1 n (n 1)! = np ( 1)! ( (n 1) ( 1) )! p 1 (1 p) (n 1) ( 1) =1 (since (n 1) ( 1) = n ) n 1 (n 1)! = np v! ( (n 1) v )! pv (1 p) (n 1) v (letting v = 1) v=0 = np, (using eq. 3.1 ith n replaced by n 1) hich is hat you most likely expected. That is, for a fair coin (p = 1 ), the expected 2 number of times the gambler ho plays 100 times ill in is 50, so his/her expected payoff is $50 $50 = $0. Of course, this does not happen every time; there is some variation in the outcomes. The variance of a random variable, Var[W], is the probability eighted average of the squared differences of its possible values from E[W]: Var[W] = ( E[W]) 2 prob(w = ). Evaluating the variance of the binomial distribution is substantially more complicated than evaluating the expectation value. But e can avoid doing the algebra by learning a little more about basic probability theory, hich ill also sho us ho to compute the expectation value much more easily than the calculation above. Sums of random variables Notice that W = X 1 + + X n, here X i is a random variable that can take to values: { 1 if the i X i = th coin flipped lands head up; 0 if the i th coin flipped lands tail up. 4

That is, the number of ins can be computed by adding 1 for each coin that lands head up, and 0 for each coin that lands tail up. It is easy to calculate the expectation value of X i. From the definition as the probability-eighted sum of the possible outcomes e have: E[X i ] = p 1 + (1 p) 0 = p. No consider the case n = 2, i.e., W = X 1 + X 2. E[X 1 + X 2 ] = (x 1 + x 2 )prob(x 1 = x 1 X 2 = x 2 ) x 1,x 2 = x 1 prob(x 1 = x 1 X 2 = x 2 ) + x 2 prob(x 1 = x 1 X 2 = x 2 ) x 1,x 2 x 1,x 2 = x 1 prob(x 1 = x 1 X 2 = x 2 ) + prob(x 1 = x 1 X 2 = x 2 ) x 1 x 2 x 1 x 2 x 2 = x 1 x 1 prob(x 1 = x 1 ) + x 2 x 2 prob(x 2 = x 2 ) = E[X 1 ] + E[X 2 ], here x i {0, 1} and means and. The penultimate equality follos from the fact that for any random variables X and Y, prob(x = x) = y prob(x = x Y = y). Thus, in general, the expectation value of the sum of random variables is the sum of their expectation values. In particular, E[W] = E[X 1 + + X n ] = E[X 1 ] + + E[X n ] = np, hich is hat e computed previously, ith considerably more effort. Homeork: Use mathematical induction to prove the middle equality in this equation, for all n N. So e have shoed that expectation values of random variables add. We might ask, Do they multiply?. The anser is, Sometimes.. They do if an important property holds: To random variables X and Y are independent if and only if prob(x = x Y = y) = prob(x = x)prob(y = y). In this case e can compute the expectation value of their product: E[XY ] = x,y = x,y xy prob(x = x Y = y) xy prob(x = x)prob(y = y) = x x prob(x = x) y y prob(y = y) = E[X]E[Y ]. 5

We use this fact to study the variance of a sum of random variables: Var[X + Y ] = x,y (x + y E[X] E[Y ]) 2 prob(x = x Y = y) = x,y ( (x E[X]) + (y E[Y ]) ) 2 prob(x = x Y = y) = x,y ( (x E[X]) 2 + 2(x E[X])(y E[Y ]) + (y E[Y ]) 2) prob(x = x Y = y) = x,y (x E[X]) 2 prob(x = x Y = y) + 2 x,y (x E[X])(y E[Y ]) prob(x = x Y = y) + x,y (y E[Y ]) 2 prob(x = x Y = y) = x (x E[X]) 2 prob(x = x) + 2 x,y (x E[X])(y E[Y ]) prob(x = x)prob(y = y) + y (y E[Y ]) 2 prob(y = y) = Var[X] + Var[Y ] + 2 x (x E[X]) prob(x = x) y (y E[Y ]) prob(y = y) = Var[X] + Var[Y ], here the last equality follos from (x E[X])prob(X = x) = x prob(x = x) E[X] x x x = E[X] E[X] 1 = 0. prob(x = x) For the coin flipping game, the outcomes of different coin flips are assumed to be independent, and Var[X i ] = p (1 p) 2 + (1 p) (0 p) 2 = p(1 p), so Var[W] = Var[X 1 ] + + Var[X n ] = np(1 p). 6

No that e have computed the expectation value and the variance of the binomial distribution, e can investigate ho close it is to a normal distribution ith the same mean and variance, just as e did ith the height data in Lecture 2 [2]. Figure 3.1 shos the results for to binomial distributions ith n = 20. The central one (in red) has p = 1/2 and is very ell approximated by the normal distribution ith the same mean and variance (in blue). The left one (in green) has p = 1/4 and is slightly less ell approximated by the corresponding normal distribution (in blue). But the observation 0.2 0.15 0.1 0.05 5 10 15 20 Figure 3.1. Binomial distributions for n = 20 and p = 1/2 (central distribution, red), p = 1/4 (left distribution, green), and normal distributions ith the same mean and variance, respectively. that each is ell approximated by a normal distribution is correct, and is a consequence of the folloing theorem. DEMOIVRE-LAPLACE LIMIT THEOREM. Let W be a binomial random variable describing the number of successes in n trials, each of hich succeeds ith probability p. For all a b R, ( lim prob a n W np ) b = 1 b e 2 /2 d. np(1 p) 2π a We could equally ell rite this as: lim prob(a W b) = 1 b e ( np)2 /2np(1 p) d. n 2πnp(1 p) a Both statements say that the area under either of the blue curves in Figure 3.1, beteen = a and = b, is approximately equal to the sum of the probabilities at the points ith a b for the corresponding binomial probability function. The approximation is better for larger n, and for p further from 0 or 1. Remember that e used the formula W = X 1 + + X n to calculate the expectation value and variance of W. We can imagine a random variable that is the sum Y 1 + + Y n for a sequence of random variables Y i that are not simply binary valued like X i. Even for this more general situation, the same result holds: CENTRAL LIMIT THEOREM. Let Y 1, Y 2,... be an infinite sequence of independent random variables, each having the same distribution, ith E[Y i ] = µ < and Var[Y i ] = σ 2 <. For all a b R, ( lim prob a Y 1 + + Y n nµ n nσ 7 ) b = 1 b e 2 /2 d. 2π a

For proofs of these theorems, see any standard probability/statistics text, e.g., Larsen & Marx [1]. The Central Limit Theorem is most interesting for us because it suggests a characteristic of mathematical models that ould produce a normal distribution for some variable the model likely includes several or many independent variables ith similar distributions that are summed to give the variable ith the normal distribution. In the case of human height, the data e have already seen indicate that male/female is an important factor, and e ould expect that nutrition is also important. The observation that the distribution for each sex is separately approximately normal suggests further that there might be a genetic model that involves the action of multiple genes, each of hich can contribute to larger or smaller height. This is in contrast to the famous pea plants originally studied by Mendel [3], hich are only tall or short, i.e., have a binary distribution, not a normal one. From hat e have learned in this lecture, e ould expect pea plant height to be controlled by only one gene; this is true, and the action of that gene (Le) is no understood [4]. References [1] R. J. Larsen and M. L. Marx, An Introduction to Mathematical Statistics and Its Applications (Upper Saddle River, NJ: Prentice Hall 2001). [2] D. A. Meyer, Introduction to Mathematical Modelling: Data collection, http://math.ucsd.edu/ ~ dmeyer/teaching/111inter04/imm040107.pdf. [3] G. Mendel, (1866). Versuche über Pflanzen-hybriden, Verh. Natforsch. Ver. Brünn 4 (1866) 3 47; http://.mendeleb.org/mwgertext.html; English transl. by C. T. Druery and W. Bateson, Experiments in plant hybridization, http://.mendeleb.org/mendel.html. [4] D. R. Lester, J. J. Ross, P. J. Davies and J. B. Reid, Mendel s stem length gene (Le) encodes a gibberellin 3β-hydroxylase, Plant Cell 9 (1997) 1435 1443. 8