Basic Principles of Probability and Statistics. Lecture notes for PET 472 Spring 2012 Prepared by: Thomas W. Engler, Ph.D., P.E

Similar documents
Basic Principles of Probability and Statistics. Lecture notes for PET 472 Spring 2010 Prepared by: Thomas W. Engler, Ph.D., P.E

Lean Six Sigma: Training/Certification Books and Resources

Normal Probability Distributions

2011 Pearson Education, Inc

Chapter 7. Sampling Distributions

Counting Basics. Venn diagrams

Describing Uncertain Variables

Frequency Distribution Models 1- Probability Density Function (PDF)

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

Basic Procedure for Histograms

Contents Part I Descriptive Statistics 1 Introduction and Framework Population, Sample, and Observations Variables Quali

Statistics, Measures of Central Tendency I

Probability Distribution Unit Review

Useful Probability Distributions

AP Statistics Chapter 6 - Random Variables

Theoretical Foundations

The normal distribution is a theoretical model derived mathematically and not empirically.

Continuous Probability Distributions

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Chapter 3 Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2013 John Wiley & Sons, Inc.

4-2 Probability Distributions and Probability Density Functions. Figure 4-2 Probability determined from the area under f(x).

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

Statistics for Managers Using Microsoft Excel/SPSS Chapter 6 The Normal Distribution And Other Continuous Distributions

Introduction to Statistical Data Analysis II

Probability distributions

Section Introduction to Normal Distributions

Lecture 9. Probability Distributions. Outline. Outline

UNIT 4 MATHEMATICAL METHODS

Lecture 9. Probability Distributions

Probability distributions relevant to radiowave propagation modelling

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI

Web Science & Technologies University of Koblenz Landau, Germany. Lecture Data Science. Statistics and Probabilities JProf. Dr.

Review of the Topics for Midterm I

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Statistical Intervals (One sample) (Chs )

2.1 Properties of PDFs

Review of Expected Operations

Statistics 431 Spring 2007 P. Shaman. Preliminaries

STAT 157 HW1 Solutions

ก ก ก ก ก ก ก. ก (Food Safety Risk Assessment Workshop) 1 : Fundamental ( ก ( NAC 2010)) 2 3 : Excel and Statistics Simulation Software\

Model Paper Statistics Objective. Paper Code Time Allowed: 20 minutes

Statistics for Business and Economics

x is a random variable which is a numerical description of the outcome of an experiment.

Part V - Chance Variability

A probability distribution shows the possible outcomes of an experiment and the probability of each of these outcomes.

DESCRIBING DATA: MESURES OF LOCATION

Econ 6900: Statistical Problems. Instructor: Yogesh Uppal

Random Variables and Probability Distributions

Chapter 6 Simple Correlation and

PROBABILITY. Wiley. With Applications and R ROBERT P. DOBROW. Department of Mathematics. Carleton College Northfield, MN

Probability and Statistics

Probability Theory. Mohamed I. Riffi. Islamic University of Gaza

Week 7. Texas A& M University. Department of Mathematics Texas A& M University, College Station Section 3.2, 3.3 and 3.4

Epidemiology Principle of Biostatistics Chapter 7: Sampling Distributions (continued) John Koval

Lecture 6: Chapter 6

Statistical Methods in Practice STAT/MATH 3379

Consider the following examples: ex: let X = tossing a coin three times and counting the number of heads

CSC Advanced Scientific Programming, Spring Descriptive Statistics

MANAGEMENT PRINCIPLES AND STATISTICS (252 BE)

Math 227 Elementary Statistics. Bluman 5 th edition

Contents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii)

MAS187/AEF258. University of Newcastle upon Tyne

Probability and Statistics for Engineers

Fundamentals of Statistics

IOP 201-Q (Industrial Psychological Research) Tutorial 5

value BE.104 Spring Biostatistics: Distribution and the Mean J. L. Sherley

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Numerical Descriptions of Data

Monte Carlo Simulation (Random Number Generation)

Chapter 7 1. Random Variables

Statistical Modeling Techniques for Reserve Ranges: A Simulation Approach

4.3 Normal distribution

Continuous Distributions

Using Monte Carlo Analysis in Ecological Risk Assessments

Class 11. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Probability and Sampling Distributions Random variables. Section 4.3 (Continued)

Lecture 2. Probability Distributions Theophanis Tsandilas

PSYCHOLOGICAL STATISTICS

Lecture 2 Describing Data

1/2 2. Mean & variance. Mean & standard deviation

Chapter 3. Discrete Probability Distributions

MidTerm 1) Find the following (round off to one decimal place):

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Prentice Hall Connected Mathematics 2, 7th Grade Units 2009 Correlated to: Minnesota K-12 Academic Standards in Mathematics, 9/2008 (Grade 7)

MAS187/AEF258. University of Newcastle upon Tyne

Probability & Sampling The Practice of Statistics 4e Mostly Chpts 5 7

Acritical aspect of any capital budgeting decision. Using Excel to Perform Monte Carlo Simulations TECHNOLOGY

Statistics & Flood Frequency Chapter 3. Dr. Philip B. Bedient

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Introduction to Statistics I

TOPIC: PROBABILITY DISTRIBUTIONS

2018 CFA Exam Prep. IFT High-Yield Notes. Quantitative Methods (Sample) Level I. Table of Contents

STATISTICS and PROBABILITY

Continuous random variables

Prob and Stats, Nov 7

Transcription:

Basic Principles of Probability and Statistics Lecture notes for PET 472 Spring 2012 Prepared by: Thomas W. Engler, Ph.D., P.E

Definitions Risk Analysis Assessing probabilities of occurrence for each possible outcome Risk Analysis Probabilities and prob. distributions Representing judgments about chance events Modeling Geologic, reservoir, drilling Operations, Economics Decision criteria EV, profit, IRR Present to management for decision

Definitions Sample Space Complete set of outcomes (52 cards) Outcome Subset of the sample space (drawing a 5 of any suit) Probability Likelihood of drawing a 5 P(A) = 4/52

Definitions Equally likely outcomes Have same probability to occur Mutually eclusive outcomes The occurrence of any given outcome ecludes the occurrence of other outcomes Independent events The occurrence of one outcome does not influence the occurrence of another Conditional probability The probability of an outcome is dependent upon one or more events that have previously occurred.

Rules of Operation Symbol Definition Epression P(A) Probability of outcome A occurring P(A+B) Probability of outcome A and/or B occurring P(A+B)=P(A)+P(B)-P(AB) P(AB) Probability of A and B occurring P(AB) = P(A) P(B A) P(A B) Probability of A given B has occurred.

Rules of Operation Addition Theorem P(A+B)=P(A)+P(B)-P(AB) Eample outcome A drawing 4, 5, 6 of any suit outcome B J or Q of any suit P(A B) 20 52 P(A) P(B) P(AB) 12 52 8 52 0 A B Mutually Eclusive events Venn Diagram

Rules of Operation Addition Theorem P(A+B)=P(A)+P(B)-P(AB) Eample outcome A drawing 4, 5, 6 of any suit outcome B drawing a diamond P(A B) 22 52 P(A) P(B) P(AB) 12 52 13 52 3 52 A B Venn Diagram

Rules of Operation Multiplication Theorem P(AB)=P(A)P(A B) Eample outcome A drawing any jack outcome B drawing a four of hearts on the second draw P(A) P(B A) 4 52 1 51 P(AB) 4 52 1 51 1 663 conditional Sampling without replacement - observed outcome is not returned - series of dependent events

Rules of Operation Multiplication Theorem P(AB)=P(A)P(B) Eample outcome A drawing any jack, return P(A) 4 52 outcome B drawing a four of hearts on the second draw P(B) 1 52 P(AB) 4 52 1 52 1 676 Sampling with replacement - observed outcome is returned to sample space - series of independent events

Eample Eample: Eploration eample involving conditional probabilities Decision: drill prospect or farmout and retain an override Tabulated gross per-well reserves for eisting wells NPV for alternatives Percent of wells EUR Number having these Bcf of wells reserves 2 7 35% P(B A) 3 7 35% These are conditional probabilities. That is, 4 4 20% given a well is productive there is a 35% 5 2 10% chance of producing 2 Bcf. 20 EUR Drill option Farmout option Bcf NPV, $ EMV, $ NPV, $ EMV, $ 2 40000 14000 9000 3150 3 90000 31500 12500 4375 4 130000 26000 15000 3000 5 200000 20000 18000 1800 91500 12325

Eample Dry hole cost = 70000 Probability of finding gas, P(A) = 0.25 Apply multiplication theorem Probability of finding gas and that reserves are 2 Bcf? P(AB) EMV calculations Possible outcome P(A) P(B A) P(AB) dry hole 0.25 0.7500 2 Bcf 0.25 0.35 0.0875 P(AB) = P(A) P(B A) 3 Bcf 0.25 0.35 0.0875 4 Bcf 0.25 0.20 0.0500 5 Bcf 0.25 0.10 0.0250 1.0000 Possible Drill well Farmout outcome P(AB) NPV, $ EMV, $ NPV, $ EMV, $ dry hole 0.7500-70000 -52500 0 0 2 Bcf 0.0875 40000 3500 9000 788 3 Bcf 0.0875 90000 7875 12500 1094 4 Bcf 0.0500 130000 6500 15000 750 5 Bcf 0.0250 200000 5000 18000 450 1.0000-29625 3081

EMV, $ Eample Find minimum probability required to justify drilling, (ps)min ps EMV EMV drill farmout 0-70000 0 0.25-29625 3081 50000 (ps) min = 0.47 30000 10000-10000 0 0.1 0.2 0.3 0.4 0.5 0.6 p s -30000-50000 -70000-90000

f(), frequency Probability Distributions A graphical representation of the range and likelihoods of possible values of a random variable Random variable a variable that can have more than one possible value, also known as stochastic or deterministic Probability density function, random variable Useful method to describe a range of possible values. Basis for Monte Carlo Simulation.

frequency Percent Probability Distributions Frequency distributions Data Well No Net pay, ft 1 111 2 81 3 142 4 59 5 109 6 96 7 124 8 139 9 89 10 129 11 104 12 186 13 65 14 95 15 54 16 72 17 167 18 135 19 84 20 154 Divide into intervals Or bins 8 7 6 5 4 3 2 Range frequency Percent 50-80 4 20% 81-110 7 35% 111-140 5 25% 141-170 3 15% 171-200 1 5% 20 100% Histogram representation Of statistical data 40% 35% 30% 25% 20% 15% 10% 1 0 50-80 81-110 111-140 141-170 171-200 Net Pay, feet 5% 0%

Cumulative percent Probability Distributions Cumulative frequency distributions Range frequency Percent 50-80 4 20% 81-110 7 35% 111-140 5 25% 141-170 3 15% 171-200 1 5% 20 100% minimum maimum Cumulative Range Percent 50 0% 80 20% 110 55% 140 80% 170 95% 200 100% 100% Benefits 1. Can easily read probabilities 2. Necessary for Monte Carlo Simulation 80% 60% 40% 20% 0% 0 50 100 150 200 Net Pay, feet

Parameters of distributions A parameter that describes central tendency or average of the distribution Mean, m weighted average value of the random variable Median value of the random variable with equal likelihood above or below Mode value most likely to occur A parameter that describes the variability of the distribution Variance, s 2 mean of the squared deviations about the mean Standard deviation, s square root of variance degree of dispersion of distribution about the mean s a <s b A B m a =m b

Parameters of distributions Computing mean and standard deviation 1. Arithmetic average of discrete sample data set N i m i1 N s N 2 ( i m) i1 N N number of equally-probable values m 17.6 s 2.87 Core porosity and permeability Depth k,md f, % 4807.5 2.5 17.0 4808.5 59 20.7 4809.5 221 19.1 4810.5 211 20.4 4811.5 275 23.3 4812.5 384 24.0 4813.5 108 23.3 4814.5 147 16.1 4815.5 290 17.2 4816.5 170 15.3 4817.5 278 15.9 4818.5 238 18.6 4819.5 167 16.2 4820.5 304 20.0 4821.5 98 16.9 4822.5 191 18.1 4823.5 266 20.3 4824.5 40 15.3 4825.5 260 15.1 4826.5 179 14.0 4827.5 312 15.6 4828.5 272 15.5 4829.5 395 19.4 4830.5 405 17.5 4831.5 275 16.4 4832.5 852 17.2 4833.5 610 15.5 4834.5 406 20.2 4835.5 535 18.3 4836.5 663 19.6 4837.5 597 17.7 4838.5 434 20.0 4839.5 339 16.8 4840.5 216 13.3 4841.5 332 18.0 4842.5 295 16.1 4843.5 882 15.1 4844.5 600 18.0 4845.5 407 15.7 4847.5 479 17.8 4847.5 139 20.5 4847.5 135 8.4 m 17.6 s 2.87

Parameters of distributions Computing mean and standard deviation 2. Values listed as frequencies in groups m i i n i i n i i inde to denote number of intervals n frequency of data points in each interval midpoint value of each interval s i 2 i ( i m) n i i n Porosity n i p i i m s 2 i interval frequency prob. midpoint mean deviation variance 1 7 < 10 1 0.024 8.5 0.202 85.342 2.032 2 10 < 12 0 0.000 11.0 0.000 45.402 0.000 3 12 < 14 1 0.024 13.0 0.310 22.450 0.535 4 14 < 16 10 0.238 15.0 3.571 7.497 1.785 5 16 < 18 12 0.286 17.0 4.857 0.545 0.156 6 18 < 20 8 0.190 19.0 3.619 1.592 0.303 7 20 < 22 7 0.167 21.0 3.500 10.640 1.773 8 22 < 25 3 0.071 23.5 1.679 33.200 2.371 Applicable for large data sets Results are approimate 42 1.00 m 17.74 s 2 = 8.96 s 2.993

Parameters of distributions Computing mean and standard deviation 3. Discrete probability distributions m i s p i i p i ( i i 2 m) midpoint drilling costs probability of range EV i*pi ( i -m) 2 p( i )( i -m) $M $M $M $M ($M) 2 ($M) 2 100.0 0 105.2 0.007 102.6 0.7 0.7 1641.3 10.7 111.5 0.040 108.4 4.3 4.5 1208.5 48.3 130.6 0.229 121.1 27.7 29.9 486.8 111.5 136.3 0.093 133.5 12.4 12.7 93.4 8.7 148.2 0.225 142.3 32.0 33.3 0.7 0.2 165.2 0.278 156.7 43.6 45.9 184.6 51.3 168.7 0.035 167.0 5.8 5.9 568.2 19.9 178.5 0.066 173.6 11.5 11.8 929.5 61.3 183.7 0.021 181.1 3.8 3.9 1443.0 30.3 190.0 0.007 186.9 1.3 1.3 1912.9 13.4 m 143.1 149.9 355.6 s 15.8 s 18.9 p i is the probability of occurrence of the i th value of the random variable

Cumulative probability Parameters of distributions Computing mean and standard deviation 4. Cumulative frequency distribution 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 100.0 120.0 140.0 160.0 180.0 200.0 Drilling Costs, $M midpoint drilling costs probability of range EV i*pi ( i -m) 2 p( i )( i -m) $M $M $M $M ($M) 2 ($M) 2 100.0 0 105.2 0.007 102.6 0.7 0.7 1641.3 10.7 111.5 0.040 108.4 4.3 4.5 1208.5 48.3 130.6 0.229 121.1 27.7 29.9 486.8 111.5 136.3 0.093 133.5 12.4 12.7 93.4 8.7 148.2 0.225 142.3 32.0 33.3 0.7 0.2 165.2 0.278 156.7 43.6 45.9 184.6 51.3 168.7 0.035 167.0 5.8 5.9 568.2 19.9 178.5 0.066 173.6 11.5 11.8 929.5 61.3 183.7 0.021 181.1 3.8 3.9 1443.0 30.3 190.0 0.007 186.9 1.3 1.3 1912.9 13.4 m 143.1 149.9 355.6 s 15.8 s 18.9

Types of distributions Normal Lognormal Uniform Triangle Binomial Multinomial hypergeometric

Cumulative frequency Types of distributions Normal Characteristics Define by m and s Mode=mean=median Curve is symmetric Cumulative frequency graph is s shaped Can normalize and obtain area (probability) under the curve. t m s f() s m s

Cumulative frequency Types of distributions Normal Given a set of data how do you know whether it is normally distributed? Shape of curves median = mean Eamples: porosity, fractional flow m f() s s

Cumulative frequency Types of distributions Lognormal Characteristics Define by m and s Mode mean median Curve is asymmetric Cumulative frequency graph ehibits rapid rise Can transform to normal variable by y=ln() f() mode median m

Types of distributions Lognormal Eamples: permeability thickness oil recovery (bbls/acre-foot) field sizes in a play mode f() median m

Cumulative frequency Types of distributions Uniform Characteristics: all values are equi-probable f() specify min and ma allows for uncertainty min ma used in Monte Carlo simulation 100% min ma

Cumulative frequency Types of distributions Triangle Characteristics: all values are equi-probable specify min and ma allows for uncertainty used in Monte Carlo simulation f() 100% M, most likely L, low H, high min ma

Types of distributions Triangle Convert to cumulative frequency plot: normalize to a 0 to 1 scale: Define m as: m M L H L ' L HL f() M, most likely For m, cumulative probability is given by: P( 2 () ) m L, low H, high For > m, P( ) 2 (1 ) 1 1 m

Cumulative probability Types of distributions Triangle Eample f() Estimated costs to drill a well vary from a minimum of $100,000 to a maimum of $200,000,with the most probable value at $130,000. Convert the probability distribution to a cumulative frequency distribution M, 130 L, 100 H, 200, random ' cumulative variable normalized probability (drilling costs) 100 0.0 0.000 110 0.1 0.033 120 0.2 0.133 130 0.3 0.300 140 0.4 0.486 150 0.5 0.643 160 0.6 0.771 170 0.7 0.871 180 0.8 0.943 190 0.9 0.986 200 1.0 1.000 1.0 0.8 0.6 0.4 0.2 0.0 100 120 140 160 180 200 Drilling Costs, ($M)

Types of distributions Binomial Describes a stochastic process characterized by: 1. Only two outcomes can occur 2. Each trial is an independent event 3. The probability of each outcomes remains constant over repeated trials 4. Binomial probability equation is given by: where P() = number of successes (0 n) n = total number of trials n C p (1 n p) p = probability of success on any given trial and the combination of n things taken at a time n C n!!(n )!

P() Types of distributions Binomial Eample Your company proposes to drill 5 wells in a new basin where the chance of success is 0.15 per well What is the probability of only one discovery in the five wells drilled? What is the probability of at least one discovery in the 5-well drilling program? Number of P() Cumulative discoveries P() 1.0 0.9 0.8 0.7 0 0.4437 0.4437 0.6 1 0.3915 0.8352 0.5 2 0.1382 0.9734 0.4 0.3 3 0.0244 0.9978 0.2 4 0.0022 0.9999 0.1 5 0.0001 1.0000 0.0 0 1 2 3 4 5 Number of discoveries Cumulative

Types of distributions Multinomial Describes a stochastic process characterized by: 1. Any number of discrete outcomes 2. Each trial is an independent event 3. The probability of each outcomes remains constant over repeated trials 4. Multinomial probability equation is given by: where P( 1, 2,..., r ) n! 1 2 r p 1 p 2...p!!...! r 1 2 r r = number of possible outcomes 1 = number of times outcome 1 occurs in n trials 2 = number of times outcome 2 occurs in n trials r = number of times outcome r occurs in n trials n = total number of trials p r = probability of outcome r on any given trial

Types of distributions Multinomial Eample Your company proposes to drill 10 wells in a new basin where the chance of success is 15% per well What is the probability of obtaining 7 dry holes, 2 fields in the 1-2 mmbbl range and 1 field in the 8-12 mmbbl range? outcome probability range of mmbbl outcome 1-2 0.08 2-4 0.04 4-8 0.02 8-12 0.01 0.150 probability of dry hole 0.850 number of trials (wells) in program n = 10 probability of dry holes 1 = 7 probability of 1-2 mmbbl 2 = 2 probability of 2-4 mmbbl 3 = 0 probability of 4-8 mmbbl 4 = 0 probability of 8-12 mmbbl 5 = 1 0.7%

Types of distributions Hypergeometric Describes a stochastic process characterized by: 1. Any number of discrete outcomes 2. Each trial is dependent on the previous event (sampling without replacement) 3. The probability of each outcomes remains constant over repeated trials 4. Hypergeometric probability equation for two possible outcomes: where d 1 Nd 1 C C P() n N C n n=number of trials d i = number of successes in the sample space before the n trials i = number of successes in n trials N = total number of elements in the sample space before the n trials C a b = the number of combinations of a things taken b at a time.

Types of distributions Hypergeometric Eample Our company has identified ten seismic anomalies of about equal size in a new offshore area. In an adjacent area, 30% of the drilled structures were oil productive. If we drill 5 wells (test 5 anomalies) what is the probability of two discoveries? number_sample n = 5 number_pop N = 10 population_s d1 = 3 sample_s 1 = 2 42%