occurrence of a larger storm than our culvert or bridge is barely capable of handling? (what is The main question is: What is the possibility of

Similar documents
II. Random Variables. Variable Types. Variables Map Outcomes to Numbers

Random Variables. b 2.

MgtOp 215 Chapter 13 Dr. Ahn

Calibration Methods: Regression & Correlation. Calibration Methods: Regression & Correlation

Measures of Spread IQR and Deviation. For exam X, calculate the mean, median and mode. For exam Y, calculate the mean, median and mode.

Notes on experimental uncertainties and their propagation

Tests for Two Correlations

CHAPTER 9 FUNCTIONAL FORMS OF REGRESSION MODELS

/ Computational Genomics. Normalization

Physics 4A. Error Analysis or Experimental Uncertainty. Error

UNIVERSITY OF VICTORIA Midterm June 6, 2018 Solutions

Chapter 3 Student Lecture Notes 3-1

Interval Estimation for a Linear Function of. Variances of Nonnormal Distributions. that Utilize the Kurtosis

ISyE 512 Chapter 9. CUSUM and EWMA Control Charts. Instructor: Prof. Kaibo Liu. Department of Industrial and Systems Engineering UW-Madison

3/3/2014. CDS M Phil Econometrics. Vijayamohanan Pillai N. Truncated standard normal distribution for a = 0.5, 0, and 0.5. CDS Mphil Econometrics

A Comparison of Statistical Methods in Interrupted Time Series Analysis to Estimate an Intervention Effect

Likelihood Fits. Craig Blocker Brandeis August 23, 2004

Chapter 3 Descriptive Statistics: Numerical Measures Part B

OPERATIONS RESEARCH. Game Theory

3: Central Limit Theorem, Systematic Errors

Chapter 5 Student Lecture Notes 5-1

Lecture Note 2 Time Value of Money

Mode is the value which occurs most frequency. The mode may not exist, and even if it does, it may not be unique.

A Bootstrap Confidence Limit for Process Capability Indices

OCR Statistics 1 Working with data. Section 2: Measures of location

Which of the following provides the most reasonable approximation to the least squares regression line? (a) y=50+10x (b) Y=50+x (d) Y=1+50x

Understanding Annuities. Some Algebraic Terminology.

Data Mining Linear and Logistic Regression

Economic Design of Short-Run CSP-1 Plan Under Linear Inspection Cost

CHAPTER 3: BAYESIAN DECISION THEORY

Tests for Two Ordered Categorical Variables

Linear Combinations of Random Variables and Sampling (100 points)

SIMPLE FIXED-POINT ITERATION

Capability Analysis. Chapter 255. Introduction. Capability Analysis

15-451/651: Design & Analysis of Algorithms January 22, 2019 Lecture #3: Amortized Analysis last changed: January 18, 2019

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #21 Scribe: Lawrence Diao April 23, 2013

Evaluating Performance

Maturity Effect on Risk Measure in a Ratings-Based Default-Mode Model

Financial mathematics

Random Variables. 8.1 What is a Random Variable? Announcements: Chapter 8

Survey of Math Test #3 Practice Questions Page 1 of 5

Quiz on Deterministic part of course October 22, 2002

Creating a zero coupon curve by bootstrapping with cubic splines.

Appendix - Normally Distributed Admissible Choices are Optimal

Probability Distributions. Statistics and Quantitative Analysis U4320. Probability Distributions(cont.) Probability

Clearing Notice SIX x-clear Ltd

Multifactor Term Structure Models

Solution of periodic review inventory model with general constrains

Problem Set 6 Finance 1,

Notes are not permitted in this examination. Do not turn over until you are told to do so by the Invigilator.

International ejournals

Understanding price volatility in electricity markets

Elements of Economic Analysis II Lecture VI: Industry Supply

Graphical Methods for Survival Distribution Fitting

Applications of Myerson s Lemma

Stochastic Generation of Daily Rainfall Data

Finance 402: Problem Set 1 Solutions

Introduction. Why One-Pass Statistics?

Analysis of Variance and Design of Experiments-II

Using Cumulative Count of Conforming CCC-Chart to Study the Expansion of the Cement

PhysicsAndMathsTutor.com

Alternatives to Shewhart Charts

Elton, Gruber, Brown and Goetzmann. Modern Portfolio Theory and Investment Analysis, 7th Edition. Solutions to Text Problems: Chapter 4

ECE 586GT: Problem Set 2: Problems and Solutions Uniqueness of Nash equilibria, zero sum games, evolutionary dynamics

TCOM501 Networking: Theory & Fundamentals Final Examination Professor Yannis A. Korilis April 26, 2002

FORD MOTOR CREDIT COMPANY SUGGESTED ANSWERS. Richard M. Levich. New York University Stern School of Business. Revised, February 1999

The Integration of the Israel Labour Force Survey with the National Insurance File

Numerical Analysis ECIV 3306 Chapter 6

02_EBA2eSolutionsChapter2.pdf 02_EBA2e Case Soln Chapter2.pdf

Simple Regression Theory II 2010 Samuel L. Baker

IND E 250 Final Exam Solutions June 8, Section A. Multiple choice and simple computation. [5 points each] (Version A)

ISE High Income Index Methodology

Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session STS041) p The Max-CUSUM Chart

S yi a bx i cx yi a bx i cx 2 i =0. yi a bx i cx 2 i xi =0. yi a bx i cx 2 i x

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Price and Quantity Competition Revisited. Abstract

Теоретические основы и методология имитационного и комплексного моделирования

Stochastic ALM models - General Methodology

Incorrect Beliefs. Overconfidence. Types of Overconfidence. Outline. Overprecision 4/15/2017. Behavioral Economics Mark Dean Spring 2017

Foundations of Machine Learning II TP1: Entropy

Self-controlled case series analyses: small sample performance

arxiv:cond-mat/ v1 [cond-mat.other] 28 Nov 2004

Asset Management. Country Allocation and Mutual Fund Returns

Monetary Tightening Cycles and the Predictability of Economic Activity. by Tobias Adrian and Arturo Estrella * October 2006.

Project Management Project Phases the S curve

Technological inefficiency and the skewness of the error component in stochastic frontier analysis

Elton, Gruber, Brown, and Goetzmann. Modern Portfolio Theory and Investment Analysis, 7th Edition. Solutions to Text Problems: Chapter 9

Chapter 10 Making Choices: The Method, MARR, and Multiple Attributes

Module Contact: Dr P Moffatt, ECO Copyright of the University of East Anglia Version 2

Appendix for Solving Asset Pricing Models when the Price-Dividend Function is Analytic

Preliminary communication. Received: 20 th November 2013 Accepted: 10 th December 2013 SUMMARY

Final Exam. 7. (10 points) Please state whether each of the following statements is true or false. No explanation needed.

Instituto de Engenharia de Sistemas e Computadores de Coimbra Institute of Systems Engineering and Computers INESC - Coimbra

Introduction. Chapter 7 - An Introduction to Portfolio Management

Estimation of Wage Equations in Australia: Allowing for Censored Observations of Labour Supply *

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY

Topics on the Border of Economics and Computation November 6, Lecture 2

Number of women 0.15

Note on Cubic Spline Valuation Methodology

COMPARATIVE ANALYSIS AND SELECTION OF THE BEST METHOD HIGHWAY ROUTE

Transcription:

Module 8: Probablty and Statstcal Methods n Water Resources Engneerng Bob Ptt Unversty of Alabama Tuscaloosa, AL Flow data are avalable from numerous USGS operated flow recordng statons. Data s usually avalable on a real-tme bass. Besdes the current flow condtons, summares of etreme flows are also tabulated and avalable. These data can be used to predct the frequency of etreme flows of most nterest for desgn. Probablty and Statstcs Desgn storms are frequently used to sze hydroc structures. The man queston s: What s the possblty of occurrence of a larger storm than our culvert or brdge s barely capable of handlng? (what s the lkelhood of falure) We must be able to evaluate the probablty of hydroc events. Frequent statstcal goal s therefore to ft a standard probablty dstrbuton to observed precptaton and runoff data. Probablty and Return Perods Concepts of probablty and statstcs are closely assocated, partcularly when dealng wth a set of data. The probablty of the occurrence of a partcular event s smply the chance that the event wll occur. If there s a fnte number of events, a notnecessarly equal probablty may be assgned to each. If the possble outcomes cover a contnuous range of values, the probablty can only be epressed by a mathematcal functon.

What s the probablty of rollng a wth a sngle roll of a s-faced de? Snce there are s equally lkely outcomes, the probablty of obtanng a s /6. What s the probablty of rollng ether a or a 4? There are possble outcomes, so the probablty of obtanng ether a or a 4 s /6. The probablty of equalng or eceedng a partcular value s a cumulatve probablty. Our nterest centers on the probablty that an event wll be equaled or eceeded wthn a gven tme frame. p s the probablty that an event of a specfc magntude s equaled or eceeded n that year. Therefore, -p s the probablty that the event s not equaled or eceeded n that year. The return perod epresses the average nterval between events, but t does not gve specfc nformaton concernng the lkelhood of occurrence durng the desgn lfe of the project. What s the probablty that the dscharge of 5,000 cfs wth a return perod of 50 years wll occur durng the lfe of a dam? Wll t occur at all? Can t occur more than once? The probablty that the dscharge s not equaled or eceeded n two years s: (-p)(-p), or (-p) where s the perod of nterest. The probablty J that the event wll occur at least once durng years s: J ( p ) If a dscharge of 5,000 cfs has a probablty, p 0.0, then the probablty s percent that ths dscharge wll be equaled or eceeded n any one year (and 98% that t wll not). The recprocal of p s the return perod, or recurrence nterval, t p. Ths s the average tme nterval between dscharges that are equal to, or greater than, a specfed dscharge: t p p Prasuhn 987

If a dam has a desgn lfe of 50 years (50 years), the 5,000 cfs flow (havng a recurrence nterval, t p, of 50 years), wll occur, or be eceeded wth a probablty of 63.6 percent (not 00%!). What s the probablty that ths flow wll occur (or be eceeded) durng the 5-year constructon perod? (5 years). (Answer: 9.6%). What s the probablty that a 00 year storm wll occur at least once durng a 00 year perod? Durng a 50 year perod? Durng year? Unform dstrbutons are of lttle nterest n hydroy, but are a smple place to start, and are smlar to the de problem. Probablty Dstrbuton Functon Cumulatve Dstrbuton Functon Any value between a and b have an equal lkelhood of occurrng. a and b are the lower and upper lmts of The probablty that an outcome wll be equal to, or greater than, a partcular value of s: p F ( ) f ( )d Prasuhn 987 Probablty Dstrbutons A probablty densty functon (PDF) s a contnuous mathematcal epresson that determnes the probablty of a specfc event. The dstrbuton that best fts the set of data s epected to gve the best estmate of the probablty of an event that has not been observed. Actual dscharges or precptaton values over a perod of years form a contnuous functon, because any value s possble, wthn a broad range. We wll only eamne a few possble probablty dstrbutons. ormal dstrbutons are famlar bell-shaped curves. f ( ) ( ) ep σ π σ Prasuhn 987 3

The normal dstrbuton PDF s defned by two dstrbuton parameters, the mean and the standard devaton. The mean s the average of all of the observatons: and s at the center of the dstrbuton. The standard devaton descrbes the wdth of the dstrbuton (the spread of the data): About 68% of the data s wthn +/- standard devatons of the mean, whle σ ( ) / about 98% s wthn +/- 3 standard devatons. Prasuhn 987 The normal dstrbuton does not provde a satsfactory ft to flood dscharges and other hydroc data. The dstrbuton etends from negatve to postve nfnty and therefore assgns a probablty to negatve flows. A specfc event can be related to the probablty of eceedance p by: + K Kσ where K s the frequency factor (gven n the followng table for specfc values of p). Usng actual data, K can be calculated: Ths s the number of ( ) σ standard devatons the data pont s located from the mean If actual hydroc data are to be epressed wth respect to p, the probablty of eceedance n a year, the data set wll often be based on the sngle peak value observed for each year (the annual seres). Eample 5- (Prasuhn 987) Determne the probablty that a dscharge of 0,000 cfs wll be equaled or eceed n any one year, f the mean of the annual seres of rver dscharges s 0,000 cfs and the standard devaton s () 3,000 cfs, or () 6,000 cfs. What s the return perod n each case? Soluton: () K 0,000 cfs 3,000 0,000 cfs cfs 3.33 Lookng up ths value of K on the prevous table yelds a p of about 0.0005, and the correspondng recurrence nterval: t, p p 0.0005 000 years 4

() 0,000cfs K 6,000 0,000 cfs cfs.67 The correspondng p value on the table s about 0.05 t p 0 p 0.05 years Therefore, the same value can have vastly dfferent recurrence ntervals, even wth the same average flow rates, as the standard devaton changes. Ths dstrbuton assumes that the arthms of the dscharges are normally dstrbuted. The pror equatons can be used descrbng the mean and standard devatons, f the followng transformaton s used: y The mean of the arthms themselves can be epressed drectly: As an alternatve, the mean can be found by takng the arthm of the geometrc mean of the set of values: / ( ) 3... Log-ormal Dstrbutons The -normal dstrbuton s closely related to the normal dstrbuton. The values are transformed by takng the 0 of the data. Ths dstrbuton s much more useful than the basc normal dstrbuton as no negatve numbers are allowed, whle large postve values are acceptable. The followng plot shows ths dstorted dstrbuton n real (nontransformed) space: The standard devaton can also be drectly calculated where the values are both based on the arthms of the actual data: σ ( ) / The probablty of eceedance can be related to the occurrence usng: + K σ The frequency factors may be determned from the pror table for normal dstrbutons. 5

Log Pearson Type III Dstrbuton The problem wth most hydroc data s that an equal data spread does not occur above and below the mean. The lower sde s lmted to the range from the mean to zero, whle there s no lmt to the upper range. Ths results n a skewed dstrbuton that may not be completely corrected by the normal dstrbuton. The coeffcent of skew, a, s defned by: a ( ) 3 3 ( )( ) σ The use of values n the -normal dstrbuton tends to reduce the dstrbuton dstorton. However, some skewness usually remans. Prasuhn 987 To determne the skewness when usng values, the followng can be used: a ( ) 3 ( )( ) σ 3 The normal and -normal dstrbutons assume zero skew. If some skew ests n the data, these dstrbutons result n errors. The Pearson Type III dstrbuton was developed n 967 to mprove the ft of hydroc data. Ths method uses a thrd parameter, the skew coeffcent, n addton to the mean and standard devaton. The followng table gves the frequency factors for ths dstrbuton. The zero skew values correspond to the normal dstrbuton. 6

Eample 5- (Prasuhn 987) The mean of the arthms of the annual seres of rver dscharges s.700 (whch corresponds to a geometrc mean for the peak flows of 50 m 3 /sec). The standard devaton of the same values s 0.65. Determne the dscharge wth a 00-year perod f the coeffcent of skew s () -0.4, () 0, and (3) +0.4. Soluton: Log.7 + 0.65 K () K.09; Q 4.09; Q 0,400 m 3 /sec () K.36; Q 4.; Q 6,300 m 3 /sec (-normal) (3) K.65; Q 4.400; Q 5,00 m 3 /sec Prasuhn 987 Statstcal Analyss An annual seres uses the mamum values for each year. A drawback to the annual seres s that some years may have several large peaks, whle other years may have peaks that are much lower. However, only one value per year can be used, dscardng some potentally valuable nformaton. A partal duraton seres uses all values above a selected value. For rare events, the results are smlar and the annual seres s recommended because t s easer to obtan. However, f more frequent flows are of nterest (such as recurrence ntervals of less than one year), the partal duraton seres should be used. The followng table shows all annual peak flows, plus all others greater than 9,000 cfs. Ths data can therefore be used for ether a partal duraton seres, or an annual seres analyss. The followng table shows the last segment of an annual seres analyss of ths data. Included are the ranks for the smallest annual peak flows, the correspondng flows, and several columns of summary statstcs for these flows. Ths analyss can be easly conducted on a spreadsheet program. Each flow has ts recurrence nterval calculated for years of record (53 years), where m s the rank: + t p m The probablty of eceedance s the recprocal of the return perod, or: p + m A rank of n a perod of 0 years leads to a probablty of p / 0.0909. + results n the best estmate for lmted data sets. In a partal duraton seres, stll refers to the duraton of the record and s dfferent than the number of observatons. 7

Prasuhn 987 Prasuhn 987 It s now possble to plot the data and the ftted equatons for the dfferent dstrbutons to normal and -normal probablty paper to determne the best fttng dstrbuton, and to use the plots to determne flows for dfferent recurrence ntervals. Probablty paper can be easly downloaded at several web stes, ncludng: http://www.webull.com/gpaper/ The use of probablty paper allows vsual clues as to the best dstrbuton (usually the one wth the best ft for the data has a straght lne, at least for normal and -normal plots, or that fts the curved plotted lne for -Pearson type III plots). The frst step s to create the annual seres (or partal seres) data and rank the observatons, usually from the largest to the smallest. Then calculate the probablty for each observaton, usng p m/(+). Fnally, just plot the flow values aganst the calculated p values. The followng plots are eamples usng the Sou Rver data. Do an n-class eample to plot the followng 9 observatons: 5 5 78 56 3 7 3 88 8

The followng plot uses the Bg Sou Rver data on a normal plot. For ths plot, the flow values are plotted on an arthmetc scale and the probabltes are plotted on scales that are dstorted so that a normal dstrbuton would plot as a straght lne. ote that ths s not a scale. Besdes the data ponts (whch are not along a straght lne, an ndcaton that ths s not a sutable dstrbuton), a straght lne whch corresponds to the best ft for ths data s also plotted. The equaton for ths lne s based on the data characterstcs: + Kσ 3,844 + 4, 505K To plot the straght lne, values of K are obtaned from the pror table of K values for normalty probablty for selected p values. The flow values are then calculated correspondng to these p values, and then plotted to form the lne. The flow values are plotted on the scale, and the same probablty vs. flow values are used to plot the straght lne. The actual equaton for the straght lne s: + K σ 3.949 + 0. 4380K Ths plot also shows the Pearson type III plot, usng the calculated skew. Prasuhn 987 ot a very good ft, so the followng plot for -normal dstrbutons are attempted, usng the same data, but usng -normal probablty paper. Prasuhn 987 The curved lne for the Pearson type III dstrbuton s obtaned the same way, ecept the calculated skew value s used to obtan the K parameter for the equaton. In ths eample, the skew coeffcent a s -0.368. Both of these ftted lnes are not perfect fts of the observed data, and lead to very dfferent results when used to etrapolate to large recurrence nterval flows. The Pearson curve fts the overall data range better, but the -normal curve fts the 3 largest values (usually of most nterest) better. The best choce s therefore sometmes dffcult to determne. What s the epected dscharge havng a 00 year recurrence nterval (t p 00 years, p 0.005)? The calculated lnes could be etended to ths value, or the equatons can be drectly used. 9

Log-normal: p 0.005 and a 0 K.576 from the table Therefore: Log 3.949 + (0.4380) (.576) 5.0773 and 0 5.0773 9,500 cfs Log Pearson Type III: p 0.005 and a -0.368 K.3 from the table Therefore: Log 3.949 + (0.4380)(.3) 4.96 and 0 4.96 84,400 cfs Etreme flow events are usually well outsde of the normal channel and measurement accuracy suffers. Chn 000 There s consderable dscrepancy between these two predcted values. The larger value s more conservatve and s more consstent wth the larger observed dscharges. However, the lower value s probably the better estmate as t fts the complete data set better. Measurng the large actual dscharges s subject to consderable error, as they were lkely occurrng durng flood stage condtons where the flow measurement staton may have been submerged, or beyond calbraton depths, requrng crude estmates of actual flows based on physcal evdence. Also, the largest flow was not lkely assocated wth an eact 54-year event. There s no way of knowng what sze event t was; could have been assocated wth a much more rare event, such as the 00 year event that just happened to occur durng the shorter perod of record. ormally, the Pearson dstrbuton s recommended as t consders the skew parameter, but cauton s needed as unrealstc and ecessve values of skew may occur for a partcular rver. Regonal skew values should also be eamned. Eample 5-4 (Prasuhn 987) Use the three dstrbuton methods to predct the 50-year flood on the Bg Sou Rver at Akron, Iowa. Soluton: ()ormal dstrbuton: t p 50 yrs, p 0.0. Therefore K.054 3,844 + (.054)(4,505) 43,600 cfs () Log-normal dstrbuton: a 0 K.054 3.949 + (.054)(0.4380) 4.849 X 0 4.849 70,600 cfs 0

(3) Log Pearson type III a -0.368 K.85 Log 3.949 + (.85)(0.4380) 4.760 X 0 4.760 57,600 cfs Obvously, the applcaton of statstcal methods s not an eact scence. The methods are etremely helpful n the nterpretaton of hydroc data and the predcton of desgn tools, but the engneer must be aware of the lmtatons nvolved. ot only are flood condtons mportant, but drought condtons can also be evaluated usng the same methods. When dealng wth ranfall, the ntensty of the precptaton as well as the overall quantty s mportant. References Chn, Davd, A. Water-Resources Engneerng. Prentce Hall. 000. Prasuhn, Alan L. Fundamentals of Hydraulc Engneerng. Holt, Rnehart and Wnston. 987. Homework Problem: Repeat the Bg Sou Rver analyss, but only use data from the last 0 years of record (97 to 98). Predct the 50 and 00 year flows usng the Pearson type III dstrbuton. What are the lmtatons of usng a short perod of observatons?