EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz

Similar documents
QUANTIFYING THE RISK OF EXTREME EVENTS IN A CHANGING CLIMATE. Rick Katz. Joint Work with Holger Rootzén Chalmers and Gothenburg University, Sweden

An Introduction to Statistical Extreme Value Theory

STOCHASTIC MODELING OF HURRICANE DAMAGE UNDER CLIMATE CHANGE

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Modelling Environmental Extremes

Financial Risk Forecasting Chapter 9 Extreme Value Theory

Modelling Environmental Extremes

Generalized MLE per Martins and Stedinger

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

Introduction to Algorithmic Trading Strategies Lecture 8

Frequency Distribution Models 1- Probability Density Function (PDF)

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -26 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

A New Hybrid Estimation Method for the Generalized Pareto Distribution

AN EXTREME VALUE APPROACH TO PRICING CREDIT RISK

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

Modelling catastrophic risk in international equity markets: An extreme value approach. JOHN COTTER University College Dublin

Lecture 3: Probability Distributions (cont d)

Commonly Used Distributions

QQ PLOT Yunsi Wang, Tyler Steele, Eva Zhang Spring 2016

ECON Introductory Econometrics. Lecture 1: Introduction and Review of Statistics

IEOR E4602: Quantitative Risk Management

Chapter 8: Sampling distributions of estimators Sections

Estimate of Maximum Insurance Loss due to Bushfires

Statistics & Flood Frequency Chapter 3. Dr. Philip B. Bedient

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

Business Statistics 41000: Probability 3

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

GPD-POT and GEV block maxima

The rth moment of a real-valued random variable X with density f(x) is. x r f(x) dx

Describing Uncertain Variables

Measuring Financial Risk using Extreme Value Theory: evidence from Pakistan

Random Variables Handout. Xavier Vilà

Likelihood Methods of Inference. Toss coin 6 times and get Heads twice.

Simulation of Extreme Events in the Presence of Spatial Dependence

Continuous random variables

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

1. You are given the following information about a stationary AR(2) model:

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay. Solutions to Final Exam

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

Homework Assignments

Chapter 7. Inferences about Population Variances

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.

All Investors are Risk-averse Expected Utility Maximizers. Carole Bernard (UW), Jit Seng Chen (GGY) and Steven Vanduffel (Vrije Universiteit Brussel)

Bivariate Birnbaum-Saunders Distribution

Basic notions of probability theory: continuous probability distributions. Piero Baraldi

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

Applications of Good s Generalized Diversity Index. A. J. Baczkowski Department of Statistics, University of Leeds Leeds LS2 9JT, UK

Random Variables and Probability Distributions

Some Characteristics of Data

NORMAL APPROXIMATION. In the last chapter we discovered that, when sampling from almost any distribution, e r2 2 rdrdϕ = 2π e u du =2π.

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Much of what appears here comes from ideas presented in the book:

Slides for Risk Management

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Non-informative Priors Multiparameter Models

MODELLING OF INCOME AND WAGE DISTRIBUTION USING THE METHOD OF L-MOMENTS OF PARAMETER ESTIMATION

Sampling Distribution

BROWNIAN MOTION Antonella Basso, Martina Nardon

Lecture 2. Probability Distributions Theophanis Tsandilas

Homework Problems Stat 479

Section 7.1: Continuous Random Variables

Chapter 3 Common Families of Distributions. Definition 3.4.1: A family of pmfs or pdfs is called exponential family if it can be expressed as

All Investors are Risk-averse Expected Utility Maximizers

Analysis of extreme values with random location Abstract Keywords: 1. Introduction and Model

Understanding Tail Risk 1

Random variables. Contents

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Statistics and Finance

SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data

Appendix A. Selecting and Using Probability Distributions. In this appendix

Statistical Tables Compiled by Alan J. Terry

The Bernoulli distribution

The Normal Distribution

Chapter 4 Continuous Random Variables and Probability Distributions

An Introduction to Stochastic Calculus

Tutorial 11: Limit Theorems. Baoxiang Wang & Yihan Zhang bxwang, April 10, 2017

STRESS-STRENGTH RELIABILITY ESTIMATION

Mongolia s TOP-20 Index Risk Analysis, Pt. 3

Risk management. Introduction to the modeling of assets. Christian Groll

Deriving the Black-Scholes Equation and Basic Mathematical Finance

Statistics for Business and Economics

Stochastic model of flow duration curves for selected rivers in Bangladesh

WEATHER EXTREMES AND CLIMATE RISK: STOCHASTIC MODELING OF HURRICANE DAMAGE

12 The Bootstrap and why it works

Financial Risk Management

Favorite Distributions

4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved.

Statistics and Probability

Clark. Outside of a few technical sections, this is a very process-oriented paper. Practice problems are key!

NEWCASTLE UNIVERSITY SCHOOL OF MATHEMATICS & STATISTICS SEMESTER /2013 MAS8304. Environmental Extremes: Mid semester test

Estimation Procedure for Parametric Survival Distribution Without Covariates

FINITE SAMPLE DISTRIBUTIONS OF RISK-RETURN RATIOS

1 Geometric Brownian motion

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]

An Information Based Methodology for the Change Point Problem Under the Non-central Skew t Distribution with Applications.

IEOR 165 Lecture 1 Probability Review

Choice Probabilities. Logit Choice Probabilities Derivation. Choice Probabilities. Basic Econometrics in Transportation.

Practice Exercises for Midterm Exam ST Statistical Theory - II The ACTUAL exam will consists of less number of problems.

Transcription:

1 EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS Rick Katz Institute for Mathematics Applied to Geosciences National Center for Atmospheric Research Boulder, CO USA email: rwk@ucar.edu Home page: www.isse.ucar.edu/staff/katz Lecture: www.isse.ucar.edu/staff/katz/docs/pdf/bgceva1.pdf

2 Outline (1) Traditional Methods/Rationale for Extreme Value Analysis (2) Max Stability/Extremal Types Theorem (3) Block Maxima Approach under Stationarity (4) Return Levels (5) Block Maxima Approach under Nonstationarity (6) Trends in Extremes (7) Other Forms of Covariates

3 (1) Traditional Methods/Rationale for Extreme Value Analysis Fit models/distributions to all data -- Even if primary focus is on extremes Statistical theory for averages -- Ubiquitous role of normal distribution -- Central Limit Theorem for sums or averages

4 Central Limit Theorem -- Given time series X 1, X 2,..., X n Assume independent and identically distributed (iid) Assume common cumulative distribution function (cdf) F Assume finite mean μ and variance σ 2 -- Denote sum by S n = X 1 + X 2 + + X n -- Then, no matter what shape of cdf F, Pr{(S n nμ) / n 1/2 σ x} Φ(x) as n where Φ denotes standard normal N(0, 1) cdf

5 Robustness -- Avoid sensitivity to extremes (outliers / contamination) Nonparametric Alternatives -- Kernel density estimation Ok for center of distribution (but not for lower & upper tails) -- Resampling Fails for maxima Cannot extrapolate

6 Conduct sampling experiment -- Exponential distribution with cdf F(x) = 1 exp[ (x/σ)], x > 0, σ > 0 Here σ is scale parameter (also mean)

-- Draw random samples of size n = 10 from exponential distribution (with σ = 1) and calculate mean for each sample 7 (i) First pseudo random sample 1.678, 0.607, 0.732, 1.806, 1.388, 0.630, 0.382, 0.396, 1.324, 1.148 (Sample mean 1.009) (ii) Second pseudo random sample Sample mean 0.571 (iii) Third pseudo random sample Sample mean 0.859 Repeat many more times

8

9

10 Limited information about extremes -- Exploit what theory is available More robust/flexible approach -- Tail behavior of standard distributions is too restrictive Statistical theory indicates possibility of heavy tails Data suggest evidence of heavy tails Conventional distributions have light tails

11 -- Example Let X have standard normal distribution [i. e., N(0, 1)] with probability density function (pdf) φ(x) = (2π) 1/2 exp( x 2 / 2) Then Pr{X > x} 1 Φ(x) φ(x) / x, for large x

12 Statistical behavior of extremes -- Effectively no role for normal distribution -- What form of distribution(s) instead? Conduct another sampling experiment -- Calculate largest value of random sample (instead of mean) (i) Standard normal distribution N(0, 1) (ii) Exponential distribution (σ = 1)

13

14

15 (2) Max Stability/Extremal Types Theorem Sum stability -- Property of normal distribution X 1, X 2,..., X n iid with common cdf N(μ, σ 2 ) Then sum S n = X 1 + X 2 + + X n is exactly normally distributed In particular, (S n nμ) / n 1/2 σ has an exact N(0, 1) distribution

16 Max stability -- Want to find distribution(s) for which maximum has same form as original sample Note that max{x 1, X 2,..., X 2n } = max{max{x 1, X 2,..., X n }, max{x n+1, X n+2,..., X 2n }} -- So cdf G, say, must satisfy G 2 (x) = G(ax + b) Here a > 0 and b are constants

17

18

19 Extremal Types Theorem Time series X 1, X 2,..., X n assumed iid (for now) Set M n = max{x 1, X 2,..., X n } Suppose that there exist constants a n > 0 and b n such that Pr{(M n b n ) / a n x} G(x) as n where G is a non-degenerate cdf Then G must a generalized extreme value (GEV) cdf; that is, G(x; μ, σ, ξ) = exp { [1 + ξ (x μ)/σ] 1/ξ }, 1 + ξ (x μ)/σ > 0 μ location parameter, σ > 0 scale parameter, ξ shape parameter

20 (i) ξ = 0 (Gumbel type, limit as ξ 0) Light upper tail Domain of attraction for many common distributions (e. g., normal, exponential, gamma)

21 (ii) ξ > 0 (Fréchet type) Heavy upper tail with infinite rth-order moment if r 1/ξ (e. g., infinite variance if ξ 1/2) Fits precipitation, streamflow, economic damage

22 (iii) ξ < 0 (Weibull type) Bounded upper tail [ x < μ + σ / ( ξ) ] Fits temperature, wind speed, sea level

23 Location parameter of GEV is not equivalent to mean Scale parameter of GEV is not equivalent to standard deviation

24 Alternative forms of distribution for maxima -- Lognormal distribution Log-transformed variable has normal distribution Positively skewed Light-tailed in sense of extreme value theory (Gumbel domain of attraction) -- Log Pearson Type III distribution Log-transformed variable has gamma distribution Heavy-tailed distribution (Fréchet domain of attraction) Not as flexible as GEV distribution

25 (3) Block Maxima Approach under Stationarity GEV distribution -- Fit directly to maxima (say with block size n) e. g., annual maximum of daily precipitation amount or highest temperature over given year or annual peak stream flow -- Advantages Do not necessarily need to explicitly model annual and diurnal cycles Do not necessarily need to explicitly model temporal dependence

26 Parameter estimation techniques -- Method of moments Easy to calculate Relatively inefficient -- Probability-weighted moments (L-moments) Easy to calculate Efficient for small samples -- Maximum likelihood Requires iterative numerical techniques Quantification of uncertainty Incorporation of covariates/nonstationarity

27 Maximum likelihood estimation (mle) -- Given observed block maxima X 1 = x 1, X 2 = x 2,..., X T = x T -- Assume exact GEV dist. with pdf g(x; μ, σ, ξ) = G'(x; μ, σ, ξ) -- Likelihood function L(x 1, x 2,..., x T ; μ, σ, ξ) = g(x 1 ; μ, σ, ξ) g(x 2 ; μ, σ, ξ) g(x T ; μ, σ, ξ) Minimize ln L(x 1, x 2,..., x T ; μ, σ, ξ) with respect to μ, σ, ξ

28 Likelihood ratio test (LRT) For example, to test whether ξ = 0 fit two models: (i) ln L(x 1, x 2,..., x T ; μ, σ, ξ) minimized with respect to μ, σ, ξ (ii) ln L(x 1, x 2,..., x T ; μ, σ, ξ = 0) minimized with respect to μ, σ If ξ = 0, then 2 [(ii) (i)] has approximate chi square distribution with 1 degree of freedom (df) for large T -- Confidence interval (e. g., for ξ) based on profile likelihood Minimize ln L(x 1, x 2,..., x T ; μ, σ, ξ) with respect to μ, σ as function of ξ Use chi square dist. with 1 df

29 Fort Collins daily precipitation amount -- Fort Collins, CO, USA Time series of daily precipitation amount (in), 1900-1999 Semi-arid region Marked annual cycle in precipitation (peak in late spring/early summer, driest in winter) Consider annual maxima (block size n 365) No obvious long-term trend in annual maxima (T = 100) Flood on 28 July 1997 (Damaged campus of Colorado State Univ.)

30

31

32

33 Parameter estimates and standard errors Parameter Estimate (Std. Error) Location μ 1.347 (0.062) Scale σ 0.533 (0.049) Shape ξ 0.174 (0.092) -- LRT for ξ = 0 (P-value 0.038) -- 95% confidence interval for shape parameter ξ (based on profile likelihood) 0.009 < ξ < 0.369

34 (4) Return Levels Assume stationarity -- i. e., unchanging climate Return period / Return level -- Return level with (1/p)-yr return period x(p) = G 1 (1 p; μ, σ, ξ), 0 < p < 1 Quantile of GEV cdf G (e. g., p = 0.01 corresponds to 100-yr return period)

35

36 GEV distribution x(p) = μ (σ/ξ) {1 [ ln(1 p)]} ξ Confidence interval: Re-parameterize replacing location parameter μ with x(p) & use profile likelihood method -- Fort Collins precipitation example (annual maxima) Estimated 100-yr return level: 5.10 in 95% confidence interval (based on profile likelihood): 3.93 in < x(0.01) < 8.00 in

37 Interpretation of return level (i) Mean waiting time until next event = 1/p On average, wait 100 yr for next 100-yr event (ii) Average number of events over time period (of length 1/p) = 1 On average, one 100-yr event occurs within 100-yr time period

38 (5) Block Maxima Approach under Nonstationarity Sources -- Trends Global climate change Local land use changes -- Physically-based Large-scale atmospheric/oceanic circulation patterns (e. g., El Niño Southern Oscillation phenomenon) Used in statistical downscaling

39 Theory -- No general extreme value theory under nonstationarity Only limited results under restrictive conditions Methods -- Introduction of covariates resembles generalized linear models -- Straightforward to extend maximum likelihood estimation Issues -- Nature of relationship between extremes & covariates Resembles that for overall / center of data?

40 (6) Trends in Extremes Trends -- Example (Urban heat island) Trend in summer minimum temperature at Phoenix, AZ (i. e., block minima) min{x 1, X 2,..., X n } = max{ X 1, X 2,..., X n } Assume negated summer minimum temperature in year t has GEV distribution with location and scale parameters: μ(t) = μ 0 + μ 1 t, ln σ(t) = σ 0 + σ 1 t, ξ(t) = ξ, t = 1, 2,...

41 Parameter estimates and standard errors Parameter Estimate (Std. Error) Location: μ 0 66.17* μ 1 0.196* (0.041) Scale: σ 0 1.338 σ 1 0.009 (0.010) Shape: ξ 0.211 *Sign of location parameters reversed to convert back to minima -- LRT for μ 1 = 0 (P-value < 10 5 ) -- LRT for σ 1 = 0 (P-value 0.366)

42

43 Q-Q plots under non-stationarity -- Transform to common distribution Non-stationary GEV [μ(t), σ(t), ξ(t)] Not invariant to choice of transformation (i) Non-stationary GEV to standard exponential ε t = {1 + ξ(t) [X t μ(t)] / σ(t)} 1/ξ(t) (ii) Non-stationary GEV to standard Gumbel (used by extremes) ε t = [1/ξ(t)] log {1 + ξ(t) [X t μ(t)] / σ(t)}

44

45

46 (7) Other Forms of Covariates Physically-based covariates -- Example [Arctic Oscillation (AO)] Winter maximum temperature at Port Jervis, NY, USA (i. e., block maxima) Z denotes winter index of AO Given Z = z, assume conditional distribution of winter maximum temperature is GEV distribution with parameters: μ(z) = μ 0 + μ 1 z, ln σ(z) = σ 0 + σ 1 z, ξ(z) = ξ

47 Parameter estimates and standard errors Parameter Estimate (Std. Error) Location: μ 0 15.26 μ 1 1.175 (0.319) Scale: σ 0 0.984 σ 1 0.044 (0.092) Shape: ξ 0.186 -- LRT for μ 1 = 0 (P-value < 0.001) -- LRT for σ 1 = 0 (P-value 0.635)

48

49