GPD-POT and GEV block maxima

Similar documents
An Introduction to Statistical Extreme Value Theory

Financial Risk Forecasting Chapter 9 Extreme Value Theory

Generalized MLE per Martins and Stedinger

Modelling Environmental Extremes

Modelling Environmental Extremes

Measuring Financial Risk using Extreme Value Theory: evidence from Pakistan

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

AN EXTREME VALUE APPROACH TO PRICING CREDIT RISK

Introduction to Algorithmic Trading Strategies Lecture 8

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.

EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz

Asymptotic results discrete time martingales and stochastic algorithms

NEWCASTLE UNIVERSITY SCHOOL OF MATHEMATICS & STATISTICS SEMESTER /2013 MAS8304. Environmental Extremes: Mid semester test

Chapter 3 Common Families of Distributions. Definition 3.4.1: A family of pmfs or pdfs is called exponential family if it can be expressed as

Financial Risk Management

2 Control variates. λe λti λe e λt i where R(t) = t Y 1 Y N(t) is the time from the last event to t. L t = e λr(t) e e λt(t) Exercises

Extend the ideas of Kan and Zhou paper on Optimal Portfolio Construction under parameter uncertainty

ELEMENTS OF MONTE CARLO SIMULATION

Introduction to Sequential Monte Carlo Methods

ADVANCED OPERATIONAL RISK MODELLING IN BANKS AND INSURANCE COMPANIES

Simulation of Extreme Events in the Presence of Spatial Dependence

Generalized Additive Modelling for Sample Extremes: An Environmental Example

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

A New Hybrid Estimation Method for the Generalized Pareto Distribution

M5MF6. Advanced Methods in Derivatives Pricing

Universität Regensburg Mathematik

Chapter 8: CAPM. 1. Single Index Model. 2. Adding a Riskless Asset. 3. The Capital Market Line 4. CAPM. 5. The One-Fund Theorem

3.4 Copula approach for modeling default dependency. Two aspects of modeling the default times of several obligors

MVE051/MSG Lecture 7

An Improved Skewness Measure

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models

4 Martingales in Discrete-Time

symmys.com 3.2 Projection of the invariants to the investment horizon

Lecture IV Portfolio management: Efficient portfolios. Introduction to Finance Mathematics Fall Financial mathematics

1 Residual life for gamma and Weibull distributions

Web-based Supplementary Materials for. A space-time conditional intensity model. for invasive meningococcal disease occurence

Modelling catastrophic risk in international equity markets: An extreme value approach. JOHN COTTER University College Dublin

Estimate of Maximum Insurance Loss due to Bushfires

Modelling Joint Distribution of Returns. Dr. Sawsan Hilal space

In Discrete Time a Local Martingale is a Martingale under an Equivalent Probability Measure

Log-linear Dynamics and Local Potential

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

Problem Set 1. Debraj Ray Economic Development, Fall 2002

On the Number of Permutations Avoiding a Given Pattern

Amath 546/Econ 589 Univariate GARCH Models

Optimally Thresholded Realized Power Variations for Lévy Jump Diffusion Models

Introduction to Probability Theory and Stochastic Processes for Finance Lecture Notes

Optimum Thresholding for Semimartingales with Lévy Jumps under the mean-square error

Likelihood Methods of Inference. Toss coin 6 times and get Heads twice.

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

THE TRAVELING SALESMAN PROBLEM FOR MOVING POINTS ON A LINE

Dependence Structure and Extreme Comovements in International Equity and Bond Markets

Swaps and Inversions

Obtaining Analytic Derivatives for a Class of Discrete-Choice Dynamic Programming Models

The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis

3 Arbitrage pricing theory in discrete time.

Smooth estimation of yield curves by Laguerre functions

Tangent Lévy Models. Sergey Nadtochiy (joint work with René Carmona) Oxford-Man Institute of Quantitative Finance University of Oxford.

IEOR E4602: Quantitative Risk Management

Dynamic Replication of Non-Maturing Assets and Liabilities

Supplementary Material for Combinatorial Partial Monitoring Game with Linear Feedback and Its Application. A. Full proof for Theorems 4.1 and 4.

ERASMUS UNIVERSITY ROTTERDAM. Erasmus School of Economics. Extreme quantile estimation under serial dependence

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

EE641 Digital Image Processing II: Purdue University VISE - October 29,

Statistical Tables Compiled by Alan J. Terry

Chapter 4: Asymptotic Properties of MLE (Part 3)

Characterization of the Optimum

Final Exam Suggested Solutions

4: SINGLE-PERIOD MARKET MODELS

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Analysis of truncated data with application to the operational risk estimation

A Multifrequency Theory of the Interest Rate Term Structure

MTH6154 Financial Mathematics I Interest Rates and Present Value Analysis

Analyzing Oil Futures with a Dynamic Nelson-Siegel Model

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

Chapter 5. Statistical inference for Parametric Models

Probability. An intro for calculus students P= Figure 1: A normal integral

1 Appendix A: Definition of equilibrium

Financial Econometrics

Chapter 2 Uncertainty Analysis and Sampling Techniques

Information aggregation for timing decision making.

Asymptotic Theory for Renewal Based High-Frequency Volatility Estimation

Bayesian Linear Model: Gory Details

CHOICE THEORY, UTILITY FUNCTIONS AND RISK AVERSION

DENSITY OF PERIODIC GEODESICS IN THE UNIT TANGENT BUNDLE OF A COMPACT HYPERBOLIC SURFACE

The stochastic calculus

Course information FN3142 Quantitative finance

Quantitative Risk Management

Chapter 7: Portfolio Theory

Equity correlations implied by index options: estimation and model uncertainty analysis

Mongolia s TOP-20 Index Risk Analysis, Pt. 3

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]

Statistical Analysis of Life Insurance Policy Termination and Survivorship

Log-linear Modeling Under Generalized Inverse Sampling Scheme

MTH6154 Financial Mathematics I Stochastic Interest Rates

Evaluating Policy Feedback Rules using the Joint Density Function of a Stochastic Model

Risk Management and Time Series

Spike Statistics: A Tutorial

Transcription:

Chapter 3 GPD-POT and GEV block maxima This chapter is devoted to the relation between POT models and Block Maxima (BM). We only consider the classical frameworks where POT excesses are assumed to be GPD, and where the BM follow a GEV distribution. We also study the relation between POT and the classical r largest statistics context. Although the BM and r largest are usually derived from asymptotic considerations, it is quite well-known that the same models result from a temporal aggregation of the Marked Process [T k, X k ] as used in POT. The distribution of the maximum of a Poisson number of i.i.d. excesses over a high threshold is GEV. (Embrechts et al. 1996, chap. 3). However, a number of details are most generally omitted. These relate to the possibility that a block has no observations, in which case the maximum does not exist. 3.1 GPD-POT to GEV-max 3.1.1 Temporal aggregation of the marked process In this section we consider the case of a partial observation (or temporal aggregation) for the marked process [T k, X k ]. We assume that we have B disjoints periods or blocks with known durations w b for b = 1, 2,..., B. We consider the following r.vs representing the number of events and the maximum of the marks for the block b N b := #{k; T k block b}, M b := max T k block b X k. Observe that M b is defined only when the block b contains at least one observation, a condition which is fulfilled with the probability Pr{N b = 0} = 1 exp{ λw b }. When a block duration is small relative to 1/λ, this probability is not that close to one; e.g. for λw b = 1 we find Pr{N b = 0} 0.63. Moreover for B = 10 blocks with λw b = 1, there is about one chance in a hundred that all blocks have an observation. When B is large enough, systematically missing observations M b due to empty blocks will ineluctably occur, unless λ is large enough. This will be discussed later. 32

renouvellement agrégé x block 1 block 2 block 3 block 4 f GEV (x) Xk M2 f GPD (x) T k T k2 t Figure 3.1: Temporal aggregation of the marked process with constant block durations w b w. The distribution of the marks X k is GPD(, σ, ξ) while block maxima M b have distribution GEV(, σ, ξ ). Since N b and the observations X k are independent and since M b is the maximum of N b independent r.vs we know that Pr [M b x N b = k ] = F X (x) k for k 1, (3.1) which allows the determination of the distribution of M b in the following theorem. Since the maxima M b corresponding to disjoint blocks are independent, the joint distribution of the maxima results. When N b = 0 we can set M b := and the joint distribution is of mixed type with a positive probability mass on vectors with some M b equal to. It will be convenient to denote by I GPD (, σ, ξ) the support of the GPD with parameters, σ and ξ. Similar notations will be used for the GEV distribution. The support of the block maxima M b is the same as that of the marks X k. Theorem 3.1. The r.vs M b corresponding to disjoint blocks are independent. The marginal distribution of M b is given by Pr{N b = 0, M b x} = exp{ λw b S X (x)} exp{ λw b }. (3.2) If the marks are GPD with X k GPD(, σ, ξ), then for all x I GPD (, σ, ξ), we have where the GEV parameters are given by Pr {N b = 0, x M b x + dx} = f GEV (x; b, σ b, ξ b ) dx (3.3) b = + (λw b) ξ 1 ξ σ, σ b = (λw b) ξ σ, ξ b = ξ, for ξ = 0, (3.4) or by b = + log(λw b) σ, σ b = σ, ξ b = ξ = 0, for ξ = 0, (3.5) 33

depending on the value of ξ. In both cases, the likelihood of an observation M b is computed as if M b comes from a sample of GEV( b, σ b, ξ b ). When the blocks have the same duration w b w, the maxima M b form an i.i.d. sample of the GEV distribution GEV(, σ, ξ ). Proof. In the proof we will omit the index b. Consider x in the support of the distribution of the marks X k. We have Pr [M x N = 0] = Pr [M x N = k ] Pr [N = k N = 0] = F X (x) k Pr [N = k N = 0] where (3.1) was used in the second equality. Now, multiplying by Pr{N = 0} Pr{N = 0, M x} = F X (x) k Pr{N = k} = F X (x) k e λw [λw] k /k! = e λw [λw F X (x)] k /k! = e λw exp{λw F X (x)} 1 = exp{ λw S X (x)} exp{ λw} which is (3.3). By derivation with respect to x, we get Pr {N = 0, x M x + dx} = d e λw S X (x) dx. (3.6) dx From now on let us assume that the marks are GPD(, σ, ξ) and that, σ and ξ are given by (3.4) for ξ = 0 or (3.5) for ξ = 0. First, since σ/ξ = σ /ξ for ξ = 0, it is easy to see that for any vector [, σ, ξ] we have I GPD (, σ, ξ) = I GPD (, σ, ξ ) I GEV (, σ, ξ ) see figure 3.2. Moreover, we have for any x I GPD (, σ, ξ) F GEV (x;, σ, ξ ) = exp { S GPD (x;, σ, ξ )} which can be checked from the closed form expressions. It will thus be enough to prove that for any x I GPD (, σ, ξ) we have λw S GPD (x;, σ, ξ) = S GPD (x;, σ, ξ ) (3.7) indeed, the derivative at the right hand side of (3.6) will thus be the GEV density f GEV (x;, σ, ξ ). Separating the two cases ξ = 0 and ξ = 0, the verification of (3.7) is simple algebra. Remark. Note that (3.3) only holds when x is in the support I GPD (, σ, ξ) and it does not hold over the full superset I GEV (, σ, ξ ). It is easy to check that by integrating (3.3) with respect to x I GPD (, σ, ξ) that we get Pr{N b = 0} = 1 exp{ λw b }. Note that while λ is related to a time scale (it expresses as an inverse time), no time unit is found in the GEV parameters. The reason is that the time unit is hidden in the block duration w which is needed to compute the return level curve in time units (typically years). 34

ξ > 0 GPD(, σ, ξ) GEV(, σ, ξ ) σ/ξ = σ /ξ ξ = 0 GPD(, σ, ξ) GEV(, σ, ξ ) ξ < 0 σ/ξ GPD(, σ, ξ) σ/ξ = σ /ξ GEV(, σ, ξ ) Figure 3.2: Supports I GPD (, σ, ξ) and I GEV (, σ, ξ ). 3.1.2 Links with Extreme Value regression In the general case where the block duration w b is not constant, the distribution of M b depends on w b. Ignoring the previous derivation, one could have used w b as a covariate in an extreme value regression. It is very unlikely that by proceeding in this way we would find the exact relations to the covariate as given in (3.4). In the exponential case ξ = 0, the exact form of dependence is quite usual b = β 0 + β 1 log w b, σ b = σ (constant), with parameters β 0, β 1, σ. Thus the location parameter is related to the log duration of the blocks, which may seem natural. However, when ξ = 0 the true relationship would require the links b = β 0 + β 1 w ξ b, σ b = γ 1 w ξ b, ξ = ξ (constant), and the parameters β 0, β 1, γ 1 and ξ. These equations do not fit in the standard framework where each of the three parameters is connected to the covariates through its own link function (Coles 2001, chap. 5). Thus, the very simple situation of a temporal aggregation does not lead to a simple extreme value regression. Note also that as a function of w, the variations of the term w ξ are large for small values of w, since in practice ξ 1. 3.2 GEV-max to GPD-POT 3.2.1 Disaggregation for constant block duration Problem We now assume to be given a sequence M b corresponding to disjoint blocks b = 1, 2,..., B with the same duration w b w, so that the M b form a sample of a GEV distribution. In other words, we have partial observations of the marked process. We may then estimate the parameters of the underlying marked process and infer on them. However, the marked process embeds four 35

parameters λ,, σ and ξ, while the GEV distribution only involves three parameters, σ and ξ. Given the vector θ := [, σ, ξ ], there is an infinity of vectors θ = [λ,, σ, ξ] satisfying the relations (3.4). The marked process model that generated the observations M b can correspond to any vector provided that all the M b lie in the interior of the support I GPD (, σ, ξ). There is an infinity of vectors θ = [λ,, σ, ξ] satisfying these conditions that have the same log-likelihood. The corresponding marked process models can be said observationally equivalent with respect to the given sequence of block maxima M b. Remark. The observational equivalence is tightly related to the POT-stability property of the GPD. By increasing the threshold u and lowering the rate λ it is possible to maintain the same return level curve. A natural idea to overcome the problem of identifiability is to fix one of the four POT parameters. The following two strategies can be considered 0 Choose the rate λ > 0, and then compute or estimate the GPD parameters, σ and ξ. 0 Choose the GPD location, and then compute or estimate the rate λ as well as σ and ξ. In the first case, any positive rate λ > 0 can be chosen and we simply have a re-parameterisation of the GEV distribution. Taking λ = 1/w, the three GPD parameters of the renewal process become identical to their GEV correspondent, that is: =, σ = σ and ξ = ξ. The second approach is very attractive when the model must be fitted using observations M b, since it boils down to a POT estimation from aggregated data as discussed now. Fixing : a GEV to POT function The relations (3.4) or (3.5) give the BM parameter vector θ as a function of the POT parameter θ; we aim to clarify here a possible inverse relation, i.e. the determination of θ from θ for a fixed value of. For the GEV context we will denote by Θ = {[, σ, ξ ] ; σ > 0} the domain of admissible parameters. The notations θ ( λ) and θ ( ) are for the vectors obtained by omitting λ or in the vector θ = [λ,, σ, ξ]. The relations (3.4) giving θ as a function of θ and can be written as θ = ψ (θ ( ) ; ) (3.8) which can be called a POT to GEV transformation. The Jacobian of this transformation is easily computed, see A.3 page 77. The same notations can be used for the Gumbel context and the relations (3.5), provided it is understood then that θ = [, σ ] and θ = [λ,, σ]. Theorem 3.2. Let θ = [, σ, ξ ] be a vector of GEV parameters with σ > 0. A solution θ ( ) = [λ, σ, ξ] of (3.4) exists if and only if is an interior point of the support I GEV (θ ). Then the solution θ ( ) is unique and we may write it as a function of θ and, i.e. as θ ( ) = ψ(θ ; ). For a vector of Gumbel parameters θ = [, σ ] with σ > 0, a unique solution θ ( ) of (3.5) exists. Proof. Consider the GEV case. From (3.4) we have by simple algebra [λw] ξ = 1 + ξ ( )/σ. (3.9) 36

ξ > 0 observations M b min σ / ξ GEV(, σ, ξ ) GPD allowed Figure 3.3: The fixed parameter must lie in the interior of the support I GEV ( θ ). When ξ > 0, we must have > σ / ξ. We have λ > 0 if and only the right hand side is positive, i.e. if is located in the interior of the support I GEV (θ ). We may then take the power 1/ξ of each side, leading to λw = log F GEV (; θ ). We then easily find σ and ξ. To summarise λ = 1 w log F GEV(; θ ), σ = (λw) ξ σ, ξ = ξ. (3.10) The proof is straightforward for the Gumbel case. Fitting BM from POT Given B block maxima M b we now consider the estimation of a GEV distribution by using a POT model with fixed. Using the notations of the previous section, we can estimate θ ( ) rather than the vector θ of GEV parameters and get this later using the POT to BM transform described in the previous section. More precisely, we can maximise with respect to θ ( ) the POT likelihood L POT (θ ( ) ; ) where the second argument is meant to recall that is used as the threshold required in POT. Not all values of the fixed parameter can be chosen. The fixed value of must obviously be such that M b > for every block b, and it must also lie in the interior of the support I GEV ( θ ), see figure 3.3. Assume that we are given a subset Θ 0 of the parameter space Θ containing all the parameters θ that could have generated the observations, and that is an interior point of I GEV (θ ) for all θ Θ 0. Then L GEV (θ ) = L POT (θ ( ) ; ), with θ ( ) = ψ(θ ; ) holds for all θ Θ 0. It is thus clear that maximising the POT likelihood with respect to θ ( ) with fixed and transforming it with the function ψ will lead to the same solution θ as fitting a GEV by ML. In other words θ ( ) = ψ( θ, ) (3.11) and the estimated joint distribution for the observations M b will be the same in the two cases. An advantage of the POT approach lies in the possibility of likelihood concentration seen in chapter 1 (section 1.1.5 page 9). We can fit the model using a two-parameter optimisation involving σ and ξ while λ is concentrated out through B λ = b w b S GPD (M b ;, σ, ξ) 37 (3.12)