The rth moment of a real-valued random variable X with density f(x) is. x r f(x) dx

Similar documents
The rth moment of a real-valued random variable X with density f(x) is. x r f(x) dx

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Sampling Distribution

ROM SIMULATION Exact Moment Simulation using Random Orthogonal Matrices

Statistics for Business and Economics

Determining source cumulants in femtoscopy with Gram-Charlier and Edgeworth series

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

STAT 830 Convergence in Distribution

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

BROWNIAN MOTION Antonella Basso, Martina Nardon

Statistical Tables Compiled by Alan J. Terry

Random Variables Handout. Xavier Vilà

Asymptotic Notation. Instructor: Laszlo Babai June 14, 2002

EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz

Chapter 2. Random variables. 2.3 Expectation

Random variables. Contents

Homework Assignments

SYLLABUS AND SAMPLE QUESTIONS FOR MSQE (Program Code: MQEK and MQED) Syllabus for PEA (Mathematics), 2013

Chapter 3: Black-Scholes Equation and Its Numerical Evaluation

Drunken Birds, Brownian Motion, and Other Random Fun

LECTURE 2: MULTIPERIOD MODELS AND TREES

Mean-Variance Analysis

Saddlepoint Approximation Methods for Pricing. Financial Options on Discrete Realized Variance

symmys.com 3.2 Projection of the invariants to the investment horizon

CHOICE THEORY, UTILITY FUNCTIONS AND RISK AVERSION

STK 3505/4505: Summary of the course

Math-Stat-491-Fall2014-Notes-V

Probability. An intro for calculus students P= Figure 1: A normal integral

12 The Bootstrap and why it works

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Chapter 4 Continuous Random Variables and Probability Distributions

Discrete Random Variables and Probability Distributions. Stat 4570/5570 Based on Devore s book (Ed 8)

High Dimensional Edgeworth Expansion. Applications to Bootstrap and Its Variants

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.

S t d with probability (1 p), where

Chapter 4 Continuous Random Variables and Probability Distributions

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Chapter 5 Discrete Probability Distributions. Random Variables Discrete Probability Distributions Expected Value and Variance

On Complexity of Multistage Stochastic Programs

arxiv: v1 [math.st] 18 Sep 2018

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

Business Statistics 41000: Probability 3

Probability theory: basic notions

Decision theoretic estimation of the ratio of variances in a bivariate normal distribution 1

Financial Time Series and Their Characterictics

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Central limit theorems

Edgeworth Binomial Trees

Black-Scholes Option Pricing

4 Martingales in Discrete-Time

BROWNIAN MOTION II. D.Majumdar

MAFS Computational Methods for Pricing Structured Products

STOCHASTIC CALCULUS AND BLACK-SCHOLES MODEL

Course information FN3142 Quantitative finance

12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006.

Financial Risk Forecasting Chapter 9 Extreme Value Theory

Probability and Random Variables A FINANCIAL TIMES COMPANY

Notation, Conventions, and Conditional Expectations

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

ECSE B Assignment 5 Solutions Fall (a) Using whichever of the Markov or the Chebyshev inequalities is applicable, estimate

Basic notions of probability theory: continuous probability distributions. Piero Baraldi

Cumulants and triangles in Erdős-Rényi random graphs

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

1. Covariance between two variables X and Y is denoted by Cov(X, Y) and defined by. Cov(X, Y ) = E(X E(X))(Y E(Y ))

IEOR E4602: Quantitative Risk Management

Applications of Good s Generalized Diversity Index. A. J. Baczkowski Department of Statistics, University of Leeds Leeds LS2 9JT, UK

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Math489/889 Stochastic Processes and Advanced Mathematical Finance Homework 5

PORTFOLIO THEORY. Master in Finance INVESTMENTS. Szabolcs Sebestyén

Probability & Statistics

An Improved Saddlepoint Approximation Based on the Negative Binomial Distribution for the General Birth Process

Contents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii)

Random Variables and Probability Distributions

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

Chapter 3 Discrete Random Variables and Probability Distributions

IEOR 3106: Introduction to OR: Stochastic Models. Fall 2013, Professor Whitt. Class Lecture Notes: Tuesday, September 10.

Stochastic Programming and Financial Analysis IE447. Midterm Review. Dr. Ted Ralphs

Lecture 10: Point Estimation

Capital Allocation Principles

Objective Bayesian Analysis for Heteroscedastic Regression

4.3 Normal distribution

Application of an Interval Backward Finite Difference Method for Solving the One-Dimensional Heat Conduction Problem

P VaR0.01 (X) > 2 VaR 0.01 (X). (10 p) Problem 4

LECTURE NOTES 3 ARIEL M. VIALE

START HERE: Instructions. 1 Exponential Family [Zhou, Manzil]

CS134: Networks Spring Random Variables and Independence. 1.2 Probability Distribution Function (PDF) Number of heads Probability 2 0.

ROM Simulation with Exact Means, Covariances, and Multivariate Skewness

Chapter 3 Common Families of Distributions. Definition 3.4.1: A family of pmfs or pdfs is called exponential family if it can be expressed as

Reading: You should read Hull chapter 12 and perhaps the very first part of chapter 13.

King s College London

A discretionary stopping problem with applications to the optimal timing of investment decisions.

Modeling of Price. Ximing Wu Texas A&M University

Two-term Edgeworth expansions of the distributions of fit indexes under fixed alternatives in covariance structure models

Lecture 22. Survey Sampling: an Overview

Practice Exercises for Midterm Exam ST Statistical Theory - II The ACTUAL exam will consists of less number of problems.

Notes on the symmetric group

Simulation Wrap-up, Statistics COS 323

P2.T8. Risk Management & Investment Management. Jorion, Value at Risk: The New Benchmark for Managing Financial Risk, 3rd Edition.

Probability and distributions

Transcription:

1 Cumulants 1.1 Definition The rth moment of a real-valued random variable X with density f(x) is µ r = E(X r ) = x r f(x) dx for integer r = 0, 1,.... The value is assumed to be finite. Provided that it has a Taylor expansion about the origin, the moment generating function M(ξ) = E(e ξx ) = E(1 + ξx + + ξ r X r /r! + ) = µ r ξ r /r! r=0 is an easy way to combine all of the moments into a single expression. The rth moment is the rth derivative of M at the origin. The cumulants κ r are the coefficients in the Taylor expansion of the cumulant generating function about the origin K(ξ) = log M(ξ) = r κ r ξ r /r!. Evidently µ 0 = 1 implies κ 0 = 0. The relationship between the first few moments and cumulants, obtained by extracting coefficients from the expansion, is as follows κ 1 = µ 1 In the reverse direction κ 2 = µ 2 µ 2 1 κ 3 = µ 3 3µ 2 µ 1 + 2µ 3 1 κ 4 = µ 4 4µ 3 µ 1 3µ 2 2 + 12µ 2 µ 2 1 6µ 4 1. µ 2 = κ 2 + κ 2 1 µ 3 = κ 3 + 3κ 2 κ 1 + κ 3 1 µ 4 = κ 4 + 4κ 3 κ 1 + 3κ 2 2 + 6κ 2 κ 2 1 + κ 4 1. In particular, κ 1 = µ 1 is the mean of X, κ 2 is the variance, and κ 3 = E((X µ 1 ) 3 ). Higher-order cumulants are not the same as moments about the mean. 1

This definition of cumulants is nothing more than the formal relation between the coefficients in the Taylor expansion of one function M(ξ) with M(0) = 1, and the coefficients in the Taylor expansion of log M(ξ). For example Student s t on five degrees of freedom has finite moments up to order four, with infinite moments of order five and higher. The moment generating function does not exist for real ξ 0, but the characteristic function M(iξ) is e ξ (1 + ξ + ξ 2 /3). Both M(iξ) and K(iξ) = ξ + log(1 + ξ + ξ 2 /3) have Taylor expansions about ξ = 0 up to order four only. The normal distribution N(µ, σ 2 ) has cumulant generating function ξµ+ ξ 2 σ 2 /2, a quadratic polynomial implying that all cumulants of order three and higher are zero. Marcinkiewicz (1935) showed that the normal distribution is the only distribution whose cumulant generating function is a polynomial, i.e. the only distribution having a finite number of non-zero cumulants. The Poisson distribution with mean µ has moment generating function exp(µ(e ξ 1)) and cumulant generating function µ(e ξ 1). Consequently all the cumulants are equal to the mean. Two distinct distributions may have the same moments, and hence the same cumulants. This statement is fairly obvious for distributions whose moments are all infinite, or even for distributions having infinite higherorder moments. But it is much less obvious for distributions having finite moments of all orders. Heyde (1963) gave one such pair of distributions with densities f 1 (x) = exp( (log x) 2 /2)/(x 2π) f 2 (x) = f 1 (x)[1 + sin(2π log x)/2] for x > 0. The first of these is called the log normal distribution. To show that these distributions have the same moments it suffices to show that x k f 1 (x) sin(2π log x) dx = 0 0 for integer k 1, which can be shown by making the substitution log x = y + k. Cumulants of order r 2 are called semi-invariant on account of their behaviour under affine transformation of variables (Thiele 198?, Dressel 1942). If κ r is the rth cumulant of X, the rth cumulant of the affine transformation a + bx is b r κ r, independent of a. This behaviour is considerably simpler than that of moments. However, moments about the mean are also semiinvariant, so this property alone does not explain why cumulants are useful for statistical purposes. 2

The term cumulant was coined by Fisher (1929) on account of their behaviour under addition of random variables. Let S = X + Y be the sum of two independent random variables. The moment generating function of the sum is the product M S (ξ) = M X (ξ)m Y (ξ), and the cumulant generating function is the sum K S (ξ) = K X (ξ) + K Y (ξ). Consequently, the rth cumulant of the sum is the sum of the rth cumulants. By extension, if X 1,... X n are independent and identically distributed, the rth cumulant of the sum is nκ r, and the rth cumulant of the standardized sum n 1/2 (X 1 + + X n ) is n 1 r/2 κ r. Provided that the cumulants are finite, all cumulants of order r 3 of the standardized sum tend to zero, which is a simple demonstration of the central limit theorem. Good (195?) obtained an expression for the rth cumulant of X as the rth moment of the discrete Fourier transform of an independent and identically distributed sequence as follows. Let X 1, X 2,... be independent copies of X with rth cumulant κ r, and let ω = e 2πi/n be a primitive nth root of unity. The discrete Fourier combination Z = X 1 + ωx 2 + + ω n 1 X n is a complex-valued random variable whose distribution is invariant under rotation Z ωz through multiples of 2π/n. The rth cumulant of the sum is κ r nj=1 ω rj, which is equal to nκ r if r is a multiple of n, and zero otherwise. Consequently E(Z r ) = 0 for integer r < n and E(Z n ) = nκ n. 1.2 Multivariate cumulants Somewhat surprisingly, the relation between moments and cumulants is simpler and more transparent in the multivariate case than in the univariate case. Let X = (X 1,..., X k ) be the components of a random vector. In a departure from the univariate notation, we write κ r = E(X r ) for the components of the mean vector, κ rs = E(X r X s ) for the components of the second moment matrix, κ rst = E(X r X s X t ) for the third moments, and so on. It is convenient notationally to adopt Einstein s summation convention, so ξ r X r denotes the linear combination ξ 1 X 1 + + ξ k X k, the square of the linear combination is (ξ r X r ) 2 = ξ r ξ s X r X s a sum of k 2 terms, and so on 3

for higher powers. The Taylor expansion of the moment generating function M(ξ) = E(exp(ξ r X r ) is M(ξ) = 1 + ξ r κ r + 1 2! ξ rξ s κ rs + 1 3! ξ rξ s ξ t κ rst +. The cumulants are defined as the coefficients κ r,s, κ r,s,t,... in the Taylor expansion log M(ξ) = ξ r κ r + 1 2! ξ rξ s κ r,s + 1 3! ξ rξ s ξ t κ r,s,t +. This notation does not distinguish first-order moments from first-order cumulants, but commas separating the superscripts serve to distinguish higherorder cumulants from moments. Comparison of coefficients reveals that the each moment κ rs, κ rst,... is a sum over partitions of the superscripts, each term in the sum being a product of cumulants: κ rs = κ r,s + κ r κ s κ rst = κ r,s,t + κ r,s κ t + κ r,t κ s + κ s,t κ r + κ r κ s κ t = κ r,s,t + κ r,s κ t [3] + κ r κ s κ t κ rstu = κ r,s,t,u + κ r,s,t κ u [4] + κ r,s κ t,u [3] + κ r,s κ t κ u [6] + κ r κ s κ t κ u. Each parenthetical number indicates a sum over distinct partitions having the same block sizes, so the fourth-order moment is a sum of 15 distinct cumulant products. In the reverse direction, each cumulant is also a sum over partitions of the indices. Each term in the sum is a product of moments, but with coefficient ( 1) ν 1 (ν 1)! where ν is the number of blocks: κ r,s = κ rs κ r κ s κ r,s,t = κ rst κ rs κ t [3] + 2κ r κ s κ t κ r,s,t,u = κ rstu κ rst κ u [4] κ rs κ tu [3] + 2κ rs κ t κ u [6] 6κ r κ s κ t κ u Partition notation serves one additional purpose. It establishes moments and cumulants as special cases of generalized cumulants, which includes objects of the type κ r,st = cov(x r, X s X t ), κ rs,tu = cov(x r X s, X t X u ), and κ rs,t,u with incompletely partitioned indices. These objects arise very naturally in statistical work involving asymptotic approximation of distributions. They are intermediate between moments and cumulants, and have characteristics of both. 4

Every generalized cumulant can be expressed as a sum of certain products of ordinary cumulants. Some examples are as follows: κ rs,t = κ r,s,t + κ r κ s,t + κ s κ r,t = κ r,s,t + κ r κ s,t [2] κ rs,tu = κ r,s,t,u + κ r,s,t κ u [4] + κ r,t κ s,u [2] + κ r,t κ s κ u [4] κ rs,t,u = κ r,s,t,u + κ r,t,u κ s [2] + κ r,t κ s,u [2] Each generalized cumulant is associated with a partition τ of the given set of indices. For example, κ rs,t,u is associated with the partition τ = rs t u of four indices into three blocks. Each term on the right is a cumulant product associated with a partition σ of the same indices. The coefficient is one if the least upper bound σ τ has a single block, otherwise zero. Thus, with τ = rs t u, the product κ r,s κ t,u does not appear on the right because σ τ = rs tu has two blocks. As an example of the way these formulae may be used, let X be a scalar random variable with cumulants κ 1, κ 2, κ 3,.... By translating the second formula in the preceding list, we find that the variance of the squared variable is var(x 2 ) = κ 4 + 4κ 3 κ 1 + 2κ 2 2 + 4κ 2 κ 2 1, reducing to κ 4 + 2κ 2 2 if the mean is zero. 1.3 Exponential families 2 Approximation of distributions 2.1 Edgeworth approximation 2.2 Saddlepoint approximation 3 Samples and sub-samples A function f: R n R is symmetric if f(x 1,..., x n ) = f(x π(1),..., x π(n) ) for each permutation π of the arguments. For example, the total T n = x 1 + + x n, the average T n /n, the min, max and median are symmetric functions, as are the sum of squares S n = x 2 i, the sample variance s2 n = (S n T 2 n/n)/(n 1) and the mean absolute deviation x i x j /(n(n 1)). A vector x in R n is an ordered list of n real numbers (x 1,..., x n ) or a function x: [n] R where [n] = {1,..., n}. For m n, a 1 1 function ϕ: [m] [n] is a sample of size m, the sampled values being 5

xϕ = (x ϕ(1),..., x ϕ(m) ). All told, there are n(n 1) (n m + 1) distinct samples of size m that can be taken from a list of length n. A sequence of functions f n : R n R is consistent under subsampling if, for each f m, f n, f n (x) = ave ϕ f m (xϕ), where ave ϕ denotes the average over samples of size m. For m = n, this condition implies only that f n is a symmetric function. Although the total and the median are both symmetric functions, neither is consistent under subsampling. For example, the median of the numbers (0, 1, 3) is one, but the average of the medians of samples of size two is 4/3. However, the average x n = T n /n is sampling consistent. Likewise the sample variance s 2 n = (x i x) 2 /(n 1) with divisor n 1 is sampling consistent, but the mean squared deviation (x i x n ) 2 /n with divisor n is not. Other sampling consistent functions include Fisher s k-statistics, the first few of which are k 1,n = x n, k 2,n = s 2 n for n 2, k 3,n = n (x i x n ) 3 /((n 1)(n 2)) k 4,n = defined for n 3 and n 4 respectively. For a sequence of independent and identically distributed random variables, the k-statistic of order r n is the unique symmetric function such that E(k r,n ) = κ r. Fisher (1929) derived the variances and covariances. The connection with finite-population sub-sampling was developed by Tukey (1954). 6