A useful modeling tricks.

Similar documents
2. Modeling Uncertainty

Business Statistics 41000: Probability 3

Business Statistics 41000: Probability 4

Section 0: Introduction and Review of Basic Concepts

Statistical Methods in Practice STAT/MATH 3379

The Binomial Distribution

The Binomial Distribution

Probability. An intro for calculus students P= Figure 1: A normal integral

The normal distribution is a theoretical model derived mathematically and not empirically.

Chapter 4 and 5 Note Guide: Probability Distributions

4: Probability. Notes: Range of possible probabilities: Probabilities can be no less than 0% and no more than 100% (of course).

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

Statistics and Probability

Discrete Random Variables

5.2 Random Variables, Probability Histograms and Probability Distributions

4: Probability. What is probability? Random variables (RVs)

CS 237: Probability in Computing

CSSS/SOC/STAT 321 Case-Based Statistics I. Random Variables & Probability Distributions I: Discrete Distributions

Part 1 In which we meet the law of averages. The Law of Averages. The Expected Value & The Standard Error. Where Are We Going?

variance risk Alice & Bob are gambling (again). X = Alice s gain per flip: E[X] = Time passes... Alice (yawning) says let s raise the stakes

Part V - Chance Variability

E509A: Principle of Biostatistics. GY Zou

2011 Pearson Education, Inc

Statistics 511 Additional Materials

The topics in this section are related and necessary topics for both course objectives.

Chapter 6: Random Variables. Ch. 6-3: Binomial and Geometric Random Variables

Binomial Random Variable - The count X of successes in a binomial setting

Econ 6900: Statistical Problems. Instructor: Yogesh Uppal

Prof. Thistleton MAT 505 Introduction to Probability Lecture 3

Statistics Chapter 8

CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS

***SECTION 8.1*** The Binomial Distributions

Theoretical Foundations

Introduction to Business Statistics QM 120 Chapter 6

MA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution.

On one of the feet? 1 2. On red? 1 4. Within 1 of the vertical black line at the top?( 1 to 1 2

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

STA 6166 Fall 2007 Web-based Course. Notes 10: Probability Models

Statistical Methods for NLP LT 2202

Lean Six Sigma: Training/Certification Books and Resources

THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management

PROBABILITY DISTRIBUTIONS

Learning Objectives for Ch. 5

MA : Introductory Probability

Probability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016

Module 3: Sampling Distributions and the CLT Statistics (OA3102)

MATH 118 Class Notes For Chapter 5 By: Maan Omran

CS134: Networks Spring Random Variables and Independence. 1.2 Probability Distribution Function (PDF) Number of heads Probability 2 0.

Section Sampling Distributions for Counts and Proportions

Contents. The Binomial Distribution. The Binomial Distribution The Normal Approximation to the Binomial Left hander example

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 4: May 2, Abstract

4 Random Variables and Distributions

Section Distributions of Random Variables

TOPIC: PROBABILITY DISTRIBUTIONS

Statistics 6 th Edition

Chapter 7 1. Random Variables

Data Analysis and Statistical Methods Statistics 651

MATH 264 Problem Homework I

Section Random Variables and Histograms

Chapter 5: Discrete Probability Distributions

Section Distributions of Random Variables

STAT 201 Chapter 6. Distribution

Binomial Distributions

4.1 Probability Distributions

What s Normal? Chapter 8. Hitting the Curve. In This Chapter

SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Probability and distributions

Review. What is the probability of throwing two 6s in a row with a fair die? a) b) c) d) 0.333

Central Limit Theorem, Joint Distributions Spring 2018

Section 1.3: More Probability and Decisions: Linear Combinations and Continuous Random Variables

Business Statistics Midterm Exam Fall 2013 Russell

Binomial Random Variables. Binomial Random Variables

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Random Variables and Probability Functions

Commonly Used Distributions

Chapter 6: Random Variables

The Binomial Probability Distribution

4.2 Bernoulli Trials and Binomial Distributions

Chapter 8. Binomial and Geometric Distributions

LAB 2 INSTRUCTIONS PROBABILITY DISTRIBUTIONS IN EXCEL

Sampling Distributions and the Central Limit Theorem

Expected Value of a Random Variable

Probability Distribution Unit Review

Statistics for Business and Economics

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

BIOL The Normal Distribution and the Central Limit Theorem

Chapter 5: Probability models

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

x is a random variable which is a numerical description of the outcome of an experiment.

6. THE BINOMIAL DISTRIBUTION

Sampling Distributions For Counts and Proportions

Focus Points 10/11/2011. The Binomial Probability Distribution and Related Topics. Additional Properties of the Binomial Distribution. Section 5.

Discrete Random Variables and Probability Distributions

SECTION 4.4: Expected Value

Review. Binomial random variable

Chapter 7: Point Estimation and Sampling Distributions

Transcription:

.7 Joint models for more than two outcomes We saw that we could write joint models for a pair of variables by specifying the joint probabilities over all pairs of outcomes. In principal, we could do this for three (or more) variables by specifying the probabilities of combinations of outcomes. i.e. for the outcome of three coin tosses, the probabilities for any of the 8 possible triplet of combinations would be /8. A useful modeling tricks. We have: p YX ( y x) pxy ( x, y) p ( x) X There are three probabilities in the above relationship: pyx ( y x ), p ( x ), and p ( xy, ) Given any two we can recover the third. For example, if we have a model for the marginal and a model for the conditional, we can recover the joint as pxy ( x, y) py X ( y x) px ( x) X XY

A similar result holds for more than two variables: p( X, X, X ) p X p X X p X X, X This is a nice way to model time series data.,x,..., X X,...,..., P X P X P X P X X X P X X X X T T T P Xt X, X..., Xt is the model for X t given the past. If I know this, knowledge of past values of x tells us what the distribution of the next value X is. So this says that we should be able to figure out the probability of the sequence X, X,,X T if we just knew all the conditionals on the right hand side. This doesn t seem easier, but with some additional assumptions it can be! Lets consider a couple of special cases of the dependence structure. The idd model The Markovian model.

.8 A special case: The IID Model Remember our basic coin tossing example.. We think of each toss as an outcome from the model: coin.5. Index 5 5 5 P(X=)=.5 P(X=)=.5 (or X~Bernoulli(.5)) So each outcome has the same probability model (always a.5 chance of a head and.5 chance of a tail) Each outcome is independent (these are coin tosses!) from one and other.

Consider the case of coin tosses To model how we think about the coins we have: X ~ Bernoull(.5) X ~ Bernoull(.5) and X is independent of X We say the two X s are independent and identically distributed, X i ~Bernoull(.5) They are iid independent (the first i) identically distributed (the id) 4

To model the tossing of n coins: Let X i denote the outcome of the i th coin, i=,,...n We say the X i are iid. The outcome for each coin is independent of the outcome for all the others. For each coin X i ~ Bernoulli(.5) iid Random Variables In general if we say X, X, X n are iid, we mean that each is independent of all the others, and they each have the same probability distribution. Notice that the X i can follow Special about the Bernoulli! model, there is nothing 5

Remember, the rv s refer to the possible outcomes before they happen. The rv s describe what can happen and the probability tells us how likely each outcome is. Alternatively, we can observe data or the outcome or realization of a r.v. Sometimes we refer to iid data as draws from an iid r.v. Example Suppose we consider whether or not mortgages default. X i X i if the i th mortgage defaults if the i th mortgage doesn t default To say the X s are iid Bernoull(.) Says a lot. 6

It can be a way to summarize what you have seen: the numbers we have already seen look like draws from the common distribution Default..8.6.4. 4 6 8 and it can tell us what we expect to see in the future (if things don t change of course). We are using the idea of iid rv s as a model for something in the real world. Example How do you think about tosses of a die? Let Y i denote the outcome for the i th toss. The Y s are iid with, y 4 5 6 p(y) 6 6 6 6 6 6 for each die draws could look like this: tosses 6 5 4 Index 4 5 6 7 8 9 7

C5Does this data Example Stock price moves Series equals if up and if down...8.6.4.. Example. look iid?.5. Index 5 5 Does this data look iid?.9.8.7.6.5.4... 8 5 9 6 4 5 57 64 7 78 85 9 99 8

Intuitively the iid model is meant to describe numbers that (i ) have no pattern, are random (_id) but, over the long haul, the probabilities of the Distribution tell you how often certain values (or sets of values) occur. There is no pattern to coin tosses, but over the long haul you get about half heads. What is the probability of a given sequence of outcomes for the iid model? Let s go back to our modeling approach of last section and ask what happened to the following expression if the X s are iid.,,...,,...,..., P X X X P X P X X P X X X P X X X X N T N Since the X s are independent, the conditional probabilities are just the unconditional: P X X P X P X etc X, X P X So P X, X,..., X P X P X... P X N N 9

OK, so the X s are iid so the first I means they are independent. In this case:,,...,... P X X X P X P X P X N Since they are iid the id part means we use the same probability table for each X i. That is, the model for X is the same as the model for X and so on. They are all Bernoulli(.5). N What is the chance of getting a head followed by tails on three tosses of a coin? P X, X, X P X X X.5*.5*.5 X i ~ Bernoull(.5) What is the chance of getting heads followed by tails?

If mortgage defaults are iid Bernoulli(.), what is the chance of the first 5 not defaulting and the 6 th mortgage defaulting?,,,,,.88 5. P X X X X X X 4 5 6 A special case (non iid): the Markovian Model A simple time series model might say that Y t depends only on the most recent Y t, but not any others. In this case we have: PYY,,..., YT PY PY Y PY YY,... PY T YY,..., YT becomes PYY,,..., YT PY PY Y PY Y... PY T YT If it is the case that P Y is the same for all t t Yt then we only need to specify a model for t t in order go get PYY,,..., YT P Y Y

Example: Let Y t be an indicator for whether the t th trade is buyer or seller initiated. Y t = denotes buyer and Y t = denotes seller initiated. We might be able to model it as Markovian. Will there be persistence in the process, i.e. will a buy tell you something different about the next trade than a sell? y P(y) y P(y i y i =) y P(y i y i =).5.5 Then, for example, Y Y Y Y4 We could figure out the probability of any sequence of outcomes! P,,,.5*(/)*(/)*(/)

.9 Models and Formulas We often use mathematical formulas to describe how numerical quantities are related. We can do this with our models as well. Example Suppose you are playing a game where you toss a coin and win $ if it comes up heads and lose $ if it comes up tails. Let W denote your winnings. What is the distribution of W? w: p(w):.5.5 We can represent this in another way using the Bernoulli distribution. Let X~Bernoulli(.5). W = + X Whatever X turns out to be, the formula gives the W.

Example Let R be your uncertain return. Suppose you invest $ thousand. How is your end of period wealth related to R? Example Suppose you toss two coins: X and X ~ Bernoulli(.5) iid. Let Y = X + X. What does Y mean?. The Binomial Distribution We have seen how these probability models can be used to think about coin tosses, die tosses, and defects. We use probability to model a wide variety of phenomena in the real world. There are many type of distributions that are useful for various situations. Our most basic type of distribution is the Bernoulli. In the section we learn about the Binomial. In the next section, we will consider additional models. 4

Let X, X,,X n denote n iid Bernoulli(p) random variables Let Y X X X X What does Y mean? n n i i) You try something n times (X i denotes the i th outcome) ii) Each time you have the same chance p of success (X i is one for a success and zero for a failure) iii) Each time you try your probability of a success does not depend on any of the other outcomes. iv) Y simply counts the number of successes in n tries. i Binomial Distribution The binomial distribution is the probability distribution for the total number of successes: Y X X X X Example: What is the probability of succeeding 4 times in 5 (independent) tries when the probability of a success on any given try is. (p=., n=5)? The Binomial distribution answers this question. n n i i 5

More examples How many heads do I get when I toss a coin times? If Kobe Bryant makes 8 percent of his free throw shots how many does he make in attempts (assuming the outcomes are independent!). Suppose n=: (x,x ) p(x,x ) y (,).5 (,).5 (,).5 (,).5 X X.5.5.5.5 6

Suppose X, X, X n are iid Bernoulli(p). Then Y X X X n Has the Binomial distribution with parameters n and p. We write: Y ~ B(n, p) X i tells you whether it happened on the i th trial. Y is the total number of times it happened out of n trials. There is a formula giving the binomial probabilities: n! y n y py ( y) p ( p) y,,, n ( n y)! y! Probability of getting number of ways to y successes on n tries get y successes on (one way only) n tries. where n! = n(n )(n )(n )...()()(). 7

Example B(,.) B(,.5) B(,.8).. pp.. 5 y Example A firm was being sued for sexual discrimination. As a (small) part of the evidence the following data was used. Each point corresponds to a firm in the same industry. The x axis give the number of partners. The y axis gives the number of female partners. yy 5 5 6 7 8 nn 9 This point corresponds to the firm in question. 8

Clearly the point corresponding to the firm looks unusual. How can we quantify this? If whether a partner is male or female is iid Bernoulli(p) then the total number of female partners at the i th firm should be B(n i,p) where n i is the number of partners at the i th firm. What should we use for p? Not counting the firm in question, 7% of partners are female. Let s estimate p=.7. (But, we could be wrong!!) 9

y p(y).94.99.459.88 4.6 5.65966 6.6656 7.4487 8.84 9.66865.85.96.9.85 4.49 pyf.5..5. p(y) for Y~B(85,.7) 4 yf Under our assumptions, the prob of having female partners at the firm with 85 partners is.. 5 6 7 8 9. A non iid Model, the Random Walk At left is a plot of the price of a stock. The price is recorded every time it changes. Each price change is one tick which in this case is...5..5.95.9 9 7 5 9 7 4 45 49 5 57 6 65 69 7 77 8 85 89 9 97 5 Does this data look iid??

The trick here is to look at the price changes: D P P t=,,4,... t t t.5..5.5 9 7 5 9 7 4 45 49 5 57 6 65 69 7 77 8 85 89 9 97 5..5... The D t look i.i.d. with Pr(D t =.)=.55 Pr(D t =.)=.45 What is p(p p,p, p ) t t t? Our model is, P P D with D t+ : t t t d p(d)..45..55 and the D's are iid. What is the conditional probability distribution of P t+ P t =p t? P t+ : p t+ p(p t+ p t ) p t..45 p t +..55

Given our model, how would you predict the next price? The last price in the series is.. Data that kind of wanders can often be modeled as a random walk. P P D t t t where the D s look iid from some distribution. The next value is the current value plus a random increment.

. Models for continuous outcomes. The pdf. The Normal Family of Distributions. The cdf.4 IID Draws from the Normal Distribution.5 The Histogram and IID Draws.6 The Normal Distribution and Data.7 The Inverse CDF and VaR.8 Standardization. Continuous Random Variables, the pdf Remember this returns example? This is unrealistic as it is unlikely that you know the return will be one of possible values. R: r.5..5 p(r)..5.4 It may be inconvenient to pick a reasonably small list of values that seem to cover all possibilities.

Consider a spinner that can stop at any point between zero and one..75.5.5 What is the probability that the spinner stops at exactly.5? What is the probability that the spinner stops at any point between.5 and.5?.75.5.5 4

Clearly, there are modeling situations where we need our models to be able to take on any value, or any (continuous) value in an interval. In this case we cannot simply list all the possible values and give each one a probability. We need a new way to specify probabilities. NEW TRICK: Instead of specifying the probabilities for specific Values we specify probabilities for intervals of outcomes. Old (discrete) : Pr(X=4) =.7 New (continuous) Pr(X is in ( 4,8)) =.7 In general we specify Pr(X in (a,b)) for any values a and b with a< b. An easy way to do this is with the probability density function (pdf). 5

Probability Density Function (PDF) Again let x denote a possible value of the random variable X. The pdf, probability density function denoted by f(x) is a function of x such that the probability of any interval [a,b] is Given by the area under the graph of the function between a and b. In our spinner example X any value between zero and one where equally likely. Here is what the density looks like: f(x).5.5.75 x Why is the height? What is the probability that the outcome for X is between.5 and.5? What is the probability that the outcome for X is between.5 and.75? 6

A density function where not all values are equally likely will not be flat:.4. f(x)... - - - x.4. area is.477 f(x)... - - - x For the rv X the probability that it is in the interval [,] is.477. 47.7 percent of the time X will fall in this interval. 7

.4. f(x)... - - - x For the rv X the probability that it is in the interval [,] is.4. Here is a probability density function that is not symmetric and only takes positive values.5..5..5 4 6 8 Most of the prob is concentrated in to, but you could get one much bigger. This kind of distribution is called skewed to the right. 8

For a continuous random variable X, the probability of the interval (a,b) is the area under the probability density function from a to b. For technical reasons the probability of any one value is. Any non negative valued function with total area under the curve equal to one is a density function.. The Normal Family of Distributions The rv having this pdf is very special..4. This distribution is called the standard normal distribution. f(x)... - - - x If Z has this distribution then: Pr( <Z<) =.68 Pr(.96<Z<.96)=.95 9

Note: P( Z ). 4 P( Z ). 68 P( Z ).954 P( 96. Z 96. ). 95 P( Z ). 9974 NB. In these notes I will usually act as if.96 =. The Normal Family of Distributions We are going to use the normal distribution to describe our uncertainty about things in the real world. The standard normal distribution is not too exciting as it is centered around, with prob.95 of being in +/. We can create a family of interesting distributions from the standard normal by moving it around spreading it out and tightening up

We can do both, Let, X Z.4. f(x)... x 95% chance of being in (, ) 68% chance of being in (, ) The Normal Distribution: We write, X ~ N(, ) for X Z Z standard normal 95% chance of being in (, ) 68% chance of being in (, ) You can see where the empirical rule comes from!

We have family of distributions. For each pair (, ) we get a normal distribution. determines the center of the distribution determines how spread out the prob is around the center. Note: >= Note: in the next section of the notes we will see that is the mean, is the standard deviation of the distribution, and is the variance. I ll use these names right away, but explain what they mean later (next section of notes).

We won t have to directly use this, but this is what the Normal density function looks like: f x exp x All of these normal distributions have =,, or and =.5,,or. Which is which?.8.7.6.5 C.4.... -6-4 - x 4 6 8

Be careful!!!! If we say X~N(5,4), then =5 = That is, we use the mean and variance to specify a normal distribution. I wish it had been the mean and standard deviation.. The cdf Computing probabilities from a pdf requires computing areas under the pdf curve. The Cumulative Distribution Function (CDF) is a tool that computes specific areas for us. For a random variable X, the cdf which we denote by F (we used f for the pdf) is defined by, FX( x) P( X x) THE CDF OF X GIVES US A PROBABILITY!!! Just a number. 4

Here is the cdf for the density given earlier:. What is P(X<x).5 F()? F( )?. F()? x eg, for the standard normal F() =.5 F( ) =.6 F() =.84.4. f(x)... - - - x 5

The cdf is handy for computing the prob of intervals. Pa ( X b) PX ( b) PX ( a) F ( b) F ( a) X X f(x)..5..5 4 6 8 x = f(x)..5..5 4 6 8 x f(x)..5..5 4 6 8 x 6

. Fb ( ) F(x).5 Fa ( ). - - - a x b The prob of an interval is the jump in the cdf over that interval. Note: for x big enough F(x) must get close to. for x small enough F(x) must get close to. Example Let R denote the return on our portfolio next month. We don t know what R will be. Let s assume we can describe what we think it will be by where =. and =.4: R~N(.,.4 ) Use the =normdist() function in Excel p(r) 5 -... What is the probability of a negative return? What is the probability of a return between and.5? r 7

.4 What would data generated by an iid Normal look like? Remember how we used the idea of iid draws from the Bernoulli(.5) distribution to model tossing a coin? We want to use the normal distribution to model data in the real world. Surprisingly often, data looks like iid draws from a normal distribution. We can have iid draws from any distribution. By, X,X, X n ~N(, )iid we mean each X will be an independent draw from the same normal distribution. We haven t formally defined independence for continuous distributions, but our intuition is the same!! What do iid normal draws look like? 8

The computer can generate iid draws from the normal distribution. C - - There is no pattern, they look random. Index 4 5 6 7 8 9 Same with lines drawn in at and +/ C in the long run, 95% will be in here - - Index 4 5 6 7 8 9 9

Here are draws from a normal other than the standard one. C 5 The draws are iid from N( 54, ) How would you predict the next one? Index 4 5 6 7 8 9.5 What is the relationship between a histogram from iid draws of a Normal and the Normal pdf? Here is the histogram of draws from the standard normal. The height of each bar tells us the percentage of observations in the interval. The width of each interval is.5. Percent We can see about 68% are between and. -4 - - - z 4 4

If we use the density option the area of each bar is the fraction in the interval. Density.4.... -4 - - - z 4 It looks the same, but the vertical scale is different. For a large number of draws, the observed percent a given interval should get close to the probability: For the density the area is the prob of the interval..4 For the hist the area is the observed percent in the interval. Density.... In large samples these are close. -4 - - - z 4 4

Example hist of iid N(,) Frequency -.5 -. -.5..5..5..5..5 C hist of iid N(,) Frequency 9 8 7 6 5 4 - - - C4 The histogram of a large number of iid draws from any distribution should look like the pdf. Example, draws, uniform on (,), draws, N(,) 5 4 Frequency 5 Frequency - C -4 - - - 4 5 normal 4

.6 Conversely, if we see data in the real world, we might ask if it could have come from an iid normal model. The returns data for Canada....8 canada..6.4...4.6 6 6 6 6 4 46 5 56 6 66 7 76 8 86 9 96 6 -..8 Index 4 6 8. Real data Simulated data Frequency 5 5 5 5 -. -.7 -.4 -.. canada.5.8. The histogram of the real data looks normal! 4

If we think is about. and is about.4, then our best guess at the next return is.. An interval which has a 95% chance of containing the next return would be:. +/.8. canada. -. Index 4 6 8 We used iid Bernoulli draws to model coin tosses and defects. Now we are using the idea of iid normal draws to model returns! To say the returns look iid normal () summarizes the past () tells us how to predict the future It is a powerful statement about the real world. Unanswered questions: How do we know the normal is right (good)? How do we best choose the and? 44

Example Of course, not all data looks normal: Daily volume of trades in the Cattle pit. Frequency 6 5 4 Skewed right. 4 5 6 7 Volume Sometimes we can succinctly describe real world data by saying they look normal. In this case we would really like to know (, ) Similarly, if data are iid Bernoulli, we would like to know p. Given data, we will estimate the parameters. 45

Example Dow Jones Do these look like iid draws? dji 5 5 5 Index 5 5 Lake Level Beer Production 8 level 9 8 7 beerprod 9 8 7 6 5 4 56 4 7 9 4 67 5 8 9 4 5 6 7 8 9 5 7 6 4 8 9 56 7 4 8 9 6 7 45 8 9 6 Index 4 5 6 7 8 9 Index 4 5 6 7.7 The Inverse CDF and Value at Risk (VaR) One intuitive measure of the risk associated with a financial position (portfolio) is to answer the question If my portfolio crashes, how much will I loose? Value at Risk (VaR) answers this question and is a common tool used on to asses the risk of holding a portfolio. We must define what we mean by a crash. Lets say a (bad) event that occurs only % of the time. If one of these days occurs, how much would you loose? 46

Formal Definition: VaR More formally, let X denote a model for the uncertain return on a portfolio with cdf given by F(x). We define the % value at risk as: VaR x where F x p(r) 9 8 7 6 5 4 -. -. -..... r So the %VaRis the return where there is an % chance that your return is worse than that value and a % chance you do better than that return. 47

The inverse CDF function Instead of asking what it Pr( X<x) we want to find the x such that Pr(X<x)=. If is. this would mean we want to find the first percentile of the distribution. We have the computer find this value of x using the inverse cdf function. Suppose that daily S&P5 returns are normally distributed with a mean of.59 and a standard deviation of.946. In Excel: =NORMINV(.,.59,.946) P( X <= x) x..5 Example Suppose an investment portfolio consists of $,, invested in the S&P5. The VaR is VaR(%).5 which translates into a dollar value of:.5*,, $58, There is a % chance that the portfolio looses.5% ($58,) or more over the next month. We are 99% sure that we get a return larger than.5% ($ 58,). 48

.8 Standardization How unusual is it? Sometimes something weird or unusual happens and we want to quantify just how weird it is. Suppose a market crashes, weekly returns: Monthly returns on a market index from Jan 8 to Oct 87. C6.. -. -. Index 4 5 6 7 8 9 How crazy is the crash? We can use a normal curve to describe all the values except the last. Frequency 5 5 Histogram of C7, with Normal Curve -... C7 The curve has =.7 and =.47. The crash month return was.76. 49

N(.7,.47 ) Wow! The crash return was way out in left field!!! p(r) 9 8 7 6 5 4 -. -. -..... r We can do essentially the same thing by standardizing the value. We ask: if the value was a standard normal, what would it be? We can think of our return values as: r =.7 +.47z ( + z) So the z value corresponding to an r value is: r r.7 z.47 The z values should look standard normal. 5

How unusual is the crash return?.76.7 z 5.7.47 Its z value is 5.7!!!! It is like getting a value of 5.7 from the standard normal. Never!! Here are the z values for the previous months. Frequency -.5 -. -.5 -. -.5..5..5..5 C5 Another way to say it is that the crash return was 5. standard deviations away from the mean. For values X~N(, ), the z value corresponding to an x value is z x It can be interpreted as the number of standard deviations the x value is from the mean. 5

Question What is the interpretation of the standardized value if the distribution is not normal? 5