STAT 825 Notes Random Number Generation

Size: px
Start display at page:

Download "STAT 825 Notes Random Number Generation"

Transcription

1 STAT 825 Notes Random Number Generation What if R/Splus/SAS doesn t have a function to randomly generate data from a particular distribution? Although R, Splus, SAS and other packages can generate data from many different distributions, certainly most of the common ones, you may need to generate data from a distribution for which there is no existing function. The following tools will aid you in generating data from some of these other distributions. The methods described here can be thought of as direct methods. We will also discuss some indirect methods a bit later. Additional comment: If you find yourself writing a Fortran or C++ program one day, you may only be able to generate uniform random variates. Some of the techniques below will be necessary in generating variates from even the most common distributions. (Note that routines exist for many of these distributions. You just need to know where to look.) Transformations: As you may recall from Theory I and II, many common distributions are related to one another through transformations. For example, If X N(0, 1) and Y = X 2, then Y χ 2 (1). Let X N(0, 1) and Y χ 2 (r) where X and Y are independent, and let V = X/ Y/r. Then V t(r). Let X 1 and X 2 be independent Exp(θ) where θ is the mean, and let Y = X 1 X 2. Then Y Double Exponential(0, θ) where 0 is the value of the location parameter and θ is the scale parameter. If X Gamma(α, β) and Y = 1/X, then Y Inverted gamma(α, β). If X Exp(θ) and Y = X 1/γ, then Y Weibull(γ, θ). An important transformation is the probability integeral transformation: Let X be a random variable with a continuous cdf F X (x) and define the random variable Y as Y = F X (X). Then Y is uniformly distributed on (0, 1), Y Uniform(0, 1). The probability integral transformation implies the following: If Y is a continuous random variable with cdf F Y, then the random variable F 1 Y (U), where U Uniform(0, 1), has distribution F Y. Example: Recall that if Y Exp(θ), then the cdf of Y is F Y (y) = 1 exp( y/θ). The inverse of the F Y is F 1 Y (t) = θ log(1 t). 1

2 Therefore, if U Uniform(0, 1), then F 1 (U) = θ log(1 U) Exp(θ). Y What are the implications of the probability integral transformation? You can generate variates from any distribution provided that you can generate variates from a uniform distribution on (0, 1), the cdf of the target distribution can be written in closed form, and the cdf of the target distribution is invertible. Note: The probability integral transformation technique for generating random variates based on uniform random variates is called the inverse cdf method by some. Example continued: An algorithm for generating n variates from an Exp(θ) distribution is: 1. Generate u 1,...,u n from a Uniform(0, 1) distribution. 2. Let y i = θ log(1 u i ). In Splus/R, this can be carried out as follows. Suppose n = 1000 and θ = 3. > n<-1000 > theta<-3 > u<-runif(n,0,1) > y<-(-theta)*log(1-u) A histogram of y should show the characteristic shape of the exponential pdf. Also, note that if U Unif(0, 1), then 1 U Unif(0, 1). Therefore, Y = θ log(u) Exp(θ) also. Therefore, the last line above could be replaced by > y<-(-theta)*log(u) still randomly generating exponential variates. The relationship between the exponential distribution and other distributions allows the quick generation of many random variables. For example, if U j are iid Uniform(0, 1) random variables, what are the distributions of the following? X = b Y Z = = 2 a log(u j ) j=1 r log(u j ) j=1 a j=1 log(u j) a+b j=1 log(u j) Discrete Distributions: Many discrete distributions can also be generated using cdfs. For example, let Y be a 2

3 discrete random variable taking on values a 1 < a 2 < < a k with cdf F Y, and let U be a Uniform(0, 1) random variable. Then we can write P[F Y (a j ) < U F Y (a j+1 )] = F Y (a j+1 ) F Y (a j ) = P(Y = a j+1 ) Implementation of the above to generate discrete random variates is quite straightforward and can be summarized as follows. To generate, y 1,...,y n from a discrete distribution with cdf F Y (y), 1. Generate u 1,...,u n from Uniform(0,1). 2. If F Y (a j ) < u i F Y (a j+1 ), set y i = a j+1. We define a 0 = and F Y (a 0 ) = 0. Example: Suppose we want to generate a random sample of size 1000 from the following discrete distribution: Note that F Y is given by a p Y (a) 1/15 2/15 3/15 4/15 5/15 a F Y (a) 1/15 3/15 6/15 10/15 1 The following Splus/R code will do the job (discrete.prog). n<-1000 a<-1:5 p<-a/15 cdf<-c(0,cumsum(p)) u<-runif(n,0,1) ind<-matrix(0,nrow=5,ncol=n) for(i in 1:length(a)) ind[i,]<-ifelse((u>cdf[i])&(u<=cdf[i+1]),1,0) y<-as.vector(a%*%ind) Using table(y) should confirm that we have generated data from the above distribution. The above algorithm can be used to generate variates from a binomial distribution also. (How would you do this?) In addition, the algorithm will also work if the support of the discrete random variable is infinite, like Poisson and negative binomial. Although, theoretically, this could require a large number of evaluations, in practice this does not happen because there are simple and clever ways of speeding up the algorithm. For example, instead of checking each y i in the order 1, 2,..., it can be much faster to start checking y i s near the mean. (See Stochastic Simulation by B.D. Ripley, 1987, Section 3.3 and Exercise 5.55 for more information.) Other algorithms also exist for generating discrete random variates from 3

4 uniform random variates. (See Random Number Generation and Monte Carlo Methods by J.E. Gentle, 2003, Section 4.1.) Multivariate Distributions: The probability integeral transformation (or inverse cdf) method does not apply to a multivariate distribution. However, the probability integral transformation can be applied to the marginal and conditional univariate distributions to generate multivariate random variates. Suppose that the cdf of the multivariate random variable (random vecto) (X 1, X 2,...,X d ) is decomposed as F X1,X 2,...,X d (x 1, x 2,...,x d ) = F X1 (x 1 )F X2 X 1 (x 2 x 1 ) F Xd X 1,...,X d 1 (x d x 1, x 2,...,x d 1 ). Also, suppose that all of the marginal and conditional cdfs above are invertible. Then the probability integral transformation can be applied sequentially by using independent Uniform(0, 1) random variates, u 1,...,u d : x 1 x 2. x d = F 1 X 1 (u 1 ) = F 1 X 2 X 1 (u 2 ) = F 1 X d X 1,X 2,...,X d 1 (u d ) The modifications of the probability integral transformation approach for discrete random variables above can be applied here also if necessary. Example: Recall that R has a built-in function for generating random variates from a multinomial distribution but Splus does not. We can combine some of the techniques above to generate multinomial variates from uniform variates. To give this example some context, suppose that 15% of all college students are engineering majos and 20% are business majors. (The remaining 65% of college students have majors in other areas.) Suppose that we want to sample n = 100 college students and count the number, X 1, who major in engineering and the number, X 2, who major in business. Then (X 1, X 2 ) Multinomial(n = 100, p 1 = 0.15, p 2 = 0.20) (more specifically, a trinomial distribution). In general, if (X 1, X 2 ) Multinomial(n, p 1, p 2 ) with joint pmf p X1,X 2 (x 1, x 2 ), then and p X1,X 2 (x 1, x 2 ) = p X1 (x 1 )p X2 X 1 (x 2 x 1 ) X 1 Binomial(n, p 1 ) ( ) p 2 X 2 X 1 Binomial n x 1,. 1 p 1 Therefore, (X 1, X 2 ) can be produced by sampling from binomial distributions. 4

5 (Note: Since many packages have functions/procedures for generating binomial variates, we could use the above information to generate our multinomial random variate. However, to illustrate how the probability integral transformation can be used, we will continue one step further.) Now, we are faced with the issue of generating binomial random variates from uniform random variates. The early approach (under Discrete Distributions) could be used but I will illustrate another approach here. Consider U Uniform(0, 1) and 0 < p < 1. Let { 1 U <= p Y = 0 otherwise. Then Y Binomial(1, p) (or Y Bernoulli(p)). Also, note that if Y 1,...,Y n are iid Bernoulii(p), then n i=1 Y i Binomial(n, p). Putting this all together we can generate a binomial random variate from uniform random variates. Algorithm for generating (x 1, x 2 ) from a Trinomial(n, p 1, p 2 ) distribution: 1. Generate u 1,...,u n from a Uniform(0, 1) distribution. 2. x 1 = n i=1 I(u i p 1 ) generates x 1 from Binomial(n, p 1 ) distribution. 3. Generate u 1,...,u n x1 from a Uniform(0, 1) distribution. 4. x 2 = n x 1 i=1 I(u i p 2 /(1 p 1 )) gets x 2 x 1 Binomial(n x 1, p 2 /(1 p 1 )). 5. (x 1, x 2 ) is a trinomial variate. 6. Repeat previous steps m times for a random sample of size m. The following R function (stored in the file multinom.func) will generate a single variate from a trinomial distribution but implementing the above algorithm. Notice that the function requires you to specify n, p 1, and p 2. rtrinom.ex<-function(n,p1,p2){ # This function generates one random trinomial variate. # n = number of individuals sampled # p1 = probability of being in the first group # p2 = probability of being in the second group u1<-runif(n,0,1) x1<-sum(ifelse(u1<=p1,1,0)) u2<-runif(n-x1,0,1) x2<-sum(ifelse(u2<=p2/(1-p1),1,0)) c(x1,x2) } (We could also choose to return (x 1, x 2, n x 1 x 2 ). Executing this function for the above example, yields 5

6 > source("multinom.func") > rtrinom.ex(100,0.15,0.20) [1] Therefore, in this sample of 100 college students, 18 were engineering majors and 21 were business majors. Other Methods: In many cases, the cdf of a random variable can not be written in closed form or the cdf is not invertible. In these cases, other options must be explored. These include other types of generation methods (other algorithms) and indirect methods. An example of the former is the Box-Muller Algorithm. Box-Muller Algorithm: Generate U 1 and U 2, two independent Uniform(0, 1) random variables. Then are iid Normal(0, 1) random variables. X 1 = 2 log(u 1 )cos(2πu 2 ) X 2 = 2 log(u 1 ) sin(2πu 2 ) Unfortunately, solutions such at the Box-Muller algorithm are not plentiful. Moreover, they take advantage of the specific structure of certain distributions and are, thus, less applicable as general strategies. For the most part, the generation of other continuous distributions is best accomplished through indirect methods. Indirect methods include, but are not limited to, the Accept/Reject Algorithm, the Ratio-of-Uniforms Method, the Metropolis-Hastings Algorithm, and Gibbs Sampling. Time permitting, we will discuss some of these at the end of the semester. Finite Mixture Distributions Consider a random variable X with pdf (or pmf) of the following form f(x) = w 1 f 1 (x) + w 2 f 2 (x) + + w k f k (x) where f 1 (x),...,f k (x) are pdfs (or pmfs); w j 0, j = 1,..., k; and k j=1 w j = 1. The random variable X is said to have a (finite) mixture distribution. Examples of finite mixture distributions include the following. X is said to have a contaminated normal distribution if f 1 (x) is the standard normal pdf and f 2 (x) is a normal pdf with variance greater than one. Typically, w 1 = 1 ε is close to one while w 2 = ε is close to zero; some common values are ε = 0.01, 0.05, Contaminated normal distributions are useful if you need to generate data from a distribution that is pretty much normal but contains some outliers or has slightly thicker tails. Finite mixtures are often used when subpopulations are known to exist. As a simple example, suppose X represents heights of adults. The population of adults consists of both men and women. Heights of men might be modelled with one normal distribution while heights of women might be modelled with another normal distribution. 6

7 A zero-inflated Poisson distribution is another example. In this case, the bulk of the data follows a Poisson distribution but there are many extra zeroes. Note that the distributions need not be normal. In addition, the distributions need not be from the same family of distributions but they often are. How would generate n variates from a finite mixture distribution? Suppose that we can already generate variates from each of the component distributions. Also, suppose for simplicity that k = 2, and let w = w 1 and 1 w = w 2. Thus, we want to generate n random variates from f(x) = wf 1 (x) + (1 w)f 2 (x) where 0 < w < 1. One way of forming the above mixture distribution is to consider a conditional pdf/pmf of a similar form: f(x y) = yf 1 (x) + (1 y)f 2 (x) where y is a realization of a Bernoulli random variable with probability of success w, i.e. Y Bernoulli(w). Then the marginal pdf/pmf of X is f X (x) = = 1 f X,Y (x, y) y=0 1 f(x y)p(y = y) y=0 which is the desired mixture distribution. = f(x 0)P(Y = 0) + f(x 1)P(Y = 1) = wf 1 (x) + (1 w)f 2 (x) Algorithm for generating data from a mixture distribution: 1. Generate n variates from f 1 (x): x (1) 1,...,x (1) n. 2. Generate n variates from f 2 (x): x (2) 1,...,x (2) n. 3. Generate n variates from Bernoulli(w): y 1,...,y n. 4. Let z i = y i x (1) i + (1 y i ) x (2) i, i = 1,...,n. Then z 1,..., z n is a sample from f(x) = wf 1 (x) + (1 w)f 2 (x). Example: Suppose we want to generate n random variates from a contaminated normal distribution with f 1 (x) = φ(x) and f 2 (x) = (1/σ)φ(x/σ) where φ represents the standard normal pdf. (Therefore, the first component is Normal(0, 1) and the second component is Normal(0, σ 2 )). The following R function will do the trick: 7

8 contam.norm<-function(n,eps,sigma){ # This function generates n random variates from a contaminated # normal distributions with contaminations proportion eps and # contamination variance sigma^2. x1<-rnorm(n,0,1) x2<-rnorm(n,0,sigma) y<-rbinom(n,1,1-eps) x<-y*x1+(1-y)*x2 x } Execute the function for n = 1000, eps = 0.1 and sigma = 3: > x<-contam.norm(1000,0.1,3) A histogram of x will be bell-shaped with longer tails than a standard normal distribution. Alternatively, the following R function could be used: contam.norm2<-function(n,eps,sigma){ # This function generates n random variates from a contaminated # normal distributions with contaminations proportion eps and # contamination variance sigma^2. x<-rnorm(n,0,1) y<-rbinom(n,1,1-eps) x<-y*x+(1-y)*sigma*x x } How does this differ from the first? Note: When we run simulations, we will typically need to take several samples of the same size from the same distribution. We could do this in a for loop but there are more efficient ways. Can you think of any? Suppose you will need 10 samples of size 5 from a standard normal distribution. In simulation studies, these 10 samples should be independent. Therefore, one can simply generate 10 5 = 50 standard normal variates and fill in a matrix so each sample is one column of the matrix. > set.seed(10) > y<-matrix(rnorm(50),nrow=5,ncol=10) > y [,1] [,2] [,3] [,4] [,5] [,6] [1,] [2,] [3,] [4,] [5,]

9 [,7] [,8] [,9] [,10] [1,] [2,] [3,] [4,] [5,] The replicate() function is useful for repeating a procedure many times. > set.seed(10) > z<-replicate(10,rnorm(5)) > z [,1] [,2] [,3] [,4] [,5] [,6] [1,] [2,] [3,] [4,] [5,] [,7] [,8] [,9] [,10] [1,] [2,] [3,] [4,] [5,]

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Exam 2 Spring 2015 Statistics for Applications 4/9/2015 18.443 Exam 2 Spring 2015 Statistics for Applications 4/9/2015 1. True or False (and state why). (a). The significance level of a statistical test is not equal to the probability that the null hypothesis

More information

IEOR E4703: Monte-Carlo Simulation

IEOR E4703: Monte-Carlo Simulation IEOR E4703: Monte-Carlo Simulation Generating Random Variables and Stochastic Processes Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties Posterior Inference Example. Consider a binomial model where we have a posterior distribution for the probability term, θ. Suppose we want to make inferences about the log-odds γ = log ( θ 1 θ), where

More information

What was in the last lecture?

What was in the last lecture? What was in the last lecture? Normal distribution A continuous rv with bell-shaped density curve The pdf is given by f(x) = 1 2πσ e (x µ)2 2σ 2, < x < If X N(µ, σ 2 ), E(X) = µ and V (X) = σ 2 Standard

More information

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions. UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions. Random Variables 2 A random variable X is a numerical (integer, real, complex, vector etc.) summary of the outcome of the random experiment.

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulation APPM 7400 Lesson 5: Generating (Some) Continuous Random Variables September 12, 2018 esson 5: Generating (Some) Continuous Random Variables Stochastic Simulation September 12, 2018

More information

Non-informative Priors Multiparameter Models

Non-informative Priors Multiparameter Models Non-informative Priors Multiparameter Models Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Prior Types Informative vs Non-informative There has been a desire for a prior distributions that

More information

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ. Sufficient Statistics Lecture Notes 6 Sufficiency Data reduction in terms of a particular statistic can be thought of as a partition of the sample space X. Definition T is sufficient for θ if the conditional

More information

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence continuous rv Let X be a continuous rv. Then a probability distribution or probability density function (pdf) of X is a function f(x) such that for any two numbers a and b with a b, P(a X b) = b a f (x)dx.

More information

Machine Learning for Quantitative Finance

Machine Learning for Quantitative Finance Machine Learning for Quantitative Finance Fast derivative pricing Sofie Reyners Joint work with Jan De Spiegeleer, Dilip Madan and Wim Schoutens Derivative pricing is time-consuming... Vanilla option pricing

More information

4-2 Probability Distributions and Probability Density Functions. Figure 4-2 Probability determined from the area under f(x).

4-2 Probability Distributions and Probability Density Functions. Figure 4-2 Probability determined from the area under f(x). 4-2 Probability Distributions and Probability Density Functions Figure 4-2 Probability determined from the area under f(x). 4-2 Probability Distributions and Probability Density Functions Definition 4-2

More information

Chapter 7: Point Estimation and Sampling Distributions

Chapter 7: Point Estimation and Sampling Distributions Chapter 7: Point Estimation and Sampling Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 20 Motivation In chapter 3, we learned

More information

Lean Six Sigma: Training/Certification Books and Resources

Lean Six Sigma: Training/Certification Books and Resources Lean Si Sigma Training/Certification Books and Resources Samples from MINITAB BOOK Quality and Si Sigma Tools using MINITAB Statistical Software A complete Guide to Si Sigma DMAIC Tools using MINITAB Prof.

More information

Business Statistics 41000: Probability 4

Business Statistics 41000: Probability 4 Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:

More information

ELEMENTS OF MONTE CARLO SIMULATION

ELEMENTS OF MONTE CARLO SIMULATION APPENDIX B ELEMENTS OF MONTE CARLO SIMULATION B. GENERAL CONCEPT The basic idea of Monte Carlo simulation is to create a series of experimental samples using a random number sequence. According to the

More information

Generating Random Variables and Stochastic Processes

Generating Random Variables and Stochastic Processes IEOR E4703: Monte Carlo Simulation Columbia University c 2017 by Martin Haugh Generating Random Variables and Stochastic Processes In these lecture notes we describe the principal methods that are used

More information

Introduction to Algorithmic Trading Strategies Lecture 8

Introduction to Algorithmic Trading Strategies Lecture 8 Introduction to Algorithmic Trading Strategies Lecture 8 Risk Management Haksun Li haksun.li@numericalmethod.com www.numericalmethod.com Outline Value at Risk (VaR) Extreme Value Theory (EVT) References

More information

MATH 3200 Exam 3 Dr. Syring

MATH 3200 Exam 3 Dr. Syring . Suppose n eligible voters are polled (randomly sampled) from a population of size N. The poll asks voters whether they support or do not support increasing local taxes to fund public parks. Let M be

More information

Probability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016

Probability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016 Probability Theory Probability and Statistics for Data Science CSE594 - Spring 2016 What is Probability? 2 What is Probability? Examples outcome of flipping a coin (seminal example) amount of snowfall

More information

Generating Random Numbers

Generating Random Numbers Generating Random Numbers Aim: produce random variables for given distribution Inverse Method Let F be the distribution function of an univariate distribution and let F 1 (y) = inf{x F (x) y} (generalized

More information

Write legibly. Unreadable answers are worthless.

Write legibly. Unreadable answers are worthless. MMF 2021 Final Exam 1 December 2016. This is a closed-book exam: no books, no notes, no calculators, no phones, no tablets, no computers (of any kind) allowed. Do NOT turn this page over until you are

More information

Corso di Identificazione dei Modelli e Analisi dei Dati

Corso di Identificazione dei Modelli e Analisi dei Dati Università degli Studi di Pavia Dipartimento di Ingegneria Industriale e dell Informazione Corso di Identificazione dei Modelli e Analisi dei Dati Central Limit Theorem and Law of Large Numbers Prof. Giuseppe

More information

Business Statistics 41000: Probability 3

Business Statistics 41000: Probability 3 Business Statistics 41000: Probability 3 Drew D. Creal University of Chicago, Booth School of Business February 7 and 8, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404

More information

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate

More information

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate

More information

Describing Uncertain Variables

Describing Uncertain Variables Describing Uncertain Variables L7 Uncertainty in Variables Uncertainty in concepts and models Uncertainty in variables Lack of precision Lack of knowledge Variability in space/time Describing Uncertainty

More information

6. Continous Distributions

6. Continous Distributions 6. Continous Distributions Chris Piech and Mehran Sahami May 17 So far, all random variables we have seen have been discrete. In all the cases we have seen in CS19 this meant that our RVs could only take

More information

Chapter 5. Sampling Distributions

Chapter 5. Sampling Distributions Lecture notes, Lang Wu, UBC 1 Chapter 5. Sampling Distributions 5.1. Introduction In statistical inference, we attempt to estimate an unknown population characteristic, such as the population mean, µ,

More information

A useful modeling tricks.

A useful modeling tricks. .7 Joint models for more than two outcomes We saw that we could write joint models for a pair of variables by specifying the joint probabilities over all pairs of outcomes. In principal, we could do this

More information

Probability and Statistics

Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 3: PARAMETRIC FAMILIES OF UNIVARIATE DISTRIBUTIONS 1 Why do we need distributions?

More information

STAT Mathematical Statistics

STAT Mathematical Statistics STAT 6201 - Mathematical Statistics Chapter 3 : Random variables 5, Event, Prc ) Random variables and distributions Let S be the sample space associated with a probability experiment Assume that we have

More information

Random Variables Handout. Xavier Vilà

Random Variables Handout. Xavier Vilà Random Variables Handout Xavier Vilà Course 2004-2005 1 Discrete Random Variables. 1.1 Introduction 1.1.1 Definition of Random Variable A random variable X is a function that maps each possible outcome

More information

Chapter 3 Discrete Random Variables and Probability Distributions

Chapter 3 Discrete Random Variables and Probability Distributions Chapter 3 Discrete Random Variables and Probability Distributions Part 4: Special Discrete Random Variable Distributions Sections 3.7 & 3.8 Geometric, Negative Binomial, Hypergeometric NOTE: The discrete

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2019 Last Time: Markov Chains We can use Markov chains for density estimation, d p(x) = p(x 1 ) p(x }{{}

More information

Metropolis-Hastings algorithm

Metropolis-Hastings algorithm Metropolis-Hastings algorithm Dr. Jarad Niemi STAT 544 - Iowa State University March 27, 2018 Jarad Niemi (STAT544@ISU) Metropolis-Hastings March 27, 2018 1 / 32 Outline Metropolis-Hastings algorithm Independence

More information

ECON 214 Elements of Statistics for Economists

ECON 214 Elements of Statistics for Economists ECON 214 Elements of Statistics for Economists Session 7 The Normal Distribution Part 1 Lecturer: Dr. Bernardin Senadza, Dept. of Economics Contact Information: bsenadza@ug.edu.gh College of Education

More information

4 Random Variables and Distributions

4 Random Variables and Distributions 4 Random Variables and Distributions Random variables A random variable assigns each outcome in a sample space. e.g. called a realization of that variable to Note: We ll usually denote a random variable

More information

Chapter 3 Discrete Random Variables and Probability Distributions

Chapter 3 Discrete Random Variables and Probability Distributions Chapter 3 Discrete Random Variables and Probability Distributions Part 3: Special Discrete Random Variable Distributions Section 3.5 Discrete Uniform Section 3.6 Bernoulli and Binomial Others sections

More information

Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 6 Normal Probability Distribution QMIS 120. Dr.

Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 6 Normal Probability Distribution QMIS 120. Dr. Department of Quantitative Methods & Information Systems Business Statistics Chapter 6 Normal Probability Distribution QMIS 120 Dr. Mohammad Zainal Chapter Goals After completing this chapter, you should

More information

CS145: Probability & Computing

CS145: Probability & Computing CS145: Probability & Computing Lecture 8: Variance of Sums, Cumulative Distribution, Continuous Variables Instructor: Eli Upfal Brown University Computer Science Figure credits: Bertsekas & Tsitsiklis,

More information

ECON 214 Elements of Statistics for Economists 2016/2017

ECON 214 Elements of Statistics for Economists 2016/2017 ECON 214 Elements of Statistics for Economists 2016/2017 Topic The Normal Distribution Lecturer: Dr. Bernardin Senadza, Dept. of Economics bsenadza@ug.edu.gh College of Education School of Continuing and

More information

GENERATION OF APPROXIMATE GAMMA SAMPLES BY PARTIAL REJECTION

GENERATION OF APPROXIMATE GAMMA SAMPLES BY PARTIAL REJECTION IASC8: December 5-8, 8, Yokohama, Japan GEERATIO OF APPROXIMATE GAMMA SAMPLES BY PARTIAL REJECTIO S.H. Ong 1 Wen Jau Lee 1 Institute of Mathematical Sciences, University of Malaya, 563 Kuala Lumpur, MALAYSIA

More information

STOR Lecture 15. Jointly distributed Random Variables - III

STOR Lecture 15. Jointly distributed Random Variables - III STOR 435.001 Lecture 15 Jointly distributed Random Variables - III Jan Hannig UNC Chapel Hill 1 / 17 Before we dive in Contents of this lecture 1. Conditional pmf/pdf: definition and simple properties.

More information

10. Monte Carlo Methods

10. Monte Carlo Methods 10. Monte Carlo Methods 1. Introduction. Monte Carlo simulation is an important tool in computational finance. It may be used to evaluate portfolio management rules, to price options, to simulate hedging

More information

STATS 200: Introduction to Statistical Inference. Lecture 4: Asymptotics and simulation

STATS 200: Introduction to Statistical Inference. Lecture 4: Asymptotics and simulation STATS 200: Introduction to Statistical Inference Lecture 4: Asymptotics and simulation Recap We ve discussed a few examples of how to determine the distribution of a statistic computed from data, assuming

More information

Conjugate Models. Patrick Lam

Conjugate Models. Patrick Lam Conjugate Models Patrick Lam Outline Conjugate Models What is Conjugacy? The Beta-Binomial Model The Normal Model Normal Model with Unknown Mean, Known Variance Normal Model with Known Mean, Unknown Variance

More information

MEASURING PORTFOLIO RISKS USING CONDITIONAL COPULA-AR-GARCH MODEL

MEASURING PORTFOLIO RISKS USING CONDITIONAL COPULA-AR-GARCH MODEL MEASURING PORTFOLIO RISKS USING CONDITIONAL COPULA-AR-GARCH MODEL Isariya Suttakulpiboon MSc in Risk Management and Insurance Georgia State University, 30303 Atlanta, Georgia Email: suttakul.i@gmail.com,

More information

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi Chapter 4: Commonly Used Distributions Statistics for Engineers and Scientists Fourth Edition William Navidi 2014 by Education. This is proprietary material solely for authorized instructor use. Not authorized

More information

Stat 213: Intro to Statistics 9 Central Limit Theorem

Stat 213: Intro to Statistics 9 Central Limit Theorem 1 Stat 213: Intro to Statistics 9 Central Limit Theorem H. Kim Fall 2007 2 unknown parameters Example: A pollster is sure that the responses to his agree/disagree questions will follow a binomial distribution,

More information

Probability. An intro for calculus students P= Figure 1: A normal integral

Probability. An intro for calculus students P= Figure 1: A normal integral Probability An intro for calculus students.8.6.4.2 P=.87 2 3 4 Figure : A normal integral Suppose we flip a coin 2 times; what is the probability that we get more than 2 heads? Suppose we roll a six-sided

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2018 Last Time: Markov Chains We can use Markov chains for density estimation, p(x) = p(x 1 ) }{{} d p(x

More information

Section The Sampling Distribution of a Sample Mean

Section The Sampling Distribution of a Sample Mean Section 5.2 - The Sampling Distribution of a Sample Mean Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin The Sampling Distribution of a Sample Mean Example: Quality control check of light

More information

Stochastic Components of Models

Stochastic Components of Models Stochastic Components of Models Gov 2001 Section February 5, 2014 Gov 2001 Section Stochastic Components of Models February 5, 2014 1 / 41 Outline 1 Replication Paper and other logistics 2 Data Generation

More information

Monte Carlo Methods. Matt Davison May University of Verona Italy

Monte Carlo Methods. Matt Davison May University of Verona Italy Monte Carlo Methods Matt Davison May 22 2017 University of Verona Italy Big question 1 How can I convince myself that Delta Hedging a Geometric Brownian Motion stock really works with no transaction costs?

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 05 Normal Distribution So far we have looked at discrete distributions

More information

Chapter 5: Statistical Inference (in General)

Chapter 5: Statistical Inference (in General) Chapter 5: Statistical Inference (in General) Shiwen Shen University of South Carolina 2016 Fall Section 003 1 / 17 Motivation In chapter 3, we learn the discrete probability distributions, including Bernoulli,

More information

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82 Announcements: Week 5 quiz begins at 4pm today and ends at 3pm on Wed If you take more than 20 minutes to complete your quiz, you will only receive partial credit. (It doesn t cut you off.) Today: Sections

More information

A Probabilistic Approach to Determining the Number of Widgets to Build in a Yield-Constrained Process

A Probabilistic Approach to Determining the Number of Widgets to Build in a Yield-Constrained Process A Probabilistic Approach to Determining the Number of Widgets to Build in a Yield-Constrained Process Introduction Timothy P. Anderson The Aerospace Corporation Many cost estimating problems involve determining

More information

FREDRIK BAJERS VEJ 7 G 9220 AALBORG ØST Tlf.: URL: Fax: Monte Carlo methods

FREDRIK BAJERS VEJ 7 G 9220 AALBORG ØST Tlf.: URL:   Fax: Monte Carlo methods INSTITUT FOR MATEMATISKE FAG AALBORG UNIVERSITET FREDRIK BAJERS VEJ 7 G 9220 AALBORG ØST Tlf.: 96 35 88 63 URL: www.math.auc.dk Fax: 98 15 81 29 E-mail: jm@math.aau.dk Monte Carlo methods Monte Carlo methods

More information

Bernoulli and Binomial Distributions

Bernoulli and Binomial Distributions Bernoulli and Binomial Distributions Bernoulli Distribution a flipped coin turns up either heads or tails an item on an assembly line is either defective or not defective a piece of fruit is either damaged

More information

Monte Carlo Simulation and Resampling

Monte Carlo Simulation and Resampling Monte Carlo Simulation and Resampling Tom Carsey (Instructor) Jeff Harden (TA) ICPSR Summer Course Summer, 2011 Monte Carlo Simulation and Resampling 1/114 Introductions and Overview What do I plan for

More information

Value at Risk Ch.12. PAK Study Manual

Value at Risk Ch.12. PAK Study Manual Value at Risk Ch.12 Related Learning Objectives 3a) Apply and construct risk metrics to quantify major types of risk exposure such as market risk, credit risk, liquidity risk, regulatory risk etc., and

More information

2.1 Mathematical Basis: Risk-Neutral Pricing

2.1 Mathematical Basis: Risk-Neutral Pricing Chapter Monte-Carlo Simulation.1 Mathematical Basis: Risk-Neutral Pricing Suppose that F T is the payoff at T for a European-type derivative f. Then the price at times t before T is given by f t = e r(t

More information

Chapter 8 Estimation

Chapter 8 Estimation Chapter 8 Estimation There are two important forms of statistical inference: estimation (Confidence Intervals) Hypothesis Testing Statistical Inference drawing conclusions about populations based on samples

More information

Practical example of an Economic Scenario Generator

Practical example of an Economic Scenario Generator Practical example of an Economic Scenario Generator Martin Schenk Actuarial & Insurance Solutions SAV 7 March 2014 Agenda Introduction Deterministic vs. stochastic approach Mathematical model Application

More information

Financial Risk Forecasting Chapter 7 Simulation methods for VaR for options and bonds

Financial Risk Forecasting Chapter 7 Simulation methods for VaR for options and bonds Financial Risk Forecasting Chapter 7 Simulation methods for VaR for options and bonds Jon Danielsson 2017 London School of Economics To accompany Financial Risk Forecasting www.financialriskforecasting.com

More information

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage 6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic

More information

Data Simulator. Chapter 920. Introduction

Data Simulator. Chapter 920. Introduction Chapter 920 Introduction Because of mathematical intractability, it is often necessary to investigate the properties of a statistical procedure using simulation (or Monte Carlo) techniques. In power analysis,

More information

LECTURE CHAPTER 3 DESCRETE RANDOM VARIABLE

LECTURE CHAPTER 3 DESCRETE RANDOM VARIABLE LECTURE CHAPTER 3 DESCRETE RANDOM VARIABLE MSc Đào Việt Hùng Email: hungdv@tlu.edu.vn Random Variable A random variable is a function that assigns a real number to each outcome in the sample space of a

More information

Chapter 4 Continuous Random Variables and Probability Distributions

Chapter 4 Continuous Random Variables and Probability Distributions Chapter 4 Continuous Random Variables and Probability Distributions Part 2: More on Continuous Random Variables Section 4.5 Continuous Uniform Distribution Section 4.6 Normal Distribution 1 / 28 One more

More information

Expectations. Definition Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X ) or

Expectations. Definition Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X ) or Definition Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X ) or µ X, is E(X ) = µ X = x D x p(x) Definition Let X be a discrete

More information

BIOINFORMATICS MSc PROBABILITY AND STATISTICS SPLUS SHEET 1

BIOINFORMATICS MSc PROBABILITY AND STATISTICS SPLUS SHEET 1 BIOINFORMATICS MSc PROBABILITY AND STATISTICS SPLUS SHEET 1 A data set containing a segment of human chromosome 13 containing the BRCA2 breast cancer gene; it was obtained from the National Center for

More information

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions April 9th, 2018 Lecture 20: Special distributions Week 1 Chapter 1: Axioms of probability Week 2 Chapter 3: Conditional probability and independence Week 4 Chapters 4, 6: Random variables Week 9 Chapter

More information

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s. STAT 515 -- Chapter 5: Continuous Distributions Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s. Continuous distributions typically are represented by

More information

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a Announcements: There are some office hour changes for Nov 5, 8, 9 on website Week 5 quiz begins after class today and ends at

More information

IEOR E4602: Quantitative Risk Management

IEOR E4602: Quantitative Risk Management IEOR E4602: Quantitative Risk Management Basic Concepts and Techniques of Risk Management Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Chapter 7. Sampling Distributions and the Central Limit Theorem

Chapter 7. Sampling Distributions and the Central Limit Theorem Chapter 7. Sampling Distributions and the Central Limit Theorem 1 Introduction 2 Sampling Distributions related to the normal distribution 3 The central limit theorem 4 The normal approximation to binomial

More information

EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz

EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz 1 EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS Rick Katz Institute for Mathematics Applied to Geosciences National Center for Atmospheric Research Boulder, CO USA email: rwk@ucar.edu

More information

Chapter 8: Sampling distributions of estimators Sections

Chapter 8: Sampling distributions of estimators Sections Chapter 8 continued Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample

More information

6 Central Limit Theorem. (Chs 6.4, 6.5)

6 Central Limit Theorem. (Chs 6.4, 6.5) 6 Central Limit Theorem (Chs 6.4, 6.5) Motivating Example In the next few weeks, we will be focusing on making statistical inference about the true mean of a population by using sample datasets. Examples?

More information

4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved.

4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved. 4-1 Chapter 4 Commonly Used Distributions 2014 by The Companies, Inc. All rights reserved. Section 4.1: The Bernoulli Distribution 4-2 We use the Bernoulli distribution when we have an experiment which

More information

The Bernoulli distribution

The Bernoulli distribution This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions. ME3620 Theory of Engineering Experimentation Chapter III. Random Variables and Probability Distributions Chapter III 1 3.2 Random Variables In an experiment, a measurement is usually denoted by a variable

More information

Probability Distributions: Discrete

Probability Distributions: Discrete Probability Distributions: Discrete Introduction to Data Science Algorithms Jordan Boyd-Graber and Michael Paul SEPTEMBER 27, 2016 Introduction to Data Science Algorithms Boyd-Graber and Paul Probability

More information

CHAPTERS 5 & 6: CONTINUOUS RANDOM VARIABLES

CHAPTERS 5 & 6: CONTINUOUS RANDOM VARIABLES CHAPTERS 5 & 6: CONTINUOUS RANDOM VARIABLES DISCRETE RANDOM VARIABLE: Variable can take on only certain specified values. There are gaps between possible data values. Values may be counting numbers or

More information

MTH6154 Financial Mathematics I Stochastic Interest Rates

MTH6154 Financial Mathematics I Stochastic Interest Rates MTH6154 Financial Mathematics I Stochastic Interest Rates Contents 4 Stochastic Interest Rates 45 4.1 Fixed Interest Rate Model............................ 45 4.2 Varying Interest Rate Model...........................

More information

Chapter 7. Sampling Distributions and the Central Limit Theorem

Chapter 7. Sampling Distributions and the Central Limit Theorem Chapter 7. Sampling Distributions and the Central Limit Theorem 1 Introduction 2 Sampling Distributions related to the normal distribution 3 The central limit theorem 4 The normal approximation to binomial

More information

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Chapter 7 Sampling Distributions and Point Estimation of Parameters Chapter 7 Sampling Distributions and Point Estimation of Parameters Part 1: Sampling Distributions, the Central Limit Theorem, Point Estimation & Estimators Sections 7-1 to 7-2 1 / 25 Statistical Inferences

More information

4.3 Normal distribution

4.3 Normal distribution 43 Normal distribution Prof Tesler Math 186 Winter 216 Prof Tesler 43 Normal distribution Math 186 / Winter 216 1 / 4 Normal distribution aka Bell curve and Gaussian distribution The normal distribution

More information

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality Point Estimation Some General Concepts of Point Estimation Statistical inference = conclusions about parameters Parameters == population characteristics A point estimate of a parameter is a value (based

More information

Two hours UNIVERSITY OF MANCHESTER. 23 May :00 16:00. Answer ALL SIX questions The total number of marks in the paper is 90.

Two hours UNIVERSITY OF MANCHESTER. 23 May :00 16:00. Answer ALL SIX questions The total number of marks in the paper is 90. Two hours MATH39542 UNIVERSITY OF MANCHESTER RISK THEORY 23 May 2016 14:00 16:00 Answer ALL SIX questions The total number of marks in the paper is 90. University approved calculators may be used 1 of

More information

Objective Bayesian Analysis for Heteroscedastic Regression

Objective Bayesian Analysis for Heteroscedastic Regression Analysis for Heteroscedastic Regression & Esther Salazar Universidade Federal do Rio de Janeiro Colóquio Inter-institucional: Modelos Estocásticos e Aplicações 2009 Collaborators: Marco Ferreira and Thais

More information

CS 237: Probability in Computing

CS 237: Probability in Computing CS 237: Probability in Computing Wayne Snyder Computer Science Department Boston University Lecture 12: Continuous Distributions Uniform Distribution Normal Distribution (motivation) Discrete vs Continuous

More information

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41 STA258H5 Al Nosedal and Alison Weir Winter 2017 Al Nosedal and Alison Weir STA258H5 Winter 2017 1 / 41 NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION. Al Nosedal and Alison Weir STA258H5 Winter 2017

More information

A First Course in Probability

A First Course in Probability A First Course in Probability Seventh Edition Sheldon Ross University of Southern California PEARSON Prentice Hall Upper Saddle River, New Jersey 07458 Preface 1 Combinatorial Analysis 1 1.1 Introduction

More information

,,, be any other strategy for selling items. It yields no more revenue than, based on the

,,, be any other strategy for selling items. It yields no more revenue than, based on the ONLINE SUPPLEMENT Appendix 1: Proofs for all Propositions and Corollaries Proof of Proposition 1 Proposition 1: For all 1,2,,, if, is a non-increasing function with respect to (henceforth referred to as

More information

Regression and Simulation

Regression and Simulation Regression and Simulation This is an introductory R session, so it may go slowly if you have never used R before. Do not be discouraged. A great way to learn a new language like this is to plunge right

More information

Statistics for Managers Using Microsoft Excel 7 th Edition

Statistics for Managers Using Microsoft Excel 7 th Edition Statistics for Managers Using Microsoft Excel 7 th Edition Chapter 5 Discrete Probability Distributions Statistics for Managers Using Microsoft Excel 7e Copyright 014 Pearson Education, Inc. Chap 5-1 Learning

More information

Statistical Computing (36-350)

Statistical Computing (36-350) Statistical Computing (36-350) Lecture 14: Simulation I: Generating Random Variables Cosma Shalizi 14 October 2013 Agenda Base R commands The basic random-variable commands Transforming uniform random

More information

STAT 111 Recitation 4

STAT 111 Recitation 4 STAT 111 Recitation 4 Linjun Zhang http://stat.wharton.upenn.edu/~linjunz/ September 29, 2017 Misc. Mid-term exam time: 6-8 pm, Wednesday, Oct. 11 The mid-term break is Oct. 5-8 The next recitation class

More information