The Normal Distribution

Similar documents
6. Continous Distributions

The Normal Probability Distribution

Binomial Approximation and Joint Distributions Chris Piech CS109, Stanford University

Lecture Stat 302 Introduction to Probability - Slides 15

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Statistics 511 Supplemental Materials

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

Chapter 4 Continuous Random Variables and Probability Distributions

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Statistical Intervals (One sample) (Chs )

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Chapter 4 Continuous Random Variables and Probability Distributions

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

ECE 340 Probabilistic Methods in Engineering M/W 3-4:15. Lecture 10: Continuous RV Families. Prof. Vince Calhoun

CS 237: Probability in Computing

Chapter 5. Sampling Distributions

Examples of continuous probability distributions: The normal and standard normal

4 Random Variables and Distributions

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

Statistics for Business and Economics

Business Statistics 41000: Probability 3

Binomial Random Variables. Binomial Random Variables

The Bernoulli distribution

Chapter 3 - Lecture 5 The Binomial Probability Distribution

Probability Distributions II

Central Limit Theorem, Joint Distributions Spring 2018

What was in the last lecture?

Law of Large Numbers, Central Limit Theorem

Lecture 12. Some Useful Continuous Distributions. The most important continuous probability distribution in entire field of statistics.

Discrete Random Variables and Probability Distributions. Stat 4570/5570 Based on Devore s book (Ed 8)

Introduction to Business Statistics QM 120 Chapter 6

Normal Distribution. Definition A continuous rv X is said to have a normal distribution with. the pdf of X is

Introduction to Statistics I


Math 227 Elementary Statistics. Bluman 5 th edition

Probability Theory. Mohamed I. Riffi. Islamic University of Gaza

Chapter 6. The Normal Probability Distributions

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

STAT 201 Chapter 6. Distribution

Data Analysis and Statistical Methods Statistics 651

Probability. An intro for calculus students P= Figure 1: A normal integral

Random Variables Handout. Xavier Vilà

Simulation Wrap-up, Statistics COS 323

MAS3904/MAS8904 Stochastic Financial Modelling

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Business Statistics 41000: Probability 4

The Assumptions of Bernoulli Trials. 1. Each trial results in one of two possible outcomes, denoted success (S) or failure (F ).

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Chapter 7 1. Random Variables

ECON 214 Elements of Statistics for Economists 2016/2017

AP Statistics Chapter 6 - Random Variables

Chapter 6 Analyzing Accumulated Change: Integrals in Action

Commonly Used Distributions

Solutions for practice questions: Chapter 15, Probability Distributions If you find any errors, please let me know at

Statistics and Probability

2. The sum of all the probabilities in the sample space must add up to 1

Lecture 6: Chapter 6

5. In fact, any function of a random variable is also a random variable

MAKING SENSE OF DATA Essentials series

Chapter 4. The Normal Distribution

Chapter 3 Common Families of Distributions. Definition 3.4.1: A family of pmfs or pdfs is called exponential family if it can be expressed as

THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

Discrete Random Variables

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

But suppose we want to find a particular value for y, at which the probability is, say, 0.90? In other words, we want to figure out the following:

STOR Lecture 7. Random Variables - I

4.3 Normal distribution

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Continuous Probability Distributions & Normal Distribution

9 Expectation and Variance

MA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution.

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

Standard Normal, Inverse Normal and Sampling Distributions

Lecture 9. Probability Distributions. Outline. Outline

5.3 Statistics and Their Distributions

QUADRATIC. Parent Graph: How to Tell it's a Quadratic: Helpful Hints for Calculator Usage: Domain of Parent Graph:, Range of Parent Graph: 0,

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

Probability Distributions for Discrete RV

Topic 6 - Continuous Distributions I. Discrete RVs. Probability Density. Continuous RVs. Background Reading. Recall the discrete distributions

Lecture 9. Probability Distributions

Engineering Statistics ECIV 2305

Review of the Topics for Midterm I

Nov 4 Stats 2D03 - Tutorial 7 Solutions 1

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Chapter 16. Random Variables. Copyright 2010 Pearson Education, Inc.

Central Limit Theorem (cont d) 7/28/2006

What s Normal? Chapter 8. Hitting the Curve. In This Chapter

Using the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the

MA 1125 Lecture 05 - Measures of Spread. Wednesday, September 6, Objectives: Introduce variance, standard deviation, range.

Expectations. Definition Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X ) or

Chapter 7. Sampling Distributions and the Central Limit Theorem

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Chapter 5: Probability models

Transcription:

Will Monroe CS 09 The Normal Distribution Lecture Notes # July 9, 207 Based on a chapter by Chris Piech The single most important random variable type is the normal a.k.a. Gaussian) random variable, parametrized by a mean µ) and variance 2 ). If X is a normal variable, we write X Nµ, 2 ). The normal is important for many reasons: it is generated from the summation of independent random variables and as a result it occurs often in nature. Many things in the world are not quite distributed normally, but data scientists and computer scientists model them as normal distributions anyways. Why? Essentially, the normal is what we use if we know mean and variance, but nothing else. In fact, it is the most conservative modeling decision that we can make for a random variable while still matching a particular expectation average value) and variance spread). The probability density function PDF) for a normal X Nµ, 2 ) is: f X x) = 2π e 2 x µ ) 2 Notice the x in the exponent of the PDF function. When x is equal to the mean µ), then e is raised to the power of 0 and the PDF is maximized. By design, a normal has E[X] = µ and VarX) = 2. Linear Transform If X is a normal RV such that X N µ, 2 ) and Y = ax + b Y is a linear transform of X), then Y is also a normal RV where: Y N aµ + b, a 2 2 ) Projection to Standard Normal For any normal RV X we can find a linear transform from X to the standard normal N 0, ). That is, if you subtract the mean µ) of the normal and divide by the standard deviation ), the result is distributed according to the standard normal. We can prove this mathematically. Let W = X µ : W = X µ = X µ transform X: subtract µ and divide by use algebra to rewrite the equation = ax + b where a =, b = µ N aµ + b, a 2 2 ) N µ µ, 2 2 ) N 0, ) the linear transform of a normal is another normal substituting values in for a and b the standard normal Formally, it has the highest entropy HX) = dx f x) log f x) of any distribution given the mean and variance.

2 An extremely common use of this transform is to express F X x), the CDF of X, in terms of the CDF of Z, F Z x). Since the CDF of Z is so common it gets its own Greek symbol: Φx) F X x) = PX x) X µ = P x µ ) = P Z x µ ) x µ ) = Φ Why is this useful? Well, in the days when we couldn t call scipy.stats.norm.cdf or on exams, when one doesn t have a calculator), people would look up values of the CDF in a table see the last page of these notes). Using the standard normal means you only need to build a table of one distribution, rather than an indefinite number of tables for all the different values of µ and! We also have an online calculator on the CS 09 website. You should learn how to use the normal table for the exams, however! How to Remember that Crazy PDF What is the PDF of the standard normal Z? Let s plug it in: f Z z) = 2π e 2 z µ ) 2 = ) z 0 2 2π e 2 = e 2 z2 2π This gets even better if we realize that 2π is just a constant to make the whole thing integrate to. Call that constant C: f Z z) = Ce 2 z2 Not so scary anymore, is it? In fact, this equation can be a rather helpful mnemonic: the normal distribution PDF is just the exponential of a parabola. What does that look like? f z) = 2 z2 f z) = e 2 z2 As it turns out, the exponential of a downward) parabola is a familiar shape: the bell curve. Now bring back the fact that Z = X µ, and you can see that µ determines where the peak of the bell curve will be, while tells you how wide it is. Don t forget that C changes too!)

3 Example Let X N3, 6), what is PX > 0)? X 3 PX > 0) = P > 0 3 ) = P Z > 3 ) = P Z 3 ) = Φ 3 ) = Φ3 )) = Φ3 ) = 0.773 What is P2 < X < 5)? 2 3 P2 < X < 5) = P < X 3 < 5 3 ) = P < Z < 2 ) = Φ 2 ) Φ ) = Φ 2 ) Φ )) = 0.2902 Example 2 You send voltage of 2 or -2 on a wire to denote or 0. Let X = voltage sent and let R = voltage received. R = X + Y, where Y N 0, ) is noise. When decoding, if R 0.5 we interpret the voltage as, else 0. What is Perror after decoding original bit = )? PX + Y < 0.5) = P2 + Y < 0.5) = PY <.5) = Φ.5) = Φ.5) 0.0668 Binomial Approximation Imagine this terrible scenario. You need to solve a probability question on a binomial random variable see the chapter on discrete distributions) with a large value for n the number of experiments). You quickly realize that it is way too hard to compute by hand. Recall that the binomial probability mass function has an n! term. You decide to turn to you computer, but after a few iterations you realize that this is too hard even for your GPU boosted mega computer or your laptop). As a concrete example, imagine that in an election each person in a country with 0 million people votes in an election. Each person in the country votes for candidate A, with independent probability 0.53. You want to know the probability that candidate A gets more than 5 million votes. Yikes! Don t panic unless you are candidate B, then sorry, this election is not for you). Did you notice how similar a normal distribution s PDF and a binomial distributions PMF look? Lets take a side by side view:

Lets say our binomial is a random variable X Bin00, 0.5) and we want to calculate PX 55). We could cheat by using the closest fit normal in this case Y N 50, 25)). How did we chose that particular Normal? I simply selected one with a mean and variance that matches the Binomial expectation and variance. The binomial expectation is np = 00 0.5 = 50. The Binomial variance is np p) = 00 0.5 0.5 = 25. Since Y X then PX 55) seems like it should be PY 55). That is almost true. It turns out that there is a formal mathematical reason why the normal is a good approximation of the binomial as long as the Binomial parameter p is reasonable eg in the range [0.3 to 0.7]) and n is large enough. However! There was an oversight in our logic. Let s look a bit closer at the binomial we are approximating. Since we want to approximate PX 55), our goal is to calculate the sum of all of the columns in the Binomial PMF from 55 and up all those dark columns). If we calculate the probability that the approximating Normal random variable takes on a value greater than 55 PY 55) we will get the integral starting at the vertical dashed line. Hey! That s not where the columns start. Really

5 we want the area under the curve starting half way between 5 and 55. The correct approximation would be to calculate PX 5.5). Yep, that adds an annoying layer of complexity. The simple idea is that when you approximate a discrete distribution with a continuous one, if you are not careful your approximating integral will only include half of one of your boundary values. In this case we were only adding half of the column for PX = 55)). The correction is called the continuity correction. You can use a Normal distribution to approximate a Binomial X Binn, p). To do so define a normal Y E[X], V arx)). Using the Binomial formulas for expectation and variance, Y np, np p)). This approximation holds for large n and moderate p. Since a Normal is continuous and Binomial is discrete we have to use a continuity correction to discretize the Normal. PX = k) P k 2 < Y < k + 2 ) = Φ k np + 0.5 Φ k np 0.5 np p) np p) You should get comfortable deciding what continuity correction to use. Here are a few examples of discrete probability questions and the continuity correction: Discrete Binomial) probability question PX = 6) P5.5 < X < 6.5) PX 6) PX > 5.5) PX > 6) PX > 6.5) PX < 6) PX < 5.5) PX 6) PX < 6.5) Equivalent continuous probability question Example 3 00 visitors to your website are given a new design. Let X = # of people who were given the new design and spend more time on your website. Your CEO will endorse the new design if X 65. What is PCEO endorses change it has no effect)? E[X] = np = 50. V arx) = np p) = 25. = V arx)) = 5. We can thus use a Normal approximation: Y N 50, 25). ) Y 50 6.5 50 PX 65) PY > 6.5) = P > = Φ2.9) = 0.009 5 5 Example Stanford accepts 280 students and each student has a 68% chance of attending. Let X = # students who will attend. X Bin280, 0.68). What is PX > 75)? E[X] = np = 686.. V arx) = np p) = 539.7. = V arx)) = 23.23. We can thus use a Normal approximation: Y N 686., 539.7). ) Y 686. 75.5 686. PX > 75) PY > 75.5) = P > = Φ2.5) = 0.0055 23.23 23.23