Section 0: Introduction and Review of Basic Concepts

Similar documents
Section 2: Estimation, Confidence Intervals and Testing Hypothesis

Section 1.4: Learning from data

Section 2: Estimation, Confidence Intervals and Testing Hypothesis

Section 1.3: More Probability and Decisions: Linear Combinations and Continuous Random Variables

Business Statistics 41000: Probability 4

A useful modeling tricks.

Business Statistics 41000: Probability 3

2. Modeling Uncertainty

THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management

Sampling and sampling distribution

Discrete probability distributions

Chapter 5. Sampling Distributions

E509A: Principle of Biostatistics. GY Zou

Chapter 16. Random Variables. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Statistics for Business and Economics

5. In fact, any function of a random variable is also a random variable

Statistic Midterm. Spring This is a closed-book, closed-notes exam. You may use any calculator.

Math 5760/6890 Introduction to Mathematical Finance

Chapter 16. Random Variables. Copyright 2010 Pearson Education, Inc.

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Chapter 5. Statistical inference for Parametric Models

Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

The normal distribution is a theoretical model derived mathematically and not empirically.

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Lecture 9 - Sampling Distributions and the CLT

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Lecture 9 - Sampling Distributions and the CLT. Mean. Margin of error. Sta102/BME102. February 6, Sample mean ( X ): x i

8.1 Estimation of the Mean and Proportion

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Populations and Samples Bios 662

AP Statistics Chapter 6 - Random Variables

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 3, 1.9

STA Module 3B Discrete Random Variables

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

BIO5312 Biostatistics Lecture 5: Estimations

PROBABILITY DISTRIBUTIONS

Chapter 4 Continuous Random Variables and Probability Distributions

Point Estimation. Edwin Leuven

Chapter 7 Sampling Distributions and Point Estimation of Parameters

4.1 Introduction Estimating a population mean The problem with estimating a population mean with a sample mean: an example...

Shifting our focus. We were studying statistics (data, displays, sampling...) The next few lectures focus on probability (randomness) Why?

Chapter 7: Point Estimation and Sampling Distributions

MLLunsford 1. Activity: Central Limit Theorem Theory and Computations

Chapter 7: Estimation Sections

19. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE

MATH 264 Problem Homework I

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

2011 Pearson Education, Inc

Chapter 7: Estimation Sections

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Section The Sampling Distribution of a Sample Mean

Chapter 4 Continuous Random Variables and Probability Distributions

Mean of a Discrete Random variable. Suppose that X is a discrete random variable whose distribution is : :

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.

Statistics and Probability

Central Limit Theorem, Joint Distributions Spring 2018

6 Central Limit Theorem. (Chs 6.4, 6.5)

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Probability: Week 4. Kwonsang Lee. University of Pennsylvania February 13, 2015

Law of Large Numbers, Central Limit Theorem

15.063: Communicating with Data Summer Recitation 3 Probability II

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

The Bernoulli distribution

Simple Random Sample

9 Expectation and Variance

STA Rev. F Learning Objectives. What is a Random Variable? Module 5 Discrete Random Variables

Introduction to Statistics I

Much of what appears here comes from ideas presented in the book:

TOPIC: PROBABILITY DISTRIBUTIONS

Stat 139 Homework 2 Solutions, Fall 2016

Business Statistics Midterm Exam Fall 2013 Russell

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede,

MVE051/MSG Lecture 7

Back to estimators...

Value (x) probability Example A-2: Construct a histogram for population Ψ.

Chapter 3 Discrete Random Variables and Probability Distributions

Central Limit Theorem (cont d) 7/28/2006

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Data Analysis and Statistical Methods Statistics 651

MTH6154 Financial Mathematics I Stochastic Interest Rates

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

Elementary Statistics Lecture 5

Statistical Methods in Practice STAT/MATH 3379

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance

Bernoulli and Binomial Distributions


ECO220Y Estimation: Confidence Interval Estimator for Sample Proportions Readings: Chapter 11 (skip 11.5)

Module 4: Probability

Part 1 In which we meet the law of averages. The Law of Averages. The Expected Value & The Standard Error. Where Are We Going?

Data Analysis and Statistical Methods Statistics 651

4 Random Variables and Distributions

The probability of having a very tall person in our sample. We look to see how this random variable is distributed.

Module 3: Sampling Distributions and the CLT Statistics (OA3102)

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial

Statistics, Measures of Central Tendency I

Transcription:

Section 0: Introduction and Review of Basic Concepts Carlos M. Carvalho The University of Texas McCombs School of Business mccombs.utexas.edu/faculty/carlos.carvalho/teaching 1

Getting Started Syllabus www.mccombs.utexas.edu/faculty/carlos.carvalho/teaching/ General Expectations 1. Read the notes / Practice 2. Be on schedule 2

Course Overview Section 0: Basic Concepts: Probability and Estimation Section 1: Simple Regression Model Section 2: Multiple Regression Section 3: Dummy Variables and Interactions Section 4: Regression Diagnostics and Transformations Section 5: Time Series Section 6: Model Selection, Logistic Regression and more... 3

Review of Basic Concepts Probability and statistics let us talk efficiently about things we are unsure about. if I only ask 1,000 voters out of 10 million, how sure can I be about how they all will vote? What is the true proportion of yes voters. if I am trying to predict sales next quarter, how sure am I? if I am trying to choose my portfolio, how sure am I about returns on the assets next period? if I want to do target marketing, which customers are more likely to respond to a promotion? All of these involve inferring or predicting unknown quantities!! 4

Random Variables Random Variables are numbers that we are NOT sure about but we might have some idea of how to describe its potential outcomes. We usually use a capital letter to denote a random variable. Example: Suppose we are about to toss two coins. Let X denote the number of heads. We say that X, is the random variable that stands for the number we are not sure about. Note that we always assign numbers to random variables! 5

Probability Distribution We describe the behavior of random variables with a Probability Distribution Example: If X is the random variable denoting the number of heads in two independent coin tosses, we can describe its behavior through the following probability distribution: 0 with prob. 0.25 X = 1 with prob. 0.5 2 with prob. 0.25 X is called a Discrete Random Variable as we are able to list all the possible outcomes Question: What is Pr(X = 0)? How about Pr(X 1)? 6

Probability Distribution Probability is always a positive number It can take values between 0 and 1. The total probability across all possible values of a random variable equals 1. 7

The Bernoulli Distribution Suppose the random variable X can only take the values 0 or 1. This is a dummy variable representing success of failure of an experiment... e.g.: X = 1 if I win a hand in blackjack... X = 0 if I lose X = 1 if David Ortiz gets a hit in an at bat... X = 0 otherwise X = 1 if the S&P500 return is positive... X = 0 if negative The Bernoulli is a distribution defined by the probability parameter p. We denote X Bernoulli(p) { 1 with prob. p X = 0 with prob. 1 p 8

Mean and Variance of a Random Variable Suppose someone asks you for a prediction of X. What would you say? Suppose someone asks you how sure you are. What would you say? 9

Mean and Variance of a Random Variable The Mean or Expected Value is defined as (for a discrete X ): E(X ) = n Pr(x i ) x i i=1 We weight each possible value by how likely they are... this provides us with a measure of centrality of the distribution... a good prediction for X! 10

Mean and Variance of a Random Variable Suppose X Bernoulli(p) n E(X ) = Pr(x i ) x i i=1 = 0 (1 p) + 1 p E(X ) = p 11

Mean and Variance of a Random Variable The Variance is defined as (for a discrete X ): Var(X ) = n Pr(x i ) [x i E(X )] 2 i=1 Weighted average of squared prediction errors... This is a measure of spread of a distribution. More risky distributions have larger variance. 12

Mean and Variance of a Random Variable Suppose X Bernoulli(p) n Var(X ) = Pr(x i ) [x i E(X )] 2 i=1 = (0 p) 2 (1 p) + (1 p) 2 p = p(1 p) [(1 p) + p] Var(X ) = p(1 p) Question: For which value of p is the variance the largest? 13

The Standard Deviation What are the units of E(X )? What are the units of Var(X )? A more intuitive way to understand the spread of a distribution is to look at the standard deviation: sd(x ) = Var(X ) What are the units of sd(x )? 14

Continuous Random Variables Suppose we are trying to predict tomorrow s return on the S&P500... Question: What is the random variable of interest? Question: How can we describe our uncertainty about tomorrow s outcome? Listing all possible values seems like a crazy task... we ll work with intervals instead. These are call continuous random variables. The probability of an interval is defined by the area under the probability density function. 15

The Normal Distribution A random variable is a number we are NOT sure about but we might have some idea of how to describe its potential outcomes. The Normal distribution is the most used probability distribution to describe a random variable The probability the number ends up in an interval is given by the area under the curve (pdf) standard normal pdf 0.0 0.1 0.2 0.3 0.4 4 2 0 2 4 16

The Normal Distribution The standard Normal distribution has mean 0 and has variance 1. Notation: If Z N(0, 1) (Z is the random variable) Pr( 1 < Z < 1) = 0.68 Pr( 1.96 < Z < 1.96) = 0.95 standard normal pdf 0.0 0.1 0.2 0.3 0.4 standard normal pdf 0.0 0.1 0.2 0.3 0.4 4 2 0 2 4 z 4 2 0 2 4 z 17

The Normal Distribution Note: For simplicity we will often use P( 2 < Z < 2) 0.95 Questions: What is Pr(Z < 2)? How about Pr(Z 2)? What is Pr(Z < 0)? 18

The Normal Distribution The standard normal is not that useful by itself. When we say the normal distribution, we really mean a family of distributions. We obtain pdfs in the normal family by shifting the bell curve around and spreading it out (or tightening it up). 19

The Normal Distribution We write X N(µ, σ 2 ). Normal distribution with mean µ and variance σ 2. The parameter µ determines where the curve is. The center of the curve is µ. The parameter σ determines how spread out the curve is. The area under the curve in the interval (µ 2σ, µ + 2σ) is 95%. Pr(µ 2 σ < X < µ + 2 σ) 0.95 µ 2σ µ σ µ µ + σ µ + 2σ 20

The Normal Distribution Example: Below are the pdfs of X 1 N(0, 1), X 2 N(3, 1), and X 3 N(0, 16). Which pdf goes with which X? 8 6 4 2 0 2 4 6 8 21

The Normal Distribution Example Assume the annual returns on the SP500 are normally distributed with mean 6% and standard deviation 15%. SP500 N(6, 225). (Notice: 15 2 = 225). Two questions: (i) What is the chance of losing money on a given year? (ii) What is the value that there s only a 2% chance of losing that or more? Lloyd Blankfein: I spend 98% of my time thinking about 2% probability events! (i) Pr(SP500 < 0) and (ii) Pr(SP500 <?) = 0.02 22

The Normal Distribution Example prob less than 0 prob is 2% 0.000 0.010 0.020 40 20 0 20 40 60 sp500 0.000 0.010 0.020 40 20 0 20 40 60 sp500 (i) Pr(SP500 < 0) = 0.35 and (ii) Pr(SP500 < 25) = 0.02 In Excel: NORMDIST and NORMINV (homework!) 23

The Normal Distribution 1. Note: In X N(µ, σ 2 ) µ is the mean and σ 2 is the variance. 2. Standardization: if X N(µ, σ 2 ) then Z = X µ σ N(0, 1) 3. Summary: X N(µ, σ 2 ): µ: where the curve is σ: how spread out the curve is 95% chance X µ ± 2σ. 24

The Normal Distribution Another Example Prior to the 1987 crash, monthly S&P500 returns (r) followed (approximately) a normal with mean 0.012 and standard deviation equal to 0.043. How extreme was the crash of -0.2176? The standardization helps us interpret these numbers... r N(0.012, 0.043 2 ) For the crash, z = r 0.012 0.043 N(0, 1) z = 0.2176 0.012 0.043 = 5.27 How extreme is this zvalue? 5 standard deviations away!! 25

Mean and Variance of a Random Variable Suppose X N(µ, σ 2 ). Suppose someone asks you for a prediction of X. What would you say? µ 2σ µ σ µ µ + σ µ + 2σ Suppose someone asks you how sure you are. What would you say? 26 x

Mean and Variance of a Random Variable For the normal family of distributions we can see that the parameter µ talks about where the distribution is located or centered. We often use µ as our best guess for a prediction. The parameter σ talks about how spread out the distribution is. This gives us and indication about how uncertain or how risky our prediction is. If X is any random variable, the mean will be a measure of the location of the distribution and the variance will be a measure of how spread out it is. 27

The Mean and Variance of a Normal The Mean and Variance of a Normal For continuous distributions, the above formulas for E(X ) and Var(X ) get a bit more complicated as we are adding an infinite number of possible outcomes... not to worry, the interpretation is still the same. if X N(µ, σ 2 ) then E(X ) = µ, Var(X ) = σ 2, sd(x ) = σ µ σ µ µ + σ 28 µ 2σ µ + 2σ

Two More (very important!) Formulas Let X and Y be two random variables: E(aX + by ) = ae(x ) + be(y ) Var(aX + by ) = a 2 Var(X ) + b 2 Var(Y ) + 2ab Cov(X, Y ) We will get back to this later... 29

Conditional, Joint and Marginal Distributions In general we want to use probability to address problems involving more than one variable at the time We need to be able to describe what we think will happen to one variable relative to another... we want to answer questions like: How are my sales impacted by the overall economy? 30

Conditional, Joint and Marginal Distributions Let E denote the performance of the economy next quarter... for simplicity, say E = 1 if the economy is expanding and E = 0 if the economy is contracting (what kind of random variable is this?) Let s assume E Bernoulli(0.7) Let S denote my sales next quarter... and let s suppose the following probability statements: S pr(s E = 1) S pr(s E = 0) 1 0.05 1 0.20 2 0.20 2 0.30 3 0.50 3 0.30 4 0.25 4 0.20 These are called Conditional Distributions 31

Conditional, Joint and Marginal Distributions S pr(s E = 1) S pr(s E = 0) 1 0.05 1 0.20 2 0.20 2 0.30 3 0.50 3 0.30 4 0.25 4 0.20 In blue is the conditional distribution of S given E = 1 In red is the conditional distribution of S given E = 0 We read: the probability of Sales of 4 (S = 4) given(or conditional on) the economy is growing (E = 1) is 0.25 32

Conditional, Joint and Marginal Distributions The conditional distributions tell us about about what can happen to S for a given value of E... but what about S and E jointly? pr(s = 4 and E = 1) = pr(e = 1) pr(s = 4 E = 1) = 0.70 0.25 = 0.175 In english, 70% of the times the economy grows and 1/4 of those times sales equals 4... 25% of 70% is 17.5% 33

Conditional, Joint and Marginal Distributions 34

Conditional, Joint and Marginal Distributions We call the probabilities of E and S together the joint distribution of E and S. In general the notation is... pr(y = y, X = x) is the joint probability of the random variable Y equal y END the random variable X equal x. pr(y = y X = x) is the conditional probability of the random variable Y takes the value y GIVEN that X equals x. pr(y = y) and pr(x = x) are the marginal probabilities of Y = y and X = x 35

Important relationships Relationship between the joint and conditional... pr(y, x) = pr(x) pr(y x) = pr(y) pr(x y) Relationship between joint and marginal... pr(x) = y pr(y) = x pr(x, y) pr(x, y) 36

Conditional, Joint and Marginal Distributions Why we call marginals marginals... the table represents the joint and at the margins, we get the marginals. 37

Conditional, Joint and Marginal Distributions Example... Given E = 1 what is the probability of S = 4? pr(s = 4, E = 1) pr(s = 4 E = 1) = = 0.175 pr(e = 1) 0.7 = 0.25 38

Conditional, Joint and Marginal Distributions Example... Given S = 4 what is the probability of E = 1? pr(s = 4, E = 1) pr(e = 1 S = 4) = = 0.175 pr(s = 4) 0.235 = 0.745 39

Bayes Theorem Disease testing example... Let D = 1 indicate you have a disease Let T = 1 indicate that you test positive for it If you take the test and the result is positive, you are really interested in the question: Given that you tested positive, what is the chance you have the disease? 40

Bayes Theorem 0.019 pr(d = 1 T = 1) = (0.019 + 0.0098) = 0.66 41

Bayes Theorem The computation of pr(x y) from pr(x) and pr(y x) is called Bayes theorem... pr(x y) = pr(y, x) pr(y) = pr(y, x) = pr(x)pr(y x) pr(y, x) x x pr(x)pr(y x) In the disease testing example: p(d = 1 T = 1) = p(t =1 D=1)p(D=1) p(t =1 D=1)p(D=1)+p(T =1 D=0)p(D=0) pr(d = 1 T = 1) = 0.019 (0.019+0.0098) = 0.66 42

Independence Two random variable X and Y are independent if pr(y = y X = x) = pr(y = y) for all possible x and y. In other words, knowing X tells you nothing about Y! e.g.,tossing a coin 2 times... what is the probability of getting H in the second toss given we saw a T in the first one? 43

Independence We can extend the notion of independence to any number of variables For example, Y 1 is independent of Y 2 and Y 3 if pr(y 1 = y 1 Y 2 = y 2, Y 3 = y 3 ) = pr(y 1 = y 1 ) 44

IID Suppose you are about to toss a coin n times Let Y i = 1 if the i th toss is a head and 0 otherwise... What is the pr(y 29 = 1)? What is the pr(y 29 = Y 27 = 0, Y 28 = 1)? What is pr(y 57 = 1)? 45

IID Each Y i Bernoulli(0.5) Each Y i is independent of all others... hence IID: independent and identically distributed 46

A First Modeling Exercise I have US$ 1,000 invested in the Brazilian stock index, the IBOVESPA. I need to predict tomorrow s value of my portfolio. I also want to know how risky my portfolio is, in particular, I want to know how likely am I to lose more than 3% of my money by the end of tomorrow s trading session. What should I do? 47

IBOVESPA - Data BOVESPA Density 0.00 0.05 0.10 0.15 0.20 0.25 0.30-4 -2 0 2 4 Daily Returns Daily Return -4-2 0 2 4 0 20 40 60 80 100 Date 48

As a first modeling decision, let s call the random variable associated with daily returns on the IBOVESPA X and assume that returns are independent and identically distributed as X N(µ, σ 2 ) Question: What are the values of µ and σ 2? We need to estimate these values from the sample in hands (n=113 observations)... 49

Let s assume that each observation in the random sample {x 1, x 2, x 3,..., x n } is independent and distributed according to the model above, i.e., x i N(µ, σ 2 ) An usual strategy is to estimate µ and σ 2, the mean and the variance of the distribution, via the sample mean ( X ) and the sample variance (s 2 )... (their sample counterparts) X = 1 n n i=1 x i s 2 = 1 n 1 n ( xi X ) 2 i=1 50

For the IBOVESPA data in hands, BOVESPA Density 0.00 0.05 0.10 0.15 0.20 0.25 0.30 X = 0.04 and s 2 = 2.19-4 -2 0 2 4 Daily Returns The red line represents our model, i.e., the normal distribution with mean and variance given by the estimated quantities X and s 2. What is Pr(X < 3)? 51

Models, Parameters, Estimates... In general we talk about unknown quantities using the language of probability... and the following steps: Define the random variables of interest Define a model (or probability distribution) that describes the behavior of the RV of interest Based on the data available, we estimate the parameters defining the model We are now ready to describe possible scenarios, generate predictions, make decisions, evaluate risk, etc... 52

Oracle vs SAP Example (understanding variation) 53

Oracle vs. SAP Do we buy the claim from this add? We have a dataset of 81 firms that use SAP... The industry ROE is 15% (also an estimate but let s assume it is true) We assume that the random variable X represents ROE of SAP firms and can be described by X N(µ, σ 2 ) X s 2 SAP firms 0.1263 0.065 Well, 0.12 0.15 0.8! I guess the add is correct, right? Not so fast... 54

Oracle vs. SAP Let s assume that the ROE of firms using SAP is, on average, the same as the industry. Assume further that s 2 is a good estimate of the variance... ROE N(0.15, 0.065) In a sample of 81 firms, how often can we expect the sample mean to be below 0.15? What does this mean if I trying to compare the profitability of firms using SAP versus the industry? 55

Oracle vs. SAP Let s do a little simulation... Generate 1000 different samples of size 81 from a N(0.15, 0.065). Plot the histogram of X... Now, what do you think about the add? Histogram of sample mean Density 0 2 4 6 8 10 12 14 0.05 0.10 0.15 0.20 0.25 56

Sampling Distribution of Sample Mean Consider the mean for an iid sample of n observations of a random variable {X 1,..., X n } Suppose that E(X i ) = µ and var(x i ) = σ 2 E( X ) = 1 n E(Xi ) = µ var( X ) = var ( 1 ) n Xi = 1 n var 2 (Yi ) = σ2 n ( If X is normal, then X N µ, σ2 n This is called the sampling distribution of the mean... ). 57

Sampling Distribution of Sample Mean The sampling distribution of X describes how our estimate would vary over different datasets of the same size n It provides us with a vehicle to evaluate the uncertainty associated with our estimate of the mean... It turns out that s 2 is a good proxy for σ 2 so that we can approximate the sampling distribution by We call s 2 n X N ) (µ, s2 n the standard error of X... it is a measure of its variability... I like the notation s X = s 2 n 58

Sampling Distribution of Sample Mean X N ( µ, s 2 X ) X is unbiased... E( X ) = µ. On average, X is right! X is consistent... as n grows, s 2 X 0, i.e., with more information, eventually X correctly estimates µ! 59

Back to the Oracle vs. SAP example Our simulation was done assuming that µ = 0.15... in that case X N ( 0.15, 0.065 ) 81 Histogram of sample mean Density 0 2 4 6 8 10 12 14 Sampling Distribution 0.05 0.10 0.15 0.20 0.25 60

Confidence Intervals X N ( µ, s 2 X ) so... ( X µ) N ( 0, s 2 X ) right? What is a good prediction for µ? What is our best guess?? X How do we make mistakes? How far from µ can we be?? 95% of the time ±2 s X [ X ±2 s X ] gives a 95% range of plausible values for µ... this is called the 95% Confidence Interval for µ. 61

Oracle vs. SAP example... one more time In this example, X = 0.1263, s 2 = 0.065 and n = 81... therefore, s 2 X = 0.065 81 so, the 95% confidence interval for the ROE of SAP firms is [ X 2 s X ; X + 2 s X ] = [ 0.1263 2 0.065 81 ; 0.1263 + 2 = [0.069; 0.183] ] 0.065 81 Is 0.15 a plausible value? What does that mean? 62

y Estimating Proportions... another modeling example Your job is to manufacture a part. Each time you make a part, it is defective or not. Below we have the results from 100 parts you just made. Y i = 1 means a defect, 0 a good one. How would you predict the next one? 0.0 0.2 0.4 0.6 0.8 1.0 0 20 40 60 80 100 Index Would you model these Y as iid Bernoulli draws for some p? There are 18 ones and 82 zeros. 63

In this case, it might be reasonable to model the defects as iid Bernoulli(.18). We can t be sure this is right, but, the data looks like the kind of thing we would get if we had iid draws with that p!!! If we believe our model, what is the chance that the next 10 are good?.82 10 = 0.137. 64

We used the proportion of defects in our sample to estimate p, the true, long-run, proportion of defects. Could this estimate be wrong?!! Let ˆp denote the sample proportion. The standard error associated with the sample proportion as an estimate of the true proportion is: sˆp = ˆp (1 ˆp) n 65

Suppose we have iid Bernoulli data and estimate the true p by the observed sample proportion of 1 s, ˆp. The (approximate) 95% confidence interval for the true proportion is: ˆp ± 2 sˆp. 66

Defects: In our defect example we had ˆp =.18 and n = 100. This gives sˆp = (.18) (.82) 100 =.04. The confidence interval is.18 ±.08 = (.1,.26), big!!!!. 67

Polls: yet another example... If we take a relatively small random sample from a large population and ask each respondent yes or no with yes Y i = 1 and no Y i = 0, then, approximately. Y i Bernoulli(p) where p is the true population proportion of yes. Suppose, as is common, n = 1000, and ˆp.5. Then, sˆp = (.5) (.5) 1000 =.0158. The standard error is.0158 so that the ± is.0316, or about ± 3%. 68