Module 4: Probability

Similar documents
Theoretical Foundations

The normal distribution is a theoretical model derived mathematically and not empirically.

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Part V - Chance Variability

Introduction to Statistical Data Analysis II

Central Limit Theorem

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

MLLunsford 1. Activity: Central Limit Theorem Theory and Computations

The Binomial Probability Distribution

Chapter 5. Sampling Distributions

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 4: May 2, Abstract

Business Statistics 41000: Probability 4

Chapter 7 Study Guide: The Central Limit Theorem

Midterm Exam III Review

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Binomial and Normal Distributions

Figure 1: 2πσ is said to have a normal distribution with mean µ and standard deviation σ. This is also denoted

MATH 446/546 Homework 1:

6 Central Limit Theorem. (Chs 6.4, 6.5)

Prof. Thistleton MAT 505 Introduction to Probability Lecture 3

Chapter 14. From Randomness to Probability. Copyright 2010 Pearson Education, Inc.

Chapter 6 Probability

Probability. An intro for calculus students P= Figure 1: A normal integral

STAT 201 Chapter 6. Distribution

The Binomial Distribution

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

Elementary Statistics Lecture 5

Statistical Methods in Practice STAT/MATH 3379

Counting Basics. Venn diagrams

The Central Limit Theorem. Sec. 8.2: The Random Variable. it s Distribution. it s Distribution

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Binomial Random Variables

Statistics, Their Distributions, and the Central Limit Theorem

Confidence Intervals Introduction

MATH 112 Section 7.3: Understanding Chance

Lecture 8 - Sampling Distributions and the CLT

Event p351 An event is an outcome or a set of outcomes of a random phenomenon. That is, an event is a subset of the sample space.

15.063: Communicating with Data Summer Recitation 3 Probability II

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Chapter 6: Random Variables and Probability Distributions

Lecture 6: Confidence Intervals

The Binomial Distribution

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

1 Sampling Distributions

Making Sense of Cents

(# of die rolls that satisfy the criteria) (# of possible die rolls)

Chapter 9: Sampling Distributions

Stat511 Additional Materials

CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS

Experimental Probability - probability measured by performing an experiment for a number of n trials and recording the number of outcomes

1. Variability in estimates and CLT

Class 11. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Random Variables CHAPTER 6.3 BINOMIAL AND GEOMETRIC RANDOM VARIABLES

CHAPTER 5 Sampling Distributions

Chapter 9. Idea of Probability. Randomness and Probability. Basic Practice of Statistics - 3rd Edition. Chapter 9 1. Introducing Probability

Sampling Distributions and the Central Limit Theorem

Central Limit Theorem 11/08/2005

chapter 13: Binomial Distribution Exercises (binomial)13.6, 13.12, 13.22, 13.43

8.1 Estimation of the Mean and Proportion

BIOL The Normal Distribution and the Central Limit Theorem

The Normal Probability Distribution

Lecture 9. Probability Distributions. Outline. Outline

Stat 20: Intro to Probability and Statistics

Introduction to Business Statistics QM 120 Chapter 6

The topics in this section are related and necessary topics for both course objectives.

2017 Fall QMS102 Tip Sheet 2

Chapter 6: Random Variables

Lecture 9. Probability Distributions

Examples: On a menu, there are 5 appetizers, 10 entrees, 6 desserts, and 4 beverages. How many possible dinners are there?

Expected Value of a Random Variable

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Notes 12.8: Normal Distribution

Using the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the

SAMPLING DISTRIBUTIONS. Chapter 7

Math 14 Lecture Notes Ch The Normal Approximation to the Binomial Distribution. P (X ) = nc X p X q n X =

Chapter 5: Probability

ECON 214 Elements of Statistics for Economists 2016/2017

MAKING SENSE OF DATA Essentials series

If the distribution of a random variable x is approximately normal, then

Business Statistics 41000: Probability 3

5.2 Random Variables, Probability Histograms and Probability Distributions

Chapter Seven. The Normal Distribution

5.4 Normal Approximation of the Binomial Distribution

Probability: Week 4. Kwonsang Lee. University of Pennsylvania February 13, 2015

Chapter 7: Point Estimation and Sampling Distributions

5.1 Personal Probability

STAT Chapter 7: Central Limit Theorem

NORMAL RANDOM VARIABLES (Normal or gaussian distribution)

Review of the Topics for Midterm I

Probability and Sampling Distributions Random variables. Section 4.3 (Continued)

Central Limit Theorem (cont d) 7/28/2006

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Section The Sampling Distribution of a Sample Mean

E509A: Principle of Biostatistics. GY Zou

5.7 Probability Distributions and Variance

Some Characteristics of Data

Review. What is the probability of throwing two 6s in a row with a fair die? a) b) c) d) 0.333

2011 Pearson Education, Inc

Transcription:

Module 4: Probability 1 / 22

Probability concepts in statistical inference Probability is a way of quantifying uncertainty associated with random events and is the basis for statistical inference. Inference is the generalization of findings from a sample or an experiment to a population. - In a sample of 1028 adults, 11% were found to approve of the way Congress was handling its job. How much certainty (or confidence) do we have in saying that the true proportion is close to 11%? - Suppose an experiment is carried out to determine if taking an antidepressant will help an individual quit smoking. This study finds that at the end of 1 year, 55% of subjects receiving an antidepressant were not smoking, compared with 42.3% in the placebo (control) group. Can this difference (55% vs. 42.3%) be explained by chance? 2 / 22

Probability as a measure of long-run behavior Suppose we rolled a die 100 times. What proportion of times would you expect to roll a 6? What if we rolled the die 1000 times? 10,000 times? Let s try this experiment in R! 3 / 22

The following should be clear from the die example: With random events, the proportion of times something happens is random and variable in the short term but predictable in the long run. The probability of rolling a die and getting a 6 is 1/6 0.167. Probability With a randomized experiment or a random sample or other random phenomenon, the probability of a particular outcome is the proportion of times that the outcome would occur in a long run of observations. This is also an example of the law of large numbers. This definition of probability is sometimes referred to as the empirical probability. 4 / 22

For the definition on the previous page to hold, each trial (e.g., roll of the die) must be independent of each other. Independent trials Different trials of a random phenomenon are independent if the outcome of any one trial is not affected by or correlated with the outcome of any other trial. Many events are independent but it can be human nature to think that they are not. The following are examples of independent events rolling a die flipping a coin having children For example, if you roll a die and the number 5 has not come up in 100 rolls, the probability that you roll a 5 on the next roll is still 1/6. 5 / 22

Classical Probabilities We saw that the probability of rolling a 6 on a fair die is 1/6, based on its long run proportion. However, we can also determine this probability by saying that out of the 6 possible outcomes (rollling a 1-6), there is only a single way to roll a 6. We will talk about this approach in more detail. Sample Space For a random phenomenon, the sample space is the set of all possible outcomes. 6 / 22

We will look at some examples. To help determine the sample space, it is useful to note that if there are x possible outcomes for each trial, and there are n trials, the sample space consists of x n outcomes. Example State the sample space for the following probability experiments: - Flipping a coin once - Flipping a coin twice - Flipping a coin three times 7 / 22

Event An event is a subset of the sample space and corresponds to a particular outcome or a group of possible outcomes. Events are often denoted with capital letters or by a string of letters that describe the event. For example, consider flipping a coin three times: A = student gets exactly 2 heads B = student gets at least 2 heads 8 / 22

Probability of an Event (classical definition) The probability of an event A, denoted by P(A), is obtained by adding the probabilities of the individual outcomes in the event. When all possible outcomes are equally likely, P(A) = number of outcomes in the event A number of outcomes in the sample space Probability characteristics for any event: - a probability must be between 0 and 1. - if S is the sample space, then P(S) = 1. - a probability of 0 means the event is impossible - a probability of 1 means the event is a certainty 9 / 22

Example Find the probability of flipping a coin 3 times and - getting all heads - getting at least 1 head 10 / 22

The Complement of an Event The complement of an event A consists of all outcomes in the sample space that are not in A. It is denoted by A C. The probabilities of A and A C add to 1, so P(A C ) = 1 P(A) Example We previously found that if you flip a coin 3 times, the probability of the event A = getting at least one head = 7/8. Therefore, the probability of getting no heads (or all tails) is P(A C ) = 1 P(A) = 1 7/8 = 1/8 11 / 22

Probabilities for Bell-Shaped (Normal) Distributions 12 / 22

Normal distribution The normal distribution is symmetric, bell-shaped, and characterized by its mean µ and standard deviation σ 13 / 22

Finding probabilities for a normal random variable Suppose that X is a random variable that is normally distributed with mean µ and standard deviation σ. Then X N(µ, σ) - The total area under the curve of the normal distribution is 1.0 - The area between two values, a and b, is the probability that X is between a and b. - The area to the left of a is the probability that X is less than a. - The area to the right of b is the probability that X is greater than b. We will use R to calculate probabilities of normally distributed random variables 14 / 22

Standard normal distribution A random variable Z follows the standard normal distribution if it is normally distributed with mean µ = 0 and standard deviation σ = 1. If X N(µ, σ), then Z is X µ σ the mean, and Z N(0, 1). = z standard deviations above The standard normal distribution can therefore be used to calculate probabilities of observations regarding standard deviations from the mean. 15 / 22

Sampling distributions How Sample Means Vary Around the Population Mean 16 / 22

Consider a very small population of students in a class, whose ages and gender are given in the table below. Person Gender Age A F 21 B M 19 C F 21 D F 21 E M 18 What is the population mean µ? Let s take a simple random sample of n = 3 students from this class and calculate X = the sample mean. Note that X is a random variable and therefore has a probability distribution. Let s find the probability distribution of X as well as its mean, or expected value, E[ X ]. 17 / 22

Find the probability distribution and expected value of X when n = 3. How does the expected value of the sample mean X compare to the population mean µ? 18 / 22

Mean and Standard Deviation of the Sampling Distribution of the sampling distribution X. For a random sample of size n from a population having mean µ and standard deviation σ, the sampling distribution of the sample mean X has expected value equal to the population mean µ and a standard deviation of σ n. Therefore, as n increases the expected value of the sample mean gets closer and closer to the population mean, µ. 19 / 22

What about the shape of the distribution? If the population is normally distributed, then X n is always normally distributed. Amazingly, this is approximately the case regardless of the distribution of the population. The Central Limit Theorem (CLT) For a random sample of size n from a population having mean µ and standard deviation σ, and any distribution (shape) then as the sample size n increases, the sample distribution of the sample mean X n approaches an approximately normal distribution. In other words, it is always approximately true that ( ) σ X n N µ, n 20 / 22

In practice, the CLT holds when either the population is normally distributed or when n > 30. Example Additional examples of the Central Limit Theorem http://www.chem.uoa.gr/applets/appletcentrallimit/appl_centrallimit2.html 21 / 22

The Central Limit Theorem Helps Us Make Inferences Remember the Empirical Rule? If a distribution is approximately bell-shaped (normal), then the percent of observations falling between one, two, and three standard deviations is approximately 68%, 95%, and 99.7%, respectively. Therefore if the sampling distribution of X n is (approximately) normal, then the sample mean x falls within 1 standard deviation of the population mean µ 68% of the time, falls within 2 standard deviations of the population mean 95% of the time, and falls within 3 standard deviations of the population mean almost all of the time. 22 / 22