Chapter 4 Probability and Probability Distributions. Sections

Similar documents
2011 Pearson Education, Inc

Statistical Methods in Practice STAT/MATH 3379

NORMAL RANDOM VARIABLES (Normal or gaussian distribution)

The normal distribution is a theoretical model derived mathematically and not empirically.

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Week 7. Texas A& M University. Department of Mathematics Texas A& M University, College Station Section 3.2, 3.3 and 3.4

Chapter 7 1. Random Variables

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

ECON 214 Elements of Statistics for Economists 2016/2017

MAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw

No, because np = 100(0.02) = 2. The value of np must be greater than or equal to 5 to use the normal approximation.

Learning Objectives for Ch. 5

Introduction to Business Statistics QM 120 Chapter 6

CH 5 Normal Probability Distributions Properties of the Normal Distribution

A continuous random variable is one that can theoretically take on any value on some line interval. We use f ( x)

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Theoretical Foundations

Examples: Random Variables. Discrete and Continuous Random Variables. Probability Distributions

A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.

Data Analysis and Statistical Methods Statistics 651

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

11.5: Normal Distributions

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

STAT 201 Chapter 6. Distribution

MAKING SENSE OF DATA Essentials series

Business Statistics 41000: Probability 3

Lecture 9. Probability Distributions. Outline. Outline

Lecture 9. Probability Distributions

Business Statistics 41000: Probability 4

MTH 245: Mathematics for Management, Life, and Social Sciences

Part V - Chance Variability

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

Density curves. (James Madison University) February 4, / 20

Chapter 7. Random Variables

MATH 104 CHAPTER 5 page 1 NORMAL DISTRIBUTION

Lecture 6: Chapter 6

Random Variables. 6.1 Discrete and Continuous Random Variables. Probability Distribution. Discrete Random Variables. Chapter 6, Section 1

Statistics 6 th Edition

ECON 214 Elements of Statistics for Economists

Chapter 6. The Normal Probability Distributions

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

TOPIC: PROBABILITY DISTRIBUTIONS

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Chapter 4 Random Variables & Probability. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Chapter 5 Normal Probability Distributions

Probability Distributions II

MTH 245: Mathematics for Management, Life, and Social Sciences

Random Variable: Definition

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

Statistics, Measures of Central Tendency I

Statistics for Business and Economics: Random Variables:Continuous

STAT:2010 Statistical Methods and Computing. Using density curves to describe the distribution of values of a quantitative

Distribution of the Sample Mean

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

Normal Curves & Sampling Distributions

The graph of a normal curve is symmetric with respect to the line x = µ, and has points of

Midterm Exam III Review

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

Topic 6 - Continuous Distributions I. Discrete RVs. Probability Density. Continuous RVs. Background Reading. Recall the discrete distributions

Section Distributions of Random Variables

Stats CH 6 Intro Activity 1

Section Introduction to Normal Distributions

Continuous Probability Distributions & Normal Distribution

4: Probability. What is probability? Random variables (RVs)

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Consider the following examples: ex: let X = tossing a coin three times and counting the number of heads

5.4 Normal Approximation of the Binomial Distribution Lesson MDM4U Jensen

MLLunsford 1. Activity: Central Limit Theorem Theory and Computations

CHAPTER 8 PROBABILITY DISTRIBUTIONS AND STATISTICS

Math 227 Elementary Statistics. Bluman 5 th edition

5.4 Normal Approximation of the Binomial Distribution

A.REPRESENTATION OF DATA

Chapter 8. Variables. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS

MANAGEMENT PRINCIPLES AND STATISTICS (252 BE)

4.1 Probability Distributions

Introduction to Statistical Data Analysis II

Lecture 12. Some Useful Continuous Distributions. The most important continuous probability distribution in entire field of statistics.

Continuous Probability Distributions

DATA SUMMARIZATION AND VISUALIZATION

Prob and Stats, Nov 7

Business Statistics. Chapter 5 Discrete Probability Distributions QMIS 120. Dr. Mohammad Zainal

Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 6 Normal Probability Distribution QMIS 120. Dr.

Chapter 7. Sampling Distributions

Estimation. Focus Points 10/11/2011. Estimating p in the Binomial Distribution. Section 7.3

Chapter 5 Student Lecture Notes 5-1. Department of Quantitative Methods & Information Systems. Business Statistics

Statistics 511 Supplemental Materials

A random variable is a quantitative variable that represents a certain

Continuous Random Variables and the Normal Distribution

Section Random Variables and Histograms

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Normal distribution. We say that a random variable X follows the normal distribution if the probability density function of X is given by

Probability. An intro for calculus students P= Figure 1: A normal integral

AP Statistics Chapter 6 - Random Variables

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

On one of the feet? 1 2. On red? 1 4. Within 1 of the vertical black line at the top?( 1 to 1 2

The topics in this section are related and necessary topics for both course objectives.

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

Chapter 3. Density Curves. Density Curves. Basic Practice of Statistics - 3rd Edition. Chapter 3 1. The Normal Distributions

Transcription:

Chapter 4 Probabilit and Probabilit Distributions Sections 4.6-4.10

Sec 4.6 - Variables Variable: takes on different values (or attributes) Random variable: cannot be predicted with certaint Random Variables Qualitative eg. political affiliation, color preference, gender Quantitative measureable, numeric outcomes Discrete eg. # heads tossed, enrollement Continuous eg. Age of marriage, income tax return amts, height Recall: We want to know the probabilit of observing a particular sample

4.7 Probabilit Distributions for Discrete RVs Discrete random variable: quantitative random variable, the variable can onl assume a countable number of values What is the probabilit associated with each value of the variable,? Probabilit Distribution of : theoretical relative frequencies obtained from the probabilities for each value of The probabilit distribution for a discrete r.v., displas the probabilit P() associated with each value of.

Probabilit Distributions Discrete RVs Example. Consider the tossing of coins, and define the variable,, to be the number of heads observed. Possible values of : 0, 1,. Suppose that empirical sampling ields the following: freq 0 19 1 4 19 Empirical probabilit distribution of : freq rel. freq 0 19 0.58 1 4 0.484 19 0.58 Theoretical probabilit distribution of : P() 0 0.5 1 0.5 **Theoretical and empirical probabilit distributions

4.9 Probabilit Distributions for Continuous RVs Continuous Random Variable: quantitative, variable assumes values on an interval, un-countabl man possible values Example. Consider the random variable,, that is the average height of 18 ear old males in the US. The following is sample data collected from 400 individuals: 5.4959 5.507 5.559 5.5698 5.5446 5.4464 5.884 5.837 5.4901 5.4569 5.18 5.6931 4.93 5.9798 5.0576 6.478 6.1558 6.6181 6.0048 6.1135 5.1775 6.184 6.378 6.0983 6.0165 6.1591 5.4195 5.5411 5.7411 5.6197 5.341 5.8045 5.665 6.033 5.8679 5.9166 6.0485 5.1919 5.8154 5.0156 5.55 5.781 5.355 5.6197 5.341 6.1074 5.6618 5.8685 5.848 5.4685 5.758 5.683 5.7863 5.4616 5.718 5.854 5.8888 5.6631 6.4617 5.8419 5.5149 5.76 5.4401 6.809 5.834 6.0809 4.9667 5.941 6.718 5.5195 5.5634 5.1731 6.311 5.7405 5.7851 5.514 6.07 5.0959 5.5863 5.55 5.8677 5.3949 5.8159 5.3006 5.7134 5.6737 6.084 5.656 6.316 6.0855 6.1686 5.436 5.4665 6.5448 5.9669 5.7581 5.806 6.0079 5.3411 5.9654 6.0338 6.063 5.0646 6.3141 6.059 5.6471 5.764 6.345 5.3717 5.19 5.9169 5.944 5.4851 5.47 5.6306 5.716 5.7367 5.748 6.66 5.1307 5.7611 5.196 5.847 5.718 5.9569 5.4853 5.0979 5.8701 5.687 5.6347 5.158 5.8158 5.1913 5.8076 4.9118 5.847 5.6585 5.4951 5.814 5.6896 6.0666 5.5501 5.5753 6.0568 5.084 5.9461 6.066 5.177 4.9793 5.618 5.4857 6.163 5.6608 6.1057 5.619 5.551 5.7406 5.758 5.4758 5.438 5.445 6.0701 5.469 5.855 5.5485 6.0436 5.806 6.656 6.0661 5.743 5.8049 6.104 5.651 5.635 5.7107 5.130 5.95 6.1118 5.903 5.3639 6.0563 5.581 5.443 6.666 5.661 5.6967 5.847 5.4449 5.5194 5.6584 6.1407 5.941 6.1833 4.8951 5.785 5.5433 5.857 5.9 6.0596 5.954 6.0389 5.849 5.531 6.1674 5.8486 5.88 5.6159 5.665 6.085 5.445 5.764 4.9846 5.148 6.4544 5.8351 6.3308 6.109 5.6398 5.6678 5.5356 5.8694 5.6393 5.5884 6.0101 6.01 6.048 5.7914 5.877 6.1343 5.7689 5.7496 5.9386 5.5588 5.88 6.054 6.193 5.4785 5.8039 5.7008 6.4147 5.8676 6.0046 5.740 5.7745 5.8013 6.1333 4.8571 4.9746 5.9478 5.7179 5.79 6.17 5.8119 5.799 5.7891 5.6666 6.1177 5.9385 5.5016 5.9354 5.657 6.1379 6.3875 5.785 6.071 5.8701 5.7518 5.597 5.975 5.8168 6.018 5.7141 5.7858 5.734 5.1043 5.7719 6.1106 5.4786 5.7649 5.8087 5.5939 4.88 6.117 5.1014 5.087 5.496 5.986 6.0805 5.816 5.95 5.5037 6.0471 5.3983 5.817 5.8639 5.4055 5.7776 6.4469 5.5847 5.936 6.0166 5.3819 5.5075 5.6116 6.183 5.5771 6.01 5.9787 5.9914 5.7378 6.136 6.947 5.593 6.155 5.4893 5.0933 5.576 5.1963 5.989 6.3131 5.5738 6.0115 6.1356 5.8364 6.63 6.1083 6.147 5.613 5.9585 5.561 5.931 6.116 6.0367 5.0873 6.0336 5.97 6.0865 5.113 5.6348 5.9155 5.8398 5.831 5.765 5.9536 5.8978 5.9475 6.014 5.8874 6.0786 5.7364 5.7579 5.813 6.0458 5.8416 5.8506 5.436 5.6194 6.434 5.794 4.8988 5.6871 5.87 5.968 6.3543 6.086 5.4783 6.0511 5.0799 5.888 5.4756 5.764 5.457 6.1518 5.734 5.8335 5.863 5.691 5.3864 5.5351 6.3403

Probabilit Distribution for Continuous RV Example (ctd). The variable values have to be binned relative frequenc histogram. The interval lengths and numbers of bins can be refined 18 bins here 40 bins here with more data, and finer binning, the histogram outline will approach a smooth curve.

1000 data points. Smooth curve outline appears to be emerging. The smooth curve is the probabilit distribution associated with variable, the height of an 18 r old male in the US.

Discrete and Continuous Probabilit Distributions Probabilit distributions provide a means of quantifing the probabilit of obtaining a certain sample outcome. Note: Probabilities are equal to the fraction of the total histogram area corresponding to the values of interest Discrete case: 1. Probabilit of observing two heads when a coin is tossed two times is 0.5.. Probabilit of observing at least one head is 0.5 + 0.5 = 0.75 Probabilit of observing Either no heads or two Heads is 0.5 + 0.5.

Discrete and Continuous Probabilit Distributions Continuous case: 1. Does it make sense to ask what is the probabilit that an 18.o. male is 5 10? NO. Note: The distribution plot was created using relative frequencies total area under the plot is 1. 3. We compute the probabilit of a value falling in a certain range of values, b computing the area that lies under the distribution plot, over that range. The probabilit that an 18.o. male has a height that lies between 5.7 and 5.8 feet is approx 0.1.

Half-wa Summar So far: 1. How to create probabilit distributions from empirical/theoretical discrete and continuous random variables.. How to determine probabilities of a variable attaining a certain value (discrete) or attaining a value that lies within a certain range (continuous). 3. Wh is this useful? (Q: what is the probabilit of obtaining a particular sample) 4. Some common known distributions bionomial (discrete), normal (continuous), t-distribution (continuous), chi-squared (continuous) 5. Can make assumptions about the tpe of distribution associated with particular populations of interest one of the known distributions 6. Can determine features of the underling distributions b simulation, other empirical observations

The Binomial Distribution - Discrete Binomial Distribution properties: 1. experiment has n identical trials. each trial is either a success or failure ( possible outcomes) 3. P(success) = π for ever trial, fixed 4. trials are independent 5. variable, = # of successes in the n trials Outcome of one trial does not affect the outcome of an other(s) Examples. 1. = # heads when a coin is tossed n times (success = heads). = # light bulbs that fail inspection when n selected from a batch are tested (success = failed inspection) 3. = # of people who test positive for a bacterial infection out of n who have been exposed to the bacteria (success = positive test result)

The Binomial Distribution (ctd) P() = probabilit of obtaining successes in n trials of a binomial exp Example (Computing P()). Suppose there is a 5% chance that a pregnanc test fails. What is the probabilit that out of a sample of 5 tests, all 5 fail? i.e. What is P(5)? P( 0) (0.5) * (0.5) *...* (0.5) Now, what is P()? P(5) = P(the 1 st test fails and the nd test fails and the 3 rd test fails and and the 5 th test fails) 5 (0.5) 0.000977

The Binomial Distribution (ctd) What is P()? P() = P(1 st fails and nd fails and rest don t OR 1 st fails and 3 rd fails and rest don t OR ) P() (0.5)(0.5)(0.75)(0.75)(0.75) (0.5)(0.75)(0.5)(0.75)(0.75)... (0.75)(0.75)(0.75)(0.5)(0.5) P() 5 (0.5) (0.75) 5! 3 0.5 0.75 3!! 0.637 3 P() = (# was to select failing tests out of 5)* (probabilit of test failing)*(probabilit of 3 tests not failing) = 5 C *0.5 *0.75 3

The Binomial Distribution (ctd) Probabilit of successes in n trials of a binomial experiment: P( ) n!!( n )! (1 ) ( n) = # successes in n trials n = # trials π = probabilit of success on a single trial Mean and Standard Deviation of the Binomial Distribution: Mean: n Standard n ( 1 ) Deviation:

The Binomial Distribution (ctd) Example. What is the probabilit that 6 out of 0 tests fail, if the probabilit that an one test fails is 5%? Success = test fails So, π = 0.5, n = 0, = 6 0! 6 14 P(6) 0.5 0.75 6! 14! 0 *19 *18*17 *16 *15 0.5 6 *5* 4*3* *1 0.1686 6 0.75 What are the mean and deviation of this distribution? 14 0 * 0.5 5 1.94 0 * 0.5(0.75) Note: P( 7) = P(7) + P(8) + P(9) + + P(0) = 1 P( 6)

The Normal Distribution - Continuous Bell-shaped curve, smmetric about mean Numerous continuous random variables have a normal distribution eg. test scores, weight, 100m sprint times Normal curve is defined b μ and σ Empirical rule holds: approx 68% of the population lies within ± 1σ of μ P( 1 < ) = area under normal curve between = 1 and = f ( ) 1 e Normal curve, f() ( )

Computing probabilities for normall distributed populations: The Normal Distribution ) ( 1 ) ( e f 1 ) ( 1 1 1 ) ( ) ( e f P P(5.5 x <5.7) = 0.1844

The Normal Distribution Standard Normal Computing probabilities (ctd): - Normal curves var b variable values (x-axis), depend on μ and σ, but are identical in shape - Standard normal distribution: μ = 0 and σ = 1 - Tables exist for areas under this graph (Table 1, Appendix of text) - In a standards normal distribution, these are known as z- values x values between z = 0.5 and z = 1.1 are measurements that lie between 0.5 and 1.1 standard deviations awa from the mean of 0.

The Normal Distribution Reading from the table Table 1 contains areas under the standard normal curve that lie to the left of a particular z-value. P(z<0.5) i.e. Reading the entr corresponding to z 1 we obtain P(x < z 1 ) P(z<1.1) So P(0.5 x < 1.1) = P(x < 1.1) - P( x < 0.5) = 0.8643-0.6915 = 0.178 z-values P(0.5 z<1.1)

The Normal Distribution Z-scores We can use Table 1 for arbitrar normal distributions, as long μ and σ are known. This is done b standardizing the measurement values,, to standard normal values known as z-scores: z Example. Consider a normal distribution with μ = 5 and σ = 3.5. Compute the probabilit that the value of a measurement lies between 7 and 30. 7 5 30 5 P( 7 30) P( z ) P( z 1.486) P( z 3.5 3.5 0.936 0.7157 1 z 1 z 0. 079 0.5714) There is a 0.79% probabilit that takes a value between 7 and 30.

The Normal Distribution Percentiles Def: The 100pth percentile of a distribution is the value p such that 100p% of the population values lie below p and 100(1-p)% lie above p. To find percentiles of standard normal distribution reverse lookup of Table 1 Example. Find the 33 rd percentile of the standard normal distribution. Need to find z p such that 100p% of values lies below z p. I.e. Find z p such that P(z z p ) = 33% From Table 1: z p = -0.44 So, 33 rd percentile is -0.44

The Normal Distribution Percentiles To appl this idea to general normal distributions, we do a reverse standardizing: The 100pth percentile is p such that 100p% of measurements lie below p. I.e. P( p ) = 100p% we can find the z-score associated with 100p%, and convert it back to -values using: z p Example. For the normal distribution with μ =5.75 and σ = 0.4, find the 40 th percentile. p From Table 1, z p = -0.5 p = 5.75 + (-0.5)*0.4 = 5.65 The 40 th percentile of this distribution is is 5.65.