ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

Similar documents
Statistics 6 th Edition

Statistical Methods in Practice STAT/MATH 3379

What was in the last lecture?

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

4 Random Variables and Distributions

MidTerm 1) Find the following (round off to one decimal place):

The normal distribution is a theoretical model derived mathematically and not empirically.

Commonly Used Distributions

2011 Pearson Education, Inc

Business Statistics 41000: Probability 4

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

ECON 214 Elements of Statistics for Economists 2016/2017

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Counting Basics. Venn diagrams

Statistical Tables Compiled by Alan J. Terry

A probability distribution shows the possible outcomes of an experiment and the probability of each of these outcomes.

Normal Distribution. Definition A continuous rv X is said to have a normal distribution with. the pdf of X is

Chapter 7 1. Random Variables

STATISTICS and PROBABILITY

Homework: Due Wed, Nov 3 rd Chapter 8, # 48a, 55c and 56 (count as 1), 67a

PROBABILITY DISTRIBUTIONS

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

Theoretical Foundations

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Business Statistics 41000: Probability 3

Chapter 6 Continuous Probability Distributions. Learning objectives

Chapter 4 Random Variables & Probability. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

MAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw

Central Limit Theorem, Joint Distributions Spring 2018

The graph of a normal curve is symmetric with respect to the line x = µ, and has points of

Introduction to Business Statistics QM 120 Chapter 6

Chapter 4 and 5 Note Guide: Probability Distributions

2017 Fall QMS102 Tip Sheet 2

Basic Procedure for Histograms

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

ECE 340 Probabilistic Methods in Engineering M/W 3-4:15. Lecture 10: Continuous RV Families. Prof. Vince Calhoun

4.3 Normal distribution

Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved.

Part V - Chance Variability

Useful Probability Distributions

MA : Introductory Probability

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

IEOR 165 Lecture 1 Probability Review

Lecture 12. Some Useful Continuous Distributions. The most important continuous probability distribution in entire field of statistics.

Lecture 9. Probability Distributions. Outline. Outline

Chapter 4 Probability Distributions

MAKING SENSE OF DATA Essentials series

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Chapter 3 Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2013 John Wiley & Sons, Inc.

Lecture 9. Probability Distributions

Statistics for Business and Economics

AMS7: WEEK 4. CLASS 3

The Normal Distribution

Chapter 4 Continuous Random Variables and Probability Distributions

Probability Models.S2 Discrete Random Variables

5.4 Normal Approximation of the Binomial Distribution

Continuous random variables

TOPIC: PROBABILITY DISTRIBUTIONS

Statistics, Measures of Central Tendency I

MANAGEMENT PRINCIPLES AND STATISTICS (252 BE)

Random Variable: Definition

Chapter 4 Continuous Random Variables and Probability Distributions

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

Statistics and Probability

Probability. An intro for calculus students P= Figure 1: A normal integral

ECON 214 Elements of Statistics for Economists

Probability Distributions II

Diploma in Business Administration Part 2. Quantitative Methods. Examiner s Suggested Answers

Random Variables Handout. Xavier Vilà

Binomial Random Variables. Binomial Random Variables

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Examples: Random Variables. Discrete and Continuous Random Variables. Probability Distributions

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

Overview. Definitions. Definitions. Graphs. Chapter 4 Probability Distributions. probability distributions

Homework: Due Wed, Feb 20 th. Chapter 8, # 60a + 62a (count together as 1), 74, 82

Central Limit Theorem (cont d) 7/28/2006

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

The topics in this section are related and necessary topics for both course objectives.

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Random Variables and Probability Functions

15.063: Communicating with Data Summer Recitation 4 Probability III

Chapter 8. Variables. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Shifting our focus. We were studying statistics (data, displays, sampling...) The next few lectures focus on probability (randomness) Why?

The Normal Distribution. (Ch 4.3)

Discrete Random Variables and Probability Distributions

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

ECON 214 Elements of Statistics for Economists 2016/2017

Chapter 3 - Lecture 5 The Binomial Probability Distribution

Simple Random Sample

Engineering Statistics ECIV 2305

Lecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial

CHAPTER 8 PROBABILITY DISTRIBUTIONS AND STATISTICS

Chapter 5: Probability models

MLLunsford 1. Activity: Central Limit Theorem Theory and Computations

Statistics for Managers Using Microsoft Excel 7 th Edition

Transcription:

ME3620 Theory of Engineering Experimentation Chapter III. Random Variables and Probability Distributions Chapter III 1

3.2 Random Variables In an experiment, a measurement is usually denoted by a variable such as X. In a random experiment, a variable whose measured value can change (from one replicate of the experiment to another) is referred to as a random variable. Chapter III 2

Chapter III 3

3.3 Probability Used to quantify likelihood or chance Used to represent risk or uncertainty in engineering applications Probability statements describe the likelihood that particular values occur. The likelihood is quantified by assigning a number from the interval [0, 1] to the set of values (or a percentage from 0 to 100%). Higher numbers indicate that the set of values is more likely. A probability is usually expressed in terms of a random variable. Chapter III 4

3.3 Probability Complement of an Event Given a set E, the complement of E is the set of elements that are not in E. The complement is denoted as E. Mutually Exclusive Events The sets E 1, E 2,...,E k are mutually exclusive if the intersection of any pair is empty. That is, each element is in one and only one of the sets E 1, E 2,..., E k. Probability Properties X represents the value of measurement of a variable Chapter III 5

3.3 Probability Probability Properties This property states that the maximum value for a property is 1 The probability of any event cannot be negative This property states that the proportion of measurements that fall in E 1 E 2 E k is the sum of the proportions that fall in E 1 and E 2 and, and E k whenever sets are mutually exclusive. For example P(X 10) = P(X 0) + P(0 < X 5) + P(5 < X 10) Also P(X E ) = 1 - P(X E) For example, P(X 2) = 1 - P(X > 2) In general, for any fixed number x: P(X x) = 1 - P(X > x) Chapter III 6

Example: The homework scores of a given assignment are listed on the second and fourth columns of the next data set, 1 33.5625 18 82.375 2 54.21875 19 82.875 3 61.8125 20 83.375 4 63.6875 21 84.3125 5 65.1875 22 84.8125 6 66.8125 23 85.4375 7 67.6875 24 85.6875 8 71.84375 25 86.5 9 73.65625 26 87.4375 10 74.75 27 87.75 11 76.75 28 88.70625 12 78.6875 29 89.375 13 78.75 30 90.4375 14 80.875 31 92.25 15 81.3125 32 92.4375 16 81.9375 33 96.0625 17 82.25 34 96.4375 a) What is the probability of a student getting 80% or below? b) What is the probability of a student getting 86% or below? c) What is the probability of a student getting between 80% and 86%? d) What is the probability of a student getting a score larger than 86%? Example. Problem 3 13 Example. Problem 3 17 Chapter III 7

3 4 Continuous Random Variables 3 4.1 Probability Density Function (pdf) The probability distribution or simply distribution, f(x),of a random variable X is a description of the set of the probabilities associated with the possible values for X. The Probability Density Function (pdf) is used to describe the probability distribution of a continuous random variable X. The probability that X is between a and b is determined as the integral of f(x) from a to b. Chapter III 8

3 4.2 Cumulative Distribution Function Another way to describe the probability distribution of a random variable is by defining a function that provides the probability than X is less or equal to x. Chapter III 9

Example 3 3 Chapter III 10

Example 3 4 a) Determine the cumulative density function (cdf). b) Determine the probability that the distance to the first surface flaw is less than 1000 μm. c) Determine the probability that the distance to the first surface flaw exceeds 2000 μm. d) Determine the probability that the distance to the first surface flaw is between 1000 μm and 2000 μm. Chapter III 11

3 4.3 Mean and Variance For sample data x 1, x 1,, x n the sample mean (x ) is determined by: x x x x n x n For sample data x 1, x 1,, x n the sample variance (s 2 ) is a measure of the dispersion or scatter in the data: s x x x x x x x x n 1 n 1 Chapter III 12

For sample data x 1, x 1,, x n the sample variance (s 2 ) is a measure of the dispersion or scatter in the data: s x x x x x x x x n 1 n 1 s x x n n 1 The population variance [σ 2, or V(x)] of a random variable X is: σ = Chapter III 13

Example: The homework scores of a given assignment are listed on the second and fourth columns of the next data set, 1 33.5625 18 82.375 2 54.21875 19 82.875 3 61.8125 20 83.375 4 63.6875 21 84.3125 5 65.1875 22 84.8125 6 66.8125 23 85.4375 7 67.6875 24 85.6875 8 71.84375 25 86.5 9 73.65625 26 87.4375 10 74.75 27 87.75 11 76.75 28 88.70625 12 78.6875 29 89.375 13 78.75 30 90.4375 14 80.875 31 92.25 15 81.3125 32 92.4375 16 81.9375 33 96.0625 17 82.25 34 96.4375 a) What is mean? b) What is the variance? c) What is the standard deviation? Example. Problem 3 24 Example. Problem 3 27 Chapter III 14

3 5 Important Continuous Distribution Functions 3 5.1 Normal Distribution Normal distribution is also known as Gaussian distribution, Bell curve, or Natural distribution. The normal distribution is the most widely used model for the distribution of a random variable. Whenever a random experiment is replicated, the random variable that equals the average (or total) result over the replicates tends to have a normal distribution as the number of replicates become larger. Chapter III 15

3 5.1 Normal Distribution Random variables with different mean (μ) and variance (σ 2 ) can be modeled by normal probability density functions with appropriate choices of the center and width of the curve. μ determines the center and σ 2 determines the width. Thus, a random variable X with probability density function: f x 1 e σ 2π has a normal distribution (and it is called a normal random variable) with parameters μ and σ, where - < μ <, and σ > 0. The notation N(μ, σ 2 ) is often used to denote a normal distribution with mean μ and variance σ 2. Chapter III 16

3 5.1 Normal Distribution For example, the figure shows the plot of three random variables that follow a normal distribution, two of them have the same mean (μ = 5) but different variance (σ 2 ), the third variable has a mean of μ = 15. As mentioned previously, the variance is a measure of the scatter of the data, therefore variables with larger the variance will have a flatter Gaussian curve. Chapter III 17

3 5.1 Normal Distribution The figure shows the plot (created on excel) of two random variables that follow a normal distribution, the variables have the same mean μ = 1, but different standard deviations σ 1 = 4, σ 2 = 7. 0.1 0.08 0.06 0.04 0.02 0 10 9 8 7 6 5 4 3 2 1 STD = 4 0 1STD = 72 3 4 5 6 7 8 9 10 Chapter III 18

3 5.1 Normal Distribution The following figure summarizes some important characteristics of a normal distribution. Thus, the probabilities of a variable X that follows a normal distribution of: P(μ σ < X < μ + σ) = 0.6827 P(μ 2σ < X < μ + 2σ) = 0.9545 P(μ 3σ < X < μ + 3σ) = 0.9973 μ ± 2σ 0.95 95% Confidence Interval (C. I.) μ ± 3σ 0.99 99% C. I. Since more than 0.9973 of a probability of a normal distribution is within the interval (μ 3σ < X < μ + 3σ), 6σ is called the width of a normal distribution. The area under the curve of a normal pdf from < x < is 1. Chapter III 19

3 5.1 Normal Distribution Standard Normal Random Variable Table 1 provides cumulative probabilities for a standard normal random variable. Standard Normal Cumulative Distribution Function Chapter III 20

Example: Determine the following probabilities: a) P(Z 1.12) b) P(Z > 1.12) c) P(Z 0.43) d) P(Z > 0.43) e) P(.06 Z 1.18) f) Find the value of z such that P(Z z) = 0.33 g) Find the value of z such that P(Z > z) = 0.22 Standard Normal Cumulative Distribution Function Chapter III 21

Standard Normal Cumulative Distribution Function The variable Z, defined as: represents the distance of X from its mean in terms of standard deviations. It is important to notice that Z is a dimensionless parameter. Example. Problem 3 55. Chapter III 22

3 6 Probability Plots 3 6.1 Normal Probability Plots How do we know if a normal distribution is a reasonable model for data? Probability plotting is a graphical method for determining whether sample data conform to a hypothesized distribution based on a subjective visual examination of the data. Probability plotting typically uses special graph paper, known as probability paper, that has been designed for the hypothesized distribution. Probability paper is widely available for the normal, lognormal, Weibull, and various chi-square and gamma distributions. Chapter III 23

3 6.1 Normal Probability Plots To construct a probability plot: a) Rank the data in ascending order, that is from smaller to largest: x 1, x 2,, x n, where x 1 is the smaller and x n the largest. b) Using the probability paper of the hypothesized distribution, plot the ordered observations x j on the abscissae axis (horizontal axis) and the observed cumulative frequency [( j 0.5)/n] on the axis of the ordinate (vertical axis). c) Add a trend line. If the hypothesized distribution adequately describes the data, the plotted points will fall along a straight line. If the plotted points deviate significantly and systematically from the straight line the hypothesized model is not appropriate. Chapter III 24

3 6.1 Normal Probability Plots To construct a normal probability plot, using ordinary graph paper: a) Determine a set of standardized normal scores using the cumulative frequency as For example, if ( j 0.5)/n = 0.026, (z j ) = 0.026 implies that z j = 1.94313 EXCEL has a function to determine z j. The name of the function is: NORM.S.INV(argument) b) Using a scatter plot (on excel), plot the ordered observations x j on the abscissae axis and the standardized normal scores on the axis of the ordinate (vertical axis). c) Add a trend line.. P Z z Φ z ) Chapter III 25

3 6.1 Normal Probability Plots If the hypothesized distribution adequately describes the data, the plotted points will fall along a straight line. If the plotted points deviate significantly and systematically from the straight line the hypothesized model is not appropriate. Example 3 18. Example. Problem 3 83. Chapter III 26

3 8 Binomial Distribution A trial with only two possible outcomes is frequently used as a starting point of a random experiment. These type of experiments with only two possible outcomes are called Bernoulli trial. For example: Coin toss, roll of a die expecting 4, etc. Then if the trials that constitute the random experiment are independent. Implying that the outcome from one trial has no effect on the outcome to be obtained from any other trial. Additionally, the probability of a success on each trial is constant and known. Chapter III 27

3 8 Binomial Distribution Thus, in a binomial experiment: 1: The number of observations n is fixed. 2: Each observation is independent. 3: Each observation represents one of two outcomes ("success" or "failure"). 4: The probability of "success" p is the same for each trial. The probability distribution that describes these types of experiments is the Binomial Distribution Chapter III 28

3 8 Binomial Distribution n is the total number of samples x is the number of successful events. p is the probability of a successful event in a single trial. (1 p) is the probability of failure of the event in a single trial. n x nc n! x! n x! The mean and variance for a Binomial Distribution are defined as μ= E(X) = np and σ 2 = V(X) = np(1 p) Chapter III 29

3 8 Binomial Distribution Example. Problem 3 107. Example. Problem 3 108. Chapter III 30

3 8 Poisson Process The Poisson process is one of the most widely-used counting processes. It is usually used when it is necessary to count the occurrences of certain events that appear to happen at a certain rate, but completely at random (without a certain structure). For example, it is known from historical data, that a certain region has 19 days of rain during the three summer months. Other than this information, the timings of days with rain appears to be totally random. This process falls within the category of a Poisson process. Chapter III 31

3 8.1 Poisson Distribution The probability distribution that models a Poisson process is called Poisson distribution Mean = λ = np, Average number of expected occurrence. Variance = σ 2 = λ = np p, Probability of occurrence in a single trial. n, total number of events Example. A taxi cab company owns 100 taxis, each car has a probability of breaking down on a given day of p = 0.05. a) Find the probability that three of the cars will breakdown today b) Find the probability that at most three of the cars will breakdown today c) Find the probability that at least three of the cars will breakdown today Chapter III 32

3 10 Normal Approximation to the Binomial and Poisson Distributions Binomial Distribution Approximation A binomial random variable is the total count of successes from repeated independent trials. When the number of trials, n, is large the binomial random variable can be approximated to a normal random variable. Consequently, the normal distribution can be used to approximate binomial probabilities when n is large. Since for a normal distribution the variable Z is defined as: Z x μ σ Then, when modeling a binomial variable using a normal distribution approximation μ= E(X) = np and σ 2 = V(X) = np(1 p) Chapter III 33

Binomial Distribution Approximation Thus Since expressing a binomial variable in terms of a normal distribution is an approximation, an additional correction factor (known as continuity correction) can be introduced to further improve the approximation. In general a ±0.5 is added to the binomial values to improve the approximation. Thus, the ±0.5 correction is applied such that increases the binomial probability that it is to be approximated. Example. Problem 3 148. Chapter III 34

Poisson Distribution Approximation Similarly as a binomial random variable can be modeled using a normal distribution, a Poisson distribution can be approximated to a normal distribution. In order for a Poisson probability distribution to be approximated to a normal probability distribution λ = np > 5. Thus, if X is a Poisson random variable with E(X) = λ and V(X) = λ Z x λ λ is approximately a standard normal random variable. Example. Problem 3 154. Chapter III 35

Weibull Distribution Approximation The Weibull distribution is used to model the time until failure of a number of different physical systems. The parameters involved in the description of this distribution allow to adapt the distribution to systems in which the number of failures increases with time (bearing wear), decreases with time (some semiconductors) or remains constant (failures produced by external factors to the system) Chapter III 36

Weibull Distribution Approximation β Is a shape parameter and it is equal to the slope of regression line in Weibull plot paper δ Is the scale parameter or characteristic life (time), at a reliability failure of 63.2 %. x Life of product Chapter III 37

Weibull Distribution Approximation The cumulative Weibull distribution function is: The mean (life) and variance of the Weibull distribution are: where the Gamma function, Γ, is defined as: Γ is tabulated in tables or can be readily determined in software such as EXCEL using the function GAMMA(argument) Chapter III 38

Weibull Distribution Approximation Example: Bearings are tested to the failure cycle, according to the data gathered and plotted on probability paper, the slope of the regression line is β = 1.5 and 63.2 % of the bearings will fail after δ = 5.7 10 5 cycles. a) Find the probability that a bearing will fail before 4 10 5 cycles. b) Find the mean life cycle of this bearing. c) Find the number of cycles at which 10 % of the bearings will fail. Chapter III 39

3 13 Random Samples Statistics and the Central Limit Theorem In statistics data is defined as the observed values of random variables obtained from replicates of a random experiment. If the random variables that represent the observations of n replicates are X 1, X 2,, X n, and since replicates are identical, then each random variable follows the same distribution. Additionally, the random variables are independent from each other. Thus, Consider now a large population of objects of which a subset of n items is randomly selected. If the total population has a given distribution, it follows that the randomly sampled items will also have the same given distribution. Chapter III 40

3 13 Random Samples Statistics and the Central Limit Theorem Thus, Statistical Analysis can be performed using information a) Measured directly from the entire population (true μ and σ) b) Measured indirectly from sampling (X, and σ ) σ is known as the Standard Error Mean (S. E. M.) and it is defined as σ σ being the standard deviation of the entire population and n the number of items sampled. Thus, for a sample of a population with normal distribution Chapter III 41

3 13 Random Samples Statistics and the Central Limit Theorem Example: Assume that the weight of medium size propane tanks follows a normal distribution. It is known that the mean weight of the entire population is μ = 35 lb with a standard deviation of σ = 3.6 lb. Determine, for a random sample of n = 34 tanks, a) What is the probability that the mean, X, of the sample is less than 33 lb? b) What is the probability that the mean, X, of the sample is between 34 lb and 36.5 lb? Example. Problem 207. For part b), change t = 1970 min instead of 2200 min Chapter III 42