Probability & Statistics BITS Pilani K K Birla Goa Campus Dr. Jajati Keshari Sahoo Department of Mathematics
Statistics Descriptive statistics Inferential statistics /38
Inferential Statistics 1. Involves: Estimation Hypothesis Testing Population?. Purpose Make Inferences about Population Characteristics 3/38
Inference Process Estimates & Hypothesis tests Population Sample statistic e.g., (X) Sample 4/38
Key terms Population All items of interest Sample or random sample Portion of population Parameter Summary Measure about Population Statistic Summary Measure about sample 5/38
Population and Sample Population: refer to a population in term of its probability distribution or frequency distribution. Population X or f(x) means a population described by a probability distribution f(x). Population might be infinite or it is impossible to observe all its values even finite, it may be impractical or uneconomical to observe it. 6/38
Sample Sample: a part of population. Random samples (Why we need?): such results can be useful only if the sample is in some way representative. Negative example: performance of a tire if it is tested only on a smooth roads; family incomes based on the data of home owner only. 7/38
Sampling Representative sample Same characteristics as the population Random sample Every subset of the population has an equal chance of being selected 8/38
Random sample Definition (Random sample): A set of observations X, X,, 1 X n constitutes a random sample of size n from a population or distribution X if (a) X i 's are independent (b) X 's have same distribution as of X. i 9/38
Statistic Definition (Statistic): A statistic is a random variable whose numerical value can be determined from a random sample i.e., a function of random sample. Examples: (a) max { X } (b) min { X } (c) X n (d) X / n X. i1 i i i i i1 n 10/3
Sample mean Definition: Let X, X,, X be a random sample n i1 1 from the distribution of n X i. The statistic is called the sample mean and is denoted by X. n X 11/3
Sample variance and sample SD Definition: Let X, X,, X be a random sample of size n i 1 1 from the distribution of X. The statistic S n ( X X ) i n 1 and the statistic S S standard deviation. n is called the sample variance is called sample 1/3
13/3 Sample variance Computation formula for sample variance: 1 1 1 1. or ( 1) n i i n n i i i i X n X S n n X X S n n
Sampling Distribution Mean and variance of sample mean: If a random sample of size n is taken from a population having the mean and the variance, then X is a random variable whose distribution has the mean and variance / n. Standard deviation of X is and called n standard error of the mean. 14/3
Sampling Distribution Mean of sample variance: Theorem: If a random sample of size n is taken from a population having the mean and the variance, then E ( S ). 15/3
Central limit theorem Theorem: If X is the mean of a sample of size n taken from a population having the mean and variance. Then for large n, X is approximately normal with mean and variance. n X Hence is standard normal. / n 16/3
Central Limit Theorem As sample size gets large enough (n 5)... x n sampling distribution becomes almost normal. x X 17/3
Central Limit Theorem Example: The mean length of life of a certain cutting tool is 41.5 hours, with a standard deviation of.5 hours. What is the probability that a sample of size 50 drawn from this population will have a mean of between 40.5 hours and 4 hours? Ans: 0.9184 18/3
Central Limit Theorem Example: One has 100 light bulbs whose life times are independent exponential with mean 5. If the bulbs are used one at a time, with a failed bulb being replaced immediately by a new one, what is the probability that there is still a working bulb after 55 hours? Ans: 0.3085 19/3
Central Limit Theorem Example: If X is a Poisson random variable with mean µ then X is normal with mean µ and variance µ. 0/3
Central Limit Theorem Example: If the number of people entering a store per hour is Poisson distribution with parameter 100, how long should the shopkeeper wait in order to get a probability of 0.9 that more than 00 people have entered in the store? 1/3
Central Limit Theorem Example: A certain component is critical to the operation of an electrical system and must be replaced immediately upon failure. If the mean life time of this type of component is 100 and its standard deviation is 30 hours, how many of these components must be in stock so that the probability that the system is in continual operation for the next 000 hours is at least 0.95? Ans: n=3. /3
Sampling Distribution Distribution of sample variance: Theorem: If a random sample of size n is taken from a normal population having the mean and the variance distribution w ith ( n Note: n 1 S,. n 1, then the random variable ( n 1) S / has a chi-squared -1) degrees of freedom. 3/3
Sampling Distribution Exam ple: The claim that the variance of normal population =1.3 is rejected if the variance of a random sample of size 15 excess 39.74. W hat will be the probability that the claim will be rejected even though =1.3. Ans: 0.05 4/3
T- Distribution Definiti on: Let Z be a standard normal variable and let be an independent chi-squared random variable with degrees of freedom. The random variable T is said to follow / a T distribution with degrees of freedom. Z 5/3
T- Distribution Theorem: If a random sample of size n is taken from a normal population having the mean and the variance, then the random X variable follows a T distribution S / n with ( n -1) degrees of freedom. 6/3
T-distribution t, P( T ) t, 7/3
The sampling distribution of the Mean (unknown variance ) If n is large, it doesn t matter whether is known or not, as it is reasonable in that case to substitute for it the sample standard deviation s. Question: how about n is a small value? We need to make the assumption that the sample comes from a normal population the use T distribution if σ is unknown otherwise use normal approximation. 8/3
ESTIMATION OF PARAMETERS ESTIMATION:- Procedure of estimating a Population (Parameter) by using sample information is referred as estimation. i) POINT ESTIMATION II) INTERVAL ESTIMATION
POINT ESTIMATION POINT ESTIMATE : An estimate of a population parameter given by a single number is called point estimate POINT ESTIMATOR : A point estimator is a statistic for Estimating the population Parameter and will be denoted by.
Point Estimator Problem of point estimation of the population mean µ : The statistic chosen will be called a point estimator for µ is Logical estimator for µ is the Sample mean Hence.
Example Market researcher use the number of sentences per advertisement as a measure of readability far magazine advertisement. The following represents a random sample of 54 advertisements. Find a point estimate of the population mean µ 9,0,18,16,9,16,16,9,11,13,,16,5,18,6,6,5,1,5,17,,3,7,10,9, 10,10,5,11,18,18,9,9,17,13,11,7,14,6,11,1,11,15,6,1,14,11,4,9, 18,1,1,17,11,0. Solution: The Sample mean of Data = 671/54 =1.4. So, Point Estimate for the mean length of all magazine advertisement is 1.4 sentences.
UNBIASED ESTIMATOR Unbiased Estimator: If the mean of sampling distribution of a Statistic equals the corresponding Population parameter, the Statistic is called an Unbiased Estimator of the Parameter i.e if E. Biased Estimator: if E.
Unbiased Estimator Example: The Sample mean population mean. is an unbiased estimator for the Example: S is an unbiased estimator of population variance. Example: S is not an unbiased estimator of population standard deviation. Example: n i1 X i X n variance. is a biased estimator of population
Unbiased Estimator Example: If X is b( n, p) then X (a) is an unbiased for p. n X n / (b) is an unbiased for p iff p 0.5. n n X p n (c) is an unbiased for p. n n
Unbiased Estimator Example: If X for 1 i n constitute a i random sample from the population given by ( x ) e, x f ( x) 0, otherwise. Find an unbiased estimator for.
Methods For Finding Estimators Method of Moments: Using the k-th moment as k k M k E( X ) X i / n, n i1 we estimate the parameters in terms of moments. Method of Maximum Likelihood: We maximize the likelihood function with respect to the parameter θ and then the statistic at which the likelihood function gives maximum is called maximum likelihood estimator for θ.
Method of Moments Example: If X is (, ). Using method of moments find estimators for and. Exam ple: If X is N (, ). Using met of mom ents find estim ators for and hod. Exam ple: of mo If X is U ( a, a). Using method ments find an estimator for a.
Likelihood Function Likelihood Function: Let x, x,, x be a random sample of size n 1 from a population with density function f ( x) and parameter. Then the likelihood function of the sample value x, x,, x is denoted by L( ), and defined as n 1 L( ) f ( xi ) n i1 n
Method of maximum likelihood Exam ple: If X is B ( n, p). Using method of maximum likelihood (M L) find an estimators for p. Example: If X is Exp( ). Using method of M L find an estimator for. Exam ple: If X is U [0, ]. Using method of M L find an estimator for.
Method of maximum likelihood Example: If X is Poisson ( ). Using of M L find estimator for. method Example: If X is N (, ). Using method of M L find estimators for and. Exam ple: Find the M L estimator for a if the population is given by f ( x) ( a 1) x a, 0 x 1.
INTERVAL ESTIMATION By using point estimation,we may not get desired degree of accuracy in estimating a parameter. Therefore, it is better to replace point estimation by interval estimation.
INTERVAL ESTIMATION Interval estimate: An interval estimate of an unknown parameter is an interval of the form L 1 θ L, where the end points L 1 and L depend on the numerical value of the statistic and or parameter of population distribution. 100(1-α)% Confidence Interval: A 100(1-α)% confidence interval for a parameter θ is a random interval of the form [L 1, L ] such that P( L L ) 1. 1
INTERVAL ESTIMATION Theorem: [100(1- )% confidence interval for when is known] : Let,,, be a random sample of size X 1 X X n n (large) from a population with mean and variance. A 100(1- )% confidence interval fo r is X z /, X z /. n n Note: If n is small assume the sample from normal.
INTERVAL ESTIMATION Example: A random sample of size 100 is taken from a population with mean and =0.5. If the sample mean is 0.75, find a 95% confidence interval for. Ans: (0.65, 0.848).
INTERVAL ESTIMATION Example: A random sample of size n is taken from a population with mean and =0. How large n should choose with 90% confidence that the random interval X, X includes. Ans: n 9.
INTERVAL ESTIMATION Theorem: [100(1- )% confidence interval for when is unknown]: Let,,, be a random sample of size X 1 X X n n from a normal population with mean and variance. A 100(1- for is Note: t )% confidence interval S S X t /, X t /. n n has ( n -1) degrees of freedom. Note: If n large(n 5), we can scrap normality assumption.
INTERVAL ESTIMATION Example: A signal is transmitted from location A to B. The value received at location B is nornally distributed with mean and variance. A particular value is transmitted 9 times. Find the 95% con fidence interval for, when the values received are 5, 8.5, 1, 15, 7, 9, 7.5, 6.5, and 10.5. Ans: (6.631, 11.369).
INTERVAL ESTIMATION Example: A random sample of size 80 is taken from a population with mean and S 30.85. If the sample mean is 18.85, construct a 99% confidence interval for. Ans: (17.014, 0.6786).
INTERVAL ESTIMATION Theorem: [100(1- )% confidence interval for ]: Let,,, be a random sample of size X 1 X X n n from a normal population with mean and variance for is ( 1), /. A 100(1- )% confidence interval n ( n-1) S ( n 1) S 1 /. Note: has ( n -1) degrees of freedom.
INTERVAL ESTIMATION Example: An optical firm purchases glass for making lenses. Assume that the refractive index of 0 pieces of glass have a variance of 0.0001. Find a 95% confidence interval for.
Problem To estimate the average time it takes to assemble a certain Computer component, the industrial engineer at an electronic firm timed 40 technicians in the performance of this task, getting a mean of 1.73 minutes and a standard deviation.06 minutes. (a) What can we say with 99% confidence about the maximum error if is used as a point estimate of the actual average time required to do this job? (b) Use the given data to construct a 98% confidence interval for the true average time it takes to assemble the computer component.
Interval Estimation (cont d) Solution: Given x 1.73, s =.06 and n = 40 (a) x 1.73and (1 - ) = 0.99 = 0.01 Since sample is large (n = 40) The maximum error of estimation with 99% confidence is E z s s.06 / z0.005.575 n n 40 (b) 98% confidence interval (i.e. = 0.0 ) is given by s s x z0.01 x z0.01 n n.06 1.73.33 1.73.33 40 11.971 13.489. 0.839.06 40
Problem With reference to the previous problem with what confidence we can assert that the sample mean does not differ from the true mean by more than 30 seconds. Solution: Given E = 30 seconds = 0.5 minute, s =.06, n = 40 and we have to get value of (1 - ). E z s n 40 (0.5).06 / z / E z / / n F( z ) F(1.54) 0.938 (from Table 3) 1 0.938 0.136 1 0.8764 1.54 Thus, we have 87.64% confidence that the sample mean does not duffer from the true mean by more than 30 seconds.