Topic-7. Large Sample Estimation

Topic-7 Large Sample Estimatio

TYPES OF INFERENCE Ò Estimatio: É Estimatig or predictig the value of the parameter É What is (are) the most likely values of m or p? Ò Hypothesis Testig: É Decidig about the value of a parameter based o some precoceived idea. É Did the sample come from a populatio with m = 5 or p = 0.?

TYPES OF INFERENCE Ò Examples: É A cosumer wats to estimate the average price of similar homes i her city before puttig her home o the market. Estimatio: Estimate m, the average home price. A maufacturer wats to kow if a ew type of steel is more resistat to high temperatures tha a old type was. Hypothesis test: Is the ew average resistace, m N equal to the old average resistace, m O?

WHAT DO WE FREQUENTLY NEED TO ESTIMATE? Ò A ukow populatio proportio p m? p? Ò A ukow populatio mea m

DEFINITIONS Ò A estimator is a rule, usually a formula, that tells you how to calculate the estimate based o the sample. ÉPoit estimatio: A sigle umber is calculated to estimate the parameter. ÉIterval estimatio: Two umbers are calculated to create a iterval withi which the parameter is expected to lie.

A. POINT ESTIMATORS Properties: } A estimator is ubiased if the mea of its samplig distributio equals the parameter of iterest. } Of all the ubiased estimators, we prefer the estimator whose samplig distributio has the smallest spread or variability.

MEASURING THE GOODNESS OF AN ESTIMATOR o The distace betwee a estimate ad the true value of the parameter is the error of estimatio. The distace betwee the bullet ad the bull s-eye. o Whe the sample sizes are large, our ubiased estimators will have ormal distributios. Because of the Cetral Limit Theorem.

Part-I Large Sample Estimatio: Oe Populatio mea ad oe populatio proportio

MARGIN OF ERROR Margi of error is the maximum likely differece observed betwee sample mea x ad true populatio mea µ. deoted by E E = z a/ s x -E µ x + E x-e < µ < x +E

Estimatig Meas ad Proportios For a quatitative populatio, SE Poit estimator of populatio mea μ : x Margi of error ( > 30) : ± z a For a biomial populatio, s.645.96.33.575 Poit estimator of populatio proportio p : p ˆ = x/ Margi of error ( > 30) : ± z a pq ˆ ˆ

Tabulated Z value For differet level of Alpha Z value ca be calculated from Stadard Normal distributio Table a a/ z a/.0.05.645.05.05.96.0.0.33.0.005.575

Example (Populatio Mea) A homeower radomly samples 64 homes similar to her ow ad fids that the average sellig price is $5,000 with a stadard deviatio of $5,000. Estimate the average sellig price for all similar homes i the city. Poit estimator of μx : = 5,000 s 5, 000 Margi of error : ±=.96 ±=.96 ± 3675 64

Example (Populatio Proportio) A quality cotrol techicia wats to estimate the proportio of soda cas that are uderfilled. He radomly samples 00 cas of soda ad fids 0 uderfilled cas. = 00 p = proportioof Poit estimatorof p : p ˆ = x/ uderfilled cas = 0/ 00 =.05 Margiof error : ±.96 pq ˆ ˆ = ±.96 (.05)(.95) 00 = ±.03

B. INTERVAL ESTIMATION Create a iterval (a, b) so that you are fairly sure that the parameter lies betwee these two values. Fairly sure is meas with high probability, measured usig the cofidece coefficiet, -a. Usually, -a =.90,.95,.98,.99 Suppose -a =.95 ad that the estimator has a ormal distributio. Parameter ±.96SE

INTERPRETATION OF CONFIDENCE INTERVAL 98.08 o < µ < 98.3 o Correct: We are 95% cofidet that the true value of m is i the iterval from 98.08 to 98.3. This meas that if we were to select may differet samples of size 06 ad costruct the cofidece itervals, 95% of them would actually cotai the value of the populatio mea m. Wrog: There is a 95% chace that the true value of m will fall betwee 98.08 ad 98.3.

INTERVAL ESTIMATION/CONFIDENCE INTERVALS

INTERVAL ESTIMATION (CONT D) Sice we do t kow the value of the parameter, cosider which has a variable ceter. Estimator ±.96 SE Worked Worked Worked Failed Oly if the estimator falls i the tail areas will the iterval fail to eclose the parameter. This happes oly 5% of the time.

00(-a)% Cofidece Iterval For populatio Mea: Cofidece iterval for a populatio mea μ : x ± z a / s For populatio proportio: Cofidece iterval for a populatio proportio p : p ˆ ± z a / pq ˆ ˆ

EXAMPLE (POPULATION MEAN) A radom sample of = 50 males showed a mea average daily itake of dairy products equal to 756 grams with a stadard deviatio of 35 grams. Fid a 95% cofidece iterval for the populatio average m. x ± z0.05 s.96 35 Þ 756 ±.96 Þ 756 ± 9. 70 50 or 746.30 < m < 765.70 grams.

EXAMPLE Fid a 99% cofidece iterval for m, the populatio average daily itake of dairy products for me. x ± z0.0 s or 743.5 < m <.575 35 Þ 756 ±.58 Þ 756 ±. 75 50 768.75 grams. The iterval must be wider to provide for the icreased cofidece that is does ideed eclose the true value of m.

EXAMPLE (POPULATION PROPORTION) Of a radom sample of = 50 college studets, 04 of the studets said that they had played o a soccer team durig their K- years. Estimate the proportio of college studets who played soccer i their youth with a 98% cofidece iterval..33 pˆ ± z0.0 pq ˆ ˆ 04 ±.33 Þ 50.69(.3) 50 Þ. 69 ±.09 or.60 < p <.78.

ONE SIDED CONFIDENCE BOUNDS Ò Cofidece itervals are by their ature twosided sice they produce upper ad lower bouds for the parameter. Ò Oe-sided bouds ca be costructed simply by usig a value of z that puts a rather tha a/ i the tail of the z distributio. LCB:Estimator UCB:Estimator - z a + z a (Std (Std Error of Error of Estimator) Estimator)

CHOOSING THE SAMPLE SIZE ÒThe total amout of relevat iformatio i a sample is cotrolled by two factors: The samplig pla or experimetal desig: the procedure for collectig the iformatio The sample size : the amout of iformatio you collect. ÒI a statistical estimatio problem, the accuracy of the estimatio is measured by the margi of error or the width of the cofidece iterval.

CHOOSING THE SAMPLE SIZE (CONT D). Determie the size of the margi of error, E, that you are willig to tolerate.. Choose the sample size by solvig for or = = i the iequality:.96 SE E, where SE is a fuctio of the sample size. 3. For quatitative populatios, estimate the populatio stadard deviatio usig a previously calculated value of s or the rage approximatio s» Rage / 4. 4. For biomial populatios, use the coservative approach ad approximate p usig the value p =.5.

EXAMPLE A producer of PVC pipe wats to survey wholesalers who buy his product i order to estimate the proportio who pla to icrease their purchases ext year. What sample size is required if he wats his estimate to be withi 0.04 of the actual proportio with probability equal to 0.95? pq 0.5(0.5).96.04 Þ.96 0.04.96 0.5(0.5) Þ ³ = 4.5 Þ ³ 4.5 = 600. 5 0.04 He should survey at least 600 wholesalers.

Part-II Large Sample Estimatio: The Differet betwee Two Populatio Meas Ad Two Populatio Proportio

ESTIMATING THE DIFFERENCE BETWEEN TWO MEANS Sometimes we are iterested i comparig the meas of two populatios. The average growth of plats fed usig two differet utriets. The average scores for studets taught with two differet teachig methods. To make this compariso, A radom sample populatio with of size mea μ draw ad from variace s. A radom sample populatio with of size mea μ draw from ad variace s.

ESTIMATING THE DIFFERENCE BETWEEN TWO MEANS (CONT D) We compare the two averages by makig ifereces about m -m, the differece i the two populatio averages. If the two populatio averages are the same, the m -m = 0. The best estimate of m -m is the differece i the two sample meas, x - x

THE SAMPLING DISTRIBUTION OF x - x Properties of the Samplig Distributio of Expected Value x - x E ( x - x ) = m - m Stadard Deviatio/Stadard Error s s s = + x -x where: s = stadard deviatio of populatio s = stadard deviatio of populatio = sample size from populatio = sample size from populatio

INTERVAL ESTIMATE OF m - m : LARGE-SAMPLE CASE ( > 30 AND > 30) q Iterval Estimate with s ad s Kow where: - ± s x x za x -x s s s = + x -x SE q Iterval Estimate with s ad s Ukow x - x ± za s x -x where: s s x -x = + s

EXAMPLE Avg Daily Itakes Me Wome Sample size 50 50 Sample mea 756 76 Sample Std Dev 35 30.96 Compare the average daily itake of dairy products of me ad wome usig a 95% cofidece iterval. s ( x - x) ± z0.05 + s 35 30 Þ(756-76) ±.96 + 50 50 Þ - 6 ±.78 or -8.78 < m - m < 6.78.

EXAMPLE (CONT D) - 8.78 < m - m < 6.78 Could you coclude, based o this cofidece iterval, that there is a differece i the average daily itake of dairy products for me ad wome? The cofidece iterval cotais the value m -m = 0. Therefore, it is possible that m = m. You would ot wat to coclude that there is a differece i average daily itake of dairy products for me ad wome.

ESTIMATING THE DIFFERENCE BETWEEN TWO PROPORTIONS Sometimes we are iterested i comparig the proportio of successes i two biomial populatios. The germiatio rates of utreated seeds ad seeds treated with a fugicide. The proportio of male ad female voters who favor a particular cadidate for goveror. To make this compariso, A radom sample biomial of populatio size with draw from parameter p. A radom sample of biomial populatio size with draw from parameter p.

Estimatig the Differece betwee Two Proportios (cot d) We compare the two proportios by makig ifereces about p -p, the differece i the two populatio proportios. If the two populatio proportios are the same, the p -p = 0. The best estimate of p -p is the differece i the two sample proportios, x p ˆ - pˆ = - x

The Samplig Distributio of Expected Value E( pˆ ) - pˆ = p - p Stadard Deviatio/Stadard Error s p q p ˆ - pˆ = + p q Distributio Form If the sample sizes are large ( p, q, p, q ) are all greater tha to 5), the samplig distributio of pˆ- pˆ ca be approximated by a ormal probability distributio. pˆ - pˆ

Iterval Estimate of p - p : Large-Sample Case q Iterval Estimate with s ad s Kow where: q Iterval Estimate with s ad s Ukow where: ˆ - ± s p ˆ p za x -x ˆ p q s p ) - p ) = + p ˆ - p ± za s x -x s p q s x -x = + s

Example Youth Soccer Male Female Sample size 80 70 Played soccer 65 39 Compare the proportio of male ad female college studets who said that they had played o a soccer team durig their K- years usig a 99% cofidece iterval..575 ( ˆ ˆ pˆ qˆ p - p) ± z0.0 + pˆ ˆ q 65 39.8(.9) ( - ) ±.575 + 80 70 80.56(.44) 70 Þ Þ 0.6 ± 0. 9 or 0.07 < p - p < 0.45

Example (cot d) 0.07 < p - p < 0.45 Could you coclude, based o this cofidece iterval, that there is a differece i the proportio of male ad female college studets who said that they had played o a soccer team durig their K- years? The cofidece iterval does ot cotais the value p -p = 0. Therefore, it is ot likely that p = p. You would coclude that there is a differece i the proportios for males ad females. A higher proportio of males tha females played soccer i their youth.

KEY CONCEPTS I. Types of Estimators. Poit estimator: a sigle umber is calculated to estimate the populatio parameter.. Iterval estimator: two umbers are calculated to form a iterval that cotais the parameter. II. Properties of Good Estimators. Ubiased: the average value of the estimator equals the parameter to be estimated.. Miimum variace: of all the ubiased estimators, the best estimator has a samplig distributio with the smallest stadard error. 3. The margi of error measures the maximum distace betwee the estimator ad the true value of the parameter.

KEY CONCEPTS III. Large-Sample Poit Estimators To estimate oe of four populatio parameters whe the sample sizes are large, use the followig poit estimators with the appropriate margis of error.

KEY CONCEPTS IV. Large-Sample Iterval Estimators To estimate oe of four populatio parameters whe the sample sizes are large, use the followig iterval estimators.

KEY CONCEPTS. All values i the iterval are possible values for the ukow populatio parameter.. Ay values outside the iterval are ulikely to be the value of the ukow parameter. 3. To compare two populatio meas or proportios, look for the value 0 i the cofidece iterval. If 0 is i the iterval, it is possible that the two populatio meas or proportios are equal, ad you should ot declare a differece. If 0 is ot i the iterval, it is ulikely that the two meas or proportios are equal, ad you ca cofidetly declare a differece. V. Oe-Sided Cofidece Bouds Use either the upper (+) or lower (-) two-sided boud, with the critical value of z chaged from z a / to z a.