One sample z-test and t-test - PDF Free Download

One sample z-test and t-test January 30, 2017 psych10.stanford.edu

Announcements / Action Items Install ISI package (instructions in Getting Started with R) Assessment Problem Set #3 due Tu 1/31 at 7 PM R Lab in class on W 2/1 Navarro CH 8 Survey Quiz

from Tintle et al. Congratulations! we ve now introduced most of the complicated ideas (and most terminology) and now will revisit these ideas by exploring different types of data

Last time We can calculate mean, variance, and standard deviation for a process We can predict descriptive characteristics of distributions of hypothetical variables using: rules of linear transformation rules of summing variables the central limit theorem

This time Recap: how do we describe our newest sampling distribution, a distribution of sample means? How can we use distributions of sample means to make inferences? What s the difference between a z-test and a t-test? How do we measure effect size for single means?

Inferences about means Do herbal supplements change performance on a standardized test of memory? Scores on the test are known to be normally distributed with μ = 50 and σ = 12. A sample of n = 16 participants take herbal supplements for 90 days and then take the test. They get a mean score of x = 54. We want to know if the mean score in the population that takes herbal supplements, μ herbal, is different from 50. Is x = 54 a plausible sample mean if herbal supplements actually have no effect on mean memory performance? State two hypotheses H 0 : herbal supplements do not change mean performance, μ herbal = 50 H A : herbal supplements do change mean performance, μ herbal 50 Think about the distribution of sample means we could get if μ = 50 (and σ = 12) Determine the probability of getting a sample mean as or more extreme as x = 54 if μ = 50 (i.e., a p-value)

Reminder: proportions Population / probability distribution describing the population or process (the distribution that we want to know about) summarized by π, our population proportion Proportion 1 0.5 0 Y N Sample distribution describing our sample data summarized by p, our sample proportion Sampling distribution (distribution of sample proportions) describing all of the possible samples of size n that we could have gotten from a hypothesized population Proportion Probability 0.2 0.1 0.0 1 0.5 0 Y N 0.0 0.2 0.4 0.6 0.8 1.0 Proportion Successes

Distribution of sample means One population, described by µ and σ and shape Many potential samples of size n, each described by x and s and shape } } } } x 1 x 2 x 3 x distribution of sample means: distribution of all of these possible sample means, described by µx and σx and shape

Distribution of sample means Distribution of sample means: collection of sample means for all of the possible random samples of a particular size (n) that can be drawn from a particular population 1 2 3 4 5 6 7 8 9 population distribution 1 2 3 4 5 6 7 8 9 sample distribution (n = 2) M M M M M M M M M M M M M M M M 1 2 3 4 5 6 7 8 9 distribution of sample means describing all of the possible samples of size n that we could have gotten from a population

Distribution of sample means How can we describe the central tendency, variability, and shape of a distribution of sample means? Simulation http://mfviz.com/central-limit/ http://onlinestatbook.com/stat_sim/sampling_dist/ https://supsych.shinyapps.io/sampling_and_stderr/ Theoretical approach central limit theorem

Central limit theorem (CLT) For any population with a mean μ and standard deviation σ, the distribution of sample means from that population for sample size n will: 1. have a mean of μ written E[M], E[X ], μ M, µ X 2. have a standard deviation of σ / n written SD[M], SD[X ], σ M, σ X, standard error of the mean, SEM increases as σ gets large decreases as n gets large 3. approach a normal distribution as n approaches infinity in fact, distribution is almost perfectly normal if either: a. the population follows a normal distribution b. n 30

Distribution of sample means normal bimodal skewed prop 1 2 5 25 100 population distribution distributions of sample means as n increases

Careful! Standard deviation Standard error of the mean σ = σ 2 = (SS / N) σx = σ / n = (σ 2 / n) s = s 2 = (SS / (n - 1)) sx = s / n = (s 2 / n) typical distance between a single score and the population mean typical distance between a sample mean and the population mean used to convert single scores to z-scores filllllllllllllllllllllllllllllllllllllllller used to convert sample means to z-scores, often called z-statistics

Applying the central limit theorem A population distribution is normally distributed and has a mean of μ = 10 and a standard deviation of σ = 20. What are the mean, standard deviation, and shape of a distribution of sample means for n = 16? μ X = μ = 10 σ X = σ / n = 20 / 16 = 5 shape is normal because population distribution is normal A population distribution has a mean of μ = 400 and a standard deviation of σ = 100. What are the mean, standard deviation, and shape of a distribution of sample means for n = 64? μ X = μ = 400 σ X = σ / n = 100 / 64 = 12.5 shape is normal because n 30 A population distribution has a mean of μ = 50 and a standard deviation of σ = 10. What are the mean, standard deviation, and shape of a distribution of sample means for n = 4? μ X = μ = 50 σ X = σ / n = 10 / 4 = 5 careful, we re not sure about the shape!

Inferences about means Do herbal supplements change performance on a standardized test of memory? Scores on the test are known to be normally distributed with μ = 50 and σ = 12. A sample of n = 16 participants take herbal supplements for 90 days and then take the test. They get a mean score of x = 54. We want to know if the mean score in the population that takes herbal supplements, μ herbal, is different from 50. Is x = 54 a plausible sample mean if herbal supplements actually have no effect on mean memory performance? State two opposing hypotheses H 0 : herbal supplements do not change mean performance, μ herbal = 50 H A : herbal supplements do change mean performance, μ herbal 50 Think about the distribution of sample means we could get if μ = 50 (and σ = 12) Determine the probability of getting a sample mean as or more extreme as x = 54 if μ = 50 (i.e., a p-value)

Distribution of sample means the central limit theorem tells us that the distribution of sample means will: have a mean of μ X = μ = 50 have a standard deviation of σ X = σ / n = 12 / 16 = 3 be approximately normally distributed why? 3 x = 54 Is our sample mean, x, likely or unlikely if the null hypothesis is true? 50 What types of sample means would we expect to see if the null hypothesis was true (µ = 50, σ = 12)?

Inferences about means Do herbal supplements change performance on a standardized test of memory? Scores on the test are known to be normally distributed with μ = 50 and σ = 12. A sample of n = 16 participants take herbal supplements for 90 days and then take the test. They get a mean score of x = 54. We want to know if the mean score in the population that takes herbal supplements, μ herbal, is different from 50. Is x = 54 a plausible sample mean if herbal supplements actually have no effect on mean memory performance? State two opposing hypotheses H 0 : herbal supplements do not change mean performance, μ herbal = 50 H A : herbal supplements do change mean performance, μ herbal 50 Think about the distribution of sample means we could get if μ = 50 (and σ = 12) Determine the probability of getting a sample mean as or more extreme as x = 54 if μ = 50 (i.e., a p-value)

Normal distribution Know one-to-one mapping between every z-score and corresponding quantile z = -1.96.025 quantile z = +1.96.975 quantile ~68% of observations fall within 1 standard deviation of mean Density ~95% of observations fall within 2 standard deviations of mean 4 3 2 1 0 1 2 3 4 z

Distribution of sample means the central limit theorem tells us that the distribution of sample means will: have a mean of μ X = μ = 50 have a standard deviation of σ X = σ / n = 12 / 16 = 3 be approximately normally distributed why? 50 3 x = 54 Convert x to a z-score: z = (x - μx ) / σx z = (x - μ) / (σ / n) z = (54-50) / 3 z = 4/3 = 1.33 What types of sample means would we expect to see if the null hypothesis was true (µ = 50, σ = 12)?

From z to p-value What is the the probability of observing a sample statistic (x z) as or more extreme as our sample statistic (x = 54 z = 1.33) if the null hypothesis was true? as or more extreme: z < -1.33 or z > 1.33 p(z < -1.33) + p(z > 1.33) [mutually exclusive] z = -1.33 3 1 x = 50 z = 0 x = 54 z = 1.33

R: the norm() family of functions pnorm(q, mean = 0, sd = 1, lower.tail = TRUE) by default, the pnorm() function returns the cumulative distribution function at quantile q, i.e., the cumulative relative frequency* at q, i.e., the probability of getting a value less than or equal to q by default, q corresponds to a z-score in a normal distribution with µ = 0, σ = 1 > pnorm(-1.96) [1] 0.0249979 > pnorm(1.96) [1] 0.9750021 > pnorm(1.96, lower.tail = FALSE) [1] 0.0249979 *equivalent to 1 - pnorm(q) z = -1.96 1 z = 0 1 z = 0 1 z = 0 z = +1.96 z = +1.96 * note, I have been a bit sloppy in referring to this as cumulative frequency

From z to p-value What is the the probability of observing a sample statistic (x z) as or more extreme as our sample statistic (x = 54 z = 1.33) if the null hypothesis was true? p =.18 as or more extreme: z < -1.33 or z > 1.33 p(z < -1.33) + p(z > 1.33) [mutually exclusive] x = 54 > pnorm(-1.33) [1] 0.09175914 z = -1.33 3 z = 1.33 > pnorm(1.33, lower.tail=false) [1] 0.09175914 1 x = 50 > pnorm(-1.33) * 2 [1] 0.1835183 z = 0

Distribution of sample means the central limit theorem tells us that the distribution of sample means will: have a mean of μ X = μ = 50 have a standard deviation of σ X = σ / n = 12 / 16 = 3 be approximately normally distributed why? 50 3 x = 54 Is our sample mean, x, likely or unlikely if the null hypothesis is true? p >.05, so plausible What types of sample means would we expect to see if the null hypothesis was true (µ = 50, σ = 12)?

Inferences about means Do herbal supplements change performance on a standardized test of memory? Scores on the test are known to be normally distributed with μ = 50 and σ = 12. A sample of n = 16 participants take herbal supplements for 90 days and then take the test. They get a mean score of x = 54. We want to know if the mean score in the population that takes herbal supplements, μ herbal, is different from 50. Is x = 54 a plausible sample mean if herbal supplements actually have no effect on mean memory performance? State two opposing hypotheses H 0 : herbal supplements do not change mean performance, μ herbal = 50 H A : herbal supplements do change mean performance, μ herbal 50 Think about the distribution of sample means we could get if μ = 50 (and σ = 12) the probability of getting a sample mean as or more extreme as x = 54 if μ = 50 is p =.18 fail to reject H 0 and conclude we do not have evidence that herbal supplements change mean performance on the test of memory

Case study: quality assurance Company standards at a fast food chain require that each franchise has a mean wait time of less than µ = 10 minutes. It is already known that wait times have a standard deviation of σ = 2. To test whether a franchise is in compliance, a manager looks at data from a recent customer service survey with a sample of n = 100 customers and wants to see if there is evidence that wait times are greater than 10 minutes for the population of all customers, using α =.05. She finds that the franchise has a mean wait time of of x = 10.5 minutes. State two opposing hypotheses: H0: μ = 10 (μ 10) HA: μ > 10 Is our sample mean likely if H0 is true?

Case study: quality assurance What types of sample means would we get if μ = 10, σ = 2, n = 100? the central limit theorem tells us that the distribution of sample means will: have a mean of μ X = μ = 10 have a standard deviation of σ X = σ / n = 2 / 100 = 0.20 be approximately normally distributed why? What is the probability of getting a sample mean as extreme as our sample mean of x = 10.5, if H 0 is true? (1) convert x to a z-score z = (x - μ M ) / σ M = (x - μ) / (σ / n) = (10.5-10) / (2 / 100) =.5 /.2 = 2.5 (2) use pnorm() to find (one-tailed) p-value > pnorm(2.5, lower.tail=false) [1] 0.006209665 z = 0 1 z = +2.5

Case study: quality assurance Company standards at a fast food chain require that each franchise has a mean wait time of less than µ = 10 minutes. It is already known that wait times have a standard deviation of σ = 2. To test whether a franchise is in compliance, a manager looks at data from a recent customer service survey with a sample of n = 100 customers and wants to see if there is evidence that wait times are greater than 10 minutes for the population of all customers, using α =.05. She finds that the franchise has a mean wait time of of x = 10.5 minutes. Some considerations about sampling How did they recruit a representative group of customers (e.g., who arrived at the franchise at representative times of day)? Were some customers more likely to respond than others?

We can also use pnorm() to get exact p-values when using the normal approximation for a distribution of sample proportions (Which, as a reminder, follows directly from combining expected value and variance of a process with the central limit theorem!)

Case study: focus group A user experience researcher wants to know if all users prefer one website layout (Layout A) over another (Layout B). He samples n = 64 customers and asks them which layout they prefer. He finds a sample proportion of p =.68 prefer Layout A, and wants to know if this is evidence that there is a preference in the population of all users, using α =.01. State two opposing hypotheses: H0: π =.50 HA: π.50 Is our sample proportion likely if H0 is true?

Case study: focus group What types of sample means would we get if π =.50, n = 64? the central limit theorem tells us that the distribution of sample proportions will: have a mean of μ M = π =.50 have a standard deviation of σ M = (π*(1-π) / n) = (.5*.5/64) =.0625 be approximately normally distributed why? What is the probability of getting a sample proportion as extreme as our sample mean of p =.68, if H 0 is true? (1) convert x to a z-score z = (x - μ M ) / σ M = (.68 -.50) /.0625 =.18 /.0625 = 2.88 (2) use pnorm() to find (two-tailed) p-value > pnorm(-2.88) * 2 z = -2.88 1 z = +2.88 [1] 0.003976752 z = 0

Case study: vertical-horizontal illusion The vertical line is 2 inches. How long is the horizontal line? They are the same length! After learning that the vertical line is 2 in., a sample of n = 25 participants estimate the length the horizontal line, and give a mean estimate of x = 1.7. Do we have evidence that, in the entire population, people misjudge the relative lengths of the two lines? State two hypotheses: H0: people judge the lines to be of equal length, μ = 2 HA: people do not judge the lines to be of equal length, μ 2

t-statistic (vs. z-statistic) z = (x - μx ) / σx t = (x - μx ) / sx z = (x - μ) / (σ / n) t = (x - μ) / (s / n) z = (M - μm) / (σ 2 / n) t = (M - μm) / (s 2 / n) problem: z-statistic requires population standard deviation for H0 solution: get our best estimate of the population standard deviation

What determines whether we calculate a z-statistic or a t-statistic? Do we know the population standard deviation for H0? yes z-statistic no t-statistic Note: in the case of a proportion, both the mean and variance are determined by π we always know the population standard deviation for H0 always use z-statistic

Case study: vertical-horizontal illusion The vertical line is 2 inches. How long is the horizontal line? They are the same length! After learning that the vertical line is 2 in., a sample of n = 25 participants estimate the length the horizontal line, and give a mean estimate of x = 1.7 with a standard deviation of s =.5. Do we have evidence that, in the entire population, people misjudge the relative lengths of the two lines, using α =.05? State two hypotheses: H0: people judge the lines to be of equal length, μ = 2 HA: people do not judge the lines to be of equal length, μ 2

Distributions of t-statistics z = (x - μx ) / σx t = (x - μx ) / sx z = (x - μ) / (σ / n) t = (x - μ) / (s / n) Consider z-and t-statistics across different samples Numerator (same): x will vary across samples (sampling error) Denominator (different): z: σ does not vary across samples (defined by population) t: s does vary across samples (sampling error) added source of variability in t-statistic t-statistics are more variable than z-statistics t-statistics are not quite normally distributed heavier tails

Distributions of t-statistics Student s t-distribution a family of distributions Consider z-and t-statistics across different samples Numerator (same): x will vary across samples (sampling error) Denominator (different): z: σ does not vary across samples (defined by population) t: s does vary across samples (sampling error) added source of variability in t-statistic t-statistics are more variable than z-statistics t-statistics are not quite normally distributed heavier tails

Distributions of t-statistics what will determine variability of s? larger sample size s becomes less variable (variability of sampling distribution of s decreases) (law of large numbers) Consider z-and t-statistics across different samples Numerator (same): x will vary across samples (sampling error) Denominator (different): z: σ does not vary across samples (defined by population) t: s does vary across samples (sampling error) added source of variability in t-statistic t-statistics are more variable than z-statistics t-statistics are not quite normally distributed heavier tails

Distributions of t-statistics t-distribution(s) mean of 0 symmetrical bell-shaped more probability in the tails ( heavier tails ) 5 4 3 2 1 0 1 2 3 4 5 scores z-distribution (normal) t-distribution (small sample size) t-distribution (medium sample size) t-distribution (large sample size) with larger sample size approaches a normal distribution

Degrees of freedom Rather than defining t-distributions by sample size, we define them by degrees of freedom (df) the number of values in the sample that are free to vary Suppose we have a sample of n = 3 with x = 5 sample 1 sample 2 sample 3 x1 x2 x3

Distributions of t-statistics t-distribution(s) mean of 0 symmetrical bell-shaped more probability in the tails ( heavier tails ) 5 4 3 2 1 0 1 2 3 4 5 scores z-distribution (normal) t-distribution (df = 2) t-distribution (df = 5) t-distribution (df = 50) with larger sample size approaches a normal distribution

R: pt() pt(q, df, ncp, lower.tail = TRUE, log.p=false) by default, the pt() function returns the cumulative distribution function at quantile q, i.e., the cumulative relative frequency* at q, i.e., the probability of getting a value less than or equal to q q corresponds to a t-statistic in a t distribution, defined by df > pt(-1.96, df = 10) [1] 0.03921812 > pt(1.96, df = 10) [1] 0.9607819 t = -1.96 t = 0 t = 0 t = +1.96 > pt(1.96, df = 10, lower.tail = FALSE) [1] 0.03921812 *equivalent to 1 - pt(1.96, df = 10) * note, I have been a bit sloppy in referring to this as cumulative frequency t = 0 t = +1.96

Case study: vertical-horizontal illusion The vertical line is 2 inches. How long is the horizontal line? They are the same length! After learning that the vertical line is 2 in., a sample of n = 25 participants estimate the length the horizontal line, and give a mean estimate of x = 1.7 with a standard deviation of s =.5. Do we have evidence that, in the entire population, people misjudge the relative lengths of the two lines, using α =.05? State two hypotheses: H0: people judge the lines to be of equal length, μ = 2 HA: people do not judge the lines to be of equal length, μ 2

Case study: vertical-horizontal illusion What is the probability of getting a sample mean as extreme as our sample mean of x = 1.7, if H 0 is true (if μ = 2)? (1) convert x to a t-statistic t = (x - μ M ) / s M = (x - μ) / (s / n) = (1.7-2) / (.5 / 25) = -.3 /.1 = -3.0 (2) use pt() to find (two-tailed) p-value > pt(-3.0, df = 24) * 2 [1] 0.006205737 t = -3.00 t = 0 t = +3.00 p-value is less than alpha reject H0 H0: people judge the lines to be of equal length, μ = 2 HA: people do not judge the lines to be of equal length, μ 2

R: t.test() t.test(x, alternative=c( two.sided, less, greater ), mu=0) t.test performs many types of t-tests, for now we ll only focus on a few arguments x is a vector of values mu is the population mean (µ) if H 0 is true alternative specifies whether you are performing a two-tailed or one-tailed test

We rejected H 0, now what? We rejected H0 and concluded that μ 2 Statistical significance: if our null hypothesis is true, would it be plausible to observe x in our sample? Effect size: is the difference between x and μ from our null hypothesis (μ0) meaningful in the real world? Why not judge based on p-value, z-statistic, or t-statistic? influenced by difference between x and μ0 influenced by sample size (n)

Proportion variance explained (r 2 ) x μ0 x μ0 x μ total variance variance we can explain variance we cannot explain r 2 = variance we can explain / total variance r 2 = t 2 / (t 2 + df) r 2 is a proportion so it ranges from 0 to 1 in our illusion example, r 2 = (-3) 2 / ((-3) 2 + 24) =.27

Recap The central limit theorem can tell us the mean, standard deviation, and shape of a distribution of sample means We can use a distribution of sample means to tell us whether a sample mean is plausible given a hypothesized population mean We use a z-test when we know the population standard deviation and a t-test when we estimate the population standard deviation We ve introduced r 2 as a new measure of effect size

Quiz 1 Nice work! median =.91, IQR =.15 mean =.86, sd =.13 Will post solutions / common mistakes soon If you are surprised by, or not happy with, your score, please see us ASAP to start planning for next quiz 10 count Note: left-most bin represents 60 (this is an open-ended distribution) 5 0 0.6 0.7 0.8 0.9 1.0 score

Questions?