STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either the center-most data value or the average of the two center-most data values. Characteristic: Is resistant is only affected by which side (left or right) new data falls on; not affected by outliers or skew. - Mode: Most common value. Characteristic: Peak of the data set. Important fact to remember: The median is typically a better measure of center than the mean. b. Spread - Range: Maximum minimum. A weak measure of spread, but a simple one. - First quartile (Q1): The median of the lower half of the data set; the 25 th percentile. Characteristic: Approximately 25% of data is contained below Q1 and approximately 75% of the data is contained above Q1. - Third quartile (QE): The median of the upper half of the data set; the 75 th percentile. Characteristic: Approximately 75% of data is contained below Q3 and approximately 25% of the data is contained above Q3. Important fact to remember: The amount of data contained between the min and Q1, Q1 and the median, the median and Q3, and Q3 and the maximum is the same no matter how clustered the data may be in any given quartile. - Interquartile range (IQR): Q3 Q1. 50% of the data is contained in the IQR. - Outliers: Any point that is below Q1 1.5IQR or above Q3 + 1.5IQR is an outlier. - Standard deviation: Measures typical deviation from the mean. Formula: ( ) Characteristics: Standard deviation is always positive. It is the square root of the variance. - Variance: Measures the overall variation in the data. More comprehensive than range. Characteristics: Variance is always positive, and it is the square of the standard deviation. Important fact to remember: Standard deviation measures the typical distance that a data point would be away from the mean. Calculator Commands Entering data into a list: STAT -> EDIT Calculating summary statistics: STAT -> CALC -> 1: 1-Var Stats 2 ND # (whatever number your list is) 2. Comparing two variables a. Linear regression - Least squares regression line (LSRL): This is the line that minimizes the squares of the residuals. - Residual: How off a prediction made by an LSRL is from the actual data. Calculated by: actual predicted. - Correlation coefficient r: A measure of how strong a linear relationship between two variables. Characteristics: Always between and 1. Correlation of 0 implies no linear relationship between two variables. Must have the same sign as the slope of the regression line. Important fact to remember: A high value of r does NOT imply that two variables have a linear relationship; it only shows the strength of such a relationship if it exists. Two variables with an exponential relationship may just happen to have a high r given a LSRL; this doesn t change the fact that the actual relationship is nonlinear. To determine if a relationship is linear, you must look at a scatterplot or a residual plot. - Coefficient of determination ( ): Describes what percentage of the overall variation in a -variable can be explained by plotting the -variable against an -variable with an LSRL.

Characteristic: Put otherwise, measures by what percentage the sum of the square residuals reduces by when using an LSRL instead of the line. Calculator Commands Make sure you have DiagnosticOn. This can be achieved by: 2 ND -> 0 -> DiagnosticOn -> Enter -> Enter. To calculate a LSRL given two data sets in L1 and L2: STAT -> CALC -> 8: LinReg(a+bx) L1, L2 a is the -intercept of the LSRL. It is the value of the -variable when. b is the slope of the LSRL. It is the change in units of the -variable for every increase of 1 unit in the -variable. 3. Basic graphical displays # and type of Name variables Boxplots Stemplots Bar charts Segmented bar charts Histograms 1 or more categorical 1 or more categorical Requirements for an E Mark outliers Scale Key/legend No skipping Key/legend Must use rel. freq. if sample sizes different Key/legend Frequency v. relative frequency Bars must touch if continuous Bars must be inclusive to the right Scatterplots 2 quantitative Cumulative frequency plots Add relative frequencies as increases Advantages Easy to graph All observations shown (can see gaps, # peaks) Simple, easy to graph Can be hard to graph Good for continuous variables Reveal shape Median can be estimated Easily readable (DOFS) Can determine percentiles Disadvantages Cannot determine gaps or # of peaks Time consuming for large data sets Misleading if without relative frequency Misleading if without relative frequency Median cannot be found exactly; mean can only be estimated N/A N/A Done in calculator? Yes Yes (sorta) No No Yes Yes N/A Among the graphs featured on the AP exam, the TI-84 can graph: scatterplots, histograms (frequency or relative frequency), and boxplots (with or without outliers). For any graphs, make sure that all functions are cleared from the menu. Enter your data using STAT -> EDIT. Hit 2 ND Y=. Turn the plot On. Select the graph you want and hit ZOOM 9:ZoomStat. To trace anything along the graph (quartiles for boxplots, points on a scatterplot, etc), hit TRACE. To order data in ascending order for a stemplot, enter your data into a list. Then, hit STAT 2: SortA(List #) ENTER.

3. Normal distributions a. All Normal distributions follow what is called the 68-95-99.7 rule (known as the Empirical Rule). This states that, for any normal distribution, 68% of the data is contained within one standard deviation of the mean; 95% of the data is contained within 2 standard deviations of the mean; and 99.7% of the data is contained within 3 standard deviations of the mean; this is illustrated below. This is useful for estimating probabilities and percentiles. b. Percentile: A data point is at the percentile if percent of the data is equal to or below the data point. EXAMPLE: Suppose the distribution of heights of ten-year old boys is normally distributed with a mean of 58 inches and a standard deviation of 2.5 inches. Answer the following. - Shaun is a 10-year old boy and is 60.5 inches tall. Shaun s height is at what percentile? Answer: 60.5 is one standard deviation above the mean, so 50 + 34 = 84 percent of 10-year old boys heights are below 60.5 inches. Shaun is at the 84 th percentile. - Andre is a 10-year old boy who is 53 inches tall. Andre s height is at what percentile of the height distribution? Answer: 53 is 58 2(2.5), so Andre s height is two standard deviations below the mean. Since 95% of data is within two standard deviations of the mean, Andre s height must be in the outer 5%; because his percentile rank only includes heights below his, he is at the ( ) percentile. - Tommy is 59 inches tall. What percentile is Tommy s height at? Answer: 59 is less than one standard deviation above the mean; this means his height is a region above the mean and that contains 34% of the data. His height must be somewhere between the 50 th and 84 th percentile. To answer this more specifically, we need a calculator command. To calculate the probability of a data point falling in a certain range or a data point s percentile, use normalcdf. Hit 2 ND VARS 2:normalcdf(. Enter the following four inputs: lowerbound, upperbound, ). The lower bound needs to be extreme enough to have only a negligible amount of data below it. To answer the question about Tommy, we would enter: normalcdf(-100, 59, 58, 2.5) =.6554216971. Tommy s height is at approximately the 65.5 th percentile of heights of 10-year old boys. c. -scores: Because all normal distributions have the same shape, it is useful to have what is called the standard normal distribution, which is the normal distribution with a mean and a standard deviation. Any data point in a normal distribution can be converted into a data point on the standard normal curve; the corresponding value is called that data point s -score, and is calculated as follows: for a data point and a normal distribution with and the data point s -score is

This is useful for a few reasons. Firstly, -scores are measured in standard deviations above or below the mean. This means that given two normal distributions with completely different units, we can compare data points on the distributions by standardizing them and comparing their -scores. In addition, the standard normal distribution has been studied enough that we have what is known on the AP Statistics formula sheet as Table A: given a -score, you can find that -scores percentile (i.e. the proportion of data to the left of that -score). Example: Benjamin and Will both play football at their local high school. Benjamin is known for his speed, while Will is known for his strength. The distribution of 40-yard dashes of football players at this school is approximately normal with mean 4.9 seconds and standard deviation 0.2 seconds; the distribution of bench-press lifts is approximately normal with mean 250 pounds and standard deviation 15 pounds. Benjamin can run a 40- yard dash in 4.52 seconds, and Will can bench-press 285 pounds. Who is better with regards to their respective strengths? Answer: The two players strengths are measured in different units, so we will standard their respective values:,. While both players are better than the players at their school, Benjamin is only 1.9 standard deviations faster than the average player at his school, while Will is 2.33 standard deviations above the mean. Will is better. To calculate the percentile of a -score, use normalcdf(. There are two options; enter the lower and upper bound and,, or simply do not input the mean or standard deviation. The TI-84 assumes that, if you input no mean or standard deviation, it is calculating based on the standard normal distribution. So, if you wanted to calculate Will s percentile, there are four options: 1. normalcdf(-100, 2.33, 0, 1) =.990096947 2. normalcdf(-100, 2.33) =.990096947 3. normalcdf(-100, 285, 250, 15) =.9901846932 4. Use Table A and find.3 on the leftmost column and 0.03 on the uppermost row: the corresponding value in the table is.9901. The values are slightly different due to the facts that the -score of 2.33 was slightly rounded and that Table A rounds the probability to 4 decimal places. Sometimes, you are given a percentile and are asked to calculate the data value that has that percentile. For this, we use another calculator command. When given a percentile and the mean and standard deviation of a normal distribution, you can calculate the data value with that percentile using invnorm. Hit 2 ND VARS 3:invNorm( and enter the following inputs: area to the left, mean, standard deviation). EXAMPLE: Refer back to the data on the heights of 10-year old boys. Eric, a ten-year old boy, brags to his friends that he is at the 98 th percentile for height. How tall must Eric be? Answer: invnorm(0.98, 58, 2.5) = 63.13437228. Eric must be about 63.13 inches tall. As you will see on the next page, invnorm( is integral in calculating critical values for confidence intervals. To calculate the critical value for a confidence interval about a sample proportion (remember: PROPZ: Z is for Proportions), you can use invnorm(. To calculate the critical value for a confidence interval with confidence level, you must simply keep in mind that the confidence level is the proportion of data contained around the mean; it is NOT equivalent to a percentile. To find the critical value, you must add on to the given half of the remaining area this corresponds to the area to the left (what the confidence interval does not contain). To calculate the critical value for a confidence interval about a proportion with confidence level, enter the following: invnorm( ( ) )

given and. Practically, you will not type this explicitly; you just need to add in the tail mentally. For example, the critical value for a 98% confidence interval is: invnorm( ) This adds in the to the left of the desired confidence interval. 4. distributions When working with means, we often do not know the population standard deviation Because of this, we must estimate with the sample standard deviation. When constructing a distribution using the sample standard deviation, the critical values must be slightly larger this to compensate for the fact that is by definition smaller than, regardless of whether is known. Characteristics: All distributions have shorter peaks than corresponding normal distributions and more area in the tails. The particular shape of a distribution is determined by degrees of freedom, which is simply, where is the sample size. The larger the sample size, the more the distribution will resemble a normal distribution. Similarly to normal distributions, there is a standard distribution; this is critical, because the TI-84 can only calculate probabilities and percentiles for the standard distribution. a. -scores: Critical values for -distributions are almost identical to -scores, only use instead of Given a data point, mean and sample standard deviation, the corresponding -value for that data point is You will not likely be asked to calculate probabilities or percentiles, but you will need to know how to find critical values for confidence intervals. To calculate the critical value for a confidence interval about a mean with confidence level, enter the following: 2 ND VARS 4:invT( and input the following; ( ) ) Practically, you will not type this explicitly; you just need to add in the tail mentally. For example, if a sample size is 10, then the critical value for a 99% confidence interval about that sample mean is: invnorm( ) This adds in the to the left of the desired confidence interval and includes the degrees of freedom 5. Binomial distributions a. Binomial distribution: The distribution of the number of times a certain binary event occurs. Characteristics: Defined by two parameters:, the number of trials, and, the probability of success on each trial. Conditions for a variable to be considered binomial: BINS Binary: The possible outcomes of each trial can be classified as success or failure. Independent: Trials must be independent; the result of any one trial will not affect the result of any other trial. Number: The number of trials must be fixed in advance. Success: There is the same probability of success on each trial; as long as the 10% condition is met, this condition is met. Characteristic: The binomial distribution defined by ( ) has mean and standard deviation ( ) Oftentimes in a binomial setting, we want to calculate the probability of an outcome occurring a certain number of times (or more or less). This can be done with the calculator. In a binomial setting with number of trials, probability of success, and desired number of successes, we can calculate the probability of or fewer successes using binomcdf. Hit 2 ND VARS B:binomcdf( and input n, p, x). It is important to know that this is inclusive; the result of this command is the probability of anything up to, and including, number of successes. EXAMPLE: If you flip a fair two-sided coin 10 times, what is the probability of getting fewer than 4 tails?

Answer: This is a binomial setting because there are a fixed number of independent trials, each of which have two possible outcomes: tails, or not tails. Because it says fewer than 4 tails, we must calculate the probability of three of fewer tails by entering the following: binomcdf(10, 0.5, 3) = 0.171875. The probability of getting fewer than 4 tails in 10 flips of a coin is approximately 0.172. EXAMPLE: If you flip a fair two-sided coin 10 times, what is the probability of getting more than 8 heads? Answer: The probability of getting more than 8 heads is equal to 1 minus the probability of getting 7 or fewer heads, so: 1 - binomcdf(10, 0.5, 7) = 0.546875. The probability of getting more than 8 heads in 10 flips of a fair two-sided coin is approximately 0.547. b. Normal approximations to a binomial distribution: Binomial distributions are usually skewed if they do not have a sufficiently large sample size. If, a binomial distribution without a sufficiently large sample size will be skewed right, and a binomial distribution with without a sufficiently large sample size will be skewed left. However, there is a very simple, albeit subjective, rule for determining if a normal distribution can approximate a binomial distribution. Important fact to remember: If and ( ) then you can use a normal approximation to a binomial distribution. Some resources say that you only need 5 successes and 5 failures; this is usually if is already close to 0.5. This distribution will have mean and standard deviation ( ). EXAMPLE: Suppose that 12% of all potatoes in a large truckload have blemishes. If 100 potatoes are randomly selected, what is the probability that more than 15 potatoes have blemishes? Answer: This is a binomial setting because there are a fixed number of trials that have only two possible outcomes: blemish or no blemish. Because this involves sampling without replacement, it is important to note that the sample size of 100 is probably less than 10% of the population (# of potatoes in the large truckload); this means that any two trials are independent within reason. Now, in a sample of size 100, we would expect successes (potatotes with blemishes) and ( ) failures (potatoes without blemishes). Because both of these numbers are approximately 10, we can approximate the binomial distribution with a normal distribution with mean and ( )( ) We now enter: normalcdf(15, 1000, 12, 3.25) = 0.1779835349. The probability of getting more than 15 potatoes with blemishes is approximately 0.178. 6. Geometric distributions a. Geometric distribution: This is the distribution of how many trials are needed in a binomial setting to achieve a certain number of successes. Characteristic: If the probability of success in a chance process is, then the expected number of trials until a first success is. EXAMPLE: Suppose that 22% of all households in a certain town have laptop computers. If you were to go randomly select houses, what is the probability that it would take until the 7 th house to find one that has a laptop? Answer: This would require finding 6 houses without a laptop and finding the last house with a laptop; assuming that 7 is less than 10% of the number of households in the town, we know that the probability of success on each house is approximately 0.22. So, our solution is:. There is approximately a 5% chance that it would take until the 7 th house to find the first one with a laptop. 7. Linear transformations and combinations of random variables a. Multiplying a random variable by a constant: If a random variable has mean and standard deviation then the random variable, where is a constant, has mean and standard deviation. The variance of the new variable is - this is larger than the variance of by a factor of squared. Important fact to remember: Multiplication affects both center (mean, median, etc.) and spread (standard deviation, range).

b. Adding or subtracting random variables: IF two random variables are independent, then they can be added or subtracted: let and be independent random variables with respective means and and respective standard deviations and. Then, the mean of will be and the standard deviation will be. Important fact to remember: You add the variances when calculating the new standard deviation, regardless of whether or not the variable is or. EXAMPLE (Common one!): Suppose that the distribution of the weights of pears in a certain market is approximately normal with mean 7 ounces and standard deviation 1.3 ounces. Suppose that the distribution of weights of apples at this same market is approximately normal with mean 4 ounces and standard deviation 0.7 ounces. Let be the distribution of the weight of 3 pears and 5 bananas. What is the standard deviation of? Answer: It may be tempting to think that (where is the distribution of weights of pears, etc.). However, in statistics, this implies that is the combination of just two random variables, when in fact, it is the combination of eight variables. The correct solution is: ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ounces