Chapter 5. Sampling Distributions
|
|
- Imogen Lee
- 6 years ago
- Views:
Transcription
1 Lecture notes, Lang Wu, UBC 1 Chapter 5. Sampling Distributions 5.1. Introduction In statistical inference, we attempt to estimate an unknown population characteristic, such as the population mean, µ, using data in a sample, such as the sample mean, x. That is, we might use the sample mean, x, to estimate the population mean, µ. The accuracy of this estimation depends on the sample size, n, and the variability of the data. We can better understand the uncertainty in the estimation, as well as the basic idea behind statistical inference, by introducing an important concept called the sampling distribution. A sample statistic (e.g., x,) can be conceptually viewed as a random variable, because before we collect the data, we do not know what value the statistic will take. The statistic might take on any number in a range of values. Thus, it has a probability distribution, with the probability of certain values higher than the probability of others. The mean and variance of this distribution can be used to estimate the accuracy of using this statistic to estimate a population parameter. The probability distribution of a statistic is called the sampling distribution of this statistic. In other words, the sampling distribution of a statistic may be viewed as the distribution of all possible values of this statistic. For example, the sampling distribution of the sample mean, x, is the distribution of all possible values of x. So if we take many samples from the same population and calculate x for each of them, the values we get will all fall somewhere along the distribution. By examining the sampling distribution of x, we can get an idea of the variability and range of x, which are used to determine the accuracy of using x to estimate µ. For example, suppose we wish to estimate the average sleep time of all students in a university. Here, the population is all students in the university, and the population parameter of interest is average sleep time, denoted by µ. We can randomly select 10 students from this university and record the average sleep time of these 10 students, which is the sample mean x with sample size n = 10. Suppose that x = 6.5 (hours) for these 10 students. If we randomly select another sample of 10 students, we may obtain a different value of x, say, x = 7 (hours). Repeating this procedure many times, we obtain many values of x, such as 6.5, 7,. The probability distribution of all possible values of x is called the sampling distribution of x. This procedure is used as an illustration
2 Lecture notes, Lang Wu, UBC 2 since it may not be feasible in practice. For some populations, such as a normally distributed population, we can obtain the sampling distribution of the sample mean, x, via theoretical derivations. We examine this more later. Note that the population distribution and the sampling distribution are two different concepts. The population distribution refers to the distribution of a characteristic in the population, while the sampling distribution refers to the distribution of a particular statistic for repeated samples taken from the same population. Note also that if we randomly choose an individual from the population, the value of their characteristic can be seen as a random variable, X, whose probability distribution follows the population distribution. There are many different sample statistics, so there are many different sampling distributions. Here, we focus on the sampling distributions of the two most important statistics: the sampling distribution of the sample proportion the sampling distribution of the sample mean We focus on the above two sampling distributions because they are crucial to two respective population distributions: the binomial distribution (for discrete data) and the normal distribution (for continuous data). The sample proportion is the most important statistic for a population with a binomial distribution, and the sample mean is the most important statistic for a population with a normal distribution. Moreover, the sampling distributions of these two statistics can be derived theoretically. For sampling distributions of other statistics, such as the sample variance, readers are referred to more advanced textbooks Sampling Distribution of the Sample Proportion The Binomial Distribution Before we discuss the sampling distribution of a sample proportion, we first introduce an important distribution for a discrete binary population: the Bernoulli distribution. In practice, many random variables take on only two possible values, often denoted by the
3 Lecture notes, Lang Wu, UBC 3 binary numbers 0 and 1 (or thought of as success and failure ). Random variables of this nature are said to follow a Bernoulli distribution. For example, a student taking a course can either pass (1) or fail (0). If you toss a coin, you will get either heads (1) or tails (0). In an election, a randomly selected person can either vote for candidate A (1) or vote against candidate A (0). We can view these examples as experiments with only two possible outcomes, often called Bernoulli trials. Going back to the example regarding taking a course, let s say we randomly select 10 students from a large class. Each student can pass or fail the course. We can view this as an experiment consisting of 10 trials, with each trial having two possible outcomes (pass or fail), and our interest lying in the number of students who pass the course. Moreover, the 10 students may be viewed as independent and identically distributed. Independent because they are randomly selected; identically distributed because we do not know who will be selected so the probability of passing the course is the same for all students in the class (e.g., each student has a passing probability of 0.8 and failing probability of 0.2). The other examples above may be viewed in a similar way. Example 1. In an election, a recent poll shows that 40% of people will vote for candidate A. Suppose that three people are randomly selected. (1) What is the probability that exactly two people vote for candidate A? (2) What is the probability that at least one person votes for candidate A? Solution: Here, each person has two options: vote for candidate A or vote for someone else, so we can view each person as a random variable that follows a Bernoulli distribution. We can assume the three people are independent. Let X i = 1 if person i votes for candidate A and X i = 0 otherwise, i = 1, 2, 3. Let X be the total number of people, among the three who were selected, who vote for candidate A. Then, X = X 1 +X 2 +X 3, with X = 3 meaning all three people vote for candidate A. (1) The probability that exactly two people vote for candidate A is given by P (X = 2) = (1 0.4) = 0.288, 2 where the term 3 2 is the number of possible ways to have 2 out of 3 people vote for candidate A, the term (1 0.4) is the probability that 2 people vote for candidate A and the other one does not, assuming the 3 people are independent (so we can use the multiplication rule and multiply the probabilities).
4 Lecture notes, Lang Wu, UBC 4 (2) The probability that at least one person votes for candidate A is P (X 1) = 1 P (X = 0) = = Alternatively, we can use P (X 1) = P (X = 1) + P (X = 2) + P (X = 3) to get the same answer, but the computation is more tedious. In general, we can consider n independent and identically distributed Bernoulli trials at once, where each trial has only two possible outcomes ( success or failure ). We are often interested in the probability of a certain number of successes. Let p be the probability of success for each trial, and let X be the total number of successes. Then, the probability distribution of X is given by P (X = k) = n p k (1 p) n k, k = 0, 1, 2,, n. k The above distribution is called the binomial distribution, denoted by X B(n, p), or X Bin(n, p). Thus, a Binomial distribution is determined by two numbers: the number of trials, n, and the probability of success, p, with p being the only unknown parameter (since n is usually known). This is different from the normal distribution N(µ, σ), which is determined by two unknown parameters: the mean, µ, and the standard deviation, σ. Remarks: 1). In practice, a binomial random variable X arises in the following settings: i) there are n i.i.d. Bernoulli trials, with n known and fixed; ii) the probability of success p is the same for each trial; iii) X is the number of successes out of the n trials. 2). The above n trials may be viewed as a sample of size n. The number of successes, X, is the sample count. The proportion X/n, denoted by ˆp, is the sample proportion, and it indicates the proportion of the sample trials that were successful (i.e., the number of successes divided by the total number of trials). The probability of success, p, is the population proportion, and it represents the (usually unknown) true proportion of success in the population. Since X is a count from a sample, the distribution of X may be viewed as the sampling distribution of a count. Remember that X follows a binomial distribution. Theorem 1. If X B(n, p), then E(X) = np,
5 Lecture notes, Lang Wu, UBC 5 V ar(x) = np(1 p). Thus, for a binomial random variable, X, we can immediately obtain its mean and variance using the above formulas. Example 2. Suppose the probability of getting a certain disease is 0.001, and suppose 50 people are randomly selected. (1) What is the probability of exactly one person having the disease? (2) What is the probability of at least one person having the disease? (3) How many people should be selected so there is a 90% chance of at least one of them having the disease? (4) Find the mean and standard deviation of the number of people who have the disease among the 50 people. Solution: Each randomly selected person either has the disease or does not have the disease. Let X be the number of people who have the disease among n randomly selected people. We are working with a binomial distribution where n = 50 and p = (1) The probability that exactly one person has the disease is given by P (X = 1) = = (2) The probability that at least one person has the disease is given by P (X 1) = 1 P (X = 0) = = (3) In this case, n is unknown and needs to be determined. We need to find the value of n so that P (X 1) = 0.9, i.e., 1 P (X = 0) = 0.9 or P (X = 0) = 0.1. Thus n n = n = 0.1, i.e., n log(0.999) = log(0.1). Solving the above equation, we have n = log(0.1) log(0.999) That is, we must select 2303 people to ensure there is 90% chance of at least one of them having the disease. (4) When n = 50 and p = 0.001, the mean and standard deviation of X are given by E(X) = np = = 0.05.
6 Lecture notes, Lang Wu, UBC 6 σ X = V ar(x) = np(1 p) = = For n = 50, we have a mean of 0.05 people having the disease and a standard deviation of 0.22 people. Example 3. The probability of a battery life exceeding 4 hours is There are three batteries in use. (1) Find the probability that at most 2 batteries last for 4 or more hours; (2) Find the mean and standard deviation of the number of batteries lasting 4 or more hours. Solution: A battery s life will either exceed 4 hours or not exceed 4 hours. Let X be the number of batteries lasting 4 or more hours. Here we have n = 3 and p = Thus, (1) P (X 2) = 1 P (X = 3) = ( ) 0 = (2) E(X) = np = = σ X = np(1 p) = = For n = 3, the mean is batteries exceed 4 hours and the standard deviation is 0.59 batteries Sampling Distribution of the sample proportion A major goal in statistics is to make inferences about unknown population parameters. We do this by using sample statistics to estimate corresponding population parameters. For example, we might use sample proportions to estimate population proportions or use sample means to estimate population means. There is uncertainty in these estimations because the value of a statistic will vary from one sample to the next. To measure the uncertainty of each estimation, we look at the variability of the statistic (i.e., how much its value might vary from one sample to the next). To do this, we need to find the distribution of the sample statistic that is used to estimate the population parameter. This distribution is called the sampling distribution of the corresponding statistic. In this section, we consider a discrete population that follows a Bernoulli distribution (i.e., a population that is split into two groups, or a binary population), as described in the previous section. For a population that follows a Bernoulli distribution, the parameter of interest is the population proportion, p, which is the proportion of success in the population (or the proportion of the population with the attribute of
7 Lecture notes, Lang Wu, UBC 7 interest). Recall the difference between a proportion and a percentage: a percentage is a proportion multiplied by 100. A proportion is a number between 0 and 1, while a percentage is a number between 0 and 100. Examples of population proportions include the proportion of people who are literate, the proportion of people who smoke, the proportion of people with cancer, etc. Recall also the difference between a parameter and a statistic: a parameter is a population characteristic, while a statistic is a function or measure of data in a sample. The difference between the population proportion, p, and the sample proportion, ˆp, is the difference between a parameter and a statistic. Let p be an (unknown) population proportion of success. We select a sample of size n and think of it as n independent Bernoulli trials, with x being the number of successes. Using the information in the sample, we can calculate the sample proportion (denoted by ˆp): number of successes in the sample ˆp = = x sample size n. We can then use ˆp as an estimation of p. For example, if the unknown parameter p is the proportion of people who smoke in Canada, then perhaps ˆp is the proportion of people who smoke in a randomly selected sample of n individuals in Canada. Here, ˆp is a number we can calculate and it gives us an estimate of p. From the previous section, we know that before we collect the data, the number of successes, X, is a random variable that follows a Binomial distribution. Once we have the data, we are interested in the distribution of the sample proportion ˆp (i.e., the sampling distribution of ˆp), which is unknown. Remember that the sampling distribution of ˆp is the the distribution of all possible values of ˆp if ˆp is calculated for an infinite number of samples of equal size taken from the same population. This distribution will allow us to be fairly confident that the actual value of p lies within a certain interval. The distribution of ˆp is difficult to find, so we often approximate it with the normal distribution, as described below in Theorem 3. In addition, the mean and standard deviation of the distribution of ˆp can be easily found, as shown in Theorem 2 below. Note that the normal distribution is completely determined by its mean and standard deviation, but this property does not hold for all distributions. Theorem 2. The mean and variance of the sampling distribution of the sample proportion ˆp are respectively given by E(ˆp) = p, V ar(ˆp) = p(1 p), n
8 Lecture notes, Lang Wu, UBC 8 where p is the population proportion. Note that Theorem 2 only gives the mean and variance (or standard deviation) of the distribution of ˆp. We still do not know what the exact distribution of ˆp is. However, Theorem 3 below shows that the distribution of ˆp can be approximated by a normal distribution. Theorem 3. If the sample size n is sufficiently large such that then np 10 and n(1 p) 10, (i) the sampling distribution ( of the ) sample proportion, ˆp, can be approximated by the normal distribution N p,, p(1 p) n (ii) the distribution of the number of successes, X, can be approximated by the normal distribution N ( np, np(1 p) ). Theorem 3 shows that both ˆp and X may be approximated by normal distributions when the sample size, n, is large. Here, large means np 10 and n(1 p) 10. Some books use the condition np 5 and n(1 p) 5. Readers should not worry about the specific numbers 5 or 10. The key thing is, in order for the normal approximations to be accurate, n should be large and p should not be too close to 0 or 1. The larger the sample size, n, the more accurate the normal approximations. Theorem 3 (ii) can be used to quickly calculate binomial probabilities. We know that X follows B(n, p). However, computation of probabilities such as P (X < k) can be quite tedious if k is not small. For example, P (X < 10) requires computation of 10 binomial probabilities that are then added together. If we instead use the normal approximation in Theorem 3 (ii), the normal distribution will quickly give us an approximate answer to P (X < 10), as shown in the examples below. We will explore the idea of inferring population parameters from sample statistics in more detail in the next chapter. For now, we focus on familiarizing ourselves with the relationships between parameters and sampling distributions (Theorem 2), as well as how information can be gathered from an approximated sampling distribution (Theorem 3). In the following examples, the population proportion, p, is already known, so we explore how knowing this proportion can allow us to approximate the sampling distribution of ˆp and then gather information from it.
9 Lecture notes, Lang Wu, UBC 9 Example 4. Suppose 20% of people in a certain city smoke. A sample of 100 people are randomly selected from this city. Find the probability that more than 30% of people in this sample smoke. Solution: Here the population proportion is known to be p = 0.2, and the sample size is n = 100. The sample proportion ˆp can be viewed as a random variable before we observe the data in the sample. Since np = 20 > 10 and n(1 p) = 80 > 10, we can use a normal approximation to find the probability P (ˆp > 0.3). By Theorem 3, we have, approximately, ˆp N 0.2, = N(0.2, 0.04). 100 Thus, an approximation of P (ˆp > 0.3) is given by ( ) ˆp P (ˆp > 0.3) = P > (standardization) P (Z > 2.5) = P (Z 2.5) = where we first use standardization (i.e., subtract the mean and divide by the standard deviation) to get to the standard normal distribution, and then look up the probability for the specific z value. Note that for this problem, we can also do exact computation using the binomial distribution (where n = 100 and we are finding the probability that more than 30 people smoke): P (ˆp > 0.3) = P (X > ) = P (X > 30) = P (X = 31) + P (X = 32) + + P (X = 100) = which is very tedious to compute! Example 5. A fair coin is tossed 60 times. Find the probability that less than 1/3 of the results are heads. Solutions: Let X be the number of heads. Here we have n = 60, p = 0.5, np = n(1 p) = 30 > 10, so we can use a normal approximation ˆp N 0.5, = N(0.5, ). 60
10 Lecture notes, Lang Wu, UBC 10 Thus ( ˆp P (ˆp < 1/3) = P < 0.5 ) 3 = P (Z < 2.58) = This problem can also be solved exactly using binomial distributions, but the computation is again very tedious. The general method for these types of problems is to approximate the binomial distribution with the normal distribution (after checking all requirements are met), convert this distribution to a standard normal distribution using standardization, and then look up the probability for the resulting z value using a standard normal table or statistical software. 5.3 The Sampling Distribution of a Sample Mean In the previous section, we considered (discrete) binary populations that follow Bernoulli distributions, as well as the sampling distribution of the sample proportion. In this section, we consider a population distribution that is continuous and has mean µ and standard deviation σ. The population is not necessary normally distributed. (Remember that a normal distribution is completely determined by µ and σ but a general continuous distribution may not be completely determined by µ and σ.) The parameters µ and σ are unknown. We will use the sample mean, x, as an estimate of the population mean, µ. To measure the accuracy of this estimation, we need to find the sampling distribution of the sample mean x, i.e., the distribution of all possible values of the sample mean, x, if infinitely many samples of equal size are taken from the same population and the mean is calculated for each of them. (Note: when we talk about the sampling distribution of x, we are viewing x as a random variable because we are considering all possible samples. If we instead focus on a specific sample with observed data, then x is a number.) When the population distribution is unknown (except that it is continuous), the exact sampling distribution of the sample mean, x, cannot be known either. However, if we know the population parameters, we can still obtain the mean and standard deviation of the sampling distribution of the sample mean, x, as shown in the theorem below. Moreover, when the sample size is large, we can use a normal distribution to approximate the sampling distribution of the sample mean, x.
11 Lecture notes, Lang Wu, UBC 11 Theorem 4. Consider a continuous population with mean µ and standard deviation σ. When the population distribution is unknown, we have (i) the mean of all possible values of x (i.e., the mean of the sampling distribution of x, or the mean of the sample mean) is equal to the population mean: E( x) = µ; (ii) The standard deviation of all possible values of x (i.e., the standard deviation of the sampling distribution of x, or the standard deviation of the sample mean) is n times smaller than the population standard deviation: σ x = σ n, or V ar( x) = σ2 n. As you can see, the formulas for the mean and standard deviation of the sample mean distribution depend on the population µ and σ. This shows the relationship between the parameters and the sampling distribution of the sample mean. In practice, however, the parameters µ and σ are usually unknown, so we must estimate them using the statistics we have from a sample. We use the sample mean, x, to estimate the population mean, µ. Plugging x instead of µ into Theorem 4(i), we get an estimate of the mean of the sampling distribution of the sample mean. Similarly, we use the sample standard deviation, ˆσ = s, to estimate the population standard deviation, σ. Plugging ˆσ instead of σ into Theorem 4(ii), we get an estimate of the standard deviation of the sampling distribution of the sample mean. We call this estimate the standard error of the sample mean x, given by ˆσ x = ˆσ n. In other words, the standard error is an estimate of the standard deviation of the sample mean distribution. Since σ x = σ n, the larger the sample size, n, the smaller the standard error of the distribution of x is (i.e., less variability in x), and so the more accurate x is as an estimate of µ. As an example, suppose that you wish to get an accurate measure of your blood pressure. One way to increase your accuracy is to measure your blood pressure as many times as possible and then take an average of the measurements. In Theorem 4, we give the mean and standard deviation of the distribution of the sample mean, x. We still do not know the exact distribution of x since the population distribution is unknown and mean and standard deviation cannot completely determine a continuous distribution (unless it is a normal distribution). However, if the population
12 Lecture notes, Lang Wu, UBC 12 distribution is known to be normal or if the sample size, n, is large, the distribution of x is either exactly or approximately normal, as shown in the theorem below. Theorem 5. (i) If the population follows a normal distribution, N(µ, σ), then the sample mean x also follows a normal distribution exactly: ( ) σ x N µ,. n (ii) If the population distribution is unknown but the sample size, n, is large (say, n 25), then the sample mean, x, approximately follows the following normal distribution ( ) σ x N µ,, n which is the same distribution as the one in (i). Based on Theorem 5, when the sample size, n, is reasonably large, the distribution of the sample mean, x, will approximately follow the distribution N ( µ, σ n ). Some books use n 25 as a condition and some books use n 30 or another number as a condition. Readers should not worry too much about the specific number, since it just sets a benchmark of accuracy for the normal approximation. The larger the value of n, the more accurately the normal distribution will approximate the distribution of x. Generally, if n < 10, the normal approximation may be poor. Example 6. Suppose the weights of all adults in a large city form a distribution with mean µ = 140 (pounds) and standard deviation σ = 20 (pounds). A sample of 25 adults in the city is randomly selected. Find the probability that the mean weight of the adults in the sample is at least 144 pounds. Solutions: Here, we know the value of the parameters µ and σ so we can calculate the mean and standard deviation of the distribution of the sample mean, x. E( x) = µ = 140 and σ x = σ n = 20/ 25 = 4. Since n = 25, we can approximate the sample mean distribution by a normal distribution: x N(140, 4). Now that we have approximated the sample mean distribution, we can calculate probabilities of certain values. We have P ( x 144) = P (Z ) = P (Z 1) =
13 Lecture notes, Lang Wu, UBC 13 Example 7. The weights of large eggs follow a normal distribution with a mean of 1 oz and a standard deviation of 0.1 oz. What is the probability that a dozen (12) eggs weigh more than 13 oz.? Solution: We are given the population mean, standard deviation, and distribution, so we can directly use the above theorems. Since the population follows N(1, 0.1), the sample mean x follows N(1, 0.1/ 12) or N(1, 0.029). Let X i be the weight of egg i, i = 1, 2,, 12. Then, the total weight of 12 eggs is 12 i=1 X i and the mean weight is 12 i=1 X i = x. Thus, 12 ( 12 ) P X i > 13 = P ( x > 13 ) = P ( x > 1.083) i=1 12 ( = P Z > ) = P (Z > 2.86) = In this example, the sample size n = 12 is not large, but we know the population distribution, so we have the exact sampling distribution for the sample mean, x The Central Limit Theorem In the previous sections, we have seen that regardless of whether the population is discrete or continuous, the distributions of the sample proportion and sample mean can be approximated by normal distributions when the sample sizes are large. There is a reason for this the normal approximations are justified by the so-called central limit theorem (CLT). The central limit theorem is one of the most important theorems in Statistics. Basically, the CLT says that, no matter what the population distribution may be, when the sample size is sufficiently large, the mean of i.i.d. random variables will be approximately normally distributed. Note that both the sample proportion, ˆp, and the sample mean, x, can be written as means of independent and identically distributed (i.i.d.) random variables. This is obvious for the sample mean, x. We can see that the sample proportion ˆp can also be written as a mean: ni=1 x i ˆp =, n where each x i only takes on a value of 0 or 1. Note also that a simple random sample (SRS) {x 1, x 2,, x n } can be viewed as having i.i.d. random variables,
14 Lecture notes, Lang Wu, UBC 14 as noted earlier. The Central Limit Theorem (CLT). The CLT can be stated as follows: (i) If a continuous population has mean µ and standard deviation σ, when the sample size n in a SRS is large, the sample mean approximately follows the following normal distribution x N ( µ, ) σ. n (ii) If a binary (or Bernoulli) population has proportion of success p, when the sample size n in a SRS is large, the sample proportion approximately follows the following normal distribution p(1 p) ˆp N p,. n Remark: In the CLT above, the sample size, n, needs to be large in order for the normal approximations to be accurate. For a continuous population, we usually need n 25, while for a binary population, we need np 10 and n(1 p) 10. These are rough guidelines. The larger n is, the more accurate the normal distributions are as approximations. An SRS ensures i.i.d. random variables because each individual is randomly selected. Note that, for a continuous population, we do not need to know the population distribution when applying the CLT. The CLT not only holds for binary and continuous populations, but also holds for other populations, such as counts. The key here is that the data in the sample must be i.i.d. (e.g., in an SRS,) and the statistic must be a sum or a mean. The CLT can be used to provide an approximate distribution for a statistic if the statistic can be written as a mean (or a sum) of i.i.d. random variables. Since many statistics may be expressed (or approximated) as sums or means of i.i.d. random variables, many statistics may be assumed to approximately follow normal distributions. This explains why the normal distribution is the most common distribution in statistics. However, some statistics, such as the median or the sample standard deviation, cannot be written as a sum or mean of i.i.d. random variables. When this is the case, the CLT cannot be used, so these statistics will not approximately follow normal distributions even when the sample size is large. The sampling distributions of the sample proportion and the sample mean are examples of applications of the CLT. We give one more example below. Example 8. Suppose the scores in a standard test have an average of 500 and
15 Lecture notes, Lang Wu, UBC 15 a standard deviation of 60. A group of 49 students take the test. (1) What is the probability that the average score of the group will fall between 480 and 520? (2) Find a range of scores such that the group average will fall within this range with a probability of Solution: In this example, we do not know the exact population distribution, but we know it is continuous and has mean µ = 500 and standard deviation σ = 60. We can assume the 49 students are an SRS. Since the sample size is 49, which is large, we may apply the central limit theorem and approximate the distribution of the sample mean by a normal distribution. Let x be the group mean. Then, the distribution of x can be approximated by x N(500, 60/ 49) (i.e., N(500, 60/7)). (1) We have P (480 < x < 520) = P ( < Z < ) 60/7 60/7 = P ( 2.33 < Z < 2.33) = 2P (0 < Z < 2.33) = 2(P (Z < 2.33) 0.5) = (2) From (1), x N(500, 60/7) approximately. By the rule for a normal distribution, we have P (µ 2σ x < x < µ + 2σ x ) So and 2σ x = = 17.14, = , = Thus, with probability 0.95, x will fall between and In this example, we do not have to use the rule. If we use a standard normal table, then we should replace 2 with 1.96 in the above calculations. Note that a continuous population can be converted into a binary population. For example, in the above example, if we are only interested in the proportion of students who scored over 600, then we have a binary population. The corresponding sample can
16 Lecture notes, Lang Wu, UBC 16 also be converted into binary data: each student s score is either above 600 or below 600. When we convert continuous data into binary data, we will lose some information. However, sometimes we are only interested in certain pieces of information, such as if a student s score is above 600 or not. In this sense, we do not actually lose any crucial information Chapter Summary In this chapter, we examined the sampling distributions of the sample proportion, ˆp, and the sample mean, x. These sampling distributions are important when making statistical inferences for the unknown population proportion, p, or population mean, µ, as will be shown in the next few chapters. When the sample size is large, the sampling distributions of ˆp and x can be approximated by normal distributions, which can then be used in statistical inference. When the sample size is small, we must know the population distributions in order to know the sampling distributions. The CLT can be used to approximate the sampling distribution of a statistic if the statistic can be written as a sum or mean of i.i.d. random variables Review Questions 1. What is a sampling distribution? Why do we need to consider sampling distributions? 2. Can you think of a sample that does not have i.i.d. random variables? 3. Can we use the CLT to find the sampling distribution of a sample correlation r? Why? 4. I have a box containing a number of tickets numbered between -10 and +10. The mean of the numbers is 0 and the standard deviation is 5. I am going to make a number of draws, with replacement, from the box. If the mean of the numbers that I draw falls between -1 and +1, I win and you will give me $10. Otherwise, you win and I will give you $10. Which of the following number of draws will give you the best chance of winning?
17 Lecture notes, Lang Wu, UBC 17 A. 10 B. 20 C. 100 D. There is insufficient information to tell 5. Suppose the daily precipitation in a city in December is uniformly distributed between 0mm and 15mm. For the month of December (with 31 days), what is the probability that the daily precipitation is less than 10 mm on at least 20 days? Assume the daily precipitation for the different days are independent. Choose the most appropriate answer. A. Less than 0.16 B. Between 0.16 and 0.5 C. Between 0.5 and 0.84 D. Between 0.84 and E. Greater than Ture or false: For a continuous population, the sampling distribution of the sample mean has the same mean as the population mean but has a smaller standard deviation as long as the sample size is larger than Ture or false: The sample mean always under-estimates the population mean because of sampling variation. 8. Ture or false: If the population is uniformly distributed on an interval, the sample mean of a sample taken from this population will still be approximately normally distributed if the sample size is large (say larger than 30). 9. The waiting time for a bus follows a uniform distribution with a mean of 5 hours and a standard deviation of 1 hour. A student takes the bus 100 times in a semester. There is a 95% chance that the average waiting time for this student during that semester is approximately within which of the following hours of 5 hours. (a) 0.1 (b) 0.2 (c) 2 (d) 1
Central Limit Theorem (cont d) 7/28/2006
Central Limit Theorem (cont d) 7/28/2006 Central Limit Theorem for Binomial Distributions Theorem. For the binomial distribution b(n, p, j) we have lim npq b(n, p, np + x npq ) = φ(x), n where φ(x) is
More informationChapter 7. Sampling Distributions and the Central Limit Theorem
Chapter 7. Sampling Distributions and the Central Limit Theorem 1 Introduction 2 Sampling Distributions related to the normal distribution 3 The central limit theorem 4 The normal approximation to binomial
More informationChapter 7. Sampling Distributions and the Central Limit Theorem
Chapter 7. Sampling Distributions and the Central Limit Theorem 1 Introduction 2 Sampling Distributions related to the normal distribution 3 The central limit theorem 4 The normal approximation to binomial
More informationElementary Statistics Lecture 5
Elementary Statistics Lecture 5 Sampling Distributions Chong Ma Department of Statistics University of South Carolina Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 1 / 24 Outline 1 Introduction
More informationChapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are
Chapter 7 presents the beginning of inferential statistics. Concept: Inferential Statistics The two major activities of inferential statistics are 1 to use sample data to estimate values of population
More informationBusiness Statistics 41000: Probability 4
Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:
More informationChapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.
Chapter 14 : Statistical Inference 1 Chapter 14 : Introduction to Statistical Inference Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same. Data x
More informationSampling and sampling distribution
Sampling and sampling distribution September 12, 2017 STAT 101 Class 5 Slide 1 Outline of Topics 1 Sampling 2 Sampling distribution of a mean 3 Sampling distribution of a proportion STAT 101 Class 5 Slide
More informationAP Statistics: Chapter 8, lesson 2: Estimating a population proportion
Activity 1: Which way will the Hershey s kiss land? When you toss a Hershey Kiss, it sometimes lands flat and sometimes lands on its side. What proportion of tosses will land flat? Each group of four selects
More informationIEOR 3106: Introduction to OR: Stochastic Models. Fall 2013, Professor Whitt. Class Lecture Notes: Tuesday, September 10.
IEOR 3106: Introduction to OR: Stochastic Models Fall 2013, Professor Whitt Class Lecture Notes: Tuesday, September 10. The Central Limit Theorem and Stock Prices 1. The Central Limit Theorem (CLT See
More information5.3 Statistics and Their Distributions
Chapter 5 Joint Probability Distributions and Random Samples Instructor: Lingsong Zhang 1 Statistics and Their Distributions 5.3 Statistics and Their Distributions Statistics and Their Distributions Consider
More informationAMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4
AMS 7 Sampling Distributions, Central limit theorem, Confidence Intervals Lecture 4 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Summer 2014 1 / 26 Sampling Distributions!!!!!!
More informationModule 4: Probability
Module 4: Probability 1 / 22 Probability concepts in statistical inference Probability is a way of quantifying uncertainty associated with random events and is the basis for statistical inference. Inference
More informationUnit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More informationUnit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More informationChapter 3 Discrete Random Variables and Probability Distributions
Chapter 3 Discrete Random Variables and Probability Distributions Part 3: Special Discrete Random Variable Distributions Section 3.5 Discrete Uniform Section 3.6 Bernoulli and Binomial Others sections
More informationChapter 7: Point Estimation and Sampling Distributions
Chapter 7: Point Estimation and Sampling Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 20 Motivation In chapter 3, we learned
More informationBinomial Random Variables. Binomial Random Variables
Bernoulli Trials Definition A Bernoulli trial is a random experiment in which there are only two possible outcomes - success and failure. 1 Tossing a coin and considering heads as success and tails as
More informationData Analysis and Statistical Methods Statistics 651
Review of previous lecture: Why confidence intervals? Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Suppose you want to know the
More informationConfidence Intervals Introduction
Confidence Intervals Introduction A point estimate provides no information about the precision and reliability of estimation. For example, the sample mean X is a point estimate of the population mean μ
More informationLecture 9. Probability Distributions. Outline. Outline
Outline Lecture 9 Probability Distributions 6-1 Introduction 6- Probability Distributions 6-3 Mean, Variance, and Expectation 6-4 The Binomial Distribution Outline 7- Properties of the Normal Distribution
More informationLecture 9. Probability Distributions
Lecture 9 Probability Distributions Outline 6-1 Introduction 6-2 Probability Distributions 6-3 Mean, Variance, and Expectation 6-4 The Binomial Distribution Outline 7-2 Properties of the Normal Distribution
More informationA random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.
Chapter 14: random variables p394 A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon. Consider the experiment of tossing a coin. Define a random variable
More informationSTA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 41
STA258H5 Al Nosedal and Alison Weir Winter 2017 Al Nosedal and Alison Weir STA258H5 Winter 2017 1 / 41 NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION. Al Nosedal and Alison Weir STA258H5 Winter 2017
More information7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4
7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4 - Would the correlation between x and y in the table above be positive or negative? The correlation is negative. -
More informationPart V - Chance Variability
Part V - Chance Variability Dr. Joseph Brennan Math 148, BU Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 1 / 78 Law of Averages In Chapter 13 we discussed the Kerrich coin-tossing experiment.
More informationCHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS
CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS A random variable is the description of the outcome of an experiment in words. The verbal description of a random variable tells you how to find or calculate
More informationStat 213: Intro to Statistics 9 Central Limit Theorem
1 Stat 213: Intro to Statistics 9 Central Limit Theorem H. Kim Fall 2007 2 unknown parameters Example: A pollster is sure that the responses to his agree/disagree questions will follow a binomial distribution,
More informationchapter 13: Binomial Distribution Exercises (binomial)13.6, 13.12, 13.22, 13.43
chapter 13: Binomial Distribution ch13-links binom-tossing-4-coins binom-coin-example ch13 image Exercises (binomial)13.6, 13.12, 13.22, 13.43 CHAPTER 13: Binomial Distributions The Basic Practice of Statistics
More informationBIO5312 Biostatistics Lecture 5: Estimations
BIO5312 Biostatistics Lecture 5: Estimations Yujin Chung September 27th, 2016 Fall 2016 Yujin Chung Lec5: Estimations Fall 2016 1/34 Recap Yujin Chung Lec5: Estimations Fall 2016 2/34 Today s lecture and
More informationMidterm Exam III Review
Midterm Exam III Review Dr. Joseph Brennan Math 148, BU Dr. Joseph Brennan (Math 148, BU) Midterm Exam III Review 1 / 25 Permutations and Combinations ORDER In order to count the number of possible ways
More informationE509A: Principle of Biostatistics. GY Zou
E509A: Principle of Biostatistics (Week 2: Probability and Distributions) GY Zou gzou@robarts.ca Reporting of continuous data If approximately symmetric, use mean (SD), e.g., Antibody titers ranged from
More informationMA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution.
MA 5 Lecture - Mean and Standard Deviation for the Binomial Distribution Friday, September 9, 07 Objectives: Mean and standard deviation for the binomial distribution.. Mean and Standard Deviation of the
More informationSection The Sampling Distribution of a Sample Mean
Section 5.2 - The Sampling Distribution of a Sample Mean Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin The Sampling Distribution of a Sample Mean Example: Quality control check of light
More informationIntroduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017
Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017 Please fill out the attendance sheet! Suggestions Box: Feedback and suggestions are important to the
More information4.2 Bernoulli Trials and Binomial Distributions
Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan 4.2 Bernoulli Trials and Binomial Distributions A Bernoulli trial 1 is an experiment with exactly two outcomes: Success and
More informationAMS7: WEEK 4. CLASS 3
AMS7: WEEK 4. CLASS 3 Sampling distributions and estimators. Central Limit Theorem Normal Approximation to the Binomial Distribution Friday April 24th, 2015 Sampling distributions and estimators REMEMBER:
More informationEngineering Statistics ECIV 2305
Engineering Statistics ECIV 2305 Section 5.3 Approximating Distributions with the Normal Distribution Introduction A very useful property of the normal distribution is that it provides good approximations
More informationCH 5 Normal Probability Distributions Properties of the Normal Distribution
Properties of the Normal Distribution Example A friend that is always late. Let X represent the amount of minutes that pass from the moment you are suppose to meet your friend until the moment your friend
More informationSection Distributions of Random Variables
Section 8.1 - Distributions of Random Variables Definition: A random variable is a rule that assigns a number to each outcome of an experiment. Example 1: Suppose we toss a coin three times. Then we could
More informationThe Bernoulli distribution
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationThe binomial distribution p314
The binomial distribution p314 Example: A biased coin (P(H) = p = 0.6) ) is tossed 5 times. Let X be the number of H s. Fine P(X = 2). This X is a binomial r. v. The binomial setting p314 1. There are
More informationChapter 9 Chapter Friday, June 4 th
Chapter 9 Chapter 10 Sections 9.1 9.5 and 10.1 10.5 Friday, June 4 th Parameter and Statisticti ti Parameter is a number that is a summary characteristic of a population Statistic, is a number that is
More informationPart 10: The Binomial Distribution
Part 10: The Binomial Distribution The binomial distribution is an important example of a probability distribution for a discrete random variable. It has wide ranging applications. One readily available
More informationVersion A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.
Math 224 Q Exam 3A Fall 217 Tues Dec 12 Version A Problem 1. Let X be the continuous random variable defined by the following pdf: { 1 x/2 when x 2, f(x) otherwise. (a) Compute the mean µ E[X]. E[X] x
More informationSampling. Marc H. Mehlman University of New Haven. Marc Mehlman (University of New Haven) Sampling 1 / 20.
Sampling Marc H. Mehlman marcmehlman@yahoo.com University of New Haven (University of New Haven) Sampling 1 / 20 Table of Contents 1 Sampling Distributions 2 Central Limit Theorem 3 Binomial Distribution
More informationMLLunsford 1. Activity: Central Limit Theorem Theory and Computations
MLLunsford 1 Activity: Central Limit Theorem Theory and Computations Concepts: The Central Limit Theorem; computations using the Central Limit Theorem. Prerequisites: The student should be familiar with
More informationUsing the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the
Using the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the mean, use the CLT for the mean. If you are being asked to
More informationECON 214 Elements of Statistics for Economists 2016/2017
ECON 214 Elements of Statistics for Economists 2016/2017 Topic The Normal Distribution Lecturer: Dr. Bernardin Senadza, Dept. of Economics bsenadza@ug.edu.gh College of Education School of Continuing and
More informationWe use probability distributions to represent the distribution of a discrete random variable.
Now we focus on discrete random variables. We will look at these in general, including calculating the mean and standard deviation. Then we will look more in depth at binomial random variables which are
More informationRandom Variables CHAPTER 6.3 BINOMIAL AND GEOMETRIC RANDOM VARIABLES
Random Variables CHAPTER 6.3 BINOMIAL AND GEOMETRIC RANDOM VARIABLES Essential Question How can I determine whether the conditions for using binomial random variables are met? Binomial Settings When the
More informationThe normal distribution is a theoretical model derived mathematically and not empirically.
Sociology 541 The Normal Distribution Probability and An Introduction to Inferential Statistics Normal Approximation The normal distribution is a theoretical model derived mathematically and not empirically.
More information2011 Pearson Education, Inc
Statistics for Business and Economics Chapter 4 Random Variables & Probability Distributions Content 1. Two Types of Random Variables 2. Probability Distributions for Discrete Random Variables 3. The Binomial
More information7 THE CENTRAL LIMIT THEOREM
CHAPTER 7 THE CENTRAL LIMIT THEOREM 373 7 THE CENTRAL LIMIT THEOREM Figure 7.1 If you want to figure out the distribution of the change people carry in their pockets, using the central limit theorem and
More informationLecture 3. Sampling distributions. Counts, Proportions, and sample mean.
Lecture 3 Sampling distributions. Counts, Proportions, and sample mean. Statistical Inference: Uses data and summary statistics (mean, variances, proportions, slopes) to draw conclusions about a population
More informationStatistical Methods in Practice STAT/MATH 3379
Statistical Methods in Practice STAT/MATH 3379 Dr. A. B. W. Manage Associate Professor of Mathematics & Statistics Department of Mathematics & Statistics Sam Houston State University Overview 6.1 Discrete
More informationTOPIC: PROBABILITY DISTRIBUTIONS
TOPIC: PROBABILITY DISTRIBUTIONS There are two types of random variables: A Discrete random variable can take on only specified, distinct values. A Continuous random variable can take on any value within
More informationChapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables
Chapter 5 Continuous Random Variables and Probability Distributions 5.1 Continuous Random Variables 1 2CHAPTER 5. CONTINUOUS RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS Probability Distributions Probability
More informationThe Central Limit Theorem. Sec. 8.2: The Random Variable. it s Distribution. it s Distribution
The Central Limit Theorem Sec. 8.1: The Random Variable it s Distribution Sec. 8.2: The Random Variable it s Distribution X p and and How Should You Think of a Random Variable? Imagine a bag with numbers
More informationInterval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems
Interval estimation September 29, 2017 STAT 151 Class 7 Slide 1 Outline of Topics 1 Basic ideas 2 Sampling variation and CLT 3 Interval estimation using X 4 More general problems STAT 151 Class 7 Slide
More informationChapter 5. Statistical inference for Parametric Models
Chapter 5. Statistical inference for Parametric Models Outline Overview Parameter estimation Method of moments How good are method of moments estimates? Interval estimation Statistical Inference for Parametric
More information8.1 Estimation of the Mean and Proportion
8.1 Estimation of the Mean and Proportion Statistical inference enables us to make judgments about a population on the basis of sample information. The mean, standard deviation, and proportions of a population
More informationMATH 3200 Exam 3 Dr. Syring
. Suppose n eligible voters are polled (randomly sampled) from a population of size N. The poll asks voters whether they support or do not support increasing local taxes to fund public parks. Let M be
More informationSTOR 155 Introductory Statistics (Chap 5) Lecture 14: Sampling Distributions for Counts and Proportions
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL STOR 155 Introductory Statistics (Chap 5) Lecture 14: Sampling Distributions for Counts and Proportions 5/31/11 Lecture 14 1 Statistic & Its Sampling Distribution
More informationConfidence Intervals and Sample Size
Confidence Intervals and Sample Size Chapter 6 shows us how we can use the Central Limit Theorem (CLT) to 1. estimate a population parameter (such as the mean or proportion) using a sample, and. determine
More informationChapter 3 - Lecture 5 The Binomial Probability Distribution
Chapter 3 - Lecture 5 The Binomial Probability October 12th, 2009 Experiment Examples Moments and moment generating function of a Binomial Random Variable Outline Experiment Examples A binomial experiment
More informationProbability. An intro for calculus students P= Figure 1: A normal integral
Probability An intro for calculus students.8.6.4.2 P=.87 2 3 4 Figure : A normal integral Suppose we flip a coin 2 times; what is the probability that we get more than 2 heads? Suppose we roll a six-sided
More informationMA 1125 Lecture 14 - Expected Values. Wednesday, October 4, Objectives: Introduce expected values.
MA 5 Lecture 4 - Expected Values Wednesday, October 4, 27 Objectives: Introduce expected values.. Means, Variances, and Standard Deviations of Probability Distributions Two classes ago, we computed the
More informationThe Two-Sample Independent Sample t Test
Department of Psychology and Human Development Vanderbilt University 1 Introduction 2 3 The General Formula The Equal-n Formula 4 5 6 Independence Normality Homogeneity of Variances 7 Non-Normality Unequal
More informationSection Sampling Distributions for Counts and Proportions
Section 5.1 - Sampling Distributions for Counts and Proportions Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Distributions When dealing with inference procedures, there are two different
More informationChapter 6 Probability
Chapter 6 Probability Learning Objectives 1. Simulate simple experiments and compute empirical probabilities. 2. Compute both theoretical and empirical probabilities. 3. Apply the rules of probability
More informationChapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi
Chapter 4: Commonly Used Distributions Statistics for Engineers and Scientists Fourth Edition William Navidi 2014 by Education. This is proprietary material solely for authorized instructor use. Not authorized
More informationSTAT 201 Chapter 6. Distribution
STAT 201 Chapter 6 Distribution 1 Random Variable We know variable Random Variable: a numerical measurement of the outcome of a random phenomena Capital letter refer to the random variable Lower case letters
More informationChapter 5: Statistical Inference (in General)
Chapter 5: Statistical Inference (in General) Shiwen Shen University of South Carolina 2016 Fall Section 003 1 / 17 Motivation In chapter 3, we learn the discrete probability distributions, including Bernoulli,
More information4 Random Variables and Distributions
4 Random Variables and Distributions Random variables A random variable assigns each outcome in a sample space. e.g. called a realization of that variable to Note: We ll usually denote a random variable
More informationINF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 3, 1.9
INF5830 015 FALL NATURAL LANGUAGE PROCESSING Jan Tore Lønning, Lecture 3, 1.9 Today: More statistics Binomial distribution Continuous random variables/distributions Normal distribution Sampling and sampling
More informationProbability is the tool used for anticipating what the distribution of data should look like under a given model.
AP Statistics NAME: Exam Review: Strand 3: Anticipating Patterns Date: Block: III. Anticipating Patterns: Exploring random phenomena using probability and simulation (20%-30%) Probability is the tool used
More informationStatistics and Probability
Statistics and Probability Continuous RVs (Normal); Confidence Intervals Outline Continuous random variables Normal distribution CLT Point estimation Confidence intervals http://www.isrec.isb-sib.ch/~darlene/geneve/
More informationChapter 6 Confidence Intervals
Chapter 6 Confidence Intervals Section 6-1 Confidence Intervals for the Mean (Large Samples) VOCABULARY: Point Estimate A value for a parameter. The most point estimate of the population parameter is the
More informationCommonly Used Distributions
Chapter 4: Commonly Used Distributions 1 Introduction Statistical inference involves drawing a sample from a population and analyzing the sample data to learn about the population. We often have some knowledge
More informationHOMEWORK: Due Mon 11/8, Chapter 9: #15, 25, 37, 44
This week: Chapter 9 (will do 9.6 to 9.8 later, with Chap. 11) Understanding Sampling Distributions: Statistics as Random Variables ANNOUNCEMENTS: Shandong Min will give the lecture on Friday. See website
More informationReview for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom
Review for Final Exam 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom THANK YOU!!!! JON!! PETER!! RUTHI!! ERIKA!! ALL OF YOU!!!! Probability Counting Sets Inclusion-exclusion principle Rule of product
More informationBinomial and Normal Distributions
Binomial and Normal Distributions Bernoulli Trials A Bernoulli trial is a random experiment with 2 special properties: The result of a Bernoulli trial is binary. Examples: Heads vs. Tails, Healthy vs.
More informationStatistics 13 Elementary Statistics
Statistics 13 Elementary Statistics Summer Session I 2012 Lecture Notes 5: Estimation with Confidence intervals 1 Our goal is to estimate the value of an unknown population parameter, such as a population
More informationRandom Variables Handout. Xavier Vilà
Random Variables Handout Xavier Vilà Course 2004-2005 1 Discrete Random Variables. 1.1 Introduction 1.1.1 Definition of Random Variable A random variable X is a function that maps each possible outcome
More informationChapter 9: Sampling Distributions
Chapter 9: Sampling Distributions 9. Introduction This chapter connects the material in Chapters 4 through 8 (numerical descriptive statistics, sampling, and probability distributions, in particular) with
More informationSAMPLING DISTRIBUTIONS. Chapter 7
SAMPLING DISTRIBUTIONS Chapter 7 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution Statistic and Parameter Statistic numerical summary of sample data: p-hat or xbar Parameter
More informationUsing the Central Limit
Using the Central Limit Theorem By: OpenStaxCollege It is important for you to understand when to use the central limit theorem. If you are being asked to find the probability of the mean, use the clt
More informationChapter 8: Binomial and Geometric Distributions
Chapter 8: Binomial and Geometric Distributions Section 8.1 Binomial Distributions The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Section 8.1 Binomial Distribution Learning Objectives
More informationCHAPTER 5 SAMPLING DISTRIBUTIONS
CHAPTER 5 SAMPLING DISTRIBUTIONS Sampling Variability. We will visualize our data as a random sample from the population with unknown parameter μ. Our sample mean Ȳ is intended to estimate population mean
More informationEcon 6900: Statistical Problems. Instructor: Yogesh Uppal
Econ 6900: Statistical Problems Instructor: Yogesh Uppal Email: yuppal@ysu.edu Lecture Slides 4 Random Variables Probability Distributions Discrete Distributions Discrete Uniform Probability Distribution
More informationNormal distribution. We say that a random variable X follows the normal distribution if the probability density function of X is given by
Normal distribution The normal distribution is the most important distribution. It describes well the distribution of random variables that arise in practice, such as the heights or weights of people,
More informationχ 2 distributions and confidence intervals for population variance
χ 2 distributions and confidence intervals for population variance Let Z be a standard Normal random variable, i.e., Z N(0, 1). Define Y = Z 2. Y is a non-negative random variable. Its distribution is
More informationMath : Spring 2008
Math 1070-2: Spring 2008 Lecture 7 Davar Khoshnevisan Department of Mathematics University of Utah http://www.math.utah.edu/ davar February 27, 2008 An example A WHO study of health: In Canada, the systolic
More informationChapter 8. Variables. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Chapter 8 Random Variables Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc. 8.1 What is a Random Variable? Random Variable: assigns a number to each outcome of a random circumstance, or,
More informationBiostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras
Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 05 Normal Distribution So far we have looked at discrete distributions
More informationSTA 220H1F LEC0201. Week 7: More Probability: Discrete Random Variables
STA 220H1F LEC0201 Week 7: More Probability: Discrete Random Variables Recall: A sample space for a random experiment is the set of all possible outcomes of the experiment. Random Variables A random variable
More information= 0.35 (or ˆp = We have 20 independent trials, each with probability of success (heads) equal to 0.5, so X has a B(20, 0.5) distribution.
Chapter 5 Solutions 51 (a) n = 1500 (the sample size) (b) The Yes count seems like the most reasonable choice, but either count is defensible (c) X = 525 (or X = 975) (d) ˆp = 525 1500 = 035 (or ˆp = 975
More informationHomework: (Due Wed) Chapter 10: #5, 22, 42
Announcements: Discussion today is review for midterm, no credit. You may attend more than one discussion section. Bring 2 sheets of notes and calculator to midterm. We will provide Scantron form. Homework:
More informationSTAT Chapter 7: Central Limit Theorem
STAT 251 - Chapter 7: Central Limit Theorem In this chapter we will introduce the most important theorem in statistics; the central limit theorem. What have we seen so far? First, we saw that for an i.i.d
More information