LAB 2 Random Variables, Sampling Distributions of Counts, and Normal Distributions The ECA 225 has open lab hours if you need to finish LAB 2. The lab is open Monday-Thursday 6:30-10:00pm and Saturday-Sunday 2:00-6:00pm To download R onto your own personal computer, go to: http://cran.r-project.org/bin/windows/base/ Click on the link for R-2.6.1-win32.exe. Save the file to your computer. Then click on the file to start the installation to your computer. Your submission to LAB 2 should consist of answering the numbered questions as you work through the Lab. ***AS YOU ARE WORKING THROUGH THE LAB, copy and paste each output into a blank word file**** You can either print the completed word file out and turn that in, or you can e-mail the word file to me for you LAB 2 grade. Everything MUST be done in R and included in your word file. *********************************************************************** *********************************************************************** Access R On the desktop or through the Programs Menu, find the R icon and click on it. You should be brought to a screen with a command prompt:
Random Variables You can use R to simulate certain Random Variables. For example, if X is equal to the outcome of tossing a coin, you could use 0 to represent a head and 1 to represent a tail. R can simulate tossing a coin ten times with the following command: >sample (0:1, size = 10, replace = TRUE) output: [1] 0 0 1 1 1 1 1 0 1 0 To read the above results, in the ten tosses of the coin, we had 4 heads and 6 tails. The 0:1 code is telling R to chose a discrete number between 0 and 1 inclusively, the size = 10 code is telling R the number of repetitions, the replace = TRUE code is telling R that sampling is done with replacement. For another example, the following simulates the outcomes for X = the outcome of tossing a six sided die 20 times. >sample (1:6, size = 20, replace = TRUE) output: [1] 1 2 1 5 4 6 4 1 4 3 1 5 1 2 5 6 3 3 1 2 For another example, the following simulates the outcomes for X = the sum of tossing two six sided dice, 13 times. >sample (1:6, size = 13, replace = TRUE) + sample (1:6, size = 13, replace = TRUE) output: [1] 8 7 5 8 3 6 4 5 7 3 4 11 8 1. What would be the code to simulate Y = the sum of tossing a six sided die and a four sided die, for 15 tosses? 2. What is your output to (1). Copy and paste R code and output 3. Simulate a lottery ticket that consists of choosing 5 numbers from the numbers 01 to 45. Include your R code and output (this type of sampling is done without replacement) Binomial Variables and Sampling distribution of a Count We can use R to calculate probabilities and simulate samples from a binomial distribution. For example, suppose the probability that a person aged 20 will be alive at age 65 is 0.80. If we select 15 people aged 20, and X = the number that are alive at age 65. X has a Binomial distribution with n = 15 and p = 0.8. The probability that X = 7 can be found with R: >dbinom(7, size=15, prob = 0.8) output: [1] 0.003454764
The probability that X is at least 7 can be found with R: >sum(dbinom(7:15,size =15, prob = 0.8)) output: [1] 0.999215 You can also draw a plot to represent the sampling distribution of this count: >heights = dbinom(0:15, size=15, prob=0.8) >plot(0:15, heights, type= h, main= Spike plot of X, xlab = x, ylab= prob ) output: 4. Pinworm infestation can be treated with a drug. According to a study, the drug is an effective treatment to cure pinworm in 90% of cases. Suppose that 8 children with pinworm are given the drug. What is the probability that 6 children are cured? Show R code and output. 5. What is the probability that at least one child is not cured? Show R code and output. 6. Construct a graph to show the sampling distribution of X = the number of children in the sample of size 8 that are cured. 7. How many children should be sampled so that the distribution of X is approximately normal? Explain your reasoning and show a sampling distribution of X graph that is approximately normal. 8. A fair coin is tossed 100,000 times. The number of heads is recorded. What is the probability that there are between 49,800 and 50,200 heads, inclusively? Show R code and output.
Normal Distributions Just like a binomial random variable, we can simulate normal random variables and calculate probabilities. For example, if you want to draw a normal distribution in R: > curve(dnorm(x, mean=4, sd=0.5),2,6) output: Note: the 2,6 part of the code is just telling R what a good range for the horizontal axis would be. You can put any reasonable range. If you want to calculate the probability that X is less than 3 in a normal distribution with mean 4 and standard deviation 0.5, use the following code: >pnorm(3,mean=4,sd=0.5) output: [1] 0.02275013 The code pnorm is asking R to find the area to the LEFT. If you want to calculate the probability that X is greater than 3 in a normal distribution with mean 4 and standard deviation 0.5, use the following code: >1-pnorm(3,mean=4,sd=0.5) output: [1] 0.9772499
If you want to calculate the probability that X is between 3 and 5 in a normal distribution with mean 4 and standard deviation 0.5, you can use the following code options: >1-2*pnorm(3,mean=4,sd=0.5) Notice that what it is doing is finding the area to the left of 3, multiplying that by 2 because the curve is symmetric, and then subtracting those two tails from one. OR: >pnorm(5,mean=4,sd=0.5)-pnorm(3,mean=4,sd=0.5) OR: >diff(pnorm(c(3,5),mean=4,sd=0.5)) 9. Tarantula Carapace lengths are normally distributed with mean 18.14 mm and standard deviation 1.76 mm. Draw a normal distribution curve in R for the carapace lengths. 10. What is the probability that a randomly selected tarantula has a carapace length less than 15 mm? Include R code and output 11. What is the probability that a randomly selected tarantula has a carapace length greater than 20mm? Include R code and output 12. What is the probability that a randomly selected tarantula has a carapace length within one mm of the population mean? Include R code and output