Ordinal Predicted Variable
|
|
- Brianna Atkinson
- 5 years ago
- Views:
Transcription
1 Ordinal Predicted Variable Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information.
2 Goals and General Idea
3 Goals When would we use this type of analysis? When the predicted variable is ordinal! Places in a race (1st, 2nd, 3rd, etc.) Surveys on a Likert scale (5 = strongly agree, 4 = agree, 3 = neutral 2 = disagree, 1 = strongly disagree) Scaled responses (good, mediocre, bad) etc.
4 Characteristics Ordinal data are kind of a pain to deal with Know order, but not necessarily equally spaced How much do you like fish (1-hate to 5-love)? May be harder to go from 1 2 than 4 5 As predictor variables increase, should sequentially step through predicted values How can we ensure this happens?
5 Characteristics Suppose ordinal data with 7 levels There will be cut-off points (thresholds) between levels, indicating where it switches from one to another (indicated here as θs) If there are k levels, there will be k-1 of these thresholds From Kruschke (2015) p. 673
6 Characteristics How do we get probabilities for each level? Cumulative distribution From Kruschke (2015) p. 673
7 Characteristics Now values range from 0 to 1 Probability for each level is the cumulative area up to the threshold just above that level minus the cumulative area up to the threshold just below that level Call each threshold point an α value cumulative proportion α1 α2 α3 α4 α5 α response
8 Characteristics For first category, probability is cumulative probability for that value, minus zero Considering the mean and sd of the underlying distribution Cumulative normal distribution in JAGS is pnorm cumulative proportion α1 α2 α3 α4 α5 α response
9 Characteristics For second category, probability is cumulative probability for that value, minus that for the first category cumulative proportion α1 α2 α3 α4 α5 α response
10 Characteristics For third category, probability is cumulative probability for that value, minus that for the second category cumulative proportion α1 α2 α3 α4 α5 α response
11 Characteristics For fourth category, probability is cumulative probability for that value, minus that for the third category cumulative proportion α1 α2 α3 α4 α5 α response
12 Characteristics For fifth category, probability is cumulative probability for that value, minus that for the fourth category cumulative proportion α1 α2 α3 α4 α5 α response
13 Characteristics For sixth category, probability is cumulative probability for that value, minus that for the fifth category cumulative proportion α1 α2 α3 α4 α5 α response
14 Characteristics For seventh category, probability is one, minus the cumulative probability for the 6th category cumulative proportion α1 α2 α3 α4 α5 α response
15 Characteristics One problem: Our α values are only relative to one another, and have no absolute position, as is Could add any constant to raw (i.e. non-cumulative) values, and recover the same estimates Like sliding our distribution up and down the x-axis, our α estimates would remain the same Real problems for MCMC process (any value is reasonable!)
16 Characteristics One solution: Pin down distribution by specifying the two extreme α values Estimate the rest, relative to these, which is all that matters Will specify this in the data list
17 Characteristics The mean (μ) of this distribution is the result of the additive effect of our predictor variables Our standard equation for the effects of the predictor variables goes into this μ
18 Characteristics Distribution does not have to look normal for the normal distribution to be appropriate All the following histograms were generated by a normal distribution From Kruschke (2015) p. 673
19 Characteristics What we are estimating: 1. The α values for all but the first and last thresholds 2. The mean (μ) of the underlying distribution (based on the additive effect of the predictor variables) 3. The standard deviation (σ) of the underlying distribution 4. Other appropriate distribution parameters if not using the normal distribution
20 The Data
21 Data Fake data generated from code in Kruschke (2011) ord <- read.table("ordinaldata.csv", header = TRUE, sep = ",")
22 Data Fake data generated from code in Kruschke (2011) Ordinal predicted variable ord <- read.table("ordinaldata.csv", header = TRUE, sep = ",")
23 Data Fake data generated from code in Kruschke (2011) Two metric predictor variables ord <- read.table("ordinaldata.csv", header = TRUE, sep = ",")
24 Data Can use the pairs function to plot the data, and get some idea of potential patterns (keeping in mind the issue of interactions) pairs(ord, pch = 16, col = rgb(0, 0, 1, 0.3))
25 Data Exploration Y X1 X
26 Data Exploration Looks like a negative relationship between X1 & Y, and a positive relationship between X2 & Y Y X1 X
27 Data Exploration Use the table function to get frequencies for each ordinal response ytable <- table(ord$y) ytable
28 Data Exploration Make as a data frame and format properly ytable.df <- as.data.frame(ytable) ytable.df[, 1] <- as.numeric(as.character(ytable.df[, 1]))
29 Data Exploration Plot the data plot(ytable.df[, 1], ytable.df[, 2], type = "h", ylab = "Frequency", xlab = "response", lwd = 4, col = rgb(0, 0, 1, 0.5)) Frequency response
30 Data Exploration Can also transpose this to the cumulative distribution of your data, if you want to Frequency response # Get proportions pr_y <- ytable / nrow(ord) # Get cumulative proportions cum_pr_y <- cumsum(pr_y)
31 Data Exploration Can also transpose this to the cumulative distribution of your data, if you want to An R function that calculates the cumulative sums of a vector Frequency response # Get proportions pr_y <- ytable / nrow(ord) # Get cumulative proportions cum_pr_y <- cumsum(pr_y)
32 Data Exploration # Plot plot(ytable.df[, 1], cum_pr_y, type = "b", lwd = 2, ylab = cumulative proportion", xlab = "response", ylim = c(0, 1), col = blue ) cumulative proportion response
33 Frequentist Approach
34 Frequentist Approch I don t know polr function from the MASS package seems to be an option, but I couldn t get it to work (with limited time)
35 Bayesian Approach
36 Load Libraries & Functions library(runjags) library(coda) source("plotpost.r")
37 Organize the Data y <- ord$y N <- length(y) nlevels <- length(unique(y)) x1 <- ord$x1 x2 <- ord$x2
38 Organize the Data y <- ord$y N <- length(y) nlevels <- length(unique(y)) x1 <- ord$x1 x2 <- ord$x2 Making a variable with the number of response levels will make your code more generic
39 Standardize the Metric Variables # x1 x1mean <- mean(x1) x1sd <- sd(x1) zx1 <- (x1 - x1mean) / x1sd # x2 x2mean <- mean(x2) x2sd <- sd(x2) zx2 <- (x2 - x2mean) / x2sd
40 Create a List For Alpha Values #--- Create a list for anchored alpha values ---# # with beginning and ending values, but the # # rest will be filled in by the MCMC process # # (have them as "NA" for now). # # # alpha <- rep(na, nlevels - 1) alpha[1] < # Set first value alpha[nlevels - 1] <- nlevels # Set last value
41 Make Data List For JAGS datalist = list( y = y, nlevels = nlevels, N = N, x1 = zx1, x2 = zx2, alpha = alpha )
42 Define the Model Doesn t lend itself well to a diagram We ll just walk through the code
43 modelstring = " model { #--- The likelihood ---# for (i in 1:N) { Note that I have rearranged things from how I did it before, to try to add clarity. We ll walk through it. # The standard part of our equation mu[i] <- b0 + (b1 * x1[i]) + (b2 * x2[i]) # log-odds for each value being in first category p[i, 1] <- pnorm(alpha[1], mu[i], 1 / sigma^2) # log-odds for each value being in categories between the highest and the lowest for (j in 2:(nLevels - 1)) { p[i, j] <- max(0, pnorm(alpha[j], mu[i], 1 / sigma^2) - pnorm(alpha[j - 1], mu[i], 1 / sigma^2)) # log-odds for each value being in the highest category p[i, nlevels] <- 1 - pnorm(alpha[nlevels - 1], mu[i], 1/sigma^2)... # Now, fit the y data to a categorical distribution # with the characteristics we just calculated y[i] ~ dcat(p[i, 1:nLevels])
44 modelstring = " model { #--- The likelihood ---# for (i in 1:N) { The black box into which we can put any equations that we have dealt with before (or more) # The standard part of our equation mu[i] <- b0 + (b1 * x1[i]) + (b2 * x2[i]) # log-odds for each value being in first category p[i, 1] <- pnorm(alpha[1], mu[i], 1 / sigma^2) # log-odds for each value being in categories between the highest and the lowest for (j in 2:(nLevels - 1)) { p[i, j] <- max(0, pnorm(alpha[j], mu[i], 1 / sigma^2) - pnorm(alpha[j - 1], mu[i], 1 / sigma^2)) # log-odds for each value being in the highest category p[i, nlevels] <- 1 - pnorm(alpha[nlevels - 1], mu[i], 1/sigma^2)... # Now, fit the y data to a categorical distribution # with the characteristics we just calculated y[i] ~ dcat(p[i, 1:nLevels])
45 modelstring = " model { #--- The likelihood ---# for (i in 1:N) {...describe the mean of the normal distribution describing the data # The standard part of our equation mu[i] <- b0 + (b1 * x1[i]) + (b2 * x2[i]) # log-odds for each value being in first category p[i, 1] <- pnorm(alpha[1], mu[i], 1 / sigma^2) # log-odds for each value being in categories between the highest and the lowest for (j in 2:(nLevels - 1)) { p[i, j] <- max(0, pnorm(alpha[j], mu[i], 1 / sigma^2) - pnorm(alpha[j - 1], mu[i], 1 / sigma^2)) # log-odds for each value being in the highest category p[i, nlevels] <- 1 - pnorm(alpha[nlevels - 1], mu[i], 1/sigma^2)... # Now, fit the y data to a categorical distribution # with the characteristics we just calculated y[i] ~ dcat(p[i, 1:nLevels])
46 modelstring = " model { #--- The likelihood ---# for (i in 1:N) { # The standard part of our equation mu[i] <- b0 + (b1 * x1[i]) + (b2 * x2[i]) The probability of each data point being in the first category is the cumulative probability up to the first threshold, based on the mean and sd of the underlying distribution # log-odds for each value being in first category p[i, 1] <- pnorm(alpha[1], mu[i], 1 / sigma^2) # log-odds for each value being in categories between the highest and the lowest for (j in 2:(nLevels - 1)) { p[i, j] <- max(0, pnorm(alpha[j], mu[i], 1 / sigma^2) - pnorm(alpha[j - 1], mu[i], 1 / sigma^2)) # log-odds for each value being in the highest category p[i, nlevels] <- 1 - pnorm(alpha[nlevels - 1], mu[i], 1/sigma^2)... # Now, fit the y data to a categorical distribution # with the characteristics we just calculated y[i] ~ dcat(p[i, 1:nLevels])
47 modelstring = " model { #--- The likelihood ---# for (i in 1:N) { # The standard part of our equation mu[i] <- b0 + (b1 * x1[i]) + (b2 * x2[i]) # log-odds for each value being in first category p[i, 1] <- pnorm(alpha[1], mu[i], 1 / sigma^2) The probability of each data point being in each of the middle categories is the cumulative probability up to the top threshold for the given category, minus the cumulative probability up to the bottom threshold for the given category, based on the mean and sd of the underlying distribution # log-odds for each value being in categories between the highest and the lowest for (j in 2:(nLevels - 1)) { p[i, j] <- max(0, pnorm(alpha[j], mu[i], 1 / sigma^2) - pnorm(alpha[j - 1], mu[i], 1 / sigma^2)) # log-odds for each value being in the highest category p[i, nlevels] <- 1 - pnorm(alpha[nlevels - 1], mu[i], 1/sigma^2)... # Now, fit the y data to a categorical distribution # with the characteristics we just calculated y[i] ~ dcat(p[i, 1:nLevels])
48 modelstring = " model { #--- The likelihood ---# for (i in 1:N) { # The standard part of our equation mu[i] <- b0 + (b1 * x1[i]) + (b2 * x2[i]) This is just a safety net. If the difference calculated to the right is less than zero, zero will be used. Included because probabilities can t be less than zero. # log-odds for each value being in first category p[i, 1] <- pnorm(alpha[1], mu[i], 1 / sigma^2) # log-odds for each value being in categories between the highest and the lowest for (j in 2:(nLevels - 1)) { p[i, j] <- max(0, pnorm(alpha[j], mu[i], 1 / sigma^2) - pnorm(alpha[j - 1], mu[i], 1 / sigma^2)) # log-odds for each value being in the highest category p[i, nlevels] <- 1 - pnorm(alpha[nlevels - 1], mu[i], 1/sigma^2)... # Now, fit the y data to a categorical distribution # with the characteristics we just calculated y[i] ~ dcat(p[i, 1:nLevels])
49 modelstring = " model { #--- The likelihood ---# for (i in 1:N) { # The standard part of our equation mu[i] <- b0 + (b1 * x1[i]) + (b2 * x2[i]) The probability of each data point being in the last category is one minus the cumulative probability up to the lower threshold for that category, based on the mean and sd of the underlying distribution # log-odds for each value being in first category p[i, 1] <- pnorm(alpha[1], mu[i], 1 / sigma^2) # log-odds for each value being in categories between the highest and the lowest for (j in 2:(nLevels - 1)) { p[i, j] <- max(0, pnorm(alpha[j], mu[i], 1 / sigma^2) - pnorm(alpha[j - 1], mu[i], 1 / sigma^2)) # log-odds for each value being in the highest category p[i, nlevels] <- 1 - pnorm(alpha[nlevels - 1], mu[i], 1/sigma^2)... # Now, fit the y data to a categorical distribution # with the characteristics we just calculated y[i] ~ dcat(p[i, 1:nLevels])
50 modelstring = " model { #--- The likelihood ---# for (i in 1:N) { What we ve created in the last few lines is a probability matrix for each data point being in each of our response categories... # The standard part of our equation mu[i] <- b0 + (b1 * x1[i]) + (b2 * x2[i]) # log-odds for each value being in first category p[i, 1] <- pnorm(alpha[1], mu[i], 1 / sigma^2) # log-odds for each value being in categories between the highest and the lowest for (j in 2:(nLevels - 1)) { p[i, j] <- max(0, pnorm(alpha[j], mu[i], 1 / sigma^2) - pnorm(alpha[j - 1], mu[i], 1 / sigma^2)) # log-odds for each value being in the highest category p[i, nlevels] <- 1 - pnorm(alpha[nlevels - 1], mu[i], 1/sigma^2)... # Now, fit the y data to a categorical distribution # with the characteristics we just calculated y[i] ~ dcat(p[i, 1:nLevels])
51 modelstring = " model { #--- The likelihood ---# for (i in 1:N) {...these are used to describe the categorical distribution that is ultimately fit to the observed response variables # The standard part of our equation mu[i] <- b0 + (b1 * x1[i]) + (b2 * x2[i]) # log-odds for each value being in first category p[i, 1] <- pnorm(alpha[1], mu[i], 1 / sigma^2) # log-odds for each value being in categories between the highest and the lowest for (j in 2:(nLevels - 1)) { p[i, j] <- max(0, pnorm(alpha[j], mu[i], 1 / sigma^2) - pnorm(alpha[j - 1], mu[i], 1 / sigma^2)) # log-odds for each value being in the highest category p[i, nlevels] <- 1 - pnorm(alpha[nlevels - 1], mu[i], 1/sigma^2)... # Now, fit the y data to a categorical distribution # with the characteristics we just calculated y[i] ~ dcat(p[i, 1:nLevels])
52 Define the Model... #--- The Priors ---# # intercept and effect coefficients b0 ~ dnorm((1 + nlevels) / 2, 1 / nlevels^2) b1 ~ dnorm(0, 1 / nlevels^2) b2 ~ dnorm(0, 1 / nlevels^2) # Sigma sigma ~ dunif(nlevels / 1000, nlevels * 10) # Intermediate alpha values (we set the min and max # values in our initial data list) for (j in 2:(nLevels - 2)) { alpha[j] ~ dnorm(j + 0.5, 1 / 2^2) " writelines(modelstring, con = "model.txt")
53 Define the Model... #--- The Priors ---# # intercept and effect coefficients b0 ~ dnorm((1 + nlevels) / 2, 1 / nlevels^2) b1 ~ dnorm(0, 1 / nlevels^2) b2 ~ dnorm(0, 1 / nlevels^2) # Sigma sigma ~ dunif(nlevels / 1000, nlevels * 10) # Intermediate alpha values (we set the min and max # values in our initial data list) for (j in 2:(nLevels - 2)) { alpha[j] ~ dnorm(j + 0.5, 1 / 2^2) " writelines(modelstring, con = "model.txt") Our mean value (b0) should be about in the centre of our categories, and the sd is the number of categories (can t be beyond this)
54 Define the Model... #--- The Priors ---# # intercept and effect coefficients b0 ~ dnorm((1 + nlevels) / 2, 1 / nlevels^2) b1 ~ dnorm(0, 1 / nlevels^2) b2 ~ dnorm(0, 1 / nlevels^2) # Sigma sigma ~ dunif(nlevels / 1000, nlevels * 10) # Intermediate alpha values (we set the min and max # values in our initial data list) for (j in 2:(nLevels - 2)) { alpha[j] ~ dnorm(j + 0.5, 1 / 2^2) " writelines(modelstring, con = "model.txt") Same logic for sd here
55 Define the Model... #--- The Priors ---# # intercept and effect coefficients b0 ~ dnorm((1 + nlevels) / 2, 1 / nlevels^2) b1 ~ dnorm(0, 1 / nlevels^2) b2 ~ dnorm(0, 1 / nlevels^2) # Sigma sigma ~ dunif(nlevels / 1000, nlevels * 10) # Intermediate alpha values (we set the min and max # values in our initial data list) for (j in 2:(nLevels - 2)) { alpha[j] ~ dnorm(j + 0.5, 1 / 2^2) " writelines(modelstring, con = "model.txt") Our prior for sigma comes from a uniform distribution with a minimum value of our number of levels divided by 1000, and a maximum value of the number of levels times 10
56 Define the Model... #--- The Priors ---# # intercept and effect coefficients b0 ~ dnorm((1 + nlevels) / 2, 1 / nlevels^2) b1 ~ dnorm(0, 1 / nlevels^2) b2 ~ dnorm(0, 1 / nlevels^2) # Sigma sigma ~ dunif(nlevels / 1000, nlevels * 10) # Intermediate alpha values (we set the min and max # values in our initial data list) for (j in 2:(nLevels - 2)) { alpha[j] ~ dnorm(j + 0.5, 1 / 2^2) " writelines(modelstring, con = "model.txt") Only need alpha priors for the middle values because outer ones were specified. These will come from a normal distribution with mean of that level value plus 0.5, and a sd of 2. Note that what you choose here should be based on what values you used to specify the outer alphas.
57 Specify Initial Values initslist <- function() { list( b0 = rnorm(n = 1, mean = (1 + nlevels) / 2, sd = nlevels), b1 = rnorm(n = 1, mean = 0, sd = nlevels), b2 = rnorm(n = 1, mean = 0, sd = nlevels), sigma = runif(n = 1, min = nlevels / 1000, max = nlevels * 10) )
58 Specify MCMC Parameters and Run runjagsout <- run.jags( method = "simple", model = "model.txt", monitor = c("b0", "b1", "b2", "sigma", "alpha"), data = datalist, inits = initslist, n.chains = 3, adapt = 500, burnin = 1000, sample = 20000, thin = 1, summarise = TRUE, plots = FALSE)
59 Specify MCMC Parameters and Run runjagsout <- run.jags( method = "simple", model = "model.txt", monitor = c("b0", "b1", "b2", "sigma", "alpha"), data = datalist, inits = initslist, n.chains = 3, Note adapt that = there 500, is a lot going on in this burnin = 1000, model. As a result, it takes substantially sample = 20000, longer thin than = 1, our other ones. This one takes about summarise 10 min = on TRUE, my computer, and my plots = FALSE) computer is fairly fast.
60 Next Steps (On Your Own) Retrieve the data and take a peek at the structure Test model performance Extract & parse results Convert back to original scale
61 View Posteriors
62 Plotting Posterior Distributions β 0 par(mfrow = c(1, 1)) histinfo = plotpost(b0, xlab = bquote(beta[0])) mean = % HDI β 0
63 Plotting Posterior Distributions β 1 & β 2 par(mfrow = c(1, 2)) histinfo = plotpost(b1, xlab = bquote(beta[1]), main = "x1") histinfo = plotpost(b2, xlab = bquote(beta[2]), main = "x2") x1 x2 mean = mean = % HDI % HDI β β 2
64 Plotting Posterior Distributions β 1 & β 2 x1 has a credible negative effect, and x2 has par(mfrow = c(1, 2)) a credible positive effect. These are on a histinfo = plotpost(b1, xlab = bquote(beta[1]), main = "x1") histinfo = plotpost(b2, real xlab scale. = No bquote(beta[2]), need to transform main them = for "x2") interpretation. x1 mean = x2 mean = % HDI % HDI β β 2
65 Posterior Predictive Check
66 Posterior Predictive Check Code is clunky and slow, but should make sense (and work!) Takes about 10 minutes on my computer...so be patient! Code predicts for the entire data set (N = 200), but could use a subset For each step in the chain, count the number of individuals assigned to each level (given the predictor variables) Compare the mean and HDIs of these predictions relative to true values
67 Posterior Predictive Check source("hdiofmcmc.r") # Create a matrix to hold results ypostpred <- matrix(0, nrow = chainlength, ncol = nlevels) # For each step in the chain... for (i in 1:chainLength) { # Initialize holders (counters for each level) counter1 <- 0 counter2 <- 0 counter3 <- 0 counter4 <- 0 counter5 <- 0 counter6 <- 0 counter7 <- 0...
68 Posterior Predictive Check source("hdiofmcmc.r") Starting a loop that will go through every step in the chain # Create a matrix to hold results ypostpred <- matrix(0, nrow = chainlength, ncol = nlevels) # For each step in the chain... for (i in 1:chainLength) { # Initialize holders (counters for each level) counter1 <- 0 counter2 <- 0 counter3 <- 0 counter4 <- 0 counter5 <- 0 counter6 <- 0 counter7 <- 0...
69 Posterior Predictive Check source("hdiofmcmc.r") # Create a matrix to hold results ypostpred <- matrix(0, nrow = chainlength, ncol = nlevels) # For each step in the chain... for (i in 1:chainLength) { # Initialize holders (counters for each level) counter1 <- 0 counter2 <- 0 counter3 <- 0 counter4 <- 0 counter5 <- 0 counter6 <- 0 counter7 < For each response level, initialize a counter that will keep track of how many results were assigned to each level (re-zeroed for each step in the chain)
70 Posterior Predictive Check... # For each individual... for (j in 1:N) { Then (for each step in the chain), for each individual (data point) calculate the mean, and then the probability of them being in each response level, using equations we have seen before. # Calculate mean mu = b0[i] + (b1[i] * x1[j]) + (b2[i] * x2[j]) # Calculate their probability of being in each level levelprobs <- rep(0, times = nlevels) levelprobs[1] <- pnorm(alpha[i, 1], mu, sigma[i]) levelprobs[2] <- pnorm(alpha[i, 2], mu, sigma[i]) - pnorm(alpha[i, 1], mu, sigma[i]) levelprobs[3] <- pnorm(alpha[i, 3], mu, sigma[i]) - pnorm(alpha[i, 2], mu, sigma[i]) levelprobs[4] <- pnorm(alpha[i, 4], mu, sigma[i]) - pnorm(alpha[i, 3], mu, sigma[i]) levelprobs[5] <- pnorm(alpha[i, 5], mu, sigma[i]) - pnorm(alpha[i, 4], mu, sigma[i]) levelprobs[6] <- pnorm(alpha[i, 6], mu, sigma[i]) - pnorm(alpha[i, 5], mu, sigma[i]) levelprobs[7] <- 1 - pnorm(alpha[i, 6], mu, sigma[i])...
71 # Find item number for highest value levelid <- which.max(levelprobs) # Increase counter for appropriate group if(levelid == 1) { counter1 <- counter1 + 1 else { if (levelid == 2) { counter2 <- counter2 + 1 else { if (levelid == 3) { counter3 <- counter3 + 1 else { if (levelid == 4) { counter4 <- counter4 + 1 else { if (levelid == 5) { counter5 <- counter5 + 1 else { if (levelid == 6 ) { counter6 <- counter6 + 1 else { counter7 <- counter7 + 1
72 # Find item number for highest value levelid <- which.max(levelprobs) # Increase counter for appropriate group if(levelid == 1) { counter1 <- counter1 + 1 else { if (levelid == 2) { counter2 <- counter2 + 1 else { if (levelid == 3) { counter3 <- counter3 + 1 else { if (levelid == 4) { counter4 <- counter4 + 1 else { if (levelid == 5) { counter5 <- counter5 + 1 else { if (levelid == 6 ) { counter6 <- counter6 + 1 else { counter7 <- counter7 + 1 Identify for which response level the individual has the highest probability
73 # Find item number for highest value levelid <- which.max(levelprobs) # Increase counter for appropriate group if(levelid == 1) { counter1 <- counter1 + 1 else { if (levelid == 2) { counter2 <- counter2 + 1 else { if (levelid == 3) { counter3 <- counter3 + 1 else { if (levelid == 4) { counter4 <- counter4 + 1 else { if (levelid == 5) { counter5 <- counter5 + 1 else { if (levelid == 6 ) { counter6 <- counter6 + 1 else { counter7 <- counter7 + 1 Increment the appropriate counter. For each step in the chain, these counters will be the number of individuals predicted to be in each response level
74 Posterior Predictive Check... # Place results in results matrix ypostpred[i, 1] <- counter1 ypostpred[i, 2] <- counter2 ypostpred[i, 3] <- counter3 ypostpred[i, 4] <- counter4 ypostpred[i, 5] <- counter5 ypostpred[i, 6] <- counter6 ypostpred[i, 7] <- counter7 ypredmeans = apply(ypostpred, 2, median, na.rm = TRUE) ypredhdi = apply(ypostpred, 2, HDIofMCMC)
75 Posterior Predictive Check Fill the appropriate row (step in chain) of the ypostpred matrix with the counts for each response level.... # Place results in results matrix ypostpred[i, 1] <- counter1 ypostpred[i, 2] <- counter2 ypostpred[i, 3] <- counter3 ypostpred[i, 4] <- counter4 ypostpred[i, 5] <- counter5 ypostpred[i, 6] <- counter6 ypostpred[i, 7] <- counter7 ypostpred will have one row for each step in the chain, indicating how many individuals were assigned to each response level (columns) ypredmeans = apply(ypostpred, 2, median, na.rm = TRUE) ypredhdi = apply(ypostpred, 2, HDIofMCMC)
76 Posterior Predictive Check... # Place results in results matrix ypostpred[i, 1] <- counter1 ypostpred[i, 2] <- counter2 ypostpred[i, 3] <- counter3 ypostpred[i, 4] <- counter4 ypostpred[i, 5] <- counter5 ypostpred[i, 6] <- counter6 ypostpred[i, 7] <- counter7 Calculate the mean and HDI for each response level, across all steps in the chain ypredmeans = apply(ypostpred, 2, median, na.rm = TRUE) ypredhdi = apply(ypostpred, 2, HDIofMCMC)
77 Posterior Predictive Check # Plot original data hist(y, breaks = c(0.5, (1:nLevels + 0.5)), main = "", col = "skyblue", border = "white") # Add predicted means points(x = 1:nLevels, y = ypredmeans, pch = 16) # Add HDI bars segments(x0 = 1:nLevels, y0 = ypredhdi[1, ], x1 = 1:nLevels, y1 = ypredhdi[2, ], lwd = 2) Frequency y
78 Questions?
79 Creative Commons License Anyone is allowed to distribute, remix, tweak, and build upon this work, even commercially, as long as they credit me for the original creation. See the Creative Commons website for more information. Click here to go back to beginning
4.1 Introduction Estimating a population mean The problem with estimating a population mean with a sample mean: an example...
Chapter 4 Point estimation Contents 4.1 Introduction................................... 2 4.2 Estimating a population mean......................... 2 4.2.1 The problem with estimating a population mean
More informationModel 0: We start with a linear regression model: log Y t = β 0 + β 1 (t 1980) + ε, with ε N(0,
Stat 534: Fall 2017. Introduction to the BUGS language and rjags Installation: download and install JAGS. You will find the executables on Sourceforge. You must have JAGS installed prior to installing
More informationMixture Models and Gibbs Sampling
Mixture Models and Gibbs Sampling October 12, 2009 Readings: Hoff CHapter 6 Mixture Models and Gibbs Sampling p.1/16 Eyes Exmple Bowmaker et al (1985) analyze data on the peak sensitivity wavelengths for
More information4. Basic distributions with R
4. Basic distributions with R CA200 (based on the book by Prof. Jane M. Horgan) 1 Discrete distributions: Binomial distribution Def: Conditions: 1. An experiment consists of n repeated trials 2. Each trial
More informationMultiple regression - a brief introduction
Multiple regression - a brief introduction Multiple regression is an extension to regular (simple) regression. Instead of one X, we now have several. Suppose, for example, that you are trying to predict
More informationSTAT Lecture 9: T-tests
STAT 491 - Lecture 9: T-tests Posterior Predictive Distribution Another valuable tool in Bayesian statistics is the posterior predictive distribution. The posterior predictive distribution can be written
More informationGeneralized Linear Models
Generalized Linear Models Scott Creel Wednesday, September 10, 2014 This exercise extends the prior material on using the lm() function to fit an OLS regression and test hypotheses about effects on a parameter.
More informationSTATISTICAL LABORATORY, May 18th, 2010 CENTRAL LIMIT THEOREM ILLUSTRATION
STATISTICAL LABORATORY, May 18th, 2010 CENTRAL LIMIT THEOREM ILLUSTRATION Mario Romanazzi 1 BINOMIAL DISTRIBUTION The binomial distribution Bi(n, p), being the sum of n independent Bernoulli distributions,
More informationIncome inequality and the growth of redistributive spending in the U.S. states: Is there a link?
Draft Version: May 27, 2017 Word Count: 3128 words. SUPPLEMENTARY ONLINE MATERIAL: Income inequality and the growth of redistributive spending in the U.S. states: Is there a link? Appendix 1 Bayesian posterior
More informationDescriptive Statistics
Chapter 3 Descriptive Statistics Chapter 2 presented graphical techniques for organizing and displaying data. Even though such graphical techniques allow the researcher to make some general observations
More informationChapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1
Chapter 3 Numerical Descriptive Measures Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Objectives In this chapter, you learn to: Describe the properties of central tendency, variation, and
More informationMA 1125 Lecture 05 - Measures of Spread. Wednesday, September 6, Objectives: Introduce variance, standard deviation, range.
MA 115 Lecture 05 - Measures of Spread Wednesday, September 6, 017 Objectives: Introduce variance, standard deviation, range. 1. Measures of Spread In Lecture 04, we looked at several measures of central
More informationMarket Volatility and Risk Proxies
Market Volatility and Risk Proxies... an introduction to the concepts 019 Gary R. Evans. This slide set by Gary R. Evans is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International
More informationPackage tailloss. August 29, 2016
Package tailloss August 29, 2016 Title Estimate the Probability in the Upper Tail of the Aggregate Loss Distribution Set of tools to estimate the probability in the upper tail of the aggregate loss distribution
More informationFrequency Distributions
Frequency Distributions January 8, 2018 Contents Frequency histograms Relative Frequency Histograms Cumulative Frequency Graph Frequency Histograms in R Using the Cumulative Frequency Graph to Estimate
More informationBasic Procedure for Histograms
Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that
More informationStochastic Loss Reserving with Bayesian MCMC Models Revised March 31
w w w. I C A 2 0 1 4. o r g Stochastic Loss Reserving with Bayesian MCMC Models Revised March 31 Glenn Meyers FCAS, MAAA, CERA, Ph.D. April 2, 2014 The CAS Loss Reserve Database Created by Meyers and Shi
More informationBayesian Multinomial Model for Ordinal Data
Bayesian Multinomial Model for Ordinal Data Overview This example illustrates how to fit a Bayesian multinomial model by using the built-in mutinomial density function (MULTINOM) in the MCMC procedure
More informationPASS Sample Size Software
Chapter 850 Introduction Cox proportional hazards regression models the relationship between the hazard function λ( t X ) time and k covariates using the following formula λ log λ ( t X ) ( t) 0 = β1 X1
More informationChapter 4 Variability
Chapter 4 Variability PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh Edition by Frederick J Gravetter and Larry B. Wallnau Chapter 4 Learning Outcomes 1 2 3 4 5
More informationSTAB22 section 1.3 and Chapter 1 exercises
STAB22 section 1.3 and Chapter 1 exercises 1.101 Go up and down two times the standard deviation from the mean. So 95% of scores will be between 572 (2)(51) = 470 and 572 + (2)(51) = 674. 1.102 Same idea
More informationData Analysis. BCF106 Fundamentals of Cost Analysis
Data Analysis BCF106 Fundamentals of Cost Analysis June 009 Chapter 5 Data Analysis 5.0 Introduction... 3 5.1 Terminology... 3 5. Measures of Central Tendency... 5 5.3 Measures of Dispersion... 7 5.4 Frequency
More informationRegression and Simulation
Regression and Simulation This is an introductory R session, so it may go slowly if you have never used R before. Do not be discouraged. A great way to learn a new language like this is to plunge right
More informationSome Characteristics of Data
Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key
More informationSTAT 157 HW1 Solutions
STAT 157 HW1 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/10/spring/stats157.dir/ Problem 1. 1.a: (6 points) Determine the Relative Frequency and the Cumulative Relative Frequency (fill
More informationStatistics for Engineering, 4C3/6C3, 2012 Assignment 2
Statistics for Engineering, 4C3/6C3, 2012 Assignment 2 Kevin Dunn, dunnkg@mcmaster.ca Due date: 23 January 2012 Assignment objectives: Use a table of normal distributions to calculate probabilities Summarizing
More informationJacob: The illustrative worksheet shows the values of the simulation parameters in the upper left section (Cells D5:F10). Is this for documentation?
PROJECT TEMPLATE: DISCRETE CHANGE IN THE INFLATION RATE (The attached PDF file has better formatting.) {This posting explains how to simulate a discrete change in a parameter and how to use dummy variables
More informationCopyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.
Appendix: Statistics in Action Part I Financial Time Series 1. These data show the effects of stock splits. If you investigate further, you ll find that most of these splits (such as in May 1970) are 3-for-1
More informationA new tool for selecting your next project
The Quantitative PICK Chart A new tool for selecting your next project Author Sean Scott, PMP, is an accomplished Project Manager at Perficient. He has over 20 years of consulting IT experience providing
More informationNumerical Descriptive Measures. Measures of Center: Mean and Median
Steve Sawin Statistics Numerical Descriptive Measures Having seen the shape of a distribution by looking at the histogram, the two most obvious questions to ask about the specific distribution is where
More informationOne sample z-test and t-test
One sample z-test and t-test January 30, 2017 psych10.stanford.edu Announcements / Action Items Install ISI package (instructions in Getting Started with R) Assessment Problem Set #3 due Tu 1/31 at 7 PM
More informationChapter 7: Estimation Sections
Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions Frequentist Methods: 7.5 Maximum Likelihood Estimators
More informationRandom Variables and Probability Distributions
Chapter 3 Random Variables and Probability Distributions Chapter Three Random Variables and Probability Distributions 3. Introduction An event is defined as the possible outcome of an experiment. In engineering
More informationstarting on 5/1/1953 up until 2/1/2017.
An Actuary s Guide to Financial Applications: Examples with EViews By William Bourgeois An actuary is a business professional who uses statistics to determine and analyze risks for companies. In this guide,
More informationGARCH Models. Instructor: G. William Schwert
APS 425 Fall 2015 GARCH Models Instructor: G. William Schwert 585-275-2470 schwert@schwert.ssb.rochester.edu Autocorrelated Heteroskedasticity Suppose you have regression residuals Mean = 0, not autocorrelated
More informationAn Introduction to Bayesian Inference and MCMC Methods for Capture-Recapture
An Introduction to Bayesian Inference and MCMC Methods for Capture-Recapture Trinity River Restoration Program Workshop on Outmigration: Population Estimation October 6 8, 2009 An Introduction to Bayesian
More informationIOP 201-Q (Industrial Psychological Research) Tutorial 5
IOP 201-Q (Industrial Psychological Research) Tutorial 5 TRUE/FALSE [1 point each] Indicate whether the sentence or statement is true or false. 1. To establish a cause-and-effect relation between two variables,
More informationDescriptive Statistics Bios 662
Descriptive Statistics Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-08-19 08:51 BIOS 662 1 Descriptive Statistics Descriptive Statistics Types of variables
More informationDescriptive Statistics (Devore Chapter One)
Descriptive Statistics (Devore Chapter One) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 0 Perspective 1 1 Pictorial and Tabular Descriptions of Data 2 1.1 Stem-and-Leaf
More informationFirst Midterm Examination Econ 103, Statistics for Economists February 16th, 2016
First Midterm Examination Econ 103, Statistics for Economists February 16th, 2016 You will have 70 minutes to complete this exam. Graphing calculators, notes, and textbooks are not permitted. I pledge
More informationPutting Things Together Part 2
Frequency Putting Things Together Part These exercise blend ideas from various graphs (histograms and boxplots), differing shapes of distributions, and values summarizing the data. Data for, and are in
More informationDecision Trees: Booths
DECISION ANALYSIS Decision Trees: Booths Terri Donovan recorded: January, 2010 Hi. Tony has given you a challenge of setting up a spreadsheet, so you can really understand whether it s wiser to play in
More informationAssessment on Credit Risk of Real Estate Based on Logistic Regression Model
Assessment on Credit Risk of Real Estate Based on Logistic Regression Model Li Hongli 1, a, Song Liwei 2,b 1 Chongqing Engineering Polytechnic College, Chongqing400037, China 2 Division of Planning and
More informationA.REPRESENTATION OF DATA
A.REPRESENTATION OF DATA (a) GRAPHS : PART I Q: Why do we need a graph paper? Ans: You need graph paper to draw: (i) Histogram (ii) Cumulative Frequency Curve (iii) Frequency Polygon (iv) Box-and-Whisker
More informationConditional Power of One-Sample T-Tests
ASS Sample Size Software Chapter 4 Conditional ower of One-Sample T-Tests ntroduction n sequential designs, one or more intermediate analyses of the emerging data are conducted to evaluate whether the
More informationMonte Carlo Simulations
Is Uncle Norm's shot going to exhibit a Weiner Process? Knowing Uncle Norm, probably, with a random drift and huge volatility. Monte Carlo Simulations... of stock prices the primary model 2019 Gary R.
More informationInfluence of Personal Factors on Health Insurance Purchase Decision
Influence of Personal Factors on Health Insurance Purchase Decision INFLUENCE OF PERSONAL FACTORS ON HEALTH INSURANCE PURCHASE DECISION The decision in health insurance purchase include decisions about
More informationMetropolis-Hastings algorithm
Metropolis-Hastings algorithm Dr. Jarad Niemi STAT 544 - Iowa State University March 27, 2018 Jarad Niemi (STAT544@ISU) Metropolis-Hastings March 27, 2018 1 / 32 Outline Metropolis-Hastings algorithm Independence
More information1 Bayesian Bias Correction Model
1 Bayesian Bias Correction Model Assuming that n iid samples {X 1,...,X n }, were collected from a normal population with mean µ and variance σ 2. The model likelihood has the form, P( X µ, σ 2, T n >
More informationBoth the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.
Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. For exams (MD1, MD2, and Final): You may bring one 8.5 by 11 sheet of
More informationWe use probability distributions to represent the distribution of a discrete random variable.
Now we focus on discrete random variables. We will look at these in general, including calculating the mean and standard deviation. Then we will look more in depth at binomial random variables which are
More informationXLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING
XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING INTRODUCTION XLSTAT makes accessible to anyone a powerful, complete and user-friendly data analysis and statistical solution. Accessibility to
More informationStudy 2: data analysis. Example analysis using R
Study 2: data analysis Example analysis using R Steps for data analysis Install software on your computer or locate computer with software (e.g., R, systat, SPSS) Prepare data for analysis Subjects (rows)
More informationHydrology 4410 Class 29. In Class Notes & Exercises Mar 27, 2013
Hydrology 4410 Class 29 In Class Notes & Exercises Mar 27, 2013 Log Normal Distribution We will not work an example in class. The procedure is exactly the same as in the normal distribution, but first
More information1. Empirical mean and standard deviation for each variable, plus standard error of the mean:
Solutions to Selected Computer Lab Problems and Exercises in Chapter 20 of Statistics and Data Analysis for Financial Engineering, 2nd ed. by David Ruppert and David S. Matteson c 2016 David Ruppert and
More informationDATA SUMMARIZATION AND VISUALIZATION
APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296
More informationSTA Module 3B Discrete Random Variables
STA 2023 Module 3B Discrete Random Variables Learning Objectives Upon completing this module, you should be able to 1. Determine the probability distribution of a discrete random variable. 2. Construct
More informationToday s plan: Section 4.1.4: Dispersion: Five-Number summary and Standard Deviation.
1 Today s plan: Section 4.1.4: Dispersion: Five-Number summary and Standard Deviation. 2 Once we know the central location of a data set, we want to know how close things are to the center. 2 Once we know
More informationLecture 2. Probability Distributions Theophanis Tsandilas
Lecture 2 Probability Distributions Theophanis Tsandilas Comment on measures of dispersion Why do common measures of dispersion (variance and standard deviation) use sums of squares: nx (x i ˆµ) 2 i=1
More informationMaximum Likelihood Estimation Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 13, 2018
Maximum Likelihood Estimation Richard Williams, University of otre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 3, 208 [This handout draws very heavily from Regression Models for Categorical
More informationChapter 7: Estimation Sections
1 / 40 Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: Chapter 7 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods:
More informationCS 361: Probability & Statistics
March 12, 2018 CS 361: Probability & Statistics Inference Binomial likelihood: Example Suppose we have a coin with an unknown probability of heads. We flip the coin 10 times and observe 2 heads. What can
More informationProbability and Stochastics for finance-ii Prof. Joydeep Dutta Department of Humanities and Social Sciences Indian Institute of Technology, Kanpur
Probability and Stochastics for finance-ii Prof. Joydeep Dutta Department of Humanities and Social Sciences Indian Institute of Technology, Kanpur Lecture - 07 Mean-Variance Portfolio Optimization (Part-II)
More informationOutline. Review Continuation of exercises from last time
Bayesian Models II Outline Review Continuation of exercises from last time 2 Review of terms from last time Probability density function aka pdf or density Likelihood function aka likelihood Conditional
More information9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives
Basic Statistics for the Healthcare Professional 1 F R A N K C O H E N, M B B, M P A D I R E C T O R O F A N A L Y T I C S D O C T O R S M A N A G E M E N T, LLC Purpose of Statistic 2 Provide a numerical
More informationChapter 8 Statistical Intervals for a Single Sample
Chapter 8 Statistical Intervals for a Single Sample Part 1: Confidence intervals (CI) for population mean µ Section 8-1: CI for µ when σ 2 known & drawing from normal distribution Section 8-1.2: Sample
More informationThe CreditMetrics Package
The Creditetrics Package October 19, 2006 Version 0.0-1 Date 2006-10-18 Title Functions for calculating the Creditetrics risk model Author Andreas Wittmann aintainer Andreas Wittmann
More informationLecture 2 Describing Data
Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms
More informationExtend the ideas of Kan and Zhou paper on Optimal Portfolio Construction under parameter uncertainty
Extend the ideas of Kan and Zhou paper on Optimal Portfolio Construction under parameter uncertainty George Photiou Lincoln College University of Oxford A dissertation submitted in partial fulfilment for
More information# generate data num.obs <- 100 y <- rnorm(num.obs,mean = theta.true, sd = sqrt(sigma.sq.true))
Posterior Sampling from Normal Now we seek to create draws from the joint posterior distribution and the marginal posterior distributions and Note the marginal posterior distributions would be used to
More informationProblem Set 6. I did this with figure; bar3(reshape(mean(rx),5,5) );ylabel( size ); xlabel( value ); mean mo return %
Business 35905 John H. Cochrane Problem Set 6 We re going to replicate and extend Fama and French s basic results, using earlier and extended data. Get the 25 Fama French portfolios and factors from the
More informationSupplementary material for the paper Identifiability and bias reduction in the skew-probit model for a binary response
Supplementary material for the paper Identifiability and bias reduction in the skew-probit model for a binary response DongHyuk Lee and Samiran Sinha Department of Statistics, Texas A&M University, College
More informationF1 Results. News vs. no-news
F1 Results News vs. no-news With news visible, the median trading profits were about $130,000 (485 player-sessions) With the news screen turned off, median trading profits were about $165,000 (283 player-sessions)
More informationMaximum Likelihood Estimation Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 10, 2017
Maximum Likelihood Estimation Richard Williams, University of otre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 0, 207 [This handout draws very heavily from Regression Models for Categorical
More informationSummarising Data. Summarising Data. Examples of Types of Data. Types of Data
Summarising Data Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Today we will consider Different types of data Appropriate ways to summarise these data 17/10/2017
More informationHow I Trade Forex Using the Slope Direction Line
How I Trade Forex Using the Slope Direction Line by Jeff Glenellis Copyright 2009, Simple4xSystem.net By now, you should already have both the Slope Direction Line (S.D.L.) and the Fibonacci Pivot (FiboPiv)
More informationFinal Exam Suggested Solutions
University of Washington Fall 003 Department of Economics Eric Zivot Economics 483 Final Exam Suggested Solutions This is a closed book and closed note exam. However, you are allowed one page of handwritten
More informationAP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE
AP STATISTICS Name: FALL SEMESTSER FINAL EXAM STUDY GUIDE Period: *Go over Vocabulary Notecards! *This is not a comprehensive review you still should look over your past notes, homework/practice, Quizzes,
More information2 Exploring Univariate Data
2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting
More informationTHE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management
THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management BA 386T Tom Shively PROBABILITY CONCEPTS AND NORMAL DISTRIBUTIONS The fundamental idea underlying any statistical
More informationPAIRS TRADING (just an introduction)
PAIRS TRADING (just an introduction) By Rob Booker Trading involves substantial risk of loss. Past performance is not necessarily indicative of future results. You can share this ebook with anyone you
More informationPredicting the Market
Predicting the Market April 28, 2012 Annual Conference on General Equilibrium and its Applications Steve Ross Franco Modigliani Professor of Financial Economics MIT The Importance of Forecasting Equity
More informationStatistics/BioSci 141, Fall 2006 Lab 2: Probability and Probability Distributions October 13, 2006
Statistics/BioSci 141, Fall 2006 Lab 2: Probability and Probability Distributions October 13, 2006 1 Using random samples to estimate a probability Suppose that you are stuck on the following problem:
More informationHypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD
Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD MAJOR POINTS Sampling distribution of the mean revisited Testing hypotheses: sigma known An example Testing hypotheses:
More information1. better to stick. 2. better to switch. 3. or does your second choice make no difference?
The Monty Hall game Game show host Monty Hall asks you to choose one of three doors. Behind one of the doors is a new Porsche. Behind the other two doors there are goats. Monty knows what is behind each
More informationPh.D. Preliminary Examination MICROECONOMIC THEORY Applied Economics Graduate Program June 2017
Ph.D. Preliminary Examination MICROECONOMIC THEORY Applied Economics Graduate Program June 2017 The time limit for this exam is four hours. The exam has four sections. Each section includes two questions.
More informationM249 Diagnostic Quiz
THE OPEN UNIVERSITY Faculty of Mathematics and Computing M249 Diagnostic Quiz Prepared by the Course Team [Press to begin] c 2005, 2006 The Open University Last Revision Date: May 19, 2006 Version 4.2
More informationOccupancy models with detection error Peter Solymos and Subhash Lele July 16, 2016 Madison, WI NACCB Congress
Occupancy models with detection error Peter Solymos and Subhash Lele July 16, 2016 Madison, WI NACCB Congress Let us continue with the simple occupancy model we used previously. Most applied ecologists
More informationSTA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER
STA2601/105/2/2018 Tutorial letter 105/2/2018 Applied Statistics II STA2601 Semester 2 Department of Statistics TRIAL EXAMINATION PAPER Define tomorrow. university of south africa Dear Student Congratulations
More informationProbability. An intro for calculus students P= Figure 1: A normal integral
Probability An intro for calculus students.8.6.4.2 P=.87 2 3 4 Figure : A normal integral Suppose we flip a coin 2 times; what is the probability that we get more than 2 heads? Suppose we roll a six-sided
More informationGamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
More informationArius Deterministic Exhibit Statistics
Arius Deterministic Exhibit Statistics Milliman, Inc. 3424 Peachtree Road, NE Suite 1900 Atlanta, GA 30326 USA Tel +1 800 404 2276 Fax +1 404 237 6984 actuarialsoftware.com Information in this document
More informationDescription of Data I
Description of Data I (Summary and Variability measures) Objectives: Able to understand how to summarize the data Able to understand how to measure the variability of the data Able to use and interpret
More informationDiscrete Random Variables
Discrete Random Variables In this chapter, we introduce a new concept that of a random variable or RV. A random variable is a model to help us describe the state of the world around us. Roughly, a RV can
More informationQuantitative Analysis and Empirical Methods
3) Descriptive Statistics Sciences Po, Paris, CEE / LIEPP Introduction Data and statistics Introduction to distributions Measures of central tendency Measures of dispersion Skewness Data and Statistics
More informationCategorical. A general name for non-numerical data; the data is separated into categories of some kind.
Chapter 5 Categorical A general name for non-numerical data; the data is separated into categories of some kind. Nominal data Categorical data with no implied order. Eg. Eye colours, favourite TV show,
More informationMAS187/AEF258. University of Newcastle upon Tyne
MAS187/AEF258 University of Newcastle upon Tyne 2005-6 Contents 1 Collecting and Presenting Data 5 1.1 Introduction...................................... 5 1.1.1 Examples...................................
More informationProperties of Probability Models: Part Two. What they forgot to tell you about the Gammas
Quality Digest Daily, September 1, 2015 Manuscript 285 What they forgot to tell you about the Gammas Donald J. Wheeler Clear thinking and simplicity of analysis require concise, clear, and correct notions
More informationEE266 Homework 5 Solutions
EE, Spring 15-1 Professor S. Lall EE Homework 5 Solutions 1. A refined inventory model. In this problem we consider an inventory model that is more refined than the one you ve seen in the lectures. The
More informationTests for One Variance
Chapter 65 Introduction Occasionally, researchers are interested in the estimation of the variance (or standard deviation) rather than the mean. This module calculates the sample size and performs power
More information