Introduction to Population Modeling In addition to estimating the size of a population, it is often beneficial to estimate how the population size changes over time. Ecologists often uses models to create a mathematical description of a population. A linear model can be written as N(t) = a + b* t. The linear model states that whenever the time interval t increases by one, the population size will increase by a fixed amount, b. Instead of using N to represent population size, the term N(t) will be used to represent the population at a particular time, t. While the linear model is very simple, most animal populations do not grow linearly. Exponential models are often used to model population growth. These models assume the population size will increase by a fixed percent each year and can be written as N(t) = N(0)*R t In this expression, N(0) is the starting population size at time 0 and R is called the growth rate of the population. If R is less than one, the population is decreasing and if R is greater than one, the population is increasing. Example: In the initial year of a study (time t = 0), the population of a certain species of fish starts at 1000 and is expected to increase by 10% of each year. Thus the number in the population at time t is equal to the number at time (t-1) multiplied by the growth rate of the population. We can calculate the population for following years as follows: Population at year 1 = N(1) Population at year 2 = N(2) = N(0)+N(0)*10% = N(0)(1+.1) = 1000*1.1 = 1100 = N(1) + N(1)*10% = N(1)(1.1) = N(0)(1.1)(1.1) = 1000*1.1 2 = 1210 Population at year 20 = N(20) = N(0)*R 20 = 1000(1.1) 20 = 6727.50 In the initial year example, N(0) = 1000 and R = 1.1.
1) Assume that an initial population started at 1000 species and the population declines by 5% each year. What are N(0), R, and N(20)? 2) Use software to create 5 columns of data. The first column should number t = 0,1,2 30. The next four columns should be four exponential models. All four models have an initial population size of 1000. The growth rate for the four models should be 0.8, 1, 1.1, and 1.4 respectively. Plot all four models on one graph. Rescale the graph so that the y-axis ranges from 0 to 5000. Describe any patterns that you see. Using Regression to Estimate a Population Growth Rate Most introductory statistics textbooks describe how to use least squares techniques to calculate the slope and y-intercept in simple linear regression models. These techniques do not work directly for modeling population over time because the terms in the model, N(t) = N(0)*R t, are not additive. However, a simple natural log transformation can be made to this model in order for it to look similar to a simple linear regression model. ln N(t) = ln{n(0)*r t } = ln{n(0)} + ln{ R t } = ln{n(0) + tln{r} = a + tb 3) Repeat the previous question, however instead of plotting N(t) verses t, plot ln{n(t)}versus t. Describe any patterns that you see. 4) For the first model with R = 0.8, use simple linear regression to calculate a and b. Give a clear explanation of how you can use b to estimate the growth rate in the exponential model. 5) Try a few other transformations on the first model with R = 0.8 (such as the square root or cubed root of N(t)) and conduct a simple linear regression. For each model, create a residuals verses t plot. Explain any patterns you see in these residual plots. 6) Go to the Great Fish Pond game located at www.cs.grinnell.edu/~kuipers/statsgames/capturerecapture and select the modeling option. Play the game and collect 20 years of data. Copy and paste the data into a software package. Then plot the data for the two ponds. Note that the
same population model is used for all data you collect from your sample. However, if you restart the game, the population model will change. 7) Transform the data, then use simple linear regression techniques to estimate a and b for both the fish food and natural pond models. Finally, use b to estimate R for both models. Submit your two exponential models. 8) Use the models in the previous problem to make your estimates in the Great Fish Pond game. Determining if Two Slopes are Significantly Different Assumptions: Before conducting a hypothesis test for regression models, it is necessary to check the model assumptions. The simple linear regression model assumes that: 1) the parameters are constant (the true slope and y-intercept do not change throughout the entire time period of the study) 2) the terms in the model are additive 3) the error terms sum to zero 4) the error terms are independent and identically distributed 5) the variability of the error terms in the regression model does not depend on x, and 6) the error terms follow a normal probability distribution. Details of these assumptions are discussed in most introductory text books and will not be given here. However, we will focus on the 5 th assumption. 9) Refer to the graph of your original Great Fish Pond data in question 6). Notice that the points appear to be much more spread out as t increases. Recall that the data from your game came from capture-recapture estimates. Assuming that the capture-recapture sample sizes are the same for all 20 years, explain why estimates tend to be more variable as the true population increases. 10) Conduct a simple linear regression on the original game data and create a residuals verses t plot. Explain the patterns you see in the residual plots. 11) Conduct a simple linear regression on the transformed game data, ln{n(t)}, and create a residuals verses t plot. Explain the patterns you see in the residual plots. Specifically, does the variability of the residuals still appear to depend on x?
The natural log function is one type of variance stabilizing transformation. In essence, if the variability of N(t) increases as N(t) increases, a variance stabilizing transformation is often helpful in assuring that assumption 5) is met. Hypothesis Tests and Confidence Intervals: Once the model assumptions are met, most introductory statistics textbooks show that b follows a normal distribution. Using the subscript f to represent the fish food model and n to represent the natural model it can be shown that: 12) Any linear combination of independent normal random variables is also normally distributed. Use the equations shown above to find the mean and variance of b f - b n. 13) Give the formula for the confidence interval for b f - b n assuming that f and n are known. 14) Give the formula for the confidence interval for b f - b n assuming that f and n are unknown. In other words, use s f and s n as estimates for f and n and use the t distribution degrees of freedom instead of the normal distribution. Following similar logic as the previous questions, it can be shown that the test statistic for the difference of two slopes is given by With degrees of freedom and
15) Use the confidence interval you created in the previous question to determine if the growth rates for the two ponds are significantly different. Discuss the impact of random sampling and random allocation when you state your conclusions.