Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.
|
|
- Trevor Fowler
- 6 years ago
- Views:
Transcription
1 Appendix: Statistics in Action Part I Financial Time Series 1. These data show the effects of stock splits. If you investigate further, you ll find that most of these splits (such as in May 1970) are 3-for-1 splits rather than 2-for-1 splits seen in the text example. You can recognize that the May 1970 split is a 3-for-1 split by observing that the price is $97.50 at the end of April 1970 and falls to $ (about 1/3 of the prior price) at the end of May a. The White Space Rule says that this figure doesn t show much of what is happening in the first ¾ of the time period, up through about 1985 or This portion of the data is hidden because of the large values produced by compounding in the later years. b. No. These data have a strong pattern of dependence that would be concealed by a histogram. 3. The variation in the month-to-month changes lacks a clear pattern, and a histogram is a helpful summary of such variation. 4. The percentage changes show the gains produced by an investment one month at a time. Cumulative values hide the month-to-month variation in the performance of the stock behind the effects of the compounding value. 5. a A return is 1/100 times the corresponding percentage change, so the SD of the returns is 1/100 times the SD of the percentage changes. b. If we accept the Empirical Rule, then we can find the number of standard deviations that separate the mean from -2.5%. A drop of 2.5% amounts to ( )/ , or about ½ of an SD below the mean. That s not very unusual based on the Empirical Rule, which we should check by inspecting the histogram of percentage changes. A simpler approach might be better than relying on the Empirical Rule: find the percentage of months with a drop of 2.5% or more. This approach is not without assumptions. We still need to make sure that there are no patterns (dependence) in the time series of percentage changes. (Use the visual test for association.) c. Let s find the VaR at 2.5%, excluding the worst 2.5% of outcomes. Using the Empirical Rule, that implies that the stock could change by its mean minus two standard deviations, or by *7.55=-13.55%. That s a loss of $ Executive Compensation 1. Make sure that the histogram is bell-shaped and based on data that do not show a pattern. 2. The skewness in incomes persists after removing outliers. Trimming outliers from a skewed distribution seldom improves its shape; it only sacrifices data for little or no change in the shape of the distribution. 3. You can convert to base-10 logs from natural (base-e) logs by the formula log e x log 10 x. The constant log e 10. As a result, all log transformations have the same effect on the shape of a histogram; they only differ in the scale. 4. The value is closer to the median, not the mean. The mean of compensation is $4.268 million, and the median is $2.533 million. The average of the log 10 compensation 6.411, and taking the antilog gives $2.576 million. Working on a log scale down-weights the effects of outliers, so 10 raised to the mean of the base-10 logs is closer to the median. There s an important lesson here: logs and averaging do not commute. The average of the log of a variable is less than the log of the average. Part II Dice Simulation 1. Luck! The mean for Red is astonishingly large, ,700. For most investors, however, Red loses value because of its large volatility. For some, though, there remains a chance of winning a lot with Red. All you have to do is keep rolling those 4s, 5s, and 6s. 2. The mix used to form Pink in the example is not the best ratio, but we have to define what we mean by best. Given the discussion of this example, we might for instance choose to find the mix that has the largest 268
2 Statistics in Action 269 long-run expected value. That is, pick the fraction x to keep in Red in order to maximize V(x) = [x(0.71) + (1-x)(0.008)] ½ [(x (1-x) 2 (0.0016))] mean variance That s a basic calculus problem (or you could use the Excel Solver). The derivative of V(x) is V (x) = x (1-x) Setting V (x)=0 gives x That is, keep about 40% of your wealth in Red with the rest in White. The long-run return grows from about 14% (0.141, Table 6) to about 15% (V(0.403) 0.149). 3. The investments perform independently of one another. What happens to Red, for instance, has no effect on what happens to either Green or White. Real investments are all affected by the health of the general economy. 4. Method (a) does not compound whereas method (b) does. That makes a big difference in the results. Here are the details. We can do a very complete analysis of method (a) since it amounts to summing independent random variables. The expected value of the random variable A which represents the outcome of a $1 bet using method (a) is E(A) = ( )/6 = On average you win 3.33% of the amount wagered. The variance of A is Var(A) = E(A 2 )-(E A) 2 = ( )/ = The expected outcome of 100 rounds with $1 each time, because each round is independent, is then = The variance of 100 independent rounds is then (SD 3.266). The CLT implies, then, that you re likely to win, but not very much. For example, the chance of winning at this game is P(A 1 + A A 100 )>100) P( N(0,1) > ( )/ ) Method (b) compounds, as in the dice game. The amount won in each round affects the amount won in the next. The expected position after 100 rounds (starting from a $1 initial wager) is Compounding suggests a large gain, but compounding also means that volatility drag enters the picture. Using the expression from the text, the long-run return of this approach is (mean minus half of the variance) long-run return = ½ (0.1067) That is, the volatility will gradually reduce the value of the investment. Typically, strategy (b) loses about 2% each round. M&Ms Here s another way to think about the method for weighing by counts. We are going to compute the upper 99.5% point of the distribution of the weight of a package that has too few pieces. If the weight of a package is more than this percentile, then there is a very high probability that the package has enough pieces. 1. Follow the illustration for the bolts from the text. We need to find the weight x such that (with T 59 the total weight of 59 M&Ms and using normality) P(T 59 < x) = P(Z < (x - 59*0.86)/(0.04* 59)) = The expression on the right hand side of the inequality has to equal the 99.5% percentile of a standard normal distribution, (x - 59*0.86)/(0.04* 59) = Solving for x gives x = 59* *0.04*sqrt(59) grams. That is, 99.5% of packages with 59 pieces weigh less than grams. A package that weights more than this signals that it has more than 59 pieces. A bag with 60 pieces weighs on average slightly more, 60*0.86 = Hence, many bags with 60 pieces will fall below threshold and get an extra piece. (See the answer to 3 for the probability.) 2. We need to assume that the weights of individual candies are normally distributed, whereas for #1 we could appeal to the central limit theorem. As in the solution of #1, let T 9 denote the weight of 9 M&Ms. We need to find the weight threshold x so that P(T 9 < x) = P(Z < (x - 9*0.86)/(0.04* 9)) = The expression on the right has to equal , or (x - 9*0.86)/(0.04* 9) = and x = 9* *0.04*sqrt(9) 8.05 grams. That s the upper 99.5% point of the weights of bags with 9 pieces. A bag with 10 pieces on average weighs 10*0.86 = 8.6 grams. We won t have to add so many pieces as in #1 (see the answer to #3 next). 3. The probability that we have to put in more than enough to fill the bag with 60 pieces (ie, the chance that a bag with 60 weighs less than the acceptance threshold found in #1) is
3 270 Statistics in Action P(T 60 < 51.53) = P( Z < ( *0.86)/(sqrt(60)*0.04) = P(Z < ) The chance that a bag with 10 weights less than the threshold from #2 is P(T 10 < 8.05) = P(Z < ( *0.86)/(sqrt(10)*0.04) = P(Z < -4.35) There s less chance to put in extra pieces in these small bags. On the other hand, we need to assume that the weights are normally distributed. (The result works out like this for M&Ms because the variation in the weight of a piece is so small relative to the mean; i.e., the coefficient of variance is small.) Part III Rare Events 1. Reverse the definition of success and failure. If all of the results were successful, then the approximate 95% confidence interval for the population proportion of success is 1-3/n to To raise the level of confidence requires solving a slightly different equation. The solution p* found in the text solves the equation (1-p*) n = 0.05 which implies that p* = /n. If we set the level of confidence to 99.75%, then the new value for the confidence limit q* is q* = /n We can also find an approximation for q* using the same procedure described in Behind the Math that gives the Rule of Three its name. Following that argument, write q* = x/n. It follows that x = -log e That means we get the higher level of confidence by using the Rule of Six. The 99.75% confidence interval is approximately [0, 6/n]. 3. The Rule of Three is designed to work for large values of n. If n = 20, then the exact value for the upper limit of the 95% confidence interval p* is (as in Question 2) p* = /n = / The approximate value is larger, namely 3/20 = The implication is that the 95% confidence interval [0,3/n] is conservative or what you might call cautious. The coverage of this simple interval is larger than the nominal level 95%. Testing Association 1. No. The fact that we have 200 samples from each location makes it easier to compare the proportions in each case, but that s only convenient, not necessary. The main constraint on the number of cases in each row is that the count be large enough to satisfy the sample size condition that assures that we expect at least 10 in each cell (This rule corresponds to the sample size condition used for proportions, n p 10 and n (1-p) 10. The sample size within the rows does not determine the degrees of freedom for the chi-squared statistic; that s always (#rows-1)(#columns-1). 2. No. These are 84 p-values, not 84 population parameters. For products with p-values less than 0.025, we anticipate finding association in the population. That may not be the case. Recall that the p-value accepts a chance for a false-positive result. It could be the case that H 0 : no association is true, but the p-value is less than In fact, if H 0 holds, 2.5% of p-values will be less than Since we have 650 p-values, we can expect about p-values to be less than by chance alone. We just don t know which these are! 3. The analysis shows whether customer preferences for items (such as color preferences) depend on where we find the customers. If color choice and location are associated, then we may choose to stock different color mixes in the different locations. On the other hand, if the color choices are independent of the location, we can manage the stock in all of the locations similarly. Part IV Analyzing Experiments 1. The estimates in Table 4 imply that the change in sales in the Midwest if ads feature small labor costs is D(Midwest) + D(Sm Labor) + Interaction = = $57.5
4 Statistics in Action 271 The easy way to get this answer is to recognize that the fit of the anova regression is the mean of the cell in Table 5 that combines Midwest and Small Labor, $57.5 (third row, first column). 2. The standard errors reflect the sample sizes and the balanced layout of the table. We observe 10 cases for each of the 12 combinations of Region and Advertising, with 30 observations for each region and 40 for each ad type. Intuitively, it is easy to anticipate that the interactions are less well determined (higher SE) since they rely on combinations. To see why the coefficients of the dummy variables for region and advertising type have the same standard errors even though they are associated with different numbers of cases write out several equations of the fitted model. In particular, the intercept of the fitted model is the mean for the Total price in the West y ˆ = b 0. The fitted value for Total price in the Midwest is y ˆ = b 0 + b Midwest. Hence, the coefficient of D(Midwest) is the difference between the means of these two cells of Table 1 ( = matches the slope of D(Midwest) in Table 4 on page 733). Analogously, the fitted value for Small parts in the West is ˆ y = b 0 + b SmParts. Hence, the coefficient for D(SmParts) from Table 1 is the difference between the mean of Total price in the West with Small Parts in the West ( = 27.4). Thus, the estimated slopes of the dummy variables for region and ad type are differences between pairs of means in Table 1. The standard error of the difference between two means each estimated from 10 cases is σ 2 (1/10 + 1/10) = σ 2 /5 (see Chapter 18). Using s e to estimate σ, we obtain SE = /sqrt(5) 80.6 as shown in Table 4 for the slopes for advertising and region. 3. The interaction remains, but the plot now changes the roles of ad type and region. Rather than join averages associated with the same type of ad, join averages from the same region. Here s the plot. The fact that the lines that join the means from each region are not parallel again shows the presence of the interaction. 4. Use the s e from the shown model, s e = as the estimate of σ. Then estimate the SE for the difference between any pair of means in Table 1 as 2 s e /sqrt(10) Within a region, we have 3 means to compare, so there are 3 pairwise comparisons. If we want to keep an overall alpha level of 0.05, we can test each comparison at level 0.05/ with z value Hence, in order to be different, the absolute value of the difference between means within a row of Table 1 must be least In the Northeast, only the difference between Small Labor and Small Parts exceeds this threshold.
5 272 Statistics in Action 5. We deleted a random selection of 15 cases from the data table. Here s the corresponding table of estimates. In general, the estimates are similar, but the consistency of the standard errors is lost. (Most of the standard errors are slightly larger due to the smaller sample size.) Intercept <.0001* Region[Northeast] Region[South] Region[Midwest] Price Partition[Small labor] Price Partition[Small Parts] Price Partition[Small labor]*region[northeast] * Price Partition[Small labor]*region[south] * Price Partition[Small labor]*region[midwest] Price Partition[Small Parts]*Region[Northeast] Price Partition[Small Parts]*Region[South] Price Partition[Small Parts]*Region[Midwest] * Automated Modeling 1. The following table shows the coefficient estimates for the stepwise model including D(Rush). The estimate is negative, indicating that rush jobs (given the other characteristics in the model) tend to be less costly. Intercept <.0001* Labor Hours * Breakdown/Unit <.0001* Total Metal Cost <.0001* Temp Deviation <.0001* Plant[NEW] * 1/Units * D[Rush] * Table 2 explains why the fitted model makes it appear that, both marginally and within the regression, rushed jobs are cheaper to produce. The explanation is that rushed jobs tend to be simpler jobs, lacking detail. 2. If you add Room Temp to the shown regression, the resulting fit is Intercept Labor Hours <.0001* Breakdown/Unit <.0001* Total Metal Cost <.0001* Temp Deviation <.0001* Plant[NEW] * Room Temp The net effect for room temperature is Temp (Temp-75) 2 You can table and chart this function of temperature in a spreadsheet to find the minimum value (or use calculus). Adding the linear trend shifts the optimum low-cost temperature from 75 down to /(2*0.0356) 73.1
6 Statistics in Action No. The backward elimination does have to reach the same model as found by the forward search. The lack of agreement is usually caused by collinearity. For this example, we will start from the saturated model (which does not include Plant since it is redundant with Manager) and remove variables one at a time. For consistency with the forward stepwise analysis, we remove a variable at each step if its p-value is larger than the threshold we used in the forward selection, p-to-remove = p-to-enter = 0.05/26 = The following tables summarize the model we found with backward stepwise search. R s e Analysis of Variance Source DF Sum of Squares Mean Square F Ratio Model Error Prob > F C. Total <.0001* Parameter Estimates Intercept <.0001* Labor Hours <.0001* Breakdown/Unit * Total Metal Cost <.0001* Temp Deviation * The model is the same as that found from the forward search except for the plant effect. We could not include it in the saturated model along with the manager effect! 4. This is a hard question to answer in regression modeling in general. What does it mean if a variable is not among the explanatory variables in a regression? Most importantly, recognize that it does not mean that the omitted variable is unrelated to the response. The number of machine hours is statistically significantly correlated with costs but not once we adjust for other variables such as labor hours that are correlated with machine hours. If we add machine hours to the stepwise regression, we get the following summary of estimates. Intercept <.0001* Labor Hours <.0001* Breakdown/Unit <.0001* Total Metal Cost <.0001* Temp Deviation <.0001* Plant[NEW] * Machine Hours The estimated effect is not statistically significant. The wide confidence interval, however, reminds us that machine hours could have quite an impact even after adjusting for the other variables. The CI is approximately ± 2(45.13) -109 to 71 dollars per hour.
Business Statistics 41000: Probability 3
Business Statistics 41000: Probability 3 Drew D. Creal University of Chicago, Booth School of Business February 7 and 8, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404
More informationBeing Warren Buffett. Wharton Statistics Department
Being Warren Buffett Robert Stine & Dean Foster The School, Univ of Pennsylvania October, 2004 www-stat.wharton.upenn.edu/~stine Introducing students to risk Hands-on simulation experiment - Avoid computer
More informationSTA 103: Final Exam. Print clearly on this exam. Only correct solutions that can be read will be given credit.
STA 103: Final Exam June 26, 2008 Name: } {{ } by writing my name i swear by the honor code Read all of the following information before starting the exam: Print clearly on this exam. Only correct solutions
More informationStat 101 Exam 1 - Embers Important Formulas and Concepts 1
1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.
More informationLESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY
LESSON 7 INTERVAL ESTIMATION SAMIE L.S. LY 1 THIS WEEK S PLAN Part I: Theory + Practice ( Interval Estimation ) Part II: Theory + Practice ( Interval Estimation ) z-based Confidence Intervals for a Population
More informationStat 328, Summer 2005
Stat 328, Summer 2005 Exam #2, 6/18/05 Name (print) UnivID I have neither given nor received any unauthorized aid in completing this exam. Signed Answer each question completely showing your work where
More informationSTAT 201 Chapter 6. Distribution
STAT 201 Chapter 6 Distribution 1 Random Variable We know variable Random Variable: a numerical measurement of the outcome of a random phenomena Capital letter refer to the random variable Lower case letters
More informationMultiple Regression. Review of Regression with One Predictor
Fall Semester, 2001 Statistics 621 Lecture 4 Robert Stine 1 Preliminaries Multiple Regression Grading on this and other assignments Assignment will get placed in folder of first member of Learning Team.
More informationSTAT 157 HW1 Solutions
STAT 157 HW1 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/10/spring/stats157.dir/ Problem 1. 1.a: (6 points) Determine the Relative Frequency and the Cumulative Relative Frequency (fill
More informationXLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING
XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING INTRODUCTION XLSTAT makes accessible to anyone a powerful, complete and user-friendly data analysis and statistical solution. Accessibility to
More informationCharacterization of the Optimum
ECO 317 Economics of Uncertainty Fall Term 2009 Notes for lectures 5. Portfolio Allocation with One Riskless, One Risky Asset Characterization of the Optimum Consider a risk-averse, expected-utility-maximizing
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 10 (MWF) Checking for normality of the data using the QQplot Suhasini Subba Rao Review of previous
More informationCentral Limit Theorem, Joint Distributions Spring 2018
Central Limit Theorem, Joint Distributions 18.5 Spring 218.5.4.3.2.1-4 -3-2 -1 1 2 3 4 Exam next Wednesday Exam 1 on Wednesday March 7, regular room and time. Designed for 1 hour. You will have the full
More informationConfidence Intervals. σ unknown, small samples The t-statistic /22
Confidence Intervals σ unknown, small samples The t-statistic 1 /22 Homework Read Sec 7-3. Discussion Question pg 365 Do Ex 7-3 1-4, 6, 9, 12, 14, 15, 17 2/22 Objective find the confidence interval for
More informationEconomics 345 Applied Econometrics
Economics 345 Applied Econometrics Problem Set 4--Solutions Prof: Martin Farnham Problem sets in this course are ungraded. An answer key will be posted on the course website within a few days of the release
More informationJacob: The illustrative worksheet shows the values of the simulation parameters in the upper left section (Cells D5:F10). Is this for documentation?
PROJECT TEMPLATE: DISCRETE CHANGE IN THE INFLATION RATE (The attached PDF file has better formatting.) {This posting explains how to simulate a discrete change in a parameter and how to use dummy variables
More informationWeek 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals
Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg :
More informationPoint Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage
6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic
More informationNumerical Descriptive Measures. Measures of Center: Mean and Median
Steve Sawin Statistics Numerical Descriptive Measures Having seen the shape of a distribution by looking at the histogram, the two most obvious questions to ask about the specific distribution is where
More informationMAKING SENSE OF DATA Essentials series
MAKING SENSE OF DATA Essentials series THE NORMAL DISTRIBUTION Copyright by City of Bradford MDC Prerequisites Descriptive statistics Charts and graphs The normal distribution Surveys and sampling Correlation
More informationLecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1
Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 6 Normal Probability Distributions 6-1 Overview 6-2 The Standard Normal Distribution
More informationProbability. An intro for calculus students P= Figure 1: A normal integral
Probability An intro for calculus students.8.6.4.2 P=.87 2 3 4 Figure : A normal integral Suppose we flip a coin 2 times; what is the probability that we get more than 2 heads? Suppose we roll a six-sided
More informationPoint Estimation. Some General Concepts of Point Estimation. Example. Estimator quality
Point Estimation Some General Concepts of Point Estimation Statistical inference = conclusions about parameters Parameters == population characteristics A point estimate of a parameter is a value (based
More informationChapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)
Starter Ch. 6: A z-score Analysis Starter Ch. 6 Your Statistics teacher has announced that the lower of your two tests will be dropped. You got a 90 on test 1 and an 85 on test 2. You re all set to drop
More informationI. Return Calculations (20 pts, 4 points each)
University of Washington Winter 015 Department of Economics Eric Zivot Econ 44 Midterm Exam Solutions This is a closed book and closed note exam. However, you are allowed one page of notes (8.5 by 11 or
More informationPart V - Chance Variability
Part V - Chance Variability Dr. Joseph Brennan Math 148, BU Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 1 / 78 Law of Averages In Chapter 13 we discussed the Kerrich coin-tossing experiment.
More informationDescriptive Statistics
Chapter 3 Descriptive Statistics Chapter 2 presented graphical techniques for organizing and displaying data. Even though such graphical techniques allow the researcher to make some general observations
More informationChapter 4 Random Variables & Probability. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables
Chapter 4.5, 6, 8 Probability for Continuous Random Variables Discrete vs. continuous random variables Examples of continuous distributions o Uniform o Exponential o Normal Recall: A random variable =
More informationJacob: What data do we use? Do we compile paid loss triangles for a line of business?
PROJECT TEMPLATES FOR REGRESSION ANALYSIS APPLIED TO LOSS RESERVING BACKGROUND ON PAID LOSS TRIANGLES (The attached PDF file has better formatting.) {The paid loss triangle helps you! distinguish between
More informationAnalysis of 2x2 Cross-Over Designs using T-Tests for Non-Inferiority
Chapter 235 Analysis of 2x2 Cross-Over Designs using -ests for Non-Inferiority Introduction his procedure analyzes data from a two-treatment, two-period (2x2) cross-over design where the goal is to demonstrate
More informationA useful modeling tricks.
.7 Joint models for more than two outcomes We saw that we could write joint models for a pair of variables by specifying the joint probabilities over all pairs of outcomes. In principal, we could do this
More informationModels of Patterns. Lecture 3, SMMD 2005 Bob Stine
Models of Patterns Lecture 3, SMMD 2005 Bob Stine Review Speculative investing and portfolios Risk and variance Volatility adjusted return Volatility drag Dependence Covariance Review Example Stock and
More informationKey Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions
SGSB Workshop: Using Statistical Data to Make Decisions Module 2: The Logic of Statistical Inference Dr. Tom Ilvento January 2006 Dr. Mugdim Pašić Key Objectives Understand the logic of statistical inference
More informationUnit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More informationUnit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More informationThis homework assignment uses the material on pages ( A moving average ).
Module 2: Time series concepts HW Homework assignment: equally weighted moving average This homework assignment uses the material on pages 14-15 ( A moving average ). 2 Let Y t = 1/5 ( t + t-1 + t-2 +
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 10 (MWF) Checking for normality of the data using the QQplot Suhasini Subba Rao Checking for
More informationTHE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management
THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management BA 386T Tom Shively PROBABILITY CONCEPTS AND NORMAL DISTRIBUTIONS The fundamental idea underlying any statistical
More informationSTATISTICAL DISTRIBUTIONS AND THE CALCULATOR
STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either
More informationMultiple regression - a brief introduction
Multiple regression - a brief introduction Multiple regression is an extension to regular (simple) regression. Instead of one X, we now have several. Suppose, for example, that you are trying to predict
More informationFinance Practice Midterm #1 Solutions
Finance 30210 Practice Midterm #1 Solutions 1) Suppose that you have the opportunity to invest $50,000 in a new restaurant in South Bend. (FYI: Dr. HG Parsa of Ohio State University has done a study that
More informationName: Show all your work! Mathematical Concepts Joysheet 1 MAT 117, Spring 2013 D. Ivanšić
Mathematical Concepts Joysheet 1 Use your calculator to compute each expression to 6 significant digits accuracy or six decimal places, whichever is more accurate. Write down the sequence of keys you entered
More informationDecision Trees: Booths
DECISION ANALYSIS Decision Trees: Booths Terri Donovan recorded: January, 2010 Hi. Tony has given you a challenge of setting up a spreadsheet, so you can really understand whether it s wiser to play in
More informationNormal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem
1.1.2 Normal distribution 1.1.3 Approimating binomial distribution by normal 2.1 Central Limit Theorem Prof. Tesler Math 283 Fall 216 Prof. Tesler 1.1.2-3, 2.1 Normal distribution Math 283 / Fall 216 1
More informationData screening, transformations: MRC05
Dale Berger Data screening, transformations: MRC05 This is a demonstration of data screening and transformations for a regression analysis. Our interest is in predicting current salary from education level
More informationThe following content is provided under a Creative Commons license. Your support
MITOCW Recitation 6 The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make
More informationLecture 3: Factor models in modern portfolio choice
Lecture 3: Factor models in modern portfolio choice Prof. Massimo Guidolin Portfolio Management Spring 2016 Overview The inputs of portfolio problems Using the single index model Multi-index models Portfolio
More informationThe Binomial Distribution
The Binomial Distribution January 31, 2019 Contents The Binomial Distribution The Normal Approximation to the Binomial The Binomial Hypothesis Test Computing Binomial Probabilities in R 30 Problems The
More informationValuing Investments A Statistical Perspective. Bob Stine Department of Statistics Wharton, University of Pennsylvania
Valuing Investments A Statistical Perspective Bob Stine, University of Pennsylvania Overview Principles Focus on returns, not cumulative value Remove market performance (CAPM) Watch for unseen volatility
More informationP2.T5. Market Risk Measurement & Management. Bruce Tuckman, Fixed Income Securities, 3rd Edition
P2.T5. Market Risk Measurement & Management Bruce Tuckman, Fixed Income Securities, 3rd Edition Bionic Turtle FRM Study Notes Reading 40 By David Harper, CFA FRM CIPM www.bionicturtle.com TUCKMAN, CHAPTER
More informationAP Statistics Chapter 6 - Random Variables
AP Statistics Chapter 6 - Random 6.1 Discrete and Continuous Random Objective: Recognize and define discrete random variables, and construct a probability distribution table and a probability histogram
More informationFINAL REVIEW W/ANSWERS
FINAL REVIEW W/ANSWERS ( 03/15/08 - Sharon Coates) Concepts to review before answering the questions: A population consists of the entire group of people or objects of interest to an investigator, while
More informationStat3011: Solution of Midterm Exam One
1 Stat3011: Solution of Midterm Exam One Fall/2003, Tiefeng Jiang Name: Problem 1 (30 points). Choose one appropriate answer in each of the following questions. 1. (B ) The mean age of five people in a
More informationThe data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998
Economics 312 Sample Project Report Jeffrey Parker Introduction This project is based on Exercise 2.12 on page 81 of the Hill, Griffiths, and Lim text. It examines how the sale price of houses in Stockton,
More informationA.REPRESENTATION OF DATA
A.REPRESENTATION OF DATA (a) GRAPHS : PART I Q: Why do we need a graph paper? Ans: You need graph paper to draw: (i) Histogram (ii) Cumulative Frequency Curve (iii) Frequency Polygon (iv) Box-and-Whisker
More informationPrentice Hall Connected Mathematics 2, 7th Grade Units 2009 Correlated to: Minnesota K-12 Academic Standards in Mathematics, 9/2008 (Grade 7)
7.1.1.1 Know that every rational number can be written as the ratio of two integers or as a terminating or repeating decimal. Recognize that π is not rational, but that it can be approximated by rational
More information6. THE BINOMIAL DISTRIBUTION
6. THE BINOMIAL DISTRIBUTION Eg: For 1000 borrowers in the lowest risk category (FICO score between 800 and 850), what is the probability that at least 250 of them will default on their loan (thereby rendering
More information8.1 Estimation of the Mean and Proportion
8.1 Estimation of the Mean and Proportion Statistical inference enables us to make judgments about a population on the basis of sample information. The mean, standard deviation, and proportions of a population
More informationTwo-Sample T-Test for Superiority by a Margin
Chapter 219 Two-Sample T-Test for Superiority by a Margin Introduction This procedure provides reports for making inference about the superiority of a treatment mean compared to a control mean from data
More informationMarket Volatility and Risk Proxies
Market Volatility and Risk Proxies... an introduction to the concepts 019 Gary R. Evans. This slide set by Gary R. Evans is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 14 (MWF) The t-distribution Suhasini Subba Rao Review of previous lecture Often the precision
More informationDiploma Part 2. Quantitative Methods. Examiner s Suggested Answers
Diploma Part 2 Quantitative Methods Examiner s Suggested Answers Question 1 (a) The binomial distribution may be used in an experiment in which there are only two defined outcomes in any particular trial
More information4.3 Normal distribution
43 Normal distribution Prof Tesler Math 186 Winter 216 Prof Tesler 43 Normal distribution Math 186 / Winter 216 1 / 4 Normal distribution aka Bell curve and Gaussian distribution The normal distribution
More informationTwo-Sample T-Test for Non-Inferiority
Chapter 198 Two-Sample T-Test for Non-Inferiority Introduction This procedure provides reports for making inference about the non-inferiority of a treatment mean compared to a control mean from data taken
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2019 Last Time: Markov Chains We can use Markov chains for density estimation, d p(x) = p(x 1 ) p(x }{{}
More informationSTA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER
STA2601/105/2/2018 Tutorial letter 105/2/2018 Applied Statistics II STA2601 Semester 2 Department of Statistics TRIAL EXAMINATION PAPER Define tomorrow. university of south africa Dear Student Congratulations
More informationNumerical Descriptions of Data
Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =
More information* The Unlimited Plan costs $100 per month for as many minutes as you care to use.
Problem: You walk into the new Herizon Wireless store, which just opened in the mall. They offer two different plans for voice (the data and text plans are separate): * The Unlimited Plan costs $100 per
More informationStatistical Intervals (One sample) (Chs )
7 Statistical Intervals (One sample) (Chs 8.1-8.3) Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to normally distributed with expected value µ and
More informationThe Binomial Distribution
The Binomial Distribution January 31, 2018 Contents The Binomial Distribution The Normal Approximation to the Binomial The Binomial Hypothesis Test Computing Binomial Probabilities in R 30 Problems The
More informationBusiness Statistics 41000: Probability 4
Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:
More informationCH 5 Normal Probability Distributions Properties of the Normal Distribution
Properties of the Normal Distribution Example A friend that is always late. Let X represent the amount of minutes that pass from the moment you are suppose to meet your friend until the moment your friend
More informationJohn Hull, Risk Management and Financial Institutions, 4th Edition
P1.T2. Quantitative Analysis John Hull, Risk Management and Financial Institutions, 4th Edition Bionic Turtle FRM Video Tutorials By David Harper, CFA FRM 1 Chapter 10: Volatility (Learning objectives)
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 14 (MWF) The t-distribution Suhasini Subba Rao Review of previous lecture Often the precision
More informationChapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables
Chapter 5 Continuous Random Variables and Probability Distributions 5.1 Continuous Random Variables 1 2CHAPTER 5. CONTINUOUS RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS Probability Distributions Probability
More informationThe Two-Sample Independent Sample t Test
Department of Psychology and Human Development Vanderbilt University 1 Introduction 2 3 The General Formula The Equal-n Formula 4 5 6 Independence Normality Homogeneity of Variances 7 Non-Normality Unequal
More informationstarting on 5/1/1953 up until 2/1/2017.
An Actuary s Guide to Financial Applications: Examples with EViews By William Bourgeois An actuary is a business professional who uses statistics to determine and analyze risks for companies. In this guide,
More informationBoth the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.
Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. For exams (MD1, MD2, and Final): You may bring one 8.5 by 11 sheet of
More informationCHAPTER 5 SAMPLING DISTRIBUTIONS
CHAPTER 5 SAMPLING DISTRIBUTIONS Sampling Variability. We will visualize our data as a random sample from the population with unknown parameter μ. Our sample mean Ȳ is intended to estimate population mean
More informationSampling Distributions and the Central Limit Theorem
Sampling Distributions and the Central Limit Theorem February 18 Data distributions and sampling distributions So far, we have discussed the distribution of data (i.e. of random variables in our sample,
More informationSTA 220H1F LEC0201. Week 7: More Probability: Discrete Random Variables
STA 220H1F LEC0201 Week 7: More Probability: Discrete Random Variables Recall: A sample space for a random experiment is the set of all possible outcomes of the experiment. Random Variables A random variable
More information1 Inferential Statistic
1 Inferential Statistic Population versus Sample, parameter versus statistic A population is the set of all individuals the researcher intends to learn about. A sample is a subset of the population and
More information4.1 Introduction Estimating a population mean The problem with estimating a population mean with a sample mean: an example...
Chapter 4 Point estimation Contents 4.1 Introduction................................... 2 4.2 Estimating a population mean......................... 2 4.2.1 The problem with estimating a population mean
More informationChapter 5. Sampling Distributions
Lecture notes, Lang Wu, UBC 1 Chapter 5. Sampling Distributions 5.1. Introduction In statistical inference, we attempt to estimate an unknown population characteristic, such as the population mean, µ,
More informationLecture 13: Identifying unusual observations In lecture 12, we learned how to investigate variables. Now we learn how to investigate cases.
Lecture 13: Identifying unusual observations In lecture 12, we learned how to investigate variables. Now we learn how to investigate cases. Goal: Find unusual cases that might be mistakes, or that might
More informationExpectation Exercises.
Expectation Exercises. Pages Problems 0 2,4,5,7 (you don t need to use trees, if you don t want to but they might help!), 9,-5 373 5 (you ll need to head to this page: http://phet.colorado.edu/sims/plinkoprobability/plinko-probability_en.html)
More informationDATA SUMMARIZATION AND VISUALIZATION
APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296
More information( ) P = = =
1. On a lunch counter, there are 5 oranges and 6 apples. If 3 pieces of fruit are selected, find the probability that 1 orange and apples are selected. Order does not matter Combinations: 5C1 (1 ) 6C P
More informationCHAPTER 2 Describing Data: Numerical
CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of
More informationComputing interest and composition of functions:
Computing interest and composition of functions: In this week, we are creating a simple and compound interest calculator in EXCEL. These two calculators will be used to solve interest questions in week
More informationMath 140 Introductory Statistics
Math 140 Introductory Statistics Let s make our own sampling! If we use a random sample (a survey) or if we randomly assign treatments to subjects (an experiment) we can come up with proper, unbiased conclusions
More informationBasic Procedure for Histograms
Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that
More informationChapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.
Chapter 14 : Statistical Inference 1 Chapter 14 : Introduction to Statistical Inference Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same. Data x
More informationFinal Exam Suggested Solutions
University of Washington Fall 003 Department of Economics Eric Zivot Economics 483 Final Exam Suggested Solutions This is a closed book and closed note exam. However, you are allowed one page of handwritten
More informationLearning Objectives for Ch. 7
Chapter 7: Point and Interval Estimation Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition 1 Learning Objectives for Ch. 7 Obtaining a point estimate of a population parameter
More informationMean-Variance Portfolio Theory
Mean-Variance Portfolio Theory Lakehead University Winter 2005 Outline Measures of Location Risk of a Single Asset Risk and Return of Financial Securities Risk of a Portfolio The Capital Asset Pricing
More informationChapter 5 Normal Probability Distributions
Chapter 5 Normal Probability Distributions Section 5-1 Introduction to Normal Distributions and the Standard Normal Distribution A The normal distribution is the most important of the continuous probability
More informationData Analysis. BCF106 Fundamentals of Cost Analysis
Data Analysis BCF106 Fundamentals of Cost Analysis June 009 Chapter 5 Data Analysis 5.0 Introduction... 3 5.1 Terminology... 3 5. Measures of Central Tendency... 5 5.3 Measures of Dispersion... 7 5.4 Frequency
More information5.3 Statistics and Their Distributions
Chapter 5 Joint Probability Distributions and Random Samples Instructor: Lingsong Zhang 1 Statistics and Their Distributions 5.3 Statistics and Their Distributions Statistics and Their Distributions Consider
More informationHomework Assignment Section 3
Homework Assignment Section 3 Tengyuan Liang Business Statistics Booth School of Business Problem 1 A company sets different prices for a particular stereo system in eight different regions of the country.
More information