Stat 139 Homework 2 Solutions, Fall 2016
|
|
- Mavis Sparks
- 5 years ago
- Views:
Transcription
1 Stat 139 Homework 2 Solutions, Fall 2016 Problem 1. The sum of squares of a sample of data is minimized when the sample mean, X = Xi /n, is used as the basis of the calculation. Define g(c) as a function w.r.t. c as: g(c) = (X i c) 2. Show that this function is minimized at the value c = X. Solution: In order to minimize a function, we have to take the first derivative (w.r.t. c) and set to zero. Then we can take the second derivative and make sure it is positive at x (concave up): g (c) = 2 (x i c) 0 = c = x i = n c = x i = c = x i /n = x g (c) = 2 1 = 2n > 0 Problem 2. Let X 1, X 2,..., X n be a sample of independent random variables drawn from a population with mean µ and variance σ 2. Let X be the sample average. Recall that σ 2 can be estimated by S 2, the usual sample variance, defined as: n S 2 = (X i X) ( 2 ) = 1 Xi 2 n n 1 n 1 X 2. (a) Show that E(X 2 i ) = σ2 + µ 2, using the fact that σ 2 = E ( (X i µ) 2). Solution: E(X 2 i ) = E(X 2 i 2µ 2 + 2µ 2 ) = E(X 2 i 2µX i + µ 2 ) + E(µ 2 ) = E ( (X i µ) 2) + E(µ 2 ) = σ 2 + µ 2 Note: E(µ 2 ) = µe(µ) = µe(x i ) = E(µX i ). (b) Show that E(S 2 ) = σ 2, i.e., S 2 is an unbiased estimator of the population variance. Solution: [ ( )] ( ) E(S 2 1 ) = E Xi 2 n n 1 X 2 = 1 E(Xi 2 ) ne( n 1 X 2 ) = 1 ( n(σ 2 + µ 2 ) n(σ 2 /n + µ 2 ) ) n 1 Note: E( X 2 ) = σ 2 X + µ 2 X = σ 2 /n + µ 2 based on the Law of Large Numbers. Problem 3. Let X 1, X 2,..., X 25 be i.i.d. Normal r.v.s. with mean µ = 1 and variance σ 2 = 3 2 = 9. Let S 2 be the usual variance estimate: S 2 = (X i X) 2 /(n 1), and let ˆσ 2 be the estimate using µ in the calculation instead: ˆσ 2 = (X i µ) 2 /n. Write a simulation in R, using a for-loop based 1
2 on at least 10,000 iterations, to determine the following (be sure to include the relevant R code and output): (a) That both estimators (S 2 and ˆσ 2 ) are unbiased. Solution: Based on 10,000 iterations, the observed means of both estimators were within 0.01 units of the true variance of 9. We could formally test if the is significantly different from 9 (based on n = 10, 000 realizations), but that is overkill. Here is the relevant R code: > nsims=10000 > mu=1 > sigma=3 > n=25 > sigma2.hat=s2=rep(na,nsims) > > for(i in 1:nsims){ + sample=rnorm(n,mean=mu,sd=sigma) + xbar=mean(sample) + sigma2.hat[i]=sum((sample-mu)^2)/n + s2[i]=var(sample) + } > mean(sigma2.hat) [1] > mean(s2) [1] (b) Provide a separate histogram for each of the two sampling distributions. Which has lower spread? Solution: Based on the R output below, ˆσ 2 has slightly smaller spread than S 2 (about 3% lower standard deviation). > sd(sigma2.hat) [1] > sd(s2) [1] (c) Which estimator is closer to the true value more often. Solution: Based on the R output below, ˆσ 2 is as close or closer than S 2 about 52.4% of the time. > mean(abs(sigma2.hat-sigma^2)>abs(s2-sigma^2)) [1] (d) Are you sure of your answers above? What could you do to be more certain? Solution: No, the answers are not certainly true above since these are based on random simulations. We could be more certain if we based this study on more iterations, or if we performed a formal test to see if the results above were statistically significant. 2
3 Histogram of sigma2.hat Histogram of s sigma2.hat s2 Problem 4. The National Football League (NFL) instituted a new rule in 2016 that changed how kickoffs are returned (a touchback is placed at the 25 instead of the 20 yard line in hopes to increase touchbacks to reduce injuries). This problem will investigate what effect this may have on that type of play in the game based on just 1 week, n = 16 games, of data. (a) In the entire year of 2015 (256 games), 1470 out of 2627 kickoffs were touchbacks. So far in 2016, 103 out of 165 kickoffs have been touchbacks. Perform a formal hypothesis test to determine whether the rate of touchbacks has changed from 2015 to What does this mean for whether the rule change had an effect on kickoffs? Solution: If we treat these as two samples from two separate populations (or superpopulations), then we can perform a 2-sample z-test for proportions. Note, each observation (each individual kickoff) is a Bernoulli r.v. with parameter p, and thus has mean µ = p and variance σ = p(1 p), and the CLT applies to the 2 sample proportions, ˆp, since these are averages individual observations: H 0 : p 2016 = p 2015 vs. H A : p 2016 p 2015 Z = ˆp 2016 ˆp 2015 ( ) = ˆp pooled (1 ˆp pooled ) n1 n2 p value = 2(1 P (Z > 1.624)) = Note: ˆp pooled = ( )/( ) is the combined rate of touchbacks. Also, this test can instead be performed as a one sample test where only ˆp 2015 is used as an estimate and p 2016 = 1470/2627 = is the true parameter value, and this difference in approach is addressed in the next problem. (b) The proportion from 2015 could be treated as a population parameter or as an estimate of some super-population (the theoretical construct that there is some mechanism producing these data). Practically speaking, give an argument why it does not matter which way you treated it in the previous part. Solution: This does not matter since the sample size is so large (n = 2627). When an estimate is based on so much data, the estimator s variance is so small that it has almost no bearing on the result and can be treated like a constant. 3
4 (c) Calculate a 95% confidence interval for the true proportion of kickoffs that end in touchbacks in Solution: ˆp 2016 ± z ˆp 2016 (1 ˆp 2016 ) (0.3758) = ± 1.96 = (0.550, 0.698) n (d) Do the confidence interval and hypothesis test agree? How do you know? Solution: Yes they agree. Treating the proportion from 2015 as a population parameter (p =.05596), our confidence interval covers that value, and thus it should not be rejected as the true underlying proportion of kickoffs that are touchbacks in (e) There are 32 teams in the NFL and each team essentially uses the same players on kickoffs throughout the year (the kicker is the most important player on kickoffs and he very rarely changes). How does this affect the assumptions of your inferences in part (a) and (c)? Solution: The observations are not independent, either within a season (due to a clustering effect by team/kicker) or between seasons (since the data in some way are paired from one season to the next). (f) The entire 2015 season may not be the best comparison group for this study. Provide a different comparison group and/or analysis approach that may be more appropriate. Solution: A better approach would be to compare week 1 of 2016 to week 1 of There may be changes as the season goes along (especially weather). Also, one could perform a paired test looking at the difference within teams from 2015 to Problem 5. Use R to perform the following simulation based on 100,000 iterations to mimic the previous problem. (a) Assume 2015 is a discrete population of kickoffs with exactly 1470 touchbacks and 1157 nontouchbacks. Sample, without replacement, 165 kickoffs and measure the proportion of kickoffs that are touchbacks within this sample. Repeat this sampling 100,000 times. Provide a histogram of the sampled proportions. What proportion of sample proportions is greater than what was actually observed in 2016? Solution: The histogram is provided below along with the relevant R output (see R code file for simulation code). The proportion of sample proportions greater than what was observed is estimated to be about 3.5%. (b) Now assume 2015 is an infinite population of kickoffs where p = 1470/2627 of the kickoffs are touchbacks. (Note: this is equivalent to sampling with replacement from the discrete population). Sample from the theoretically infinite population (or sample with replacement from the finite population) 165 kickoffs and measure the proportion of kickoffs that are touchbacks within this sample. Repeat this sampling 100,000 times. Provide a histogram of the sampled proportions. What proportion of sample proportions is greater than what was actually observed in 2016? Solution: The proportion of sample proportions greater than what was observed is estimated to be about 3.9% here, slightly larger than in part (a). Since this is higher than the respectively calculation for part (a), this hints that the sampling distribution has fatter tails (higher variance), which will be discussed in the next part. 4
5 > mean(sample_props1>phat_1) #part a [1] > mean(sample_props2>phat_1) #part b [1] Histogram of sample_props1 Histogram of sample_props sample_props sample_props2 (c) How do the histograms of the two sampling procedures compare? How would the histogram change if part (a) was based on a discrete population with 1/4 as many observations? Feel free to use empirical evidence and statistics/measures to support your claim. Note: this is a different issue than seen in 4(b). Solution: They are very similar (both approximately normal). The variance of the histogram for (b), when sampling was done with replacement, has slightly more variability than when performed without replacement. This difference in the variance will be exacerbated if the population size is even smaller (closer to the sample size). > mean(sample_props1) [1] > var(sample_props1) [1] > mean(sample_props2) [1] > var(sample_props2) [1] (d) Turn your results in parts (a) and (b) into 2-sided p-values. How do these compare to the hypothesis test from the previous problem? Solution: Technically, we should change the inequality from parts (a) and (b) as they should be greater than or equal to and not just greater than. How to turn this into a 2-sided p-value then would depend on if the reference distribution (here, the histogram) is symmetric or not. If it s symmetric, we can just take the 1-tail probability and multiply by 2. If it is not symmetric, then we would have to be much more careful in how to calculate this (and determine extremity by distance from the null hypothesis mean). Luckily, our histogram is roughly normal, so we are OK to take the 1-tail probability and multiply by 2. Ignoring the equality we can just take our results from parts (a) and (b) and get p-values of Taking into account the equality we see the p-values are more similar to the calculations done by hand: 5
6 > 2*mean(sample_props1>=phat_1) [1] > 2*mean(sample_props2>=phat_1) [1] Note: due to the discreteness of this r.v., it makes a difference if we include the equality. 6
Unit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More informationUnit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More informationChapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS
Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS Part 1: Introduction Sampling Distributions & the Central Limit Theorem Point Estimation & Estimators Sections 7-1 to 7-2 Sample data
More informationSampling and sampling distribution
Sampling and sampling distribution September 12, 2017 STAT 101 Class 5 Slide 1 Outline of Topics 1 Sampling 2 Sampling distribution of a mean 3 Sampling distribution of a proportion STAT 101 Class 5 Slide
More informationChapter 8. Introduction to Statistical Inference
Chapter 8. Introduction to Statistical Inference Point Estimation Statistical inference is to draw some type of conclusion about one or more parameters(population characteristics). Now you know that a
More informationMATH 3200 Exam 3 Dr. Syring
. Suppose n eligible voters are polled (randomly sampled) from a population of size N. The poll asks voters whether they support or do not support increasing local taxes to fund public parks. Let M be
More informationConfidence Intervals. σ unknown, small samples The t-statistic /22
Confidence Intervals σ unknown, small samples The t-statistic 1 /22 Homework Read Sec 7-3. Discussion Question pg 365 Do Ex 7-3 1-4, 6, 9, 12, 14, 15, 17 2/22 Objective find the confidence interval for
More informationChapter 7: Point Estimation and Sampling Distributions
Chapter 7: Point Estimation and Sampling Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 20 Motivation In chapter 3, we learned
More informationChapter 7 Sampling Distributions and Point Estimation of Parameters
Chapter 7 Sampling Distributions and Point Estimation of Parameters Part 1: Sampling Distributions, the Central Limit Theorem, Point Estimation & Estimators Sections 7-1 to 7-2 1 / 25 Statistical Inferences
More informationChapter 7: Random Variables
Chapter 7: Random Variables 7.1 Discrete and Continuous Random Variables 7.2 Means and Variances of Random Variables 1 Introduction A random variable is a function that associates a unique numerical value
More informationThe Bernoulli distribution
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationSTA215 Confidence Intervals for Proportions
STA215 Confidence Intervals for Proportions Al Nosedal. University of Toronto. Summer 2017 June 14, 2017 Pepsi problem A market research consultant hired by the Pepsi-Cola Co. is interested in determining
More informationChapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means
Chapter 11: Inference for Distributions 11.1 Inference for Means of a Population 11.2 Comparing Two Means 1 Population Standard Deviation In the previous chapter, we computed confidence intervals and performed
More informationSection 0: Introduction and Review of Basic Concepts
Section 0: Introduction and Review of Basic Concepts Carlos M. Carvalho The University of Texas McCombs School of Business mccombs.utexas.edu/faculty/carlos.carvalho/teaching 1 Getting Started Syllabus
More information5.3 Statistics and Their Distributions
Chapter 5 Joint Probability Distributions and Random Samples Instructor: Lingsong Zhang 1 Statistics and Their Distributions 5.3 Statistics and Their Distributions Statistics and Their Distributions Consider
More informationKey Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions
SGSB Workshop: Using Statistical Data to Make Decisions Module 2: The Logic of Statistical Inference Dr. Tom Ilvento January 2006 Dr. Mugdim Pašić Key Objectives Understand the logic of statistical inference
More informationStat 213: Intro to Statistics 9 Central Limit Theorem
1 Stat 213: Intro to Statistics 9 Central Limit Theorem H. Kim Fall 2007 2 unknown parameters Example: A pollster is sure that the responses to his agree/disagree questions will follow a binomial distribution,
More informationSTA Module 3B Discrete Random Variables
STA 2023 Module 3B Discrete Random Variables Learning Objectives Upon completing this module, you should be able to 1. Determine the probability distribution of a discrete random variable. 2. Construct
More information4.2 Probability Distributions
4.2 Probability Distributions Definition. A random variable is a variable whose value is a numerical outcome of a random phenomenon. The probability distribution of a random variable tells us what the
More informationReview: Population, sample, and sampling distributions
Review: Population, sample, and sampling distributions A population with mean µ and standard deviation σ For instance, µ = 0, σ = 1 0 1 Sample 1, N=30 Sample 2, N=30 Sample 100000000000 InterquartileRange
More informationStatistical analysis and bootstrapping
Statistical analysis and bootstrapping p. 1/15 Statistical analysis and bootstrapping Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Statistical analysis and bootstrapping
More informationBIO5312 Biostatistics Lecture 5: Estimations
BIO5312 Biostatistics Lecture 5: Estimations Yujin Chung September 27th, 2016 Fall 2016 Yujin Chung Lec5: Estimations Fall 2016 1/34 Recap Yujin Chung Lec5: Estimations Fall 2016 2/34 Today s lecture and
More informationPoint Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage
6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic
More informationHypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD
Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD MAJOR POINTS Sampling distribution of the mean revisited Testing hypotheses: sigma known An example Testing hypotheses:
More informationChapter 8 Statistical Intervals for a Single Sample
Chapter 8 Statistical Intervals for a Single Sample Part 1: Confidence intervals (CI) for population mean µ Section 8-1: CI for µ when σ 2 known & drawing from normal distribution Section 8-1.2: Sample
More informationMLLunsford 1. Activity: Central Limit Theorem Theory and Computations
MLLunsford 1 Activity: Central Limit Theorem Theory and Computations Concepts: The Central Limit Theorem; computations using the Central Limit Theorem. Prerequisites: The student should be familiar with
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2019 Last Time: Markov Chains We can use Markov chains for density estimation, d p(x) = p(x 1 ) p(x }{{}
More informationStatistical Methods in Practice STAT/MATH 3379
Statistical Methods in Practice STAT/MATH 3379 Dr. A. B. W. Manage Associate Professor of Mathematics & Statistics Department of Mathematics & Statistics Sam Houston State University Overview 6.1 Discrete
More informationMuch of what appears here comes from ideas presented in the book:
Chapter 11 Robust statistical methods Much of what appears here comes from ideas presented in the book: Huber, Peter J. (1981), Robust statistics, John Wiley & Sons (New York; Chichester). There are many
More informationChapter 5. Sampling Distributions
Lecture notes, Lang Wu, UBC 1 Chapter 5. Sampling Distributions 5.1. Introduction In statistical inference, we attempt to estimate an unknown population characteristic, such as the population mean, µ,
More informationLecture 9 - Sampling Distributions and the CLT
Lecture 9 - Sampling Distributions and the CLT Sta102/BME102 Colin Rundel September 23, 2015 1 Variability of Estimates Activity Sampling distributions - via simulation Sampling distributions - via CLT
More informationPoint Estimation. Some General Concepts of Point Estimation. Example. Estimator quality
Point Estimation Some General Concepts of Point Estimation Statistical inference = conclusions about parameters Parameters == population characteristics A point estimate of a parameter is a value (based
More informationSTAT Chapter 7: Central Limit Theorem
STAT 251 - Chapter 7: Central Limit Theorem In this chapter we will introduce the most important theorem in statistics; the central limit theorem. What have we seen so far? First, we saw that for an i.i.d
More informationHave you ever wondered whether it would be worth it to buy a lottery ticket every week, or pondered on questions such as If I were offered a choice
Section 8.5: Expected Value and Variance Have you ever wondered whether it would be worth it to buy a lottery ticket every week, or pondered on questions such as If I were offered a choice between a million
More informationFEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede,
FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede, mb8@ecs.soton.ac.uk The normal distribution The normal distribution is the classic "bell curve". We've seen that
More informationChapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.
Chapter 14 : Statistical Inference 1 Chapter 14 : Introduction to Statistical Inference Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same. Data x
More informationThe normal distribution is a theoretical model derived mathematically and not empirically.
Sociology 541 The Normal Distribution Probability and An Introduction to Inferential Statistics Normal Approximation The normal distribution is a theoretical model derived mathematically and not empirically.
More informationExam 2 Spring 2015 Statistics for Applications 4/9/2015
18.443 Exam 2 Spring 2015 Statistics for Applications 4/9/2015 1. True or False (and state why). (a). The significance level of a statistical test is not equal to the probability that the null hypothesis
More informationTHE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management
THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management BA 386T Tom Shively PROBABILITY CONCEPTS AND NORMAL DISTRIBUTIONS The fundamental idea underlying any statistical
More informationWeek 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals
Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg :
More informationEstimation Y 3. Confidence intervals I, Feb 11,
Estimation Example: Cholesterol levels of heart-attack patients Data: Observational study at a Pennsylvania medical center blood cholesterol levels patients treated for heart attacks measurements 2, 4,
More informationSTAT Chapter 6: Sampling Distributions
STAT 515 -- Chapter 6: Sampling Distributions Definition: Parameter = a number that characterizes a population (example: population mean ) it s typically unknown. Statistic = a number that characterizes
More information8.1 Estimation of the Mean and Proportion
8.1 Estimation of the Mean and Proportion Statistical inference enables us to make judgments about a population on the basis of sample information. The mean, standard deviation, and proportions of a population
More informationStatistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage
7 Statistical Intervals Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to
More informationOne sample z-test and t-test
One sample z-test and t-test January 30, 2017 psych10.stanford.edu Announcements / Action Items Install ISI package (instructions in Getting Started with R) Assessment Problem Set #3 due Tu 1/31 at 7 PM
More informationSTA Rev. F Learning Objectives. What is a Random Variable? Module 5 Discrete Random Variables
STA 2023 Module 5 Discrete Random Variables Learning Objectives Upon completing this module, you should be able to: 1. Determine the probability distribution of a discrete random variable. 2. Construct
More informationStatistics 251: Statistical Methods Sampling Distributions Module
Statistics 251: Statistical Methods Sampling Distributions Module 7 2018 Three Types of Distributions data distribution the distribution of a variable in a sample population distribution the probability
More informationσ 2 : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics
σ : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics CONTENTS Estimating other parameters besides μ Estimating variance Confidence intervals for σ Hypothesis tests for σ Estimating standard
More informationChapter 7. Inferences about Population Variances
Chapter 7. Inferences about Population Variances Introduction () The variability of a population s values is as important as the population mean. Hypothetical distribution of E. coli concentrations from
More information1. Variability in estimates and CLT
Unit3: Foundationsforinference 1. Variability in estimates and CLT Sta 101 - Fall 2015 Duke University, Department of Statistical Science Dr. Çetinkaya-Rundel Slides posted at http://bit.ly/sta101_f15
More informationLecture 16: Estimating Parameters (Confidence Interval Estimates of the Mean)
Statistics 16_est_parameters.pdf Michael Hallstone, Ph.D. hallston@hawaii.edu Lecture 16: Estimating Parameters (Confidence Interval Estimates of the Mean) Some Common Sense Assumptions for Interval Estimates
More informationAn approximate sampling distribution for the t-ratio. Caution: comparing population means when σ 1 σ 2.
Stat 529 (Winter 2011) Non-pooled t procedures (The Welch test) Reading: Section 4.3.2 The sampling distribution of Y 1 Y 2. An approximate sampling distribution for the t-ratio. The Sri Lankan analysis.
More information6.1 Discrete & Continuous Random Variables. Nov 4 6:53 PM. Objectives
6.1 Discrete & Continuous Random Variables examples vocab Objectives Today we will... - Compute probabilities using the probability distribution of a discrete random variable. - Calculate and interpret
More informationSTA 103: Final Exam. Print clearly on this exam. Only correct solutions that can be read will be given credit.
STA 103: Final Exam June 26, 2008 Name: } {{ } by writing my name i swear by the honor code Read all of the following information before starting the exam: Print clearly on this exam. Only correct solutions
More informationSTAT 111 Recitation 3
STAT 111 Recitation 3 Linjun Zhang stat.wharton.upenn.edu/~linjunz/ September 23, 2017 Misc. The unpicked-up homeworks will be put in the STAT 111 box in the Stats Department lobby (It s on the 4th floor
More informationSTAT 241/251 - Chapter 7: Central Limit Theorem
STAT 241/251 - Chapter 7: Central Limit Theorem In this chapter we will introduce the most important theorem in statistics; the central limit theorem. What have we seen so far? First, we saw that for an
More informationTwo Populations Hypothesis Testing
Two Populations Hypothesis Testing Two Proportions (Large Independent Samples) Two samples are said to be independent if the data from the first sample is not connected to the data from the second sample.
More informationMath 140 Introductory Statistics
Math 140 Introductory Statistics Let s make our own sampling! If we use a random sample (a survey) or if we randomly assign treatments to subjects (an experiment) we can come up with proper, unbiased conclusions
More informationCounting Basics. Venn diagrams
Counting Basics Sets Ways of specifying sets Union and intersection Universal set and complements Empty set and disjoint sets Venn diagrams Counting Inclusion-exclusion Multiplication principle Addition
More informationSampling Distribution of and Simulation Methods. Ontario Public Sector Salaries. Strange Sample? Lecture 11. Reading: Sections
Sampling Distribution of and Simulation Methods Lecture 11 Reading: Sections 1.3 1.5 1 Ontario Public Sector Salaries Public Sector Salary Disclosure Act, 1996 Requires organizations that receive public
More informationDiscrete Random Variables
Discrete Random Variables In this chapter, we introduce a new concept that of a random variable or RV. A random variable is a model to help us describe the state of the world around us. Roughly, a RV can
More informationHomework Assignments
Homework Assignments Week 1 (p. 57) #4.1, 4., 4.3 Week (pp 58 6) #4.5, 4.6, 4.8(a), 4.13, 4.0, 4.6(b), 4.8, 4.31, 4.34 Week 3 (pp 15 19) #1.9, 1.1, 1.13, 1.15, 1.18 (pp 9 31) #.,.6,.9 Week 4 (pp 36 37)
More informationSection Random Variables and Histograms
Section 3.1 - Random Variables and Histograms Definition: A random variable is a rule that assigns a number to each outcome of an experiment. Example 1: Suppose we toss a coin three times. Then we could
More informationInterval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems
Interval estimation September 29, 2017 STAT 151 Class 7 Slide 1 Outline of Topics 1 Basic ideas 2 Sampling variation and CLT 3 Interval estimation using X 4 More general problems STAT 151 Class 7 Slide
More informationProbability is the tool used for anticipating what the distribution of data should look like under a given model.
AP Statistics NAME: Exam Review: Strand 3: Anticipating Patterns Date: Block: III. Anticipating Patterns: Exploring random phenomena using probability and simulation (20%-30%) Probability is the tool used
More informationμ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics
μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics CONTENTS Estimating parameters The sampling distribution Confidence intervals for μ Hypothesis tests for μ The t-distribution Comparison
More informationReview of key points about estimators
Review of key points about estimators Populations can be at least partially described by population parameters Population parameters include: mean, proportion, variance, etc. Because populations are often
More informationIntroduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017
Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017 Please fill out the attendance sheet! Suggestions Box: Feedback and suggestions are important to the
More informationPoint Estimation. Edwin Leuven
Point Estimation Edwin Leuven Introduction Last time we reviewed statistical inference We saw that while in probability we ask: given a data generating process, what are the properties of the outcomes?
More informationChapter 7. Random Variables
Chapter 7 Random Variables Making quantifiable meaning out of categorical data Toss three coins. What does the sample space consist of? HHH, HHT, HTH, HTT, TTT, TTH, THT, THH In statistics, we are most
More informationChapter 8: Sampling distributions of estimators Sections
Chapter 8 continued Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample
More informationIOP 201-Q (Industrial Psychological Research) Tutorial 5
IOP 201-Q (Industrial Psychological Research) Tutorial 5 TRUE/FALSE [1 point each] Indicate whether the sentence or statement is true or false. 1. To establish a cause-and-effect relation between two variables,
More informationGraduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Solutions to Final Exam
Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay Solutions to Final Exam Problem A: (30 pts) Answer briefly the following questions. 1. Suppose that
More informationIEOR 3106: Introduction to OR: Stochastic Models. Fall 2013, Professor Whitt. Class Lecture Notes: Tuesday, September 10.
IEOR 3106: Introduction to OR: Stochastic Models Fall 2013, Professor Whitt Class Lecture Notes: Tuesday, September 10. The Central Limit Theorem and Stock Prices 1. The Central Limit Theorem (CLT See
More informationBinomial Random Variables. Binomial Random Variables
Bernoulli Trials Definition A Bernoulli trial is a random experiment in which there are only two possible outcomes - success and failure. 1 Tossing a coin and considering heads as success and tails as
More informationBasics. STAT:5400 Computing in Statistics Simulation studies in statistics Lecture 9 September 21, 2016
STAT:5400 Computing in Statistics Simulation studies in statistics Lecture 9 September 21, 2016 Based on a lecture by Marie Davidian for ST 810A - Spring 2005 Preparation for Statistical Research North
More informationThe topics in this section are related and necessary topics for both course objectives.
2.5 Probability Distributions The topics in this section are related and necessary topics for both course objectives. A probability distribution indicates how the probabilities are distributed for outcomes
More informationFigure 1: 2πσ is said to have a normal distribution with mean µ and standard deviation σ. This is also denoted
Figure 1: Math 223 Lecture Notes 4/1/04 Section 4.10 The normal distribution Recall that a continuous random variable X with probability distribution function f(x) = 1 µ)2 (x e 2σ 2πσ is said to have a
More informationProb and Stats, Nov 7
Prob and Stats, Nov 7 The Standard Normal Distribution Book Sections: 7.1, 7.2 Essential Questions: What is the standard normal distribution, how is it related to all other normal distributions, and how
More informationNumerical Descriptive Measures. Measures of Center: Mean and Median
Steve Sawin Statistics Numerical Descriptive Measures Having seen the shape of a distribution by looking at the histogram, the two most obvious questions to ask about the specific distribution is where
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2018 Last Time: Markov Chains We can use Markov chains for density estimation, p(x) = p(x 1 ) }{{} d p(x
More informationFinancial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR
Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR Nelson Mark University of Notre Dame Fall 2017 September 11, 2017 Introduction
More informationAP Statistics Chapter 6 - Random Variables
AP Statistics Chapter 6 - Random 6.1 Discrete and Continuous Random Objective: Recognize and define discrete random variables, and construct a probability distribution table and a probability histogram
More informationChapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are
Chapter 7 presents the beginning of inferential statistics. Concept: Inferential Statistics The two major activities of inferential statistics are 1 to use sample data to estimate values of population
More informationHonor Code: By signing my name below, I pledge my honor that I have not violated the Booth Honor Code during this examination.
Name: OUTLINE SOLUTIONS University of Chicago Graduate School of Business Business 41000: Business Statistics Special Notes: 1. This is a closed-book exam. You may use an 8 11 piece of paper for the formulas.
More informationChapter 9: Sampling Distributions
Chapter 9: Sampling Distributions 9. Introduction This chapter connects the material in Chapters 4 through 8 (numerical descriptive statistics, sampling, and probability distributions, in particular) with
More informationBusiness Statistics 41000: Probability 3
Business Statistics 41000: Probability 3 Drew D. Creal University of Chicago, Booth School of Business February 7 and 8, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404
More informationSection 2: Estimation, Confidence Intervals and Testing Hypothesis
Section 2: Estimation, Confidence Intervals and Testing Hypothesis Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/
More informationPart V - Chance Variability
Part V - Chance Variability Dr. Joseph Brennan Math 148, BU Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 1 / 78 Law of Averages In Chapter 13 we discussed the Kerrich coin-tossing experiment.
More informationBusiness Statistics 41000: Probability 4
Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:
More informationχ 2 distributions and confidence intervals for population variance
χ 2 distributions and confidence intervals for population variance Let Z be a standard Normal random variable, i.e., Z N(0, 1). Define Y = Z 2. Y is a non-negative random variable. Its distribution is
More informationChapter 7. Sampling Distributions and the Central Limit Theorem
Chapter 7. Sampling Distributions and the Central Limit Theorem 1 Introduction 2 Sampling Distributions related to the normal distribution 3 The central limit theorem 4 The normal approximation to binomial
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 10 (MWF) Checking for normality of the data using the QQplot Suhasini Subba Rao Review of previous
More informationThe Two-Sample Independent Sample t Test
Department of Psychology and Human Development Vanderbilt University 1 Introduction 2 3 The General Formula The Equal-n Formula 4 5 6 Independence Normality Homogeneity of Variances 7 Non-Normality Unequal
More informationMeasure of Variation
Measure of Variation Variation is the spread of a data set. The simplest measure is the range. Range the difference between the maximum and minimum data entries in the set. To find the range, the data
More informationMean of a Discrete Random variable. Suppose that X is a discrete random variable whose distribution is : :
Dr. Kim s Note (December 17 th ) The values taken on by the random variable X are random, but the values follow the pattern given in the random variable table. What is a typical value of a random variable
More informationClass 13. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 13 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 017 by D.B. Rowe 1 Agenda: Recap Chapter 6.3 6.5 Lecture Chapter 7.1 7. Review Chapter 5 for Eam 3.
More informationModule 4: Point Estimation Statistics (OA3102)
Module 4: Point Estimation Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 8.1-8.4 Revision: 1-12 1 Goals for this Module Define
More informationMATH 143: Introduction to Probability and Statistics Worksheet 9 for Thurs., Dec. 10: What procedure?
MATH 143: Introduction to Probability and Statistics Worksheet 9 for Thurs., Dec. 10: What procedure? For each numbered problem, identify (if possible) the following: (a) the variable(s) and variable type(s)
More informationSession 178 TS, Stats for Health Actuaries. Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA. Presenter: Joan C. Barrett, FSA, MAAA
Session 178 TS, Stats for Health Actuaries Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA Presenter: Joan C. Barrett, FSA, MAAA Session 178 Statistics for Health Actuaries October 14, 2015 Presented
More information