Stat 139 Homework 2 Solutions, Fall 2016

Size: px
Start display at page:

Download "Stat 139 Homework 2 Solutions, Fall 2016"

Transcription

1 Stat 139 Homework 2 Solutions, Fall 2016 Problem 1. The sum of squares of a sample of data is minimized when the sample mean, X = Xi /n, is used as the basis of the calculation. Define g(c) as a function w.r.t. c as: g(c) = (X i c) 2. Show that this function is minimized at the value c = X. Solution: In order to minimize a function, we have to take the first derivative (w.r.t. c) and set to zero. Then we can take the second derivative and make sure it is positive at x (concave up): g (c) = 2 (x i c) 0 = c = x i = n c = x i = c = x i /n = x g (c) = 2 1 = 2n > 0 Problem 2. Let X 1, X 2,..., X n be a sample of independent random variables drawn from a population with mean µ and variance σ 2. Let X be the sample average. Recall that σ 2 can be estimated by S 2, the usual sample variance, defined as: n S 2 = (X i X) ( 2 ) = 1 Xi 2 n n 1 n 1 X 2. (a) Show that E(X 2 i ) = σ2 + µ 2, using the fact that σ 2 = E ( (X i µ) 2). Solution: E(X 2 i ) = E(X 2 i 2µ 2 + 2µ 2 ) = E(X 2 i 2µX i + µ 2 ) + E(µ 2 ) = E ( (X i µ) 2) + E(µ 2 ) = σ 2 + µ 2 Note: E(µ 2 ) = µe(µ) = µe(x i ) = E(µX i ). (b) Show that E(S 2 ) = σ 2, i.e., S 2 is an unbiased estimator of the population variance. Solution: [ ( )] ( ) E(S 2 1 ) = E Xi 2 n n 1 X 2 = 1 E(Xi 2 ) ne( n 1 X 2 ) = 1 ( n(σ 2 + µ 2 ) n(σ 2 /n + µ 2 ) ) n 1 Note: E( X 2 ) = σ 2 X + µ 2 X = σ 2 /n + µ 2 based on the Law of Large Numbers. Problem 3. Let X 1, X 2,..., X 25 be i.i.d. Normal r.v.s. with mean µ = 1 and variance σ 2 = 3 2 = 9. Let S 2 be the usual variance estimate: S 2 = (X i X) 2 /(n 1), and let ˆσ 2 be the estimate using µ in the calculation instead: ˆσ 2 = (X i µ) 2 /n. Write a simulation in R, using a for-loop based 1

2 on at least 10,000 iterations, to determine the following (be sure to include the relevant R code and output): (a) That both estimators (S 2 and ˆσ 2 ) are unbiased. Solution: Based on 10,000 iterations, the observed means of both estimators were within 0.01 units of the true variance of 9. We could formally test if the is significantly different from 9 (based on n = 10, 000 realizations), but that is overkill. Here is the relevant R code: > nsims=10000 > mu=1 > sigma=3 > n=25 > sigma2.hat=s2=rep(na,nsims) > > for(i in 1:nsims){ + sample=rnorm(n,mean=mu,sd=sigma) + xbar=mean(sample) + sigma2.hat[i]=sum((sample-mu)^2)/n + s2[i]=var(sample) + } > mean(sigma2.hat) [1] > mean(s2) [1] (b) Provide a separate histogram for each of the two sampling distributions. Which has lower spread? Solution: Based on the R output below, ˆσ 2 has slightly smaller spread than S 2 (about 3% lower standard deviation). > sd(sigma2.hat) [1] > sd(s2) [1] (c) Which estimator is closer to the true value more often. Solution: Based on the R output below, ˆσ 2 is as close or closer than S 2 about 52.4% of the time. > mean(abs(sigma2.hat-sigma^2)>abs(s2-sigma^2)) [1] (d) Are you sure of your answers above? What could you do to be more certain? Solution: No, the answers are not certainly true above since these are based on random simulations. We could be more certain if we based this study on more iterations, or if we performed a formal test to see if the results above were statistically significant. 2

3 Histogram of sigma2.hat Histogram of s sigma2.hat s2 Problem 4. The National Football League (NFL) instituted a new rule in 2016 that changed how kickoffs are returned (a touchback is placed at the 25 instead of the 20 yard line in hopes to increase touchbacks to reduce injuries). This problem will investigate what effect this may have on that type of play in the game based on just 1 week, n = 16 games, of data. (a) In the entire year of 2015 (256 games), 1470 out of 2627 kickoffs were touchbacks. So far in 2016, 103 out of 165 kickoffs have been touchbacks. Perform a formal hypothesis test to determine whether the rate of touchbacks has changed from 2015 to What does this mean for whether the rule change had an effect on kickoffs? Solution: If we treat these as two samples from two separate populations (or superpopulations), then we can perform a 2-sample z-test for proportions. Note, each observation (each individual kickoff) is a Bernoulli r.v. with parameter p, and thus has mean µ = p and variance σ = p(1 p), and the CLT applies to the 2 sample proportions, ˆp, since these are averages individual observations: H 0 : p 2016 = p 2015 vs. H A : p 2016 p 2015 Z = ˆp 2016 ˆp 2015 ( ) = ˆp pooled (1 ˆp pooled ) n1 n2 p value = 2(1 P (Z > 1.624)) = Note: ˆp pooled = ( )/( ) is the combined rate of touchbacks. Also, this test can instead be performed as a one sample test where only ˆp 2015 is used as an estimate and p 2016 = 1470/2627 = is the true parameter value, and this difference in approach is addressed in the next problem. (b) The proportion from 2015 could be treated as a population parameter or as an estimate of some super-population (the theoretical construct that there is some mechanism producing these data). Practically speaking, give an argument why it does not matter which way you treated it in the previous part. Solution: This does not matter since the sample size is so large (n = 2627). When an estimate is based on so much data, the estimator s variance is so small that it has almost no bearing on the result and can be treated like a constant. 3

4 (c) Calculate a 95% confidence interval for the true proportion of kickoffs that end in touchbacks in Solution: ˆp 2016 ± z ˆp 2016 (1 ˆp 2016 ) (0.3758) = ± 1.96 = (0.550, 0.698) n (d) Do the confidence interval and hypothesis test agree? How do you know? Solution: Yes they agree. Treating the proportion from 2015 as a population parameter (p =.05596), our confidence interval covers that value, and thus it should not be rejected as the true underlying proportion of kickoffs that are touchbacks in (e) There are 32 teams in the NFL and each team essentially uses the same players on kickoffs throughout the year (the kicker is the most important player on kickoffs and he very rarely changes). How does this affect the assumptions of your inferences in part (a) and (c)? Solution: The observations are not independent, either within a season (due to a clustering effect by team/kicker) or between seasons (since the data in some way are paired from one season to the next). (f) The entire 2015 season may not be the best comparison group for this study. Provide a different comparison group and/or analysis approach that may be more appropriate. Solution: A better approach would be to compare week 1 of 2016 to week 1 of There may be changes as the season goes along (especially weather). Also, one could perform a paired test looking at the difference within teams from 2015 to Problem 5. Use R to perform the following simulation based on 100,000 iterations to mimic the previous problem. (a) Assume 2015 is a discrete population of kickoffs with exactly 1470 touchbacks and 1157 nontouchbacks. Sample, without replacement, 165 kickoffs and measure the proportion of kickoffs that are touchbacks within this sample. Repeat this sampling 100,000 times. Provide a histogram of the sampled proportions. What proportion of sample proportions is greater than what was actually observed in 2016? Solution: The histogram is provided below along with the relevant R output (see R code file for simulation code). The proportion of sample proportions greater than what was observed is estimated to be about 3.5%. (b) Now assume 2015 is an infinite population of kickoffs where p = 1470/2627 of the kickoffs are touchbacks. (Note: this is equivalent to sampling with replacement from the discrete population). Sample from the theoretically infinite population (or sample with replacement from the finite population) 165 kickoffs and measure the proportion of kickoffs that are touchbacks within this sample. Repeat this sampling 100,000 times. Provide a histogram of the sampled proportions. What proportion of sample proportions is greater than what was actually observed in 2016? Solution: The proportion of sample proportions greater than what was observed is estimated to be about 3.9% here, slightly larger than in part (a). Since this is higher than the respectively calculation for part (a), this hints that the sampling distribution has fatter tails (higher variance), which will be discussed in the next part. 4

5 > mean(sample_props1>phat_1) #part a [1] > mean(sample_props2>phat_1) #part b [1] Histogram of sample_props1 Histogram of sample_props sample_props sample_props2 (c) How do the histograms of the two sampling procedures compare? How would the histogram change if part (a) was based on a discrete population with 1/4 as many observations? Feel free to use empirical evidence and statistics/measures to support your claim. Note: this is a different issue than seen in 4(b). Solution: They are very similar (both approximately normal). The variance of the histogram for (b), when sampling was done with replacement, has slightly more variability than when performed without replacement. This difference in the variance will be exacerbated if the population size is even smaller (closer to the sample size). > mean(sample_props1) [1] > var(sample_props1) [1] > mean(sample_props2) [1] > var(sample_props2) [1] (d) Turn your results in parts (a) and (b) into 2-sided p-values. How do these compare to the hypothesis test from the previous problem? Solution: Technically, we should change the inequality from parts (a) and (b) as they should be greater than or equal to and not just greater than. How to turn this into a 2-sided p-value then would depend on if the reference distribution (here, the histogram) is symmetric or not. If it s symmetric, we can just take the 1-tail probability and multiply by 2. If it is not symmetric, then we would have to be much more careful in how to calculate this (and determine extremity by distance from the null hypothesis mean). Luckily, our histogram is roughly normal, so we are OK to take the 1-tail probability and multiply by 2. Ignoring the equality we can just take our results from parts (a) and (b) and get p-values of Taking into account the equality we see the p-values are more similar to the calculations done by hand: 5

6 > 2*mean(sample_props1>=phat_1) [1] > 2*mean(sample_props2>=phat_1) [1] Note: due to the discreteness of this r.v., it makes a difference if we include the equality. 6

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate

More information

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate

More information

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS Part 1: Introduction Sampling Distributions & the Central Limit Theorem Point Estimation & Estimators Sections 7-1 to 7-2 Sample data

More information

Sampling and sampling distribution

Sampling and sampling distribution Sampling and sampling distribution September 12, 2017 STAT 101 Class 5 Slide 1 Outline of Topics 1 Sampling 2 Sampling distribution of a mean 3 Sampling distribution of a proportion STAT 101 Class 5 Slide

More information

Chapter 8. Introduction to Statistical Inference

Chapter 8. Introduction to Statistical Inference Chapter 8. Introduction to Statistical Inference Point Estimation Statistical inference is to draw some type of conclusion about one or more parameters(population characteristics). Now you know that a

More information

MATH 3200 Exam 3 Dr. Syring

MATH 3200 Exam 3 Dr. Syring . Suppose n eligible voters are polled (randomly sampled) from a population of size N. The poll asks voters whether they support or do not support increasing local taxes to fund public parks. Let M be

More information

Confidence Intervals. σ unknown, small samples The t-statistic /22

Confidence Intervals. σ unknown, small samples The t-statistic /22 Confidence Intervals σ unknown, small samples The t-statistic 1 /22 Homework Read Sec 7-3. Discussion Question pg 365 Do Ex 7-3 1-4, 6, 9, 12, 14, 15, 17 2/22 Objective find the confidence interval for

More information

Chapter 7: Point Estimation and Sampling Distributions

Chapter 7: Point Estimation and Sampling Distributions Chapter 7: Point Estimation and Sampling Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 20 Motivation In chapter 3, we learned

More information

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Chapter 7 Sampling Distributions and Point Estimation of Parameters Chapter 7 Sampling Distributions and Point Estimation of Parameters Part 1: Sampling Distributions, the Central Limit Theorem, Point Estimation & Estimators Sections 7-1 to 7-2 1 / 25 Statistical Inferences

More information

Chapter 7: Random Variables

Chapter 7: Random Variables Chapter 7: Random Variables 7.1 Discrete and Continuous Random Variables 7.2 Means and Variances of Random Variables 1 Introduction A random variable is a function that associates a unique numerical value

More information

The Bernoulli distribution

The Bernoulli distribution This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

STA215 Confidence Intervals for Proportions

STA215 Confidence Intervals for Proportions STA215 Confidence Intervals for Proportions Al Nosedal. University of Toronto. Summer 2017 June 14, 2017 Pepsi problem A market research consultant hired by the Pepsi-Cola Co. is interested in determining

More information

Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means

Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means Chapter 11: Inference for Distributions 11.1 Inference for Means of a Population 11.2 Comparing Two Means 1 Population Standard Deviation In the previous chapter, we computed confidence intervals and performed

More information

Section 0: Introduction and Review of Basic Concepts

Section 0: Introduction and Review of Basic Concepts Section 0: Introduction and Review of Basic Concepts Carlos M. Carvalho The University of Texas McCombs School of Business mccombs.utexas.edu/faculty/carlos.carvalho/teaching 1 Getting Started Syllabus

More information

5.3 Statistics and Their Distributions

5.3 Statistics and Their Distributions Chapter 5 Joint Probability Distributions and Random Samples Instructor: Lingsong Zhang 1 Statistics and Their Distributions 5.3 Statistics and Their Distributions Statistics and Their Distributions Consider

More information

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions SGSB Workshop: Using Statistical Data to Make Decisions Module 2: The Logic of Statistical Inference Dr. Tom Ilvento January 2006 Dr. Mugdim Pašić Key Objectives Understand the logic of statistical inference

More information

Stat 213: Intro to Statistics 9 Central Limit Theorem

Stat 213: Intro to Statistics 9 Central Limit Theorem 1 Stat 213: Intro to Statistics 9 Central Limit Theorem H. Kim Fall 2007 2 unknown parameters Example: A pollster is sure that the responses to his agree/disagree questions will follow a binomial distribution,

More information

STA Module 3B Discrete Random Variables

STA Module 3B Discrete Random Variables STA 2023 Module 3B Discrete Random Variables Learning Objectives Upon completing this module, you should be able to 1. Determine the probability distribution of a discrete random variable. 2. Construct

More information

4.2 Probability Distributions

4.2 Probability Distributions 4.2 Probability Distributions Definition. A random variable is a variable whose value is a numerical outcome of a random phenomenon. The probability distribution of a random variable tells us what the

More information

Review: Population, sample, and sampling distributions

Review: Population, sample, and sampling distributions Review: Population, sample, and sampling distributions A population with mean µ and standard deviation σ For instance, µ = 0, σ = 1 0 1 Sample 1, N=30 Sample 2, N=30 Sample 100000000000 InterquartileRange

More information

Statistical analysis and bootstrapping

Statistical analysis and bootstrapping Statistical analysis and bootstrapping p. 1/15 Statistical analysis and bootstrapping Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Statistical analysis and bootstrapping

More information

BIO5312 Biostatistics Lecture 5: Estimations

BIO5312 Biostatistics Lecture 5: Estimations BIO5312 Biostatistics Lecture 5: Estimations Yujin Chung September 27th, 2016 Fall 2016 Yujin Chung Lec5: Estimations Fall 2016 1/34 Recap Yujin Chung Lec5: Estimations Fall 2016 2/34 Today s lecture and

More information

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage 6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic

More information

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD MAJOR POINTS Sampling distribution of the mean revisited Testing hypotheses: sigma known An example Testing hypotheses:

More information

Chapter 8 Statistical Intervals for a Single Sample

Chapter 8 Statistical Intervals for a Single Sample Chapter 8 Statistical Intervals for a Single Sample Part 1: Confidence intervals (CI) for population mean µ Section 8-1: CI for µ when σ 2 known & drawing from normal distribution Section 8-1.2: Sample

More information

MLLunsford 1. Activity: Central Limit Theorem Theory and Computations

MLLunsford 1. Activity: Central Limit Theorem Theory and Computations MLLunsford 1 Activity: Central Limit Theorem Theory and Computations Concepts: The Central Limit Theorem; computations using the Central Limit Theorem. Prerequisites: The student should be familiar with

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2019 Last Time: Markov Chains We can use Markov chains for density estimation, d p(x) = p(x 1 ) p(x }{{}

More information

Statistical Methods in Practice STAT/MATH 3379

Statistical Methods in Practice STAT/MATH 3379 Statistical Methods in Practice STAT/MATH 3379 Dr. A. B. W. Manage Associate Professor of Mathematics & Statistics Department of Mathematics & Statistics Sam Houston State University Overview 6.1 Discrete

More information

Much of what appears here comes from ideas presented in the book:

Much of what appears here comes from ideas presented in the book: Chapter 11 Robust statistical methods Much of what appears here comes from ideas presented in the book: Huber, Peter J. (1981), Robust statistics, John Wiley & Sons (New York; Chichester). There are many

More information

Chapter 5. Sampling Distributions

Chapter 5. Sampling Distributions Lecture notes, Lang Wu, UBC 1 Chapter 5. Sampling Distributions 5.1. Introduction In statistical inference, we attempt to estimate an unknown population characteristic, such as the population mean, µ,

More information

Lecture 9 - Sampling Distributions and the CLT

Lecture 9 - Sampling Distributions and the CLT Lecture 9 - Sampling Distributions and the CLT Sta102/BME102 Colin Rundel September 23, 2015 1 Variability of Estimates Activity Sampling distributions - via simulation Sampling distributions - via CLT

More information

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality Point Estimation Some General Concepts of Point Estimation Statistical inference = conclusions about parameters Parameters == population characteristics A point estimate of a parameter is a value (based

More information

STAT Chapter 7: Central Limit Theorem

STAT Chapter 7: Central Limit Theorem STAT 251 - Chapter 7: Central Limit Theorem In this chapter we will introduce the most important theorem in statistics; the central limit theorem. What have we seen so far? First, we saw that for an i.i.d

More information

Have you ever wondered whether it would be worth it to buy a lottery ticket every week, or pondered on questions such as If I were offered a choice

Have you ever wondered whether it would be worth it to buy a lottery ticket every week, or pondered on questions such as If I were offered a choice Section 8.5: Expected Value and Variance Have you ever wondered whether it would be worth it to buy a lottery ticket every week, or pondered on questions such as If I were offered a choice between a million

More information

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede,

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede, FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede, mb8@ecs.soton.ac.uk The normal distribution The normal distribution is the classic "bell curve". We've seen that

More information

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same. Chapter 14 : Statistical Inference 1 Chapter 14 : Introduction to Statistical Inference Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same. Data x

More information

The normal distribution is a theoretical model derived mathematically and not empirically.

The normal distribution is a theoretical model derived mathematically and not empirically. Sociology 541 The Normal Distribution Probability and An Introduction to Inferential Statistics Normal Approximation The normal distribution is a theoretical model derived mathematically and not empirically.

More information

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Exam 2 Spring 2015 Statistics for Applications 4/9/2015 18.443 Exam 2 Spring 2015 Statistics for Applications 4/9/2015 1. True or False (and state why). (a). The significance level of a statistical test is not equal to the probability that the null hypothesis

More information

THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management

THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management BA 386T Tom Shively PROBABILITY CONCEPTS AND NORMAL DISTRIBUTIONS The fundamental idea underlying any statistical

More information

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg :

More information

Estimation Y 3. Confidence intervals I, Feb 11,

Estimation Y 3. Confidence intervals I, Feb 11, Estimation Example: Cholesterol levels of heart-attack patients Data: Observational study at a Pennsylvania medical center blood cholesterol levels patients treated for heart attacks measurements 2, 4,

More information

STAT Chapter 6: Sampling Distributions

STAT Chapter 6: Sampling Distributions STAT 515 -- Chapter 6: Sampling Distributions Definition: Parameter = a number that characterizes a population (example: population mean ) it s typically unknown. Statistic = a number that characterizes

More information

8.1 Estimation of the Mean and Proportion

8.1 Estimation of the Mean and Proportion 8.1 Estimation of the Mean and Proportion Statistical inference enables us to make judgments about a population on the basis of sample information. The mean, standard deviation, and proportions of a population

More information

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage 7 Statistical Intervals Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to

More information

One sample z-test and t-test

One sample z-test and t-test One sample z-test and t-test January 30, 2017 psych10.stanford.edu Announcements / Action Items Install ISI package (instructions in Getting Started with R) Assessment Problem Set #3 due Tu 1/31 at 7 PM

More information

STA Rev. F Learning Objectives. What is a Random Variable? Module 5 Discrete Random Variables

STA Rev. F Learning Objectives. What is a Random Variable? Module 5 Discrete Random Variables STA 2023 Module 5 Discrete Random Variables Learning Objectives Upon completing this module, you should be able to: 1. Determine the probability distribution of a discrete random variable. 2. Construct

More information

Statistics 251: Statistical Methods Sampling Distributions Module

Statistics 251: Statistical Methods Sampling Distributions Module Statistics 251: Statistical Methods Sampling Distributions Module 7 2018 Three Types of Distributions data distribution the distribution of a variable in a sample population distribution the probability

More information

σ 2 : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

σ 2 : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics σ : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics CONTENTS Estimating other parameters besides μ Estimating variance Confidence intervals for σ Hypothesis tests for σ Estimating standard

More information

Chapter 7. Inferences about Population Variances

Chapter 7. Inferences about Population Variances Chapter 7. Inferences about Population Variances Introduction () The variability of a population s values is as important as the population mean. Hypothetical distribution of E. coli concentrations from

More information

1. Variability in estimates and CLT

1. Variability in estimates and CLT Unit3: Foundationsforinference 1. Variability in estimates and CLT Sta 101 - Fall 2015 Duke University, Department of Statistical Science Dr. Çetinkaya-Rundel Slides posted at http://bit.ly/sta101_f15

More information

Lecture 16: Estimating Parameters (Confidence Interval Estimates of the Mean)

Lecture 16: Estimating Parameters (Confidence Interval Estimates of the Mean) Statistics 16_est_parameters.pdf Michael Hallstone, Ph.D. hallston@hawaii.edu Lecture 16: Estimating Parameters (Confidence Interval Estimates of the Mean) Some Common Sense Assumptions for Interval Estimates

More information

An approximate sampling distribution for the t-ratio. Caution: comparing population means when σ 1 σ 2.

An approximate sampling distribution for the t-ratio. Caution: comparing population means when σ 1 σ 2. Stat 529 (Winter 2011) Non-pooled t procedures (The Welch test) Reading: Section 4.3.2 The sampling distribution of Y 1 Y 2. An approximate sampling distribution for the t-ratio. The Sri Lankan analysis.

More information

6.1 Discrete & Continuous Random Variables. Nov 4 6:53 PM. Objectives

6.1 Discrete & Continuous Random Variables. Nov 4 6:53 PM. Objectives 6.1 Discrete & Continuous Random Variables examples vocab Objectives Today we will... - Compute probabilities using the probability distribution of a discrete random variable. - Calculate and interpret

More information

STA 103: Final Exam. Print clearly on this exam. Only correct solutions that can be read will be given credit.

STA 103: Final Exam. Print clearly on this exam. Only correct solutions that can be read will be given credit. STA 103: Final Exam June 26, 2008 Name: } {{ } by writing my name i swear by the honor code Read all of the following information before starting the exam: Print clearly on this exam. Only correct solutions

More information

STAT 111 Recitation 3

STAT 111 Recitation 3 STAT 111 Recitation 3 Linjun Zhang stat.wharton.upenn.edu/~linjunz/ September 23, 2017 Misc. The unpicked-up homeworks will be put in the STAT 111 box in the Stats Department lobby (It s on the 4th floor

More information

STAT 241/251 - Chapter 7: Central Limit Theorem

STAT 241/251 - Chapter 7: Central Limit Theorem STAT 241/251 - Chapter 7: Central Limit Theorem In this chapter we will introduce the most important theorem in statistics; the central limit theorem. What have we seen so far? First, we saw that for an

More information

Two Populations Hypothesis Testing

Two Populations Hypothesis Testing Two Populations Hypothesis Testing Two Proportions (Large Independent Samples) Two samples are said to be independent if the data from the first sample is not connected to the data from the second sample.

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Math 140 Introductory Statistics Let s make our own sampling! If we use a random sample (a survey) or if we randomly assign treatments to subjects (an experiment) we can come up with proper, unbiased conclusions

More information

Counting Basics. Venn diagrams

Counting Basics. Venn diagrams Counting Basics Sets Ways of specifying sets Union and intersection Universal set and complements Empty set and disjoint sets Venn diagrams Counting Inclusion-exclusion Multiplication principle Addition

More information

Sampling Distribution of and Simulation Methods. Ontario Public Sector Salaries. Strange Sample? Lecture 11. Reading: Sections

Sampling Distribution of and Simulation Methods. Ontario Public Sector Salaries. Strange Sample? Lecture 11. Reading: Sections Sampling Distribution of and Simulation Methods Lecture 11 Reading: Sections 1.3 1.5 1 Ontario Public Sector Salaries Public Sector Salary Disclosure Act, 1996 Requires organizations that receive public

More information

Discrete Random Variables

Discrete Random Variables Discrete Random Variables In this chapter, we introduce a new concept that of a random variable or RV. A random variable is a model to help us describe the state of the world around us. Roughly, a RV can

More information

Homework Assignments

Homework Assignments Homework Assignments Week 1 (p. 57) #4.1, 4., 4.3 Week (pp 58 6) #4.5, 4.6, 4.8(a), 4.13, 4.0, 4.6(b), 4.8, 4.31, 4.34 Week 3 (pp 15 19) #1.9, 1.1, 1.13, 1.15, 1.18 (pp 9 31) #.,.6,.9 Week 4 (pp 36 37)

More information

Section Random Variables and Histograms

Section Random Variables and Histograms Section 3.1 - Random Variables and Histograms Definition: A random variable is a rule that assigns a number to each outcome of an experiment. Example 1: Suppose we toss a coin three times. Then we could

More information

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems Interval estimation September 29, 2017 STAT 151 Class 7 Slide 1 Outline of Topics 1 Basic ideas 2 Sampling variation and CLT 3 Interval estimation using X 4 More general problems STAT 151 Class 7 Slide

More information

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

Probability is the tool used for anticipating what the distribution of data should look like under a given model. AP Statistics NAME: Exam Review: Strand 3: Anticipating Patterns Date: Block: III. Anticipating Patterns: Exploring random phenomena using probability and simulation (20%-30%) Probability is the tool used

More information

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics CONTENTS Estimating parameters The sampling distribution Confidence intervals for μ Hypothesis tests for μ The t-distribution Comparison

More information

Review of key points about estimators

Review of key points about estimators Review of key points about estimators Populations can be at least partially described by population parameters Population parameters include: mean, proportion, variance, etc. Because populations are often

More information

Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017

Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017 Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017 Please fill out the attendance sheet! Suggestions Box: Feedback and suggestions are important to the

More information

Point Estimation. Edwin Leuven

Point Estimation. Edwin Leuven Point Estimation Edwin Leuven Introduction Last time we reviewed statistical inference We saw that while in probability we ask: given a data generating process, what are the properties of the outcomes?

More information

Chapter 7. Random Variables

Chapter 7. Random Variables Chapter 7 Random Variables Making quantifiable meaning out of categorical data Toss three coins. What does the sample space consist of? HHH, HHT, HTH, HTT, TTT, TTH, THT, THH In statistics, we are most

More information

Chapter 8: Sampling distributions of estimators Sections

Chapter 8: Sampling distributions of estimators Sections Chapter 8 continued Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample

More information

IOP 201-Q (Industrial Psychological Research) Tutorial 5

IOP 201-Q (Industrial Psychological Research) Tutorial 5 IOP 201-Q (Industrial Psychological Research) Tutorial 5 TRUE/FALSE [1 point each] Indicate whether the sentence or statement is true or false. 1. To establish a cause-and-effect relation between two variables,

More information

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Solutions to Final Exam

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Solutions to Final Exam Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay Solutions to Final Exam Problem A: (30 pts) Answer briefly the following questions. 1. Suppose that

More information

IEOR 3106: Introduction to OR: Stochastic Models. Fall 2013, Professor Whitt. Class Lecture Notes: Tuesday, September 10.

IEOR 3106: Introduction to OR: Stochastic Models. Fall 2013, Professor Whitt. Class Lecture Notes: Tuesday, September 10. IEOR 3106: Introduction to OR: Stochastic Models Fall 2013, Professor Whitt Class Lecture Notes: Tuesday, September 10. The Central Limit Theorem and Stock Prices 1. The Central Limit Theorem (CLT See

More information

Binomial Random Variables. Binomial Random Variables

Binomial Random Variables. Binomial Random Variables Bernoulli Trials Definition A Bernoulli trial is a random experiment in which there are only two possible outcomes - success and failure. 1 Tossing a coin and considering heads as success and tails as

More information

Basics. STAT:5400 Computing in Statistics Simulation studies in statistics Lecture 9 September 21, 2016

Basics. STAT:5400 Computing in Statistics Simulation studies in statistics Lecture 9 September 21, 2016 STAT:5400 Computing in Statistics Simulation studies in statistics Lecture 9 September 21, 2016 Based on a lecture by Marie Davidian for ST 810A - Spring 2005 Preparation for Statistical Research North

More information

The topics in this section are related and necessary topics for both course objectives.

The topics in this section are related and necessary topics for both course objectives. 2.5 Probability Distributions The topics in this section are related and necessary topics for both course objectives. A probability distribution indicates how the probabilities are distributed for outcomes

More information

Figure 1: 2πσ is said to have a normal distribution with mean µ and standard deviation σ. This is also denoted

Figure 1: 2πσ is said to have a normal distribution with mean µ and standard deviation σ. This is also denoted Figure 1: Math 223 Lecture Notes 4/1/04 Section 4.10 The normal distribution Recall that a continuous random variable X with probability distribution function f(x) = 1 µ)2 (x e 2σ 2πσ is said to have a

More information

Prob and Stats, Nov 7

Prob and Stats, Nov 7 Prob and Stats, Nov 7 The Standard Normal Distribution Book Sections: 7.1, 7.2 Essential Questions: What is the standard normal distribution, how is it related to all other normal distributions, and how

More information

Numerical Descriptive Measures. Measures of Center: Mean and Median

Numerical Descriptive Measures. Measures of Center: Mean and Median Steve Sawin Statistics Numerical Descriptive Measures Having seen the shape of a distribution by looking at the histogram, the two most obvious questions to ask about the specific distribution is where

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2018 Last Time: Markov Chains We can use Markov chains for density estimation, p(x) = p(x 1 ) }{{} d p(x

More information

Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR

Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR Nelson Mark University of Notre Dame Fall 2017 September 11, 2017 Introduction

More information

AP Statistics Chapter 6 - Random Variables

AP Statistics Chapter 6 - Random Variables AP Statistics Chapter 6 - Random 6.1 Discrete and Continuous Random Objective: Recognize and define discrete random variables, and construct a probability distribution table and a probability histogram

More information

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are Chapter 7 presents the beginning of inferential statistics. Concept: Inferential Statistics The two major activities of inferential statistics are 1 to use sample data to estimate values of population

More information

Honor Code: By signing my name below, I pledge my honor that I have not violated the Booth Honor Code during this examination.

Honor Code: By signing my name below, I pledge my honor that I have not violated the Booth Honor Code during this examination. Name: OUTLINE SOLUTIONS University of Chicago Graduate School of Business Business 41000: Business Statistics Special Notes: 1. This is a closed-book exam. You may use an 8 11 piece of paper for the formulas.

More information

Chapter 9: Sampling Distributions

Chapter 9: Sampling Distributions Chapter 9: Sampling Distributions 9. Introduction This chapter connects the material in Chapters 4 through 8 (numerical descriptive statistics, sampling, and probability distributions, in particular) with

More information

Business Statistics 41000: Probability 3

Business Statistics 41000: Probability 3 Business Statistics 41000: Probability 3 Drew D. Creal University of Chicago, Booth School of Business February 7 and 8, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404

More information

Section 2: Estimation, Confidence Intervals and Testing Hypothesis

Section 2: Estimation, Confidence Intervals and Testing Hypothesis Section 2: Estimation, Confidence Intervals and Testing Hypothesis Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/

More information

Part V - Chance Variability

Part V - Chance Variability Part V - Chance Variability Dr. Joseph Brennan Math 148, BU Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 1 / 78 Law of Averages In Chapter 13 we discussed the Kerrich coin-tossing experiment.

More information

Business Statistics 41000: Probability 4

Business Statistics 41000: Probability 4 Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:

More information

χ 2 distributions and confidence intervals for population variance

χ 2 distributions and confidence intervals for population variance χ 2 distributions and confidence intervals for population variance Let Z be a standard Normal random variable, i.e., Z N(0, 1). Define Y = Z 2. Y is a non-negative random variable. Its distribution is

More information

Chapter 7. Sampling Distributions and the Central Limit Theorem

Chapter 7. Sampling Distributions and the Central Limit Theorem Chapter 7. Sampling Distributions and the Central Limit Theorem 1 Introduction 2 Sampling Distributions related to the normal distribution 3 The central limit theorem 4 The normal approximation to binomial

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 10 (MWF) Checking for normality of the data using the QQplot Suhasini Subba Rao Review of previous

More information

The Two-Sample Independent Sample t Test

The Two-Sample Independent Sample t Test Department of Psychology and Human Development Vanderbilt University 1 Introduction 2 3 The General Formula The Equal-n Formula 4 5 6 Independence Normality Homogeneity of Variances 7 Non-Normality Unequal

More information

Measure of Variation

Measure of Variation Measure of Variation Variation is the spread of a data set. The simplest measure is the range. Range the difference between the maximum and minimum data entries in the set. To find the range, the data

More information

Mean of a Discrete Random variable. Suppose that X is a discrete random variable whose distribution is : :

Mean of a Discrete Random variable. Suppose that X is a discrete random variable whose distribution is : : Dr. Kim s Note (December 17 th ) The values taken on by the random variable X are random, but the values follow the pattern given in the random variable table. What is a typical value of a random variable

More information

Class 13. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Class 13. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700 Class 13 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 017 by D.B. Rowe 1 Agenda: Recap Chapter 6.3 6.5 Lecture Chapter 7.1 7. Review Chapter 5 for Eam 3.

More information

Module 4: Point Estimation Statistics (OA3102)

Module 4: Point Estimation Statistics (OA3102) Module 4: Point Estimation Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 8.1-8.4 Revision: 1-12 1 Goals for this Module Define

More information

MATH 143: Introduction to Probability and Statistics Worksheet 9 for Thurs., Dec. 10: What procedure?

MATH 143: Introduction to Probability and Statistics Worksheet 9 for Thurs., Dec. 10: What procedure? MATH 143: Introduction to Probability and Statistics Worksheet 9 for Thurs., Dec. 10: What procedure? For each numbered problem, identify (if possible) the following: (a) the variable(s) and variable type(s)

More information

Session 178 TS, Stats for Health Actuaries. Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA. Presenter: Joan C. Barrett, FSA, MAAA

Session 178 TS, Stats for Health Actuaries. Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA. Presenter: Joan C. Barrett, FSA, MAAA Session 178 TS, Stats for Health Actuaries Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA Presenter: Joan C. Barrett, FSA, MAAA Session 178 Statistics for Health Actuaries October 14, 2015 Presented

More information