Experimental Design and Statistics - AGA47A

Similar documents
7.1 Comparing Two Population Means: Independent Sampling

χ 2 distributions and confidence intervals for population variance

Two Populations Hypothesis Testing

Lecture 2 INTERVAL ESTIMATION II

Confidence Intervals Introduction

Distribution. Lecture 34 Section Fri, Oct 31, Hampden-Sydney College. Student s t Distribution. Robb T. Koether.

Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means

Lecture 37 Sections 11.1, 11.2, Mon, Mar 31, Hampden-Sydney College. Independent Samples: Comparing Means. Robb T. Koether.

The Two Sample T-test with One Variance Unknown

Chapter Seven: Confidence Intervals and Sample Size

8.1 Estimation of the Mean and Proportion

Tests for One Variance

Statistical Methodology. A note on a two-sample T test with one variance unknown

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

Chapter 7. Inferences about Population Variances

Chapter 8. Introduction to Statistical Inference

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Chapter 8 Statistical Intervals for a Single Sample

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

σ 2 : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Independent-Samples t Test

SLIDES. BY. John Loucks. St. Edward s University

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Examples

Confidence Intervals for the Difference Between Two Means with Tolerance Probability

Lecture 2. Probability Distributions Theophanis Tsandilas

Lecture 10 - Confidence Intervals for Sample Means

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Sample Size for Assessing Agreement between Two Methods of Measurement by Bland Altman Method

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Determining Sample Size. Slide 1 ˆ ˆ. p q n E = z α / 2. (solve for n by algebra) n = E 2

Chapter 7. Sampling Distributions and the Central Limit Theorem

Statistical Intervals (One sample) (Chs )

Lecture 39 Section 11.5

1. Statistical problems - a) Distribution is known. b) Distribution is unknown.

Quantitative Methods for Economics, Finance and Management (A86050 F86050)

The Two-Sample Independent Sample t Test

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Random Variables Handout. Xavier Vilà

Business Statistics 41000: Probability 3

STA258 Analysis of Variance

Probability & Statistics

Point Estimation. Principle of Unbiased Estimation. When choosing among several different estimators of θ, select one that is unbiased.

Hypothesis Tests: One Sample Mean Cal State Northridge Ψ320 Andrew Ainsworth PhD

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

Chapter 4 Continuous Random Variables and Probability Distributions

Sampling Distribution

ECO220Y Continuous Probability Distributions: Normal Readings: Chapter 9, section 9.10

CHAPTER 8. Confidence Interval Estimation Point and Interval Estimates

MATH 3200 Exam 3 Dr. Syring

Statistics for Business and Economics

STAT Chapter 7: Confidence Intervals

Two-Sample Z-Tests Assuming Equal Variance

Back to estimators...

C.10 Exercises. Y* =!1 + Yz

The Normal Distribution. (Ch 4.3)

STA215 Confidence Intervals for Proportions

Chapter 4 Continuous Random Variables and Probability Distributions

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

CIVL Confidence Intervals

Statistics 13 Elementary Statistics

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

1 Inferential Statistic

Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017

Previously, when making inferences about the population mean, μ, we were assuming the following simple conditions:

1. Covariance between two variables X and Y is denoted by Cov(X, Y) and defined by. Cov(X, Y ) = E(X E(X))(Y E(Y ))

Hypotesis testing: Two samples (Chapter 8)

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

ECE 340 Probabilistic Methods in Engineering M/W 3-4:15. Lecture 10: Continuous RV Families. Prof. Vince Calhoun

On Sensitivity Value of Pair-Matched Observational Studies

Simulation Wrap-up, Statistics COS 323

Chapter 7 - Lecture 1 General concepts and criteria

New robust inference for predictive regressions

Lecture 3. Sampling distributions. Counts, Proportions, and sample mean.

15.063: Communicating with Data Summer Recitation 4 Probability III

ECO220Y Estimation: Confidence Interval Estimator for Sample Proportions Readings: Chapter 11 (skip 11.5)

Statistics, Their Distributions, and the Central Limit Theorem

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay. Solutions to Final Exam

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Design of Engineering Experiments Part 9 Experiments with Random Factors

Chapter 7.2: Large-Sample Confidence Intervals for a Population Mean and Proportion. Instructor: Elvan Ceyhan

Chapter 7: Estimation Sections

Statistical Tables Compiled by Alan J. Terry

Estimation and Confidence Intervals

Two-Sample T-Tests using Effect Size

Inference of Several Log-normal Distributions

1 Introduction 1. 3 Confidence interval for proportion p 6

Chapter 9: Sampling Distributions

Inferences on Correlation Coefficients of Bivariate Log-normal Distributions

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION

Central Limit Theorem (cont d) 7/28/2006

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

BIOSTATISTICS TOPIC 5: SAMPLING DISTRIBUTION II THE NORMAL DISTRIBUTION

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Section 7.2. Estimating a Population Proportion

Data Analysis. BCF106 Fundamentals of Cost Analysis

An approximate sampling distribution for the t-ratio. Caution: comparing population means when σ 1 σ 2.

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER

Introduction to Probability

Transcription:

Experimental Design and Statistics - AGA47A Czech University of Life Sciences in Prague Department of Genetics and Breeding Fall/Winter 2014/2015 Matúš Maciak (@ A 211) Office Hours: M 14:00 15:30 W 15:30 17:00 (or by appointment) 1 / 15

Brief Overview Some Useful Formulas for a random sample X 1,..., X n N(µ, σ 2 ) with σ 2 > 0 known: n X n µ σ N(0, 1) for a random sample X 1,..., X n N(µ, σ 2 ) with σ 2 > 0 unknown: n X n µ s n t n 1 for a random sample X 1,..., X n N(µ, σ 2 ) it holds: (n 1)s 2 n σ 2 χ 2 n 1 2 / 15

Brief Overview Confidence intervals for the sample mean X n with known variance σ 2 it holds: [ )] σ σ (X n u α/2 n, X n + u α/2 n P µ = 1 α; for the sample mean X n with unknown variance σ 2 it holds: [ ( )] P µ X n t n 1(α/2) sn, X n + t n 1(α/2) sn = 1 α; n n for the sample variance sn 2 it holds: [ ( )] (n 1)s P σ 2 2 n χ n 1(1 α/2), (n 1)sn 2 = 1 α; χ n 1(α/2) 3 / 15

Brief Overview Confidence intervals for the sample mean X n with known variance σ 2 it holds: [ )] σ σ (X n u α/2 n, X n + u α/2 n P µ = 1 α; for the sample mean X n with unknown variance σ 2 it holds: [ ( )] P µ X n t n 1(α/2) sn, X n + t n 1(α/2) sn = 1 α; n n for the sample variance sn 2 it holds: [ ( )] (n 1)s P σ 2 2 n χ n 1(1 α/2), (n 1)sn 2 = 1 α; χ n 1(α/2) In a very analogous way one can also construct one-sided confidence intervals for parameters µ and σ 2 ; 3 / 15

Brief Overview William Sealy Gosset - The Student 4 / 15

One sample problems in statistics Common one sample problems for one random sample X 1,..., X n N(µ, σ 2 ), for σ 2 > 0, the most common statistical problems one usually encounters include: 5 / 15

One sample problems in statistics Common one sample problems for one random sample X 1,..., X n N(µ, σ 2 ), for σ 2 > 0, the most common statistical problems one usually encounters include: estimating unknown parameters (e.g. µ R and σ 2 > 0); (various methods proposed, most common is the method of moments) constructing confidence intervals for µ R and σ 2 > 0; (with a given confidence level (1 α) for some α (0, 1 2 )) testing a pair of hypotheses about the true values of µ R or σ 2 > 0; (for a given critical value α (0, 1 2 )) 5 / 15

Two sample inference - motivation in statistics we also focus on problems which can be related to more than just one sample (e.g. comparison of two samples); the simplest scenario is to consider two different random samples and to answer a statistical question how are these samples different; there are many various characteristics that can be used to judge the difference between two populations: 6 / 15

Two sample inference - motivation in statistics we also focus on problems which can be related to more than just one sample (e.g. comparison of two samples); the simplest scenario is to consider two different random samples and to answer a statistical question how are these samples different; there are many various characteristics that can be used to judge the difference between two populations: two random samples X 1,..., X n1 F 1 (x) and Y 1,..., Y n2 F 2 (x); two distribution functions F 1 and F 2 ; how can they be different? 6 / 15

Two sample inference - motivation in statistics we also focus on problems which can be related to more than just one sample (e.g. comparison of two samples); the simplest scenario is to consider two different random samples and to answer a statistical question how are these samples different; there are many various characteristics that can be used to judge the difference between two populations: two random samples X 1,..., X n1 F 1 (x) and Y 1,..., Y n2 F 2 (x); two distribution functions F 1 and F 2 ; how can they be different?... different distributions F 1 and F 2 ;... different functional shapes of F 1 and F 2 ;... different range of values for F 1 and F 2 ;... different locations mean parameters µ 1 and µ 2 ;... different scale variance parameters σ 2 1 and σ2 2 ;......... 6 / 15

Two sample inference - motivation in statistics we also focus on problems which can be related to more than just one sample (e.g. comparison of two samples); the simplest scenario is to consider two different random samples and to answer a statistical question how are these samples different; there are many various characteristics that can be used to judge the difference between two populations: two random samples X 1,..., X n1 F 1 (x) and Y 1,..., Y n2 F 2 (x); two distribution functions F 1 and F 2 ; how can they be different?... different distributions F 1 and F 2 ;... different functional shapes of F 1 and F 2 ;... different range of values for F 1 and F 2 ;... different locations mean parameters µ 1 and µ 2 ;... different scale variance parameters σ 2 1 and σ2 2 ;......... Inference is again based on confidence intervals and hypotheses tests; 6 / 15

Two sample problem - motivation Density 0.0 0.1 0.2 0.3 0.4 5 0 5 we need some statistical approaches to reveal the difference; we need some decision criteria to judge the difference; 7 / 15

Two sample problem - motivation Density 0.0 0.1 0.2 0.3 0.4 5 0 5 we need some statistical approaches to reveal the difference; we need some decision criteria to judge the difference; 7 / 15

Two sample problem - motivation Density 0.0 0.1 0.2 0.3 0.4 0 5 10 we need some statistical approaches to reveal the difference; we need some decision criteria to judge the difference; 7 / 15

Two sample problem (Gaussian) for two random samples X 1,..., X n1 N(µ 1, σ1 2) and Y 1,..., Y n2 N(µ 2, σ2 2), for σ2 1, σ2 2 > 0, the most common statistical problems (questions we are interested in) are: 8 / 15

Two sample problem (Gaussian) for two random samples X 1,..., X n1 N(µ 1, σ1 2) and Y 1,..., Y n2 N(µ 2, σ2 2), for σ2 1, σ2 2 > 0, the most common statistical problems (questions we are interested in) are: parameter estimates for µ 1, µ 2 R and σ 2 1, σ 2 2 > 0; confidence intervals for µ 1, µ 2 R and σ 2 1, σ 2 2 > 0; (with a given confidence level (1 α) for some α (0, 1 2 )) hypothesis tests about the true values of µ R and σ 2 > 0; (for a given critical value α (0, 1 2 )) 8 / 15

Two sample problem (Gaussian) for two random samples X 1,..., X n1 N(µ 1, σ1 2) and Y 1,..., Y n2 N(µ 2, σ2 2), for σ2 1, σ2 2 > 0, the most common statistical problems (questions we are interested in) are: parameter estimates for µ 1, µ 2 R and σ 2 1, σ 2 2 > 0; parameter estimate for the difference µ 1 µ 2 R; confidence intervals for µ 1, µ 2 R and σ 2 1, σ 2 2 > 0; (with a given confidence level (1 α) for some α (0, 1 2 )) hypothesis tests about the true values of µ R and σ 2 > 0; (for a given critical value α (0, 1 2 )) 8 / 15

Two sample problem (Gaussian) for two random samples X 1,..., X n1 N(µ 1, σ1 2) and Y 1,..., Y n2 N(µ 2, σ2 2), for σ2 1, σ2 2 > 0, the most common statistical problems (questions we are interested in) are: parameter estimates for µ 1, µ 2 R and σ 2 1, σ 2 2 > 0; parameter estimate for the difference µ 1 µ 2 R; confidence intervals for µ 1, µ 2 R and σ 2 1, σ 2 2 > 0; (with a given confidence level (1 α) for some α (0, 1 2 )) confidence interval for the difference µ 1 µ 2 R; (with a given confidence level (1 α) for some α (0, 1 2 )) hypothesis tests about the true values of µ R and σ 2 > 0; (for a given critical value α (0, 1 2 )) 8 / 15

Two sample problem (Gaussian) for two random samples X 1,..., X n1 N(µ 1, σ1 2) and Y 1,..., Y n2 N(µ 2, σ2 2), for σ2 1, σ2 2 > 0, the most common statistical problems (questions we are interested in) are: parameter estimates for µ 1, µ 2 R and σ 2 1, σ 2 2 > 0; parameter estimate for the difference µ 1 µ 2 R; confidence intervals for µ 1, µ 2 R and σ 2 1, σ 2 2 > 0; (with a given confidence level (1 α) for some α (0, 1 2 )) confidence interval for the difference µ 1 µ 2 R; (with a given confidence level (1 α) for some α (0, 1 2 )) hypothesis tests about the true values of µ R and σ 2 > 0; (for a given critical value α (0, 1 2 )) hypothesis tests about the true value of µ 1 µ 2 R; (for a given critical value α (0, 1 2 )) 8 / 15

Two sample problem (Gaussian) for two random samples X 1,..., X n1 N(µ 1, σ1 2) and Y 1,..., Y n2 N(µ 2, σ2 2), for σ2 1, σ2 2 > 0, the most common statistical problems (questions we are interested in) are: parameter estimates for µ 1, µ 2 R and σ 2 1, σ 2 2 > 0; parameter estimate for the difference µ 1 µ 2 R; confidence intervals for µ 1, µ 2 R and σ 2 1, σ 2 2 > 0; (with a given confidence level (1 α) for some α (0, 1 2 )) confidence interval for the difference µ 1 µ 2 R; (with a given confidence level (1 α) for some α (0, 1 2 )) hypothesis tests about the true values of µ R and σ 2 > 0; (for a given critical value α (0, 1 2 )) hypothesis tests about the true value of µ 1 µ 2 R; (for a given critical value α (0, 1 2 )) comparing two variances: σ 2 1 vs. σ 2 2; 8 / 15

Paired vs. Independent Samples let us assume an experiment producing two random samples:: 1 X 1,..., X n1 N(µ 1, σ 2 1 ); 2 Y 1,..., Y n2 N(µ 2, σ 2 2 ); 9 / 15

Paired vs. Independent Samples let us assume an experiment producing two random samples:: 1 X 1,..., X n1 N(µ 1, σ 2 1 ); 2 Y 1,..., Y n2 N(µ 2, σ 2 2 ); Various options are possible and one needs to distinguish among them: 9 / 15

Paired vs. Independent Samples let us assume an experiment producing two random samples:: 1 X 1,..., X n1 N(µ 1, σ 2 1 ); 2 Y 1,..., Y n2 N(µ 2, σ 2 2 ); Various options are possible and one needs to distinguish among them: random samples are balanced (n 1 = n 2 ) or they are not (n 1 n 2 ); random samples are only shifted (µ 1 µ 2 ), however, with the same variance σ1 2 = σ2 2 - homoscedastic samples; random samples are shifted and scaled (µ 1 µ 2 and σ1 2 σ2 2 ) - heteroscedastic samples; for balanced samples we can have observations X i and Y i being always measured on the same subject for every i = 1,..., n 1 = n 2 ; observations X i and Y j are always measured independently on two different subjects, for i = 1,..., n 1 and j = 1,..., n 2 ;... 9 / 15

Design of Experiments How it all goes: question of interest design of experiment collecting data statistical evaluation results interpretation answering the question of interest 10 / 15

Design of Experiments How it all goes: question of interest design of experiment collecting data statistical evaluation results interpretation answering the question of interest To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of. Ronald Fisher (1890 1962) 10 / 15

Design of Experiments How it all goes: question of interest design of experiment collecting data statistical evaluation results interpretation answering the question of interest important is the question of interest behind the whole experiment; given the question of interest statistician designs an experiment; there are many ways how to design an experiment: 10 / 15

Design of Experiments How it all goes: question of interest design of experiment collecting data statistical evaluation results interpretation answering the question of interest important is the question of interest behind the whole experiment; given the question of interest statistician designs an experiment; there are many ways how to design an experiment: randomized experiment; factorial design (fully crossed design); block design experiment; blind vs. double-blind experiment; independent random samples; paired samples; 10 / 15

Estimating unknown parameters for two sample problem Parameter estimates sample means: X n1 = 1 n1 n 1 i=1 Xi and Y n 2 = 1 n2 Yi; n 2 i=1 sample variances: s 2 n 1 = 1 n 1 1 n1 i=1 ( Xi X n1 ) 2 and s 2 n2 = 1 n 2 1 n2 i=1 ( Yi Y n2 ) 2; 11 / 15

Estimating unknown parameters for two sample problem Parameter estimates sample means: X n1 = 1 n1 n 1 i=1 Xi and Y n 2 = 1 n2 Yi; n 2 i=1 sample variances: s 2 n 1 = 1 n 1 1 n1 i=1 ( Xi X n1 ) 2 and s 2 n2 = 1 n 2 1 How to estimate the difference? for paired samples 1 n n i=1 (X i Y i ); for independent samples: X n1 Y n2 ; n2 i=1 ( Yi Y n2 ) 2; 11 / 15

Estimating unknown parameters for two sample problem Parameter estimates sample means: X n1 = 1 n1 n 1 i=1 Xi and Y n 2 = 1 n2 Yi; n 2 i=1 sample variances: s 2 n 1 = 1 n 1 1 n1 i=1 ( Xi X n1 ) 2 and s 2 n2 = 1 n 2 1 How to estimate the difference? for paired samples 1 n n i=1 (X i Y i ); for independent samples: X n1 Y n2 ; What are the corresponding distributions? 1 n n i=1 (X i Y i ) N(µ 1 µ 2, σ 2 ) X n1 Y n2 N(µ 1 µ 2, σ 2 ) n2 i=1 ( Yi Y n2 ) 2; 11 / 15

Estimating unknown parameters for two sample problem Parameter estimates sample means: X n1 = 1 n1 n 1 i=1 Xi and Y n 2 = 1 n2 Yi; n 2 i=1 sample variances: s 2 n 1 = 1 n 1 1 n1 i=1 ( Xi X n1 ) 2 and s 2 n2 = 1 n 2 1 How to estimate the difference? for paired samples 1 n n i=1 (X i Y i ); for independent samples: X n1 Y n2 ; What are the corresponding distributions? 1 n n i=1 (X i Y i ) N(µ 1 µ 2, σ 2 ) X n1 Y n2 N(µ 1 µ 2, σ 2 ) n2 i=1 ( Yi Y n2 ) 2; What are the corresponding variance parameters σ 2 > 0; either σ 2 1 = σ2 2 or σ 2 1 σ2 2 11 / 15

Estimating unknown parameters for two sample problem Variance parameters estimation What are the corresponding estimates for σ 2 > 0 under the different scenarios? for equal or unequal sample sizes n 1 and n 2 and equal variances σ 2 1 = σ 2 2: σ 2 = σ 2 XY ( 1 n 1 + 1 n 2 ), for σ 2 XY = (n1 1)s2 n 1 + (n 2 1)s 2 n 2 n 1 + n 2 2 for σ XY 2 to called a pooled variance estimate; for equal (unequal) sample sizes n 1, n 2 and unequal variances σ 2 1 σ 2 2: σ 2 = s2 n 1 n 1 + s2 n 2 n 2 12 / 15

Estimating unknown parameters for two sample problem Degrees of Freedom Calculations For paired samples (n 1 = n 2): degrees of freedom = n 1 13 / 15

Estimating unknown parameters for two sample problem Degrees of Freedom Calculations For paired samples (n 1 = n 2): degrees of freedom = n 1 For independent samples (n 1 2 and σ 2 1 = σ 2 2): degrees of freedom = n 1 + n 2 2 13 / 15

Estimating unknown parameters for two sample problem Degrees of Freedom Calculations For paired samples (n 1 = n 2): degrees of freedom = n 1 For independent samples (n 1 2 and σ 2 1 = σ 2 2): degrees of freedom = n 1 + n 2 2 General procedure (n 1 n 2 and σ1 2 σ2): 2 ( ) σ 2 2 A + σ2 B n m degrees of freedom = σ 4 A n 2 (n 1) + σ4 B m 2 (m 1) 13 / 15

Estimating unknown parameters for two sample problem Degrees of Freedom Calculations For paired samples (n 1 = n 2): degrees of freedom = n 1 For independent samples (n 1 2 and σ 2 1 = σ 2 2): degrees of freedom = n 1 + n 2 2 General procedure (n 1 n 2 and σ1 2 σ2): 2 ( ) σ 2 2 A + σ2 B n m degrees of freedom = σ 4 A n 2 (n 1) + σ4 B m 2 (m 1) Conservative approach (independent samples n m): degrees of freedom = min{n, m} 1 13 / 15

Estimating unknown parameters for two sample problem Degrees of Freedom 14 / 15

Estimating unknown parameters for two sample problem To be continued... comparing two sample variances; inference on population proportion; some other statistical tests;... 15 / 15