Section 0: Introduction and Review of Basic Concepts
|
|
- Beverly Sullivan
- 5 years ago
- Views:
Transcription
1 Section 0: Introduction and Review of Basic Concepts Carlos M. Carvalho The University of Texas McCombs School of Business mccombs.utexas.edu/faculty/carlos.carvalho/teaching 1
2 Getting Started Syllabus General Expectations 1. Read the notes / Practice 2. Be on schedule 2
3 Course Overview Section 0: Basic Concepts: Probability and Estimation Section 1: Simple Regression Model Section 2: Multiple Regression Section 3: Dummy Variables and Interactions Section 4: Regression Diagnostics and Transformations Section 5: Time Series Section 6: Model Selection, Logistic Regression and more... 3
4 Review of Basic Concepts Probability and statistics let us talk efficiently about things we are unsure about. if I only ask 1,000 voters out of 10 million, how sure can I be about how they all will vote? What is the true proportion of yes voters. if I am trying to predict sales next quarter, how sure am I? if I am trying to choose my portfolio, how sure am I about returns on the assets next period? if I want to do target marketing, which customers are more likely to respond to a promotion? All of these involve inferring or predicting unknown quantities!! 4
5 Random Variables Random Variables are numbers that we are NOT sure about but we might have some idea of how to describe its potential outcomes. We usually use a capital letter to denote a random variable. Example: Suppose we are about to toss two coins. Let X denote the number of heads. We say that X, is the random variable that stands for the number we are not sure about. Note that we always assign numbers to random variables! 5
6 Probability Distribution We describe the behavior of random variables with a Probability Distribution Example: If X is the random variable denoting the number of heads in two independent coin tosses, we can describe its behavior through the following probability distribution: 0 with prob X = 1 with prob with prob X is called a Discrete Random Variable as we are able to list all the possible outcomes Question: What is Pr(X = 0)? How about Pr(X 1)? 6
7 Probability Distribution Probability is always a positive number It can take values between 0 and 1. The total probability across all possible values of a random variable equals 1. 7
8 The Bernoulli Distribution Suppose the random variable X can only take the values 0 or 1. This is a dummy variable representing success of failure of an experiment... e.g.: X = 1 if I win a hand in blackjack... X = 0 if I lose X = 1 if David Ortiz gets a hit in an at bat... X = 0 otherwise X = 1 if the S&P500 return is positive... X = 0 if negative The Bernoulli is a distribution defined by the probability parameter p. We denote X Bernoulli(p) { 1 with prob. p X = 0 with prob. 1 p 8
9 Mean and Variance of a Random Variable Suppose someone asks you for a prediction of X. What would you say? Suppose someone asks you how sure you are. What would you say? 9
10 Mean and Variance of a Random Variable The Mean or Expected Value is defined as (for a discrete X ): E(X ) = n Pr(x i ) x i i=1 We weight each possible value by how likely they are... this provides us with a measure of centrality of the distribution... a good prediction for X! 10
11 Mean and Variance of a Random Variable Suppose X Bernoulli(p) n E(X ) = Pr(x i ) x i i=1 = 0 (1 p) + 1 p E(X ) = p 11
12 Mean and Variance of a Random Variable The Variance is defined as (for a discrete X ): Var(X ) = n Pr(x i ) [x i E(X )] 2 i=1 Weighted average of squared prediction errors... This is a measure of spread of a distribution. More risky distributions have larger variance. 12
13 Mean and Variance of a Random Variable Suppose X Bernoulli(p) n Var(X ) = Pr(x i ) [x i E(X )] 2 i=1 = (0 p) 2 (1 p) + (1 p) 2 p = p(1 p) [(1 p) + p] Var(X ) = p(1 p) Question: For which value of p is the variance the largest? 13
14 The Standard Deviation What are the units of E(X )? What are the units of Var(X )? A more intuitive way to understand the spread of a distribution is to look at the standard deviation: sd(x ) = Var(X ) What are the units of sd(x )? 14
15 Continuous Random Variables Suppose we are trying to predict tomorrow s return on the S&P Question: What is the random variable of interest? Question: How can we describe our uncertainty about tomorrow s outcome? Listing all possible values seems like a crazy task... we ll work with intervals instead. These are call continuous random variables. The probability of an interval is defined by the area under the probability density function. 15
16 The Normal Distribution A random variable is a number we are NOT sure about but we might have some idea of how to describe its potential outcomes. The Normal distribution is the most used probability distribution to describe a random variable The probability the number ends up in an interval is given by the area under the curve (pdf) standard normal pdf
17 The Normal Distribution The standard Normal distribution has mean 0 and has variance 1. Notation: If Z N(0, 1) (Z is the random variable) Pr( 1 < Z < 1) = 0.68 Pr( 1.96 < Z < 1.96) = 0.95 standard normal pdf standard normal pdf z z 17
18 The Normal Distribution Note: For simplicity we will often use P( 2 < Z < 2) 0.95 Questions: What is Pr(Z < 2)? How about Pr(Z 2)? What is Pr(Z < 0)? 18
19 The Normal Distribution The standard normal is not that useful by itself. When we say the normal distribution, we really mean a family of distributions. We obtain pdfs in the normal family by shifting the bell curve around and spreading it out (or tightening it up). 19
20 The Normal Distribution We write X N(µ, σ 2 ). Normal distribution with mean µ and variance σ 2. The parameter µ determines where the curve is. The center of the curve is µ. The parameter σ determines how spread out the curve is. The area under the curve in the interval (µ 2σ, µ + 2σ) is 95%. Pr(µ 2 σ < X < µ + 2 σ) 0.95 µ 2σ µ σ µ µ + σ µ + 2σ 20
21 The Normal Distribution Example: Below are the pdfs of X 1 N(0, 1), X 2 N(3, 1), and X 3 N(0, 16). Which pdf goes with which X?
22 The Normal Distribution Example Assume the annual returns on the SP500 are normally distributed with mean 6% and standard deviation 15%. SP500 N(6, 225). (Notice: 15 2 = 225). Two questions: (i) What is the chance of losing money on a given year? (ii) What is the value that there s only a 2% chance of losing that or more? Lloyd Blankfein: I spend 98% of my time thinking about 2% probability events! (i) Pr(SP500 < 0) and (ii) Pr(SP500 <?) =
23 The Normal Distribution Example prob less than 0 prob is 2% sp sp500 (i) Pr(SP500 < 0) = 0.35 and (ii) Pr(SP500 < 25) = 0.02 In Excel: NORMDIST and NORMINV (homework!) 23
24 The Normal Distribution 1. Note: In X N(µ, σ 2 ) µ is the mean and σ 2 is the variance. 2. Standardization: if X N(µ, σ 2 ) then Z = X µ σ N(0, 1) 3. Summary: X N(µ, σ 2 ): µ: where the curve is σ: how spread out the curve is 95% chance X µ ± 2σ. 24
25 The Normal Distribution Another Example Prior to the 1987 crash, monthly S&P500 returns (r) followed (approximately) a normal with mean and standard deviation equal to How extreme was the crash of ? The standardization helps us interpret these numbers... r N(0.012, ) For the crash, z = r N(0, 1) z = = 5.27 How extreme is this zvalue? 5 standard deviations away!! 25
26 Mean and Variance of a Random Variable Suppose X N(µ, σ 2 ). Suppose someone asks you for a prediction of X. What would you say? µ 2σ µ σ µ µ + σ µ + 2σ Suppose someone asks you how sure you are. What would you say? 26 x
27 Mean and Variance of a Random Variable For the normal family of distributions we can see that the parameter µ talks about where the distribution is located or centered. We often use µ as our best guess for a prediction. The parameter σ talks about how spread out the distribution is. This gives us and indication about how uncertain or how risky our prediction is. If X is any random variable, the mean will be a measure of the location of the distribution and the variance will be a measure of how spread out it is. 27
28 The Mean and Variance of a Normal The Mean and Variance of a Normal For continuous distributions, the above formulas for E(X ) and Var(X ) get a bit more complicated as we are adding an infinite number of possible outcomes... not to worry, the interpretation is still the same. if X N(µ, σ 2 ) then E(X ) = µ, Var(X ) = σ 2, sd(x ) = σ µ σ µ µ + σ 28 µ 2σ µ + 2σ
29 Two More (very important!) Formulas Let X and Y be two random variables: E(aX + by ) = ae(x ) + be(y ) Var(aX + by ) = a 2 Var(X ) + b 2 Var(Y ) + 2ab Cov(X, Y ) We will get back to this later... 29
30 Conditional, Joint and Marginal Distributions In general we want to use probability to address problems involving more than one variable at the time We need to be able to describe what we think will happen to one variable relative to another... we want to answer questions like: How are my sales impacted by the overall economy? 30
31 Conditional, Joint and Marginal Distributions Let E denote the performance of the economy next quarter... for simplicity, say E = 1 if the economy is expanding and E = 0 if the economy is contracting (what kind of random variable is this?) Let s assume E Bernoulli(0.7) Let S denote my sales next quarter... and let s suppose the following probability statements: S pr(s E = 1) S pr(s E = 0) These are called Conditional Distributions 31
32 Conditional, Joint and Marginal Distributions S pr(s E = 1) S pr(s E = 0) In blue is the conditional distribution of S given E = 1 In red is the conditional distribution of S given E = 0 We read: the probability of Sales of 4 (S = 4) given(or conditional on) the economy is growing (E = 1) is
33 Conditional, Joint and Marginal Distributions The conditional distributions tell us about about what can happen to S for a given value of E... but what about S and E jointly? pr(s = 4 and E = 1) = pr(e = 1) pr(s = 4 E = 1) = = In english, 70% of the times the economy grows and 1/4 of those times sales equals % of 70% is 17.5% 33
34 Conditional, Joint and Marginal Distributions 34
35 Conditional, Joint and Marginal Distributions We call the probabilities of E and S together the joint distribution of E and S. In general the notation is... pr(y = y, X = x) is the joint probability of the random variable Y equal y END the random variable X equal x. pr(y = y X = x) is the conditional probability of the random variable Y takes the value y GIVEN that X equals x. pr(y = y) and pr(x = x) are the marginal probabilities of Y = y and X = x 35
36 Important relationships Relationship between the joint and conditional... pr(y, x) = pr(x) pr(y x) = pr(y) pr(x y) Relationship between joint and marginal... pr(x) = y pr(y) = x pr(x, y) pr(x, y) 36
37 Conditional, Joint and Marginal Distributions Why we call marginals marginals... the table represents the joint and at the margins, we get the marginals. 37
38 Conditional, Joint and Marginal Distributions Example... Given E = 1 what is the probability of S = 4? pr(s = 4, E = 1) pr(s = 4 E = 1) = = pr(e = 1) 0.7 =
39 Conditional, Joint and Marginal Distributions Example... Given S = 4 what is the probability of E = 1? pr(s = 4, E = 1) pr(e = 1 S = 4) = = pr(s = 4) =
40 Bayes Theorem Disease testing example... Let D = 1 indicate you have a disease Let T = 1 indicate that you test positive for it If you take the test and the result is positive, you are really interested in the question: Given that you tested positive, what is the chance you have the disease? 40
41 Bayes Theorem pr(d = 1 T = 1) = ( ) =
42 Bayes Theorem The computation of pr(x y) from pr(x) and pr(y x) is called Bayes theorem... pr(x y) = pr(y, x) pr(y) = pr(y, x) = pr(x)pr(y x) pr(y, x) x x pr(x)pr(y x) In the disease testing example: p(d = 1 T = 1) = p(t =1 D=1)p(D=1) p(t =1 D=1)p(D=1)+p(T =1 D=0)p(D=0) pr(d = 1 T = 1) = ( ) =
43 Independence Two random variable X and Y are independent if pr(y = y X = x) = pr(y = y) for all possible x and y. In other words, knowing X tells you nothing about Y! e.g.,tossing a coin 2 times... what is the probability of getting H in the second toss given we saw a T in the first one? 43
44 Independence We can extend the notion of independence to any number of variables For example, Y 1 is independent of Y 2 and Y 3 if pr(y 1 = y 1 Y 2 = y 2, Y 3 = y 3 ) = pr(y 1 = y 1 ) 44
45 IID Suppose you are about to toss a coin n times Let Y i = 1 if the i th toss is a head and 0 otherwise... What is the pr(y 29 = 1)? What is the pr(y 29 = Y 27 = 0, Y 28 = 1)? What is pr(y 57 = 1)? 45
46 IID Each Y i Bernoulli(0.5) Each Y i is independent of all others... hence IID: independent and identically distributed 46
47 A First Modeling Exercise I have US$ 1,000 invested in the Brazilian stock index, the IBOVESPA. I need to predict tomorrow s value of my portfolio. I also want to know how risky my portfolio is, in particular, I want to know how likely am I to lose more than 3% of my money by the end of tomorrow s trading session. What should I do? 47
48 IBOVESPA - Data BOVESPA Density Daily Returns Daily Return Date 48
49 As a first modeling decision, let s call the random variable associated with daily returns on the IBOVESPA X and assume that returns are independent and identically distributed as X N(µ, σ 2 ) Question: What are the values of µ and σ 2? We need to estimate these values from the sample in hands (n=113 observations)... 49
50 Let s assume that each observation in the random sample {x 1, x 2, x 3,..., x n } is independent and distributed according to the model above, i.e., x i N(µ, σ 2 ) An usual strategy is to estimate µ and σ 2, the mean and the variance of the distribution, via the sample mean ( X ) and the sample variance (s 2 )... (their sample counterparts) X = 1 n n i=1 x i s 2 = 1 n 1 n ( xi X ) 2 i=1 50
51 For the IBOVESPA data in hands, BOVESPA Density X = 0.04 and s 2 = Daily Returns The red line represents our model, i.e., the normal distribution with mean and variance given by the estimated quantities X and s 2. What is Pr(X < 3)? 51
52 Models, Parameters, Estimates... In general we talk about unknown quantities using the language of probability... and the following steps: Define the random variables of interest Define a model (or probability distribution) that describes the behavior of the RV of interest Based on the data available, we estimate the parameters defining the model We are now ready to describe possible scenarios, generate predictions, make decisions, evaluate risk, etc... 52
53 Oracle vs SAP Example (understanding variation) 53
54 Oracle vs. SAP Do we buy the claim from this add? We have a dataset of 81 firms that use SAP... The industry ROE is 15% (also an estimate but let s assume it is true) We assume that the random variable X represents ROE of SAP firms and can be described by X N(µ, σ 2 ) X s 2 SAP firms Well, ! I guess the add is correct, right? Not so fast... 54
55 Oracle vs. SAP Let s assume that the ROE of firms using SAP is, on average, the same as the industry. Assume further that s 2 is a good estimate of the variance... ROE N(0.15, 0.065) In a sample of 81 firms, how often can we expect the sample mean to be below 0.15? What does this mean if I trying to compare the profitability of firms using SAP versus the industry? 55
56 Oracle vs. SAP Let s do a little simulation... Generate 1000 different samples of size 81 from a N(0.15, 0.065). Plot the histogram of X... Now, what do you think about the add? Histogram of sample mean Density
57 Sampling Distribution of Sample Mean Consider the mean for an iid sample of n observations of a random variable {X 1,..., X n } Suppose that E(X i ) = µ and var(x i ) = σ 2 E( X ) = 1 n E(Xi ) = µ var( X ) = var ( 1 ) n Xi = 1 n var 2 (Yi ) = σ2 n ( If X is normal, then X N µ, σ2 n This is called the sampling distribution of the mean... ). 57
58 Sampling Distribution of Sample Mean The sampling distribution of X describes how our estimate would vary over different datasets of the same size n It provides us with a vehicle to evaluate the uncertainty associated with our estimate of the mean... It turns out that s 2 is a good proxy for σ 2 so that we can approximate the sampling distribution by We call s 2 n X N ) (µ, s2 n the standard error of X... it is a measure of its variability... I like the notation s X = s 2 n 58
59 Sampling Distribution of Sample Mean X N ( µ, s 2 X ) X is unbiased... E( X ) = µ. On average, X is right! X is consistent... as n grows, s 2 X 0, i.e., with more information, eventually X correctly estimates µ! 59
60 Back to the Oracle vs. SAP example Our simulation was done assuming that µ = in that case X N ( 0.15, ) 81 Histogram of sample mean Density Sampling Distribution
61 Confidence Intervals X N ( µ, s 2 X ) so... ( X µ) N ( 0, s 2 X ) right? What is a good prediction for µ? What is our best guess?? X How do we make mistakes? How far from µ can we be?? 95% of the time ±2 s X [ X ±2 s X ] gives a 95% range of plausible values for µ... this is called the 95% Confidence Interval for µ. 61
62 Oracle vs. SAP example... one more time In this example, X = , s 2 = and n = therefore, s 2 X = so, the 95% confidence interval for the ROE of SAP firms is [ X 2 s X ; X + 2 s X ] = [ ; = [0.069; 0.183] ] Is 0.15 a plausible value? What does that mean? 62
63 y Estimating Proportions... another modeling example Your job is to manufacture a part. Each time you make a part, it is defective or not. Below we have the results from 100 parts you just made. Y i = 1 means a defect, 0 a good one. How would you predict the next one? Index Would you model these Y as iid Bernoulli draws for some p? There are 18 ones and 82 zeros. 63
64 In this case, it might be reasonable to model the defects as iid Bernoulli(.18). We can t be sure this is right, but, the data looks like the kind of thing we would get if we had iid draws with that p!!! If we believe our model, what is the chance that the next 10 are good? =
65 We used the proportion of defects in our sample to estimate p, the true, long-run, proportion of defects. Could this estimate be wrong?!! Let ˆp denote the sample proportion. The standard error associated with the sample proportion as an estimate of the true proportion is: sˆp = ˆp (1 ˆp) n 65
66 Suppose we have iid Bernoulli data and estimate the true p by the observed sample proportion of 1 s, ˆp. The (approximate) 95% confidence interval for the true proportion is: ˆp ± 2 sˆp. 66
67 Defects: In our defect example we had ˆp =.18 and n = 100. This gives sˆp = (.18) (.82) 100 =.04. The confidence interval is.18 ±.08 = (.1,.26), big!!!!. 67
68 Polls: yet another example... If we take a relatively small random sample from a large population and ask each respondent yes or no with yes Y i = 1 and no Y i = 0, then, approximately. Y i Bernoulli(p) where p is the true population proportion of yes. Suppose, as is common, n = 1000, and ˆp.5. Then, sˆp = (.5) (.5) 1000 = The standard error is.0158 so that the ± is.0316, or about ± 3%. 68
Section 2: Estimation, Confidence Intervals and Testing Hypothesis
Section 2: Estimation, Confidence Intervals and Testing Hypothesis Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/
More informationSection 1.4: Learning from data
Section 1.4: Learning from data Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 4.1, 4.2, 4.4, 5.3 1 A First Modeling Exercise
More informationSection 2: Estimation, Confidence Intervals and Testing Hypothesis
Section 2: Estimation, Confidence Intervals and Testing Hypothesis Tengyuan Liang, Chicago Booth https://tyliang.github.io/bus41000/ Suggested Reading: Naked Statistics, Chapters 7, 8, 9 and 10 OpenIntro
More informationSection 1.3: More Probability and Decisions: Linear Combinations and Continuous Random Variables
Section 1.3: More Probability and Decisions: Linear Combinations and Continuous Random Variables Jared S. Murray The University of Texas at Austin McCombs School of Business OpenIntro Statistics, Chapters
More informationBusiness Statistics 41000: Probability 4
Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:
More informationA useful modeling tricks.
.7 Joint models for more than two outcomes We saw that we could write joint models for a pair of variables by specifying the joint probabilities over all pairs of outcomes. In principal, we could do this
More informationBusiness Statistics 41000: Probability 3
Business Statistics 41000: Probability 3 Drew D. Creal University of Chicago, Booth School of Business February 7 and 8, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404
More information2. Modeling Uncertainty
2. Modeling Uncertainty Models for Uncertainty (Random Variables): Big Picture We now move from viewing the data to thinking about models that describe the data. Since the real world is uncertain, our
More informationTHE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management
THE UNIVERSITY OF TEXAS AT AUSTIN Department of Information, Risk, and Operations Management BA 386T Tom Shively PROBABILITY CONCEPTS AND NORMAL DISTRIBUTIONS The fundamental idea underlying any statistical
More informationSampling and sampling distribution
Sampling and sampling distribution September 12, 2017 STAT 101 Class 5 Slide 1 Outline of Topics 1 Sampling 2 Sampling distribution of a mean 3 Sampling distribution of a proportion STAT 101 Class 5 Slide
More informationDiscrete probability distributions
Discrete probability distributions Probability distributions Discrete random variables Expected values (mean) Variance Linear functions - mean & standard deviation Standard deviation 1 Probability distributions
More informationChapter 5. Sampling Distributions
Lecture notes, Lang Wu, UBC 1 Chapter 5. Sampling Distributions 5.1. Introduction In statistical inference, we attempt to estimate an unknown population characteristic, such as the population mean, µ,
More informationE509A: Principle of Biostatistics. GY Zou
E509A: Principle of Biostatistics (Week 2: Probability and Distributions) GY Zou gzou@robarts.ca Reporting of continuous data If approximately symmetric, use mean (SD), e.g., Antibody titers ranged from
More informationChapter 16. Random Variables. Copyright 2010, 2007, 2004 Pearson Education, Inc.
Chapter 16 Random Variables Copyright 2010, 2007, 2004 Pearson Education, Inc. Expected Value: Center A random variable is a numeric value based on the outcome of a random event. We use a capital letter,
More informationStatistics for Business and Economics
Statistics for Business and Economics Chapter 5 Continuous Random Variables and Probability Distributions Ch. 5-1 Probability Distributions Probability Distributions Ch. 4 Discrete Continuous Ch. 5 Probability
More information5. In fact, any function of a random variable is also a random variable
Random Variables - Class 11 October 14, 2012 Debdeep Pati 1 Random variables 1.1 Expectation of a function of a random variable 1. Expectation of a function of a random variable 2. We know E(X) = x xp(x)
More informationStatistic Midterm. Spring This is a closed-book, closed-notes exam. You may use any calculator.
Statistic Midterm Spring 2018 This is a closed-book, closed-notes exam. You may use any calculator. Please answer all problems in the space provided on the exam. Read each question carefully and clearly
More informationMath 5760/6890 Introduction to Mathematical Finance
Math 5760/6890 Introduction to Mathematical Finance Instructor: Jingyi Zhu Office: LCB 335 Telephone:581-3236 E-mail: zhu@math.utah.edu Class web page: www.math.utah.edu/~zhu/5760_12f.html What you should
More informationChapter 16. Random Variables. Copyright 2010 Pearson Education, Inc.
Chapter 16 Random Variables Copyright 2010 Pearson Education, Inc. Expected Value: Center A random variable assumes a value based on the outcome of a random event. We use a capital letter, like X, to denote
More informationUnit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More informationUnit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More informationChapter 5. Statistical inference for Parametric Models
Chapter 5. Statistical inference for Parametric Models Outline Overview Parameter estimation Method of moments How good are method of moments estimates? Interval estimation Statistical Inference for Parametric
More informationIntroduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017
Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017 Please fill out the attendance sheet! Suggestions Box: Feedback and suggestions are important to the
More informationChapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS
Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS Part 1: Introduction Sampling Distributions & the Central Limit Theorem Point Estimation & Estimators Sections 7-1 to 7-2 Sample data
More informationThe normal distribution is a theoretical model derived mathematically and not empirically.
Sociology 541 The Normal Distribution Probability and An Introduction to Inferential Statistics Normal Approximation The normal distribution is a theoretical model derived mathematically and not empirically.
More informationInterval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems
Interval estimation September 29, 2017 STAT 151 Class 7 Slide 1 Outline of Topics 1 Basic ideas 2 Sampling variation and CLT 3 Interval estimation using X 4 More general problems STAT 151 Class 7 Slide
More informationLecture 9 - Sampling Distributions and the CLT
Lecture 9 - Sampling Distributions and the CLT Sta102/BME102 Colin Rundel September 23, 2015 1 Variability of Estimates Activity Sampling distributions - via simulation Sampling distributions - via CLT
More informationPoint Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage
6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic
More informationLecture 9 - Sampling Distributions and the CLT. Mean. Margin of error. Sta102/BME102. February 6, Sample mean ( X ): x i
Lecture 9 - Sampling Distributions and the CLT Sta102/BME102 Colin Rundel February 6, 2015 http:// pewresearch.org/ pubs/ 2191/ young-adults-workers-labor-market-pay-careers-advancement-recession Sta102/BME102
More information8.1 Estimation of the Mean and Proportion
8.1 Estimation of the Mean and Proportion Statistical inference enables us to make judgments about a population on the basis of sample information. The mean, standard deviation, and proportions of a population
More informationReview for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom
Review for Final Exam 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom THANK YOU!!!! JON!! PETER!! RUTHI!! ERIKA!! ALL OF YOU!!!! Probability Counting Sets Inclusion-exclusion principle Rule of product
More informationProblems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:
Math 224 Fall 207 Homework 5 Drew Armstrong Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman: Section 3., Exercises 3, 0. Section 3.3, Exercises 2, 3, 0,.
More informationPopulations and Samples Bios 662
Populations and Samples Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-08-22 16:29 BIOS 662 1 Populations and Samples Random Variables Random sample: result
More informationAP Statistics Chapter 6 - Random Variables
AP Statistics Chapter 6 - Random 6.1 Discrete and Continuous Random Objective: Recognize and define discrete random variables, and construct a probability distribution table and a probability histogram
More informationINF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 3, 1.9
INF5830 015 FALL NATURAL LANGUAGE PROCESSING Jan Tore Lønning, Lecture 3, 1.9 Today: More statistics Binomial distribution Continuous random variables/distributions Normal distribution Sampling and sampling
More informationSTA Module 3B Discrete Random Variables
STA 2023 Module 3B Discrete Random Variables Learning Objectives Upon completing this module, you should be able to 1. Determine the probability distribution of a discrete random variable. 2. Construct
More informationWeek 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals
Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg :
More informationBIO5312 Biostatistics Lecture 5: Estimations
BIO5312 Biostatistics Lecture 5: Estimations Yujin Chung September 27th, 2016 Fall 2016 Yujin Chung Lec5: Estimations Fall 2016 1/34 Recap Yujin Chung Lec5: Estimations Fall 2016 2/34 Today s lecture and
More informationPROBABILITY DISTRIBUTIONS
CHAPTER 3 PROBABILITY DISTRIBUTIONS Page Contents 3.1 Introduction to Probability Distributions 51 3.2 The Normal Distribution 56 3.3 The Binomial Distribution 60 3.4 The Poisson Distribution 64 Exercise
More informationChapter 4 Continuous Random Variables and Probability Distributions
Chapter 4 Continuous Random Variables and Probability Distributions Part 2: More on Continuous Random Variables Section 4.5 Continuous Uniform Distribution Section 4.6 Normal Distribution 1 / 27 Continuous
More informationPoint Estimation. Edwin Leuven
Point Estimation Edwin Leuven Introduction Last time we reviewed statistical inference We saw that while in probability we ask: given a data generating process, what are the properties of the outcomes?
More informationChapter 7 Sampling Distributions and Point Estimation of Parameters
Chapter 7 Sampling Distributions and Point Estimation of Parameters Part 1: Sampling Distributions, the Central Limit Theorem, Point Estimation & Estimators Sections 7-1 to 7-2 1 / 25 Statistical Inferences
More information4.1 Introduction Estimating a population mean The problem with estimating a population mean with a sample mean: an example...
Chapter 4 Point estimation Contents 4.1 Introduction................................... 2 4.2 Estimating a population mean......................... 2 4.2.1 The problem with estimating a population mean
More informationShifting our focus. We were studying statistics (data, displays, sampling...) The next few lectures focus on probability (randomness) Why?
Probability Introduction Shifting our focus We were studying statistics (data, displays, sampling...) The next few lectures focus on probability (randomness) Why? What is Probability? Probability is used
More informationChapter 7: Point Estimation and Sampling Distributions
Chapter 7: Point Estimation and Sampling Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 20 Motivation In chapter 3, we learned
More informationMLLunsford 1. Activity: Central Limit Theorem Theory and Computations
MLLunsford 1 Activity: Central Limit Theorem Theory and Computations Concepts: The Central Limit Theorem; computations using the Central Limit Theorem. Prerequisites: The student should be familiar with
More informationChapter 7: Estimation Sections
1 / 31 : Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods: 7.5 Maximum Likelihood
More information19. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE
19. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE We assume here that the population variance σ 2 is known. This is an unrealistic assumption, but it allows us to give a simplified presentation which
More informationMATH 264 Problem Homework I
MATH Problem Homework I Due to December 9, 00@:0 PROBLEMS & SOLUTIONS. A student answers a multiple-choice examination question that offers four possible answers. Suppose that the probability that the
More informationA random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.
Chapter 14: random variables p394 A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon. Consider the experiment of tossing a coin. Define a random variable
More information2011 Pearson Education, Inc
Statistics for Business and Economics Chapter 4 Random Variables & Probability Distributions Content 1. Two Types of Random Variables 2. Probability Distributions for Discrete Random Variables 3. The Binomial
More informationChapter 7: Estimation Sections
1 / 40 Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: Chapter 7 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods:
More informationA random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon.
Chapter 14: random variables p394 A random variable (r. v.) is a variable whose value is a numerical outcome of a random phenomenon. Consider the experiment of tossing a coin. Define a random variable
More informationSection The Sampling Distribution of a Sample Mean
Section 5.2 - The Sampling Distribution of a Sample Mean Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin The Sampling Distribution of a Sample Mean Example: Quality control check of light
More informationChapter 4 Continuous Random Variables and Probability Distributions
Chapter 4 Continuous Random Variables and Probability Distributions Part 2: More on Continuous Random Variables Section 4.5 Continuous Uniform Distribution Section 4.6 Normal Distribution 1 / 28 One more
More informationMean of a Discrete Random variable. Suppose that X is a discrete random variable whose distribution is : :
Dr. Kim s Note (December 17 th ) The values taken on by the random variable X are random, but the values follow the pattern given in the random variable table. What is a typical value of a random variable
More informationDefinition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.
9 Point estimation 9.1 Rationale behind point estimation When sampling from a population described by a pdf f(x θ) or probability function P [X = x θ] knowledge of θ gives knowledge of the entire population.
More informationStatistics and Probability
Statistics and Probability Continuous RVs (Normal); Confidence Intervals Outline Continuous random variables Normal distribution CLT Point estimation Confidence intervals http://www.isrec.isb-sib.ch/~darlene/geneve/
More informationCentral Limit Theorem, Joint Distributions Spring 2018
Central Limit Theorem, Joint Distributions 18.5 Spring 218.5.4.3.2.1-4 -3-2 -1 1 2 3 4 Exam next Wednesday Exam 1 on Wednesday March 7, regular room and time. Designed for 1 hour. You will have the full
More information6 Central Limit Theorem. (Chs 6.4, 6.5)
6 Central Limit Theorem (Chs 6.4, 6.5) Motivating Example In the next few weeks, we will be focusing on making statistical inference about the true mean of a population by using sample datasets. Examples?
More informationVersion A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.
Math 224 Q Exam 3A Fall 217 Tues Dec 12 Version A Problem 1. Let X be the continuous random variable defined by the following pdf: { 1 x/2 when x 2, f(x) otherwise. (a) Compute the mean µ E[X]. E[X] x
More informationProbability: Week 4. Kwonsang Lee. University of Pennsylvania February 13, 2015
Probability: Week 4 Kwonsang Lee University of Pennsylvania kwonlee@wharton.upenn.edu February 13, 2015 Kwonsang Lee STAT111 February 13, 2015 1 / 21 Probability Sample space S: the set of all possible
More informationLaw of Large Numbers, Central Limit Theorem
November 14, 2017 November 15 18 Ribet in Providence on AMS business. No SLC office hour tomorrow. Thursday s class conducted by Teddy Zhu. November 21 Class on hypothesis testing and p-values December
More information15.063: Communicating with Data Summer Recitation 3 Probability II
15.063: Communicating with Data Summer 2003 Recitation 3 Probability II Today s Goal Binomial Random Variables (RV) Covariance and Correlation Sums of RV Normal RV 15.063, Summer '03 2 Random Variables
More informationChapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.
Chapter 14 : Statistical Inference 1 Chapter 14 : Introduction to Statistical Inference Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same. Data x
More informationTwo hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER
Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS Answer any FOUR of the SIX questions.
More informationThe Bernoulli distribution
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationSimple Random Sample
Simple Random Sample A simple random sample (SRS) of size n consists of n elements from the population chosen in such a way that every set of n elements has an equal chance to be the sample actually selected.
More information9 Expectation and Variance
9 Expectation and Variance Two numbers are often used to summarize a probability distribution for a random variable X. The mean is a measure of the center or middle of the probability distribution, and
More informationSTA Rev. F Learning Objectives. What is a Random Variable? Module 5 Discrete Random Variables
STA 2023 Module 5 Discrete Random Variables Learning Objectives Upon completing this module, you should be able to: 1. Determine the probability distribution of a discrete random variable. 2. Construct
More informationIntroduction to Statistics I
Introduction to Statistics I Keio University, Faculty of Economics Continuous random variables Simon Clinet (Keio University) Intro to Stats November 1, 2018 1 / 18 Definition (Continuous random variable)
More informationMuch of what appears here comes from ideas presented in the book:
Chapter 11 Robust statistical methods Much of what appears here comes from ideas presented in the book: Huber, Peter J. (1981), Robust statistics, John Wiley & Sons (New York; Chichester). There are many
More informationTOPIC: PROBABILITY DISTRIBUTIONS
TOPIC: PROBABILITY DISTRIBUTIONS There are two types of random variables: A Discrete random variable can take on only specified, distinct values. A Continuous random variable can take on any value within
More informationStat 139 Homework 2 Solutions, Fall 2016
Stat 139 Homework 2 Solutions, Fall 2016 Problem 1. The sum of squares of a sample of data is minimized when the sample mean, X = Xi /n, is used as the basis of the calculation. Define g(c) as a function
More informationBusiness Statistics Midterm Exam Fall 2013 Russell
Name SOLUTION Business Statistics Midterm Exam Fall 2013 Russell Do not turn over this page until you are told to do so. You will have 2 hours to complete the exam. There are a total of 100 points divided
More informationFEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede,
FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede, mb8@ecs.soton.ac.uk The normal distribution The normal distribution is the classic "bell curve". We've seen that
More informationMVE051/MSG Lecture 7
MVE051/MSG810 2017 Lecture 7 Petter Mostad Chalmers November 20, 2017 The purpose of collecting and analyzing data Purpose: To build and select models for parts of the real world (which can be used for
More informationBack to estimators...
Back to estimators... So far, we have: Identified estimators for common parameters Discussed the sampling distributions of estimators Introduced ways to judge the goodness of an estimator (bias, MSE, etc.)
More informationValue (x) probability Example A-2: Construct a histogram for population Ψ.
Calculus 111, section 08.x The Central Limit Theorem notes by Tim Pilachowski If you haven t done it yet, go to the Math 111 page and download the handout: Central Limit Theorem supplement. Today s lecture
More informationChapter 3 Discrete Random Variables and Probability Distributions
Chapter 3 Discrete Random Variables and Probability Distributions Part 3: Special Discrete Random Variable Distributions Section 3.5 Discrete Uniform Section 3.6 Bernoulli and Binomial Others sections
More informationCentral Limit Theorem (cont d) 7/28/2006
Central Limit Theorem (cont d) 7/28/2006 Central Limit Theorem for Binomial Distributions Theorem. For the binomial distribution b(n, p, j) we have lim npq b(n, p, np + x npq ) = φ(x), n where φ(x) is
More informationChapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi
Chapter 4: Commonly Used Distributions Statistics for Engineers and Scientists Fourth Edition William Navidi 2014 by Education. This is proprietary material solely for authorized instructor use. Not authorized
More informationData Analysis and Statistical Methods Statistics 651
Review of previous lecture: Why confidence intervals? Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Suppose you want to know the
More informationMTH6154 Financial Mathematics I Stochastic Interest Rates
MTH6154 Financial Mathematics I Stochastic Interest Rates Contents 4 Stochastic Interest Rates 45 4.1 Fixed Interest Rate Model............................ 45 4.2 Varying Interest Rate Model...........................
More informationChapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables
Chapter 5 Continuous Random Variables and Probability Distributions 5.1 Continuous Random Variables 1 2CHAPTER 5. CONTINUOUS RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS Probability Distributions Probability
More informationElementary Statistics Lecture 5
Elementary Statistics Lecture 5 Sampling Distributions Chong Ma Department of Statistics University of South Carolina Chong Ma (Statistics, USC) STAT 201 Elementary Statistics 1 / 24 Outline 1 Introduction
More informationStatistical Methods in Practice STAT/MATH 3379
Statistical Methods in Practice STAT/MATH 3379 Dr. A. B. W. Manage Associate Professor of Mathematics & Statistics Department of Mathematics & Statistics Sam Houston State University Overview 6.1 Discrete
More informationPoint Estimation. Some General Concepts of Point Estimation. Example. Estimator quality
Point Estimation Some General Concepts of Point Estimation Statistical inference = conclusions about parameters Parameters == population characteristics A point estimate of a parameter is a value (based
More informationEstimating parameters 5.3 Confidence Intervals 5.4 Sample Variance
Estimating parameters 5.3 Confidence Intervals 5.4 Sample Variance Prof. Tesler Math 186 Winter 2017 Prof. Tesler Ch. 5: Confidence Intervals, Sample Variance Math 186 / Winter 2017 1 / 29 Estimating parameters
More informationBernoulli and Binomial Distributions
Bernoulli and Binomial Distributions Bernoulli Distribution a flipped coin turns up either heads or tails an item on an assembly line is either defective or not defective a piece of fruit is either damaged
More informationName: CS3130: Probability and Statistics for Engineers Practice Final Exam Instructions: You may use any notes that you like, but no calculators or computers are allowed. Be sure to show all of your work.
More informationECO220Y Estimation: Confidence Interval Estimator for Sample Proportions Readings: Chapter 11 (skip 11.5)
ECO220Y Estimation: Confidence Interval Estimator for Sample Proportions Readings: Chapter 11 (skip 11.5) Fall 2011 Lecture 10 (Fall 2011) Estimation Lecture 10 1 / 23 Review: Sampling Distributions Sample
More informationModule 4: Probability
Module 4: Probability 1 / 22 Probability concepts in statistical inference Probability is a way of quantifying uncertainty associated with random events and is the basis for statistical inference. Inference
More informationPart 1 In which we meet the law of averages. The Law of Averages. The Expected Value & The Standard Error. Where Are We Going?
1 The Law of Averages The Expected Value & The Standard Error Where Are We Going? Sums of random numbers The law of averages Box models for generating random numbers Sums of draws: the Expected Value Standard
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 7 (MWF) Analyzing the sums of binary outcomes Suhasini Subba Rao Introduction Lecture 7 (MWF)
More information4 Random Variables and Distributions
4 Random Variables and Distributions Random variables A random variable assigns each outcome in a sample space. e.g. called a realization of that variable to Note: We ll usually denote a random variable
More informationThe probability of having a very tall person in our sample. We look to see how this random variable is distributed.
Distributions We're doing things a bit differently than in the text (it's very similar to BIOL 214/312 if you've had either of those courses). 1. What are distributions? When we look at a random variable,
More informationModule 3: Sampling Distributions and the CLT Statistics (OA3102)
Module 3: Sampling Distributions and the CLT Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chpt 7.1-7.3, 7.5 Revision: 1-12 1 Goals for
More informationLecture 8. The Binomial Distribution. Binomial Distribution. Binomial Distribution. Probability Distributions: Normal and Binomial
Lecture 8 The Binomial Distribution Probability Distributions: Normal and Binomial 1 2 Binomial Distribution >A binomial experiment possesses the following properties. The experiment consists of a fixed
More informationStatistics, Measures of Central Tendency I
Statistics, Measures of Central Tendency I We are considering a random variable X with a probability distribution which has some parameters. We want to get an idea what these parameters are. We perfom
More information