Prof. Thistleton MAT 505 Introduction to Probability Lecture 3

Similar documents
The normal distribution is a theoretical model derived mathematically and not empirically.

Theoretical Foundations

Module 4: Probability

Descriptive Statistics (Devore Chapter One)

5.2 Random Variables, Probability Histograms and Probability Distributions

Lecture 2 Describing Data

Event p351 An event is an outcome or a set of outcomes of a random phenomenon. That is, an event is a subset of the sample space.

2011 Pearson Education, Inc

Math Take Home Quiz on Chapter 2

MATH 118 Class Notes For Chapter 5 By: Maan Omran

Chapter 3 - Lecture 5 The Binomial Probability Distribution

Chapter 3: Probability Distributions and Statistics

The following content is provided under a Creative Commons license. Your support

Lecture Data Science

MA 1125 Lecture 14 - Expected Values. Wednesday, October 4, Objectives: Introduce expected values.

Part V - Chance Variability

A useful modeling tricks.

Probability. An intro for calculus students P= Figure 1: A normal integral

DATA SUMMARIZATION AND VISUALIZATION

Graphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

Discrete Random Variables

Discrete Random Variables

TOPIC: PROBABILITY DISTRIBUTIONS

ECON 214 Elements of Statistics for Economists 2016/2017

Web Science & Technologies University of Koblenz Landau, Germany. Lecture Data Science. Statistics and Probabilities JProf. Dr.

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

Statistical Methods in Practice STAT/MATH 3379

Basic Procedure for Histograms

Econ 6900: Statistical Problems. Instructor: Yogesh Uppal

HUDM4122 Probability and Statistical Inference. February 23, 2015

Probability & Sampling The Practice of Statistics 4e Mostly Chpts 5 7

Experimental Probability - probability measured by performing an experiment for a number of n trials and recording the number of outcomes

The probability of having a very tall person in our sample. We look to see how this random variable is distributed.

2. Modeling Uncertainty

Describing Data: One Quantitative Variable

6.3: The Binomial Model

Review: Chebyshev s Rule. Measures of Dispersion II. Review: Empirical Rule. Review: Empirical Rule. Auto Batteries Example, p 59.

Chapter 9. Idea of Probability. Randomness and Probability. Basic Practice of Statistics - 3rd Edition. Chapter 9 1. Introducing Probability

Lecture 1: Review and Exploratory Data Analysis (EDA)

Previously, when making inferences about the population mean, μ, we were assuming the following simple conditions:

The Binomial Distribution

x is a random variable which is a numerical description of the outcome of an experiment.

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar

Lecture 9. Probability Distributions. Outline. Outline

When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution?

Random variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny.

Moments and Measures of Skewness and Kurtosis

When we look at a random variable, such as Y, one of the first things we want to know, is what is it s distribution?

The topics in this section are related and necessary topics for both course objectives.

CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS

Lecture 9. Probability Distributions

Unit 2 Statistics of One Variable

2 Exploring Univariate Data

Normal Approximation to Binomial Distributions

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Random variables The binomial distribution The normal distribution Other distributions. Distributions. Patrick Breheny.

The Binomial Distribution

Probability: Week 4. Kwonsang Lee. University of Pennsylvania February 13, 2015

Some Characteristics of Data

Consider the following examples: ex: let X = tossing a coin three times and counting the number of heads

Example. Chapter 8 Probability Distributions and Statistics Section 8.1 Distributions of Random Variables

MA 1125 Lecture 12 - Mean and Standard Deviation for the Binomial Distribution. Objectives: Mean and standard deviation for the binomial distribution.

Lecture 6 Probability

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Sampling and Descriptive Statistics

Probability and Sample space

23.1 Probability Distributions

Binomial Random Variable - The count X of successes in a binomial setting

Introduction to Statistical Data Analysis II

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Part 10: The Binomial Distribution

MATH 446/546 Homework 1:

Section Distributions of Random Variables

ECON 214 Elements of Statistics for Economists

Frequency Distribution and Summary Statistics

Statistics for IT Managers

Section Distributions of Random Variables

Module Tag PSY_P2_M 7. PAPER No.2: QUANTITATIVE METHODS MODULE No.7: NORMAL DISTRIBUTION

Probability and distributions

Chapter 5. Sampling Distributions

The Binomial Probability Distribution

Data Analysis. BCF106 Fundamentals of Cost Analysis

Chapter 5 Student Lecture Notes 5-1. Department of Quantitative Methods & Information Systems. Business Statistics

Central Limit Theorem 11/08/2005

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION

Empirical Rule (P148)

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Chapter 4 Discrete Random variables

STAB22 section 1.3 and Chapter 1 exercises

Section Random Variables and Histograms

SECTION 4.4: Expected Value

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

List of Online Quizzes: Quiz7: Basic Probability Quiz 8: Expectation and sigma. Quiz 9: Binomial Introduction Quiz 10: Binomial Probability

Math 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet.

5.1 Personal Probability

Lecture 2. Probability Distributions Theophanis Tsandilas

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1

UNIT 4 NORMAL DISTRIBUTION: DEFINITION, CHARACTERISTICS AND PROPERTIES

Transcription:

Sections from Text and MIT Video Lecture: Sections 2.1 through 2.5 http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-041-probabilistic-systemsanalysis-and-applied-probability-fall-2010/video-lectures/lecture-1-probability-models-and-axioms/ Topics from Syllabus: Histograms, Sets and Basic Probability Review and Looking Ahead We have seen how to summarize data graphically with the dotplot, stem and leaf plot, boxplot, and the histogram. We have also seen how to summarize a data set with measures of central tendency (mean and median) and measures of variability or spread (range, IQR, and Variance/Standard deviation). Skew and Kurtosis We quickly note that there are other features of a data set (or the generating mechanism, model or random variable). For instance, a way to measure the prominence of the right or left tail in a distribution (i.e. how the distribution deviates from symmetrical or mound shaped is the skew. Right skewed distributions trail off to the right, similarly for left tailed distributions. A clever way to measure this is to compare the median to the mean and say that a distribution is skewed to the right (i.e. towards larger values) if the mean is further to the right than the median. A conventional measure of this is the skew, defined by Pearson as Pearson mean median s Skew 3 standard deviation Another characteristic of a data set (or model!) worth noting is the extent to which it is peaked or flat. We will see later that quantities called moments are especially useful in probability and statistics. An n th order moment of a distribution is just the average or expected value of the distribution raised to the n th power. For example, the mean is called the first moment, the variance is the second (central) moment, an alternate definition of the skew involves the third moment, etc. To measure peakedness of a data set it is common to use kurtosis (x i x ) 4 n (n 1)s 4 SUNY POLY Page 1

Relative and Modified Relative Frequency Histograms Now we would like to produce a plot, called a relative frequency histogram, defined in such a way that we keep track of the percentage of data within each bin instead of the frequency or count. That is, we need to take the count in each bin and divide it by the total number of data points: binrelativefrequencies = bincounts/sum(bincounts) Clearly, if we sum the relative frequencies we get the number 1. That is, number of bins (relative frequency) i = 1 This is terrific if we have a discrete random variable. For continuous random variables, however, we would like to be able to integrate the histogram and obtain a 1. That is, we would like b f(x)dx a = 1 What can we do in this case? The area under the relative frequency histogram is number of bins AREA = (relative frequency) i (bin width) i Since we have a constant bin width, this is just number of bins AREA = (bin width) (relative frequency) i So, if we would like the area to be 1, we can perform the following maneuver: number of bins (relative frequency) i bin width = 1 SUNY POLY Page 2

Since this is so close to a relative frequency histogram, we call it a modified relative frequency histogram. It is one of the fundamental constructs of our semester! Histograms: How many bins should I use?!? Consider the following situation: a population is composed of two subpopulations which are mixed together. This is not an exotic situation! Just think of any group of people- you will most likely have both men and women. With people this is obvious, but can you tell a difference between male and female catfish? We can create a mixed population with the commands #1E6 random deviates centered at 100 with standard deviation of 10 data1=rnorm(1e6,100,10); #1E6 random deviates centered at 125 with standard deviation of 10 data2= rnorm(1e6,125,10); #merge the data sets to create one large set data=c(data1, data2) What we have done is created a sample of size n = 1E6 from a subpopulation which is symmetrical and mound shaped and centered at μ = 100 and mixed it in with another population which is also symmetrical and mound shaped, but centered at μ = 130. If we create a histogram with the command hist(data,10) we get a big lump of data. How would you describe this shape? If we instead use the command hist(data,100) we obtain the second figure. How would you describe this shape? SUNY POLY Page 3

Histogram of data Frequency 0 10000 20000 30000 40000 60 80 100 120 140 160 data A nice paper discussing the choice of number of bins is "Data-Based Choice of Histogram Bin Width" by M. P. Wand, available at our Angel site. Basic Probability One recent semester I gathered the following data on some students: Owns a Pet Gender Male Female Yes 15 27 No 6 3 Based upon these data, do you think that there is evidence that men and women own pets at different rates? How can we develop a theory to answer this sort of a question? Note that we re not asking whether these particular men and women own pets at the same rate (since they obviously do not) but whether we can infer something about the population from which the students are drawn. To answer questions like this we need to know a little probability. As another example, if you and I are flipping a coin and I receive $1 for Heads while you receive $1 for Tails you might begin to suspect that I m cheating if I win 20 times in a row. How about 5 times? 10 times? To get us started, consider the following experiment. I ve taken what I believe to be a fair coin and tossed it 10,000 times. Here are my data: SUNY POLY Page 4

#tosses 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Cumulative number of HEADS 509 1012 1513 2016 2488 3015 3545 4050 4565 5059 We can start to notice a few things. First by doing a little subtracting we can see that on the first 1000 tosses I got 509 heads, on the second I got 503, on the third 501 and so on (503, 472, 527, 530, 505, 515, and 494). Its impossible to say exactly how many you ll get on 1000 tosses, but they all seem to be near 500 even though it s possible to get as few as 0 and as many as 1. A simple plot gives us an idea here. If we know that we have obtained 509 heads on the first 1000 tosses then at this point we have a relative frequency of 509/1000=0.509. In general, define relative frequency of successes number of times we succeed at a task number of times we attempt the task Relative Frequency of Heads 0.55 0.5 0.45 0.4 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Number of Tosses Intuitively, we think of the probability of an event (i.e. the outcome of an experiment) as the number of times an event will occur relative to the number of times the experiment is performed. Of course we have to be careful about how we denote (or think about) experiments, outcomes, and events). Given this way of thinking, a few properties of what we call probability will be obvious. SUNY POLY Page 5

Since we are counting a given number of successful outcomes and dividing by the number of trials: Probabilities can never be negative. Probabilities can never be greater than one. Let's start with a few loose definitions. Look these up in your book and write the definitions in the space provided. 1. Experiment 2. Outcome 3. Sample Space 4. Event Given everything we've been discussing, how would you define the probability of obtaining HEADS on a toss of a fair coin? The Classical Notion of Probability Very often we encounter situations in which exactly one out of a certain number of outcomes will occur. For example, when you roll a fair die you may obtain a one, two, three, four, five, or a six. If the die is fair then most people would conclude that the probability of any individual outcome is exactly 1/6. So far so good. Now suppose that when you buy a lottery ticket. In this case you might win 0 times (i.e.you loose) or 1 time (i.e. you win). What is the probability that you win? (If you said 1 out of 2 or ½ please think again. If that were true I d stop doing mathematics and just buy lottery tickets for a living). In general, suppose you have a sample space with a finite (10, 20, or 1000000, etc) number of outcomes, say N of them. If each of these outcomes is equally likely, then the probability of an event made out of n outcomes occurring is Probability of Event = number of outcomes in event number of outcomes in sample space = n N SUNY POLY Page 6

Example: Suppose you toss two fair coins. Define your sample space. How many outcomes are in this sample space? Our sample space has four outcomes: Heads on first and Heads on second, denoted HH Heads on first and Tails on second, denoted HT Tails on first and Heads on second, denoted TH Tails on first and Tails on second, denoted TT What is the probability that, when you roll two fair dice, you get two of a kind (a pair?)? Our event is comprised of the two simple events (outcomes) HH and TT. Since there are two outcomes in this event we say that n = 2. Since our sample space has N = 4 outcomes total, and since each of these outcomes is equally likely (the coin is fair), the probability of our event is Probability of Event = number of outcomes in event number of outcomes in sample space = 2 4 Example What is the probability that, when you roll two fair dice, the sum will equal 5? We first have to figure out what our sample space looks like. Denote the simple event one on the first and one on the second as 1,1 and so on. Then our sample space looks like: 1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5 2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6 SUNY POLY Page 7

Now decide which outcomes comprise our event. If the individual rolls must add to 5 we must be talking about 1,4 and 2,3 and 3,2 and 4,1. 1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5 2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6 And, since each of these outcomes is equally likely, at this point we can see that Probability of Event = number of outcomes in event number of outcomes in sample space = 4 36 What is the probability that, when you roll two fair dice, the sum will equal 5 or 6? 1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5 2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6 Probability of Event = number of outcomes in event number of outcomes in sample space = 9 36 What is the probability that your roll will include exactly one even number? SUNY POLY Page 8

1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5 2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6 Probability of Event = number of outcomes in event number of outcomes in sample space = 18 36 Finally, what is the probability that our roll will have exactly one even number or add to 5? What do we mean by or Basic Rules of Probability We can take the intuition we've developed up to this point and abstract a bit in order to handle situations which are more complicated. We can indicate the probability of an event E as the number P(E). We have the following axioms: The probability of an event is a real number greater than zero or equal to zero, P(E) 0 The probability of the sample space is exactly one SUNY POLY Page 9

P(S) = 1 If two events A and B are mutually exclusive (cannot both occur simultaneously), then the probability of obtaining an outcome in one or the other of these events is P(A B) = P(A) + P(B). We can extend this: If we have an infinite collection of disjoint events, then P(A 1 A 2 A 3 ) = P(A 1 ) + P(A 2 ) + P(A 3 ) + P ( A i ) = P(A i ) SUNY POLY Page 10