Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Similar documents
Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Statistics (This summary is for chapters 18, 29 and section H of chapter 19)

Chapter 7 1. Random Variables

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

Counting Basics. Venn diagrams

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

2017 Fall QMS102 Tip Sheet 2

Elementary Statistics Blue Book. The Normal Curve

A.REPRESENTATION OF DATA

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25

Applications of Data Dispersions

Basic Procedure for Histograms

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1

Example. Chapter 8 Probability Distributions and Statistics Section 8.1 Distributions of Random Variables

Numerical Descriptions of Data

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

DATA SUMMARIZATION AND VISUALIZATION

The normal distribution is a theoretical model derived mathematically and not empirically.

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Lecture 9. Probability Distributions. Outline. Outline

30 Wyner Statistics Fall 2013

DATA HANDLING Five-Number Summary

Lecture 9. Probability Distributions

Section Distributions of Random Variables

7 THE CENTRAL LIMIT THEOREM

2 Exploring Univariate Data

Binomial and Normal Distributions. Example: Determine whether the following experiments are binomial experiments. Explain.

Section Distributions of Random Variables

Week 7. Texas A& M University. Department of Mathematics Texas A& M University, College Station Section 3.2, 3.3 and 3.4

Chapter 6: Random Variables. Ch. 6-3: Binomial and Geometric Random Variables

Variance, Standard Deviation Counting Techniques

2011 Pearson Education, Inc

These Statistics NOTES Belong to:

Categorical. A general name for non-numerical data; the data is separated into categories of some kind.

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

Normal Probability Distributions

MAKING SENSE OF DATA Essentials series

IOP 201-Q (Industrial Psychological Research) Tutorial 5

1 Describing Distributions with numbers

Statistical Methods in Practice STAT/MATH 3379

Descriptive Statistics (Devore Chapter One)

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

Unit 2: Statistics Probability

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS

Lecture 1: Review and Exploratory Data Analysis (EDA)

In a binomial experiment of n trials, where p = probability of success and q = probability of failure. mean variance standard deviation

3.1 Measures of Central Tendency

A probability distribution shows the possible outcomes of an experiment and the probability of each of these outcomes.

Shifting and rescaling data distributions

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Empirical Rule (P148)

As you draw random samples of size n, as n increases, the sample means tend to be normally distributed.

KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA. Name: ID# Section

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Section Introduction to Normal Distributions

Chapter 4 and 5 Note Guide: Probability Distributions

The Normal Probability Distribution

STAT 113 Variability

Lecture 2 Describing Data

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

The Central Limit Theorem for Sample Means (Averages)

Some estimates of the height of the podium

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

appstats5.notebook September 07, 2016 Chapter 5

The Normal Distribution

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

The Binomial and Geometric Distributions. Chapter 8

Econ 6900: Statistical Problems. Instructor: Yogesh Uppal

The "bell-shaped" curve, or normal curve, is a probability distribution that describes many real-life situations.

5.2 Random Variables, Probability Histograms and Probability Distributions

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes. Standardizing normal distributions The Standard Normal Curve

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

5.1 Personal Probability

22.2 Shape, Center, and Spread

Chapter 4 Random Variables & Probability. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

2 2 In general, to find the median value of distribution, if there are n terms in the distribution the

GOALS. Discrete Probability Distributions. A Distribution. What is a Probability Distribution? Probability for Dice Toss. A Probability Distribution

Chapter 4. The Normal Distribution

Discrete Probability Distributions Chapter 6 Dr. Richard Jerz

5.1 Mean, Median, & Mode

Examples: Random Variables. Discrete and Continuous Random Variables. Probability Distributions

AP Statistics Chapter 6 - Random Variables

MAS187/AEF258. University of Newcastle upon Tyne

Theoretical Foundations

Chapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.

Math 140 Introductory Statistics. First midterm September

Probability. An intro for calculus students P= Figure 1: A normal integral

Statistics TI-83 Usage Handout

Description of Data I

Chapter 3: Probability Distributions and Statistics

STAT Chapter 5: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

The Normal Model The famous bell curve

CHAPTER 2 Describing Data: Numerical

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Transcription:

Chapter 8 Measures of Center Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Data that can only be integer values are called discrete. These are usually things that are counted, such as apples, owls, cars, books, etc. There are three statistics that are used to measure the center of a data set: mean, median, and mode. mean the average value median the middle value in an ordered data set mode the data value that occurs most often Example: The data below represent the heights, in inches, of ten high school basketball players: 65 66 66 67 67 68 68 68 70 75 Find the mean, median, and mode. mean = 68 median = 67.5 mode = 68 An extreme value in a data set is called an outlier. Sometimes outliers are excluded before the data are analyzed. Some examples of displays of data: Stem and Leaf Plot Histogram (for continuous data) Column Graph (for discrete data) stem leaf 2 3 4 6 6 3 0 5 4 2 4 4 7 9 where 2 3 means 2.3 f r e q u e n c y 40 30 20 0 Height of Trees f r e q u e n c y 8 6 4 2 Number of Cats 5 0 5 height (meters) 20 0 2 3 number of cats

Chapter 8 Measures of Dispersion Another way to analyze data is using measures of dispersion: range and interquartile range. These show how widely the data vary. The range is difference between the highest and lowest data values. The median divides the data set into two halves. The median of the lower half is called the lower quartile (Q), and the median of the upper half is called the upper quartile (Q3). The interquartile range (IQR) is the difference between the upper and lower quartiles, Q3 Q. The data set is divided into quarters by the lower quartile (Q), the median (Q2), and the upper quartile (Q3). 25% of the data are less than Q, 50% are less than Q2, and 75% are less than Q3. These are also called percentiles. The upper and lower quartiles along with the median and the minimum and maximum values form the five number summary of the data set. A box and whisker plot is a type of graph that shows the dispersion of data, including the five number summary. Example: For the data set below, find the five number summary, draw a box and whisker plot, and state the range and the interquartile range. 4 5 9 5 7 8 7 3 5 6 3 4 3 2 5 ordered data (n = 6) 2 3 3 3 4 4 5 5 5 5 6 7 7 8 9 min = Q = 3 median = 5 Q = 6.5 0 2 3 4 5 6 7 8 9 0 The range is 9 = 8. The interquartile range is 6.5 3 = 3.5.

Chapter 8 Cumulative Frequency A cumulative frequency distribution table shows the total frequency of data points, up to and including a particular value (or range of values). Example: Exam Score Frequency Cumulative Frequency 0 x < 0 2 2 0 x < 20 3 5 20 x < 30 6 30 x < 40 8 9 40 x < 50 30 Cumulative data can also be represented using a cumulative frequency polygon graph. The graph can be used to estimate the median and to find other properties of the data. Example: Exam Scores 30 f r e q u e n c y 25 20 5 0 5 0 20 30 40 50 score The median is about 35 points. If passing is a score of 30 points or more, about 9 students passed. About 6 students earned an A (90% or more). About 5 students earned a B (between 80% and 90%). The 80 th percentile is a score of about 44 points.

Chapter 8 Variance and Standard Deviation Since range and interquartile range use only two data points, they are not very informative. So there are two other more important measures of dispersion that use all the data values: variance and standard deviation. The deviation is the difference between the data value (x) and the mean ( µ ). Every value in the data set has a deviation. Because some deviations are positive and some are negative, we square them and then find 2 the average of the squared deviations. This is called the variance and is denoted σ. = k 2 i= σ f i ( x µ ) 2 The standard deviation, σ, is the square root of the variance. k i n = σ = σ = 2 i f i ( x µ ) 2 Note that in some contexts x is used for the mean and s is used for the standard deviation. You can use your graphing calculator to find standard deviation, but be careful if you re doing an IB problem! For IB problems, you should always use the -var stats function and chose the value labeled σ. Do not use the value labeled s or the stddev function! Standard deviation is very important for data that are normally distributed, which we will study later. i n

Chapter 9 Introduction to Probability Probability is the likelihood that something will happen. Probability can be measured numerically. A random experiment is an experiment in which there is no way to determine the outcome beforehand. For example, a dice game. A trial is an action in a random experiment. For example, rolling the dice. An outcome is a possible result of a trial. For example, rolling 2-4. An event is a set of possible outcomes. For example, the total of the two dice is six (the outcomes in this event are -5, 2-4, 3-3, 4-2, 5-). The sample space is the set of all possible outcomes of a random experiment. In the dice game experiment, the sample space contains 36 outcomes, as shown in the grid on page 357. For events that are equally likely, the probability is the number of outcomes in the event divided by the total number of outcomes in the sample space. Example: P(the total of the two dice is six) 5 = = 0.38 = 3.8 % 36 An event that is certain to occur has a probability of one. An event that cannot occur has a probability of zero. Probability is always a number between zero and one, inclusive. Examples: P(the total of the two dice is at least one) P(the total of the two dice is exactly one) 36 = = 36 0 = = 0 36 It is also sometimes helpful to illustrate the sample space, and there are a several ways to do this: a list, a grid, a tree diagram, and a Venn diagram. We will learn more about how to use these methods later in the chapter.

Chapter 9 Properties of Probability Events are complementary if their probabilities add up to one. This means that one of the events is certain to happen. Example: The probability of rain on Tuesday is 0.2. What is the probability that it does not rain on Tuesday? These events are complementary, so 0.8. Tree diagrams are a useful tool for solving probability problems. When drawing a tree diagram, follow these guidelines: Always draw tree diagrams horizontally. Draw one set of branches for each action in the experiment. Label the events at the end of the branches, and label the probabilities on the branches. Multiply out on each branch to get the probability of each outcome. The probabilities on each branch must always total to one and the final probabilities must always total to one. Example: There is a 20% chance of rain tomorrow. If it is raining, there is a 5% chance I will ride my bike after school. If it is not raining, there is a 70% chance I will go biking. Find the probability that I ride my bike after school tomorrow. 0.5 biking 0.03 0.2 rain 0.85 no biking 0.7 0.8 no rain 0.7 biking 0.56 0.3 no biking 0.24 P(I go biking tomorrow) = 0.03 + 0.56 = 0.59

Events that do not affect each other are called independent. For example, drawing two cards from a deck with replacement. Events that do affect each other are called conditional. For example, drawing two cards from a deck without replacement. The symbol A B means the intersection of A with B. It is equivalent to A and B. The multiplication law for probability says P ( A B) = P( A) P( B / A) where P ( B / ) A means the probability of B happening given A has happened. For independent events the multiplication law simplifies to P ( A B) = P ( A) P( B) if and only if the events are independent. This is because P ( B / A) P ( B) events. The addition law for probability says P ( A B) = P ( A) + P( B) P( A B) = for independent For mutually exclusive events the addition law simplifies to P ( A B) = P ( A) + P( B) if and only if the events are mutually exclusive. This is because P ( A B) 0 exclusive events. Example: Suppose P ( A ) = 0.3, P ( B ) = 0.5 P ( A B) if: a) the events are mutually exclusive b) the events are independent. Find a) P ( A B) = P ( A) + P( B) = 0.3 + 0.5 = 0.8 b) P ( A B) = P ( A) + P( B) P( A B) = 0.3+ 0.5 0.3( 0.5) = 0.65 = for mutually

Example: Suppose P ( C ) = 0.2, P ( D ) = 0.7 P ( C D) = 0.8. Find P ( C D)., and ( ) = ( ) + ( ) ( ) P C D P C P D P C D 0.8 = 0.2 + 0.7 x 0.8 = 0.9 x x = 0. Example: Event E and event F are shown in the Venn diagram below. Are these events independent? Explain. E F 0.2 0. 0.4 0.3 ( ) ( ) ( ) ( ) = ( ) = P E = 0.3 P E P F = 0.5 P F 0.5 P E F 0. Since ( ) ( ) ( ) P E F P E P F, the events are not independent.

Chapter 29 The Normal Distribution The most important distribution for a continuous random variable is the normal distribution. It is given by a function of the form 2 x µ 2 σ f ( x) = e σ 2π defined for any real number x. This equation represents a family of functions that depend on µ (mean) and σ (standard deviation). The probabilities for a normal distribution are given by the area under the curve, and they are found using definite integrals: b ( ) ( ) P a x b = f x Note that because X is continuous, P ( X x) you use or < since P ( X x) = P( X < x). Characteristics of the normal curve: bell-shaped symmetric about µ inflection points at µ + σ and µ σ a = is zero. Therefore it doesn t matter whether asymptotic to the x-axis as x approaches infinity and negative infinity the total area under the curve is one the height and width of the curve depend on σ 99.95% of the values are within ± 3.5 standard deviations of the mean percentages of values are distributed as shown in the diagram on page 730 The standard normal distribution has µ = 0 and σ =. It is denoted by Z and is sometimes called the z-distribution. Probabilities for any normal distribution can be found using the graphing calculator. The input is normalcdf(min, max, µ, σ ). Example: X is normally distributed with mean 70 and standard deviation 4. Find: P 60 x 72 P x > 76 a) ( < < ) b) ( ) a) P ( < x < ) = ( ) b) P ( x > 76) = normalcdf ( 76, 999, 70, 4) 0.0668 60 72 normalcdf 60, 72, 70, 4 0.685

Chapter 29 Inverse of the Normal Distribution When we are given the probability (or percentage of values) for a normally distributed random variable, we can find the particular value that gives that probability using the inverse of the normal distribution. On the GDC the inputs are invnorm(p, µ, σ ). This gives the value of k for P ( X < k ) = p. Example: X is normally distributed with mean 32 and standard deviation 3. Find the value of k so that P ( X < k ) = 0.37. k = invnorm ( 0.37,32,3) 3.0 Note that the GDC only gives values for probabilities less than k. If you need to find the P X > k = p you must use the complement, P ( X < k ) = p. value of k for ( ) Example: Scores on a physics exam are normally distributed µ = 76 and σ = 25. The teacher decides to award an A to the top 7% of students. Find the minimum score required to earn an A. ( ) ( ) P X > k =.07 P X < k =.93 k = invnorm ( 0.93, 76, 25) 82.9 The minimum score required to earn an A is about 83 points. The inverse of the normal distribution can also be used in problems where we know probabilities but µ and σ are unknown. To do this, we must first convert the x-values into x µ z-values using the formula z =. This is known as standardizing. σ Example: X is normally distributed with 22 P X < 40 =.2, find µ. σ =. Given that ( ) Since µ is unknown, we must first standardize the x-value, which in this problem is 40: x µ 40 µ z = = σ 22 Now P ( X < 40 ) =.2 becomes normal button on the GDC: 40 µ P Z < =.2, and we can use the inverse 22 40 µ = invnorm (.2, 0, ) 22 40 µ =.7 40 µ = 25.8 µ = 65.8 22

Chapter 29 The Binomial Distribution The binomial probability distribution applies to a random variable with the following characteristics: The probability distribution is discrete. There are a fixed number of trials. There are exactly two outcomes, usually called success and failure. The probability of success is the same for all trials, that is, the trials are independent. The hardest thing about the binomial probability distribution is determining when to use it, so it is important to understand these characteristics! The symbol X ~ (, ) B n p means that X is a binomial distribution where n is the number of trials and p is the probability of success. The name binomial is used here because the formula used to calculate these probabilities is X ~ B n, p : essentially the same as the binomial expansion formula. For ( ) n P X r p p r r ( = ) = ( ) Example: Ms. Carey rolls her six-sided Bicycle die ten times. Find the probability that she rolls bicycle exactly three times. Binomial with n = 0 and p = 6. 0 3 7 5 P ( X = 3) = ( 6 ) ( 6 ) 0.55 3 Probabilities for binomial distributions can also be found using the GDC. For X ~ B( n, p ) there are two buttons to choose from: P X = r use binompdf(n, p, r) to find ( ) to find P ( X r) use binomcdf(n, p, r) Example: Ms. Carey rolls her six-sided Bicycle die ten times. Find the probability that she rolls bicycle no more than three times. n r Binomial with n = 0 and p = 6. P X 3 = binomcdf 0,, 3 0.930 ( ) ( ) 6

Note that the binomcdf button on the GDC only gives probabilities for X less than or equal P X r P X r you must use complementary events. to r. If you need to find ( > ) or ( ) Example: Ms. Carey rolls her six-sided Bicycle die ten times. Find the probability that she rolls bicycle at least three times. Binomial with n = 0 and p = 6. P X 3 = P X 2 = binomcdf 0,, 2 0.224 ( ) ( ) ( ) Example: Five percent of the eggs produced on a farm are brown. Find the probability that, in a case of 240 eggs, there are more than ten brown eggs. Binomial with n = 240 and p = 0.05. ( ) P ( X ) ( ) P X > 0 = 0 = binomcdf 240, 0.05, 0 0.658 6