Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Similar documents
2 Exploring Univariate Data

DATA SUMMARIZATION AND VISUALIZATION

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Variance, Standard Deviation Counting Techniques

1 Describing Distributions with numbers

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

DATA HANDLING Five-Number Summary

Frequency Distribution and Summary Statistics

Lecture 1: Review and Exploratory Data Analysis (EDA)

appstats5.notebook September 07, 2016 Chapter 5

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25

3.1 Measures of Central Tendency

Exploratory Data Analysis

Describing Data: One Quantitative Variable

Categorical. A general name for non-numerical data; the data is separated into categories of some kind.

Graphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Descriptive Statistics (Devore Chapter One)

KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA. Name: ID# Section

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.

22.2 Shape, Center, and Spread

Some estimates of the height of the podium

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

IOP 201-Q (Industrial Psychological Research) Tutorial 5

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

Lecture 2 Describing Data

Unit 2 Statistics of One Variable

Description of Data I

STAT 113 Variability

Empirical Rule (P148)

Lecture Week 4 Inspecting Data: Distributions

Simple Descriptive Statistics

STAT 157 HW1 Solutions

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Numerical Descriptions of Data

Fundamentals of Statistics

Basic Procedure for Histograms

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

Section3-2: Measures of Center

Exploring Data and Graphics

CHAPTER 2 Describing Data: Numerical

Putting Things Together Part 2

AP Statistics Unit 1 (Chapters 1-6) Extra Practice: Part 1

CSC Advanced Scientific Programming, Spring Descriptive Statistics

How Wealthy Are Europeans?

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS

1. In a statistics class with 136 students, the professor records how much money each

2011 Pearson Education, Inc

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data

The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

CHAPTER 2 DESCRIBING DATA: FREQUENCY DISTRIBUTIONS AND GRAPHIC PRESENTATION

Descriptive Statistics

STAB22 section 1.3 and Chapter 1 exercises

PSYCHOLOGICAL STATISTICS

Descriptive Analysis

Math 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet.

NOTES: Chapter 4 Describing Data

Applications of Data Dispersions

Descriptive Statistics Bios 662

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

8. From FRED, search for Canada unemployment and download the unemployment rate for all persons 15 and over, monthly,

Section 6-1 : Numerical Summaries

Math 140 Introductory Statistics. First midterm September

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter 3 Descriptive Statistics: Numerical Measures Part A

Descriptive Statistics

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

The Normal Distribution

Math Take Home Quiz on Chapter 2

Today s plan: Section 4.1.4: Dispersion: Five-Number summary and Standard Deviation.

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1

STOR 155 Practice Midterm 1 Fall 2009

Honors Statistics. 3. Discuss homework C2# Discuss standard scores and percentiles. Chapter 2 Section Review day 2016s Notes.

Math146 - Chapter 3 Handouts. The Greek Alphabet. Source: Page 1 of 39

The Standard Deviation as a Ruler and the Normal Model. Copyright 2009 Pearson Education, Inc.

MgtOp 215 TEST 1 (Golden) Spring 2016 Dr. Ahn. Read the following instructions very carefully before you start the test.

6.1 Graphs of Normal Probability Distributions:

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]

( ) P = = =

Math 243 Lecture Notes

Shifting and rescaling data distributions

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

Chapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.

Data Analysis and Statistical Methods Statistics 651

Chapter 3: Displaying and Describing Quantitative Data Quiz A Name

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

GOALS. Describing Data: Displaying and Exploring Data. Dot Plots - Examples. Dot Plots. Dot Plot Minitab Example. Stem-and-Leaf.

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Wk 2 Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12)

CHAPTER TOPICS STATISTIK & PROBABILITAS. Copyright 2017 By. Ir. Arthur Daniel Limantara, MM, MT.

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

Normal Model (Part 1)

4. DESCRIPTIVE STATISTICS

The Normal Distribution

AP Statistics Chapter 6 - Random Variables

Transcription:

Math 2311 Bekki George bekki@math.uh.edu Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Class webpage: http://www.math.uh.edu/~bekki/math2311.html

Math 2311 Class Notes for Section 1.3-1.5 Section 1.3: Another important question we want to answer about data is about its spread or dispersion. Roughly speaking, the population standard deviation, σ, tells the average distance that data values fall from the mean. The standard deviation is the square root of the population variance, σ 2. So, what is the variance? The variance is the average of the squared differences of the data values from the mean. If N is the number of values in a population with mean µ, and x i represents each individual value in the population, then the variance is found by: σ 2 = N i=1 (x i µ) N And the population standard deviation is σ = σ 2

Most of the time we are not working with the entire population. Instead, we are working with a sample. Sample variance - s 2 = n i=1 (x i x ) n 1 Sample standard deviation - s = s 2 Example: 1. A statistics teacher wants to decide whether or not to curve an exam. From her class of 300 students, she chose a sample of 10 students and their grades were: 72, 88, 85, 81, 60, 54, 70, 72, 63, 43 Find the mean, variance and standard deviation for this sample.

2. Suppose the statistics teacher decides to curve the grades by adding 10 points to each score. What is the new mean, variance and standard deviation?

We can see from example 2 that adding the same value to all elements does not affect the variance (or standard deviation) of a set of data. What about multiplying? 3. Find the variance and the standard deviation for the following set of data (whose mean is 4.5) 3, 6, 2, 7, 4, 5 Now, multiply each value by 2. What is the new variance and the new standard deviation?

Sometimes we want to compare the variation between two groups. The coefficient of variation can be used for this. The coefficient of variation is the ratio of the standard deviation to the mean. A smaller ratio will indicate less variation in the data. Example: 4. The following statistics were collected on two different groups of stock prices: Portfolio A Portfolio B Sample size 10 15 Sample mean $52.65 $49.80 Sample standard deviation $6.50 $2.95 What can be said about the variability of each portfolio?

Section 1.4: More measures of spread (or dispersion): Range Drawbacks of range: sensitivity to outliers Percentiles: o 25 th percentile, Q1 o 50 th percentile, Median or Q2 o 75 th percentile, Q3

The values of the minimum, Q1, Q2, Q3 and the maximum make up what is called our five number summary. IQR Example: 1. Twelve babies spoke for the first time at the following ages (in months): 8 9 10 11 12 13 15 15 18 20 20 26 Find Q1, Q2, Q3, the range and the IQR.

The IQR is used to determine data classified as outliers. An outlier is an observation that is distant from the rest of the data. Outliers can occur by chance or be measurement errors so it is important to identify them. Any point that falls outside the interval calculated by Q1-1.5(IQR) and Q3 + 1.5(IQR) is considered an outlier. Example: 2. Are there any outliers in the data set given for example 1? If so, what are they?

There are other percentiles as well. The kth percentile means that k% of the ordered data values are at or below that data value. For example, if the median is 100, then 50% of the ordered data values fall at or below 100. Also, (100-k)% represents the amount of ordered data that falls above the percentile data value. If you are looking for the measurement that has a desired percentile rank, the 100Pth percentile, is the measurement with rank (or position in the list) of np+0.5, where n represents the number of data values in the sample. Example: 3. In a collection of 30 data measurements, which measurement represents the 30 th percentile?

Suppose you know the position (the order) of a value and want to know what percentile it is ranked at. In general, if you have n data measurements, x 1 represents the 100( 1 0.5 ) / n th percentile, x 2 represents the 100( 2 0.5 ) / n th th percentile, and x i represents the 100( i 0.5 ) / n percentile. Example: 4. Using the data in example 1, determine the percentile of the 4 th order statistic ( x ). 4

Section 1.5: Graphs and Describing Distributions Lets start with two examples, one for categorical data and one for quantitative data: Example 1: Suppose we have found the eye color for 85 people. 45 have brown eyes, 25 have blue eyes, 10 have green eyes and 10 have hazel eyes. Display the data in a bar graph. Example 2: Height measurements for a group of people were taken. The results are recorded below (in inches): 66, 68, 63, 71, 68, 69, 65, 70, 73, 67, 62, 59, 63, 68, 71, 63, 63, 60, 64, 66, 58 We will organize these data sets using different graphs: A bar graph is created by listing the categorical data along the x-axis and the frequencies along the y-axis. Bars are drawn above each data value.

A dot plot is made simply by putting dots above the values listed on a number line.

A stem and leaf plot, the data is arranged by values. The digits in the largest place are referred to as the stem and the digits in the smallest place are referred to as the leaf (leaves). The leaves are displayed to the right of the stem. A split stemplot divides up the stems into equal groups. Back-to-back stempots can be used when comparing two sets of data.

Histograms are created by first dividing the data into classes, or bins, of equal width. Next, count the number of observations in each class. The horizontal axis will represent the variable values and the vertical axis will represent your frequency or your relative frequency.

Boxplots not only help identify features about our data quickly (such as spread and location of center) but can be very helpful when comparing data sets. How to make a box plot: 1. Order the values in the data set in ascending order (least to greatest). 2. Find and label the median. 3. Of the lower half (less than the median do not include), find and label Q1. 4. Of the upper half (greater than the median do not include), find and label Q3. 5. Label the minimum and maximum. 6. Draw and label the scale on an axis. 7. Plot the five number summary. 8. Sketch a box starting at Q1 to Q3. 9. Sketch a segment within the box to represent the median. 10. Connect the min and max to the box with line segments. Note: If data contains outliers, a box and whiskers plot can be used instead to display the data. In a box and whiskers plot, the outliers are displayed with dots above the value and the segments begin (or end) at the next data value within the outlier interval.

A pie chart is a circular chart, divided into sectors, indicating the proportion of each data value compared to the entire set of values. Pie charts are good for categorical data.

A cumulative frequency plot of the percentages (also called an ogive) can be used to view the total number of events that occurred up to a certain value. Example: Here is an ogive for Hudson Auto Repair s cost of parts sold: Example: Hudson Auto Repair Ogive with Cumulative Percent Frequencies 100 80 60 40 20 50 60 70 80 90 Parts Cost ($) Where is the median of this data?

Patterns and shapes: Uniform graphs Symmetric graphs

Some other features Bell Shaped Skewed right Skewed left