STA 248 H1S Winter 2008 Assignment 1 Solutions
|
|
- Adela Houston
- 5 years ago
- Views:
Transcription
1 1. (a) Measures of location: STA 248 H1S Winter 2008 Assignment 1 Solutions i. The mean, 100 1=1 x i/100, can be made arbitrarily large if one of the x i are made arbitrarily large since the sample size is fixed at 100. ii. Since the median is the value in the middle of the ordered data, it s value will be unaffected by the largest value, so making it arbitrarily large will not change the median. iii. To calculate the 10% trimmed mean we first remove the 10 largest and 10 smallest data values, so the one value that is made arbitrarily large will be disregarded and the 10% trimmed mean will not change. (b) Measures of spread: i. The standard deviation is ni=1 (x i x) 2 /(n 1) where x is the mean. Since the mean can be made arbitrarily large by changing one data value, but the rest of the x i remain the same, the standard deviation will also become large. ii. The interquartile range is the difference between the value such that 75% of the data are below it and the value such that 25% of the data are below it. Making one value arbitrarily large while keeping the rest the same will not affect it. 2. (a) Internet traffic is heaviest on Fridays and least on Saturdays and Sundays. Of the weekdays, it is lightest on Wednesdays. (b) The greatest spread occurs on Fridays and the least on the weekends (Saturday and Sunday). (c) The distributions all appear to be slighly right skewed (the median is closer to the left end of the distribution than the right with longer upper whiskers than lower whiskers), although there is little skew in the distributions on Saturday and Sunday. There our large outliers on Monday, Thursday, and Friday, which may be observations from a long right tail on those days. 3. (a) R code used: > par(mfrow=c(3,3)) > hist(rgamma(1000,.5,1/.5),main="alpha=.5, beta=.5") > hist(rgamma(1000,.5,1/2),main="alpha=.5, beta=2") > hist(rgamma(1000,.5,1/5),main="alpha=.5, beta=5") > hist(rgamma(1000,2,1/.5),main="alpha=2, beta=.5") > hist(rgamma(1000,2,1/2),main="alpha=2, beta=2") > hist(rgamma(1000,2,1/5),main="alpha=2, beta=5") > hist(rgamma(1000,5,1/.5),main="alpha=5, beta=.5") > hist(rgamma(1000,5,1/2),main="alpha=5, beta=2") > hist(rgamma(1000,5,1/5),main="alpha=5, beta=5") 1
2 Resulting plot: alpha=.5, beta=.5 alpha=.5, beta=2 alpha=.5, beta= rgamma(1000, 0.5, 1/0.5) rgamma(1000, 0.5, 1/2) rgamma(1000, 0.5, 1/5) alpha=2, beta=.5 alpha=2, beta=2 alpha=2, beta= rgamma(1000, 2, 1/0.5) rgamma(1000, 2, 1/2) rgamma(1000, 2, 1/5) alpha=5, beta=.5 alpha=5, beta=2 alpha=5, beta= rgamma(1000, 5, 1/0.5) rgamma(1000, 5, 1/2) rgamma(1000, 5, 1/5) The Gamma distribution is right-skewed. The degree of skewness decreases as α increases. As β increases, the values (the location) get larger. (b) Based on the answer to part (a), α is the shape parameter and β is the scale parameter. 2
3 (c) R code used: > mean(rgamma(10,.5,1/.5)) [1] > mean(rgamma(10,.5,1/.5)) [1] > mean(rgamma(10,.5,1/.5)) [1] > mean(rgamma(1000,.5,1/.5)) [1] > mean(rgamma(1000,.5,1/.5)) [1] > mean(rgamma(1000,.5,1/.5)) [1] > mean(rgamma( ,.5,1/.5)) [1] > mean(rgamma( ,.5,1/.5)) [1] > mean(rgamma( ,.5,1/.5)) [1] The mean for the Gamma distribution from which we are generating samples is αβ = The estimated means from the random samples are closer to 0.25 for larger sample sizes. This is an illustration of the Law of Large Numbers. (d) R code used: > par(mfrow=c(2,3)) > size10samples = matrix(rgamma(1000,.5,1/.5),nrow=10) > hist(apply(size10samples,2,mean),main="means of samples n=10") > size10samples = matrix(rgamma(1000,.5,1/.5),nrow=10) > hist(apply(size10samples,2,mean),main="means of samples n=10") > size10samples = matrix(rgamma(1000,.5,1/.5),nrow=10) > hist(apply(size10samples,2,mean),main="means of samples n=10") > size1000samples = matrix(rgamma(100000,.5,1/.5),nrow=1000) > hist(apply(size1000samples,2,mean),main="means n=1000") > size1000samples = matrix(rgamma(100000,.5,1/.5),nrow=1000) > hist(apply(size1000samples,2,mean),main="means n=1000") > size1000samples = matrix(rgamma(100000,.5,1/.5),nrow=1000) > hist(apply(size1000samples,2,mean),main="means n=1000") 3
4 Resulting plot: Means of samples n=10 Means of samples n=10 Means of samples n= apply(size10samples, 2, mean) apply(size10samples, 2, mean) apply(size10samples, 2, mean) Means n=1000 Means n=1000 Means n= apply(size1000samples, 2, mean apply(size1000samples, 2, mean apply(size1000samples, 2, mean For samples of size 10, the distribution of the means is right skewed while for samples of size 1,000, the distribution of the means is closer to symmetric and bell-shaped. This is an illustration of the Central Limit Theorem. (e) When α = 1, Γ(α) = 0 e z dz = 1 so the Gamma density function simplifies to the exponential density function. Since the mean of the Gamma distribution is αβ, the mean of the exponential distribution is β. And since the variance of the Gamma distribution is αβ 2, the variance of the exponential distribution is β 2. 4
5 4. (a) R code used: > tcpdata = read.table("packetdata_dat.txt",head=t) > timestamp=tcpdata$timestamp[tcpdata$databytes!=0] > databytes=tcpdata$databytes[tcpdata$databytes!=0] > length(databytes) [1] The sample size excluding packets with 0 databytes is 31,656. (b) The R code > hist(databytes) produces Histogram of databytes databytes The distribution of the number of data bytes in the packets is bimodal with peaks at small values (0-100) and medium values ( ) and with relatively few large values. (c) R code and output: > summary(databytes) Min. 1st Qu. Median Mean 3rd Qu. Max > summary(databytes[databytes<300]) Min. 1st Qu. Median Mean 3rd Qu. Max > summary(databytes[databytes>=300]) Min. 1st Qu. Median Mean 3rd Qu. Max
6 i. The summary statistics for all values of data bytes is not useful because of the bimodal shape of the distribution; for example, it gives a mean between the 2 modes, where there is relatively little data. The summary statistics for the large and small packets are more useful as addressed further in part (ii). ii. For small packets, the distribution of packet size is right-skewed. This is evident because the mean is larger than the median. (It is very right-skewed as the mean is even larger than the third quartile!) This is also evident because the difference between the maximum and the median is much larger than the difference between the median and the minimum. For large packets, the distribution of packet size is dominated by the mode in the range. This is evident in the summary statistics as the first quartile, median, and third quartile are all very close. This distribution is much more symmetric than the distribution of the size of small packets as the mean and median are very close. (d) R code to calculate inter-arrival times: > interarrivals=numeric(0) > for (i in 1:(length(timestamp)-1)) + interarrivals[i]=timestamp[i+1]-timestamp[i] i. R code and output: > hist(interarrivals) > boxplot(interarrivals) > title("boxplot of interarrivals") > plot.ecdf(interarrivals,do.points=f,verticals=t,main=null) > title("ecdf of interarrivals") > mean(interarrivals) [1] > var(interarrivals) [1] e-05 and resulting graphs: Histogram of interarrivals interarrivals Boxplot of interarrivals x Fn(x) ECDF of interarrivals ii. As can be seen in the histogram, the distribution of inter-arrival times is severely right-skewed. This is evident in the boxplot by the extremely short left whisker and the many points that extend beyond the right whisker. The right-skewed shape is evident in the empirical distribution function because for small values of inter-arrival times it rises quickly, indicating that small values are more likely, but for large values it becomes very flat, indicating that large values are less likely and contribute less to the cumulative probability. 6
7 iii. This is most easily shown by marking the values on the plots. On the boxplot the minimum is the smallest and the maximum is the largest value shown; the 1st and 3rd quartiles are the lower and upper limits of the box, and the median is the line in the centre of the box, which is very close to the lower limit of the box. On the empirical cumulative distribution function, the minimum occurs where it first becomes non-zero (very close to 0.00) and the maximum occurs where the graph becomes 1 (which is difficult to tell with any accuracy since it rises so slowly at the end). The first quartile is the value on the horizontal axis corresponding to 0.25 on the vertical axis, the third quartile is the value on the horizontal axis corresponding to 0.75 on the vertical axis, and the median is the value on the horizontal axis corresponding to 0.5 on the vertical axis. iv. A. R code and output: > sum(interarrivals>1/60)/length(interarrivals) [1] > 1-pexp(1/60,1/mean(interarrivals)) [1] So is the fraction of the observed inter-arrival times greater than one second, but the exponential distribution model predicts that that this will only be of the observations. B. R code to generate histogram of randomly generated values from an exponential distribution: > hist(rexp(length(interarrivals),1/mean(interarrivals)), + main="exponential sample",xlim=c(0,max(interarrivals)) and the resulting graph: Exponential sample rexp(length(interarrivals), 1/mean(interarrivals)) While both this histogram and the histogram of observed inter-arrival times from part (d) i. are right-skewed, the histogram of the exponentially distributed values drops off more quickly (with no observations greater than 0.035) than the the histogram of the observed inter-arrival times. 7
8 C. R code and output: > mean(interarrivals) [1] > var(interarrivals) [1] e-05 For an exponential distribution with β = (the mean of the inter-arrival times) the variance will be β 2 = The inter-arrival times have the same mean (we generated the exponential values so that they would) but the variance is more than twice what would be expected from an exponential distribution. This is consistent with the larger values observed in the histogram of the observed inter-arrival times. 8
Numerical Descriptions of Data
Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =
More informationLecture 1: Review and Exploratory Data Analysis (EDA)
Lecture 1: Review and Exploratory Data Analysis (EDA) Ani Manichaikul amanicha@jhsph.edu 16 April 2007 1 / 40 Course Information I Office hours For questions and help When? I ll announce this tomorrow
More informationDescribing Data: One Quantitative Variable
STAT 250 Dr. Kari Lock Morgan The Big Picture Describing Data: One Quantitative Variable Population Sampling SECTIONS 2.2, 2.3 One quantitative variable (2.2, 2.3) Statistical Inference Sample Descriptive
More informationNOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS
NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS A box plot is a pictorial representation of the data and can be used to get a good idea and a clear picture about the distribution of the data. It shows
More informationWeek 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.
Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.
More informationSTAT 113 Variability
STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2
More informationLecture 2 Describing Data
Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms
More informationappstats5.notebook September 07, 2016 Chapter 5
Chapter 5 Describing Distributions Numerically Chapter 5 Objective: Students will be able to use statistics appropriate to the shape of the data distribution to compare of two or more different data sets.
More informationSection 6-1 : Numerical Summaries
MAT 2377 (Winter 2012) Section 6-1 : Numerical Summaries With a random experiment comes data. In these notes, we learn techniques to describe the data. Data : We will denote the n observations of the random
More information1 Describing Distributions with numbers
1 Describing Distributions with numbers Only for quantitative variables!! 1.1 Describing the center of a data set The mean of a set of numerical observation is the familiar arithmetic average. To write
More informationToday s plan: Section 4.1.4: Dispersion: Five-Number summary and Standard Deviation.
1 Today s plan: Section 4.1.4: Dispersion: Five-Number summary and Standard Deviation. 2 Once we know the central location of a data set, we want to know how close things are to the center. 2 Once we know
More information2 Exploring Univariate Data
2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting
More informationSome estimates of the height of the podium
Some estimates of the height of the podium 24 36 40 40 40 41 42 44 46 48 50 53 65 98 1 5 number summary Inter quartile range (IQR) range = max min 2 1.5 IQR outlier rule 3 make a boxplot 24 36 40 40 40
More informationMath 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment
Math 2311 Bekki George bekki@math.uh.edu Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Class webpage: http://www.math.uh.edu/~bekki/math2311.html Math 2311 Class
More informationChapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1
Chapter 3 Numerical Descriptive Measures Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Objectives In this chapter, you learn to: Describe the properties of central tendency, variation, and
More informationChapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1
Chapter 3 Descriptive Measures Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1 Chapter 3 Descriptive Measures Mean, Median and Mode Copyright 2016, 2012, 2008 Pearson Education, Inc.
More informationA LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]
1. a) 45 [1] b) 7 th value 37 [] n c) LQ : 4 = 3.5 4 th value so LQ = 5 3 n UQ : 4 = 9.75 10 th value so UQ = 45 IQR = 0 f.t. d) Median is closer to upper quartile Hence negative skew [] Page 1 . a) Orders
More informationPutting Things Together Part 2
Frequency Putting Things Together Part These exercise blend ideas from various graphs (histograms and boxplots), differing shapes of distributions, and values summarizing the data. Data for, and are in
More informationDescriptive Statistics
Petra Petrovics Descriptive Statistics 2 nd seminar DESCRIPTIVE STATISTICS Definition: Descriptive statistics is concerned only with collecting and describing data Methods: - statistical tables and graphs
More informationHandout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25
Handout 4 numerical descriptive measures part Calculating Mean for Grouped Data mf Mean for population data: µ mf Mean for sample data: x n where m is the midpoint and f is the frequency of a class. Example
More informationGraphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics
Graphical and Tabular Methods in Descriptive Statistics MATH 3342 Section 1.2 Descriptive Statistics n Graphs and Tables n Numerical Summaries Sections 1.3 and 1.4 1 Why graph data? n The amount of data
More informationDot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.
Introduction We continue our study of descriptive statistics with measures of dispersion, such as dot plots, stem and leaf displays, quartiles, percentiles, and box plots. Dot plots, a stem-and-leaf display,
More informationCHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) =
Solutions to End-of-Section and Chapter Review Problems 225 CHAPTER 6 6.1 (a) P(Z < 1.20) = 0.88493 P(Z > 1.25) = 1 0.89435 = 0.10565 P(1.25 < Z < 1.70) = 0.95543 0.89435 = 0.06108 (d) P(Z < 1.25) or Z
More information3.1 Measures of Central Tendency
3.1 Measures of Central Tendency n Summation Notation x i or x Sum observation on the variable that appears to the right of the summation symbol. Example 1 Suppose the variable x i is used to represent
More informationIntroduction to Computational Finance and Financial Econometrics Descriptive Statistics
You can t see this text! Introduction to Computational Finance and Financial Econometrics Descriptive Statistics Eric Zivot Summer 2015 Eric Zivot (Copyright 2015) Descriptive Statistics 1 / 28 Outline
More informationCHAPTER TOPICS STATISTIK & PROBABILITAS. Copyright 2017 By. Ir. Arthur Daniel Limantara, MM, MT.
Distribusi Normal CHAPTER TOPICS The Normal Distribution The Standardized Normal Distribution Evaluating the Normality Assumption The Uniform Distribution The Exponential Distribution 2 CONTINUOUS PROBABILITY
More informationOverview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution
PSY 464 Advanced Experimental Design Describing and Exploring Data The Normal Distribution 1 Overview/Outline Questions-problems? Exploring/Describing data Organizing/summarizing data Graphical presentations
More informationThe Normal Distribution
5.1 Introduction to Normal Distributions and the Standard Normal Distribution Section Learning objectives: 1. How to interpret graphs of normal probability distributions 2. How to find areas under the
More informationPercentiles, STATA, Box Plots, Standardizing, and Other Transformations
Percentiles, STATA, Box Plots, Standardizing, and Other Transformations Lecture 3 Reading: Sections 5.7 54 Remember, when you finish a chapter make sure not to miss the last couple of boxes: What Can Go
More informationDATA SUMMARIZATION AND VISUALIZATION
APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296
More informationFINALS REVIEW BELL RINGER. Simplify the following expressions without using your calculator. 1) 6 2/3 + 1/2 2) 2 * 3(1/2 3/5) 3) 5/ /2 4
FINALS REVIEW BELL RINGER Simplify the following expressions without using your calculator. 1) 6 2/3 + 1/2 2) 2 * 3(1/2 3/5) 3) 5/3 + 7 + 1/2 4 4) 3 + 4 ( 7) + 3 + 4 ( 2) 1) 36/6 4/6 + 3/6 32/6 + 3/6 35/6
More informationNormal Probability Distributions
Normal Probability Distributions Properties of Normal Distributions The most important probability distribution in statistics is the normal distribution. Normal curve A normal distribution is a continuous
More informationMonte Carlo Simulation (Random Number Generation)
Monte Carlo Simulation (Random Number Generation) Revised: 10/11/2017 Summary... 1 Data Input... 1 Analysis Options... 6 Summary Statistics... 6 Box-and-Whisker Plots... 7 Percentiles... 9 Quantile Plots...
More informationSome Characteristics of Data
Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key
More informationSection3-2: Measures of Center
Chapter 3 Section3-: Measures of Center Notation Suppose we are making a series of observations, n of them, to be exact. Then we write x 1, x, x 3,K, x n as the values we observe. Thus n is the total number
More informationSOLUTIONS TO THE LAB 1 ASSIGNMENT
SOLUTIONS TO THE LAB 1 ASSIGNMENT Question 1 Excel produces the following histogram of pull strengths for the 100 resistors: 2 20 Histogram of Pull Strengths (lb) Frequency 1 10 0 9 61 63 6 67 69 71 73
More information4. DESCRIPTIVE STATISTICS
4. DESCRIPTIVE STATISTICS Descriptive Statistics is a body of techniques for summarizing and presenting the essential information in a data set. Eg: Here are daily high temperatures for Jan 16, 2009 in
More informationChapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.
-3: Measure of Central Tendency Chapter : Descriptive Statistics The value at the center or middle of a data set. It is a tool for analyzing data. Part 1: Basic concepts of Measures of Center Ex. Data
More informationChapter 3 Descriptive Statistics: Numerical Measures Part A
Slides Prepared by JOHN S. LOUCKS St. Edward s University Slide 1 Chapter 3 Descriptive Statistics: Numerical Measures Part A Measures of Location Measures of Variability Slide Measures of Location Mean
More informationThe Not-So-Geeky World of Statistics
FEBRUARY 3 5, 2015 / THE HILTON NEW YORK The Not-So-Geeky World of Statistics Chris Emerson Chris Sweet (a/k/a Chris 2 ) 2 Who We Are Chris Sweet JPMorgan Chase VP, Outside Counsel & Engagement Management
More informationCopyright 2005 Pearson Education, Inc. Slide 6-1
Copyright 2005 Pearson Education, Inc. Slide 6-1 Chapter 6 Copyright 2005 Pearson Education, Inc. Measures of Center in a Distribution 6-A The mean is what we most commonly call the average value. It is
More informationUnit 2 Statistics of One Variable
Unit 2 Statistics of One Variable Day 6 Summarizing Quantitative Data Summarizing Quantitative Data We have discussed how to display quantitative data in a histogram It is useful to be able to describe
More informationFrequency Distribution and Summary Statistics
Frequency Distribution and Summary Statistics Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai i at Mānoa Outline 1. Stemplot 2. Frequency table 3. Summary
More informationCHAPTER 2 Describing Data: Numerical
CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of
More informationMath 227 Elementary Statistics. Bluman 5 th edition
Math 227 Elementary Statistics Bluman 5 th edition CHAPTER 6 The Normal Distribution 2 Objectives Identify distributions as symmetrical or skewed. Identify the properties of the normal distribution. Find
More informationMeasures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean
Measure of Center Measures of Center The value at the center or middle of a data set 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) 1 2 Mean Notation The measure of center obtained by adding the values
More informationCategorical. A general name for non-numerical data; the data is separated into categories of some kind.
Chapter 5 Categorical A general name for non-numerical data; the data is separated into categories of some kind. Nominal data Categorical data with no implied order. Eg. Eye colours, favourite TV show,
More informationSimple Descriptive Statistics
Simple Descriptive Statistics These are ways to summarize a data set quickly and accurately The most common way of describing a variable distribution is in terms of two of its properties: Central tendency
More information9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives
Basic Statistics for the Healthcare Professional 1 F R A N K C O H E N, M B B, M P A D I R E C T O R O F A N A L Y T I C S D O C T O R S M A N A G E M E N T, LLC Purpose of Statistic 2 Provide a numerical
More informationDATA HANDLING Five-Number Summary
DATA HANDLING Five-Number Summary The five-number summary consists of the minimum and maximum values, the median, and the upper and lower quartiles. The minimum and the maximum are the smallest and greatest
More informationData that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.
Chapter 8 Measures of Center Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Data that can only be integer
More informationDescription of Data I
Description of Data I (Summary and Variability measures) Objectives: Able to understand how to summarize the data Able to understand how to measure the variability of the data Able to use and interpret
More informationSTAT 157 HW1 Solutions
STAT 157 HW1 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/10/spring/stats157.dir/ Problem 1. 1.a: (6 points) Determine the Relative Frequency and the Cumulative Relative Frequency (fill
More informationLecture Week 4 Inspecting Data: Distributions
Lecture Week 4 Inspecting Data: Distributions Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit So next week No lecture & workgroups But Practice Test on-line (BB) Enter data for your
More informationMini-Lecture 3.1 Measures of Central Tendency
Mini-Lecture 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data 3. Explain what it means for a
More informationEmpirical Rule (P148)
Interpreting the Standard Deviation Numerical Descriptive Measures for Quantitative data III Dr. Tom Ilvento FREC 408 We can use the standard deviation to express the proportion of cases that might fall
More informationHow Wealthy Are Europeans?
How Wealthy Are Europeans? Grades: 7, 8, 11, 12 (course specific) Description: Organization of data of to examine measures of spread and measures of central tendency in examination of Gross Domestic Product
More informationThe Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).
We will look the three common and useful measures of spread. The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s). 1 Ameasure of the center
More informationQQ PLOT Yunsi Wang, Tyler Steele, Eva Zhang Spring 2016
QQ PLOT INTERPRETATION: Quantiles: QQ PLOT Yunsi Wang, Tyler Steele, Eva Zhang Spring 2016 The quantiles are values dividing a probability distribution into equal intervals, with every interval having
More informationMidTerm 1) Find the following (round off to one decimal place):
MidTerm 1) 68 49 21 55 57 61 70 42 59 50 66 99 Find the following (round off to one decimal place): Mean = 58:083, round off to 58.1 Median = 58 Range = max min = 99 21 = 78 St. Deviation = s = 8:535,
More informationEdexcel past paper questions
Edexcel past paper questions Statistics 1 Chapters 2-4 (Discrete) Statistics 1 Chapters 2-4 (Discrete) Page 1 Stem and leaf diagram Stem-and-leaf diagrams are used to represent data in its original form.
More informationMoments and Measures of Skewness and Kurtosis
Moments and Measures of Skewness and Kurtosis Moments The term moment has been taken from physics. The term moment in statistical use is analogous to moments of forces in physics. In statistics the values
More informationStandardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis
Descriptive Statistics (Part 2) 4 Chapter Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis McGraw-Hill/Irwin Copyright 2009 by The McGraw-Hill Companies, Inc. Chebyshev s Theorem
More informationSummary of Information from Recapitulation Report Submittals (DR-489 series, DR-493, Central Assessment, Agricultural Schedule):
County: Martin Study Type: 2014 - In-Depth The department approved your preliminary assessment roll for 2014. Roll approval statistical summary reports and graphics for 2014 are attached for additional
More informationBasic Procedure for Histograms
Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that
More informationKING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA. Name: ID# Section
KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA STAT 11: BUSINESS STATISTICS I Semester 04 Major Exam #1 Sunday March 7, 005 Please circle your instructor
More informationLecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1
Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 6 Normal Probability Distributions 6-1 Overview 6-2 The Standard Normal Distribution
More informationChapter 4. The Normal Distribution
Chapter 4 The Normal Distribution 1 Chapter 4 Overview Introduction 4-1 Normal Distributions 4-2 Applications of the Normal Distribution 4-3 The Central Limit Theorem 4-4 The Normal Approximation to the
More informationWk 2 Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12)
Wk 2 Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics: - Measures of centrality (Mean, median, mode, trimmed mean) - Measures of spread (MAD, Standard deviation, variance) -
More information2CORE. Summarising numerical data: the median, range, IQR and box plots
C H A P T E R 2CORE Summarising numerical data: the median, range, IQR and box plots How can we describe a distribution with just one or two statistics? What is the median, how is it calculated and what
More informationThe Normal Distribution
Stat 6 Introduction to Business Statistics I Spring 009 Professor: Dr. Petrutza Caragea Section A Tuesdays and Thursdays 9:300:50 a.m. Chapter, Section.3 The Normal Distribution Density Curves So far we
More informationIOP 201-Q (Industrial Psychological Research) Tutorial 5
IOP 201-Q (Industrial Psychological Research) Tutorial 5 TRUE/FALSE [1 point each] Indicate whether the sentence or statement is true or false. 1. To establish a cause-and-effect relation between two variables,
More informationDiploma in Financial Management with Public Finance
Diploma in Financial Management with Public Finance Cohort: DFM/09/FT Jan Intake Examinations for 2009 Semester II MODULE: STATISTICS FOR FINANCE MODULE CODE: QUAN 1103 Duration: 2 Hours Reading time:
More informationSummarising Data. Summarising Data. Examples of Types of Data. Types of Data
Summarising Data Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Today we will consider Different types of data Appropriate ways to summarise these data 17/10/2017
More informationMEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION
MEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION 1 Day 3 Summer 2017.07.31 DISTRIBUTION Symmetry Modality 单峰, 双峰 Skewness 正偏或负偏 Kurtosis 2 3 CHAPTER 4 Measures of Central Tendency 集中趋势
More informationExploring Data and Graphics
Exploring Data and Graphics Rick White Department of Statistics, UBC Graduate Pathways to Success Graduate & Postdoctoral Studies November 13, 2013 Outline Summarizing Data Types of Data Visualizing Data
More informationFundamentals of Statistics
CHAPTER 4 Fundamentals of Statistics Expected Outcomes Know the difference between a variable and an attribute. Perform mathematical calculations to the correct number of significant figures. Construct
More informationMATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1)
LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) Descriptive statistics are ways of summarizing large sets of quantitative (numerical) information. The best way to reduce a set of
More informationMisleading Graphs. Examples Compare unlike quantities Truncate the y-axis Improper scaling Chart Junk Impossible to interpret
Misleading Graphs Examples Compare unlike quantities Truncate the y-axis Improper scaling Chart Junk Impossible to interpret 1 Pretty Bleak Picture Reported AIDS cases 2 But Wait..! 3 Turk Incorporated
More informationstarting on 5/1/1953 up until 2/1/2017.
An Actuary s Guide to Financial Applications: Examples with EViews By William Bourgeois An actuary is a business professional who uses statistics to determine and analyze risks for companies. In this guide,
More informationSkewness and the Mean, Median, and Mode *
OpenStax-CNX module: m46931 1 Skewness and the Mean, Median, and Mode * OpenStax This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Consider the following
More informationMEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,
MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE Dr. Bijaya Bhusan Nanda, CONTENTS What is measures of dispersion? Why measures of dispersion? How measures of dispersions are calculated? Range Quartile
More informationFrequency Distribution Models 1- Probability Density Function (PDF)
Models 1- Probability Density Function (PDF) What is a PDF model? A mathematical equation that describes the frequency curve or probability distribution of a data set. Why modeling? It represents and summarizes
More informationProperties of Probability Models: Part Two. What they forgot to tell you about the Gammas
Quality Digest Daily, September 1, 2015 Manuscript 285 What they forgot to tell you about the Gammas Donald J. Wheeler Clear thinking and simplicity of analysis require concise, clear, and correct notions
More informationChapter 7. Inferences about Population Variances
Chapter 7. Inferences about Population Variances Introduction () The variability of a population s values is as important as the population mean. Hypothetical distribution of E. coli concentrations from
More informationData Distributions and Normality
Data Distributions and Normality Definition (Non)Parametric Parametric statistics assume that data come from a normal distribution, and make inferences about parameters of that distribution. These statistical
More information22.2 Shape, Center, and Spread
Name Class Date 22.2 Shape, Center, and Spread Essential Question: Which measures of center and spread are appropriate for a normal distribution, and which are appropriate for a skewed distribution? Eplore
More informationSummary of Statistical Analysis Tools EDAD 5630
Summary of Statistical Analysis Tools EDAD 5630 Test Name Program Used Purpose Steps Main Uses/Applications in Schools Principal Component Analysis SPSS Measure Underlying Constructs Reliability SPSS Measure
More informationEdexcel past paper questions
Edexcel past paper questions Statistics 1 Chapters 2-4 (Continuous) S1 Chapters 2-4 Page 1 S1 Chapters 2-4 Page 2 S1 Chapters 2-4 Page 3 S1 Chapters 2-4 Page 4 Histograms When you are asked to draw a histogram
More informationExample - Let X be the number of boys in a 4 child family. Find the probability distribution table:
Chapter7 Probability Distributions and Statistics Distributions of Random Variables tthe value of the result of the probability experiment is a RANDOM VARIABLE. Example - Let X be the number of boys in
More informationStat 101 Exam 1 - Embers Important Formulas and Concepts 1
1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.
More informationLecture 6: Chapter 6
Lecture 6: Chapter 6 C C Moxley UAB Mathematics 3 October 16 6.1 Continuous Probability Distributions Last week, we discussed the binomial probability distribution, which was discrete. 6.1 Continuous Probability
More informationAppendix A. Selecting and Using Probability Distributions. In this appendix
Appendix A Selecting and Using Probability Distributions In this appendix Understanding probability distributions Selecting a probability distribution Using basic distributions Using continuous distributions
More informationBangor University Transfer Abroad Undergraduate Programme Module Implementation Plan
Bangor University Transfer Abroad Undergraduate Programme Module Implementation Plan MODULE: BUS-121 Descriptive Statistics LECTURER: Dr Francis Jones INTAKE: 2013 SEMESTER: 3 ACTIVITY TYPES:, tutorial,
More informationLesson 12: Describing Distributions: Shape, Center, and Spread
: Shape, Center, and Spread Opening Exercise Distributions - Data are often summarized by graphs. We often refer to the group of data presented in the graph as a distribution. Below are examples of the
More informationStatistics for Managers Using Microsoft Excel/SPSS Chapter 6 The Normal Distribution And Other Continuous Distributions
Statistics for Managers Using Microsoft Excel/SPSS Chapter 6 The Normal Distribution And Other Continuous Distributions 1999 Prentice-Hall, Inc. Chap. 6-1 Chapter Topics The Normal Distribution The Standard
More informationTerms & Characteristics
NORMAL CURVE Knowledge that a variable is distributed normally can be helpful in drawing inferences as to how frequently certain observations are likely to occur. NORMAL CURVE A Normal distribution: Distribution
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 10 (MWF) Checking for normality of the data using the QQplot Suhasini Subba Rao Review of previous
More informationStatistics I Chapter 2: Analysis of univariate data
Statistics I Chapter 2: Analysis of univariate data Numerical summary Central tendency Location Spread Form mean quartiles range coeff. asymmetry median percentiles interquartile range coeff. kurtosis
More informationSTATISTICAL DISTRIBUTIONS AND THE CALCULATOR
STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either
More information