A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]

Similar documents
Edexcel past paper questions

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Lecture Week 4 Inspecting Data: Distributions

Categorical. A general name for non-numerical data; the data is separated into categories of some kind.

Putting Things Together Part 2

Statistics (This summary is for chapters 18, 29 and section H of chapter 19)

2 Exploring Univariate Data

6683/01 Edexcel GCE Statistics S1 Gold Level G2

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Measures of Dispersion (Range, standard deviation, standard error) Introduction

Unit 2 Measures of Variation

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data

Lecture 1: Review and Exploratory Data Analysis (EDA)

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS

Descriptive Statistics

CHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) =

DATA HANDLING Five-Number Summary

IOP 201-Q (Industrial Psychological Research) Tutorial 5

Section 6-1 : Numerical Summaries

Lecture 2 Describing Data

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Some estimates of the height of the podium

Section3-2: Measures of Center

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25

appstats5.notebook September 07, 2016 Chapter 5

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.

Unit 2 Statistics of One Variable

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

MAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw

Describing Data: One Quantitative Variable

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

STAB22 section 1.3 and Chapter 1 exercises

FINALS REVIEW BELL RINGER. Simplify the following expressions without using your calculator. 1) 6 2/3 + 1/2 2) 2 * 3(1/2 3/5) 3) 5/ /2 4

Percentiles, STATA, Box Plots, Standardizing, and Other Transformations

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar

Description of Data I

Basic Procedure for Histograms

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

Numerical Descriptions of Data

Simple Descriptive Statistics

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1

STAT 113 Variability

Measures of Variation. Section 2-5. Dotplots of Waiting Times. Waiting Times of Bank Customers at Different Banks in minutes. Bank of Providence

Copyright 2005 Pearson Education, Inc. Slide 6-1

SUMMARY STATISTICS EXAMPLES AND ACTIVITIES

A.REPRESENTATION OF DATA

Math 227 Elementary Statistics. Bluman 5 th edition

2 2 In general, to find the median value of distribution, if there are n terms in the distribution the

DATA ANALYSIS EXAM QUESTIONS

3.1 Measures of Central Tendency

Frequency Distribution and Summary Statistics

Applications of Data Dispersions

1 Describing Distributions with numbers

DATA SUMMARIZATION AND VISUALIZATION

PhysicsAndMathsTutor.com

The Normal Distribution

CHAPTER 2 Describing Data: Numerical

Today s plan: Section 4.1.4: Dispersion: Five-Number summary and Standard Deviation.

Example: Histogram for US household incomes from 2015 Table:

MEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION

Ti 83/84. Descriptive Statistics for a List of Numbers

Distributions and their Characteristics

2011 Pearson Education, Inc

CHAPTER 2 DESCRIBING DATA: FREQUENCY DISTRIBUTIONS AND GRAPHIC PRESENTATION

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

Solutions for practice questions: Chapter 9, Statistics

Master of Science in Strategic Management Degree Master of Science in Strategic Supply Chain Management Degree

Data Distributions and Normality

PSYCHOLOGICAL STATISTICS

Lecture 6: Chapter 6

Chapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.

STAT 157 HW1 Solutions

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

STA 248 H1S Winter 2008 Assignment 1 Solutions

Chapter 6 Continuous Probability Distributions. Learning objectives

STAT Chapter 6 The Standard Deviation (SD) as a Ruler and The Normal Model

Some Characteristics of Data

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

Fundamentals of Statistics

22.2 Shape, Center, and Spread

Application of the Bootstrap Estimating a Population Mean

Descriptive Statistics

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Central Limit Theorem: Homework

2CORE. Summarising numerical data: the median, range, IQR and box plots

Chapter 2. Section 2.1

CHAPTER TOPICS STATISTIK & PROBABILITAS. Copyright 2017 By. Ir. Arthur Daniel Limantara, MM, MT.

Introduction to Descriptive Statistics

Example - Let X be the number of boys in a 4 child family. Find the probability distribution table:

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Test Bank Elementary Statistics 2nd Edition William Navidi

Chapter 7 Study Guide: The Central Limit Theorem

KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA. Name: ID# Section

Continuous Probability Distributions

Transcription:

1. a) 45 [1] b) 7 th value 37 [] n c) LQ : 4 = 3.5 4 th value so LQ = 5 3 n UQ : 4 = 9.75 10 th value so UQ = 45 IQR = 0 f.t. d) Median is closer to upper quartile Hence negative skew [] Page 1

. a) Orders Frequency Cumulative Frequency 30 31 3 5 3 4 9 33 3 1 34 10 35 6 8 36 8 36 37 5 41 38 5 46 (cumulative 39 6 5 frequencies) median : 6.5 position 35 LQ : n = 13 4 1 (1 th + 13 th ) = 33.5 3 n UQ : 4 = 39 1 (39 th + 40 th ) = 37 Interquartile range = 3.5 f.t. [7] Σfx 187 b) µ = = = 35. 135 Σf 5 σ = Σfx Σf µ = 64514 187 5 5 =.493 c) Mean and standard deviation can be distorted by a small number of high values This is not the case here, (and mean is close to median), so they are suitable [] d) 30 31 3 33 34 35 36 37 38 39 Number of orders (on graph paper, scale) (max and min) ft (median; quartiles) e) Slight positive skew. Tendency to peak close to mean / median (or in the middle ) (either) Central 50% of data in a relatively small range (or other comment) [] Page

3. a) Mark Freuency Cumulative Frequency 0 x < 30 4 4 30 x < 40 7 11 40 x < 50 6 17 50 x < 60 5 (cum. frequs.) 60 x < 80 3 5 80 x < 100 7 1 i) n = 13.5 in 40 x < 50 class 13.5 11 median = 40 + 10 17 11 = 44.167 c.a.o. ii) 0.87n = 3.49 in 60 x < 80 class 87 th 3.49 percentile = 60 + 0 5 = 69.933 [7] b) Use of midpoints 140 mean = = 45.96 7 c) Second class must have a few very high values, because of the mean being much higher than the median. [] Medians are very similar, suggesting a comparable performance overall, ignoring the very high achievers in the second class. Page 3

4. a) Time True Boundaries Frequency Cumulative Frequency 0 4 0 4.5 66 66 5 9 4.5 9.5 51 117 (true class boundaries) 10 14 9.5 14.5 1 19 15 19 14.5 19.5 6 135 (cum freqs) 0 4 19.5 4.5 3 138 5 9 4.5 9.5 140 n 140 = = 70 in 4.5 9.5 class. 70 66 median = 4.5 + 5 117 66 = 4.89 n = 35 in 0 4.5 class 4 35 0 LQ = 0 + 4. 5 66 =.386 3 n = 105 4 in 4.5 9.5 class 105 66 UQ = 4.5 + 5 117 66 = 8.34 I.Q.R = 5.938 [10] b) Second garage First garage 0 5 10 15 0 5 30 35 Time (hours) (graph paper, to scale) (both max and min) (median and quartiles) (labelled) (on same axes) [5] Page 4

QUESTION 4 CONTINUED c) Both have positive skew. nd garage has greater range (and greater interquartile range), so times more varied. Median times close, but 1 st garage is slightly higher, suggesting it has fewer very low times. The 1 st garage is probably a better choice since there s a smaller chance of waiting a very long time. B4 (any four sensible independent comments) 5. a) µ = 50. 4 30 σ = 50.4 30 b) Σx = 1580 new mean = 50. 968 31 Σx = 83838 + 68 = 8846 8846 1580 new variance = 31 31 = 55.90 [5] c) Increase mean, because mark removed is below current mean, and new mark is above it Decrease variance, because new mark is closer to mean than the old one Page 5

6. a).4 0 + 5.7 30 50 = 4.38 b) σ = Σx n µ Σx = n(σ + µ ) Group of 0 : Σx = 0(.1 +.4 ) = 1013.4 Group of 30 : Σx = 30(1.9 + 5.7 ) = 1993 1013.4 + 1993 c) σ = 4.38 0 + 30 =.558 7. a) Although data appears to be discrete, it is actually continuous [1] b) True Class Boundary Frequency Density 0.5 3.5 9 = 3 (true class boundaries) 3.5 6.5 6.5 11.5 11.5 1.5 1.5 41.5 3 1 = 4 10 3 5 = 1 5 = 0. (frequency density) = 0.1 0 Frequency Density 5 4 3 1 (graph paper, to scale) (accurate) 0 0 10 0 30 40 Lateness (minutes) [5] Page 6

8. a) Class X Y f fy fy (midpoint) 10.5 50.5 30.5 -.4 3-7. 17.8 (true 50.5 100.5 75.5-1.5 11-16.5 4.75 class boundaries) 100.5 00.5 150.5 0 17 0 0 (midpoints) 00.5 300.5 50.5 5 10 0 (Y) 300.5 500.5 400.5 5 10 50 (fy, fy ) 500.5 1000.5 750.5 1 4 88 ΣfY = -7. 16.5 + 10 + 10 + 4 = 0.3 ΣfY = 17.8 + 4.75 + 0 + 50 + 88 = 400.03 [6] 0.3 b) Y = = 0. 5075 40 f.t. S.D. of Y = 400.03 0.5075 = 3.11 40 c) X = 50Y + 150.5 X = 50 Y + 150.5 = 175.875 f.t. S.D. of X = 50 S.D. of Y = 156.1 f.t. [5] d) The standard deviation and mean are very high which gives a misleading picture of the data. This is because a few uncharacteristically high data values have distorted them. [] 9. a) Σfx = 109 Σf = 30 Σfx = 459 (all) 109 Mean = = 3.633 30 459 109 Variance = =. 099 30 30 b) Actual values = 0.001 table values + 5 Mean = 0.001 3.633 + 5 = 5.004 f.t. Variance = 0.001.099 = 0.00000099 [5] Page 7

10.a) Height is a continuous variable A histogram makes it easy to see the spread of data [] b) Class 140 150 means actual heights between 135 and 155 so 1cm represents 0cm in width 8 cm represents 4 children cm represents 1 child (or using frequency density) i) Width 0.5 cm Area = 4 cm height = 8cm ii) Width 1.5 cm Area = 1 cm height = 8 cm [8] c) Class Midpoint Y f 80 100 90-6 6 110 110 - - 10 10 0 9 130 130 3 (Y values) 140 150 145 5 4 ΣfY = -14 Y = 14 7 = 4 1 ΣfY = 336 S.D. of Y = 3.696 H = 5Y +10 1 H = 5 Y + 10 = 117 cm 1 S.D. of H = 5 S.D. of Y = 18.48 cm d) Actual values of heights are not known [1] [10] Page 8

n 3 + n + n 1 + n + n + 1 + n + + n + 3 11.a) mean = 7 = n Deviations from mean are 3, -, -1, 0, 1,, 3. so S.D. = 3 + + 1 + 0 7 + 1 + + 3 = [5] b) i) mean = 8 S.D. = [] ii) mean = 17 S.D. = = 4 iii) mean = a S.D. = b = 4b c) mean decreases since new value below mean standard deviation decreases, since difference between new value and mean is less than standard deviation Page 9

1.a) Stem Leaves 1 7 1 5 8 8 9 3 0 0 1 4 4 6 6 7 8 9 (order) 4 0 0 4 7 7 7 7 7 (alignment) 5 6 8 8 b) median : 15.5 th value 37.5 LQ : 4n = 7.5 8 th value 30 UQ : 3 n 4 =.5 3 rd value 47 Interquartile range= 17 [6] c) 15 0 5 30 35 40 45 50 55 60 Mark (graph paper, to scale) (max and min) ft (median; quartiles) Positive skew d) Both have slight positive skew Second class have a much smaller range less of very low or very high ability Smaller interquartile range for second class, showing middle 50% are much more consistent Similar medians, showing average ability of classes similar B4 (any 4 independent comments) Page 10

13.a) Shoe size can only take discrete values. [1] b) Shoe size Frequency Cumulative Frequency 1 5 5 1½ 8 13 13 6 ½ 8 34 (all) 3 6 40 3½ 0 40 Cumulative frequency 50 40 30 0 10 0 0.5 1 1.5.5 3 3.5 4 Shoe size attempt at step polygon labelled, clear axes plotted accurately 14.a) i) Makes it clear where bulk of data lie. Can see shape / skewness of distribution B ii) Can see central 50% / range easily Can see skewness easily B (any ) Can be used to compare distribution iii) Can get an idea of shape of distribution while all data is retained Can use it to find median and quartiles easily B [6] b) i) continuous ii) continuous or discrete iii) discrete Page 11

15.a) True class Frequency Cumulative Frequency 18 x < 1 7 7 1 x < 6 10 17 (cumulative frequency) 6 x < 31 15 3 31 x < 41 0 5 (class boundaries) 41 x < 51 7 59 51 x 1 60 Cumulative frequency 60 50 40 30 0 10 0 15 5 35 45 55 65 75 Age (years) (graph paper, to scale) A (all points correct) (reasonable upper bound - 61 at least) [7] b) Look up 30 on cumulative frequency axis 30.4 (accept 9 3) [] c) Required.5 th and 97.5 th percentiles look up 1.5 and 58.5 on cumulative frequency axis.5 th : 19.1 (18 0) 97.5 th : 50.3 (49 5) So central 95% of data are between 19.1 and 50.3 d) µ ± 6 = 31.6 ± 70. 19 = 14.8, 48.4 Central 95% are slightly less spread than for normal distribution and take somewhat higher values (suggesting skewness) But reasonably close, so could be normal distribution [6] Page 1

16.a) The length 1. cm [1] b) A : 9 lizards 15 th value 10. cm B : 33 lizards 17 th value 11.7 cm (-1 if misread to be 10, not 10. etc.) c) Species A have a smaller range of lengths Commonest lengths for A are less than those for B Both distributions fairly symmetrical. d) Box plot (histogram OK if it is clear data are to be grouped) [1] e) Box plot : advantage : can clearly see location of middle 50% of data or : can see skewness clearly (either) disadvantage : actual data values lost or : cannot see overall shape of distribution (either) Histogram: advantage : can see shape/ skewness easily disadvantage: actual data values lost [] Page 13

15 4 + 9 5 + 30 + 35 + 70 8 17.a) mean = = 8 9 15 + 9 + 1 + 1 + 1 mode = 4 median : 14 th value = 4 b) median (or mode) These are representative of the ages of those at party [] c) mean : increase, since 5 is below mean age median : unaffected, since the middle value will still be 4 mode : unchanged; 4 is still commonest age d) 0 5.05 + 3 40 3 = 9.609 Page 14

18. a) True boundaries Frequency Cumulative Frequency 0 x < 5 5 x < 4 4 9 (cumulative frequency) 4 x < 6 5 14 (boundaries) 6 x < 8 10 4 8 x < 10 1 36 10 x < 13 3 39 13 x 1 40 median: n = 0 in 6 x < 8 class. 0 14 median = 6 + 4 14 = 7. 3 rd decile : 0.3n = 1 In 4 x < 6 class 3 rd 1 9 decile = 4 + 14 9 = 5. [9] b) The age 30% of the cars are below [1] c) The exact ages of the cars are not known [1] d) There may be a tendency for people to buy cars more at some times of the year (eg when new registration letter comes out) Also, those in for repair may tend to be older than a random selection of cars So may not be very accurate [] Page 15

19.a) 5 µ + 10 µ 15 5µ = 3 Σx b) i) First set : σ = µ 5 Σx = 5(µ + σ ) Second set : Σx = 10(4µ + 9σ ) In total, Σx = 5µ + 5σ + 40µ + 90σ = 45(σ) + 95σ = 75σ S.D. = 75σ 15 5µ 3 = 55σ 3 10σ 3 = 55σ 3 100σ 9 = 65σ 3 [10] ii) The second set Looking at µ - σ for both, we get σ in both cases Looking at µ - σ for both, we get 0 for set 1 and -σ for set Since this is lower for set, it probably has lower values (reasonable argument) [] c) mean: increase, since new value is above old mean standard deviation: decrease, since the difference between this value and the mean is less than σ. Page 16

0.a) Using Area = Frequency Weight Frequency 40 x < 50 5 50 x < 55 10 55 x < 60 13 A3 (all) 60 x < 65 7 65 x < 75 5 75 x < 90 6 90 x < 110 4 b) Use of midpoints Σfx = 3180 Σfx = 13350 n = 50 mean = 63.6 S.D. = 14.901 c) There are a small number of high values which distort the mean and standard deviation Use median and interquartile range 1.a) mean, median both increase by 5 standard deviation and interquartile range unchanged b) All increase by 5% [1] c) mean and standard deviation increase median and interquartile range unchanged [] Page 17

.a) If median is closer to upper quartile than lower quartile, it is negatively skewed If median is same distance from both quartiles, it is symmetrical If median is closer to lower quartile than upper quartile, it is positively skewed b) Time Frequency Cumulative Frequency 0 40 5 7 (cumulative 60 13 0 frequency) 80 8 8 100 1 9 10 1 30 median : 15.5 th value 60 LQ : 4 n = 7.5 8 th value 60 3 n UQ : =.5 3 rd value 4 80 [6] c) Class boundaries are 10 30, 30 50, 50 70, 70 90, 90 110 and 110 130 Median : 15 th value; in class 50 70 15 7 So median = 50 + 0 = 6.308 0 7 LQ = 7.5 th value; in class 50 70 7.5 7 LQ = 50 + 0 0 7 = 50.769 UQ =.5 th value; in class 70 90.5 0 UQ = 70 + 0 8 0 = 76.5 [7] d) Values calculated in c) assume the times are evenly spread through given interval The values from interpolation Since times really will be evenly spread Page 18

QUESTION CONTINUED A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES e) Since first class had higher values for median and quartiles, they were slower overall. B4 (any 4 independent First class had times slightly less spread out than nd class. correct comments) nd class times are symmetrical 1 st class times have slight positive skew Page 19