A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]

1. a) 45 [1] b) 7 th value 37 [] n c) LQ : 4 = 3.5 4 th value so LQ = 5 3 n UQ : 4 = 9.75 10 th value so UQ = 45 IQR = 0 f.t. d) Median is closer to upper quartile Hence negative skew [] Page 1

. a) Orders Frequency Cumulative Frequency 30 31 3 5 3 4 9 33 3 1 34 10 35 6 8 36 8 36 37 5 41 38 5 46 (cumulative 39 6 5 frequencies) median : 6.5 position 35 LQ : n = 13 4 1 (1 th + 13 th ) = 33.5 3 n UQ : 4 = 39 1 (39 th + 40 th ) = 37 Interquartile range = 3.5 f.t. [7] Σfx 187 b) µ = = = 35. 135 Σf 5 σ = Σfx Σf µ = 64514 187 5 5 =.493 c) Mean and standard deviation can be distorted by a small number of high values This is not the case here, (and mean is close to median), so they are suitable [] d) 30 31 3 33 34 35 36 37 38 39 Number of orders (on graph paper, scale) (max and min) ft (median; quartiles) e) Slight positive skew. Tendency to peak close to mean / median (or in the middle ) (either) Central 50% of data in a relatively small range (or other comment) [] Page

3. a) Mark Freuency Cumulative Frequency 0 x < 30 4 4 30 x < 40 7 11 40 x < 50 6 17 50 x < 60 5 (cum. frequs.) 60 x < 80 3 5 80 x < 100 7 1 i) n = 13.5 in 40 x < 50 class 13.5 11 median = 40 + 10 17 11 = 44.167 c.a.o. ii) 0.87n = 3.49 in 60 x < 80 class 87 th 3.49 percentile = 60 + 0 5 = 69.933 [7] b) Use of midpoints 140 mean = = 45.96 7 c) Second class must have a few very high values, because of the mean being much higher than the median. [] Medians are very similar, suggesting a comparable performance overall, ignoring the very high achievers in the second class. Page 3

4. a) Time True Boundaries Frequency Cumulative Frequency 0 4 0 4.5 66 66 5 9 4.5 9.5 51 117 (true class boundaries) 10 14 9.5 14.5 1 19 15 19 14.5 19.5 6 135 (cum freqs) 0 4 19.5 4.5 3 138 5 9 4.5 9.5 140 n 140 = = 70 in 4.5 9.5 class. 70 66 median = 4.5 + 5 117 66 = 4.89 n = 35 in 0 4.5 class 4 35 0 LQ = 0 + 4. 5 66 =.386 3 n = 105 4 in 4.5 9.5 class 105 66 UQ = 4.5 + 5 117 66 = 8.34 I.Q.R = 5.938 [10] b) Second garage First garage 0 5 10 15 0 5 30 35 Time (hours) (graph paper, to scale) (both max and min) (median and quartiles) (labelled) (on same axes) [5] Page 4

QUESTION 4 CONTINUED c) Both have positive skew. nd garage has greater range (and greater interquartile range), so times more varied. Median times close, but 1 st garage is slightly higher, suggesting it has fewer very low times. The 1 st garage is probably a better choice since there s a smaller chance of waiting a very long time. B4 (any four sensible independent comments) 5. a) µ = 50. 4 30 σ = 50.4 30 b) Σx = 1580 new mean = 50. 968 31 Σx = 83838 + 68 = 8846 8846 1580 new variance = 31 31 = 55.90 [5] c) Increase mean, because mark removed is below current mean, and new mark is above it Decrease variance, because new mark is closer to mean than the old one Page 5

6. a).4 0 + 5.7 30 50 = 4.38 b) σ = Σx n µ Σx = n(σ + µ ) Group of 0 : Σx = 0(.1 +.4 ) = 1013.4 Group of 30 : Σx = 30(1.9 + 5.7 ) = 1993 1013.4 + 1993 c) σ = 4.38 0 + 30 =.558 7. a) Although data appears to be discrete, it is actually continuous [1] b) True Class Boundary Frequency Density 0.5 3.5 9 = 3 (true class boundaries) 3.5 6.5 6.5 11.5 11.5 1.5 1.5 41.5 3 1 = 4 10 3 5 = 1 5 = 0. (frequency density) = 0.1 0 Frequency Density 5 4 3 1 (graph paper, to scale) (accurate) 0 0 10 0 30 40 Lateness (minutes) [5] Page 6

8. a) Class X Y f fy fy (midpoint) 10.5 50.5 30.5 -.4 3-7. 17.8 (true 50.5 100.5 75.5-1.5 11-16.5 4.75 class boundaries) 100.5 00.5 150.5 0 17 0 0 (midpoints) 00.5 300.5 50.5 5 10 0 (Y) 300.5 500.5 400.5 5 10 50 (fy, fy ) 500.5 1000.5 750.5 1 4 88 ΣfY = -7. 16.5 + 10 + 10 + 4 = 0.3 ΣfY = 17.8 + 4.75 + 0 + 50 + 88 = 400.03 [6] 0.3 b) Y = = 0. 5075 40 f.t. S.D. of Y = 400.03 0.5075 = 3.11 40 c) X = 50Y + 150.5 X = 50 Y + 150.5 = 175.875 f.t. S.D. of X = 50 S.D. of Y = 156.1 f.t. [5] d) The standard deviation and mean are very high which gives a misleading picture of the data. This is because a few uncharacteristically high data values have distorted them. [] 9. a) Σfx = 109 Σf = 30 Σfx = 459 (all) 109 Mean = = 3.633 30 459 109 Variance = =. 099 30 30 b) Actual values = 0.001 table values + 5 Mean = 0.001 3.633 + 5 = 5.004 f.t. Variance = 0.001.099 = 0.00000099 [5] Page 7

10.a) Height is a continuous variable A histogram makes it easy to see the spread of data [] b) Class 140 150 means actual heights between 135 and 155 so 1cm represents 0cm in width 8 cm represents 4 children cm represents 1 child (or using frequency density) i) Width 0.5 cm Area = 4 cm height = 8cm ii) Width 1.5 cm Area = 1 cm height = 8 cm [8] c) Class Midpoint Y f 80 100 90-6 6 110 110 - - 10 10 0 9 130 130 3 (Y values) 140 150 145 5 4 ΣfY = -14 Y = 14 7 = 4 1 ΣfY = 336 S.D. of Y = 3.696 H = 5Y +10 1 H = 5 Y + 10 = 117 cm 1 S.D. of H = 5 S.D. of Y = 18.48 cm d) Actual values of heights are not known [1] [10] Page 8

n 3 + n + n 1 + n + n + 1 + n + + n + 3 11.a) mean = 7 = n Deviations from mean are 3, -, -1, 0, 1,, 3. so S.D. = 3 + + 1 + 0 7 + 1 + + 3 = [5] b) i) mean = 8 S.D. = [] ii) mean = 17 S.D. = = 4 iii) mean = a S.D. = b = 4b c) mean decreases since new value below mean standard deviation decreases, since difference between new value and mean is less than standard deviation Page 9

1.a) Stem Leaves 1 7 1 5 8 8 9 3 0 0 1 4 4 6 6 7 8 9 (order) 4 0 0 4 7 7 7 7 7 (alignment) 5 6 8 8 b) median : 15.5 th value 37.5 LQ : 4n = 7.5 8 th value 30 UQ : 3 n 4 =.5 3 rd value 47 Interquartile range= 17 [6] c) 15 0 5 30 35 40 45 50 55 60 Mark (graph paper, to scale) (max and min) ft (median; quartiles) Positive skew d) Both have slight positive skew Second class have a much smaller range less of very low or very high ability Smaller interquartile range for second class, showing middle 50% are much more consistent Similar medians, showing average ability of classes similar B4 (any 4 independent comments) Page 10

13.a) Shoe size can only take discrete values. [1] b) Shoe size Frequency Cumulative Frequency 1 5 5 1½ 8 13 13 6 ½ 8 34 (all) 3 6 40 3½ 0 40 Cumulative frequency 50 40 30 0 10 0 0.5 1 1.5.5 3 3.5 4 Shoe size attempt at step polygon labelled, clear axes plotted accurately 14.a) i) Makes it clear where bulk of data lie. Can see shape / skewness of distribution B ii) Can see central 50% / range easily Can see skewness easily B (any ) Can be used to compare distribution iii) Can get an idea of shape of distribution while all data is retained Can use it to find median and quartiles easily B [6] b) i) continuous ii) continuous or discrete iii) discrete Page 11

15.a) True class Frequency Cumulative Frequency 18 x < 1 7 7 1 x < 6 10 17 (cumulative frequency) 6 x < 31 15 3 31 x < 41 0 5 (class boundaries) 41 x < 51 7 59 51 x 1 60 Cumulative frequency 60 50 40 30 0 10 0 15 5 35 45 55 65 75 Age (years) (graph paper, to scale) A (all points correct) (reasonable upper bound - 61 at least) [7] b) Look up 30 on cumulative frequency axis 30.4 (accept 9 3) [] c) Required.5 th and 97.5 th percentiles look up 1.5 and 58.5 on cumulative frequency axis.5 th : 19.1 (18 0) 97.5 th : 50.3 (49 5) So central 95% of data are between 19.1 and 50.3 d) µ ± 6 = 31.6 ± 70. 19 = 14.8, 48.4 Central 95% are slightly less spread than for normal distribution and take somewhat higher values (suggesting skewness) But reasonably close, so could be normal distribution [6] Page 1

16.a) The length 1. cm [1] b) A : 9 lizards 15 th value 10. cm B : 33 lizards 17 th value 11.7 cm (-1 if misread to be 10, not 10. etc.) c) Species A have a smaller range of lengths Commonest lengths for A are less than those for B Both distributions fairly symmetrical. d) Box plot (histogram OK if it is clear data are to be grouped) [1] e) Box plot : advantage : can clearly see location of middle 50% of data or : can see skewness clearly (either) disadvantage : actual data values lost or : cannot see overall shape of distribution (either) Histogram: advantage : can see shape/ skewness easily disadvantage: actual data values lost [] Page 13

15 4 + 9 5 + 30 + 35 + 70 8 17.a) mean = = 8 9 15 + 9 + 1 + 1 + 1 mode = 4 median : 14 th value = 4 b) median (or mode) These are representative of the ages of those at party [] c) mean : increase, since 5 is below mean age median : unaffected, since the middle value will still be 4 mode : unchanged; 4 is still commonest age d) 0 5.05 + 3 40 3 = 9.609 Page 14

18. a) True boundaries Frequency Cumulative Frequency 0 x < 5 5 x < 4 4 9 (cumulative frequency) 4 x < 6 5 14 (boundaries) 6 x < 8 10 4 8 x < 10 1 36 10 x < 13 3 39 13 x 1 40 median: n = 0 in 6 x < 8 class. 0 14 median = 6 + 4 14 = 7. 3 rd decile : 0.3n = 1 In 4 x < 6 class 3 rd 1 9 decile = 4 + 14 9 = 5. [9] b) The age 30% of the cars are below [1] c) The exact ages of the cars are not known [1] d) There may be a tendency for people to buy cars more at some times of the year (eg when new registration letter comes out) Also, those in for repair may tend to be older than a random selection of cars So may not be very accurate [] Page 15

19.a) 5 µ + 10 µ 15 5µ = 3 Σx b) i) First set : σ = µ 5 Σx = 5(µ + σ ) Second set : Σx = 10(4µ + 9σ ) In total, Σx = 5µ + 5σ + 40µ + 90σ = 45(σ) + 95σ = 75σ S.D. = 75σ 15 5µ 3 = 55σ 3 10σ 3 = 55σ 3 100σ 9 = 65σ 3 [10] ii) The second set Looking at µ - σ for both, we get σ in both cases Looking at µ - σ for both, we get 0 for set 1 and -σ for set Since this is lower for set, it probably has lower values (reasonable argument) [] c) mean: increase, since new value is above old mean standard deviation: decrease, since the difference between this value and the mean is less than σ. Page 16

0.a) Using Area = Frequency Weight Frequency 40 x < 50 5 50 x < 55 10 55 x < 60 13 A3 (all) 60 x < 65 7 65 x < 75 5 75 x < 90 6 90 x < 110 4 b) Use of midpoints Σfx = 3180 Σfx = 13350 n = 50 mean = 63.6 S.D. = 14.901 c) There are a small number of high values which distort the mean and standard deviation Use median and interquartile range 1.a) mean, median both increase by 5 standard deviation and interquartile range unchanged b) All increase by 5% [1] c) mean and standard deviation increase median and interquartile range unchanged [] Page 17

.a) If median is closer to upper quartile than lower quartile, it is negatively skewed If median is same distance from both quartiles, it is symmetrical If median is closer to lower quartile than upper quartile, it is positively skewed b) Time Frequency Cumulative Frequency 0 40 5 7 (cumulative 60 13 0 frequency) 80 8 8 100 1 9 10 1 30 median : 15.5 th value 60 LQ : 4 n = 7.5 8 th value 60 3 n UQ : =.5 3 rd value 4 80 [6] c) Class boundaries are 10 30, 30 50, 50 70, 70 90, 90 110 and 110 130 Median : 15 th value; in class 50 70 15 7 So median = 50 + 0 = 6.308 0 7 LQ = 7.5 th value; in class 50 70 7.5 7 LQ = 50 + 0 0 7 = 50.769 UQ =.5 th value; in class 70 90.5 0 UQ = 70 + 0 8 0 = 76.5 [7] d) Values calculated in c) assume the times are evenly spread through given interval The values from interpolation Since times really will be evenly spread Page 18

QUESTION CONTINUED A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES e) Since first class had higher values for median and quartiles, they were slower overall. B4 (any 4 independent First class had times slightly less spread out than nd class. correct comments) nd class times are symmetrical 1 st class times have slight positive skew Page 19