Statistics I Chapter 2: Analysis of univariate data

Size: px
Start display at page:

Download "Statistics I Chapter 2: Analysis of univariate data"

Transcription

1 Statistics I Chapter 2: Analysis of univariate data

2 Chapter 2: Analysis of univariate data Contents 1. Representations and graphs Frequency tables. Bar and pie charts, pictograms, histograms, frequency polygons. Other graphs. Lying with graphs. 2. Numerical summary: Central tendency (mean, median, mode) Location (quartiles and percentiles). Box plots. Spread (variance, standard deviation, range, IQR, coefficient of variation) Shape (coefficients of skewness and kurtosis)

3 Chapter 2: Analysis of univariate data Recommended reading Peña, D., Romo, J. Introducción a la Estadística para las Ciencias Sociales (1997). Chapters 2, 3, 4 y 5. Newbold, P. Statistics for Business and Economics (2008). Chapters 1 y 2

4 Description of qualitative variables Sample: 46 professionals of a computer company in the United States. Variable: EDUC: education level (1=High School; 2=College; 3=Advanced Degree) Variable: MGT: position of responsibility (1=yes; 0=no) In order to obtain information: How to summarize primary data in a more useful way that allows a quick visual interpretation?

5 Description of qualitative variables: frequency tables and bar charts Education level Number of employees Proportion of employees High School College Advanced Degree Total 46 1

6 Description of qualitative variables: general outline of a frequency table Note: Freq. Freq. Class, c i Absolute, n i Relative, f i c 1 n 1 f 1 = n1 n c 2 n 2 f 2 = n2 n... c k n k f k = n k n Total n 1 n i = number of c i in the sample, f i = n i n 0 f i 1

7 Description of qualitative variables: the bar chart Bars are of the same width and equally-spaced, with heights corresponding to frequencies There are gaps between bars Bars are labeled with class names

8 Other graphics: the pie chart Each slice is a fraction of the total size of the pie Many software programs rank slices alphabetically Although pretty harder to interpret than barcharts Avoid 3D piecharts, for those the area in the background seems to be smaller than the area in the foreground

9 The pie chart: example Tabla dinámica (Pivot table) Sample: The Simpsons 568 first episodes Variable: character performing the leading role (the one whospeaksmore) in anepisode Note: Youcan obtainthesamechart formtheraw data (without using a Pivot table). Check the Supplementary materials about Excel usage.

10 Other graphics: the Pareto chart Bar chart in which the categories of the variable are ranked in decreasing order of frequency. Applies only to nominal qualitative variables. Useful in the detection of the more significant reasons (a few options account for almost all the purchasing frequency) Pareto Principle (80-20 rule) Based on empirical knowledge Pareto stated in 1896 that society was divided into two proportional groups 80-20, the few of many and the many of few : A minority group made up of 20 % of the population who owned 80 % of something. A majority group made up of 80 % of the population who owned the remaining 20 %.

11 The Pareto chart: example Visitar la colección del Museo 16,6 Visitar o estar en la cafetería del Museo 7,7 Sample: Among the 1100 visitors of the art exhibition Turner and Visitar la tienda del Museo 28,1 the Masters (Prado Museum, June 22 to September ), Estar o visitar otros espacios del Museo que no tienen 33,0 thosecolección who bought their tickets online (a 20.3 %) Source: Institute for Tourism Esperar en Studies el exterior del Museo 27,5 Variable: Main reason for buying the ticket online % Tabla 9. Visitantes por la razón principal para adquirir la entrada por vía telemática Filtro: Adquiere la entrada por vía telemática % Por comodidad 60,5 Rapidez 10,1 Puedo elegir el día y la hora de la visita 14,0 No tengo que esperar en taquilla 9,5 Porque la entrada es más barata 4,3 Por el horario 24 horas 1,2 Había oído hablar bien del servicio 0,4 Total 100,0

12 The Pareto chart: example

13 Other graphics: pictograms Sample: 70 university students from Madrid Variable: Preferred political party Preferred political party Students numb. Students prop. PSOE PP Unidos Podemos Ciudadanos Otros Total 70 1 The area of the graph is proportional to the frequency.

14 Exercise Results from a survey conducted among year-olds about their favorite leisure activity What is the variable and who are the individuals? For what percentage of young people is reading the preferred leisure activity?

15 Exercise From a test taken by a group of students, graded between 1 and 8, the following table was obtained: Grade, c i n i f i How many students took the test? What percentage of students obtained a grade greater than or equal to 6?

16 Exercise In a survey about health habits, 30 randomly chosen students were asked about the sport they usually practice. The results are shown in the following table: Sport, c i n i f i Basket Swimming Football None Total 30 1 Which of the following charts corresponds to the data above?

17 Exercise Estadística Aplicada a) c) Deporte Deporte Baloncesto Natación Fútbol Ningún deporte 0 Baloncesto Natación Fútbol Ningún deporte b) d) Deporte Deporte Baloncesto Natación Fútbol Ningún deporte 2 0 Baloncesto Natación Fútbol Ningún deporte

18 Description of discrete quantitative variables: table of frequencies Sample: 100 shopping malls in which a promotion of a certain service was launched last November. Variable: number of new customers gained due to the promotion. Absolute Relative Absolute Relative Cumulative Cumulative c i Frequency n i Frequency f i Frequency N i Frequency F i 0 1 0,01 1 0, ,04 5 0, , , , , , , , , , , , , ,1 86 0, , , , Total 100 1

19 Description of discrete quantitative variables: table of frequencies What percentage of the sampled malls gained only 5 new customers? How many malls attracted at least 3 new customers? How many malls attracted less than 6 new customers? What percentage of the sampled malls gained between 4 and 8 new customers? What percentage of malls gained at most 7 new customers?

20 Description of discrete quantitative variables: the bar chart Bar charts can also be created for discrete data if there are not too many different values.

21 Description of discrete quantitative variables: general format of the table Note: Cumulative Cumulative Absolute Relative Absolute Relative Class, c i Freq., n i Freq., f i Freq., N i Freq., F i c 1 n 1 f 1 = n1 n N 1 = n 1 F 1 = f 1 c 2 n 2 f 2 = n2 n N 2 = N 1 + n 2 F 2 = F 1 + f c k n k f k = n k n N k = n F k = 1 Total n 1 c 1 < c 2 <... < c k n i = number of individuals in the sample in class c i,f i = n i n N i = N i 1 + n i, F i = F i 1 + f i 0 f i, F i 1 F i and N i also make sense for qualitative ordinal variables

22 Qualitative ordinal variables: cumulative frequencies We can also include cumulative frequencies in the table. Sample: 901 employees. Variable: levels of satisfaction (S=satisfied, V=very, U=unsatisfied) Cumulative Cumulative Absolute Relative Absolute Relative Class Frequency Frequency Frequency Frequency VU U S VS Total 901 1

23 Qualitative ordinal variables: bar charts with cumulative frequencies Beware! Many software programs rank the classes in alphabetical order when the variable is qualitative. If it is an ordinal variable, it must be ranked in ascending order.

24 Bar charts for discrete data Sample: 46 professionals of a computer company in the United States. variable: EXPRNC: number of years working in the company Experience, c i Absolute freq., n i Relative freq., f i 1 5 0, , , , , , , , , , , , , , , , ,022 Total 46 1

25 Description of discrete quantitative variables: the bar chart Too many different values.

26 Description of continuous quantitative variables Sample: 46 professionals of a computer company in the United States. Variable: EXPRNC: years of experience Variable: SALARY: anual gross income (in US dollars)

27 Grouping by class intervals: continuous (or discrete) data Note: Class Interval Midpoint n i f i N i F i [l 0, l 1 ] c 1 = l0+l1 2 n 1 f 1 N 1 F 1 (l 1, l 2 ] c 2 = l1+l2 2 n 2 f 2 N 2 F (l k 1, l k ] c k = l k 1+l k 2 n k f k n 1 Total n 1 Left end-point is excluded, but right end-point is included in Excel (it is a convention) Reverse end-point convention can be applied - check your software for definition Useful for tabulating discrete data if X takes many values

28 Grouping by class intervals Very often class intervals have the same width Determine the width w of each interval by w = largest number - smallest number number of desired intervals How many intervals? Roughly between 5 and 20. Practice and experience provide the best guidelines (From Newbold): Intervals never overlap Sample size Number of classes Fewer than to to to to More than Round up the interval width to get desirable interval endpoints

29 Grouping by class intervals: histogram and frequency polygon Find range: 20 1 = 19 Select number of classes: say k = 46 = Compute interval width: 19/7 = Determine the end-points (beginning before the first one and ending after the last one): [0, 3], (3, 6],..., (19, 21]

30 Description of quantitative variables: histogram and frequency polygon There are no gaps between the bars/bins Bin widths = widths of class intervals (identical), class boundaries are marked on the horizontal axis Bin heights = frequencies (here, absolute) Bin areas are proportional to the frequencies

31 Quantitative variables: the histogram

32 Description of quantitative variables: histogram and frequency polygon

33 Other graphics: cartograms (INE, Encuesta de Turismo de residentes) Average trips expenditure per person during the third term of 2016 Average excursions expenditure per person during the third term of 2016

34 Other graphics: pictograms

35 Other graphics: time series INE, Encuesta de Población Activa

36 How to lie with pictograms Published in La Voz de Galicia, on October 24, Letting height proportional to frequency gives a false impression. Is there anything else you don t like?

37 Lying with graphs Improper use of scales: the coordinate origin is not 0

38 Lying with graphs

39 Lying with graphs The vertical axes scale is upside down

40 Lying with Statistics A classic book: How to Lie with Statistics, by Darrell Huff, Available online:

41 Numerical summary Central tendency Location Spread Shape mean quartiles range coeff. skewness median percentiles interquartile range coeff. kurtosis mode variance standard deviation coeff. of variation

42 Descriptive statistics Why are they useful? Can we calculate them for all types of variables? Which are the most useful in each case? How can we compute them with a calculator or Excel?

43 Measures of central tendency The mean The median The mode

44 Central tendency: the (arithmetic) mean The (arithmetic) mean The mean is the average of all the data n i=1 x = x i n = x x n n It is the most common measure of location It is the center of gravity of the data It can be calculated only for quantitative variables

45 The mean: example For the experience of the 46 professionals of a computer company, What is the mean? x = With Excel: function PROMEDIO(número1; [número 2];...) = 7.5 años

46 The mean: example How to calculate the mean from the absolute frequency table? And from the relative frequency table?

47 The mean with grouped data It is the same formula but using the center of each interval. For the salary of the 46 professionals of a computer company, What is the mean? Note: the mean salary using the raw data equals

48 The mean: properties Linearity: If Y = a + bx ȳ = a + b x If Z = X + Y z = x + ȳ If the 46 professionals salaries increase by 2 %, How does the mean salary change? If the salary is reduced in 100 dollars, What is then the new mean salary? If the salary is increased with a productivity bonus that is recorded in variable Y, with mean ȳ, What is the new mean salary? Disadvantages: Affected by extreme values (outliers) Example: X : 3, 1, 5, 4, 2, Y : 3, 1, 5, 4, 200 x = = 3 ȳ = = 42.6! When the data is skewed, an alternative robust measure of central tendency is more appropriate

49 Central tendency: the median... it is the most central datum Order the data from smallest to largest 2. Include repetitions 3. The median is the physical centre M = Median Ordered list from smallest to largest: x (1), x (2),..., x (n) if n odd M = x ((n+1)/2) x (n/2) +x (n/2+1) 2 if n even With Excel: function MEDIANA(número1; [número2];...) = 4

50 Finding the median from a frequency table Experience, x i n i f i N i F i 1 5 0, , , , , , , , , , 435 < 0.5 M=6 4 0, , 522 > , , , , , , , , , , , , , , , , , , , , , , , , ,000

51 The median: properties Linearity: If Y = a + bx with b > 0 M y = a + bm x If the 46 professionals salaries are increased by 2 %, How does the median salary change? Afterwards the salary is reduced in 100 dollars. What is the final median salary? Can we calculate the median with the education level data? Can we calculate the median with the 0-1 position of responsibility variable? Advantage: Not affected by outliers Example: X : 3, 1, 5, 4, 2, Y : 3, 1, 5, 4, 200 M x = 3 M y = 4 When the data is skewed it is a better measure of central tendency than the mean.

52 The median and the mean for asymmetric (skewed) data Annual gross salary in 2014, Encuesta de Estructura Salarial 2014, INE La diferencia entre el salario medio y el mediano se explica porque en el cálculo del valor medio influyen notablemente los salarios muy altos aunque se refieran a pocos trabajadores. (Nota de Prensa del INE de 28 de octubre de 2016)

53 Central tendency: the mode... it is the most frequent value The mode of the variable experience in the 46 professionals example is 1 year, with an absolute frequency of 5 employees. The values 2,3,4,8 and 10 have an absolute frequency of 4 employees.

54 Central tendency: the mode Does this definition make sense with the education level data? Does this definition make sense with the 0-1 position of responsability variable?

55 Central tendency: the mode Does this definition make sense with continuous data? modal interval

56 The mode: properties It can be calculated for both qualitative and quantitative variables. Indeed, it is the only descriptive measure (mean, median, mode) that makes sense for nominal qualitative variables. Not affected by outliers There can be no mode. There can be more than one mode: bimodal trimodal plurimodal What does it indicate?

57 Bimodal distribution Time (in minutes) to complete a marathon. Data fromanopen marathon(everybody can participate). 160 Tiempo en correr un maratón: histograma Whatdo youthinkishappening? Can youguesswhichtypesof runners makeup theblue and the greengroups? Wouldyouexpectto observe thesamehistogramshapeifthedata came instead from a marathon at the Olympic Games?

58 Location measures Quartiles Percentiles

59 Location measures: quartiles and percentiles Quartiles split the ranked data into four segments with an equal number of values per segment. Percentiles split the ranked data into a hundred segments with an equal number of values per segment. 1. Order the data from smallest to largest 2. Include repetitions 3. Select each quartile (percentile) according to: The first quartil Q1 has position 1 (n + 1). 4 The second quartil Q2 (= median) has position 1 (n + 1). 2 The third quartil Q3 has position 3 (n + 1). 4 The k-th percentile Pk, has position k(n + 1)/100, k = 1,..., 99.

60 Quartiles and percentiles with Excel Note: In most cases the fractions 1 4 (n + 1), 3 4 (n + 1) and k 100 (n + 1) are not integer to get the (integer) position of the given quartile (or percentile) a rounding criterion must be used. With Excel, the functions are: CUARTIL.INC(matriz;cuartil), with: 1=first quartil, 2=median, 3=third quartil PERCENTIL.INC(matriz;p), with: p = k 100 (0, 1), k-th percentile

61 Masures of spread The range and the interquartile range The variance and the standard deviation The coefficient of variation

62 Variation: range and interquartile range (IQR) The Range is the simplest measure of variation R = x máx x mín Ignores the way the data is distributed Sensitive to outliers Example: Given observations 3, 1, 5, 4, 2, R = 5 1 = 4 Example: Given observations 3, 1, 5, 4, 100, R = = 99 The Interquartile range (IQR) can eliminate some outlier problems. Eliminate high and low observations and calculate the range of the middle 50 % of the data RIC = 3rd cuartil 1st cuartil = Q 3 Q 1

63 Variation: Interquartile range and boxplot Outliers are observations that fall below the value of Q1 1.5 IQR above the value of Q IQR For extreme outliers, replace 1.5 by 3 in the above definition MEDIANA x min Q 1 (Q 2) Q 3 x max 25% 25% 25% 25% RI=18

64 Box-Plot It shows five central location measures. It shows a robust dispersion measure. It allows the study of the symmetry of the data. It gives a criterion to detect outliers. It is very useful to compare different datasets. Variation: when several box-plots are depicted in the same chart, the boxes widths can be proportional to the sample sizes.

65 Homer and his antagonists Homer Simpson has two main antagonists: Flanders and Mr. Burns: In thoseepisodesin whichat leastoneof themappears, Howis Homer s protagonism distributed?

66 Homer and his antagonists Youmustuse thefiltervariable definedin Exercise5 (Problemset 1) 1) Create 4 variables with the values of columm(variable) Homer for each of the following cases: Homer&Burns, Homer&Flanders, Homer&Both, Homer&None 2) Selectallcases and inserta Diagrama de Cajas y Bigotes

67 Measure of variation: variance Average of squared deviations of values from the mean Sample variance n ˆσ 2 i=1 = (x i x) 2 n faster to calculate { }}{ n i=1 = x i 2 n( x) 2 n divided by n Sample quasi-variance (corrected sample variance) n s 2 i=1 = (x i x) 2 n 1 They are related via = n i=1 x 2 i n( x) 2 n 1 ˆσ 2 = n 1 n s2 divided by n 1 If a, b are real numbers and y = a + bx, then s 2 y = b 2 s 2 x

68 Measure of variation: standard deviation (SD) The most-commonly used measure of spread Population standard deviation, sample standard deviation and sample quasi-standard deviation are respectively ˆσ = ˆσ 2 s = s 2 Measures variation about the mean Has the same units as the original data, whilst variance is in units 2 Variance and SD are both affected by outliers

69 Calculating variance and standard deviation Example: X : 11, 12, 13, 16, 16, 17, 18, 21, Y : 14, 15, 15, 15, 16, 16, 16, 17, Z : 11, 11, 11, 12, 19, 20, 20, 20 x = = 15.5 ȳ = = 15.5 z = = 15.5 n i=1 n i=1 n i=1 x 2 i = = 2000 y 2 i = = 1928 z 2 i = = 2068 n sx 2 i=1 = x i 2 n( x) (15.5)2 = = 78 = sx = n sy (15.5)2 = = 6 = sy = sz (15.5)2 = = 146 = sz =

70 Calculating variance and standard deviation with Excel

71 Comparing standard deviations Example cont.: X : 11, 12, 13, 16, 16, 17, 18, 21, Y : 14, 15, 15, 15, 16, 16, 16, 17, Z : 11, 11, 11, 12, 19, 20, 20, 20 x = 15.5 s x = y = 15.5 s y = z = 15.5 s z =

72 Measure of variation: coefficient of variation (CV) Measures relative variation and is defined as CV = s x Is a unitless number (sometimes given in % s) Shows variation relative to the mean Example: Stock A: Average price last year = 50, Standard deviation = 5 Stock B: Average price last year = 100, Standard deviation = 5 CV A = 5 50 = 0.10 CV B = = 0.05 Both stocks have the same SDs, but stock B is less variable relative to its mean price

73 Z-scores. In which ODS (SDG) is Spain performing better? ODS 4: Quality education, Spain: 88,9 ODS 5: Gender equality and women s empowerment, Spain: 80,6 ODS8: Decent work and economic growth, Spain: 80,9 ODS 12: Responsible consumption and production, Spain: 60,8 ODS16: Peace, justice and strong institutions, Spain: 69,5 Medidas Resumen SDG 4 SDG5 SDG8 SDG12 SDG16 Media 72, , , , , Error típico 1, , , , , Mediana 80, , , , , Moda #N/A #N/A #N/A #N/A #N/A Desviación estándar 22, , , , , Varianza de la muestra 517, , , , , Curtosis 0, , , , , Coeficiente de asimetría -1, , , , , Rango 95, , , , , Mínimo 3, , , , , Máximo 99, , , , , Suma 11357, , , , ,21239 Cuenta

74 Numerical summaries and frequency tables. Standardization. To standardize variable x means to calculate x x s If you apply this formula to all observations x 1,..., x n and call the transformed ones z 1,..., z n, then the mean of the z s is zero with standard deviation of one Standardization = finding z-score

75 Z-scores. In which ODS is Spain performing better? Medidas Resumen SDG 4 SDG5 SDG8 SDG12 SDG16 Media 72, , , , , Error típico 1, , , , , Mediana 80, , , , , Moda #N/A #N/A #N/A #N/A #N/A Desviación estándar 22, , , , , Varianza de la muestra 517, , , , , Curtosis 0, , , , , Coeficiente de asimetría -1, , , , , Rango 95, , , , , Mínimo 3, , , , , Máximo 99, , , , , Suma 11357, , , , ,21239 Cuenta Spain 88,9 80,6 80,9 60,8 69,5 Con respecto a la media 16, , , , , Incorporando variabilidad 0, , , , ,

76 Measures of shape Fisher Pearson coefficient of skewness Fisher coefficient of kurtosis Empirical rule

77 Measures of shape: Skewness Be AWARE of not making a decision about the shape just by means of a comparison between the Mean, the Median and the Mode. Fisher Pearson coefficient of skewness γ 1 = 1 n ( ) 3 xi x n s i=1 With Excel: COEFICIENTE.ASIMETRIA(número1; número2;...) n n ( ) 3 xi x (n 1)(n 2) s i=1

78 Measures of shape: kurtosis Fisher s coefficient of kurtosis n ( ) 4 xi x 3 γ 2 = 1 n With Excel: CURTOSIS(número1; número2;...) n(n + 1) n ( ) 4 xi x (n 1) 2 3 (n 1)(n 2)(n 3) s (n 2)(n 3) i=1 i=1 s

79 Measures of shape: skewness and kurtosis Excel function

80 Análisis de Datos in Excel: Estadística descriptiva [OECD-only] Average PISA score across Maths/Reading/Science(0-600) Media 491, Error típico 4, Mediana 496, Moda #N/A Desviación estándar 26, Varianza de la muestra 679, Curtosis 1, Coeficiente de asimetría -1, Rango 113, Mínimo 415, Máximo 528, Suma 17219,46943 Cuenta 35 FRECUENCIA [OECD-only] Average PISA score across Maths/Reading/Science(0-600) 406,00 425,00 444,00 463,00 482,00 501,00 520,00 539,00 Spain: 491,4 MARCAS DE CLASE Data source: SDG Index& Dashboards Report 2017,

81 Empirical rule If the data is bell-shaped (normal), that is, symmetric with light tails, the following rule holds: 68 % of the data are in ( x 1s, x + 1s) 95 % of the data are in ( x 2s, x + 2s) 99.7 % of the data are in ( x 3s, x + 3s) Note: This rule is also known as rule Example: We know that for a sample of 100 observations, the mean is 40 and the quasi-standard deviation is 5. Assuming that the data is bell-shaped, give the limits of an interval that captures 95 % of the observations. 95 % of x i s are in: ( x ± 2s) = (40 ± 2(5)) = (30, 50)

Statistics I Chapter 2: Analysis of univariate data

Statistics I Chapter 2: Analysis of univariate data Statistics I Chapter 2: Analysis of univariate data Numerical summary Central tendency Location Spread Form mean quartiles range coeff. asymmetry median percentiles interquartile range coeff. kurtosis

More information

1. The following data shows the (hourly) number of sales in an ice-cream shop, recorded during different opening hours:

1. The following data shows the (hourly) number of sales in an ice-cream shop, recorded during different opening hours: Statistics I Exercises for Chapter 2 Academic Year 2016/17 Problems 1. The following data shows the (hourly) number of sales in an ice-cream shop, recorded during different opening hours: 35 47 22 15 13

More information

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Chapter 3 Numerical Descriptive Measures Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Objectives In this chapter, you learn to: Describe the properties of central tendency, variation, and

More information

2 Exploring Univariate Data

2 Exploring Univariate Data 2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting

More information

Numerical Descriptions of Data

Numerical Descriptions of Data Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =

More information

Lecture 2 Describing Data

Lecture 2 Describing Data Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc COUNSELLING PSYCHOLOGY (2011 Admission Onwards) II Semester Complementary Course PSYCHOLOGICAL STATISTICS QUESTION BANK 1. The process of grouping

More information

3.1 Measures of Central Tendency

3.1 Measures of Central Tendency 3.1 Measures of Central Tendency n Summation Notation x i or x Sum observation on the variable that appears to the right of the summation symbol. Example 1 Suppose the variable x i is used to represent

More information

DATA SUMMARIZATION AND VISUALIZATION

DATA SUMMARIZATION AND VISUALIZATION APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296

More information

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data Summarising Data Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Today we will consider Different types of data Appropriate ways to summarise these data 17/10/2017

More information

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Math 2311 Bekki George bekki@math.uh.edu Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Class webpage: http://www.math.uh.edu/~bekki/math2311.html Math 2311 Class

More information

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution PSY 464 Advanced Experimental Design Describing and Exploring Data The Normal Distribution 1 Overview/Outline Questions-problems? Exploring/Describing data Organizing/summarizing data Graphical presentations

More information

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.

More information

Descriptive Statistics

Descriptive Statistics Petra Petrovics Descriptive Statistics 2 nd seminar DESCRIPTIVE STATISTICS Definition: Descriptive statistics is concerned only with collecting and describing data Methods: - statistical tables and graphs

More information

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives Basic Statistics for the Healthcare Professional 1 F R A N K C O H E N, M B B, M P A D I R E C T O R O F A N A L Y T I C S D O C T O R S M A N A G E M E N T, LLC Purpose of Statistic 2 Provide a numerical

More information

Fundamentals of Statistics

Fundamentals of Statistics CHAPTER 4 Fundamentals of Statistics Expected Outcomes Know the difference between a variable and an attribute. Perform mathematical calculations to the correct number of significant figures. Construct

More information

CHAPTER 2 Describing Data: Numerical

CHAPTER 2 Describing Data: Numerical CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of

More information

Numerical summary of data

Numerical summary of data Numerical summary of data Introduction to Statistics Measures of location: mode, median, mean, Measures of spread: range, interquartile range, standard deviation, Measures of form: skewness, kurtosis,

More information

Frequency Distribution and Summary Statistics

Frequency Distribution and Summary Statistics Frequency Distribution and Summary Statistics Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai i at Mānoa Outline 1. Stemplot 2. Frequency table 3. Summary

More information

Basic Procedure for Histograms

Basic Procedure for Histograms Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that

More information

NOTES: Chapter 4 Describing Data

NOTES: Chapter 4 Describing Data NOTES: Chapter 4 Describing Data Intro to Statistics COLYER Spring 2017 Student Name: Page 2 Section 4.1 ~ What is Average? Objective: In this section you will understand the difference between the three

More information

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.

More information

appstats5.notebook September 07, 2016 Chapter 5

appstats5.notebook September 07, 2016 Chapter 5 Chapter 5 Describing Distributions Numerically Chapter 5 Objective: Students will be able to use statistics appropriate to the shape of the data distribution to compare of two or more different data sets.

More information

Lecture Week 4 Inspecting Data: Distributions

Lecture Week 4 Inspecting Data: Distributions Lecture Week 4 Inspecting Data: Distributions Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit So next week No lecture & workgroups But Practice Test on-line (BB) Enter data for your

More information

Section3-2: Measures of Center

Section3-2: Measures of Center Chapter 3 Section3-: Measures of Center Notation Suppose we are making a series of observations, n of them, to be exact. Then we write x 1, x, x 3,K, x n as the values we observe. Thus n is the total number

More information

Simple Descriptive Statistics

Simple Descriptive Statistics Simple Descriptive Statistics These are ways to summarize a data set quickly and accurately The most common way of describing a variable distribution is in terms of two of its properties: Central tendency

More information

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line. Introduction We continue our study of descriptive statistics with measures of dispersion, such as dot plots, stem and leaf displays, quartiles, percentiles, and box plots. Dot plots, a stem-and-leaf display,

More information

Description of Data I

Description of Data I Description of Data I (Summary and Variability measures) Objectives: Able to understand how to summarize the data Able to understand how to measure the variability of the data Able to use and interpret

More information

Chapter 3 Descriptive Statistics: Numerical Measures Part A

Chapter 3 Descriptive Statistics: Numerical Measures Part A Slides Prepared by JOHN S. LOUCKS St. Edward s University Slide 1 Chapter 3 Descriptive Statistics: Numerical Measures Part A Measures of Location Measures of Variability Slide Measures of Location Mean

More information

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Lecture 1: Review and Exploratory Data Analysis (EDA) Ani Manichaikul amanicha@jhsph.edu 16 April 2007 1 / 40 Course Information I Office hours For questions and help When? I ll announce this tomorrow

More information

Exploring Data and Graphics

Exploring Data and Graphics Exploring Data and Graphics Rick White Department of Statistics, UBC Graduate Pathways to Success Graduate & Postdoctoral Studies November 13, 2013 Outline Summarizing Data Types of Data Visualizing Data

More information

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean

Measures of Center. Mean. 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) Measure of Center. Notation. Mean Measure of Center Measures of Center The value at the center or middle of a data set 1. Mean 2. Median 3. Mode 4. Midrange (rarely used) 1 2 Mean Notation The measure of center obtained by adding the values

More information

4. DESCRIPTIVE STATISTICS

4. DESCRIPTIVE STATISTICS 4. DESCRIPTIVE STATISTICS Descriptive Statistics is a body of techniques for summarizing and presenting the essential information in a data set. Eg: Here are daily high temperatures for Jan 16, 2009 in

More information

Percentiles, STATA, Box Plots, Standardizing, and Other Transformations

Percentiles, STATA, Box Plots, Standardizing, and Other Transformations Percentiles, STATA, Box Plots, Standardizing, and Other Transformations Lecture 3 Reading: Sections 5.7 54 Remember, when you finish a chapter make sure not to miss the last couple of boxes: What Can Go

More information

Some estimates of the height of the podium

Some estimates of the height of the podium Some estimates of the height of the podium 24 36 40 40 40 41 42 44 46 48 50 53 65 98 1 5 number summary Inter quartile range (IQR) range = max min 2 1.5 IQR outlier rule 3 make a boxplot 24 36 40 40 40

More information

DESCRIPTIVE STATISTICS II. Sorana D. Bolboacă

DESCRIPTIVE STATISTICS II. Sorana D. Bolboacă DESCRIPTIVE STATISTICS II Sorana D. Bolboacă OUTLINE Measures of centrality Measures of spread Measures of symmetry Measures of localization Mainly applied on quantitative variables 2 DESCRIPTIVE STATISTICS

More information

Mini-Lecture 3.1 Measures of Central Tendency

Mini-Lecture 3.1 Measures of Central Tendency Mini-Lecture 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data 3. Explain what it means for a

More information

Descriptive Statistics

Descriptive Statistics Chapter 3 Descriptive Statistics Chapter 2 presented graphical techniques for organizing and displaying data. Even though such graphical techniques allow the researcher to make some general observations

More information

1 Describing Distributions with numbers

1 Describing Distributions with numbers 1 Describing Distributions with numbers Only for quantitative variables!! 1.1 Describing the center of a data set The mean of a set of numerical observation is the familiar arithmetic average. To write

More information

Some Characteristics of Data

Some Characteristics of Data Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key

More information

DATA HANDLING Five-Number Summary

DATA HANDLING Five-Number Summary DATA HANDLING Five-Number Summary The five-number summary consists of the minimum and maximum values, the median, and the upper and lower quartiles. The minimum and the maximum are the smallest and greatest

More information

Chapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data.

Chapter 2: Descriptive Statistics. Mean (Arithmetic Mean): Found by adding the data values and dividing the total by the number of data. -3: Measure of Central Tendency Chapter : Descriptive Statistics The value at the center or middle of a data set. It is a tool for analyzing data. Part 1: Basic concepts of Measures of Center Ex. Data

More information

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis Descriptive Statistics (Part 2) 4 Chapter Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis McGraw-Hill/Irwin Copyright 2009 by The McGraw-Hill Companies, Inc. Chebyshev s Theorem

More information

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2] 1. a) 45 [1] b) 7 th value 37 [] n c) LQ : 4 = 3.5 4 th value so LQ = 5 3 n UQ : 4 = 9.75 10 th value so UQ = 45 IQR = 0 f.t. d) Median is closer to upper quartile Hence negative skew [] Page 1 . a) Orders

More information

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar Today s Content Wrap-up from yesterday Frequency Distributions The Mean, Median and Mode Levels of Measurement and Measures of Central

More information

STAT 113 Variability

STAT 113 Variability STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2

More information

Section 6-1 : Numerical Summaries

Section 6-1 : Numerical Summaries MAT 2377 (Winter 2012) Section 6-1 : Numerical Summaries With a random experiment comes data. In these notes, we learn techniques to describe the data. Data : We will denote the n observations of the random

More information

Numerical Measurements

Numerical Measurements El-Shorouk Academy Acad. Year : 2013 / 2014 Higher Institute for Computer & Information Technology Term : Second Year : Second Department of Computer Science Statistics & Probabilities Section # 3 umerical

More information

Measures of Dispersion (Range, standard deviation, standard error) Introduction

Measures of Dispersion (Range, standard deviation, standard error) Introduction Measures of Dispersion (Range, standard deviation, standard error) Introduction We have already learnt that frequency distribution table gives a rough idea of the distribution of the variables in a sample

More information

Descriptive Statistics Bios 662

Descriptive Statistics Bios 662 Descriptive Statistics Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-08-19 08:51 BIOS 662 1 Descriptive Statistics Descriptive Statistics Types of variables

More information

Describing Data: One Quantitative Variable

Describing Data: One Quantitative Variable STAT 250 Dr. Kari Lock Morgan The Big Picture Describing Data: One Quantitative Variable Population Sampling SECTIONS 2.2, 2.3 One quantitative variable (2.2, 2.3) Statistical Inference Sample Descriptive

More information

Graphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics

Graphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics Graphical and Tabular Methods in Descriptive Statistics MATH 3342 Section 1.2 Descriptive Statistics n Graphs and Tables n Numerical Summaries Sections 1.3 and 1.4 1 Why graph data? n The amount of data

More information

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda, MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE Dr. Bijaya Bhusan Nanda, CONTENTS What is measures of dispersion? Why measures of dispersion? How measures of dispersions are calculated? Range Quartile

More information

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either

More information

Center and Spread. Measures of Center and Spread. Example: Mean. Mean: the balance point 2/22/2009. Describing Distributions with Numbers.

Center and Spread. Measures of Center and Spread. Example: Mean. Mean: the balance point 2/22/2009. Describing Distributions with Numbers. Chapter 3 Section3-: Measures of Center Section 3-3: Measurers of Variation Section 3-4: Measures of Relative Standing Section 3-5: Exploratory Data Analysis Describing Distributions with Numbers The overall

More information

Measures of Central Tendency: Ungrouped Data. Mode. Median. Mode -- Example. Median: Example with an Odd Number of Terms

Measures of Central Tendency: Ungrouped Data. Mode. Median. Mode -- Example. Median: Example with an Odd Number of Terms Measures of Central Tendency: Ungrouped Data Measures of central tendency yield information about particular places or locations in a group of numbers. Common Measures of Location Mode Median Percentiles

More information

Descriptive Analysis

Descriptive Analysis Descriptive Analysis HERTANTO WAHYU SUBAGIO Univariate Analysis Univariate analysis involves the examination across cases of one variable at a time. There are three major characteristics of a single variable

More information

Describing Data: Displaying and Exploring Data

Describing Data: Displaying and Exploring Data Describing Data: Displaying and Exploring Data Chapter 4 McGraw-Hill/Irwin Copyright 2011 by the McGraw-Hill Companies, Inc. All rights reserved. LEARNING OBJECTIVES LO1. Develop and interpret a dot plot.

More information

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25

Handout 4 numerical descriptive measures part 2. Example 1. Variance and Standard Deviation for Grouped Data. mf N 535 = = 25 Handout 4 numerical descriptive measures part Calculating Mean for Grouped Data mf Mean for population data: µ mf Mean for sample data: x n where m is the midpoint and f is the frequency of a class. Example

More information

STAT 157 HW1 Solutions

STAT 157 HW1 Solutions STAT 157 HW1 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/10/spring/stats157.dir/ Problem 1. 1.a: (6 points) Determine the Relative Frequency and the Cumulative Relative Frequency (fill

More information

Introduction to Descriptive Statistics

Introduction to Descriptive Statistics Introduction to Descriptive Statistics 17.871 Types of Variables ~Nominal (Quantitative) Nominal (Qualitative) categorical Ordinal Interval or ratio Describing data Moment Non-mean based measure Center

More information

Chapter 15: Graphs, Charts, and Numbers Math 107

Chapter 15: Graphs, Charts, and Numbers Math 107 Chapter 15: Graphs, Charts, and Numbers Math 107 Data Set & Data Point: Discrete v. Continuous: Frequency Table: Ex 1) Exam Scores Pictogram: Misleading Graphs: In reality, the data looks like this 45%

More information

KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA. Name: ID# Section

KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA. Name: ID# Section KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA STAT 11: BUSINESS STATISTICS I Semester 04 Major Exam #1 Sunday March 7, 005 Please circle your instructor

More information

Putting Things Together Part 2

Putting Things Together Part 2 Frequency Putting Things Together Part These exercise blend ideas from various graphs (histograms and boxplots), differing shapes of distributions, and values summarizing the data. Data for, and are in

More information

The Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012

The Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012 The Normal Distribution & Descriptive Statistics Kin 304W Week 2: Jan 15, 2012 1 Questionnaire Results I received 71 completed questionnaires. Thank you! Are you nervous about scientific writing? You re

More information

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE AP STATISTICS Name: FALL SEMESTSER FINAL EXAM STUDY GUIDE Period: *Go over Vocabulary Notecards! *This is not a comprehensive review you still should look over your past notes, homework/practice, Quizzes,

More information

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1)

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) Descriptive statistics are ways of summarizing large sets of quantitative (numerical) information. The best way to reduce a set of

More information

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1

Chapter 3. Descriptive Measures. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1 Chapter 3 Descriptive Measures Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 3, Slide 1 Chapter 3 Descriptive Measures Mean, Median and Mode Copyright 2016, 2012, 2008 Pearson Education, Inc.

More information

SOLUTIONS TO THE LAB 1 ASSIGNMENT

SOLUTIONS TO THE LAB 1 ASSIGNMENT SOLUTIONS TO THE LAB 1 ASSIGNMENT Question 1 Excel produces the following histogram of pull strengths for the 100 resistors: 2 20 Histogram of Pull Strengths (lb) Frequency 1 10 0 9 61 63 6 67 69 71 73

More information

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS

NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS A box plot is a pictorial representation of the data and can be used to get a good idea and a clear picture about the distribution of the data. It shows

More information

Categorical. A general name for non-numerical data; the data is separated into categories of some kind.

Categorical. A general name for non-numerical data; the data is separated into categories of some kind. Chapter 5 Categorical A general name for non-numerical data; the data is separated into categories of some kind. Nominal data Categorical data with no implied order. Eg. Eye colours, favourite TV show,

More information

GOALS. Describing Data: Displaying and Exploring Data. Dot Plots - Examples. Dot Plots. Dot Plot Minitab Example. Stem-and-Leaf.

GOALS. Describing Data: Displaying and Exploring Data. Dot Plots - Examples. Dot Plots. Dot Plot Minitab Example. Stem-and-Leaf. Describing Data: Displaying and Exploring Data Chapter 4 GOALS 1. Develop and interpret a dot plot.. Develop and interpret a stem-and-leaf display. 3. Compute and understand quartiles, deciles, and percentiles.

More information

1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:

1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form: 1 Exercise One Note that the data is not grouped! 1.1 Calculate the mean ROI Below you find the raw data in tabular form: Obs Data 1 18.5 2 18.6 3 17.4 4 12.2 5 19.7 6 5.6 7 7.7 8 9.8 9 19.9 10 9.9 11

More information

Chapter 5: Introduction to statistical inference

Chapter 5: Introduction to statistical inference Chapter 5: Introduction to statistical inference 1. Outline and objectives 2. Statistics and sampling distribution 3. Point estimation 4. Interval estimation 5. Hypothesis tests: means, proportions, independence

More information

Empirical Rule (P148)

Empirical Rule (P148) Interpreting the Standard Deviation Numerical Descriptive Measures for Quantitative data III Dr. Tom Ilvento FREC 408 We can use the standard deviation to express the proportion of cases that might fall

More information

Math146 - Chapter 3 Handouts. The Greek Alphabet. Source: Page 1 of 39

Math146 - Chapter 3 Handouts. The Greek Alphabet. Source:   Page 1 of 39 Source: www.mathwords.com The Greek Alphabet Page 1 of 39 Some Miscellaneous Tips on Calculations Examples: Round to the nearest thousandth 0.92431 0.75693 CAUTION! Do not truncate numbers! Example: 1

More information

Exam 1 Review. 1) Identify the population being studied. The heights of 14 out of the 31 cucumber plants at Mr. Lonardo's greenhouse.

Exam 1 Review. 1) Identify the population being studied. The heights of 14 out of the 31 cucumber plants at Mr. Lonardo's greenhouse. Exam 1 Review 1) Identify the population being studied. The heights of 14 out of the 31 cucumber plants at Mr. Lonardo's greenhouse. 2) Identify the population being studied and the sample chosen. The

More information

CHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) =

CHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) = Solutions to End-of-Section and Chapter Review Problems 225 CHAPTER 6 6.1 (a) P(Z < 1.20) = 0.88493 P(Z > 1.25) = 1 0.89435 = 0.10565 P(1.25 < Z < 1.70) = 0.95543 0.89435 = 0.06108 (d) P(Z < 1.25) or Z

More information

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, Abstract

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, Abstract Basic Data Analysis Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, 2013 Abstract Review summary statistics and measures of location. Discuss the placement exam as an exercise

More information

NCSS Statistical Software. Reference Intervals

NCSS Statistical Software. Reference Intervals Chapter 586 Introduction A reference interval contains the middle 95% of measurements of a substance from a healthy population. It is a type of prediction interval. This procedure calculates one-, and

More information

Introduction to Computational Finance and Financial Econometrics Descriptive Statistics

Introduction to Computational Finance and Financial Econometrics Descriptive Statistics You can t see this text! Introduction to Computational Finance and Financial Econometrics Descriptive Statistics Eric Zivot Summer 2015 Eric Zivot (Copyright 2015) Descriptive Statistics 1 / 28 Outline

More information

Math 227 Elementary Statistics. Bluman 5 th edition

Math 227 Elementary Statistics. Bluman 5 th edition Math 227 Elementary Statistics Bluman 5 th edition CHAPTER 6 The Normal Distribution 2 Objectives Identify distributions as symmetrical or skewed. Identify the properties of the normal distribution. Find

More information

MgtOp 215 TEST 1 (Golden) Spring 2016 Dr. Ahn. Read the following instructions very carefully before you start the test.

MgtOp 215 TEST 1 (Golden) Spring 2016 Dr. Ahn. Read the following instructions very carefully before you start the test. MgtOp 15 TEST 1 (Golden) Spring 016 Dr. Ahn Name: ID: Section (Circle one): 4, 5, 6 Read the following instructions very carefully before you start the test. This test is closed book and notes; one summary

More information

Master of Science in Strategic Management Degree Master of Science in Strategic Supply Chain Management Degree

Master of Science in Strategic Management Degree Master of Science in Strategic Supply Chain Management Degree CHINHOYI UNIVERSITY OF TECHNOLOGY SCHOOL OF BUSINESS SCIENCES AND MANAGEMENT POST GRADUATE PROGRAMME Master of Science in Strategic Management Degree Master of Science in Strategic Supply Chain Management

More information

Chapter 6 Simple Correlation and

Chapter 6 Simple Correlation and Contents Chapter 1 Introduction to Statistics Meaning of Statistics... 1 Definition of Statistics... 2 Importance and Scope of Statistics... 2 Application of Statistics... 3 Characteristics of Statistics...

More information

Review: Types of Summary Statistics

Review: Types of Summary Statistics Review: Types of Summary Statistics We re often interested in describing the following characteristics of the distribution of a data series: Central tendency - where is the middle of the distribution?

More information

MEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION

MEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION MEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION 1 Day 3 Summer 2017.07.31 DISTRIBUTION Symmetry Modality 单峰, 双峰 Skewness 正偏或负偏 Kurtosis 2 3 CHAPTER 4 Measures of Central Tendency 集中趋势

More information

SUMMARY STATISTICS EXAMPLES AND ACTIVITIES

SUMMARY STATISTICS EXAMPLES AND ACTIVITIES Session 6 SUMMARY STATISTICS EXAMPLES AD ACTIVITIES Example 1.1 Expand the following: 1. X 2. 2 6 5 X 3. X 2 4 3 4 4. X 4 2 Solution 1. 2 3 2 X X X... X 2. 6 4 X X X X 4 5 6 5 3. X 2 X 3 2 X 4 2 X 5 2

More information

Example: Histogram for US household incomes from 2015 Table:

Example: Histogram for US household incomes from 2015 Table: 1 Example: Histogram for US household incomes from 2015 Table: Income level Relative frequency $0 - $14,999 11.6% $15,000 - $24,999 10.5% $25,000 - $34,999 10% $35,000 - $49,999 12.7% $50,000 - $74,999

More information

Chapter 4-Describing Data: Displaying and Exploring Data

Chapter 4-Describing Data: Displaying and Exploring Data Chapter 4-Describing Data: Displaying and Exploring Data Jie Zhang, Ph.D. Student Account and Information Systems Department College of Business Administration The University of Texas at El Paso jzhang6@utep.edu

More information

Data screening, transformations: MRC05

Data screening, transformations: MRC05 Dale Berger Data screening, transformations: MRC05 This is a demonstration of data screening and transformations for a regression analysis. Our interest is in predicting current salary from education level

More information

2 DESCRIPTIVE STATISTICS

2 DESCRIPTIVE STATISTICS Chapter 2 Descriptive Statistics 47 2 DESCRIPTIVE STATISTICS Figure 2.1 When you have large amounts of data, you will need to organize it in a way that makes sense. These ballots from an election are rolled

More information

Putting Things Together Part 1

Putting Things Together Part 1 Putting Things Together Part 1 These exercise blend ideas from various graphs (histograms and boxplots), differing shapes of distributions, and values summarizing the data. Data for 1, 5, and 6 are in

More information

STAB22 section 1.3 and Chapter 1 exercises

STAB22 section 1.3 and Chapter 1 exercises STAB22 section 1.3 and Chapter 1 exercises 1.101 Go up and down two times the standard deviation from the mean. So 95% of scores will be between 572 (2)(51) = 470 and 572 + (2)(51) = 674. 1.102 Same idea

More information

Lecture 07: Measures of central tendency

Lecture 07: Measures of central tendency Lecture 07: Measures of central tendency Ernesto F. L. Amaral September 21, 2017 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015. Statistics: A Tool for Social Research. Stamford:

More information

Measures of Central tendency

Measures of Central tendency Elementary Statistics Measures of Central tendency By Prof. Mirza Manzoor Ahmad In statistics, a central tendency (or, more commonly, a measure of central tendency) is a central or typical value for a

More information

DESCRIPTIVE STATISTICS

DESCRIPTIVE STATISTICS DESCRIPTIVE STATISTICS INTRODUCTION Numbers and quantification offer us a very special language which enables us to express ourselves in exact terms. This language is called Mathematics. We will now learn

More information

Getting to know a data-set (how to approach data) Overview: Descriptives & Graphing

Getting to know a data-set (how to approach data) Overview: Descriptives & Graphing Overview: Descriptives & Graphing 1. Getting to know a data set 2. LOM & types of statistics 3. Descriptive statistics 4. Normal distribution 5. Non-normal distributions 6. Effect of skew on central tendency

More information

FINALS REVIEW BELL RINGER. Simplify the following expressions without using your calculator. 1) 6 2/3 + 1/2 2) 2 * 3(1/2 3/5) 3) 5/ /2 4

FINALS REVIEW BELL RINGER. Simplify the following expressions without using your calculator. 1) 6 2/3 + 1/2 2) 2 * 3(1/2 3/5) 3) 5/ /2 4 FINALS REVIEW BELL RINGER Simplify the following expressions without using your calculator. 1) 6 2/3 + 1/2 2) 2 * 3(1/2 3/5) 3) 5/3 + 7 + 1/2 4 4) 3 + 4 ( 7) + 3 + 4 ( 2) 1) 36/6 4/6 + 3/6 32/6 + 3/6 35/6

More information

Math Take Home Quiz on Chapter 2

Math Take Home Quiz on Chapter 2 Math 116 - Take Home Quiz on Chapter 2 Show the calculations that lead to the answer. Due date: Tuesday June 6th Name Time your class meets Provide an appropriate response. 1) A newspaper surveyed its

More information