Section-2. Data Analysis

Size: px
Start display at page:

Download "Section-2. Data Analysis"

Transcription

1 Section-2 Data Analysis Short Questions: Question 1: What is data? Answer: Data is the substrate for decision-making process. Data is measure of some ad servable characteristic of characteristic of a set of objects of interest. Statistics is a vast area of applied mathematics wherein data are collected, classified, presented and analyzed for a specific purpose. Question 2: What is role of statistics in business decision? Answer: Statistics plays an important role in business, because it provides the quantitative basis for arriving at decisions in all matters connected with operations of business. Statistics helps in a business to plan production according to the tastes of the consumers. Statistics in business can also serve as a tool of management to evaluate performance of machines and personnel. It also enables the businessman to judge the efficiency of new production methods by studying relationship between costs and methods of production. Question 3: Define Frequency Table. Answer: frequency is the number of occurrences of a data item. A table such as the one shown above that summarizes number of cases against a column of interest is called a frequency table. Question 4: What is Central Tendency? Answer: In a series of statistical data that parameter which reflects a central value of the series is called the central tendency. Central tendency refers to a single value that represent the whole set of data. Question 5: Define Average and discuss various types of averages. Answer: An average can be defined as a central value around which other values of series tend to cluster. An average is computed to give a concise picture of a large group. By the use of average complex groups of large numbers are presented in a few significant words or figures. Averages help in obtaining a picture of universe with the help of sample. Although sample and the universe differ in size, still their average may be very much identical. Average may be classified into tree board types: 1) Mathematical Averages: a) Arithmetical mean b) Geometric mean c) Harmonic average

2 2) Positional Averages: a) Mode b) Median 3) Commercial Averages: a) Moving average b) Progressive average c) Quadratic average Question 6: What you understand by term Range in statistics? Answer: Range: Range of data set is the difference between the largest value and the smallest value. For example runs scored by two batsmen A and B, we had some idea of variability in the scores on the basis of minimum and maximum runs in each series. To obtain a single number for this, we find the difference of maximum and minimum Values of each series. This difference is called the Range of the data. In case of batsman A, Range = = 117 and for batsman B, Range = = 14. Clearly, Range of A > Range of B. Therefore, the scores are scattered or dispersed in Case of A while for B these are close to each other. Thus, Range of a series = Maximum value Minimum value. Question 7: Define Mean Deviation. Answer: Mean deviation also known as average deviation, mean deviation is the mean of the absolute amounts by which the individual items deviate from the mean. The following procedure is usually applied: 1) Calculate the absolute deviation from the mean, removing any negative signs. 2) Add all the deviations. 3) Divide the sum of the deviation by the total number of items. Symbolically, these steps may be summarized as follows: For a sample size, the mean deviation is defined by MD = Where x is the arithmetic mean of variable x. Question 8: What is Skewness? Answer: Skewness: Skewness is a measure of the lack of symmetry or degree of distortion from symmetry exhibited by a normal distribution. Negative skew: The left tail is longer; the mass of the distribution is concentrated on the right of the figure. It has a few relatively low values. The distribution is said to be leftskewed. In such a distribution, the mean is lower than median which in turn is lower than the mode (i.e.; mean < median < mode); in which case the skewness coefficient is lower than zero. Example (observations): 1, 1000, 1001, 1002, 1003

3 Positive skew: The right tail is longer; the mass of the distribution is concentrated on the left of the figure. It has a few relatively high values. The distribution is said to be rightskewed. In such a distribution, the mean is greater than median which in turn is greater than the mode (i.e.; mean > median > mode); in which case the skewness coefficient is greater than zero. Example (observations): 1,2,3,4,100 In a skewed (unbalanced, lopsided) distribution, the mean is farther out in the long tail than is the median. If there is no skewness or the distribution is symmetric like the bell-shaped normal curve then the mean = median = mode. Question 9: Discuss Merits and Demerits of Standard Deviation. Answer: Merits (1) The standard deviation is the best measure of variation because of its mathematical characteristics. It is based on every item of the distribution. Also it is amenable to algebraic treatment and is less affected by fluctuations of sampling than most other measures of dispersion. (2) It is possible to calculate the combined standard deviation of two or more groups. This is not possible with any other measure. (3) For comparing the variability of two or more distributions coefficient of variation is considered to be most appropriate and this is based on mean and standard deviation. (4) Standard deviation is most prominently used in further statistical work. Limitations (1) As compared to other measures it is difficult to compute. However, it does not reduce the importance of this measure because of high degree of accuracy of results is gives. (2) It gives more weight to extreme items and less to those which are near the mean. It is because of the fact that the squares of the deviations which are big in size would be proportionately greater than the squares of those deviations which are comparatively small. Question 10: Calculate the arithmetic mean for the following data:

4 Serial num Height of stu. Answer: calculation of arithmetic mean Serial number Height of student n=5 Mean(X ) = X = = Question11: Find the mean of first n natural numbers? Answer: since X = Sum of First natural number = xi xi = n = X = X = Question12. Find arithmetic mean of given x i and frequency? x i F Answer: Calculation of arithmetic mean

5 X i f fx x i = 69 fx = 1478 A.M. = A.M. = = Question13: Find the mode of the given data. Family size No. of family Answer: l = 3 h = 2 f 0 = 7 f 2 = 2 Mode = l + = 3 + [ ]*2 = 3 + (Answer) = 3.28 Question14: Find the Mode of the given data. Age x

6 No. of pl. f Answer: calculation of mode Age x No. of people f cf Where l = 35 h = 10 f 0 = 21 f 1 = 23 f 2 = 14 Mode = 35 + *2 = 35 + = Question15: Find the M.D. of the mean for the given data. 6, 7, 10, 12, 13, 4, 8, 12 Answer: = 72 X = = = 9 X i x = = x i x = M. D. =

7 = = X = 2.75 Question16: Why Study Dispersion? Answer: A measure of location, such as the mean or the median, only describes the center of the data, but it does not tell us anything about the spread of the data. For example, if your nature guide told you that the river ahead averaged 3 feet in depth, would you want to wade across on foot without additional information? Probably not. You would want to know something about the variation in the depth. A second reason for studying the dispersion in a set of data is to compare the spread in two or more distributions. Question17: Write a short note on Properties of the Median. Answer: 1. There is a unique median for each data set. 2. It is not affected by extremely large or small values and is therefore a valuable measure of central tendency when such values occur. 3. It can be computed for ratio-level, interval-level, and ordinal-level data. 4. It can be computed for an open-ended frequency distribution if the median does not lie in an open-ended class. Question18: Discuss Merits and Demerits of arithmetic Mean. Answer: Merits: Arithmetic mean is widely used in practice because of the following reasons: 1. It is the simplest to understand and the easiest to compute. Neither the arranging of data as required for calculating median nor grouping of data as required for calculating mode is needed while calculating mean. 2. It is affected by the value of every item in the series. 3. It is defined by a rigid mathematical formula with the result that everyone who computes the average gets the same answer. Demerits: 1. Arithmetical mean is not always a good measure of central tendency, as, for instance, in extremely asymmetrical distributions. 2. Since the value of mean depends upon each and every item of the series, extreme items, i.e., very small and very large items, unduly affect the value of the average.

8 Question19: Discuss Merits and Demerits of Median. Answer: Merits 1. It is especially useful in case of open-end classes since only the position and not the values of items must be known. 2. In a markedly skewed distribution such as income distribution or price distribution where the arithmetic mean would be distorted by extreme values the median is especially useful. 3. The value of median can be determined graphically whereas the value of mean can not be graphically ascertained. Demerits 1. For calculating median it is necessary to arrange the data; other averages do not need any arrangement. 2. The value of median is affected mare by sampling fluctuations as compared to the value of arithmetic mean. Question20: Discuss Merits and Demerits of Mode. Answer: Merits 1. It is not affected by extremely large or small items. 2. The value of mode can also be determined graphically whereas the value of mean cannot be graphically ascertained. Demerits 1. The value of mode is not based on each and every item of the series. 2. It is not a rigidly defined measure. There are several formulae for calculating the mode, all of which usually give somewhat different answer. Long Questions: Question1: Write short notes on followings. a. Arithmetical Mean b. Weighted Average c. Geometric Mean d. Harmonic Mean Answer:

9 a. Arithmetical Mean: Arithmetic Mean or simple mean (represented by putting a bar above the variable name) is the quantity obtained by dividing the sum of the values of items ( X) in a variable by their number (n) i.e. number of items. X = b. Weighted Average: In calculating simple arithmetic mean it is assumed that all items were equal in importance. It may not be the case always. When items vary in importance they should be assigned weights in order of their relative importance. For calculating the weighted arithmetic mean the value of each items multiplied by its weight, product summated and divided by the total of weights and not by the number of items. The result is the weighted arithmetic average. X w = Here w 1, w 2, w 3. Stands for the respective weights of each of the items. c. Geometric Mean: geometric mean is defined as the positive nth root of the product of N items of series. If there are two items, take the square roots; if there are three items, we take the cube root, and so on. G.M. = d. Harmonic Mean: the harmonic mean is based on the reciprocals of the numbers averaged. It is defined as the reciprocal of the arithmetical mean of the reciprocal of the individual observations. H.M. = Question2: Write short notes on followings. a. Mean b. Median c. Mode Answer: a. Mean: Arithmetic Mean or simple mean (represented by putting a bar above the variable name) is the quantity obtained by dividing the sum of the values of items ( X) in a variable by their number (n) i.e. number of items. X b. Median: Median is the value of that item in the set of data which divides the data in two equal parts, one part consisting of all the values less and other all value greater than it.

10 Defined in another way median is that value of the central tendency, which divides the total frequency into two halves. When n is odd, The middle position number = When n is even, The middle position number = + 1 c. Mode: A third type of Central value or Centre of the distribution is the value of greatest frequency or, more precisely, of greatest frequency density. Graphically, it is the value on the X-axis below the peak, or highest point of the frequency curve. This is called then mode. Mode=L 1 + where L 1 =lower boundary of the class containing the largest frequency d 1 = difference of the largest frequency and the frequency of the last class d 2 = difference of the largest frequency and the frequency of the next class C= class interval Question3: Write short notes on followings. a. Quartiles b. Deciles c. Percentiles d. Moving Average e. Quadratic Average Answer: a. Quartiles: quartiles are another set of measures of positional central tendency. Like median, a quartile divides the entire set of data into four equal parts. Each part is known as a quartile. Therefore, three quartiles are possible in a data set as shown below. General idea remains the same. The data values are arranged either in ascending or descending order.

11 b. Deciles: in a manner similar to median and quartiles, the data set can be divided into 10 equal parts when arranged either in ascending or descending order. Each point of division is called a deciles. Thus, there are nine deciles represented as D 1,D 2,D 3.D 9. The interpretation of a deciles is similar to that of median and quartile. c. Percentiles: The data set can also be divided in to 100 equal parts whence each point of division called percentile. The 99 number of percentiles are represented by P 1,P 2,P 3 P 99. A general formula for all the positional measures of central tendency for a frequency for a frequency class distribution is given by: T i =L Ti + d. Moving Average: The moving average is an arithmetic average of data over a period and is updated regularly by replacing the first item in the average by the new item as it comes in. it is useful eliminating the irregularity of time series and is generally computed to study the trend. Example: Suppose the prices of 12 months are given and a tree monthly average is to be computed. Then the first item in the 3-month moving average would be the average [(a 1 +a 2 +a 3 )/3], the second item would be the average of the next three months[(a 2 +a 3 +a 4 )/3] and so on. The last item would be the average[(a 10 +a 11 +a 12 )/3]. As the next month would come in a10 would be dropped and a13 would be added in [(a 10 +a 11 +a 12 )/3] and so on. e. Quadratic Average: the quadratic mean on average is estimated by taking the square root of the average squares of the items of a series. Q m = Where Q m = Quadratic Mean a 2, b 2,c 2 =square of the different values

12 Quadratic average is useful when some items have negative values and other positive values because in such cases the mean is not very representative. It is also used in averaging deviations, rather than original values, when the standard deviation is computed. Question4: Write short notes on followings. a. Standard Deviation b. Variance c. Coefficient of Variance d. Quartile Deviation Answer: a. Standard deviation: The standard deviation of a sample(sd) is similar to the mean deviation in that it considers the deviation of each X value from the mean. However, instead of using the absolute values of the deviations, it uses the square of the deviations. These are added, divided by n, and the square root extracted. The formula for standard deviation SD SD = b. Variance: Variance is the square of SD and is represented by: Variance = V = c. Coefficient of variance: to get an indication of the variation that is related that is related to the mean, we divide the standard deviation by the mean to get the coefficient of variance. This enables us to compare two groups, which have different standard deviations and means more easily. Coefficient of variation = d. Quartile deviation: Half of the interquartile range is called the quartile deviation or semi-interquartile range. Symbolically, The value of Q.D. givens the average magnitude by which the two quartiles deviate from median. If the distribution is approximately symmetrical, then M d 50 % fo the observations and, thus, we can write Q 1 =M d -Q.D. and Q 3 =M d +Q.D. Question5: Write a short notes on Measures of Skewness and Kurtosis. And Kurtosis Vs. skewness Answer: Definition of skewness: For univariate data Y 1, Y 2,..., Y N, the formula for skewness is: where is the mean, is the standard deviation, and N is the number of data points. The skewness for a normal distribution is zero, and any symmetric data should have a skewness

13 near zero. Negative values for the skewness indicate data that are skewed left and positive values for the skewness indicate data that are skewed right. By skewed left, we mean that the left tail is long relative to the right tail. Similarly, skewed right means that the right tail is long relative to the left tail. Some measurements have a lower bound and are skewed right. For example, in reliability studies, failure times cannot be negative. Definition of kurtosis: For univariate data Y 1, Y 2,..., Y N, the formula for kurtosis is: where is the mean, is the standard deviation, and N is the number of data points. The kurtosis for a standard normal distribution is three. For this reason, some sources use the following defition of kurtosis: This definition is used so that the standard normal distribution has a kurtosis of zero. In addition, with the second definition positive kurtosis indicates a "peaked" distribution and negative kurtosis indicates a "flat" distribution. Which definition of kurtosis is used is a matter of convention. When using software to compute the sample kurtosis, you need to be aware of which convention is being followed.

14 Examples The following example shows histograms for 10,000 random numbers generated from a normal, a double exponential, a Cauchy, and a Weibull distribution. Skewness and kurtosis: A fundamental task in many statis characterize the location and variability of a data set. A fur data includes skewness and kurtosis. Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution, or data set, is symmetric if it looks the same to the left and right of the center point. Kurtosis is a measure of whether the data are peaked or flat relative to a normal distribution. That is, data sets with high kurtosis tend to have a distinct peak near the mean, decline rather rapidly, and have heavy tails. Data sets with low kurtosis tend to have a flat top near the mean rather than a sharp peak. A uniform distribution would be the extreme case. The histogram is an effective graphical technique for showing both the skewness and kurtosis of data set. Question6: Find the A.M. of given range and frequency. By (1) Assumption method, (2) Step deviation method Wages x No. of f

15 Answer: calculation of A.M. Wages x No. f D=X-A f*d U=D/20 f*u Let A = 900 Method (1) A.M. = A + A.M. = Method (2) A.M. = A + = A.M. = = Question6: Find the Quartiles of given data below? Length c Leaves f Answer: calculation of Quartiles Length c Leaves f Length c C*f For Q 2 Q 2 = l 1 + l 1 = 144.5

16 l 2 = f = 12 N = 40 C = 17 So, Q 2 = = = For Q 1 Q 1 = l 1 + l 1 = l 2 = f = 9 N = 40 C = 17 So, Q 1 = =137.5 For Q 3 Q 3 = l 1 + l 1 = l 2 = f = 5 N = 40 C = 29 So, Q 3 = =155.3

17 Question7: Find the M.D. about the mean for the given data. X i f i Answer: calculation of M.D. x i f i f i *x i x i x f i x i x As M.D.(m ) = N = 40 X = = M = = 7.5 M = 2.3 Question8: Find the Median of the given data. Height in c.m. Number of student Less than Less than Less than Less than Less than Less than Answer: calculation of median Class interval F fc is 51 odd, so observation will Sinc e n

18 Median = l + f-> frequency of observation class l-> Lower limit of observation cf-> frequency commutative of proceeding class h-> class size Median = [ ]*5 = [ ]*5 = = = Question9: Find the M.D. about the median for the following data. X i f i Answer: calculation of M.D. x i F i Cf Since N = 30, which is even. So Median is the A.M. of 15 th and 16 th observation. Median = = 13

19 f i x i M f i * x i - M f i * x i - M = 149 M.D. = M.D. = = 4.97 Question10: Find the M.D. about the mean for the following data. Mark obt. No. of stu Answer: calculation of M.D. Mark ob. f i x i f i *x i x i x f i * x i x f i = 40 f i *x i = 1800 f i * x i x =400 X = = 180 = 45 f i * x i x =400 M.D. = M.D. =

20 = 10 (Answer) Question11: Calculate Karl Pearson s coefficient of skewness for the following distribution. Monthly Salary (in Rs.) 400 but less than but less than but less than but less than but less than but less than Number of salesmen Answer: calculation of Karl Pearson s coefficient of skewness Salary Rs. m.p. m. f (m- 900)/200 d fd fd N=50 Fd=5 fd 2 =63 Coe ff. Of Sk. = Mea n: X = A + A = 900, fd = 5, N=50, i=200 X = =920 Mode: mode = L + Mode lies in the class L = 800, Mode=800+ = =912.5

21 S.D. = *200 = *200 = Coeff. Of sk. = = Question12: The median of the following data is 525. Find the values of x and y, if the total frequency is 100. Class interval X Y Frequency Answer: Class interval F Cf X 7+x x x x Y 56+x+y x+y x+y x+y It is given that n = 100 So, 76+x+y=100, i.e., x+y=24

22 The median is 525, which lies in the class Using the formula: Median =, we get 525 = = (14-x)*5 25=70-5x 5x=70-25=45 So, X=9 Therefore, from (1), we get 9+y=24 Y=24-9 Y=15 Short Questions: Question1: What is correlation analysis? Correlation Analysis Answer: Correlation is a measure of degree of association between two (or more) variables in a data set. Thus, if it is known that two variables are highly correlated then one can predict the value of one variable on the basis of the value of the other variable. two variables say X and Y are said to be correlated if: a. Both increase and decrease together. In this case the variables are said to be positive correlated. b. One increase then the other decrease, when the variables are said to be negatively correlated. Question2: What is scatter diagram? Answer: The simplest device for determining relationship between two variables is a special type of dot chart called scatter diagram. When this method is used the given data are plotted on a graph paper in the form of dots, i.e., for each pair of X and Y value we put a dot and thus obtain as many points the number of observations. By looking to the scatter of the various points we can form an idea as to whether the variables are related or not. The more the plotted points scatter over a chart, the less relationship there is between the two variables. The more nearly the points come falling on a line, the hither the degree of relationship. If all the points lie on a straight line falling from the left-hand corner to the upper right corner, correlation is said to be perfectly positive. On other hand, if all the

23 points are lying on a straight line rising from the upper left hand corner to the lower righthand corner of the diagram correlation is said to be perfectly negative. Question3: State Karl Pearson Coefficient of Linear Correlation. Answer: we observed that the more is the covariance the more will be correlation between the two variables. Therefore, covariance can be treated as a measure of correlation between two variables. However, the magnitude of covariance will depend on the units of measurements. The following expression derived from covariance does not suffer from the of units of measurements and hence is called Karl Pearson coefficient of linear correlation or simply coefficient of correlation and is denoted by r. r= Hence x= (X-X ), y= (Y-Y ) N = Number of paired observations. Question4: Write Properties of Coefficient of Correlation. Answer: The Karl Pearson Coefficient of Linear Correlation possesses a number of very interesting properties as described below. 1. The coefficient of linear correlation always lies between -1 and +1 inclusive. -1<=r<=1 The value 1 suggests perfect positive linear correlation while -1 implies perfect neg7yative linear correlation. A value 0 indicates that no linear correlation exists between the variables. 2. Coefficient of correlation is not affected by linear transformation of the variables. Thus if r xy is the correlation between variables X and Y, and r AB is the correlation between A and B, then r Ar =r xy where, A= ax+b and B= cy+d 3. If two variables are not related then they are also not correlated. However, if they are uncorrelated they may be related. This directly follows from the fact that coefficient of correlation measures strength of linear relationship. If the variables are related by not linearly then the coefficient of correlation may turn out to be 0 even though they are related otherwise. Question5: What is Regression Analysis? Answer: The statistical tool with the help of which we are in position to estimate (or predict) the unknown values of one variable from known value of another variable is called

24 regression. With the help of regression analysis, we are in a position to find out the average probable change in one variable given a certain amount of change in another. Question6: State the Spearman s Rank Correlation. Answer: This measure is especially useful when quantitative measures for certain factors (such as in the evaluation of leadership ability or the judgment of female beauty) cannot be fixed, but the individuals in the group can be arranged in order thereby obtaining for each individual a number indicating his (her) rank in the group. In any event, the rank correlation coefficient is applied to a set of ordinal rank numbers, with 1 for the individual ranked first in quantity, or quality, and so on, to n for the individual ranked last in the group of n individuals (or n pairs of individuals). Spearman s rank correlation coefficient is defined as: R = 1- Where R denotes rank coefficient of correlation and D refers to the difference of ranks between paired items in two series. Question7: What is difference between Regression & Correlation? Answer: Following are the points of difference between correlation and regression: 1. Whereas correlation coefficient is a measure of degree of co variability between X and Y, the objective of regression analysis is to study the nature of relationship between the variables so that we may be able to predict the value of one on the basis of production is called the interdependent variable and the variable that is to be predicted is referred to as the dependent variable. 2. The cause and effect relation is clear indicated through regression analysis than by correlation. Correlation is merely a tool of ascertaining the degree of relationship between two variable and, therefore, we cannot say that one variable is the cause and the other the effect. Question8: What is relationship between Regression and Correlation? Answer: The two coefficients of regression are related to the coefficient of correlation in a following way. Bd=r *r r 2 Or, r = Hence, coefficient of correlation is geometric mean if the two coefficients of regression. Question9: What is Partial and Multiple Correlation? Answer: when three or more variables are studied it is a problem of either multiple or partial correlation. In multiple correlations three or more variables are studied simultaneously. For example, when we study the relationship between yield of rice per acre

25 and both the amount of rainfall and the amount of fertilizer used, it is a problem of multiple correlation. Long Questions: Question1:Define follows: a. Positive and negative correlation b. Linear and non-linear correlation Answer: a. Positive and Negative Correlation: whether correlation is positive (direct) or negative (inverse) would depend upon the direction of change of the variable. If both the variables are varying in the same direction, if as one variable is increasing the other on an average, is also decreasing, correlation said to be positive. If, on the other hand, the variables are varying in opposite direction, i.e., as one variable is increasing, the other is decreasing or vice versa, correlation said to be negative. b. Linear and Non-linear Correlation: the distinction between linear and non-linear correlation is based upon the constancy of the ratio of change between variables. If the amount of change in one variable tends to bear a constant ratio to the amount of change in the other variable then the correlation is said linear. Correlation called non-linear or curvilinear if the amount of change in one variable does not bear a constant ratio to the amount of change in the order variable.

26 Linear correlation Non-linear correlation Question2: Write a short note on Karl Pearson s coefficient of correlation. Answer: of the several mathematical methods of measuring correlation, the Karl Pearson s method, popularly known as Pearsonian coefficient of correlation, is most widely used in practice. The Pearsonian coefficient of correlation is denoted by the symbol r. it is the one of the very few symbols that is used universally for describing the degree of correlation between two series. The formula for computing pearsonian r is: r = Hence x = (X X ), y = (Y Y ) This method is to be applied only when the deviations of items are taken from actual means and not from assumed means. The value of the coefficient of correlation as obtained by the above formula shall always lie between when r = +1, it means there is perfect positive correlation between the variables. When r= -1, it means there is perfect negative correlation between the variables. When r = 0, it means there no relationship between the variables. Question3: Two judges in a beauty competition rank the 12 entries as follows: X: Y:

27 What degree of agreement is there between the judgment of the two judges? Answer: Calculation of Rank Correlation coefficient X R 1 Y R 2 (R 1 R 2 ) D D 2 = 416 D 2 = 416, N = 12 D 2 R = 1 - R = 1 = 1 - = = Question4: Write a short note on regression lines. Answer: if we take the case of two variables X and Y, we shall have two regression lines as the regression of X on Y and of Y on X. The regression line of Y on X givens the most probable values of Y for given value of X and the regression line of X on Y gives the most probable values of X for given values of Y. thus we have two regression lines. However, when there is either perfect positive or perfect negative correlation between the two variables, the two regression lines will coincide, i.e., we will have only one line. The farther the two regression lines from each other, the lesser is the degree of correlation and nearer the two regression lines to each other, the higher the degree of correlation. If the varieties are independent, r is zero and the lines of regression are at right angles, i.e., parallel to OX and OY. It should be noted that the regression lines intersect each other at the point of average of X and Y, i.e., if from the point where both the regression lines intersect each other a perpendicular is drawn on the X-axis, we will get the mean value of X and if from that point a horizontal line is drawn on the Y-axis, we will get the mean value of Y. Regression equation of Y on X

28 The regression equation of Y on X is expressed as follows: Y c = a + bx To determined the value of a and b the following two normal equations are to be solved simultaneously: XY= a X+b X 2 Regression equation of X on Y The regression equation of X on Y is expressed as follows: X c = a + by To determined the value of a and b the following two normal equations are to be solved simultaneously: XY= a Y+b Y 2 Question5: From the following data obtain the regression equation of X on Y, and also than of Y on X. X Y Answer: calculation of regression equations X (X-6) X X 2 y (y-8) Y XY X =30 X= 0 X 2 = 40 y =40 Y= 0 Y 2 = 20 XY= -26 Y 2 Regression equation X on Y X-X = r (Y-Y )

29 X = X-6 = -1.3(Y-8) X-6 = -1.3Y or X = Y Regression equation Y on x Y-Y = r (X-X ) X = Y-8 = -1.3(X-8) Y-8 = -1.3X or Y = X Question6: Calculation of Karl Pearson s coefficient of correlation from the following data: X Y Answer: X (X-18) x X 2 Y (Y-19) y xy X=162 X=0 X 2 =598 Y=171 Y=0 Y 2 =338 Xy=431 Y 2 r =

30 r = = Question7: What is the utility of the study of correlation? Answer: The study of correlation is of immense use in practical life because of the following reasons: 1. Most of the variables show some kind of relationship. For example, there is relationship between price and supply, income and expenditure, etc. with the help of correlation analysis we can measure in one figure the degree of relationship exiting between the variables. 2. Once we know that two variables are closely related, we can estimate the value of one variable given the value of another. 3. Correlation analysis contributes to the economic behavior, aids in locating the critically important variables on which others depend, may reveal to the economist the connection by which disturbances spread and suggest to him the paths through which stabilizing forces become effective. In business, correlation analysis enables the executive to estimate costs, sales, prices and other variables on the basis of some other series with which these costs, sales, or prices may be functionally related. Some guesswork can be removed from decisions when the relationship between a variable to be estimated and the one or more other variables on which it depends are close reasonably invariant. However, it should be noted that coefficient of correlation is one of the most widely used and also one of the most widely abused of statistical Measures. It is abused in the sense that one sometimes overlooks the fact that r measures nothing bit the strength of the linear relationships and that it does not necessarily imply a cause-effect relationship. 4. Progressive development in the methods of science and philosophy has been characterized by increase in the knowledge of relationship or correlations. Nature has been found to be as multiplicity of interrelated forced.

Measures of Central tendency

Measures of Central tendency Elementary Statistics Measures of Central tendency By Prof. Mirza Manzoor Ahmad In statistics, a central tendency (or, more commonly, a measure of central tendency) is a central or typical value for a

More information

Contents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii)

Contents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii) Contents (ix) Contents Preface... (vii) CHAPTER 1 An Overview of Statistical Applications 1.1 Introduction... 1 1. Probability Functions and Statistics... 1..1 Discrete versus Continuous Functions... 1..

More information

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line. Introduction We continue our study of descriptive statistics with measures of dispersion, such as dot plots, stem and leaf displays, quartiles, percentiles, and box plots. Dot plots, a stem-and-leaf display,

More information

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda, MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE Dr. Bijaya Bhusan Nanda, CONTENTS What is measures of dispersion? Why measures of dispersion? How measures of dispersions are calculated? Range Quartile

More information

Chapter 6 Simple Correlation and

Chapter 6 Simple Correlation and Contents Chapter 1 Introduction to Statistics Meaning of Statistics... 1 Definition of Statistics... 2 Importance and Scope of Statistics... 2 Application of Statistics... 3 Characteristics of Statistics...

More information

CHAPTER 2 Describing Data: Numerical

CHAPTER 2 Describing Data: Numerical CHAPTER Multiple-Choice Questions 1. A scatter plot can illustrate all of the following except: A) the median of each of the two variables B) the range of each of the two variables C) an indication of

More information

Some Characteristics of Data

Some Characteristics of Data Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key

More information

Basic Procedure for Histograms

Basic Procedure for Histograms Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that

More information

34.S-[F] SU-02 June All Syllabus Science Faculty B.Sc. I Yr. Stat. [Opt.] [Sem.I & II] - 1 -

34.S-[F] SU-02 June All Syllabus Science Faculty B.Sc. I Yr. Stat. [Opt.] [Sem.I & II] - 1 - [Sem.I & II] - 1 - [Sem.I & II] - 2 - [Sem.I & II] - 3 - Syllabus of B.Sc. First Year Statistics [Optional ] Sem. I & II effect for the academic year 2014 2015 [Sem.I & II] - 4 - SYLLABUS OF F.Y.B.Sc.

More information

32.S [F] SU 02 June All Syllabus Science Faculty B.A. I Yr. Stat. [Opt.] [Sem.I & II] 1

32.S [F] SU 02 June All Syllabus Science Faculty B.A. I Yr. Stat. [Opt.] [Sem.I & II] 1 32.S [F] SU 02 June 2014 2015 All Syllabus Science Faculty B.A. I Yr. Stat. [Opt.] [Sem.I & II] 1 32.S [F] SU 02 June 2014 2015 All Syllabus Science Faculty B.A. I Yr. Stat. [Opt.] [Sem.I & II] 2 32.S

More information

Simple Descriptive Statistics

Simple Descriptive Statistics Simple Descriptive Statistics These are ways to summarize a data set quickly and accurately The most common way of describing a variable distribution is in terms of two of its properties: Central tendency

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc COUNSELLING PSYCHOLOGY (2011 Admission Onwards) II Semester Complementary Course PSYCHOLOGICAL STATISTICS QUESTION BANK 1. The process of grouping

More information

Random Variables and Probability Distributions

Random Variables and Probability Distributions Chapter 3 Random Variables and Probability Distributions Chapter Three Random Variables and Probability Distributions 3. Introduction An event is defined as the possible outcome of an experiment. In engineering

More information

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Chapter 3 Numerical Descriptive Measures Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1 Objectives In this chapter, you learn to: Describe the properties of central tendency, variation, and

More information

DATA SUMMARIZATION AND VISUALIZATION

DATA SUMMARIZATION AND VISUALIZATION APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296

More information

Descriptive Statistics

Descriptive Statistics Petra Petrovics Descriptive Statistics 2 nd seminar DESCRIPTIVE STATISTICS Definition: Descriptive statistics is concerned only with collecting and describing data Methods: - statistical tables and graphs

More information

3.1 Measures of Central Tendency

3.1 Measures of Central Tendency 3.1 Measures of Central Tendency n Summation Notation x i or x Sum observation on the variable that appears to the right of the summation symbol. Example 1 Suppose the variable x i is used to represent

More information

CABARRUS COUNTY 2008 APPRAISAL MANUAL

CABARRUS COUNTY 2008 APPRAISAL MANUAL STATISTICS AND THE APPRAISAL PROCESS PREFACE Like many of the technical aspects of appraising, such as income valuation, you have to work with and use statistics before you can really begin to understand

More information

Measures of Dispersion (Range, standard deviation, standard error) Introduction

Measures of Dispersion (Range, standard deviation, standard error) Introduction Measures of Dispersion (Range, standard deviation, standard error) Introduction We have already learnt that frequency distribution table gives a rough idea of the distribution of the variables in a sample

More information

2 Exploring Univariate Data

2 Exploring Univariate Data 2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting

More information

Fundamentals of Statistics

Fundamentals of Statistics CHAPTER 4 Fundamentals of Statistics Expected Outcomes Know the difference between a variable and an attribute. Perform mathematical calculations to the correct number of significant figures. Construct

More information

DESCRIPTIVE STATISTICS

DESCRIPTIVE STATISTICS DESCRIPTIVE STATISTICS INTRODUCTION Numbers and quantification offer us a very special language which enables us to express ourselves in exact terms. This language is called Mathematics. We will now learn

More information

Module Tag PSY_P2_M 7. PAPER No.2: QUANTITATIVE METHODS MODULE No.7: NORMAL DISTRIBUTION

Module Tag PSY_P2_M 7. PAPER No.2: QUANTITATIVE METHODS MODULE No.7: NORMAL DISTRIBUTION Subject Paper No and Title Module No and Title Paper No.2: QUANTITATIVE METHODS Module No.7: NORMAL DISTRIBUTION Module Tag PSY_P2_M 7 TABLE OF CONTENTS 1. Learning Outcomes 2. Introduction 3. Properties

More information

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives Basic Statistics for the Healthcare Professional 1 F R A N K C O H E N, M B B, M P A D I R E C T O R O F A N A L Y T I C S D O C T O R S M A N A G E M E N T, LLC Purpose of Statistic 2 Provide a numerical

More information

DATA HANDLING Five-Number Summary

DATA HANDLING Five-Number Summary DATA HANDLING Five-Number Summary The five-number summary consists of the minimum and maximum values, the median, and the upper and lower quartiles. The minimum and the maximum are the smallest and greatest

More information

Frequency Distribution and Summary Statistics

Frequency Distribution and Summary Statistics Frequency Distribution and Summary Statistics Dongmei Li Department of Public Health Sciences Office of Public Health Studies University of Hawai i at Mānoa Outline 1. Stemplot 2. Frequency table 3. Summary

More information

Descriptive Statistics

Descriptive Statistics Chapter 3 Descriptive Statistics Chapter 2 presented graphical techniques for organizing and displaying data. Even though such graphical techniques allow the researcher to make some general observations

More information

Engineering Mathematics III. Moments

Engineering Mathematics III. Moments Moments Mean and median Mean value (centre of gravity) f(x) x f (x) x dx Median value (50th percentile) F(x med ) 1 2 P(x x med ) P(x x med ) 1 0 F(x) x med 1/2 x x Variance and standard deviation

More information

ECON 214 Elements of Statistics for Economists

ECON 214 Elements of Statistics for Economists ECON 214 Elements of Statistics for Economists Session 3 Presentation of Data: Numerical Summary Measures Part 2 Lecturer: Dr. Bernardin Senadza, Dept. of Economics Contact Information: bsenadza@ug.edu.gh

More information

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI 88 P a g e B S ( B B A ) S y l l a b u s KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI Course Title : STATISTICS Course Number : BA(BS) 532 Credit Hours : 03 Course 1. Statistical

More information

Lecture 2 Describing Data

Lecture 2 Describing Data Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms

More information

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE AP STATISTICS Name: FALL SEMESTSER FINAL EXAM STUDY GUIDE Period: *Go over Vocabulary Notecards! *This is not a comprehensive review you still should look over your past notes, homework/practice, Quizzes,

More information

Moments and Measures of Skewness and Kurtosis

Moments and Measures of Skewness and Kurtosis Moments and Measures of Skewness and Kurtosis Moments The term moment has been taken from physics. The term moment in statistical use is analogous to moments of forces in physics. In statistics the values

More information

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, Abstract

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, Abstract Basic Data Analysis Stephen Turnbull Business Administration and Public Policy Lecture 3: April 25, 2013 Abstract Review summary statistics and measures of location. Discuss the placement exam as an exercise

More information

The Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012

The Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012 The Normal Distribution & Descriptive Statistics Kin 304W Week 2: Jan 15, 2012 1 Questionnaire Results I received 71 completed questionnaires. Thank you! Are you nervous about scientific writing? You re

More information

Contents Part I Descriptive Statistics 1 Introduction and Framework Population, Sample, and Observations Variables Quali

Contents Part I Descriptive Statistics 1 Introduction and Framework Population, Sample, and Observations Variables Quali Part I Descriptive Statistics 1 Introduction and Framework... 3 1.1 Population, Sample, and Observations... 3 1.2 Variables.... 4 1.2.1 Qualitative and Quantitative Variables.... 5 1.2.2 Discrete and Continuous

More information

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.

More information

Description of Data I

Description of Data I Description of Data I (Summary and Variability measures) Objectives: Able to understand how to summarize the data Able to understand how to measure the variability of the data Able to use and interpret

More information

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution PSY 464 Advanced Experimental Design Describing and Exploring Data The Normal Distribution 1 Overview/Outline Questions-problems? Exploring/Describing data Organizing/summarizing data Graphical presentations

More information

Quantitative Methods for Economics, Finance and Management (A86050 F86050)

Quantitative Methods for Economics, Finance and Management (A86050 F86050) Quantitative Methods for Economics, Finance and Management (A86050 F86050) Matteo Manera matteo.manera@unimib.it Marzio Galeotti marzio.galeotti@unimi.it 1 This material is taken and adapted from Guy Judge

More information

HIGHER SECONDARY I ST YEAR STATISTICS MODEL QUESTION PAPER

HIGHER SECONDARY I ST YEAR STATISTICS MODEL QUESTION PAPER HIGHER SECONDARY I ST YEAR STATISTICS MODEL QUESTION PAPER Time - 2½ Hrs Max. Marks - 70 PART - I 15 x 1 = 15 Answer all the Questions I. Choose the Best Answer 1. Statistics may be called the Science

More information

1 Describing Distributions with numbers

1 Describing Distributions with numbers 1 Describing Distributions with numbers Only for quantitative variables!! 1.1 Describing the center of a data set The mean of a set of numerical observation is the familiar arithmetic average. To write

More information

2 DESCRIPTIVE STATISTICS

2 DESCRIPTIVE STATISTICS Chapter 2 Descriptive Statistics 47 2 DESCRIPTIVE STATISTICS Figure 2.1 When you have large amounts of data, you will need to organize it in a way that makes sense. These ballots from an election are rolled

More information

IOP 201-Q (Industrial Psychological Research) Tutorial 5

IOP 201-Q (Industrial Psychological Research) Tutorial 5 IOP 201-Q (Industrial Psychological Research) Tutorial 5 TRUE/FALSE [1 point each] Indicate whether the sentence or statement is true or false. 1. To establish a cause-and-effect relation between two variables,

More information

Numerical Measurements

Numerical Measurements El-Shorouk Academy Acad. Year : 2013 / 2014 Higher Institute for Computer & Information Technology Term : Second Year : Second Department of Computer Science Statistics & Probabilities Section # 3 umerical

More information

Categorical. A general name for non-numerical data; the data is separated into categories of some kind.

Categorical. A general name for non-numerical data; the data is separated into categories of some kind. Chapter 5 Categorical A general name for non-numerical data; the data is separated into categories of some kind. Nominal data Categorical data with no implied order. Eg. Eye colours, favourite TV show,

More information

STAT 157 HW1 Solutions

STAT 157 HW1 Solutions STAT 157 HW1 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/10/spring/stats157.dir/ Problem 1. 1.a: (6 points) Determine the Relative Frequency and the Cumulative Relative Frequency (fill

More information

UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION

UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION 1. George cantor is the School of Distance Education UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION General (Common) Course of BCom/BBA/BMMC (2014 Admn. onwards) III SEMESTER- CUCBCSS QUESTION BANK

More information

appstats5.notebook September 07, 2016 Chapter 5

appstats5.notebook September 07, 2016 Chapter 5 Chapter 5 Describing Distributions Numerically Chapter 5 Objective: Students will be able to use statistics appropriate to the shape of the data distribution to compare of two or more different data sets.

More information

St. Xavier s College Autonomous Mumbai STATISTICS. F.Y.B.Sc. Syllabus For 1 st Semester Courses in Statistics (June 2015 onwards)

St. Xavier s College Autonomous Mumbai STATISTICS. F.Y.B.Sc. Syllabus For 1 st Semester Courses in Statistics (June 2015 onwards) St. Xavier s College Autonomous Mumbai STATISTICS F.Y.B.Sc Syllabus For 1 st Semester Courses in Statistics (June 2015 onwards) Contents: Theory Syllabus for Courses: S.STA.1.01 Descriptive Statistics

More information

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data Summarising Data Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Today we will consider Different types of data Appropriate ways to summarise these data 17/10/2017

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 05 Normal Distribution So far we have looked at discrete distributions

More information

M249 Diagnostic Quiz

M249 Diagnostic Quiz THE OPEN UNIVERSITY Faculty of Mathematics and Computing M249 Diagnostic Quiz Prepared by the Course Team [Press to begin] c 2005, 2006 The Open University Last Revision Date: May 19, 2006 Version 4.2

More information

Example: Histogram for US household incomes from 2015 Table:

Example: Histogram for US household incomes from 2015 Table: 1 Example: Histogram for US household incomes from 2015 Table: Income level Relative frequency $0 - $14,999 11.6% $15,000 - $24,999 10.5% $25,000 - $34,999 10% $35,000 - $49,999 12.7% $50,000 - $74,999

More information

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19)

Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19) Statistics (This summary is for chapters 17, 28, 29 and section G of chapter 19) Mean, Median, Mode Mode: most common value Median: middle value (when the values are in order) Mean = total how many = x

More information

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis Descriptive Statistics (Part 2) 4 Chapter Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis McGraw-Hill/Irwin Copyright 2009 by The McGraw-Hill Companies, Inc. Chebyshev s Theorem

More information

Exploring Data and Graphics

Exploring Data and Graphics Exploring Data and Graphics Rick White Department of Statistics, UBC Graduate Pathways to Success Graduate & Postdoctoral Studies November 13, 2013 Outline Summarizing Data Types of Data Visualizing Data

More information

Numerical Descriptive Measures. Measures of Center: Mean and Median

Numerical Descriptive Measures. Measures of Center: Mean and Median Steve Sawin Statistics Numerical Descriptive Measures Having seen the shape of a distribution by looking at the histogram, the two most obvious questions to ask about the specific distribution is where

More information

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either

More information

Model Paper Statistics Objective. Paper Code Time Allowed: 20 minutes

Model Paper Statistics Objective. Paper Code Time Allowed: 20 minutes Model Paper Statistics Objective Intermediate Part I (11 th Class) Examination Session 2012-2013 and onward Total marks: 17 Paper Code Time Allowed: 20 minutes Note:- You have four choices for each objective

More information

Chapter 5: Summarizing Data: Measures of Variation

Chapter 5: Summarizing Data: Measures of Variation Chapter 5: Introduction One aspect of most sets of data is that the values are not all alike; indeed, the extent to which they are unalike, or vary among themselves, is of basic importance in statistics.

More information

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Math 2311 Bekki George bekki@math.uh.edu Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment Class webpage: http://www.math.uh.edu/~bekki/math2311.html Math 2311 Class

More information

Probability & Statistics Modular Learning Exercises

Probability & Statistics Modular Learning Exercises Probability & Statistics Modular Learning Exercises About The Actuarial Foundation The Actuarial Foundation, a 501(c)(3) nonprofit organization, develops, funds and executes education, scholarship and

More information

AP Statistics Chapter 6 - Random Variables

AP Statistics Chapter 6 - Random Variables AP Statistics Chapter 6 - Random 6.1 Discrete and Continuous Random Objective: Recognize and define discrete random variables, and construct a probability distribution table and a probability histogram

More information

Data Analysis. BCF106 Fundamentals of Cost Analysis

Data Analysis. BCF106 Fundamentals of Cost Analysis Data Analysis BCF106 Fundamentals of Cost Analysis June 009 Chapter 5 Data Analysis 5.0 Introduction... 3 5.1 Terminology... 3 5. Measures of Central Tendency... 5 5.3 Measures of Dispersion... 7 5.4 Frequency

More information

STAT 113 Variability

STAT 113 Variability STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2

More information

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.

Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Chapter 8 Measures of Center Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Data that can only be integer

More information

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions ELE 525: Random Processes in Information Systems Hisashi Kobayashi Department of Electrical Engineering

More information

University 18 Lessons Financial Management. Unit 12: Return, Risk and Shareholder Value

University 18 Lessons Financial Management. Unit 12: Return, Risk and Shareholder Value University 18 Lessons Financial Management Unit 12: Return, Risk and Shareholder Value Risk and Return Risk and Return Security analysis is built around the idea that investors are concerned with two principal

More information

MAS187/AEF258. University of Newcastle upon Tyne

MAS187/AEF258. University of Newcastle upon Tyne MAS187/AEF258 University of Newcastle upon Tyne 2005-6 Contents 1 Collecting and Presenting Data 5 1.1 Introduction...................................... 5 1.1.1 Examples...................................

More information

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1)

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) Descriptive statistics are ways of summarizing large sets of quantitative (numerical) information. The best way to reduce a set of

More information

Web Extension: Continuous Distributions and Estimating Beta with a Calculator

Web Extension: Continuous Distributions and Estimating Beta with a Calculator 19878_02W_p001-008.qxd 3/10/06 9:51 AM Page 1 C H A P T E R 2 Web Extension: Continuous Distributions and Estimating Beta with a Calculator This extension explains continuous probability distributions

More information

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.

More information

Chapter 7 1. Random Variables

Chapter 7 1. Random Variables Chapter 7 1 Random Variables random variable numerical variable whose value depends on the outcome of a chance experiment - discrete if its possible values are isolated points on a number line - continuous

More information

2018 CFA Exam Prep. IFT High-Yield Notes. Quantitative Methods (Sample) Level I. Table of Contents

2018 CFA Exam Prep. IFT High-Yield Notes. Quantitative Methods (Sample) Level I. Table of Contents 2018 CFA Exam Prep IFT High-Yield Notes Quantitative Methods (Sample) Level I This document should be read in conjunction with the corresponding readings in the 2018 Level I CFA Program curriculum. Some

More information

UNIT 16 BREAK EVEN ANALYSIS

UNIT 16 BREAK EVEN ANALYSIS UNIT 16 BREAK EVEN ANALYSIS Structure 16.0 Objectives 16.1 Introduction 16.2 Break Even Analysis 16.3 Break Even Point 16.4 Impact of Changes in Sales Price, Volume, Variable Costs and on Profits 16.5

More information

The Mode: An Example. The Mode: An Example. Measure of Central Tendency: The Mode. Measure of Central Tendency: The Median

The Mode: An Example. The Mode: An Example. Measure of Central Tendency: The Mode. Measure of Central Tendency: The Median Chapter 4: What is a measure of Central Tendency? Numbers that describe what is typical of the distribution You can think of this value as where the middle of a distribution lies (the median). or The value

More information

Today's Agenda Hour 1 Correlation vs association, Pearson s R, non-linearity, Spearman rank correlation,

Today's Agenda Hour 1 Correlation vs association, Pearson s R, non-linearity, Spearman rank correlation, Today's Agenda Hour 1 Correlation vs association, Pearson s R, non-linearity, Spearman rank correlation, Hour 2 Hypothesis testing for correlation (Pearson) Correlation and regression. Correlation vs association

More information

David Tenenbaum GEOG 090 UNC-CH Spring 2005

David Tenenbaum GEOG 090 UNC-CH Spring 2005 Simple Descriptive Statistics Review and Examples You will likely make use of all three measures of central tendency (mode, median, and mean), as well as some key measures of dispersion (standard deviation,

More information

starting on 5/1/1953 up until 2/1/2017.

starting on 5/1/1953 up until 2/1/2017. An Actuary s Guide to Financial Applications: Examples with EViews By William Bourgeois An actuary is a business professional who uses statistics to determine and analyze risks for companies. In this guide,

More information

A probability distribution shows the possible outcomes of an experiment and the probability of each of these outcomes.

A probability distribution shows the possible outcomes of an experiment and the probability of each of these outcomes. Introduction In the previous chapter we discussed the basic concepts of probability and described how the rules of addition and multiplication were used to compute probabilities. In this chapter we expand

More information

MBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment

MBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment MBEJ 1023 Planning Analytical Methods Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment Contents What is statistics? Population and Sample Descriptive Statistics Inferential

More information

CHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) =

CHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) = Solutions to End-of-Section and Chapter Review Problems 225 CHAPTER 6 6.1 (a) P(Z < 1.20) = 0.88493 P(Z > 1.25) = 1 0.89435 = 0.10565 P(1.25 < Z < 1.70) = 0.95543 0.89435 = 0.06108 (d) P(Z < 1.25) or Z

More information

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2]

A LEVEL MATHEMATICS ANSWERS AND MARKSCHEMES SUMMARY STATISTICS AND DIAGRAMS. 1. a) 45 B1 [1] b) 7 th value 37 M1 A1 [2] 1. a) 45 [1] b) 7 th value 37 [] n c) LQ : 4 = 3.5 4 th value so LQ = 5 3 n UQ : 4 = 9.75 10 th value so UQ = 45 IQR = 0 f.t. d) Median is closer to upper quartile Hence negative skew [] Page 1 . a) Orders

More information

SOLUTIONS TO THE LAB 1 ASSIGNMENT

SOLUTIONS TO THE LAB 1 ASSIGNMENT SOLUTIONS TO THE LAB 1 ASSIGNMENT Question 1 Excel produces the following histogram of pull strengths for the 100 resistors: 2 20 Histogram of Pull Strengths (lb) Frequency 1 10 0 9 61 63 6 67 69 71 73

More information

DESCRIPTIVE STATISTICS II. Sorana D. Bolboacă

DESCRIPTIVE STATISTICS II. Sorana D. Bolboacă DESCRIPTIVE STATISTICS II Sorana D. Bolboacă OUTLINE Measures of centrality Measures of spread Measures of symmetry Measures of localization Mainly applied on quantitative variables 2 DESCRIPTIVE STATISTICS

More information

Continuous Probability Distributions

Continuous Probability Distributions 8.1 Continuous Probability Distributions Distributions like the binomial probability distribution and the hypergeometric distribution deal with discrete data. The possible values of the random variable

More information

FV N = PV (1+ r) N. FV N = PVe rs * N 2011 ELAN GUIDES 3. The Future Value of a Single Cash Flow. The Present Value of a Single Cash Flow

FV N = PV (1+ r) N. FV N = PVe rs * N 2011 ELAN GUIDES 3. The Future Value of a Single Cash Flow. The Present Value of a Single Cash Flow QUANTITATIVE METHODS The Future Value of a Single Cash Flow FV N = PV (1+ r) N The Present Value of a Single Cash Flow PV = FV (1+ r) N PV Annuity Due = PVOrdinary Annuity (1 + r) FV Annuity Due = FVOrdinary

More information

Diploma in Business Administration Part 2. Quantitative Methods. Examiner s Suggested Answers

Diploma in Business Administration Part 2. Quantitative Methods. Examiner s Suggested Answers Cumulative frequency Diploma in Business Administration Part Quantitative Methods Examiner s Suggested Answers Question 1 Cumulative Frequency Curve 1 9 8 7 6 5 4 3 1 5 1 15 5 3 35 4 45 Weeks 1 (b) x f

More information

Measures of Central Tendency: Ungrouped Data. Mode. Median. Mode -- Example. Median: Example with an Odd Number of Terms

Measures of Central Tendency: Ungrouped Data. Mode. Median. Mode -- Example. Median: Example with an Odd Number of Terms Measures of Central Tendency: Ungrouped Data Measures of central tendency yield information about particular places or locations in a group of numbers. Common Measures of Location Mode Median Percentiles

More information

Section 6-1 : Numerical Summaries

Section 6-1 : Numerical Summaries MAT 2377 (Winter 2012) Section 6-1 : Numerical Summaries With a random experiment comes data. In these notes, we learn techniques to describe the data. Data : We will denote the n observations of the random

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Exam Name The bar graph shows the number of tickets sold each week by the garden club for their annual flower show. ) During which week was the most number of tickets sold? ) A) Week B) Week C) Week 5

More information

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018 ` Subject CS1 Actuarial Statistics 1 Core Principles Syllabus for the 2019 exams 1 June 2018 Copyright in this Core Reading is the property of the Institute and Faculty of Actuaries who are the sole distributors.

More information

Chapter 4-Describing Data: Displaying and Exploring Data

Chapter 4-Describing Data: Displaying and Exploring Data Chapter 4-Describing Data: Displaying and Exploring Data Jie Zhang, Ph.D. Student Account and Information Systems Department College of Business Administration The University of Texas at El Paso jzhang6@utep.edu

More information

8.1 Estimation of the Mean and Proportion

8.1 Estimation of the Mean and Proportion 8.1 Estimation of the Mean and Proportion Statistical inference enables us to make judgments about a population on the basis of sample information. The mean, standard deviation, and proportions of a population

More information

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need. For exams (MD1, MD2, and Final): You may bring one 8.5 by 11 sheet of

More information

Chapter 4-Describing Data: Displaying and Exploring Data

Chapter 4-Describing Data: Displaying and Exploring Data Chapter 4-Describing Data: Displaying and Exploring Data Jie Zhang, Ph.D. Student Account and Information Systems Department College of Business Administration The University of Texas at El Paso jzhang6@utep.edu

More information

the display, exploration and transformation of the data are demonstrated and biases typically encountered are highlighted.

the display, exploration and transformation of the data are demonstrated and biases typically encountered are highlighted. 1 Insurance data Generalized linear modeling is a methodology for modeling relationships between variables. It generalizes the classical normal linear model, by relaxing some of its restrictive assumptions,

More information

Numerical Descriptions of Data

Numerical Descriptions of Data Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =

More information

Chapter 6: Quadratic Functions & Their Algebra

Chapter 6: Quadratic Functions & Their Algebra Chapter 6: Quadratic Functions & Their Algebra Topics: 1. Quadratic Function Review. Factoring: With Greatest Common Factor & Difference of Two Squares 3. Factoring: Trinomials 4. Complete Factoring 5.

More information