MANAGEMENT PRINCIPLES AND STATISTICS (252 BE) Normal and Binomial Distribution Applied to Construction Management Sampling and Confidence Intervals Sr Tan Liat Choon Email: tanliatchoon@gmail.com Mobile: 016-4975551 1
Probability and Statistics 2
Probability and Statistics Probability : Population Sample Statistics : Population Sample In probability theory, it is assumed that properties and characteristics of a population are known. In statistics, those of a sample are obtainable. Those of the population are NOT known and they are exactly what we like to know and understand. Based on data, we investigate appropriate models, estimate them (parameters) and make scientific decisions ( Hypothesis testing ). Since a sample is a subset of a population, we might reach incorrect conclusions. It is necessary to quantify and control the uncertainties and errors 3
Supplements (1) Mode: The most frequently occurring value in a data set is called the mode. Ex1) { 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5} Mode = 5 Unimodal Ex2) {2, 2, 3, 4, 4, 5} There are two modes (2 and 4). Bimodal Midrange: (maximum + minimum)/2 Ex3) {20, 30, 34, 55, 10, 1} midrange = (1+55)/2 = 28 4
Supplements (2) Order Statistics: Sort a data set in ascending order. x, x,..., x x, x,..., x => x(1) x(2) x( n) 1 2 n (1) (2) ( n) (... ) Ex) Data = { 20, 14, 5, 6, 20 } x (1) = 5 {5, 6, 14, 20, 20} x x (2) (3) = 6 = 14 x (4) = 20 x (5) = 20 5
Supplements (3) 100p-th Percentiles: The 100p-th percentile is a number below which 100p % observations lie. If 100p% of data lie below a number, then the percentile rank of the number is 100p. How to find the 100p-th percentile: = X + 100p-th percentile (( n 1)* p) (n+1)*p-th order statistic 6
Supplements (4) 100p-th Percentiles: Example) {4, 6, 17, 8, 12, 16, 13, 7} Find the 70 th percentile of the data set! 7
100p-th Percentiles: Supplements (5) Example 1) {4, 6, 17, 8, 12, 16, 13, 7} 1. Sort the data set in ascending order. { 4, 6, 7, 8, 12, 13, 16, 17 } 2. Find p. p=0.7. 3. Find (n+1)*p th order statistic 9*0.7 th order statistic = 6.3-th order statistic. We want to find 6.3th order statistic. 6 th order statistic is 13 and 7 th order statistic is 16. The total distance is 3 units, we wish to go 30% of that, which is 0.9. The 70 th percentile is 13.9. 8
Supplements (6) 100p-th Percentiles: Example 2) {40, 6, 17, 80, 12, 7, 9} Find 80 th percentile of the data set. 9
Probability Definitions and Properties (1) Sample Space (S): The set of every possible outcomes of an experiment is called a sample space (S). Ex1) What is the sample space when you toss a coin S= {Head, Tail}={H,T} Ex2) What if you toss the coin twice? S=? Ex3) What about rolling a die? S=? 10
Definitions and Properties (2) Event : an event is a subset of a sample space. Ex1) List all events of S={H,T}: {}, {H}, {T}, {H, T}. Ex2) # of all subsets of a set 2^(# of data in the set} For S={H,T} 2^2 = 4 11
Definitions and Properties (3) Probability (p) (always between 0 and 1): a numerical quantity that expresses the likelihood of an event. The probability of an event E is written as P(E) or Pr(E). Ex1) Probability of a fair coin: S={H,T}. p({})=0 p({h})=1/2 P({T})=1/2 P({H,T})=1 Ex2) Probability of a fair die : S={1, 2, 3, 4, 5, 6} 12
Definitions and Properties (4-1) (Set Theory) Set operations - Union (or) (The set of all objects that are a member of A, or B, or both) - Intersection (and) (The set of all objects that are members of both) - Complement (not) (The set of all members of S that are not members of A) A B = { x : x A or x B} A B = { x : x A and x B} C c A = { x: x A} 13
Definitions and Properties (4-2) (Set Theory) Let A and B be two events (S : Sample space) (a) (b) (c) (d) (e) P( φ ) = P({}) = 0 c PA ( ) = 1 PA ( ) PA ( B) = PA ( ) + PB ( ) PA ( B) PS ( ) = 1 If A B, P( A) P( B) 14
Definitions and Properties (4-3) (Set Theory) Mutually exclusive If Independent If A B= φ = {} PA ( B) = PA ( ) PB ( ), then A and B are mutually exclusive., then A and B are independent. Ex) suppose that P(A)=1/2 and P(B)=2/3 and they are independent. PA ( B) Calculate! Ex) suppose that P(A)=1/2 and P(B)=1/3 and they are mutually exclusive. PA ( B) Calculate! 15
Definitions and Properties (4-4) Venn Diagram: (Set Theory) A visual representation of a set. Each event is represented as a circle and the sample space is represented as a rectangle Ex) A={1,2,3,4}, B={3,4,5,6}, S={1,2,3,4,5,6,7} Ex) P(A)=0.82, P(B)=0.31, P(A n B)= 0.19 Ex) Mutually exclusive events A and B 16
Definitions and Properties (4-5) Example: (Set Theory) In a certain residential suburb, 60% of all households subscribe to the metropolitan newspaper published in a nearby city, 80% subscribe to the local afternoon paper, and 50% of all households subscribe to both papers. If a household is selected at random, what is the probability that it subscribes to (1) at least one of the two newspapers and (2) exactly one of the two newspapers? 17
Definitions and Properties (5) Conditional Probabilities: Suppose that P(B) is not equal to 0, we define the conditional probability of A given that the event B has occurred, PAB ( ) = PA ( B) PB ( ) Ex) P(A n B) = 0.3, P(B)= 0.5 P(A B)=? Ex) Two events A and B are independent. P(A)=0.4, P(B)=0.5, then P(A B)=? 18
Definitions and Properties (5-1) Example: Suppose that of all individuals buying a certain personal computer, 60% include a word processing program in their purchase, 40% include a spreadsheet program, and 30% include both types of programs. Consider randomly selecting a purchaser and let A={ word processing program included } and B = { spreadsheet program included }. (1) Given that the selected individual included a spreadsheet program, the probability that a word processing program was also included? (2) Given that the selected individual included a word processing program, the probability that a spreadsheet program was also included? 19
Definitions and Properties (6) Multiplication Rule for P(A n B): PA ( B) = PAB ( ) PB ( ) Ex) P(A B) = 0.3, P(B)= 0.5 P(A n B)=? Ex) Two events A and B are independent. P(A)=0.4, P(B)=0.5, then P(A n B)=? If two events are independent then, P(A n B) = P(A) x P(B). 20
Definitions and Properties (6-1) Example: Four individuals have responded to a request by a blood bank for blood donations. None of them has donated before, so their blood types are unknown. Suppose only type A+ is desired and only one of the four actually has this type. If the potential donors are selected in random order for typing, what is the probability that at least three individuals must be typed to obtain the desired type? 21
Definitions and Properties (7) Probability Tree Diagram: a probability tree diagram provides a convenient way to break a problem into parts and to organize the information available. Ex) a fair coin is tossed twice. Show the tree diagram. Pr (heads on both tosses) =? Ex) In the Drosophila population, 30% of the flies are black and 70% are gray. Suppose that two flies are randomly chosen from the population. Find the probability that both flies are the same color. (Use the tree diagram!) 22
Probability Tree Example 1 Suppose that a student who is about to take a multiple choice test has only learned 40% of the material covered by the exam. Thus, there is a 40% chance that she will know the answer to a question. However, even if she does not know the answer to a question, she will has a 20% chance of getting the right answer by guessing. If we choose a question at random from the exam, what is the probability that she will get it right? 23
Probability Rules Example 2 : The relation ship between hair color and eye color for a group of 1770 German men. Hair color Brown Black Red Total Eye color Brown 400 300 20 720 Blue 800 200 50 1050 Total 1200 500 70 1770 1) P(black hair or red hair)=? 2) P(black hair)=? 3) P(blue eyes black hair)=? 4) P(red hair and brown eyes)=? 24
Density Curves Density Curves 1) Relative Frequency Histogram (discrete) Density curve (continuous) ( as class widths go to 0) 2) The density curve is entirely above the x-axis. 3) The area under an entire density curve must be equal to 1. 4) For any two numbers A and B Area under density curve between A and B = Proportion of X values between A and B = P(A X B) 25
Random Variables Random Variables 1) A variable that takes on numerical values that depend on the outcome of a chance operation. 2) A rule that associates a number with each outcome in a sample space. 3) A function that associates a number with each outcome in a sample space. X(s)=x (x is the value associated with the outcome s by the random variable X ) 26
Random Variables Example (discrete random variable): We toss a fair coin once. With S={H, T}, we can define an random variable X by X(H)=1 X(T)=0 If X=1, it means that we got heads If X=0, it means that we got tails P(X=1)=? P(X=0)=? 27
Random Variables Example (discrete random variable): 1) We toss a fair dice once. With S={A, B, C, D, E, F}, we can define an random variable X by X(A)=1, X(B)=2, X(C)=3, X(D)=4, X(E)=5, X(F)=6 P(X=1)=? P(X=2)=? 2) We toss a fair dice once. With S={1, 2, 3, 4, 5, 6}, we can define an random variable X by X(s)=s (s is an outcome) 28
Random Variables Example (continuous random variable): We measure and record the height of a man chosen randomly from a certain population. With S={s: 5 ft < s < 8 ft}, we can define an random variable X by X(s)=s (s is an outcome) 29
Random Variables Mean of a Random Variable: The mean of a discrete random variable X = µ = x PX ( = x) X i i = The expected value of X = E(X) 30
Random Variables Variance of a Random Variable: The variance of a discrete random variable X = σ = ( x µ ) PX ( = x) 2 2 X i X i = VAR(X) 31
Random Variables Mean of a Random Variable (Example): In preparation for an ecological study of centipedes, the floor of a beech woods is divided into a large number of one-foot squares. At a certain moment, the distribution of the number of centipedes in the squares, X, is as shown in the table. 1) The mean value of X =? 2) The variance of X=? Number of Centipedes Percent Frequency 0 45 1 36 2 14 3 4 4 1 Total 100 32
Counting (supplement) X-factorial The factorial of a non-negative integer x, denoted by x!, is the product of all positive integers less than or equal to x. x! = xx ( 1)( x 2) (2)(1) 1! = 1 0! = 1 Ex1) 5!=? Ex2) 3!/5!=? 33
Counting (supplement) Permutations An ordered collection of distinct elements. The number of r-permutations (each of size r) from a set S with n elements (size n) is denoted by n n! Pr =, n r ( n r)! When counting groups of times (where the order inside the group changes the group), we use a permutation. 34
Counting (supplement) Example How many different ordered arrangements of 2 could be selected from the 3 items A, B, and C? AB, BA, AC, CA, BC, and CB 6 ordered arrangements are possible! Each arrangement is known as a permutation. 3! 3 2 1 6 3P2 = = = = (3 2)! 1! 1 6 35
Counting (supplement) Combinations An unordered collection of distinct elements. The number of r-combinations (each of size r) from a set S with n elements (size n) is denoted by n n n! Cr = =, n r r r!( n r)! When counting groups of times (where the order inside the group does not change the group), we use a combination. 36
Counting (supplement) Example How many different groups of 2 could be selected from the 3 items A, B, and C? AB, BA, AC, CA, BC, and CB 6 ordered arrangements are possible! But, here we don t care about the order!!! AB, BA AC, CA BC, CB should be counted once. (Therefore, 3 groups) 3! 3 2 1 6 3C2 = = = = 3 2!(3 2)! 2!1! 2 37
Counting (supplement) Example There are 20 kids who have applied to be captain of a kickball team. There are 4 total teams. In how many ways can we choose 4 kids to be captains? 20 20! 20 19 18 17 20C4 = = = = 4 4!(16)! 4 3 2 1 4845 38
Counting (supplement) Example Jimmy and Tom are to be assigned to 3 different jobs, one to each job. How many different assignments are possible? 1 st Job 2 nd Job 3 rd Job Jimmy Tom Tom Jimmy Jimmy Tom Tom Jimmy 3! 3 2 1 3P2 = = = 6 (3 2)! 1! Jimmy Tom Tom Jimmy 39
Binomial Distribution Binomial Random Variables :A random variable, X= the number of successes among the n trials, that satisfies the following four conditions. (1) Binary outcomes : There are only two possible outcomes for each trial (success and failure). (2) Independent trials : The outcomes of the trials are independent of each other. (3) Fixed n : The number of trial, n, is fixed in advance. (4) Same value of p : The probability of a success on a single trial is the same for all trials. 40
Binomial Distribution Binomial Distribution Formula : For a binomial random variable X, the probability that n trials result in k success (and n-k failures) is given by the following formula. k n k P( k Successes) = P( X = k) = C p (1 p), 0 k n X ~ Bin( n, p) n k Ex) n=4, p=1/2, P(X=2)=? Ex) n=3, p=1/3, P(2 successes)=? Ex) X~Bin(4, 0.3), P(X=3)=? 41
Binomial Distribution Example Ex) n=4, p=1/2, P(X=2)=? PX ( 2) C (1/ 2) (1 1/ 2) 2 4 2 = = 4 2 = = = Ex) n=3, p=1/3, P(2 successes)=? Ex) X~Bin(4, 0.3), P(X=3)=? 4! 1 1 4321 3 2!2!4 4 2244 8 3! 1 2 3 2 1 2 2 2!1!9 3 293 9 2 3 2 PX ( = 2) = 3C2 (1 / 3) (1 1 / 3) = = = 4! PX= = = = = 3!1! 3 4 3 3 3 ( 3) 4C3 (0.3) (1 0.3) 0.3 0.7 4 0.3 0.7 0.0756 42
Binomial Distribution Example Suppose we draw a random sample of five individuals from a large population in which 39% of the individuals are mutants. What is the probability of a sample containing 3 mutants and 2 nonmutants? 43
Binomial Distribution Answer n=5 and p=.39 X= the number of successes ( here, the number of mutants) P(3 mutants)=p(x=3) PX= = = 3 5 3 3 2 ( 3) 5C3(.39) (1.39) 10 (.39) (.61).22 44
Binomial Distribution Example (n=5 and p=.39) Number of 0.4 Mutants (k) Non-mutants (nk) Probability 0.35 0.3 0 5.08 1 4.27 2 3.35 0.25 0.2 0.15 Number of mutants 3 2.22 0.1 4 1.07 0.05 5 0.01 0 1.00 0 1 2 3 4 5 45
Binomial Distribution Mean and Variance of a Binomial When X~Bin(n,p), 1) The mean (that is, the average number of success) = E(X)=np 2) The variance for a binomial random variable = Var(X)=np(1-p) ==> the standard deviation for a binomial random variable = np(1 p) 46
Binomial Distribution Example In the United States, 37% of the population has type A blood. Consider taking a sample size 4. Let X denote the number of persons in the sample with type A blood. Find (a) P(X=0) (b) P(X=1) (c) P(0 X 1) (d) mean? (e) standard deviation? 47
Binomial Distribution Answer n=4, p=.37 X = the number of persons in the sample with type A blood k ( = ) = (.37) (.63) PX k 4C k 4 0 4 1 4 k ( a) PX ( = 0) = C (.37) (.63) =.63 0 4 0 4 ( b) PX ( = 1) = C (.37) (.63) = 4(.37)(.63) 1 4 1 3 ( c) P(0 X 1) = PX ( = 0) + PX ( = 1) =.63 + 4(.37)(.63) 4 3 ( d) E( X ) = np = 4(.37) = 1.48 ( e) σ = Var( X ) = np(1 p) = 4(.37)(.63) = 0.9324 = 0.966 48
Confidence Intervals 49
Confidence Intervals This module explores the development and interpretation of confidence intervals, with a focus on confidence intervals for the population mean, based on the sample mean. 50
1. In the Chicago area, the price of new tires is normally distributed with a standard deviation of σ = $11.50. A random sample of 64 tires indicates a mean selling price of x = $98.70. Construct an 85% confidence interval for the mean selling price, µ of this new tire in the Chicago area. In order to estimate within $10.00 of the population mean, how large of a sample should be taken in order to be 95% confident of achieving this level of accuracy. x = 98.7 σ = 11.5.425 n = 64-1.44 1.44 S. E. = σ n = 11.5 64 = 1.44 x ± Z(x ) SE 98.7 ± 2.06 = [96.630, 100.769] 98.7 ±1.44 1.44 51
2. Fifty electric bills from the apartment of a certain city apartment are chosen at random. The mean electric bill was x = $109.50 with s = $21.75. The electric bills have a normal distribution. Construct a 98% confidence interval for P. x = 109.5 s = 21.75.49 n = 50-2.33 2.33 S. E. = σ n = 21.75 50 = 3.07 x ± Z(x ) SE 109.5 ± (2.33)(3.07) 109.5 ± 7.15 = [102.347, 116.653] 52
3. Thirty SAT scores were chosen at random from the records of seniors at a certain high school over the last 20 years. 770 680 510 520 660 680 660 500 570 680 800 660 560 690 700 720 800 420 600 620 660 800 420 380 740 770 680 560 500 660 Construct a 95% confidence interval for the population mean µ. x ± Z(x ) SE 632.33 ±.475-1.96 1.96 (1.96)(21.289) x = 632.33 s = 116.606 n = 30 σ 116.606 S. E. = = = 21.289 n 30 632.33± 41.140 = [590.63, 674.09] 53
4. A random sample on n = 100 voters in a community produced x = 59 voters in favor of a candidate A. Estimate the fraction of the voting population favoring candidate A using a 95% and a 90% confidence interval. P ± Z(P) SE.59 ± (1.96)(.049) P ± Z(P).59 ±1.96 p(1 p) n (.59)(.41) 100 (.494,.686). 59 ± 1.645(.049) (.509,.677) 54
How many people must be asked if candidate A wants A 95% confidence interval with a margin of error of + 3 %? P =.59 Z(P) p(1 p) n ME 1.96 (.59)(.41) n.03 2 2 n 1033 55
5. A recent poll cited that 76 out of 180 randomly chosen Households watch at least 2 hours of public television per week. Find a 90% confidence interval for p, the proportion of households that watch at least 2 hours of public television per week. P ± Z(P) SE.42 ± (1.645)(.037) P ± Z(P) p(1 p) n (.359,.480).42 ±1.645 (.42)(.58) 180 56
How many people must be asked if the station wants a 90% confidence interval with a margin of error of + 4 %? P =.42 Z(P) p(1 p) n ME 1.645 (.42)(.58) n.04 2 2 n 411 57
6. A manufacturer of gunpowder claims to have developed a gun powder that is designed to produce a muzzle velocity of 3000 ft sec. The following data is collected in ft/sec 3005 2925 2935 2965 2995 3005 2935 2905. Construct a 95% and a 85% confidence interval for µ. x ± t( x) SE x = 2958.75 2958.75 ± 2.365 13.881 s = 39.26 n = 8, 7d. f [2925.92, 2991.58] S. E. = s n = 39.26 8 = 13.881 58
7. The profit for a car dealership for the past week was $210 $300 $120 $620 $450 $510 Construct a 90% confidence interval for the average profit x ± t( x) SE x = 368.33 368.33 ± 2.015 77.771 s = 190.5 n = 6, 5d. f [$211, $525] S. E. = s n = 190.5 6 = 77.771 59
THANKS YOU 60