EXCEL STATISTICAL Functions Presented by Wayne Wilmeth
Exponents 2 3
Exponents 2 3 2*2*2 = 8
Exponents
Exponents
Exponents
Exponent Examples
Roots? *? = 81? *? *? = 27
Roots =Sqrt(81) 9
Roots 27 1/3 27^(1/3) 3
Roots 27 1/3
Root Examples
Summations 10 1 i 2
Summations i 2 Perform a mathematical operation on each number in a series. + Sum the Results.
Summations Ending Number Starting 10 1 i 2 Mathematical Operation Number Sum(1 2, 2 2, 3 2, 4 2, 5 2, 6 2, 7 2, 8 2, 9 2, 10 2 )
Using Sum() as an Array 10 1 i 2 Sum(1^2,2^2,3^2,4^2,5^2,6^2,7^2,8^2,9^2,10^2) Sum(A1:A10^2) However, if you just press Enter, Excel cannot evaluate the equation. We need to tell it to process each cell in the range one at a time.
Using Sum() as an Array Press Control + Shift + Enter to make Excel Process cells in a range one at a time. Sum(A1:A10^2) Sum(A1 2,A2 2,A3 2,A4 2,A5 2,A6 2,A7 2,A8 2,A9 2,A10 2 ) Sum(1 2,2 2,3 2,4 2,5 2,6 2,7 2,8 2,9 2,10 2 ) Sum(1,4,9,16,25,36,49,64,81,100) 385
Summation Examples
Average vs. Weighted Average Average Weighted Average
Average Price Sum of Prices / Count of Prices Transaction Price Per Units Item Date Unit Sold 5/7/2007 Coffee $ 20 500 6/7/2007 Coffee $ 25 750 7/6/2007 Coffee $ 35 200 8/9/2007 Coffee $ 30 300 Total: 20+25+35+30 = 110 Average Price: 110 / 4 = 27.50
Weighted Average Price (Price per Unit * Units Sold) / Units Sold Date Item Price Per Units Unit Sold 5/7/2007 Coffee $ 20 X 500 6/7/2007 Coffee $ 25 X 750 7/6/2007 Coffee $ 35 X 200 8/9/2007 Coffee $ 30 X 300 + 500+750+200+300 = 1,750 = 10,000 = 18,750 = 7,000 = 9,000 + 44,750 Weighted Average Price: 44,750 / 1,750 = $25.57
Weighted Average Example
=SumProduct(Array1,Array2) Multiplies corresponding numbers in two different ranges together then sums them. Date Item Price Per Units Unit Sold 5/7/2007 Coffee $ 20 500 6/7/2007 Coffee $ 25 750 7/6/2007 Coffee $ 35 200 8/9/2007 Coffee $ 30 300 Array 1 Array 2 20*500 = 10,000 25*750 = 18,750 35*200 = 7,000 30*300 = 9,000 44,750
Wt. Avg. with SumProduct()
Combinations In a group of 7 survivors, how many different combinations of people can be selected for a three person lifeboat?
Combinations Order of Selection is Not Important
Combinations Each Person can be Selected Only Once.
Combinations =Combin(Populatin Size,Number Selected) In a group of 7 survivors, how many different combinations of people can be selected for a three person lifeboat? =Combin(7,3) 35
Permutations Like Combinations; Except, Order is Important & No Duplicates
Permutations =Permut(Population Size,# Selected) In a group of 7 Contestants, how many different combinations of people can be selected for a first, second, and third prize? =Permut(7,3) 210
Combinations Ex2 (With Replacement) You have an unlimited supply of Coconuts and Pineapples. How many different ways can you fill a bowl which will contain exactly 3 items?
Combinations Ex2 Order of Selection is Not Important
Combinations Ex2 Items can be Repeated (with Replacement)
Combinations Ex2 (With Replacement, Order Not Important) n k 1! ( n 1)! k! n = The number of different items available (2). K = The number of items selected (3).
Factorials (!) 4! = 1 * 2 * 3 * 4 = 24 6! = 1 * 2 * 3 * 4 * 5 * 6 = 720 4! = =Fact(4) = 24 6! = =Fact(6) = 720
Combinations Ex2 (With Replacement, Order Not Important) n k 1! 2 3 1! ( n 1)! k! (2 1)!3! 4! (1)!3! 4*3*2*1 (1)*3*2*1 n = The number of different items available (2). K = The number of items selected (3).
Combinations Ex2 (With Replacement, Order Not Important) 4*3*2*1 (1)*3*2*1 24 6 4 n = The number of different items available (2). K = The number of items selected (3).
Combinations Ex2 (With Replacement, Order Not Important) n k 1! ( n 1)! k! Fact 2 3 1 Fact(2 1)* Fact(3) =FACT(2+3-1)/(FACT(2-1)*FACT(3))
Combinations Ex2 (With Replacement)
Order of Operations
Permutations Ex2 (With Replacement) Like Combinations; Except, Order is Important. (213) 740 - How many different possibilities for the last 4 digits?
Permutations Ex2 Order is Important (213) 740-1 _ 2 34_ (213) 740-4 _ 3 2 _ 1
Permutations Ex2 With Replacement (Items (0-9) can be Repeated) (213) 740-1 _ 1 11_
Permutations Ex2 With Replacement (Items (0-9) can be Repeated) Number of Choices Available (n) = n (K) ( Number of Items you are selecting ( = 10^4
Permutations Ex2 (with replacement)
Combinations/Permutations Summary Category Order Count? Replacement? Function Combinations No No =Combin(pop_size,#Selected) Combinations No Yes n k 1! ( n 1)! k! Permutations Yes No =Permut(pop_size,#Selected) Permutations Yes Yes = n^k
Slope of a Line
Slope Example: Sales In 2001 Sales were 10K and in 2007 sales were 25K. What is the slope of the line? How much have sales gone up each year?
Rise Sales Slope of a Line 30K 25K Run (2007, 25 000 ) 20K 15K 10K (2001, 10 000 ) 5K 0K 2000 2001 2002 2003 2004 2005 2006 2007 Year
Rise Slope of a Line Slope = Change in Y Change in X (2001, 10000) Run (2007, 25 000 ) Slope = y2 y1 X2 x1 Slope = 25000-10000 2007-2001 15000 Slope = = 2500 6
Sales Rise Meaning of Slope 30K 25K 20K 15K 10K 5K 0K (2001, 10 000 ) Run (2007, 25 000 ) 2000 2001 2002 2003 2004 2005 2006 2007 Year Rise Slope = = = 2500 Run 2500 1 Year For each year, sales have gone up $2,500
Excel s Slope( ) Function =Slope(Known Y s, Known X s)
Slope Example 2 How much do sales increase for each additional $1 of advertising?
Slope Ex 2: Solution How much do sales increase for each additional Adv. $1? Rise Run = 1,018.2 1,000 /1000 1,018.2 = = 1.0182 /1000 1,000 1 = $1.0182
Sales Y-Intercept of a Line Where a Line Crosses the Y-Axis 30K 25K 20K 15K 10K 5K 0K 1K 2K 3K 4K 5K 6K 7K 8K Advertising Cost
Excel s Intercept() Function =Intercept(Known Y s, Known X s) What would sales be if we spent nothing on advertising?
Y-Intercept Example What would sales be if we spent nothing on advertising?
Y-Intercept Example $123,772
2005 2006 2007 2008 2009 2010 Hardware Sales Forecasting (Linear Regression) 2.25M 2.0M 1.75M 1.5M 1.25M 1M 800K 900K 600K 400K 425K 500K New Home Startups
Forecast() Returns the predicted value of a dependant variable (x) =Forecast(X, Known Y s, Known X s) The data point you are predicting (Sales for 2010) Range of dependent variables (Previous Sales). Range of independent variables (Previous Housing Starts).
=Forecast(X, Known Y s, Known X s) We are assuming a relationship between new housing construction and our in-store hardware sales. If new construction start-ups for 2010 are predicted to be 500,000, predict what our sales will be.
Correlation
Correlation Is there really a correlation between our hardware sales and the number of new housing starts? =Correl(Array1,Array2) -1 -.75 -.5 -.25 0.25.5.75 1 Inverse Correlation No Correlation Perfect Correlation
Correl(Array1,Array2)
Correl(Array1,Array2) Is there a correlation between Interest rates and average home price in the last few years? Is there a correlation between the age of an automobile and its average resale value?
2005 2006 2007 2008 2009 2010 Hardware Sales Forecasting Using Trend() (Linear Regression) 2.25M 2.0M 1.75M 1.5M 1.25M 1M 800K 900K 600K 400K 425K 500K New Home Startups
Trend() Like Forecast() Except: Trend can return multiple dependent variables. Can utilize multiple independent variables.
Trend() Trend(Known Y s, Known X s, New X s) Dependant Variables (Past Sales) Independent Variables (Past Adv. Costs) Planned Independent Variables (Planned Adv. Costs) Predict Sales for July-December using Trend().
Trend() Trend(Known Y s, Known X s, New X s) Dependant Variables (Past Sales) Independent Variables (Past Adv. Costs) Planned Independent Variables (Planned Adv. Costs) Known Xs Known Ys New Xs You must: Highlight First Press Control + Shift + Enter
Trend() Trend(Known Y s, Known X s, New X s) You must: Highlight First Press Control + Shift + Enter
Trend(Known Y s, Known X s, New X s) What price should we ask for our office building given the characteristics of similar office buildings in the area?
Trend(Known Y s, Known X s, New X s)
Number Occurring Frequency 7 6 5 4 3 2 1 F D C B A Grade
We wish to determine the number of students getting an A, B, C, D and F based on the conditions below. >89.99 and <=100 ---------> A >79.99 and <=89.99 ------> B >69.99 and <=79.99 ------> C >59.99 and <=69.99 ------> D <= 59.99 ---------------------> F
=Frequency(Data Array, Bins Array) This is the range of values you are analyzing. This is your grouping. The values listed are the maximums for the group and are inclusive. In this example the groupings are: >89.99 and <=100 >79.99 and <=89.99 >69.99 and <=79.99 >59.99 and <=69.99 <=59.99
=Frequency(Data Array, Bins Array) You must highlight your output area prior to typing the formula. When done typing, you must press: Control + Shift + Enter
=Frequency(Data Array, Bins Array) You must: Highlight First Press Control + Shift + Enter
Descriptive Functions Average() Median() Mode() Stdev() StdevP() Var() VarP() Count() Max() Min()
Average(Range) Mathematical Measure of Central Tendency 500 800 3 Sum of Numbers Count of Numbers + 200 1500 1500 3 = 5
Median(Range) The number physically in the middle of an ordered number line 11
Mode() The most Frequently Occurring Number 1 2 5 3 2 6 2 8 1
Frequency Stdev(Range) or StdevP(Range) Standard Deviation of a Sample How close to the average is most of your data? 68% 95% 99%
Frequency Stdev(Range) or StdevP(Range) Standard Deviation = 23 Indicates 68% of our students are within 23 points (+/-) of the mean (76.8). 68% 95% 99% And 95% of our students are within 46 points (+/-) of the mean. (2 Stdev * 23) And 99% of our students are within 69 points (+/-) of the mean (3 stdev * 23)
Var(Range) or VarP(Range) The Variance is the average of the squared differences from the mean. Mathematically, you typically get Variance first and find its root to get the Standard Deviation which is a little more useful. Variance is: (Score a Mean) 2 + (Score b Mean) 2 + (Score c Mean) 2 Count of the number of scores Standard Deviation is: Variance (1/2)
Variance
Count(Range) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Returns the number of cells within a range which contain a value. Note that 0 is counted as a value but blanks and cells containing text are not counted.
Max(Range) Returns the largest number within a range.
Min(Range) Returns the smallest number within a range.
Descriptive Stats Find the statistics indicated for the student scores
Descriptive Stats: Solution Find the statistics indicated for the student scores
Normal Probabilities Binominal Hypergeometric Poisson 68% 95% 99%
Binominal Probabilities
Binomial Probability Concerns probabilities when there are only two possible outcomes. (i.e. Presence or absence of a characteristic.) Coin Toss: Heads or Tails? Men or Women: Of the 9 employees selected, what is the probability that three are women?
Binomial Probability Assumes data is Normally Distributed 68% 95% 99%
Binomial Probability With Replacement or Infinite Pool * * If sampling without replacement and the sample size is less than 10% of the population, then binominal is still a good approximation.
Binomial Probability The experiment consists of n identical trials. (Same wind velocity during all coin tosses) The trials are independent of each other. (Outcome of one doesn t affect the other.)
=NormDist(X,Mean,StdDev,Cumulative) This is the value you wish to test. (i.e. probability of a basketball player making 5 free throws in a row when his ratio is 2:3. Average value of the distribution. (i.e. 2 out of 3 or 67%.) Standard Deviation True or False False= Exact probability True= Less than or equal to.
Binominal Ex1: Exactly The average employee in the US works 40 hours per week with a standard deviation of 10 hours. What is the probability that a worker worked Exactly 40 hours? 40 40 X=40
Binominal Ex1: Exactly The average employee in the US works 40 hours per week with a standard deviation of 10 hours. What is the probability that a worker worked Exactly 40 hours?
Binominal Ex2: <= 30 What is the probability that a worker worked Less than or Equal to 30 hours? ( True returns Less than or equal to X.) 40 X<=30 40
Binominal Ex2: <= 30 What is the probability that a worker worked Less than or Equal to 30 hours?
Binominal Ex3: > 30 What is the probability that a worker worked more than 30 hours? X>30 40
Binominal Ex3: > 30 What is the probability that a worker worked more than 30 hours? 40 Cumulative = True returns less than or equal to X. X<=30 40 X<=30 40 40 And a 100% probability would fill the entire curve and would be 1. Therefore
Binominal Ex3: > 30 1-Probability of X<=30 equals X >30-40 40 X<=100 X<=30 40 40 = X>30 40
Binominal Ex3: > 30 What is the probability that a worker worked more than 30 hours? Note that if you wanted >= 30 rather than >30 then you would use 29 rather than 30 as X.
Binominal Ex4: >30 and <50 What is the probability that a worker worked more than 30 and less than 50?
Binominal Ex4: >30 and <50 Subtract X<=30 from X<=50 40-40 40 X<=50 X<=30 40 = 40 X>30 40 X<50
Binominal Ex4: >30 and <50 What is the probability that a worker worked more than 30 and less than 50? Prob. Of X <=50 - Prob. Of X <=30 from Example 2
Normal Inverse Function The Exact Opposite of the Normdist() NormDist() returns a probability based on a given variable X NormInv() returns the variable X based on a given probability
NormInv() Example 1 Average height of a man is 69.2 inches with a standard deviation of 2.5 inches. How tall would a man have to be (X) if he wanted to be taller than 75% of all men?
NormInv(Probability,Mean,StDev) This is the probability you wish to find a data point for. For example, you wish to be taller than 75% of all men. This is the Average. For example, the average height of a man is 69.2 Standard Deviation from the mean. For example, 1 standard deviation is 2.5 75% NormInv() returns the variable for a 40 cumulative probability starting from the left (0%). 69.2 70.89
NormInv() Example 1 Average height of a man is 69.2 inches with a standard deviation of 2.5 inches. How tall would a man have to be (X) if he wanted to be taller than 75% of all men?
NormInv() Example 2 The average weight of a man is 189.9 lbs with a standard deviation of 15lbs. How much should a man weigh if he would like to weight less than 60% of most men? (In other words, only 40% of men weigh less than him.)
NormInv() Example 2 40% 40 186.1 189.9
Hypergeometric Probability
Hypergeometric Distribution When items are selected without replacement, use the Hypergeometric Distribution over the Binomial Normal Distribution. All other criteria is the same as the Binomial Distribution.
Hypergeometric Ex. 1 A bowl contains 8 pineapples and 12 coconuts.
Hypergeometric Ex. 1 If 4 items are randomly selected, what is the probability that: None are pineapples? 1 is a pineapple? 2 are pineapples? 3 are pineapples? All are pineapples?
=HyperGeomDist( # of Successes in Sample,,, ) Sample Size # of Successes in Pop. Total Pop. Size This is the number of successes you are trying to find the probability of. For example, probability of just 1 pineapple.
=HyperGeomDist( # of Successes in Sample,,, ) Sample Size # of Successes in Pop. Total Pop. Size This is the size of your sample. For example, you are selecting 4 items from the bowl.????
=HyperGeomDist( # of Successes in Sample,,, ) Sample Size # of Successes in Pop. Total Pop. Size This is the total number of pineapples available in the bowl. 8 in this example.
=HyperGeomDist( # of Successes in Sample,,, ) Sample Size # of Successes in Pop. Total Pop. Size This is the total number of items the bowl will hold. In this example 20. (8 pineapples + 12 coconuts)
=HyperGeomDist( # of Successes in Sample,,, ) Sample Size # of Successes in Pop. Total Pop. Size A bowl contains 8 pineapples and 12 coconuts. If 4 items are randomly selected, what is the probability of: 0 pineapples? 1 pineapple? 2 pineapples? 3 pineapples? 4 pineapples?
Cumulative Hypergeometric Probabilities All possible probabilities add to a 100%. So, for cumulative probabilities, simply add the probabilities.
Cumulative Hypergeometric Probabilities Probability of 2 or Less: 10.22 +36.33 +38.14 84.69%
Cumulative Hypergeometric Probabilities Probability of 3 or More: 13.87 +1.44 15.31%
Cumulative Hypergeometric Probabilities Probability of less than 2: 10.22 +36.33 46.54%
Cumulative Hypergeometric Probabilities
Poisson Distribution
Poisson Distribution When probabilities occur over a fixed period of time. Results are success or failure (Two possible outcomes) Probabilities are normally distributed.
Poisson Distribution What is the probability that on a given day, the city of Redondo Beach will experience 3 fires? What is the probability that more than 10 cars will arrive at a tollbooth within a minute?
=Poisson( # of Events Being Tested,, Known Average Cumulative? ) This is the value being tested. For example, number of fires in a day or cars at a tollbooth. The Poisson distribution requires that the average is known. For example, there us an average of 4 fires per day in Redondo Beach. True: Less than or equal to the value being tested. False: Exactly the value being tested.
Poisson Ex1: Exact # On average, there are 4 fires in Redondo Beach per day, what is the probability that on a given day there are no fires? =Poisson(# of Events, Average, Cumulative)
Poisson Ex1: Solution On average, there are 4 fires in Redondo Beach per day, what is the probability that on a given day there are no fires?
Poisson Ex2: Less than or Equal to On average, there are 4 fires in Redondo Beach per day, what is the probability that on a given day there are 3 fires or less? =Poisson(# of Events, Average, Cumulative)
Poisson Ex2: Solution On average, there are 4 fires in Redondo Beach per day, what is the probability that on a given day there are 3 fires or less?
Poisson Ex3: Greater Than On average, the Vincent toll booth gets 8 cars per minute. What is the probability that it will receive more than 10 cars in a given minute? Poisson cumulative probabilities are inclusive and go from left to right. Therefore, to get a probability of a value greater than: - = 40 40 40 100% Probability (1) Probability of 10 Cars or Less Probability of More than 10 Cars
Poisson Ex3: Greater Than On average, the Vincent toll booth gets 8 cars per minute. What is the probability that it will receive more than 10 cars in a given minute?
Poisson Ex4: Less Than On average, the Vincent toll booth gets 8 cars per minute. What is the probability that it will receive less than 10 cars in a given minute? 40 <=9 <=10 Cumulative Poisson Probabilities are inclusive. 10 number of events is <=10. If you want just < 10, then use 9 number of events.
Poisson Ex4: Less Than On average, the Vincent toll booth gets 8 cars per minute. What is the probability that it will receive less than 10 cars in a given minute?
Interval Estimation Around the Mean How much above and below the average you can expect the data to be given a specific level of confidence. 40
=Confidence( Alpha,, Standard Deviation Sample Size ) Alpha is the confidence level and is 1-level. For example, 95% confidence level would be 1-.95=.05 This is the standard deviation Size of the sample
Confidence() Example On average, 64 randomly selected workers can make 60 sprockets per hour with a standard deviation of 16 sprockets. Determine the 95% confidence interval for the true mean of hourly output.
Confidence() Example =Confidence(Alpha, Standard Deviation, Sample Size) On average, 64 randomly selected workers can make 60 sprockets per hour with a standard deviation of 16 sprockets. Determine the 95% confidence interval for the true mean of hourly output.