Lecture Data Science
|
|
- Ambrose Cain
- 5 years ago
- Views:
Transcription
1 Web Science & Technologies University of Koblenz Landau, Germany Lecture Data Science Statistics Foundations JProf. Dr. Claudia Wagner
2 Learning Goals How to describe sample data? What is mode/median/mean? What is variance/std? What is kurtosis/skewness? What types of data exist? Which statistic can be computed on nominal data? What is a probability? Which distribution describes the success probability of an experiment? Which distribution describes the success probability of multiple experiments? Claudia Wagner 2
3 Statistics Aim: learn sth about a population by analyzing sample data Population Probability Sample Descriptive Statistics Inferential Statistics Claudia Wagner 3
4 Types of Data (Statistician Viewpoint) Ratio (e.g., weight) Absolute zero Interval (e.g., temperature in Celsius) Distance is meaningful Ordinal (e.g., status) Observations can be ordered Nominal (e.g., ethnic group, sex, nationality) Observations are only named Stevens, S. S. (1946). "On the Theory of Scales of Measurement". Science 103 (2684): Claudia Wagner 4
5 Why shall we care? Claudia Wagner 5 5
6 Frequencies Absolute frequency h k Relative frequency (Proportion) f k Cumulative frequency c t Observations: Y Y N Y Y N N Order needs to be meaningful Claudia Wagner 6
7 Y Y N Y Y N N Claudia Wagner 7
8 Mode Applies to nominals already! Can be used for all types of data. The mode is the value that appears most often in a set of data. What is the mode of X = [17, 19, 20, 21, 22, 23, 23, 23, 23] Claudia Wagner 8
9 Median X = [17, 19, 20, 21, 22, 23, 23, 23, 23, 25] Median of X is 22.5 X = [17, 19, 20, 21, 22, 23, 23, 23, 23] Median of X is 22 Median is useful for skewed distribution where mean is meaningless Applies to ordinals, intervals and ratios Claudia Wagner 9
10 Mean (expected value) Applies to interval scales and ratios: Example: X = [17, 19, 20, 21, 22, 23, 23, 23, 23, 25] Claudia Wagner 10
11 Mode, median, mean two log-normal distributions; Claudia Wagner 11
12 Sample of Dogs Range R = x max x min Range = = 430 Average height of a dog (measured by shoulders) is 394mm src: Claudia Wagner 12
13 Dispersion Variance= src: Claudia Wagner 13
14 Standard Deviation Standard Deviation is just the square root of variance We can now show what lies within 1 std away from the mean. This helps us to assess what is normal, what is extra large or extra small? src: Claudia Wagner 15
15 Standard Deviation Standard Deviation does not measure how far typical values tend to be from the mean How could we compute that? src: Claudia Wagner 16
16 Percentiles The n th percentile is a value such that n% of all observations fall at or below it! Quartiles Q1 is the value for which 25% of all observations fall at or below it Image: Claudia Wagner 17
17 Boxplots IQR = Q 3 Q 1 Outliers are usually 3 IQR or more above the third quartile or 3 IQR or more below the first quartile. Image: Claudia Wagner 18
18 Skewness Skewness quantifies how symmetrical a distribution is. A symmetrical distribution has a skewness of zero. Negative values for the skewness indicate data is skewed left. Positive values for the skewness indicate data is skewed right. Skewness < 0 Left skew skewness=0 Skewness > 0 Right skew Claudia Wagner 19
19 Kurtosis Kurtosis quantifies how peaky a distribution is compared to a normal distribution A normal distribution has a kurtosis of 0. A flatter distribution has a negative kurtosis, A distribution more peaked than a Normal distribution has a positive kurtosis. kurtosis<0 kurtosis=0 kurtosis>0 Claudia Wagner 22
20 Kurtosis Fourth central moment Nominator: weight up strong deviations from mean how long are the tails? Denominator: large variance decreases kurtosis Large kurtosis: low variance and long tails Claudia Wagner 23
21 Normal Distribution More peaky than normal distribution! Positive Kurtosis! Claudia Wagner 25
22 Normal Distribution Flatter than normal distribution! Negative Kurtosis! Claudia Wagner 26
23 Normal Distribution Right skewed! Positive skewness value! Claudia Wagner 27
24 Normal Distribution Left skewed! Negative skewness values! Claudia Wagner 28
25 WHAT IS A PROBABILITY? Claudia Wagner 29
26 Random Variables Random Variable Discrete Random Variable Can take on only a discrete set of values. You can count the values it can take on. Continuous Random Variable Can take on any value in an interval Claudia Wagner 30
27 Discrete Random Variable X X=(S,P) S is a finite set of values P: S [0,1], whereby å s P(s) =1 Probability Mass Function Example: S={2,3,4,5,6,7,8,9,10,11,12} What does this mean? 0,18 0,16 0,14 0,12 0,1 0,08 0,06 0,04 0,02 0 Two Dices Claudia Wagner 31
28 Expected Value / Expectation Which value of the random variable do we expect? E.g., for a dice E(X) = 3.5 The expected value is the long range average Claudia Wagner 32
29 Continuous Random Variable X f(x) Probability Density Function (PDF)? Probability that someone is exactly 180cm is zero. Height x 180cm Better: P( x - 180cm < 0.1) P(179 < x < 181) Area under the curve Claudia Wagner 33
30 Expected Value / Expectation Which value of the random variable do we expect? E.g., height, length of snowboard, weight E(X) = 168cm Claudia Wagner 34
31 Example Equal number of black and red balls in an urn p = 1-p = 0.5 You win if you pick a red ball Bernoulli Distribution Probability Mass Function Claudia Wagner 35
32 Bernoulli Single experiment: draw one ball Repeat experiment 5 time. Observe: BBBBR What is the probability of observing this sequence? 0.5 * 0.5 * 0.5 * 0.5 * 0.5 = Let s assume 30% of balls are red and 70% are black? 0.7 * 0.7 * 0.7 * 0.7 * 0.3 = Claudia Wagner 36
33 Likelihood for a Bernoulli Claudia Wagner 37
34 Binomial Distribution What if drawing 5 balls becomes one experiment If we repeat the experiment what is the probability of observing k successes out of n? Success is e.g. picking a red ball PMF of a binomial distribution Number of ways to choose k elements from a set of n elements disregarding their order Claudia Wagner 38
35 Change Parameters p and n are the parameter of the binomial distribution Claudia Wagner 39
36 Discrete Random Variable X One Experiment: Toss 4 coins Coin shows either head H or tail T Number of all possible outcomes? 2 4 = 16 Claudia Wagner 40
37 Discrete Random Variable X 4 coin tosses: S={0,1,2,3} Number of all possible outcomes? 2 4 = 16 Number of outcomes that give 3 heads = 4!/(3!*1!) = 4 Probability of observing 3 heads: 4/16= 0.25 Number of ways to choose k elements from a set of n elements disregarding their order Claudia Wagner 41
38 Discrete Random Variable X One Experiment: Toss 4 coins Coin shows either head H or tail T Number of all possible outcomes? 2 4 = 16 Number of outcomes that give you 3 heads = 4 Claudia Wagner 42
39 Exploit the fact that you know that the PMF of the Binomial distribution: Probability of observing 3 heads when we toss a fair coin 4 times (no order): 4!/(3! 1!) = 0.25 Claudia Wagner 43
40 What is the probability of observing 3 THREE when rolling 4 dices (no order)? 4 1/6 3 5/6 1 = =1296 combinations if you roll 4 dices How many of them show 3 THREE? 4!/(3!*1!) * 5 = 20 20/1296 = What is the probability of observing 3 THREE in a row when rolling a dice 4 times (order)? 2 (1/6) 3 (5/6) = Claudia Wagner 44
41 What is the probability of observing at least 5 heads when we flip a fair coin 6 times (no order)? Claudia Wagner 45
42 Discrete Random Variable X 6 times repeated coin toss: S={0,1,2,3,4,5,6} Number of all possible outcomes? 2 6 = 64 Number of outcomes that give you at least 5 heads = 6!/(6!*1!) + 6!/(6!*0!) = 7 Probability: 7/64 = !/(5!*1!)*0.5 5 * !/(6!*0!) *0.5 6 = Claudia Wagner 46
43 Probability Distribution Why do we care? Compute the probability of observations! How likely is an observation given certain parameters? Claudia Wagner 47
44 Example Source: Claudia Wagner 48
45 Why should we care? Most of the time we do not know the parameter of the true distribution that generated our sample data 1. But we can test hypothesis about the parameter If we observe 5 times head in 6 coin tosses what is the probability that the coin was fair? 2. And we can estimate the parameter from the observed sample data Inference! If we observe 5 times head in 6 coin tosses what was the parameter p of the coin? Claudia Wagner 49
46 HYPOTHESIS TESTING Claudia Wagner 50
47 Hypothesis Testing Example: my hypothesis is that the coin is unfair. We create a null-hypothesis which would falsify our hypothesis if it was true. Then we try to reject the null hypothesis (with a certain probability). Null-Hypothesis H 0 would be? H 0 : X ~ Binom(n, p=0.5) Alternative-Hypothesis H A would be? H A : X ~ Binom(n, p!=0.5) Claudia Wagner 51
48 Can we verify my hypothesis and if so how? We can never verify a hypothesis, we can only reject it! If we can reject H 0 we have more evidence that supports the assumption that H A can be true, but we do not know it. Which experiment could we conduct in order to test if we should reject H 0? Repeated coin toss experiments. How many heads do we observe? How many heads would we expect if H 0 would be true? Claudia Wagner 52
49 Remember The binomial distribution shows how likely it is that we observe X heads; Parameter: number of trials per experiment n=6 and fairness of coin p= , , , ,3125 0, , , ,35 0,3 Binomial distribution Reference distribution: 0,25 0,2 0,15 0,1 0, We expect that the outcome of the experiment should look like this if p=0.5 and n=6 Binomial distribution Claudia Wagner 53
50 Hypothesis Testing: 3 Steps Compute a suitable test statistic t obs from the observed sample data and compare it to a reference distribution E.g. expected number of heads from our repeated experiments is 5 Reference distribution describes how your data looks like if the null hypothesis is true E.g. expected number of heads = 3 Find out if t obs lies in the critical regions of the reference distribution Claudia Wagner 54
51 Remember The binomial distribution shows how likely it is that we observe X heads; Parameter: number of trials n=6 and fairness of coin p= , , , ,3125 0, , , ,35 0,3 Binomial distribution 0,25 0,2 0,15 0,1 Critical Region of the reference distribution Binomial distribution 0, Claudia Wagner 55
52 Learning Goals How to describe sample data? What is mode/median/mean? What is variance/std? What is kurtosis/skewness? What types of data exist? Which statistic can be computed on nominal data? What is a probability? Which distribution describes the success probability of an experiment? Which distribution describes the success probability of multiple experiments? Claudia Wagner 56
Web Science & Technologies University of Koblenz Landau, Germany. Lecture Data Science. Statistics and Probabilities JProf. Dr.
Web Science & Technologies University of Koblenz Landau, Germany Lecture Data Science Statistics and Probabilities JProf. Dr. Claudia Wagner Data Science Open Position @GESIS Student Assistant Job in Data
More informationWeek 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.
Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.
More informationModel Paper Statistics Objective. Paper Code Time Allowed: 20 minutes
Model Paper Statistics Objective Intermediate Part I (11 th Class) Examination Session 2012-2013 and onward Total marks: 17 Paper Code Time Allowed: 20 minutes Note:- You have four choices for each objective
More information2011 Pearson Education, Inc
Statistics for Business and Economics Chapter 4 Random Variables & Probability Distributions Content 1. Two Types of Random Variables 2. Probability Distributions for Discrete Random Variables 3. The Binomial
More informationThe normal distribution is a theoretical model derived mathematically and not empirically.
Sociology 541 The Normal Distribution Probability and An Introduction to Inferential Statistics Normal Approximation The normal distribution is a theoretical model derived mathematically and not empirically.
More informationThe Bernoulli distribution
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationEcon 6900: Statistical Problems. Instructor: Yogesh Uppal
Econ 6900: Statistical Problems Instructor: Yogesh Uppal Email: yuppal@ysu.edu Lecture Slides 4 Random Variables Probability Distributions Discrete Distributions Discrete Uniform Probability Distribution
More informationSome Characteristics of Data
Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key
More informationLecture 2. Probability Distributions Theophanis Tsandilas
Lecture 2 Probability Distributions Theophanis Tsandilas Comment on measures of dispersion Why do common measures of dispersion (variance and standard deviation) use sums of squares: nx (x i ˆµ) 2 i=1
More informationBasic Procedure for Histograms
Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that
More informationMATH 118 Class Notes For Chapter 5 By: Maan Omran
MATH 118 Class Notes For Chapter 5 By: Maan Omran Section 5.1 Central Tendency Mode: the number or numbers that occur most often. Median: the number at the midpoint of a ranked data. Ex1: The test scores
More informationME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.
ME3620 Theory of Engineering Experimentation Chapter III. Random Variables and Probability Distributions Chapter III 1 3.2 Random Variables In an experiment, a measurement is usually denoted by a variable
More informationLecture 1: Review and Exploratory Data Analysis (EDA)
Lecture 1: Review and Exploratory Data Analysis (EDA) Ani Manichaikul amanicha@jhsph.edu 16 April 2007 1 / 40 Course Information I Office hours For questions and help When? I ll announce this tomorrow
More informationChapter 3 Discrete Random Variables and Probability Distributions
Chapter 3 Discrete Random Variables and Probability Distributions Part 2: Mean and Variance of a Discrete Random Variable Section 3.4 1 / 16 Discrete Random Variable - Expected Value In a random experiment,
More informationProf. Thistleton MAT 505 Introduction to Probability Lecture 3
Sections from Text and MIT Video Lecture: Sections 2.1 through 2.5 http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-041-probabilistic-systemsanalysis-and-applied-probability-fall-2010/video-lectures/lecture-1-probability-models-and-axioms/
More informationMAS1403. Quantitative Methods for Business Management. Semester 1, Module leader: Dr. David Walshaw
MAS1403 Quantitative Methods for Business Management Semester 1, 2018 2019 Module leader: Dr. David Walshaw Additional lecturers: Dr. James Waldren and Dr. Stuart Hall Announcements: Written assignment
More informationSTAT 113 Variability
STAT 113 Variability Colin Reimer Dawson Oberlin College September 14, 2017 1 / 48 Outline Last Time: Shape and Center Variability Boxplots and the IQR Variance and Standard Deviaton Transformations 2
More informationExample - Let X be the number of boys in a 4 child family. Find the probability distribution table:
Chapter8 Probability Distributions and Statistics Section 8.1 Distributions of Random Variables tthe value of the result of the probability experiment is a RANDOM VARIABLE. Example - Let X be the number
More informationTheoretical Foundations
Theoretical Foundations Probabilities Monia Ranalli monia.ranalli@uniroma2.it Ranalli M. Theoretical Foundations - Probabilities 1 / 27 Objectives understand the probability basics quantify random phenomena
More informationBinomial Random Variables. Binomial Random Variables
Bernoulli Trials Definition A Bernoulli trial is a random experiment in which there are only two possible outcomes - success and failure. 1 Tossing a coin and considering heads as success and tails as
More informationExample. Chapter 8 Probability Distributions and Statistics Section 8.1 Distributions of Random Variables
Chapter 8 Probability Distributions and Statistics Section 8.1 Distributions of Random Variables You are dealt a hand of 5 cards. Find the probability distribution table for the number of hearts. Graph
More information5.1 Personal Probability
5. Probability Value Page 1 5.1 Personal Probability Although we think probability is something that is confined to math class, in the form of personal probability it is something we use to make decisions
More informationCS145: Probability & Computing
CS145: Probability & Computing Lecture 8: Variance of Sums, Cumulative Distribution, Continuous Variables Instructor: Eli Upfal Brown University Computer Science Figure credits: Bertsekas & Tsitsiklis,
More information2 Exploring Univariate Data
2 Exploring Univariate Data A good picture is worth more than a thousand words! Having the data collected we examine them to get a feel for they main messages and any surprising features, before attempting
More informationDavid Tenenbaum GEOG 090 UNC-CH Spring 2005
Simple Descriptive Statistics Review and Examples You will likely make use of all three measures of central tendency (mode, median, and mean), as well as some key measures of dispersion (standard deviation,
More informationLecture 2 Describing Data
Lecture 2 Describing Data Thais Paiva STA 111 - Summer 2013 Term II July 2, 2013 Lecture Plan 1 Types of data 2 Describing the data with plots 3 Summary statistics for central tendency and spread 4 Histograms
More information1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:
1 Exercise One Note that the data is not grouped! 1.1 Calculate the mean ROI Below you find the raw data in tabular form: Obs Data 1 18.5 2 18.6 3 17.4 4 12.2 5 19.7 6 5.6 7 7.7 8 9.8 9 19.9 10 9.9 11
More informationExample - Let X be the number of boys in a 4 child family. Find the probability distribution table:
Chapter7 Probability Distributions and Statistics Distributions of Random Variables tthe value of the result of the probability experiment is a RANDOM VARIABLE. Example - Let X be the number of boys in
More information1/2 2. Mean & variance. Mean & standard deviation
Question # 1 of 10 ( Start time: 09:46:03 PM ) Total Marks: 1 The probability distribution of X is given below. x: 0 1 2 3 4 p(x): 0.73? 0.06 0.04 0.01 What is the value of missing probability? 0.54 0.16
More informationThe Binomial Distribution
The Binomial Distribution January 31, 2018 Contents The Binomial Distribution The Normal Approximation to the Binomial The Binomial Hypothesis Test Computing Binomial Probabilities in R 30 Problems The
More informationINF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 3, 1.9
INF5830 015 FALL NATURAL LANGUAGE PROCESSING Jan Tore Lønning, Lecture 3, 1.9 Today: More statistics Binomial distribution Continuous random variables/distributions Normal distribution Sampling and sampling
More informationProbability Models.S2 Discrete Random Variables
Probability Models.S2 Discrete Random Variables Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard Results of an experiment involving uncertainty are described by one or more random
More informationINF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 3, 1.9
1 INF5830 2015 FALL NATURAL LANGUAGE PROCESSING Jan Tore Lønning, Lecture 3, 1.9 Today: More statistics 2 Recap Probability distributions Categorical distributions Bernoulli trial Binomial distribution
More informationKARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI
88 P a g e B S ( B B A ) S y l l a b u s KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI Course Title : STATISTICS Course Number : BA(BS) 532 Credit Hours : 03 Course 1. Statistical
More informationDESCRIBING DATA: MESURES OF LOCATION
DESCRIBING DATA: MESURES OF LOCATION A. Measures of Central Tendency Measures of Central Tendency are used to pinpoint the center or average of a data set which can then be used to represent the typical
More informationThe Binomial Distribution
The Binomial Distribution January 31, 2019 Contents The Binomial Distribution The Normal Approximation to the Binomial The Binomial Hypothesis Test Computing Binomial Probabilities in R 30 Problems The
More informationData Distributions and Normality
Data Distributions and Normality Definition (Non)Parametric Parametric statistics assume that data come from a normal distribution, and make inferences about parameters of that distribution. These statistical
More informationDATA SUMMARIZATION AND VISUALIZATION
APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296
More informationSome estimates of the height of the podium
Some estimates of the height of the podium 24 36 40 40 40 41 42 44 46 48 50 53 65 98 1 5 number summary Inter quartile range (IQR) range = max min 2 1.5 IQR outlier rule 3 make a boxplot 24 36 40 40 40
More informationStatistical Methods in Practice STAT/MATH 3379
Statistical Methods in Practice STAT/MATH 3379 Dr. A. B. W. Manage Associate Professor of Mathematics & Statistics Department of Mathematics & Statistics Sam Houston State University Overview 6.1 Discrete
More informationدرس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی
یادگیري ماشین توزیع هاي نمونه و تخمین نقطه اي پارامترها Sampling Distributions and Point Estimation of Parameter (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی درس هفتم 1 Outline Introduction
More informationBinomial and Normal Distributions
Binomial and Normal Distributions Bernoulli Trials A Bernoulli trial is a random experiment with 2 special properties: The result of a Bernoulli trial is binary. Examples: Heads vs. Tails, Healthy vs.
More informationKey Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions
SGSB Workshop: Using Statistical Data to Make Decisions Module 2: The Logic of Statistical Inference Dr. Tom Ilvento January 2006 Dr. Mugdim Pašić Key Objectives Understand the logic of statistical inference
More informationDescriptive Statistics
Petra Petrovics Descriptive Statistics 2 nd seminar DESCRIPTIVE STATISTICS Definition: Descriptive statistics is concerned only with collecting and describing data Methods: - statistical tables and graphs
More informationDescription of Data I
Description of Data I (Summary and Variability measures) Objectives: Able to understand how to summarize the data Able to understand how to measure the variability of the data Able to use and interpret
More informationData that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.
Chapter 8 Measures of Center Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Data that can only be integer
More informationPoint Estimation. Some General Concepts of Point Estimation. Example. Estimator quality
Point Estimation Some General Concepts of Point Estimation Statistical inference = conclusions about parameters Parameters == population characteristics A point estimate of a parameter is a value (based
More informationSampling Distributions and the Central Limit Theorem
Sampling Distributions and the Central Limit Theorem February 18 Data distributions and sampling distributions So far, we have discussed the distribution of data (i.e. of random variables in our sample,
More informationChapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi
Chapter 4: Commonly Used Distributions Statistics for Engineers and Scientists Fourth Edition William Navidi 2014 by Education. This is proprietary material solely for authorized instructor use. Not authorized
More informationBusiness Statistics 41000: Probability 4
Business Statistics 41000: Probability 4 Drew D. Creal University of Chicago, Booth School of Business February 14 and 15, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:
More informationShifting our focus. We were studying statistics (data, displays, sampling...) The next few lectures focus on probability (randomness) Why?
Probability Introduction Shifting our focus We were studying statistics (data, displays, sampling...) The next few lectures focus on probability (randomness) Why? What is Probability? Probability is used
More informationUnit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More informationUnit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More information4 Random Variables and Distributions
4 Random Variables and Distributions Random variables A random variable assigns each outcome in a sample space. e.g. called a realization of that variable to Note: We ll usually denote a random variable
More information9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives
Basic Statistics for the Healthcare Professional 1 F R A N K C O H E N, M B B, M P A D I R E C T O R O F A N A L Y T I C S D O C T O R S M A N A G E M E N T, LLC Purpose of Statistic 2 Provide a numerical
More informationStatistics for IT Managers
Statistics for IT Managers 95-796, Fall 212 Course Overview Instructor: Daniel B. Neill (neill@cs.cmu.edu) TAs: Eli (Han) Liu, Kats Sasanuma, Sriram Somanchi, Skyler Speakman, Quan Wang, Yiye Zhang (see
More informationMATH 264 Problem Homework I
MATH Problem Homework I Due to December 9, 00@:0 PROBLEMS & SOLUTIONS. A student answers a multiple-choice examination question that offers four possible answers. Suppose that the probability that the
More informationThe Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s).
We will look the three common and useful measures of spread. The Range, the Inter Quartile Range (or IQR), and the Standard Deviation (which we usually denote by a lower case s). 1 Ameasure of the center
More informationChapter 3: Probability Distributions and Statistics
Chapter 3: Probability Distributions and Statistics Section 3.-3.3 3. Random Variables and Histograms A is a rule that assigns precisely one real number to each outcome of an experiment. We usually denote
More information4: Probability. Notes: Range of possible probabilities: Probabilities can be no less than 0% and no more than 100% (of course).
4: Probability What is probability? The probability of an event is its relative frequency (proportion) in the population. An event that happens half the time (such as a head showing up on the flip of a
More informationChapter 3 Discrete Random Variables and Probability Distributions
Chapter 3 Discrete Random Variables and Probability Distributions Part 3: Special Discrete Random Variable Distributions Section 3.5 Discrete Uniform Section 3.6 Bernoulli and Binomial Others sections
More informationReview for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom
Review for Final Exam 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom THANK YOU!!!! JON!! PETER!! RUTHI!! ERIKA!! ALL OF YOU!!!! Probability Counting Sets Inclusion-exclusion principle Rule of product
More informationMA : Introductory Probability
MA 320-001: Introductory Probability David Murrugarra Department of Mathematics, University of Kentucky http://www.math.uky.edu/~dmu228/ma320/ Spring 2017 David Murrugarra (University of Kentucky) MA 320:
More informationIn a binomial experiment of n trials, where p = probability of success and q = probability of failure. mean variance standard deviation
Name In a binomial experiment of n trials, where p = probability of success and q = probability of failure mean variance standard deviation µ = n p σ = n p q σ = n p q Notation X ~ B(n, p) The probability
More informationChapter 7. Inferences about Population Variances
Chapter 7. Inferences about Population Variances Introduction () The variability of a population s values is as important as the population mean. Hypothetical distribution of E. coli concentrations from
More informationProbability mass function; cumulative distribution function
PHP 2510 Random variables; some discrete distributions Random variables - what are they? Probability mass function; cumulative distribution function Some discrete random variable models: Bernoulli Binomial
More informationIntroduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017
Introduction to Probability and Inference HSSP Summer 2017, Instructor: Alexandra Ding July 19, 2017 Please fill out the attendance sheet! Suggestions Box: Feedback and suggestions are important to the
More informationContents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii)
Contents (ix) Contents Preface... (vii) CHAPTER 1 An Overview of Statistical Applications 1.1 Introduction... 1 1. Probability Functions and Statistics... 1..1 Discrete versus Continuous Functions... 1..
More informationSTAT 201 Chapter 6. Distribution
STAT 201 Chapter 6 Distribution 1 Random Variable We know variable Random Variable: a numerical measurement of the outcome of a random phenomena Capital letter refer to the random variable Lower case letters
More informationCopyright 2005 Pearson Education, Inc. Slide 6-1
Copyright 2005 Pearson Education, Inc. Slide 6-1 Chapter 6 Copyright 2005 Pearson Education, Inc. Measures of Center in a Distribution 6-A The mean is what we most commonly call the average value. It is
More informationSTATISTICAL DISTRIBUTIONS AND THE CALCULATOR
STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either
More informationLecture Week 4 Inspecting Data: Distributions
Lecture Week 4 Inspecting Data: Distributions Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit So next week No lecture & workgroups But Practice Test on-line (BB) Enter data for your
More informationECON 214 Elements of Statistics for Economists 2016/2017
ECON 214 Elements of Statistics for Economists 2016/2017 Topic The Normal Distribution Lecturer: Dr. Bernardin Senadza, Dept. of Economics bsenadza@ug.edu.gh College of Education School of Continuing and
More information5. In fact, any function of a random variable is also a random variable
Random Variables - Class 11 October 14, 2012 Debdeep Pati 1 Random variables 1.1 Expectation of a function of a random variable 1. Expectation of a function of a random variable 2. We know E(X) = x xp(x)
More informationClass 11. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 11 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 2017 by D.B. Rowe 1 Agenda: Recap Chapter 5.3 continued Lecture 6.1-6.2 Go over Eam 2. 2 5: Probability
More informationExploring Data and Graphics
Exploring Data and Graphics Rick White Department of Statistics, UBC Graduate Pathways to Success Graduate & Postdoctoral Studies November 13, 2013 Outline Summarizing Data Types of Data Visualizing Data
More informationCounting Basics. Venn diagrams
Counting Basics Sets Ways of specifying sets Union and intersection Universal set and complements Empty set and disjoint sets Venn diagrams Counting Inclusion-exclusion Multiplication principle Addition
More informationII. Random Variables
II. Random Variables Random variables operate in much the same way as the outcomes or events in some arbitrary sample space the distinction is that random variables are simply outcomes that are represented
More informationCSC Advanced Scientific Programming, Spring Descriptive Statistics
CSC 223 - Advanced Scientific Programming, Spring 2018 Descriptive Statistics Overview Statistics is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions.
More informationMVE051/MSG Lecture 7
MVE051/MSG810 2017 Lecture 7 Petter Mostad Chalmers November 20, 2017 The purpose of collecting and analyzing data Purpose: To build and select models for parts of the real world (which can be used for
More informationCHAPTER 8 PROBABILITY DISTRIBUTIONS AND STATISTICS
CHAPTER 8 PROBABILITY DISTRIBUTIONS AND STATISTICS 8.1 Distribution of Random Variables Random Variable Probability Distribution of Random Variables 8.2 Expected Value Mean Mean is the average value of
More informationProbability. An intro for calculus students P= Figure 1: A normal integral
Probability An intro for calculus students.8.6.4.2 P=.87 2 3 4 Figure : A normal integral Suppose we flip a coin 2 times; what is the probability that we get more than 2 heads? Suppose we roll a six-sided
More informationLecture 9: Plinko Probabilities, Part III Random Variables, Expected Values and Variances
Physical Principles in Biology Biology 3550 Fall 2018 Lecture 9: Plinko Probabilities, Part III Random Variables, Expected Values and Variances Monday, 10 September 2018 c David P. Goldenberg University
More informationSection Random Variables and Histograms
Section 3.1 - Random Variables and Histograms Definition: A random variable is a rule that assigns a number to each outcome of an experiment. Example 1: Suppose we toss a coin three times. Then we could
More informationExperimental Probability - probability measured by performing an experiment for a number of n trials and recording the number of outcomes
MDM 4U Probability Review Properties of Probability Experimental Probability - probability measured by performing an experiment for a number of n trials and recording the number of outcomes Theoretical
More informationRandom variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny.
Distributions September 17 Random variables Anything that can be measured or categorized is called a variable If the value that a variable takes on is subject to variability, then it the variable is a
More informationThe Binomial Distribution
Patrick Breheny September 13 Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 1 / 16 Outcomes and summary statistics Random variables Distributions So far, we have discussed the
More informationThe Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012
The Normal Distribution & Descriptive Statistics Kin 304W Week 2: Jan 15, 2012 1 Questionnaire Results I received 71 completed questionnaires. Thank you! Are you nervous about scientific writing? You re
More informationNOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS
NOTES TO CONSIDER BEFORE ATTEMPTING EX 2C BOX PLOTS A box plot is a pictorial representation of the data and can be used to get a good idea and a clear picture about the distribution of the data. It shows
More informationProbability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016
Probability Theory Probability and Statistics for Data Science CSE594 - Spring 2016 What is Probability? 2 What is Probability? Examples outcome of flipping a coin (seminal example) amount of snowfall
More informationM3S1 - Binomial Distribution
M3S1 - Binomial Distribution Professor Jarad Niemi STAT 226 - Iowa State University September 28, 2018 Professor Jarad Niemi (STAT226@ISU) M3S1 - Binomial Distribution September 28, 2018 1 / 28 Outline
More informationx is a random variable which is a numerical description of the outcome of an experiment.
Chapter 5 Discrete Probability Distributions Random Variables is a random variable which is a numerical description of the outcome of an eperiment. Discrete: If the possible values change by steps or jumps.
More informationModule 3: Sampling Distributions and the CLT Statistics (OA3102)
Module 3: Sampling Distributions and the CLT Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chpt 7.1-7.3, 7.5 Revision: 1-12 1 Goals for
More informationSimple Descriptive Statistics
Simple Descriptive Statistics These are ways to summarize a data set quickly and accurately The most common way of describing a variable distribution is in terms of two of its properties: Central tendency
More information5.2 Random Variables, Probability Histograms and Probability Distributions
Chapter 5 5.2 Random Variables, Probability Histograms and Probability Distributions A random variable (r.v.) can be either continuous or discrete. It takes on the possible values of an experiment. It
More information2 DESCRIPTIVE STATISTICS
Chapter 2 Descriptive Statistics 47 2 DESCRIPTIVE STATISTICS Figure 2.1 When you have large amounts of data, you will need to organize it in a way that makes sense. These ballots from an election are rolled
More informationMEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION
MEASURES OF CENTRAL TENDENCY & VARIABILITY + NORMAL DISTRIBUTION 1 Day 3 Summer 2017.07.31 DISTRIBUTION Symmetry Modality 单峰, 双峰 Skewness 正偏或负偏 Kurtosis 2 3 CHAPTER 4 Measures of Central Tendency 集中趋势
More informationECON 214 Elements of Statistics for Economists 2016/2017
ECON 214 Elements of Statistics for Economists 2016/2017 Topic Probability Distributions: Binomial and Poisson Distributions Lecturer: Dr. Bernardin Senadza, Dept. of Economics bsenadza@ug.edu.gh College
More informationMATH 112 Section 7.3: Understanding Chance
MATH 112 Section 7.3: Understanding Chance Prof. Jonathan Duncan Walla Walla University Autumn Quarter, 2007 Outline 1 Introduction to Probability 2 Theoretical vs. Experimental Probability 3 Advanced
More information32.S [F] SU 02 June All Syllabus Science Faculty B.A. I Yr. Stat. [Opt.] [Sem.I & II] 1
32.S [F] SU 02 June 2014 2015 All Syllabus Science Faculty B.A. I Yr. Stat. [Opt.] [Sem.I & II] 1 32.S [F] SU 02 June 2014 2015 All Syllabus Science Faculty B.A. I Yr. Stat. [Opt.] [Sem.I & II] 2 32.S
More information