CS134: Networks Spring Random Variables and Independence. 1.2 Probability Distribution Function (PDF) Number of heads Probability 2 0.

Similar documents
5. In fact, any function of a random variable is also a random variable

Chapter 3 - Lecture 5 The Binomial Probability Distribution

MA : Introductory Probability

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

IEOR 165 Lecture 1 Probability Review

Random Variables Handout. Xavier Vilà

STOR Lecture 7. Random Variables - I

Discrete Random Variables and Probability Distributions. Stat 4570/5570 Based on Devore s book (Ed 8)

Simple Random Sample

Chapter 5 Discrete Probability Distributions. Random Variables Discrete Probability Distributions Expected Value and Variance

Binomial Random Variables. Binomial Random Variables

Binomial Distributions

STA 6166 Fall 2007 Web-based Course. Notes 10: Probability Models

II - Probability. Counting Techniques. three rules of counting. 1multiplication rules. 2permutations. 3combinations

Part 1 In which we meet the law of averages. The Law of Averages. The Expected Value & The Standard Error. Where Are We Going?

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

Lecture 23: April 10

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

A useful modeling tricks.

Business Statistics 41000: Probability 3

Statistics for Business and Economics

MATH/STAT 3360, Probability FALL 2012 Toby Kenney

Statistical Tables Compiled by Alan J. Terry

Econ 6900: Statistical Problems. Instructor: Yogesh Uppal

Central Limit Theorem 11/08/2005

MAS187/AEF258. University of Newcastle upon Tyne

Week 1 Quantitative Analysis of Financial Markets Basic Statistics A

MS-E2114 Investment Science Exercise 10/2016, Solutions

TOPIC: PROBABILITY DISTRIBUTIONS

(Practice Version) Midterm Exam 1

Statistics 6 th Edition

MAS187/AEF258. University of Newcastle upon Tyne

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 4: May 2, Abstract

***SECTION 8.1*** The Binomial Distributions

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Chapter 2: Random Variables (Cont d)

PROBABILITY DISTRIBUTIONS

Discrete Random Variables

Discrete Random Variables and Probability Distributions

Probability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions

Mathematics of Finance Final Preparation December 19. To be thoroughly prepared for the final exam, you should

Probability. An intro for calculus students P= Figure 1: A normal integral

Statistics. Marco Caserta IE University. Stats 1 / 56

STA 220H1F LEC0201. Week 7: More Probability: Discrete Random Variables

Math489/889 Stochastic Processes and Advanced Mathematical Finance Homework 5

Statistics for Managers Using Microsoft Excel 7 th Edition

Probability Models.S2 Discrete Random Variables

Probability Distributions for Discrete RV

Counting Basics. Venn diagrams

9 Expectation and Variance

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

6.1 Binomial Theorem

MA 1125 Lecture 14 - Expected Values. Wednesday, October 4, Objectives: Introduce expected values.

Random Variables and Applications OPRE 6301

P (X = x) = x=25

CS145: Probability & Computing

STAT/MATH 395 PROBABILITY II

Discrete Random Variables

Model Paper Statistics Objective. Paper Code Time Allowed: 20 minutes

CS 237: Probability in Computing

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

IEOR 3106: Introduction to OR: Stochastic Models. Fall 2013, Professor Whitt. Class Lecture Notes: Tuesday, September 10.

Write legibly. Unreadable answers are worthless.

Business Statistics 41000: Probability 4

Case Study: Heavy-Tailed Distribution and Reinsurance Rate-making

Chapter 5. Sampling Distributions

. (i) What is the probability that X is at most 8.75? =.875

Probability Theory. Mohamed I. Riffi. Islamic University of Gaza

Random Variable: Definition

Chapter 3 Discrete Random Variables and Probability Distributions

Statistics Chapter 8

Chapter 4 Discrete Random variables

Data Analytics (CS40003) Practice Set IV (Topic: Probability and Sampling Distribution)

SYLLABUS AND SAMPLE QUESTIONS FOR MSQE (Program Code: MQEK and MQED) Syllabus for PEA (Mathematics), 2013

5.2 Random Variables, Probability Histograms and Probability Distributions

Commonly Used Distributions

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

Random variables. Contents

Shifting our focus. We were studying statistics (data, displays, sampling...) The next few lectures focus on probability (randomness) Why?

Engineering Statistics ECIV 2305

Section 7.5 The Normal Distribution. Section 7.6 Application of the Normal Distribution

The Binomial Distribution

6. Continous Distributions

Chapter 2. Random variables. 2.3 Expectation

MATH 264 Problem Homework I

Statistical Methods in Practice STAT/MATH 3379

Chapter 3 Discrete Random Variables and Probability Distributions

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series

UNIT 4 MATHEMATICAL METHODS

ECE 340 Probabilistic Methods in Engineering M/W 3-4:15. Lecture 10: Continuous RV Families. Prof. Vince Calhoun

Probability Distributions

variance risk Alice & Bob are gambling (again). X = Alice s gain per flip: E[X] = Time passes... Alice (yawning) says let s raise the stakes

MAT 4250: Lecture 1 Eric Chung

CMPSCI 311: Introduction to Algorithms Second Midterm Practice Exam SOLUTIONS

The Binomial Distribution

Binomial Distributions

Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

MATH MW Elementary Probability Course Notes Part IV: Binomial/Normal distributions Mean and Variance

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Transcription:

CS134: Networks Spring 2017 Prof. Yaron Singer Section 0 1 Probability 1.1 Random Variables and Independence A real-valued random variable is a variable that can take each of a set of possible values in R with a certain probability. Two random variables, X and Y are said to be independent if P (X = x Y = y) = P (X = x) for all values x of X and y of Y. 1.2 Probability Distribution Function (PDF) The probability distribution of a real-valued random variable X species the probability that X takes each of its possible values. For example, suppose you ip a coin two times. This simple statistical experiment can have four possible outcomes: HH, HT, TH, and TT. Now, let the variable X represent the number of Heads that result from this experiment. X is a random variable that can take the values 0, 1, and 2. The table below, which associates each outcome with its probability, is the probability distribution of the discrete random variable X: Number of heads Probability 0 0.25 1 0.50 2 0.25 1.3 Cumulative Probability Distributions (CDF) The cumulative probability distribution of a real-valued random variable X species, for each of the possible values x that X can take, the probability that X takes a value equal to or smaller than x. Let us return to the coin ip experiment. If we ip a coin two times, we might ask: What is the probability that the coin ips would result in one or fewer heads? It would be the probability that the coin ip experiment results in zero heads plus the probability that the experiment results in one head. P (X 1) = P (X = 0) + P (X = 1) = 0.25 + 0.50 = 0.75 1

The table below gives the cumulative probability distribution of the discrete random variable X: Number of heads ProbabilityP (X = x) Cumulative ProbabilityP (X < x) 0 0.25 0.25 1 0.50 0.75 2 0.25 1.00 Sanity Test: Are PDF's guaranteed to be non-decreasing? What about CDF's? Solution: PDF's are not guaranteed to be non-decreasing (for example, the PDF for any normal distribution is not non-decreasing). In contrast, CDF's are always non-decreasing. 1 1.4 The Mean of a Discrete Random Variable The mean or expectation of a real-valued random variable X is a weighted average of the possible values that the random variable can take. For example, if the random variable X can take values in a nite 2 set A R, its mean E(X) is dened by E(X) := x AP (x)x Unlike the sample mean of a group of observations, which gives each observation equal weight, the mean of a random variable weights each outcome x according to its probability, P (x). The mean of a random variable provides the long-run average of the variable, or the expected average outcome over many observations. For example, suppose an individual plays a gambling game where it is possible to lose $1, break even, win $3 or win $10 each time she plays. Let Y be the random variable that takes the value y { 1, 0, 3, 5} when the outcome of the gamble is y. The probability distribution of Y is provided by the following table: Calculate the expectation of Y. y P robability 1.3 0.4 3.2 5.1 Solution: The mean of the random variable Y can be calculated as follows: µ(y ) = ( 1.3) + (0.4) + (3.2) + (5.1) = 0.3 + 0.6 + 0.5 = 0.8 1 To see this, consider that the CDF is a sum (or integral) over an interval of the PDF; specically, for a discrete random variable dened over (a, b), P (X x) = x P (X = x) and for a continuous random variable, P (X x) = i=a x f(x) where f(x) is the PDF of X. PDFs are nonnegative, so x1 < x2 implies P (X x1) P (X x2), since f(x 2) = P (X = x 2) 0; therefore, CDFs are non-decreasing. 2 If A is innite, we replace sums for integrals. 2

In the long run, then, the player can expect to win about 80 cents playing this game the odds are in her favor. Properties of Expectation For any two random variables X and Y, E(X + Y ) = E(X) + E(Y ). For two independent random variables X and Y, E(XY ) = E(X)E(Y ). 1.5 The Variance of a Discrete Random Variable The variance of a real-valued random variable X measures the spread, or variability, of the distribution of X. For example, suppose X can take values in a nite 3 set A R. Then the variance Var(X) of X is dened by Var(X) := x A P (x)(x E(X)) 2 For example, in the gambling game above, the variance of the random variable Y may be calculated as follows: Var(Y ) = ( 1 0.8) 2 0.3 + (0 0.8) 2 0.4 + (3 0.8) 2 0.2 + (5 0.8) 2 0.1 = ( 1.8) 2 0.3 + ( 0.8) 2 0.4 + (2.2) 2 0.2 + (4.2) 2 0.1 = 3.24 0.3 + 0.64 0.4 + 4.84 0.2 + 17.64 0.1 = 0.972 + 0.256 + 0.968 + 1.764 = 3.960 Hence, the variance and standard deviation of Y are 3.96 and 3.96 = 1.99 respectively. Unlike the sample variance of a group of observations, which gives each observation equal weight, the variance of a random variable weighs each outcome x according to its probability, P (x). The standard deviation of the random variable X is the square root of its variance and is sometimes denoted by σ. Note that standard deviation has a more natural interpretation than variance. This is because variance is measured in squared units, whereas standard deviation is in the same units as those in which the random variable itself is measured. Properties of Variance For independent random variables X and Y : Var(X) 0 Var(X) = E(X 2 ) E(X) 2 Var(X + Y ) = Var(X Y ) = Var(X) + Var(Y ) 3 If A is innite, we replace sums for integrals. 3

2 Examples of Common Probability Distributions 2.1 Uniform Probability Distribution The simplest probability distribution occurs when all of the values of a random variable occur with equal probability. This probability distribution is called the uniform distribution. For example, suppose that a fair die is tossed. Let X be the random variable that takes the value x when the die lands on x, for x {1, 2, 3, 4, 5, 6}. Since the probability that a fair die lands on each the possible numbers is the same, the random variable X has a uniform distribution. 2.2 Binomial Distribution A Bernoulli trial is a random experiment with exactly two possible outcomes, "success" and "failure", in which the probability of success p is the same every time the experiment is conducted. Closely related to a Bernoulli trial is a binomial experiment, which consists of a xed number of statistically independent Bernoulli trials, each with a probability of success p, and counts the number of successes. A random variable corresponding to a binomial experiment with n Bernoulli trials is denoted by B(n, p), and is said to have a binomial distribution. The probability of exactly k successes in the binomial experiment is given by: ( ) n P (X = k) = p k (1 p) n k k where ( ) n k := n! (n k)!k!, which reads n choose k, is the number of dierent sets of k elements we can choose out of a set of n elements. For example, suppose we create edges between each pair of a set nodes of nodes {a, b, c, d} with probability p. The random variable number of edges in the network is distributed according to a binomial distribution B(n, p). 2.2.1 Poisson Distribution The random variable X has a Poisson distribution with parameter λ > 0 if, for k = 0, 1, 2,... : P (X = k) = λk e λ The positive real number λ is equal to the mean and variance of the random variable X. That is, µ(x) = λ and Var(X) = λ. The Poisson distribution can be derived as a limiting case of the binomial distribution as the number of trials goes to innity and the expected number of successes remains xed. Therefore it can be used as an approximation to the binomial distribution if n is suciently large and p is suciently small. There is a rule of thumb stating that the Poisson distribution with λ = np is a good approximation of the binomial distribution B(n, p) if n is at least 20 and p is smaller than or equal to 0.05, and an excellent approximation if n is at least 100 and np is smaller than 10. k! 4

Figure 1: graph A a c e b d f a Figure 2: Graphgraph B c e g i k b d f h j l 3 The Exponential and Logarithm Functions The exponential function e x can be dened in various ways that provide useful formulas. Fixing x R, ( lim 1 + x ) n = e x n n The identity above will frequently prove useful in the analysis of networks. When you see an expression like ( 1 + n) x n, remember that you can approximate this expression by e x when n is large. Another denition of e x is given by e x = k=0 The logarithm function log(x) is the inverse of the exponential function e x. That is, for all x we have log(e x ) = x and for all x > 0 we have e log(x) = x. x k k! 4 Graph Theory 4.1 Graph representations. (Jackson 2.1.1 and 2.1.2) A graph G = (V, E) is a set of nodes V and a set of node pairs E, called edges. An undirected graph is a graph in which the existence of an edge from a node u to a node v implies the existence of an edge from v to u. A directed graph is a graph in which the existence of an edge from u to v does not necessarily imply the existence of an edge from v to u. Unless stated otherwise, assume the graph is undirected. There are three ways to represent a graph: Picture: A graph can be represented via a picture. For example, Figures 1 and 2 represent 5

two dierent graphs 4. Adjacency matrix: A graph can be represented by listing its nodes {1, 2,..., n} and a realvalued n n matrix g, where g ij equals 1 if there is an edge between i and j and equals 0 otherwise 5. For example, graph A in Figure 1 can be represented by listing its nodes {a, b, c, d, e, f} and the adjacency matrix: 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 0 1 0 0 0 1 1 0 List of nodes and edges: A graph can be represented by listing its nodes and edges. example, graph A can be represented as a list: 6 For ({a, b, c, d, e, f}, {ac, bd, de, df, ef}) 4.1.1 Exercises a. Represent graph B in Figure 2 via its adjacency matrix and via the list of its nodes and edges. b. Graph C has nodes {1, 2, 3, 4, 5, 6} and the adjacency matrix: 0 1 1 1 1 1 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 Represent graph C via a picture and via the list of its nodes and edges 7. c. The representation of graph D in terms of a list of its nodes and edges is: ({a, b, c, d}, {ab, ac, ad, bc, bd, cd}) Represent graph D via a picture and via its adjacency matrix. Solution: 4 Figures are located in the last page of this document. 5 A weighted graph is one whose adjacency matrix contains numbers other than 0 and 1, but for now we will focus on unweighted graphs. 6 Convention: for any objects x1, x 2,..., x n, we denote by {x 1, x 2,..., x n} the set that contains the elements x 1, x 2,..., x n and we denote by (x 1, x 2,..., x n) the ordered set (or list) that contains x 1 as its rst element, x 2 as its second element, and x i as its ith element. The distinction is important. For example, we have {x 1, x 2} = {x 2, x 1}, but in general (x 1, x 2) (x 2, x 1). 7 This type of graph is sometimes called star graph. 6

3 4 5 Figure 3: Network C 2 1 6 c d Figure 4: Network D a b a. Adjacency matrix: List of nodes and edges: 0 1 1 1 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 1 1 0 ({a, b, c, d, e, f, g, h, i, j, k, l}, {ab, ac, ad, bc, bc, cd, ef, eg, fh, gh, hj, ik, il, jl, lk}) b. Picture: See Figure 3. List of nodes and edges: c. Picture: See Figure 4. Adjacency matrix: ({1, 2, 3, 4, 5, 6}, {12, 13, 14, 15, 16}) 0 1 1 1 1 0 1 1 1 1 0 1 1 1 1 0 4.2 Subgraphs. A subgraph of a given graph G is a graph that can be obtained from G by deleting some of its nodes and edges. For example, the graph ({a, c}, {ac}) is a subgraph of graph A, but the graph 7

({a, b, c}, {bc}) is not. Represent a subgraph of graph B via a picture, via its adjacency matrix, and via the list of its nodes and edges. 4.3 Paths and Cycles (Jackson 2.1.3) A walk in a graph G between nodes i and j is a sequence of edges in G, say (i 1 i 2, i 2 i 3,..., i k 1 i k ), with i 1 = i and i k = j. A path in a graph G between nodes i and j is a sequence of edges in G, say (i 1 i 2, i 2 i 3,..., i k 1 i k ), with i 1 = i and i k = j, and such that each node in the sequence (i 1, i 2,..., i k ) is distinct. A cycle is a walk that starts and ends at the same node, with no node appearing more than once except the starting node, which also appears as the ending node. For example, (uv, vw, wu) is a cycle in graph A. A geodesic between two nodes is a shortest path between them; that is, a path with no more edges than any other path between these nodes. For example, in graph A, (df) is a geodesic between nodes d and f, but (de, ef) is not. 4.4 Components and Connected Subgraphs. (Jackson 2.1.5) A graph is connected if every two nodes are connected by some path. For example, graph A is not connected, but its subgraph ({b, d, e, f}, {bd, df, de, ef}) is connected. A graph is completely connected if it is has an edge between every two nodes. A component C of graph G is a (nonempty) subgraph that (i) is connected and that (ii) is maximal i.e. it is such that all other connected subgraphs of G with some node in common with C are subgraphs of C. 8 For example, both ({a, c}, {ac}) and ({b, d, e, f}, {bd, de, df, ef}) are components of graph A, but ({d, e, f}, {de, df, ef}) is not, since it is a subgraph of the component ({b, d, e, f}, {bd, df, de, ef}) and hence not maximal. a. Write down all the components of graph B via a list of their nodes and edges. Solution: a. and ({a, b, c, d}, {ab, ac, ad, bc, bd, cd}) ({e, f, g, h, i, j, k, l}, {ef, eg, fh, gh, hj, ik, il, jl, kl}) 8 Intuitively, a component is a piece of the graph not connected to anything else. 8

4.5 Neighborhood. (Jackson 2.1.7) The neighborhood of a node i is the set of nodes that i is linked to. For example, the neighborhood of the node f in the graph A is {d, e}. Given a set of nodes S, the neighborhood of S is the union of the neighborhoods of all of its members that is, the set of nodes that have at least one edge to some node in S. For example, the neighborhood of the nodes {a, b} in graph A is {c, d}. a. What is the neighborhood of node d in graph B? b. What is the neighborhood of the set of nodes {e, g} in graph B? Solution: a. {a, b, c} b. {e, g, f, h}. Note that {e, g} is a subset of the neighborhood of {e, g} since e neighbors g and vice versa. 4.6 Degree and Graph Density. (Jackson 2.1.8) The degree of a node is the number of links that involve that node. For example, the degree of the nodes a and d in graph A are 1 and 3 respectively. In graph B: a. What is the degree of node a? b. What is the average degree of all the nodes in the graph? c. What is the average degree of node a's neighbors? Solution: a. 3 b. In alphabetical order, each nodes degree is 3,3,3,3,2,2,2,3,2,2,2,3. The average degree is then 2.5. c. The average degree of a's neighbors is 3. 5 Big-O Notation 5.1 Formal Denition Big-O notation is mathematical notation used to describe the behavior of a function as their arguments approach innity. Formally, O( ) is dened as follows: f(x) = O(g(x)) as x if there 9

exists a positive constant M such that for all x x, f(x) M g(x). Using this denition, we can see that two linear functions, e.g. f(x) = 5x and g(x) = x, follow the relationship f(x) = O(x) because for M = 5 the denition holds. In fact, for any monomial f(x) = ax b (x 7, 6x, 9, etc.), f(x) = O(x b ). It is useful to think of O( ) as an inequality, i.e. g(x) upper bounds f(x): f(x) " " g(x) in the limit. Then we denote " " with Ω: f(x) = Ω(g(x)) if there exists a positive constant M such that for all x x, f(x) M g(x). For strict inequalities, We say that f(x) = o(g(x)) (small o) if f(x) < M g(x) for x x and similarly for ω. Finally, we say that f(x) = Θ(g(x)) if f(x) = O(g(x)) and f(x) = Ω(g(x)), which can be thought of as "equality". 5.2 In Practice In practice, it is cumbersome to deal with the denition of big-o notation, so we employ a number of rules to simplify our analysis: if f(x) is a sum of terms, we only need to consider the term with the largest growth rate. For polynomials, this means only considering the term with the highest exponent. if f(x) is a product of terms, we can ignore multiplicative constants 5.3 The Growth of Some One-Variable Functions Given two increasing functions, f, g : R R we are interested in answering the question of which of the two functions is larger for large x. Here we will provide some illustrations using the functions x (yellow), x + 5 (magenta), x 5 (cyan), x 5 (red), e x (green), log(x) (blue) and x (black). 10

Figure 5: Figure 6: 11

Figure 7: Figure 8: 12

Figure 9: 13