M241 Class Discussion Notes Chapter 6 Dependence
Section 6.1 Conditional Distributions of Discrete
Random Variables.
This section is really a review of our initial work with conditional
probabilities phrased in terms of random variables.
-
Conditional Distribution of Y given X = x is the set of probabilities
P(Y=y | X = x) for all possible values of y.
(Defined on page 396)
-
Recall our fundamental conditional formula (this time expressed in terms
of random variables).

Example 1: Two stage experiment. Roll a die and get x, then
toss a coin x times. Count the number of heads y. Let X = roll of the die;
Y = number of heads.
-
Rather than use the conditional formula it is easier to calculate the conditional
probability as follows: P[Y = y | X = x} = P[ y successes in
x trials ] has binomial distribution with number of trials x, and probability
of success = ½ so

-
If we calculate the above for each possible value y = 0, 1, .. , x we have
the conditional distribution of Y given X = x. For example if we wanted
P[Y = y| x = 3], we would calculate the above probabilities with x = 3,
and y being 0, 1, 2, or 3.
-
See the sketches at the bottom of page 394 for the conditional distributions
for each possible value of X.
-
To calculate the unconditional probability P[Y = y] we use one of the two
formulas below, depending on what we know. If we know the joint distribution
P[X = x, Y=y} we use the first formula. If we know the distribution of
X, the conditional probabilities P[Y = y | X = x], we use the second formula
(the Rule of Average Conditional Probabilities).
-
Look at this applied to the Example 1 problem in the table of joint and
marginal distributions (bottom page 397). Make sure you know how the entries
are calculated in the table.
-
To calculate the reverse probability: P[X = x | Y = y] (Given that you
have tossed y heads, what is the probability that the original die roll
was X), we apply Bayes' Theorem.
where the numerator is calculator using the Multiplication Rule for Conditional
Probabilities and the denominator on the right hand side is calculated
using the Rule of Average Conditional probabilities above.
-
Thus for example P[X = 4 | Y = 2] = ( 1/6 )( 6/16 ) / ( 99 / 384 )
(See Table 3 page 398)
-
Note: The most likely value of X given Y = 2 is either 3 or 4 (each has
probabilities 24/99)
Section 6.2: Conditional Expectation for
Discrete Random Variables
The Conditional Expectation of a random variable Y Given an Event
A
E(Y | A) is defined as the sum (over all possible values
of the random y of the random variable) of y*P(Y=y | A)
Note the relationship between this formula and E(Y).
See Example 1: Y = number of heads in four tosses of a
fair coin. Calculate the conditional expectation of Y given 2 or
less heads.
We can consider this definition applied to the event A = {X = x} where
X is a random variable, and x is a possible value of the random variable.
The Rule of Average Conditional Expectations (page 402, middle)
tells us that for any random variable Y with finite expectation and any
discrete random variable X,
-
E(Y) = Sum (over all x) of E(Y | X = x) P(X = x).
-
If we define the random variable E(Y|X) to be the random variable whose
value is E(Y|X=x) when X = x, this formula can be written as
Example 2, page 403: Continuing Example 1, Y
= number of heads in X tosses of a fair coin. where X is generated
by a fair die roll.
-
Problem 1: E(Y|x=x) = x/2
-
Problem 2: E(Y) = 1.75
Section
6.4 Covariance and Correlation
-
Covariance gives us information about the relationship between two
jointly distributed random variables.
Cov(X,Y) = E [ (X - mx)(Y-my)
]
where mx= E(X) and my
= E(Y).
-
This can be simplified to
Cov(X,Y) = E (XY) - mxmy)
-
Note that for arbitrary random variables X and Y, we have
Var(X+Y) = Var(X) + Var(Y) + 2Cov(X,Y)
-
If X and Y are independent, then Cov(X,Y) = 0, but The converse is not
true!
The Sign of the Covariance
-
If above-average values of X tend to occur with above average values of
Y, and below averages of X tend to occur with below-average values of Y,
then the sign of Cov(X,Y) will be positive.
-
If above-average values of X tend to occur with below average values of
Y, and below averages of X tend to occur with above-average values of Y,
then the sign of Cov(X,Y) will be negative.
-
Definition of Correlation: Corr(X,Y) = Cov(X,Y) / [SD(X)*SD(Y)]
Corr(X,Y) has the same sign as the Cov(X,Y) and its value is always
between -1 and +1.
Uncorrelated Random Variables:
-
Two random variables X and Y are said to be uncorrelated
if any of the following three equivalent conditions hold:
-
Corr(X,Y) = 0
-
Cov(X,Y) = 0
-
E(XY) = E(X)E(Y)
-
Note that Corr(X,Y) = Cov(X*,Y*), where X* and Y* are the standardized
X and Y. This allows the following proof:
Proof that -1 <= Corr(X,Y) <=1
0 <= E(X*-Y*)2 = E(X*2) - 2E(X*Y*) + E(Y*2)
= 1 - 2E(X*Y*) + 1
So
0 <= -2E(X*Y*) + 2
-2 <= -2E(X*Y*)
and
E(X*Y*) <= 1
Also
0 <= E(X*+Y*)2 = E(X*2) + 2E(X*Y*) + E(Y*2)
= 1 + 2E(X*Y*) + 1
So
0 <= 2E(X*Y*) + 2
-2 <= 2E(X*Y*)
and
-1 <= E(X*Y*)
-
Example 2: One important way that correlation is used is seen in
the discussion of Example 2 -- When X and Y represent empirical random
variables ( values picked at random from an underlying population of jointly
distributed values -- i.e. let X = age, and Y = IQ of person picked at
random from a large population).