In this post we will explore the duality between uniform and exponential distributions, and try to see how it can be used as a clever trick to solve probability problems.
Suppose you have a problem that mentions “i.i.d. Uniform” and asks about order statistics, typically minimum and/or maximum. Instead of doing tedious arithmetic/calculus, what if there was another way?
Recall the Poisson Process. This counting process has independent arrivals and it is a well-known fact that inter-arrival times are exponentially distributed. In addition, the less-known feature is that arrivals are also uniform for a finite interval. This is where the intuition comes from.
We skip the rigorous proof here, but an alternative way to sample n uniformly distributed random variables and then sort them is to generate n exponential random variables, representing intervals in-between, draw another one representing the last interval and eventually normalize the entire interval.
Let Y_1, Y_2, \dots, Y_{n+1} be independent Exponential(1) random variables. Denote
Y_{(k)} = \frac{\sum\limits_{i=1}^{k} Y_i}{\sum\limits_{i=1}^{n+1} Y_i}.
Then Y_{(k)},k=1,\dots,n are jointly distributed exactly the same as Uniform[0;1] order statistics.
Couple of useful facts (and links) about Y_{(k)} random variables:
Expression | Distribution | Mean | Variance |
---|---|---|---|
Exp(\lambda)wiki | \Gamma(1,\lambda)wiki | \frac{1}{\lambda} | \frac{1}{\lambda^2} |
k \cdot Exp(\lambda) | Exp(\frac{\lambda}{k}) | ||
Sum of n i.i.d. Exp(1)wiki | \Gamma(n,1) | n | n |
Min of n i.i.d. Exp(1)wiki | Exp(n) | \frac{1}{n} | \frac{1}{n^2} |
Mix of indep.wiki: \frac{\Gamma(m,1)}{\Gamma(m,1)+\Gamma(n,1)} | Beta(m,n)wiki | \frac{m}{m+n} | \frac{mn}{(m+n)^2(m+n+1)} |
Joint Y_{(1)},Y_{(2)},\dots | Dirichletwiki with (1,1,\dots) |
Now let’s demonstrate the application of the uniform-exponential for getting succinct proofs.
Problem 1 (Easy). Given n i.i.d. Uniform [0;1] random variables, find the probability that the sum of their maximum and minimum is greater than 1.
Let Y_1, Y_2, \dots, Y_{n+1} be independent Exponential(1) random variables. Denote minimum and maximum as X_{min}, X_{max}. Then, by the machinery above, it holds that
X_{min} = \frac{Y_1}{\sum\limits_{i=1}^{n+1} Y_i} and X_{max} = \frac{\sum\limits_{i=1}^{n} Y_i}{\sum\limits_{i=1}^{n+1} Y_i} .
We want to evaluate P(X_{min} + X_{max} > 1). After substituting Y_i, this becomes simply the probability of P( Y_1 > Y_{n+1} ) and equals to 1/2 by symmetry.
The alternative approach is to notice that maximum is one minus minimum and play with it. This evolves bits of creativity and the standard machinery helps when “noticing” was not successful.
Problem 2 (Hard). Given n i.i.d. Uniform [0;1] random variables, find the correlation between their minimum and maximum.
Let Y_1, Y_2, \dots, Y_{n+1} be independent Exponential(1) random variables. Denote minimum and maximum as X_{min}, X_{max}. Then, by the machinery above, it holds that
X_{min} = \frac{Y_1}{\sum\limits_{i=1}^{n+1} Y_i} and X_{max} = \frac{\sum\limits_{i=1}^{n} Y_i}{\sum\limits_{i=1}^{n+1} Y_i} .
After this the magic happens when one recognizes that for every two variables A,B , by linearity, it holds that
cov(A, B) = - cov(A, 1-B)
Denote z = cov \left( \frac{Y_i}{\sum\limits_{k=1}^{n+1} Y_k}, \frac{Y_j}{\sum\limits_{k=1}^{n+1} Y_k} \right) for i \neq j . Note that we can drop the index i,j , since by symmetry all these are the same.
Let’s substitute minimum and maximum into the “magic” observation. Then simply -cov\left( X_{min}, 1-X_{max} \right) = z, where indices are i=1 and j=n+1. The original correlation after opening the sum can be expressed as
cov\left( X_{min}, X_{max} \right) = Var\left( X_{min} \right) + (n-1)z,
where variance comes from Y_1. Hence, we have that z = \frac{Var\left( X_{min} \right)}{n} , which is also a covariance! Finally, for correlation, noticing that variance for minimum and maximum are equal (same machinery),
\rho \left( X_{min}, X_{max} \right) = \frac{ cov\left(X_{min}, X_{max}\right) }{\sqrt{ Var^2 \left(X_{min} \right) }} = \frac{1}{n}.
Problem 3 (Hard). Given n i.i.d. Uniform [0;1] random variables, find the expected value of the smallest distance between them.
Use the machinery above. Denote the denominator of exponentials as V = Y_1 + \dots + Y_{n+1}. Then the distances are conveniently \frac{Y_2}{V}, \dots, \frac{Y_n}{V} and we are looking to find the expectation of their minimum. They share the common denominator that can be moved out, such as
\min \left\{ \frac{Y_2}{V}, \dots, \frac{Y_n}{V} \right\} = \frac{1}{V} \min \left\{ Y_2, \dots, Y_n \right\} =\frac{M}{V}.NB: random variables Y_1 and Y_{n+1} are not inside the minimum function (see image in the beginning). From the table above, we know that (n-1) \cdot M \sim Exp(1) \equiv \Gamma(1, 1). Now the idea is to find the Beta distribution in the form Gamma’s fraction.
Since M and V are not independent, we can’t use Beta distribution without justification. Recall that the exponential distribution is memoryless and remain the same independently distributed exponential after the minimum time passed.
Let’s condition the expectation on the minimum being achieved for some index k^{*}, and we want to use the law of total expectation \mathbb{E}\left[\frac{M}{V}\right] = \mathbb{E}\left[\mathbb{E}\left[ \frac{M}{V} | k^{*},M=m \right]\right] . The denominator V consists of:
- Y_1 and Y_{n+1}, independent from the minimum Y_{k^{*}};
- the minimum Y_{k^{*}}=m itself;
- n-2 exponential variables greater or equal m, indices from 2 to n skipping index k^{*}.
By memoryless property \mathbb{P} \left( Y_k > s+m | Y_k > m \right) = \mathbb{P} \left( Y_k > s \right), hence it can be written as Y_k = m + \widetilde{Y_k}, where \widetilde{Y_k} has exactly the same exponential distribution and independent of M. Now that M, Y_{k^{*}} are obviously measurable and other variables are independent, we can remove the condition and make enumerator Gamma by multiplying and dividing by n-1, to receive the general expectation
\footnotesize{ \mathbb{E} \left[ \frac{M}{V} \right] = \frac{1}{n-1}\mathbb{E} \left[ \frac{(n-1)M}{Y_1 + Y_{n+1} + M + (\widetilde{Y_2}+M)+\dots} \right] }The denominator has the minimum n-1 times, and reminder is n exponentials independent from the minimum, with their sum being \Gamma(n,1). Now we recall that (n-1)M has Gamma distribution and we can use the Beta ratio from the table:
\mathbb{E} \left[ \frac{M}{V} \right] = \frac{1}{n-1}\mathbb{E} \left[ Beta(1, n) \right] = \frac{1}{(n-1)(n+1)}.
0 Comments