Tuesday, September 27, 2016

Transform of a discrete random variable by it's CDF

I'm only posting on this one because I ended up spending an inordinate amount of time on it. It's a problem from one of the texts I'm using. The authors suggest an argument based on a picture of the cdf which is obvious enough. However, much of the mess of mathematics in the 18th and 19th centuries was a result of the fact that people were too quick to resort to "proof by picture" rather than strict rules of logic. (Of course, the mess since then is an even bigger one stemming from the fact that the strict rules of logic don't really work that well, either; but that's another day's discussion).

Anyway, as grad-level probability theory is in pretty safe turf from an existential argument point of view, I decided to work out a more formal proof. The result in question is the discrete version of the well-known transformation that F(X) ~ U(0,1). That is, if you feed a continuous random variable back into its own cdf, you get a standard uniform random variable. In the discrete case, it's not quite so pretty. Instead the best you can say is that you get a random variable that is bounded by [0,1] and is stochastically greater than U(0,1).

A clue as to where to start is the proof in the continuous case, which is pretty straightforward:

Let Y = FX(X).

Then

FY(y) = P(Y ≤ y)
    = P(FX(X) ≤ y)
    = P(FX-1[FX(X)] < FX-1(y))
    = P(X ≤ FX-1(y))
    = FX(FX-1(y))
    = y

Which implies Y ~ U(0,1).

Here, FX-1(y) is well defined even if the cdf is not strictly increasing by setting

FX-1(y) = inf {xFX(x) = y}

Now, we have to look at why the equality doesn't hold for discrete random variables. The answer is that the inverse function is no longer well defined. It may well be that no value of x satisfies FX(x) = y. Therefore, we need to add an inequality:

FX-1(y) = inf {xFX(x) ≥ y}

It's fairly easy to show that, on points where P(X = x) > 0, the equality still holds. Therefore, we still have P(FX-1[FX(X)] < FX-1(y)) = P(X ≤ FX-1(y)) = FX(FX-1(y)). It's the final line where things break down. Here, we can't count on y being in the set of {FX(x) | P(X = x) > 0}, so the inequality comes into play. Specifically, when P(X = FX-1(y)) = 0, we get FX(FX-1(y)) < y. As X is discrete, there are infinitely many points where P(X = FX-1(y)) = 0 and we only need one to establish that Y is stochastically greater than U(0,1).


No comments:

Post a Comment