never2old4school: General formula for variance

You may have picked up on the pattern between variance 1.01, 2.0, and 3.0. As it's a general result, I'm going to state it as a theorem:

Let X be a random variable with finite mean and variance having pdf f(x|λ), where λ is a real-valued vector of parameters from a prior distribution with finite mean such that E(X|λ) = g₁(λ) and E(X² |λ)= g₂(λ). Let X' be a random variable such that X'~X with probability θ and X'=0 otherwise where θ~Beta(a,b). Then:

E(X') = μ = aE(g₁(λ))/(a+b)

and

Var(X') = σ² = (bμ² + a[E(g₂(λ)) - 2μE(g₁(λ)) + μ²])/(a+b)

Note that, if using a "standard" distribution for a prior, it is probably easier to compute g2 indirectly, that is:

g₂(λ) = Var(X|λ) + g₁(λ)²

The proof is pretty much the derivations we've gone through before, up to the point the where we actually plug in g(λ). When g(λ) is a linear combination of the elements of λ, (e.g., normal, t, uniform, among others), the substitution becomes trivial. If not, (e.g., exponential, where g(λ) = 1/λ), you either have to hope the transformed distribution g(λ) is itself a known distribution (inverse exponential in this case), or you have to actually crank out the first two moments.

We can demonstrate the usefulness of this result by plugging in the uniform and exponential assumptions to re-derive the variances we've seen so far:

Uniform(0, L_k): (I'm using L_k to indicate the upper limit for a blocksum rather than nb_k as before. I think it makes the notation less cluttered and it removes ambiguity around the variable b.)

Here, λ = L_k is a constant, so the expectations are just the constant values:

$\begin{align*} &\textrm{E}(g_1(\lambda))=L_k/2\qquad \textrm{and}\\ &\textrm{E}(g_2(\lambda)) = \textrm{E}(\textrm{Var}(X|\lambda)+g_1(\lambda)^2)=\textrm{E}(L_k^2/12+L_k^2/4)=L_k^2/3 \end{align*}$

Thus:

$\begin{align*} \mu&=a\textrm{E}(g_1(\lambda))/(a+b)=a L_k/2(a+b)\\ \\ \sigma^2&=(b\mu^2+a[\textrm{E}(g_2(\lambda))+2\mu\textrm{E}(g_1(\lambda))+\mu^2])/(a+b)\\ &=(b\mu^2+a[L_k^2/3+\mu\L_k+\mu^2])/(a+b)\\ \end{align*}$

Granted, that's not much easier than before, but when we move to a distribution with a prior on λ, the theorem does help point the way. Let's move on to the exponential distribution. Recall that here we model λ~Gamma(h,1/s)

$\begin{align*} &g_1(\lambda)=1/\lambda\qquad\textrm{and}\\ &g_2(\lambda)=\textrm{Var}(X|\lambda)+g_1(\lambda)^2=1/\lambda^2+1/\lambda^2=2/\lambda^2 \end{align*}$

A change of variables where β = 1/λ~InvGamma(h,s) gives

$\begin{align*} \textrm{E}(g_1(\lambda))&=\textrm{E}(\beta)=s/(h-1)\qquad\textrm{and}\\ \textrm{E}(g_2(\lambda))&=\textrm{E}(2\beta^2)\\ &=2\textrm{Var}(\beta)+2E(\beta)^2\\ &=\frac{2s^2}{(h-1)(h-2)}+\frac{2s^2}{(h-1)^2}\\ &=\frac{2s^2(2h-3)}{(h-1)^2(h-2)} \end{align*}$

The restriction that h > 2 on the variance of the Inverse Gamma that we saw last week still applies. And now, we just plug it in:

$\begin{align*} \mu&=a\textrm{E}(g_1(\lambda))/(a+b)=\frac{as}{(a+b)(h-1)}\\ \\ \sigma^2&=(b\mu^2+a[\textrm{E}(g_2(\lambda))+2\mu\textrm{E}(g_1(\lambda))+\mu^2])/(a+b)\\ \\ &=\frac{b\mu^2+a\left[\frac{2s^2(2h-3)}{(h-1)^2(h-2)}+\frac{2\mu s}{h-1}+\mu^2\right]}{a+b}\\ \end{align*}$

The hardest part of this is remembering which random variable you're taking the expectation of. The "inner" expectations g₁, g₂, are on X, the block sum. The outer expectations are on the distribution of the parameters. As long as you keep that straight, it's formulaic.

I could also note that, had we been plowing ahead with the integration by parts, the restrictions on h in the above example might not have been obvious until we'd worked on it a while. In contrast, by using this theorem as a guide, we are pointed immediately to the transformed distributions where, assuming we're looking them up rather than computing them ourselves, such constraints are immediately apparent.

never2old4school

Monday, October 24, 2016

General formula for variance

No comments:

Post a Comment