never2old4school: Non parametric variance.

Monday, December 12, 2016

Non parametric variance.

So, let's assume for a second that we have some sort of intractable distribution of blocksums (and the empirical data certainly support such an assumption). What to do? Well, after a bit of panicking that my entire paper had been undermined, it occurred to me that the silly little theorem I proved might actually be the thing that saves me.

Let's start by restating it:

Let X be a random variable with finite mean and variance having pdf f(x|λ), where λ is a real-valued vector of parameters from a prior distribution with finite mean such that E(X|λ) = g₁(λ) and E(X² |λ)= g₂(λ). Let X' be a random variable such that X'~X with probability θ and X'=0 otherwise where θ~Beta(a,b). Then:

E(X') = μ = aE(g₁(λ))/(a+b)

and

Var(X') = σ² = (bμ² + a[E(g₂(λ)) - 2μE(g₁(λ)) + μ²])/(a+b)

Nowhere in there do we claim to have any clue as to what f(x|λ) actually is. All we need are the moments of the distribution (g₁ and g₂). Ok, we don't know those, either, but we can certainly estimate them. While the sample moments are the minimum variance unbiased estimators, they are also very likely to mess up the computations because there's far too much chance of not getting any hits in the first few blocks and having everything collapse to zero. So, we need to use the parameter λ to drive the functions to converge to the sample moments (which, in turn, converge to the true moments), but not too quickly.

We've seen no reason to depart from B(a,b) as our prior for θ, so we'll stick with that. Since we know that the uniform assumption is a fairly safe upper bound on variability, that's a good place to start for g₁(λ), g₂(λ). That is, we'll define them in such a way that their respective expectations under the prior on λ are the first two moments of the uniform. We'll pick some convenient conjugate distribution on λ for moving that towards the sample moments as more block hits come in. This will bias us towards oversampling early, when an error on the low side would be bad because it makes us think we've got an answer when we really don't.

Yes, this is still parametric in the sense that the underlying prior on λ will obviously impact the results. However, I'm pretty sure that by the time we stop sampling, the posterior will be pretty free from prior affects.

never2old4school

Monday, December 12, 2016

Non parametric variance.

No comments:

Post a Comment