It was nice to take a couple days off for Easter. Classes are off all week for Spring Break, but I'm back on it.
I know why my confidence intervals are too tight. This should have been obvious from the get go, but sometimes you need to take a step back to see the obvious. The confidence intervals are based on the distribution of the values given θ and λ. I just took the point estimates for those two. However, they aren't constants, they are random variables themselves. So, I really need to compute the distribution given θ and λ and then integrate that over the joint distribution of the inputs. That's not terribly hard to do, but it does bring us back the problem that I dismissed too quickly: I don't even have good marginal distributions for θ and λ. I don't have a clue what the joint distribution would be.
So, the first thing I'm going to do is take lambda out of the equation. Rather than stratify on absolute value, I'm going to stratify on value. Thus, within a stratum, values will be considered uniform between the stratum bounds and the big question is how many get picked up by the query (θ).
Because this doubles the number of strata, I'm going to back off on pinning the magnitude to 2k and instead allow for configurable bounds on each strata. I've got some ideas on the optimal way to do that, but for now, I'll just set them manually.
That leaves the prior on θ. If I'm really going to use this as a distribution for purposes of confidence intervals, I need to play by the rules and compute a real posterior based on likelihood. So, the fast-converging point estimate isn't going to cut it. I think I can still get around the problem by letting observations in adjacent strata inform the posterior. So, we start with a fairly neutral prior that reflects the variability due to the fact that we don't know the value of θ and the posterior gets updated anytime we sample a block from strata k-1, k, or k+1. This will speed the convergence, while still preserving a lot of variability, at least in the early going.
Although mathematical convenience makes it tempting to use the Beta distribution as the prior, I think it converges much too quickly, especially since we have correlation within blocks. I'm going to have to work something else out. This seems like a good application of Gibbs sampling. I can try a bunch of different distributions and see which ones behave right.
No comments:
Post a Comment