It's taking me a pretty long time to learn a pretty simple lesson: stop dropping terms.
Dropping the partition terms from the block variance was definitely the difference. I put them in (estimating the squares of the partition sizes from the observed partitions) and things look quite a bit better:
The true block variance is just under 16000 and we can see that, even for samples of just 5 blocks, we're not off by much. That said, the observed miss rate of 6.8% is high and that's what sent us to the kernel sampler in the first place, so I put the weaker posterior back in. Basically, we discount the data by the percentage we've sampled, so when we've only sampled 5 of 1000 blocks, the posterior kernel distribution is still pretty diffuse.
It's still not perfect, but it's a lot closer. And, as you can see, after only 50 blocks are sampled, the high bias on the estimate is all but gone. I think I could even get the acceptance curve flatter by deriving the true distribution of the sample variance. That's still on my todo list, though it might not be quite as urgent given these results. I think I'll move on to the more general case where both the hit rate and measure distribution depend on the partition.
No comments:
Post a Comment