Sunday, December 11, 2016

Still some work to do.

The distribution of this data is just nutty. The confidence bounds are still a bit too tight, so I plotted the distribution of the blocksums to see if the exponential assumption might be off. Here's one stratum:


I don't know what that is, but it sure isn't exponential. Since the stratification has already removed the heavy-tail problem, I have to assume that this is caused by correlation within blocks. The consequence is that my estimate of the blocksum variance is about 10% low. That's not a killer in and of itself, but when I start summing them all up and assuming normality on the sum, it winds up being off by a fair bit.

The algorithm isn't very useful if it's this sensitive to the distribution of blocksums. The whole point of going to blocksums rather than using the individual observations was to smooth things out. I may need to switch to a non-parametric distribution. Or, I might just use the actual sample variance and use a t-distribution on the sum instead of normal.

I'm uncovering questions faster than I'm answering them. As I'm trying to just get this result published so I can move on, that's not really where I want to be.

No comments:

Post a Comment