Not quite as bad as a couple weeks ago when I thought my entire thesis had been scooped, but still enough to cause some anxiety.
I've been reworking my simulated data so that I can apply all the methods to the same data set (yes, it would have made more sense to set that up from the get-go, but that's a different conversation). Anyway, as I ran the most basic sampler (just use the sample block variance) on the new set, I noted with dismay that it was returning very good results even on the correlated data. I began to worry that all this work has been for nothing and that I had just been working with an anomalous data set.
Turns out, it was the new set that was anomalous. Not terribly so, just that the correlation between hit rate and the mean observation given a hit ended up yielding roughly the same expected value for all the partitions. Obviously, that's not the situation we're trying to deal with here. We state up front that if the partitions all return roughly the same mean, then any sampling method will do.
It was an easy fix to generate data where that wasn't the case. We carry on.
No comments:
Post a Comment