Oh, brother, here I go again reframing things just when the finish line is in sight.
At my weekly meeting with my adviser, he was understandably skeptical of my plan to us an MCMC chain to generate a distribution across all the vectors for the partition parameters. Given 1000 partitions (which actually isn't very many), it is a 3000-dimensional space. I'll concede that's a heavy lift, though my thought was that even a very sparse approximation would still converge. I, of course, have nothing but gut feel to back that up.
Then, as I was staring at the blackboard, it hit me. There's no reason all these things need to be estimated at once. We've already said that in the general case we are assuming no direct correlation between partitions. They may exist, but we don't assume as much.
So, that means that we could partition the partition space and estimate the groups of partitions separately. Then, we just need a way to glue all those back together into one empirical distribution of the block variance. My adviser liked that idea and I left the meeting thinking that should be easy enough to do.
Then the revelation hit me: it's already been done! That's essentially what the Bag of Little Bootstraps algorithm does. All the theory has already been worked out. Better yet, I already have a section in my paper adapting the theory to my use case. It's not a slam dunk because I was showing how it operated in the uniform case and explicitly called out how it fails miserably in the correlated case. But, that's because we were measuring the wrong thing. Applied to the partition statistics rather than the individual rows, this should work.
It does mean I have some more proving to do. Simply showing that it works on simulated data sets won't cut it. But, the proof of correctness in the original paper is fairly straightforward. I should be able to follow the same path to show it works in this case.
The downside, of course, is that I just set myself back another week. This is the first time I've felt like it will be worth it.
No comments:
Post a Comment