I think I found where all the good math went. I worked this out on the very bumpy flight to Houston today and haven't really dug into it, so I could be wrong. Actually, that last part is pretty much true all the times I'm not on a bumpy flight, too, but I digress.
Anyway, when I introduced the correlation, I went to the simple case where each block comes from a single partition. This is perfectly adequate for demonstrating the issue, but it also oversimplifies the problem. I tried injecting the more realistic constraint that a block may contain an any number of partitions. There are three good reasons to do this. 1: The variance computation is more interesting from a theoretical point of view, 2: It's a better model of the data (OK, this should be #1, but we're trying to get something published here), and 3) It allows us to show that the 1-partition case is easier to work with computationally.
That last one may sound like a bad thing; but it's not. It actually sets up a mtathematically sound motivation for intentionally increasing the correlation within a block. That is, it leads very naturally to a theoretical, rather than completely pragmatic, argument for DBESt, which is the next step in this research. Up until now, I was worried the discussion to that would be an awkward break in the thesis. Not it looks like it will flow quite naturally.
Of course, I still have to do the derivations and I may still be wrong.
No comments:
Post a Comment