As I mentioned as part of The Ultrarunner's Guide to Long IT Projects, small steps is the way to get things done when you're in it for the long haul (actually, it works pretty well all the time). It seems my advisor agrees. We met today on what I would do to make the Bayesian class a true graduate course and decided we'd look for a piece of my thesis topic that could be tackled as a little chunk.
The most promising approach seems to be to try to solve a much more specialized case of the problem. Specifically, I'm going to look at how sampling affects some individual queries that that we run at work. We'll tune them for both performance and accuracy and see if we can't beat the regular cube. Most importantly, we'll look at whether our confidence intervals (or, more correctly, our posterior distribution on the result) matches empirical results. Even in this reduced form, this is not an easy task, but at least it's well defined.
Another avenue that occurred to me while we were tossing out ideas was the thought that rather than waiting for the posterior distribution to sufficiently converge, we could put a cost function on the error and then use hitting the expected value of that cost as our stopping rule (that is, keep sampling until the expected cost drops below a pre-set threshold.)
At any rate, I'm pretty excited to actually have some direction. Up until now, it's been lots of ideas but no clear path on how to proceed. I'm willing to put the ideas down for a bit in exchange for making some real progress.
No comments:
Post a Comment