Monday, November 6, 2017

Blind multi-stage bootstrap

The multi-stage bootstrap is a bootstrap method where one first creates a hierarchy of attributes and then selects the bootstrap sample based on that hierarchy. For example, suppose we had cash flows from several offices and we have good reason to believe that cash flows within an office are correlated whereas cashflows between offices are not. Further, we know that the cash flows for life policies are fundamentally different from those for health. If the goal was to get some sense of the variability of the present value of all these cash flows, it would make some sense to try to estimate these things separately and then combine the estimates, rather than pretend that they are all just one big pool.

So, we perform the bootstrap, but rather than selecting randomly from all observations, we first select an office (proportional to the observations for each office), then pick a product type (proportional to the life/health mix for that office) and then select our bootstrap sample from the observations meeting those criteria. As with the usual procedure, we repeat this via a Monte Carlo process until we have convergence on our estimator quality.

All good, but what if we don't know the hierarchy? Specifically, what if we know that a query may or may not exclude all rows from a block, but have no way of knowing which it is (without sampling the block beforehand, which defeats the purpose of ad hoc sampling). We are now in a situation where we are performing a multi-stage bootstrap but, rather than consciously picking the sampling branch, we are merely observing it as we sample.

I'm not sure what (if anything) this does to the estimator. In frequentist statistics, foreknowledge of an experimental design changes the outcome. Bayesian stats are a bit more resilient to foreknowledge because that's baked into the prior. At any rate, I don't need to address it right away because we are just doing a comparison with BLB and BMH, not really trying to expand on those methods.On the other hand, a cursory literature review didn't turn up anything on this topic, so it might be worth investigating sooner rather than later simply to stake out some theoretical ground.

No comments:

Post a Comment