Monday, August 7, 2017

Unequal Probability Sampling

Getting back to CISS for a bit. The selection process for the next block to sample was basically to look at the variability of the strata estimators and pick a stratum in proportion to it's "uncertainty" (technically, the quantity I use is the variance of the unsampled data, not the sampled data, so I call it uncertainty rather than variance). I did this without a bunch of mathematical justification because it seemed intuitive enough. Of course, that attitude can get one in a lot of trouble so I'm happy to have found a reference that has already done most of the heavy lifting.

Reference: Shahbaz, MQ., Shabaz, S., and Hanif, M. International Journal for Open Problems in Computational Mathematics, Vol 4, No 1, March 2011.

It starts with the Horvitz and Thompson Estimator (another paper I should include) and looks particularly at the 2-valued sample. This isn't super interesting in and of itself, but the introduction is good annotated bibliography starting with Hansen's paper that I reviewed last spring. I could have gotten most of it by simply doing a forward reference on Hansen, but now that's one less thing.

The main takeaway is the variance computation. As I said, my uncertainty isn't really a variance, but if the CISS technique was used to create a sample that would then be used for leveraged analysis, I'd need to flip that around and be able to estimate the sample variance. So, it could come in handy.

No comments:

Post a Comment