Friday, May 19, 2017

On the Theory of Sampling from Finite Populations

Well, I was right about this one: it's one of those citation classics that you simply have to put into your bibliography. It's also a really good paper. Definitely required reading for any stratified sampling work.

On the Theory of Sampling from Finite Populations Morris H. Hansen and William N. Hurwitz The Annals of Mathematical Statistics Vol. 14, No. 4 (Dec., 1943), pp. 333-362

The key result is the estimator which weights a point by the inverse of its selection probability. In this paper, this is done with replacement, that is, once you've sampled a point, you return it to the population and may well sample it again. In CISS, we don't do that. Once a block is sampled, it's removed from further consideration. That does require some adjustments, but the basic principal is the same, the strata that get fully sampled (the tails) get a lot less weight in the estimate than the stata that have only a small percentage of blocks read.

This paper also has some good references on cluster sampling. I don't think I want to go back too much earlier than 1943 for my citations since most of the heavy lifting on cluster sampling was done in the 1980's and 1990's, but those references can be reversed. By looking for other papers that cite them, I can find newer stuff that is relevant.

No comments:

Post a Comment