Thursday, January 19, 2017

Evaluating estimators

I've been preaching about how you can't just trust the results; you have to check to make sure they make sense. How does one do that?

There are two questions in play, here. The first is, is the estimator generally good for this problem? That often turns out to the be easier thing to evaluate. The second is, is the estimator generated by this data set any good? This one is often a lot harder, especially if all you have to go on is your one data set.

The first is typically based on minimizing the expected value of some loss function. That is, on average, how far off will this estimator be times how much do I care? The squared error function is the most common: the cost proportional to the square of the difference between the estimate and the "real" value (we'll go with the idea that a real value does exist for the moment). That cost function tends to favor estimators with a low variance, since squared distance is what variance measures. Other cost functions (like absolute distance or log distance) yield different favorites. At any rate, comparing the general performance of two estimators is merely a matter of selecting your cost function and computing the expectation. Even if the expectation is not closed-form, you should be able to approximate it with numerical methods.

Knowing whether your particular estimate is any good is a whole 'nuther thing. There are all sorts of weird data conditions that can skew an estimate. Good experiment design forsees many of these and defends against them but, at the end of the day, we are talking about random variables. Sometimes they just plain come out whacky.

Here's where the frequentist methods have some problems. Because the methods are based solely on the data set, it's very hard to deduce that the problem is the data set, itself. There are all sorts of model tests that can (and should) be performed to see if the data matches the model. But, there is no good way of telling if the data matches the "reality" that it was drawn from.

Bayesians aren't really on any firmer footing, here. If our beliefs are wrong and a nutty data set confirms those beliefs, we may well be even further from the truth than our frequentist brethren. About the best that can be said is that at least those beliefs are stated up front in the prior. This does make it a little easier for an independent observer to challenge the assertions.

No comments:

Post a Comment