never2old4school: Minimum Description Length Principle (MDLP)

Saturday, February 6, 2016

Minimum Description Length Principle (MDLP)

This is a new one for me and I have to say I think it's very elegant. It's very late (I've been working at my "day" job all evening, doing a little reading during lulls) and this was supposed to be an off day so I won't write anything more than an overview right now, but it's worth coming back for.

Basically, the principle is a formalization of Occam's Razor. Occam claimed that if there were competing plausible explanations, the simpler one was the one to go with. That's fine if you're trying to decide whether a wet sidewalk on a sunny day was caused by a sprinkler system or an alien flyover. However, it gets a bit more dicey if you're wondering if the warm temperature on that same day is just random fluctuation or a symptom of global warming. Both are plausible. Both are simple. And it's not at all obvious which is more so.

The MDLP asks, if you were to encode your hypotheses and your data, how much storage would you need? Most concise answer wins.

What this means is that a simple model that doesn't fit the data very well will lose to a more complicated model only if the improvement in fit offsets the complexity of the model. The second model takes up more space, but you don't have to store as much data, because the deviations from the model are smaller and smaller numbers can be stored in fewer bits.

In the above example, both hypothesis require very little coding. The first takes none (it's random, there's not much more to say). The second requires only indicating some sort of trend function. If you have enough data to actually derive such a trend, that is, you can fit your data to a trend and reduce the variance from the model, then global warming wins. However, with only one data point, you can't do that. You either have to store the one data point on it's own and call it random, or build that data point into your model and then add a trend you made up. That increases your data size. So, in the absence of more data, random gets it. I think even Al Gore would concede that one hot day doesn't prove anything.

More, including references and why this is relevant to what I'm doing, to follow.

never2old4school

Saturday, February 6, 2016

Minimum Description Length Principle (MDLP)

No comments:

Post a Comment