Saturday, January 13, 2018

Reluctant Bayesian

I had a professor at Cornell once say "the only good Bayesian is a dead Bayesian." She was joking, of course, but the sentiment had basis in reality. At that time, Cornell's stats folks were solidly in the frequentist camp.

My adviser, who did some research at Cornell after I left, tells me they (like just about everybody else) have softened their stance quite a bit and have come to at least accepting it as a legitimate approach. That said, he also cautioned me about going all in on using Bayesian priors for our work because you end up having to defend the prior rather than the result you are presenting.

Fair enough but, while looking at drawing inferences from the Markov model today, it occured to me that it might be a lot easier to analyze from a Bayesian point of view. There are several parameters that we don't know going in. The three most important are the mean and variance of the measures included in the query and the proportion of observations excluded. The sample mean, sample variance, and observed proportion are Maximum Likelihood Estimators (MLE) for those and will serve just fine. In an iid setting, that would be enough to get a point estimate and confidence interval on the sum.

In our case, it's not good enough. We need to know how the correlation is affecting those estimators and adjust the variance accordingly. That's not super easy to do. However, if we turn the situation around and say, suppose the transition matrix for the Markov Chain is T, then what's the likelihood we see the data we're looking at. What happens when we move off of our estimates for mean and variance?

A full-blown MCMC simulation could give us a nice, robust, empirical distribution for the true sum based on messing with those parameters. Of course, in the time it takes to run that, we could just sample all the data, so that doesn't buy us anything. But, I'm thinking we might be able to leverage the fact that these likelihood functions are really simple to replace the simulation with some quick searches of the estimator space using traditional optimization techniques. If nothing else, it's a novel approach.

No comments:

Post a Comment