never2old4school: F

Monday, November 21, 2016

F

No, no, I didn't fail a class. Today, I'm writing about the F-distribution. In another odd naming scenario, the F is in honor of Ronald Fisher but it was actually developed by George Snedecor. Snedecor had no problem with publishing under his own name, he just felt that Fisher had done so much of the work on this particular problem that he should get some credit, too. Some texts call it Snedecor's F distribution.

OK, fine, get on with it.

Just as the t-distribution is used to substitute in the sample variance for the true variance to get a distribution on the mean, the F-distribution substitutes in two sample variances to get a distribution on the ratio of the variances. Formally, if X_i are N(μ_X, σ_Y²) and Y_i are N(μ_Y, σ_Y²), then:

$\frac{S_X^2/S_Y^2}{\sigma_X^2/\sigma_Y^2}=\frac{S_X^2/\sigma_X^2}{S_Y^2/\sigma_Y^2}\sim F$

Also as with the t-distribution, there are degrees of freedom. In this case there are two and they are what you'd expect from the t: n - 1 for n observations of X and m - 1 for m observations of Y. You might wonder why the degrees of freedom are always one less than the number of observations. Why not just arbitrarily re-label them so they match? The reason is that these are coming from the underlying chi-squared distributions. Recall that S_X²/σ_X² is a scaled chi-square, so the F is really just the ratio of two independent chi-squared random variables with corresponding degrees of freedom.

It was this whole chi-squared, degrees of freedom bit that Fisher had worked out with respect to normal samples. Snedecor and Gosset just did the wrapping and formalizing of the distributions. Both were admirers of Fishers work, as is obvious from their deferment of credit. (Though, in the case of Gosset, he pretty much had to publish on the down low because one of his colleagues at Guinness had given away some trade secrets in a publication and the company subsequently took a dim view of anybody publishing anything.)

And, here's three fun transformation facts before I call it a night:

If X ~ F_p,q then 1/X ~ F_q,p. That should be pretty obvious from the formulation as the ratio of two random variables.
If X ~ t_q then X² ~ F_1,q. Nothing obvious about that one. At least not to me.
If X ~ F_p,q then (p/q)X/(1+(p/q)X) ~ beta(p/2, q/2). Wow, really? OK.

never2old4school

Monday, November 21, 2016

F

No comments:

Post a Comment