Wednesday, May 15, 2013

Assumptions make an ??? of you and me—or not

Each statistical test rests on assumptions. For example, a t test rests on normality in the underlying population, similar variance in both groups, and independence of both groups.

What?

Okay, here are the steps to take to make sure your t test is valid.

  1. Generate a histogram for the dependent variable (height, weight, or whatever). Does it look relatively normal? Good. Does it look like it comes from a normally distributed underlying population, or do you have prior knowledge or proof in the literature that it does? Also good. Does it look flat or in some other way non-normal? You need to look at another form of the t test that is "robust" to non-normality, or in other words, accurate even though the data are not normal. I'll write about tests of this kind later. 
  2. Provided your histogram was normal, go ahead and run the t test. Among the output tables is one that has a few columns for "Levene's Test." Check the significance of the Levene's test. If p > .10 (Do notice the direction of the sign. This is one of the few tests where having p < .05 is bad. You want a p that is GREATER THAN .05), then you have equal enough variances among the two groups. Either way, SPSS helpfully provides you with the results for two types of t test: one that works if the variances are not equal, and one that works only if the variances are equal. Pick the most appropriate one, and there is the result for your t test. I could go on and on about these two different types of t test, but that wouldn't be practical stats. 
  3. This assumption should probably be #1, but like most people, I don't think about it much. In fact, if you don't have independent groups, your t test might not even work. It's very important. Take the situation where a single teacher is answering survey questions about classrooms with only boys and classrooms with boys and girls. For each question, they slide a bar to a point between "total chaos" and "orderly studious behavior." The two bars are stacked on top of each other, the one for boys-only classrooms first and the one for mixed-gender classrooms second. Can you do a t test comparing mean total answer values for the boys-only classrooms and mean total answer values for the mixed-gender classrooms? Are the two groups (boys vs. boys and girls) independent of each other in this methodology? No. Not at all. They are linked by the fact that the same teacher is answering both questions, and further linked by being presented in a format that encourages comparison between them. Fortunately, you have an option: the paired (or dependent) t test, comparing each teacher's answer for boys with the same teacher's answer for boys and girls. That's what I would use in most situations where the two groups were not independent of each other.
A trick: if you don't have indepence among three or more groups and you had really, really wanted to do an ANOVA, you're out of luck. You have to do several paired t tests, comparing each group with each group. 

So when you write your proposal, do add a clause somewhere to explain that if the assumptions for your chosen method are not met, you will substitute an appropriate equivalent analysis. You will save yourself from numerous headaches.

Please check out our website at www.anovisions.com for more help.


No comments:

Post a Comment