Monday, June 25, 2018

But Wait a Sec . . . Is that really the Best Way?

 A Less-than-Ideal Example

After I wrote the previous blog post, I had a good think about a couple of things. One was that my situation with Anovisions provides, occasionally, less-than-ideal entry points into studies.I often enter a study long after many of the critical first decisions about data gathering, data type, and even research questions have been made by the people who have decided to hire a statistician. My methods for getting my head around a study, which I have started to describe in the previous blog post, are therefore rather ad hoc and perhaps not the best methods for researchers in environments where they have more control over the methods.

At this point in the conversation, therefore, I'm going to go in a couple of different directions. Let's have a look at a process that is more or less ideal for a data explorer, and after that, let's go back to how to manage when you are required to be the statistics expert entering in the middle of an exploration that has already been begun by people who are savvy in their businesses but not necessarily trained as data scientists or statisticians.

An Ideal Example

In this example, you have just been handed data and you have been asked to see what you can learn from it. It's a data scientist's dream: you can follow a process that allows you to find whatever the data has to offer.

Choose Your Tools

Roger Peng, PhD of Johns Hopkins is an articulate
proponent of the R Programming Language. 
We have our pick of tools, so let's use R to do the analysis. It's powerful, flexible, free, open-source data analysis software and I love it. So does Roger Peng, PhD of Johns Hopkins University, and he says why in this video that introduces a really fun course on R programming that you might want to take some day. I did and it was excellent.

Installing R and RStudio

To learn how to install R and RStudio on your own computer so you can follow along with the data analysis step by step, please see this video.
To learn how to install R and RStudio on a Mac, see this video. Dr. Peng talks fast, so you may need to pause frequently as you follow along on your computer.

We will continue in this vein for our next few posts.



This time I'll let somebody else do the talking.

Every now and again you come across an answer to a question that is just so clear that you want to bookmark it so that you can find it again if something like the original question you were googling comes up again. That just happened to me. I was googling "How are linear regression models, ANOVAs, and ANCOVAs related?" because it is a long time since I have had to do those equations by hand and all I remembered was that I learned them together because they relied on the same basic formulae. What those formulae were, I had completely forgotten.

Thanks, SAS, SPSS, Python, and R, for doing all the work for me and letting my brain atrophy.

Here is an answer to my question that is so good that I'm giving it space on my blog. Thanks, Karen Grace-Martin, for tapping this out so the rest of us could have the benefit of your explanation.

Sunday, April 1, 2018

Designing a Study: First Steps with our Horse and Cart


The following is a completely fake study that I've made up out of thin air.

Let's say we have 38 teachers and 400 adult students in a military school that teaches survival skills in hot combat situations. We want the students to learn everything they can in a very short amount of time (training soldiers is expensive), and we have begun to wonder whether some of the school's pedagogical methods are detracting from or adding to how much and how quickly the trainees learn.

Here are some of our questions:
  • Do students learn better in mixed experience-level classes than they do in classes where all the students are at the same level?
  • Do students learn better in classes with certain types of activities?
  • Do teachers who have seen the video How to Avoid Death by Powerpoint tend to produce better-performing students?
  • At what time of day do students learn best?
How exactly do you go about answering these questions? The world is at your fingertips. You can create surveys for teachers and surveys for students; you can access any data source collected by the institution, including grades, instructor comments about students, student evaluations of instructors, and (I am making this up) injury and death rates for previous students.

This may seem like a morass of information to slog through for quite a complicated study, and it is. I have come across numerous data sets, sets of questions, and study designs in my work for Anovisions. Over the years, I have developed a method for getting my head around a study like this, and you might find it useful as you take your study from research question to final design. 

Here are the first steps:
  1. Find out what the entity paying for the study has as constraints in terms of time and money. What are the deadlines? How much of your analysis time can they afford? In this case, we have three months and a budget that limits us from doing any qualitative (read: "expensive and time consuming") analysis.
  2. Define the main research question(s). In this case, the main research question is How can we change our pedagogical methods to produce better outcomes for our students? All possible answers should be explored, including the answer, "None of your considered changes will help much." 
  3. Find out about any variables your client may be considering. In this case, our client is not thinking, "Hm, I wonder about this variable or that variable?" Clients rarely think like that. But they often have excellent questions that you can convert into variables. In this case, we have a few variables that come to mind:
    • Grades (we'll call this a scalar variable for now)
    • Survival time in the field (scalar)
    • Number of injuries in training (scalar)
    • Number of injuries in the field (scalar)
    • Student experience level (scalar or categorical, to be decided once our plan is further developed)
    • Instructor status regarding How to Avoid Death by Powerpoint (binary, Y/N)
    • Types of class activities (categorical)
    • Instructor evaluations (we'll need to find a way to take lots of string data and compress it to a numerical value, so let's call this scalar for now)
    • Time of day for class
    • . . . 
"But wait," you say. "What kind of study are you doing? A randomized trial? A case study? Survival analysis, even? No, a cohort study, or--yes!  It's a cross-sectional study, right?"

"Ah," say I. "You have recently been to graduate school."

Actually, no, I don't say that, at least not out loud. I say, "Don't put the cart in front of the horse," which adage, though frequently used and perhaps stale, is difficult to replace with a more modern pithy saying while retaining the same wisdom. What I mean is, if you try to define the study type before you thoroughly examine all the client's questions and available data, you will never be able to deliver what the client needs (in the cart, ha ha). 

At this point in your process, it is essential that you engage in one of the most overlooked and important steps in any study design.

You walk away, take a break, have a coffee, call someone you love, and otherwise stop thinking about it for a few minutes or even overnight (the process of sleep does wonders for analytical tasks). When you come back, you will perform better for having taken your break.

So now I'll take mine, and we'll continue this subject in the next blog post. Don't forget to sign up for notifications so you don't miss it. 

Friday, March 9, 2018

Cool Transcription Tool that's Free AND Reliable!

I don't usually do free ads for folks, but I just have to mention oTranscribe, which I've been using for the past couple of weeks to do a transcription for a qualitative analysis. It's free to use oTranscribe on a small scale (I'm doing a single hour and 45 minute interview).

oTranscribe is one of those tools that seems to have read your mind about what you need. They save your work for you so that all you have to do is go to "otranscribe.com" and there is your project, right where you left off. If you're paranoid about saving things, like I am, they have an export function. I have multiple copies of the transcription that I've saved all along in my own folder, but that's just doubling up, because oTranscribe has also saved each of my copies in its "History" function. They do not store your audio recording, so you have to upload that each time, but I don't want them storing my audios anyway. oTranscribe also has keys for starting and stopping (the same key, a toggle, which goes back one second each time you pause), for slowing the playback, for speeding it up, for dropping in a time stamp, and many other nice features that you may or may not need.

The feature I like best is that I can re-map my function keys. Instead of using F1 to slow down playback, I use down arrow. Instead of F2 for "go back a few seconds," I use left arrow. It's incredibly easy to use.

I highly recommend this product. You can find it at www.oTranscribe.com.