Practical Stats

Monday, August 20, 2018

Data Visualization Meets Grizzly Bear

Scientists such as Anders Ynnerman and his team have been able to use a CAT scan to acquire images of entire human bodies (with all their systems), then project the images to a computer screen the size of a person-lengthed table to be manipulated, iPhone style, through touch. Users can zoom in and out, spin the images, look at cross-sections, even zoom through tissue to see what lies beneath.

They also have another instrument that is like a pen that can be used to point to specific parts of the body that one can see through what looks like a microfilm reader. The pen operates with an algorithm that makes the user feel the surfaces through the pen: so if you push on an image of the skull of the person in the CAT scan machine, you'll feel yourself push through the skin and the tissues surrounding the skull. If you push harder, you'll feel like you're pushing through bone, then into and through the brain, with precise real-time images of all the coils and blood vessels of the brain appearing as you push through them. If you push on an image of someone's chest and then through or past their ribs, you can rest the pen against the image of the heart as it beats and feel its rhythms through the pen with amazing accuracy. Imagine the way such simulation will affect training the surgeons of the future!

Scientists are in the process of scanning animals like lions and bears (quite a challenge to get even a sleeping bear into a CAT scan machine) so that our knowledge of the details of animal anatomy can approach our knowledge of human anatomy, making veterinarians everywhere--especially wildlife veterinarians--very happy.

That was four years ago. What are these scientists up to now? Coining terms like "exploranation."

Monday, June 25, 2018

But Wait a Sec . . . Is that really the Best Way?

A Less-than-Ideal Example

After I wrote the previous blog post, I had a good think about a couple of things. One was that my situation with Anovisions provides, occasionally, less-than-ideal entry points into studies.I often enter a study long after many of the critical first decisions about data gathering, data type, and even research questions have been made by the people who have decided to hire a statistician. My methods for getting my head around a study, which I have started to describe in the previous blog post, are therefore rather ad hoc and perhaps not the best methods for researchers in environments where they have more control over the methods.

At this point in the conversation, therefore, I'm going to go in a couple of different directions. Let's have a look at a process that is more or less ideal for a data explorer, and after that, let's go back to how to manage when you are required to be the statistics expert entering in the middle of an exploration that has already been begun by people who are savvy in their businesses but not necessarily trained as data scientists or statisticians.

An Ideal Example

In this example, you have just been handed data and you have been asked to see what you can learn from it. It's a data scientist's dream: you can follow a process that allows you to find whatever the data has to offer.

Choose Your Tools

Roger Peng, PhD of Johns Hopkins is an articulate
proponent of the R Programming Language.

We have our pick of tools, so let's use R to do the analysis. It's powerful, flexible, free, open-source data analysis software and I love it. So does Roger Peng, PhD of Johns Hopkins University, and he says why in this video that introduces a really fun course on R programming that you might want to take some day. I did and it was excellent.

Installing R and RStudio

To learn how to install R and RStudio on your own computer so you can follow along with the data analysis step by step, please see this video.

To learn how to install R and RStudio on a Mac, see this video. Dr. Peng talks fast, so you may need to pause frequently as you follow along on your computer.

We will continue in this vein for our next few posts.

This time I'll let somebody else do the talking.

Every now and again you come across an answer to a question that is just so clear that you want to bookmark it so that you can find it again if something like the original question you were googling comes up again. That just happened to me. I was googling "How are linear regression models, ANOVAs, and ANCOVAs related?" because it is a long time since I have had to do those equations by hand and all I remembered was that I learned them together because they relied on the same basic formulae. What those formulae were, I had completely forgotten.

Thanks, SAS, SPSS, Python, and R, for doing all the work for me and letting my brain atrophy.

Here is an answer to my question that is so good that I'm giving it space on my blog. Thanks, Karen Grace-Martin, for tapping this out so the rest of us could have the benefit of your explanation.

Sunday, April 1, 2018

Designing a Study: First Steps with our Horse and Cart

The following is a completely fake study that I've made up out of thin air.

Let's say we have 38 teachers and 400 adult students in a military school that teaches survival skills in hot combat situations. We want the students to learn everything they can in a very short amount of time (training soldiers is expensive), and we have begun to wonder whether some of the school's pedagogical methods are detracting from or adding to how much and how quickly the trainees learn.

Here are some of our questions:

Do students learn better in mixed experience-level classes than they do in classes where all the students are at the same level?
Do students learn better in classes with certain types of activities?
Do teachers who have seen the video How to Avoid Death by Powerpoint tend to produce better-performing students?
At what time of day do students learn best?

How exactly do you go about answering these questions? The world is at your fingertips. You can create surveys for teachers and surveys for students; you can access any data source collected by the institution, including grades, instructor comments about students, student evaluations of instructors, and (I am making this up) injury and death rates for previous students.

This may seem like a morass of information to slog through for quite a complicated study, and it is. I have come across numerous data sets, sets of questions, and study designs in my work for Anovisions. Over the years, I have developed a method for getting my head around a study like this, and you might find it useful as you take your study from research question to final design.

Here are the first steps:

Find out what the entity paying for the study has as constraints in terms of time and money. What are the deadlines? How much of your analysis time can they afford? In this case, we have three months and a budget that limits us from doing any qualitative (read: "expensive and time consuming") analysis.
Define the main research question(s). In this case, the main research question is How can we change our pedagogical methods to produce better outcomes for our students? All possible answers should be explored, including the answer, "None of your considered changes will help much."
Find out about any variables your client may be considering. In this case, our client is not thinking, "Hm, I wonder about this variable or that variable?" Clients rarely think like that. But they often have excellent questions that you can convert into variables. In this case, we have a few variables that come to mind:

Grades (we'll call this a scalar variable for now)
Survival time in the field (scalar)
Number of injuries in training (scalar)
Number of injuries in the field (scalar)
Student experience level (scalar or categorical, to be decided once our plan is further developed)
Instructor status regarding How to Avoid Death by Powerpoint (binary, Y/N)
Types of class activities (categorical)
Instructor evaluations (we'll need to find a way to take lots of string data and compress it to a numerical value, so let's call this scalar for now)
Time of day for class
. . .

"But wait," you say. "What kind of study are you doing? A randomized trial? A case study? Survival analysis, even? No, a cohort study, or--yes! It's a cross-sectional study, right?"

"Ah," say I. "You have recently been to graduate school."

Actually, no, I don't say that, at least not out loud. I say, "Don't put the cart in front of the horse," which adage, though frequently used and perhaps stale, is difficult to replace with a more modern pithy saying while retaining the same wisdom. What I mean is, if you try to define the study type before you thoroughly examine all the client's questions and available data, you will never be able to deliver what the client needs (in the cart, ha ha).

At this point in your process, it is essential that you engage in one of the most overlooked and important steps in any study design.

You walk away, take a break, have a coffee, call someone you love, and otherwise stop thinking about it for a few minutes or even overnight (the process of sleep does wonders for analytical tasks). When you come back, you will perform better for having taken your break.

So now I'll take mine, and we'll continue this subject in the next blog post. Don't forget to sign up for notifications so you don't miss it.

Friday, March 9, 2018

Cool Transcription Tool that's Free AND Reliable!

I don't usually do free ads for folks, but I just have to mention oTranscribe, which I've been using for the past couple of weeks to do a transcription for a qualitative analysis. It's free to use oTranscribe on a small scale (I'm doing a single hour and 45 minute interview).

oTranscribe is one of those tools that seems to have read your mind about what you need. They save your work for you so that all you have to do is go to "otranscribe.com" and there is your project, right where you left off. If you're paranoid about saving things, like I am, they have an export function. I have multiple copies of the transcription that I've saved all along in my own folder, but that's just doubling up, because oTranscribe has also saved each of my copies in its "History" function. They do not store your audio recording, so you have to upload that each time, but I don't want them storing my audios anyway. oTranscribe also has keys for starting and stopping (the same key, a toggle, which goes back one second each time you pause), for slowing the playback, for speeding it up, for dropping in a time stamp, and many other nice features that you may or may not need.

The feature I like best is that I can re-map my function keys. Instead of using F1 to slow down playback, I use down arrow. Instead of F2 for "go back a few seconds," I use left arrow. It's incredibly easy to use.

I highly recommend this product. You can find it at www.oTranscribe.com.

Friday, October 10, 2014

How to Get the Most from Your Statistics Professional

Here are a few tips for making the most of your hard-earned (or painfully borrowed) money when you hire a statistics or editing professional to help you with your project:

Communicate via brief emails when you can, phone if you must. Almost without exception, phone calls take more time than is necessary to answer brief questions. The emails are also a great reference when you need to prepare for a presentation or defense.
Once you have turned over your project for editing or statistics recommendations, stop working on it. Controlling document versions is a great way to make sure your professional doesn't have to backtrack or spend precious time integrating your work with theirs.
Answer questions quickly so the project doesn't lose momentum.
Make sure due dates for specific tasks are clearly delineated. If you are using Basecamp or something like it, put your deadlines on the project's calendar yourself and invite your consultant so they can see your expectations.

Friday, January 3, 2014

Common Errors People Make in their Research: Power

About 80% of what crosses my desk for editing contains either no power analysis or the wrong type of power analysis.

What's power? Put simply, it's your odds of finding something that's actually there.

Why should you publish it? Because if you have found nothing in your analysis (p is too high for you to claim that you have a statistically significant finding), then you can evaluate the power and either

If power was low, explain that, due to a low N, you may have missed something that is actually there
If you had low power in your research, use this point in the discussion section to call for more research into your area—especially if p was less than, say, .25.
If power was high, fortify your claim to have found something important.

Does power relate to the entire study or to a particular analysis? The latter.

How do I figure out power before I begin? Do an a priori power analysis.

How do I figure it out after I've finished? Not surprisingly, do a post hoc power analysis?

What the heck does that mean? "A priori" = "before." "Post hoc" = "after."

How are they different? Usually if you are doing an a priori analysis you are trying to figure out how many subjects you need. And usually if you are doing a post hoc analysis you are trying to figure out, given the number of subjects you had, how much power you had. I usually include both.

What's the best way to do a power analysis? Download G*Power. It's my favorite tool for power analysis.

As always, if you need help with your power analysis, contact us at 802-382-7349 or visit us online at www.anovisions.com.