data&donuts
  • Data & Donuts (thinky thoughts)
  • COLLABORATor
  • Data talks, people mumble
  • Cancer: The Brand
  • Time to make the donuts...
  • donuts (quick nibbles)
  • Tools for writers and soon-to-be writers
  • datamonger.health
  • The "How" of Data Fluency

hello data
I visualize data buried in non-proprietary healthcare databases
https://unsplash.com/@winstonchen

Statistical "rigour" mortis

3/31/2015

 
Picture
I think it is important to visit statistical concepts from time to time especially when the same misconceptions are replicated from client to client. Following the inaugural health economics and outcomes research (HEOR) writing skills for medical writers workshop last week in Philadelphia my colleague presented Terminology and Jargon Demystified for Medical Writers. I thought I had a herculean task to answer the What is Value and Why Does it Matter question in the limited conference presentation time slot  but in this case--I got off easy. She had to address unique methodology and statistics specific to HEOR.

Quite often when leading workshops and being precious with our planning and cinematic presentation style we all tend to forget one thing. The fundamentals are where we lose most of our colleagues.

Introduction to the normal distribution: Exploring the normal distribution
Picture
In medical education data I notice the preference for the t-test approach. Unfortunately appropriate application of the t-test assumes that the difference between a response of 1 and a response of 2 is exactly the same as a difference between a response of 4 and a response of 5, for example. This is not usually a reasonable assumption but it is one made quite often.

The most appropriate test for comparing pre- and post- data on a Likert scale is typically McNemar's test. To further complicate the analytics, there are assumptions often applied but perhaps not actually relevant. For example, we all assume that the difference in the scores is solely due to the exposure to an intervention like an educational activity.

Understanding what we mean by normally distributed data is paramount to understanding the selection of appropriate tests and measures. I suggest asking a lot of questions because most of us aren't born with an inherent understanding of categorical and ordinal variables.

Khan Academy does a nice job of providing a good base or try one of the Coursera classes.

Do you have to p? Are p-values really necessary?

Picture
Because the base rate of effective cancer drugs is so low – only 10% of our hundred trial drugs actually work – most of the tested drugs do not work, and we have many opportunities for false positives. If I had the bad fortune of possessing a truckload of completely ineffective medicines, giving a base rate of 0%, there is a 0% chance that any statistically significant result is true. Nevertheless, I will get a p<0.05 result for 5% of the drugs in the truck.

You often hear people quoting p values as a sign that error is unlikely. “There’s only a 1 in 10,000 chance this result arose as a statistical fluke,” they say, because they got p=0.0001. No! This ignores the base rate, and is called the base rate fallacy. Remember how p values are defined:

The P value is defined as the probability, under the assumption of no effect or no difference (the null hypothesis), of obtaining a result equal to or more extreme than what was actually observed.A p value is calculated under the assumption that the medication does not work and tells us the probability of obtaining the data we did, or data more extreme than it. It does nottell us the chance the medication is effective.

When someone uses their p values to say they’re probably right, remember this. Their study’s probability of error is almost certainly much higher. In fields where most tested hypotheses are false, like early drug trials (most early drugs don’t make it through trials), it’s likely that most “statistically significant” results with p<0.05 are actually flukes.

--Statistics Done Wrong


Those poor p-values. They sort of got swept up in the statistical fervor as an easy concept to validate research findings. They didn't mean to mislead but many of us look at data whether in our own analytics or just scouring peer-reviewed research and are often astounded with the misconception about their significance (pun intended). Remember anyway--statistical significance doesn't often translate to clinical significance.

This is a topic we will return to over and over again--misconceptions in data analytics and information to help un-muddy the waters. Shoot me an email or a comment if you have specific questions. 

Comments are closed.
    Sign up for our newsletter!
    Picture
    Browse the archive...
    follow us in feedly
    Picture
    Thank you for making a donution!
    donations=more content
    In a world of "evidence-based" medicine I am a bigger fan of practice-based evidence.

    ​Remember the quote by Upton Sinclair...


    “It is difficult to get a man to understand something, when his salary depends upon his not understanding it!”

    Follow the evolution of Alzheimer's Disease into a billion dollar brand
    Picture
Proudly powered by Weebly
  • Data & Donuts (thinky thoughts)
  • COLLABORATor
  • Data talks, people mumble
  • Cancer: The Brand
  • Time to make the donuts...
  • donuts (quick nibbles)
  • Tools for writers and soon-to-be writers
  • datamonger.health
  • The "How" of Data Fluency