As human beings, we live in stories. It doesn’t matter how quantitative you are, we’re all influenced by stories. They become like statistics in our mind. So if you report the statistics without the story, you don’t get nearly the level of interest or emotion or willingness to engage with the ideas.--Rebecca Goldin
But I did learn a lot about data sources, Integrated Public Use Microdata Series (IPUMS), a wide variety of resources for learning R, and working heteroscedasticity into conversations about data.
Fast forward a few years and I noticed something interesting. Apparently there really aren't a lot of industry adjacent attendees at these conferences. Mostly statisticians and biostatisticians in the audience but not many data centric professionals like me. But here is the rub. If healthcare industry professionals aren't engaged with your sessions--whom are they for? So I ask the questions and learn more than I ever gleaned from a statistics course.
STATS Sense About Stats is a resource you don't want to forget. I use their services when I work out a complex model and want to make sure it makes sense. Not to be confused with STAT--a news source that throws all of the meaningful reporting behind a paywall--STATS assists for no charge.
Statistics is for all of us--otherwise what is the point?
If you are thinking it doesn't matter to your work--think again. We make a lot of assumptions about p-values and what they mean not only in our own research but in the clinical research we consume to make decisions at the point of care. What if those are the wrong assumptions?
"In logical terms, the P value tests all the assumptions about how the data were generated (the entire model), not just the targeted hypothesis it is supposed to test (such as a null hypothesis). Furthermore, these assumptions include far more than what are traditionally presented as modeling or probability assumptions—they include assumptions about the conduct of the analysis, for example that intermediate analysis results were not used to determine which analyses would be presented."--Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations.
Data literacy is important. It is even more important today if you communicate your findings, thoughts, or ideas. Data rarely tell us absolute findings. We need to discover probabilities and communicate uncertainty around the answers we seek. For example, if the number needed to treat (NNT) is 100--what happens to the other 99? I am reminded of Princess Bride and often have to repeat "I do not think it means what you think it means"
Here is what I learned in the last few weeks alone:
- Charts are arguments
- Maps represent territory not people
- Ecological fallacy--findings may not apply if aggregated at different level
- Don’t simplify. Clarify
- Country data can’t be extrapolated to individual level
- Simpsons Paradox--problems result from combining data from different groups.
- Charts are tools--They extend our brain
Like, you might be interested in knowing whether taking hormones is helpful or harmful to women who are postmenopausal. So you start out with a question that’s really well-defined: Does it help or hurt? But you can’t necessarily answer that question. What you can answer is the question of whether women who take hormones whom you enroll in your study — those specific women — have an increase or decrease in, say, heart disease rates or breast cancer rates or stroke rates compared to a control group or to the general population. But that may not answer your initial question, which is: “Is that going to be the case for me? Or people like me? Or the population as a whole?”--Rebecca Goldin