January can be restorative following the chaos of the holidays but often equal parts annoying and cloying as well. You can't swing a pencil without hitting tomes about how to boost your sales in 2020, new books abound that expand the listicle mindset, or other distractions to separate you from your money.
I am going to whisper a little secret, it won't work. What works is hard work. Really hard work. You must consume more than you create. Read, listen, and watch everything. Jerry Saltz described the ordinary brilliantly--"generic but ambitious". I follow Jerry for many reasons--most importantly are his lessons on observation. Yes, I could read the stacks of books falling like manna from the heavens about data visualization. But that is too narrow of a focus. I like edges from other perspectives. I subscribe to ARTFORUM to learn how to consume information. My reality will never be yours, what worked for me will never work the same way for you, but the hard work will.
I recommend listening to Jerry's interview on the Longform Podcast. He became an art critic in his 40s from a career as a truck driver...his humility and cleverness is and should be industry agnostic. Let me know what you think.
A poem by Wallace Stevens describes my thoughts exactly. This excerpt shares the awareness that 20 men will have 20 unique experiences and will perceive the bridge AND the village uniquely. Read the poem in its entirety here.
Metaphors of a Magnifico by Wallace Stevens (1879-1955)
A recent article in the New England Journal of Medicine, and reported by the New York Times reminded me of those vapid real-estate reality shows. A couple looks at a perfectly serviceable home in need of a bit of sweat equity but declare the house a failure--"Oh I could not live with that wallpaper, can you show us something else?".
In the case of the medical literature I don't know whom to blame first, the shoddy media coverage or the confusing study design. The wallpaper can be changed folks.
The study has quite robust inclusion/exclusion criteria for starters.
Assuming that most readers of the article are simply going to think the program was unsuccessful, here are a few tools (below) I am pulling into the data literacy workshop so we can continue to question questions while also questioning answers. Follow along for insights from the workshop. I don't want to reveal to much here as discovery and discussion will be in "real time."
CMS defines readmission rates within the 30 day time-frame in order to capture events most likely associated with the independent admission. I am assuming then that a 180 day time-frame is a bit noisier.
"Readmission and death rates are measured within 30 days, because readmissions and deaths after a longer time period may have less to do with the care gotten in the hospital and more to do with other complicating illnesses, patients’ own behavior, or care provided to patients after hospital discharge."--Medicare.gov
Hospital Compare datasets
Dissecting racial bias in an algorithm used to manage the health of populations
What does race have to do with it (link to discussion of bias)
Graphical presentation of confounding in directed acyclic graphs
Health Care Hotspotting--A Randomized, Controlled Trial
Supplement to: Finkelstein A, Zhou A, Taubman S, Doyle J. Health care hotspotting — a randomized, controlled trial. N Engl J Med 2020;382:152-62. DOI: 10.1056/NEJMsa1906848
Trouble accessing clinical research articles? Try Sci-Hub...
A recent study, Dissecting racial bias in an algorithm used to manage the health of populations sheds some light on how to reform how we evaluate interventions targeted toward social determinants of health.
The bias introduced in an algorithm widely used in hospital systems revealed that black patients were considerably sicker when compared to white patients at the 97th percentile metric--a qualifier for being referred for additional supports. Although the use of healthcare costs appears to be an effective measure to indicate interventions successful in managing patient outcomes--the racial biases are evident.
When the algorithm score is replaces by number of comorbid conditions vs. medical expenditure a more equitable referral pattern is observed at the 97th percentile.
I have reviewed the supplemental data from the "Hot-spotting" article and will be reviewing in more detail in future posts. I don't want to "spoil" the end for folks enrolled in the workshop! But stay-tuned...
The authors are planning additional analyses and I anticipate insights regarding improvements in a wider variety of patient populations. Often being ambitious is not the only goal--we need to be curious--not generic.
Future data workshop "buckets" to highlight Envision2030
The writer is an explorer. Every step is an advance into new land--Ralph Waldo Emerson
I wanted to create an end-of-year tome that would encourage and dare I say inspire. But like everything in the last few months--the best of intentions are strewn across the laundry room floor--or quite possibly that is actual laundry.
I am not a big fan of listicles or self-adorned professionals insisting they know best or can help you with your journey. There are no shortcuts period. Do the work. If that doesn't yield results, keep doing the work. I probably could have stretched that into an e-book of a decent length but I prefer getting to the joke quickly and succinctly.
The reason I ran over 1000 miles last year is because most people don't. I get up before dawn--most people don't. I don't eat meat and rarely drink. I don't read fiction. I meditate while swimming a slow silent mile. We all have little levers and oddities that yield an advantage in how we live our professional lives or even personal lives for that matter.
I also don't accept advertising or have a "real" job. I watch people with platforms tell us how to improve or create our work from the security of a 9 to 5 gig that pays the bills. I am basically unemployed or unemployable by most metrics but I am booked through March 2020. See what I did there?
I write books, I speak from the podium, I teach data workshops, and as of 2020 I will be fading away much of the contract work. Why? It's time to take the training wheels off of the bicycle. I have been learning. Enrolled in bootcamps, online executive education programs, and teaching data literacy. I learned to code in Python and dusted off my R code expertise. If you are going to call yourself an analyst you better have the skills to get the job done.
My expertise is in defining data questions, sourcing data, data modeling, curating insights, and weaving a compelling narrative. I am not promising a definitive answer but I am promising that I can hold up my end of the conversation.
Here is a list of some of the best decisions I made in 2019
1619--create an environment where you can question why you include race as a checkbox on your surveys or data collection.
I can't provide links because they are migrating everything to their new platform. But you can search by these titles:
You Should Write a Book
LSE Public Lectures and Events
Learning from Data: the art of statistics
Approaching the close of a busy year of travel and public speaking I am reminded of a gap in my narrative. I lack an elevator speech. Now don't get me wrong, I can certainly share my latest data insights or a particularly interactive workshop that I loved giving but if I had to say one thing--just one thing--what would it be?
I want it to be unique and free of attachment. A sort of "cosmic giggle" that recognizes the futility of stringing together the right order of words that will resonate with someone. I seriously have mediated on this "hinderance" and only discovered its lack of relevance.
How does "recovering medical writer that got tired of focusing on client profit motives instead of point of care decisions that matter, so studied applied analytics and data visualization" sound?
I don't like how the typical descriptors hang flat in the air, "data analyst", "data visualization professional", "insight analyst" and why pick one over the other? I am reminded of the five hindrances of buddhism. These negative mental states are what many of us sit with through mindful meditation. Maybe its just me but I see how seamlessly they apply to impeding not only our meditation practice but also how we walk through our business life. Ram Dass speaks of radio channels and frequencies of engagement but I don't want to get to woo woo. If this sort of thing is intriguing to you here is a link to Who Are You.
Here is the wrinkle. We are instructed to use key words and tags that will yield us discoverable. What if you enjoy your work but simultaneously don't want to adapt to the flotsam of "being picked"? Seth Godin says it best, “The next time you catch yourself being average when you feel like quitting, realize that you have only two good choices: Quit or be exceptional. Average is for losers.”
I would also be interested to hear if your knee-jerk response to "So what do you do?" changes as constantly as mine does. It feels so dynamic in real life. Especially if you spend a nontrivial amount of time engaged in ongoing conversations. Talking to peers and colleagues aligns different edges to your thought processes.
A recent twitter post from a colleague described what I do quite succinctly. I write about topics of interest in the data healthcare sphere and people hire me to talk about it. This blog wasn't designed to be that sort of a platform but here we are.
Now, it’s interesting how subtle and yet how formal our identities are, and how much we’re attached to them, because of how much we are used to our cards of identity.
To be honest with you, I have been sitting in this metaphysical realm since hearing about a few devastating diagnoses in my circle of friends. It made me contemplate my own identity. Am I simply my awareness? Can I make choices separate from my "identity"? Does your work life get a free pass simply because we have sold out to things that we think matter?
What are your business goals? Clicks and shares? Monetizing? SEO? No thanks. I want to be able to sit down and write about the tensions or complexities in healthcare, health economics, and health policy. We all know how to Kardashian-ize a headline or create titillating content. Count me out.
I find graphics like the one published by Vox to be mesmerizing and worthy of deeper analyses and consideration. If more of us were data literate and fluent, we would be empowered to question answers and delve a little deeper. That is going to be my 2020. My focus will be to share more tools, shape more data explorations, and guide us all into collectively doing a little better.
My free newsletter will continue to focus on many of these principles. This blog will continue to gather insights and share careful considerations. Although many of us continue to work over the holiday weeks, we share a history with many of the curious that preceded our humble efforts.
The quote that originated the phrase "Comfort the Afflicted, Afflict the Comfortable" is from 1902. A humorist, Finley Peter Dunne would write pronouncements in an Irish dialectical speech according to the Quote Investigator. A rendering from the website states the following in standard spelling...
The newspaper does everything for us. It runs the police force and the banks, commands the militia, controls the legislature, baptizes the young, marries the foolish, comforts the afflicted, afflicts the comfortable, buries the dead and roasts them afterward.
If you think of newspapers from 1902 as the 2019 "media" the parallels remain even if often they are tongue in cheek. We have a responsibility to not simply serve our own interests but to dig deeper and offer granularity. The devil is indeed in the details and what better way to afflict the comfortable?
A popular workshop this year introduces a wide variety of learners to data literacy and visualization. Speaking at the Women in Tech Summit I shared how I came across the title for this year's talk--data talks, meople pumble. I wasn't sure how a talk embedded in a Tableau Story viz would be received by several hundred techy women but come to find out--it was standing room only.
We might not be aware but everyone is a data person. You either create, curate, or consume data. I think it is imperative that we become literate and aware.
The last few public experiences I have had were with women audiences. A powerful discussion with Angela Saini, about her latest book Superior the return of race science was sponsored by 500 Women Scientists NYC, the recent Women in Tech Summit, and next year's Fifth Annual Women’s Economic Development Network Leadership Forum VISION 2025: Tools for the Future.
The energy in these gatherings is tremendous. I think this might be my preferred method of communicating about data although I have an active archive of print articles (and these blog posts) that I access on a regular basis. There is something intoxicating about having conversations about data--that people don't want to be having.
I am talking to you personalized medicine.
It isn't just about the cells--its about the society.
Look. I am excited about new discoveries in medicine. But I can still acknowledge the small molecular discoveries--at least the major ones are likely behind us.
I bristle when I hear the term "innovation" attributed to pharmacology and solutions for complex diseases. True innovation would be paying attention to the upstream causes of the rise of disease chronicity and inevitability.
Point research dollars toward understanding how our environments influence disease instead of beating investors into a venture fund froth at the idea of expanding markets. Those markets and data points are actually people.
This video is creepy but also timely and relevant to memetics and epigenetic theories that continue to evolve. I stumbled upon it during something said in a podcast I discuss below. Once you watch it you might simulatenously see how bizarre this is for a leaked internal video by Google but yet, is it really?
I see the image of donuts being "bathed" in sprinkles and I see a business. Of course the obvious deliciousness of Holtman's Donuts but I think of donuts as ideas--moments of pleasure--and the sprinkles are the data. Not all of it. Simply the curated and carefully selected information to help start a conversation.
My business model is simple. When I am on the road I am listening to conversations around the halls of the National Press Club, Brookings Institute, and recently attended Applying Big Data to Address the Social Determinants of Health in Oncology: A National Cancer Policy Forum Workshop In Collaboration with the Committee on Applied and Theoretical Statistics.
When I am lucky enough to be working out of my home office I start most days with a morning run with Fred--my. hound. I queue up podcasts from a broad range of interests and arrive home filled with new "sprinkles".
I am waiting to share a full post as I am holding a workshop this week--you can sign up here Big Data on a Less Big Budget (meanwhile, I don't want to give away the goods) and speaking at Women In Tech Summit next week. But the serendipity of my chosen topic from the Tableau Fringe Festival and a common thread of data skill workshops was surprisingly realized in both the National Cancer Policy Forum and this morning's podcast from my 9-mile run.
The LSE Public Lectures are a dynamic series of timely topics where I am able to consider at length discussions around economics and political science serving about global discourse on a wide variety of topics. Today the discussion was around ordinal citizenship or even better--eigencapital.
I imagine this as an application of eigenvalues and vectors--"An eigenvalue is a number, telling you how much variance there is in the data in that direction, in the example above the eigenvalue is a number telling us how spread out the data is on the line."
My undergraduate students have been having a tough time understanding qualitative ordinal and nominal data types. I was gobsmacked and more than just a little interested to listen to a podcast on ordinal citizenship. I found it compelling and a novel foundation to introduce variable selection when measuring and defining relationships like poverty, race, and yes--even citizenship.
"As digital technologies have enabled a broadening of economic and social incorporation, the possibilities for classifying, sorting, slotting and scaling people have also grown and diversified. New ways of measuring and demonstrating merit have sprung up, some better accepted than others. Institutions, both market and state, find themselves compelled to build up and exploit this efficient, proliferating, fine-grained knowledge in order to manage individual claims on resources and opportunities.
Although this slide was part of yesterday's discussions it comes to mind for linking all the disparates thoughts informing my analyses this afternoon. In healthcare we talk a lot about value-based care and "quality".
But are we prepared to explore the social risk factors that influence patient outcomes beyond the type of care the patients receive?
Big Data on a Less Big Budget
When you become data curious you begin to notice the vagueness of descriptors and measures in the clinical and political literature. I see surveys where the very thing they intend to measure lacks a definition. Or headlines lacking clarification of what the numerator and denominator might be in their claim. Alarming reductions and increases in benefits or risks lack context but still our heart rates increase and we lean into the new normal of agitation and fear.
We can become data literate and seek the answers we need. As an analyst, if I am asked to discuss poverty for example, this requires a longer, deeper and often existential conversation.
We have advanced in our understanding of the complexity of poverty. Not unlike how we talk about health and social correlates of health, poverty is not simply one thing. When we analyze a numerical variable and consider those values as representative of our analyses we are being short-sighted. The categorical variables need to be included as well.
"...deprivation comes in many forms and use a new multi-dimensional measure that not
The U.S. income data determines the poverty level and from there federal aid is indexed and provided for a variety of programs such as SNAP, health insurance, etc... "a lot of power for a statistic that is out of touch with reality" as described by the New York Times back in 2001.
A simple tool within US Census called Abacus will let you examine 2 variables at the US or state level over a specific time on your mobile device!
I rely on the American Community Survey to explore expanded measures of poverty. I prefer my analyses outside of the IPUMS website and rely on Python or Tableau Prep to quickly clean up data from the larger dataset.
You can observe below how the expanded Household-Economic Characteristics contribute additional granularity to discussions of poverty--beyond a simple calculation that allocates a household either above or below an arbitrary income cut-off.
I plan to dig into the census data more and more over the coming months. It is an under-utilized resource in our discussions about healthcare inequality and a granular understanding of the societal poverty line and determinants of health.
*I love the musings of what data is from this website--key concepts in information and knowledge--defining data. I am aware of datum as singular and data as plural but defer to common usage.
This is the time of year I start gearing up for conference presentations (Women in Tech Summit), prepping for class material (Understanding Data at local university and Intro to Tableau at a local tech college).
I don't exactly look in typical places for story themes or ideas for introducing data literacy to populations with varying degrees of educational maturity. I believe in taking the complex and technical and making it more approachable.
So imagine my joy at finding this quirky but charming HBO special by Julio Torres,
My Favorite Shapes. It is pretty much about Julio's favorite shapes (bet you didn't see that coming) but oh so much more.
We are relaunching the newsletter. You can subscribe here.
The quote at the center of the discussion is an important one,
"Open data is what the government wants you to know. Freedom of information requests are for what they don't want you to know. The things you can't FOI because they don't collect them are what they really don't want you to know." Anna Powell-Smith
What does this mean? When we formulate a data question we need to explore datasets for access and suitability. What do you do if you discover a gap in data collection? Often, I notice we just pull in the low hanging fruit. Race stands in for social constructs or worse--a lousy biologic proxy, poverty is simply a numeric value, and social determinants (correlates) are neglected or not interpreted in a meaningful or reproducible manner.
These are the conversations we need to be having.
I usually discover misleading data collection and reporting the same way everybody else does.
An eye-catching headline or something similar limps across a social media feed--not unlike a wounded animal in the wild--the eye is drawn.
In this recent instance, it was the claim that 60% of healthcare executives say they use predictive analytics. Clearly the headline referred to this recent report from the society of actuaries--2019 Predictive Analytics in Health Care Trend Forecast.
I have "0" familiarity with the Society of Actuaries but I loosely know what their professional responsibilities include.
Wikipedia defines an actuary as "a business professional who deals with the measurement and management of risk and uncertainty".
Let's embed the discussion of this report alongside a few steps to improve your survey game and post survey analytics, shall we?
Full disclosure, I have not seen the survey and for all I know some of these practices were implemented under the hood but if they were--why not say so?
Define what you are measuring. How do you know if predictive analytics are being used in your organization if you aren't presented with a baseline definition of predictive analytics or how you would like respondents to consider whether it is being used or not and in what capacity?
Report the raw numbers. When percentages are reported without accompanying numerators and denominators how are we to evaluate the ability to extrapolate the findings to the real world. Perhaps this only applies to the 201* respondents to the survey. How many received the survey? If it was sent to 20,000 health payers and provider executives how reflective are the findings of this single survey to the larger group?
*stated on last page of report--100 health payer executives and 101 health provider executives were interviewed.
What is a health provider executive? I know what I think they are. Are they defined the same way across organizations?
You can't compare percentages from one year to the next without stating the raw numbers. Clearly the number of respondents varies from year to year so how are we intended to evaluate a 13-percentage point increase from last year? Or a 6-point increase from 2017?
How are they using the predictive analytics and are they comparable across organizations? What exactly are they trying to predict? Employee retention? Per member per month (PMPM) capitation payments?
You need specificity in your outcomes. "Nearly two-thirds of executives (61%) forecast that predictive analytics will save their organization 15% or more over the next five years."
Save their organization what?
Who are these organizations that are saying no to predictive analytics? Isn't that the foundational algorithm embedded in healthcare outcomes forecasting?
In the absence of a workable definition of predictive analytics presented to the respondents--what can we say for certain?
To be honest with you, I don't even know what I am looking at in this graphic below. No idea. So I am simply going to skip it. Don't report low-value information. Not everything needs a graphic.
My confusion continues with the next graphic. Costs of what? You need to measure specifically in order to know if costs were reduced. And what does "Staffing/workforce needs Clinical outcomes" mean? Is it a typo? Is part of the chart missing?
At first glance I would also make an assumption that the "actual results overall" are negative? Why? The color choice. Red in a chart can be misinterpreted because we all arrive at graphicacy with our own perceptions and biases.
Another problem of not seeing the actual survey--I am not sure if these choices are ranked or asked in a Likert or multiple choice format. Ranking would be preferred and add value even if I don't quite understand the responses. For example, what question wording would you have you respond "Data visualization"?
If the future of predictive capabilities question yielded "Data visualization at 23% what are you all looking at now?
Ranking questions can yield probabilities but any other format would be reporting descriptive statistics only.
This report--although well intentioned--lacks clarity that seems at odds with the work of an actuarial organization.
My point isn't too blame but to demonstrate how we can all do better--myself included.
We need to slow down and take the time to make sure that we aren't introducing unsolvable paradoxes with our own data collection.
After all, when an unstoppable force meets an immovable object, the laws of physics are quite boring. Don't let your survey become a black hole...
If you see me talking to myself, do not disturb, I'm having a staff meeting.
Professionally there has been a stall or two in a few projects--one really hit home. A rare disease of personal interest to me. A lovely neighbor, curator of ancient art, was misdiagnosed and then diagnosed with a rare Parkinsonian type disease. Pharma was interested until it wasn't deemed profitable, no funding, no molecule.
Many colleagues imagine it is only sunshine and smiles in the independent world of digital media. My two passions of writing and analytics speak to each other through my work. Leaving one or the other out of the conversation results in a phantom limb of sorts. But often they are at odds.
Requests to write are often saddled with datasets not up to the task at hand--regardless of client enthusiasm. When I decline, I know there will be an endless pool of writers only too happy to cash the check regardless of the chasm between ethics and action.
You wanna fly, you got to give up the shit that weighs you down -- Toni Morrison
August brings into sharp focus and a furious boil everything I've been listening to in the late spring and summer--Henry Rollins
Perhaps I can blame it on fewer distractions but I do my best reading when faced with long stretches of quiet and solitude. I have a stack of books to get through, several from a course I am teaching in the fall at a local university--Understanding Data.
Maya Angelou is attributed to saying something like, "When you know better, you do better." I do a fair share of pro bono work teaching the basics of survey design and how to clean data upstream from the fancy dashboard visualizations that everyone is clamoring for. And I spend a non-trivial amount of time learning how to think about analysis.
When you have been working in a field like medical writing or healthcare consulting you realize the secret sauce is scalability. Do you want to be a data mechanic constantly repairing and fixing poorly designed questions or do you crave higher level collaborations? I have noticed many colleagues falling in with the status quo and giving their clients what they have asked for. Poorly designed or articulated outcomes questions from multiple choice surveys or a data question only interested in probing or interrogating a single source of data.
You can dance around to the music provided to you like a little monkey--or become curious. Is there a better way? Am I measuring what I think I am measuring or just grabbing low hanging bananas?
The process, even with Tableau Prep can be quite laborious but is the juice worth the squeeze? I would say yes for prepping your data--but maybe not when formatting your survey data.
The perfect format for most analytics is ranking data. We want to be able to create a hierarchy of sentiment not only between questions but also between respondents. In Likert, a 5 or even a 7 response shouldn't be compared to a similar response on a different question. Just because a respondent selects Strongly Agree for example can we make assumptions that the degree of agreement here is the same as on a different question?
No. No we can't. But if we use probabilities like the ones we can generate from asking respondents to rank responses--we now know how they prioritized their behavior or sentiment.
And ranking or rating questions can be more straightforward to analyze.
Do you have questions regarding cleaning data and what tools are available?
Join us on August 14th for a Healthcare Tableau User Group meeting. You can register below...
Browse the archive...
Thank you for making a donution!
In a world of "evidence-based" medicine I am a bigger fan of practice-based evidence.
Remember the quote by Upton Sinclair...
“It is difficult to get a man to understand something, when his salary depends upon his not understanding it!”
Sign up for our newsletter!