This video is creepy but also timely and relevant to memetics and epigenetic theories that continue to evolve. I stumbled upon it during something said in a podcast I discuss below. Once you watch it you might simulatenously see how bizarre this is for a leaked internal video by Google but yet, is it really?
I see the image of donuts being "bathed" in sprinkles and I see a business. Of course the obvious deliciousness of Holtman's Donuts but I think of donuts as ideas--moments of pleasure--and the sprinkles are the data. Not all of it. Simply the curated and carefully selected information to help start a conversation.
My business model is simple. When I am on the road I am listening to conversations around the halls of the National Press Club, Brookings Institute, and recently attended Applying Big Data to Address the Social Determinants of Health in Oncology: A National Cancer Policy Forum Workshop In Collaboration with the Committee on Applied and Theoretical Statistics.
When I am lucky enough to be working out of my home office I start most days with a morning run with Fred--my. hound. I queue up podcasts from a broad range of interests and arrive home filled with new "sprinkles".
I am waiting to share a full post as I am holding a workshop this week--you can sign up here Big Data on a Less Big Budget (meanwhile, I don't want to give away the goods) and speaking at Women In Tech Summit next week. But the serendipity of my chosen topic from the Tableau Fringe Festival and a common thread of data skill workshops was surprisingly realized in both the National Cancer Policy Forum and this morning's podcast from my 9-mile run.
The LSE Public Lectures are a dynamic series of timely topics where I am able to consider at length discussions around economics and political science serving about global discourse on a wide variety of topics. Today the discussion was around ordinal citizenship or even better--eigencapital.
I imagine this as an application of eigenvalues and vectors--"An eigenvalue is a number, telling you how much variance there is in the data in that direction, in the example above the eigenvalue is a number telling us how spread out the data is on the line."
My undergraduate students have been having a tough time understanding qualitative ordinal and nominal data types. I was gobsmacked and more than just a little interested to listen to a podcast on ordinal citizenship. I found it compelling and a novel foundation to introduce variable selection when measuring and defining relationships like poverty, race, and yes--even citizenship.
"As digital technologies have enabled a broadening of economic and social incorporation, the possibilities for classifying, sorting, slotting and scaling people have also grown and diversified. New ways of measuring and demonstrating merit have sprung up, some better accepted than others. Institutions, both market and state, find themselves compelled to build up and exploit this efficient, proliferating, fine-grained knowledge in order to manage individual claims on resources and opportunities.
Although this slide was part of yesterday's discussions it comes to mind for linking all the disparates thoughts informing my analyses this afternoon. In healthcare we talk a lot about value-based care and "quality".
But are we prepared to explore the social risk factors that influence patient outcomes beyond the type of care the patients receive?
Big Data on a Less Big Budget
When you become data curious you begin to notice the vagueness of descriptors and measures in the clinical and political literature. I see surveys where the very thing they intend to measure lacks a definition. Or headlines lacking clarification of what the numerator and denominator might be in their claim. Alarming reductions and increases in benefits or risks lack context but still our heart rates increase and we lean into the new normal of agitation and fear.
We can become data literate and seek the answers we need. As an analyst, if I am asked to discuss poverty for example, this requires a longer, deeper and often existential conversation.
We have advanced in our understanding of the complexity of poverty. Not unlike how we talk about health and social correlates of health, poverty is not simply one thing. When we analyze a numerical variable and consider those values as representative of our analyses we are being short-sighted. The categorical variables need to be included as well.
"...deprivation comes in many forms and use a new multi-dimensional measure that not
The U.S. income data determines the poverty level and from there federal aid is indexed and provided for a variety of programs such as SNAP, health insurance, etc... "a lot of power for a statistic that is out of touch with reality" as described by the New York Times back in 2001.
A simple tool within US Census called Abacus will let you examine 2 variables at the US or state level over a specific time on your mobile device!
I rely on the American Community Survey to explore expanded measures of poverty. I prefer my analyses outside of the IPUMS website and rely on Python or Tableau Prep to quickly clean up data from the larger dataset.
You can observe below how the expanded Household-Economic Characteristics contribute additional granularity to discussions of poverty--beyond a simple calculation that allocates a household either above or below an arbitrary income cut-off.
I plan to dig into the census data more and more over the coming months. It is an under-utilized resource in our discussions about healthcare inequality and a granular understanding of the societal poverty line and determinants of health.
*I love the musings of what data is from this website--key concepts in information and knowledge--defining data. I am aware of datum as singular and data as plural but defer to common usage.
This is the time of year I start gearing up for conference presentations (Women in Tech Summit), prepping for class material (Understanding Data at local university and Intro to Tableau at a local tech college).
I don't exactly look in typical places for story themes or ideas for introducing data literacy to populations with varying degrees of educational maturity. I believe in taking the complex and technical and making it more approachable.
So imagine my joy at finding this quirky but charming HBO special by Julio Torres,
My Favorite Shapes. It is pretty much about Julio's favorite shapes (bet you didn't see that coming) but oh so much more.
We are relaunching the newsletter. You can subscribe here.
The quote at the center of the discussion is an important one,
"Open data is what the government wants you to know. Freedom of information requests are for what they don't want you to know. The things you can't FOI because they don't collect them are what they really don't want you to know." Anna Powell-Smith
What does this mean? When we formulate a data question we need to explore datasets for access and suitability. What do you do if you discover a gap in data collection? Often, I notice we just pull in the low hanging fruit. Race stands in for social constructs or worse--a lousy biologic proxy, poverty is simply a numeric value, and social determinants (correlates) are neglected or not interpreted in a meaningful or reproducible manner.
These are the conversations we need to be having.
I usually discover misleading data collection and reporting the same way everybody else does.
An eye-catching headline or something similar limps across a social media feed--not unlike a wounded animal in the wild--the eye is drawn.
In this recent instance, it was the claim that 60% of healthcare executives say they use predictive analytics. Clearly the headline referred to this recent report from the society of actuaries--2019 Predictive Analytics in Health Care Trend Forecast.
I have "0" familiarity with the Society of Actuaries but I loosely know what their professional responsibilities include.
Wikipedia defines an actuary as "a business professional who deals with the measurement and management of risk and uncertainty".
Let's embed the discussion of this report alongside a few steps to improve your survey game and post survey analytics, shall we?
Full disclosure, I have not seen the survey and for all I know some of these practices were implemented under the hood but if they were--why not say so?
Define what you are measuring. How do you know if predictive analytics are being used in your organization if you aren't presented with a baseline definition of predictive analytics or how you would like respondents to consider whether it is being used or not and in what capacity?
Report the raw numbers. When percentages are reported without accompanying numerators and denominators how are we to evaluate the ability to extrapolate the findings to the real world. Perhaps this only applies to the 201* respondents to the survey. How many received the survey? If it was sent to 20,000 health payers and provider executives how reflective are the findings of this single survey to the larger group?
*stated on last page of report--100 health payer executives and 101 health provider executives were interviewed.
What is a health provider executive? I know what I think they are. Are they defined the same way across organizations?
You can't compare percentages from one year to the next without stating the raw numbers. Clearly the number of respondents varies from year to year so how are we intended to evaluate a 13-percentage point increase from last year? Or a 6-point increase from 2017?
How are they using the predictive analytics and are they comparable across organizations? What exactly are they trying to predict? Employee retention? Per member per month (PMPM) capitation payments?
You need specificity in your outcomes. "Nearly two-thirds of executives (61%) forecast that predictive analytics will save their organization 15% or more over the next five years."
Save their organization what?
Who are these organizations that are saying no to predictive analytics? Isn't that the foundational algorithm embedded in healthcare outcomes forecasting?
In the absence of a workable definition of predictive analytics presented to the respondents--what can we say for certain?
To be honest with you, I don't even know what I am looking at in this graphic below. No idea. So I am simply going to skip it. Don't report low-value information. Not everything needs a graphic.
My confusion continues with the next graphic. Costs of what? You need to measure specifically in order to know if costs were reduced. And what does "Staffing/workforce needs Clinical outcomes" mean? Is it a typo? Is part of the chart missing?
At first glance I would also make an assumption that the "actual results overall" are negative? Why? The color choice. Red in a chart can be misinterpreted because we all arrive at graphicacy with our own perceptions and biases.
Another problem of not seeing the actual survey--I am not sure if these choices are ranked or asked in a Likert or multiple choice format. Ranking would be preferred and add value even if I don't quite understand the responses. For example, what question wording would you have you respond "Data visualization"?
If the future of predictive capabilities question yielded "Data visualization at 23% what are you all looking at now?
Ranking questions can yield probabilities but any other format would be reporting descriptive statistics only.
This report--although well intentioned--lacks clarity that seems at odds with the work of an actuarial organization.
My point isn't too blame but to demonstrate how we can all do better--myself included.
We need to slow down and take the time to make sure that we aren't introducing unsolvable paradoxes with our own data collection.
After all, when an unstoppable force meets an immovable object, the laws of physics are quite boring. Don't let your survey become a black hole...
If you see me talking to myself, do not disturb, I'm having a staff meeting.
Professionally there has been a stall or two in a few projects--one really hit home. A rare disease of personal interest to me. A lovely neighbor, curator of ancient art, was misdiagnosed and then diagnosed with a rare Parkinsonian type disease. Pharma was interested until it wasn't deemed profitable, no funding, no molecule.
Many colleagues imagine it is only sunshine and smiles in the independent world of digital media. My two passions of writing and analytics speak to each other through my work. Leaving one or the other out of the conversation results in a phantom limb of sorts. But often they are at odds.
Requests to write are often saddled with datasets not up to the task at hand--regardless of client enthusiasm. When I decline, I know there will be an endless pool of writers only too happy to cash the check regardless of the chasm between ethics and action.
You wanna fly, you got to give up the shit that weighs you down -- Toni Morrison
August brings into sharp focus and a furious boil everything I've been listening to in the late spring and summer--Henry Rollins
Perhaps I can blame it on fewer distractions but I do my best reading when faced with long stretches of quiet and solitude. I have a stack of books to get through, several from a course I am teaching in the fall at a local university--Understanding Data.
Maya Angelou is attributed to saying something like, "When you know better, you do better." I do a fair share of pro bono work teaching the basics of survey design and how to clean data upstream from the fancy dashboard visualizations that everyone is clamoring for. And I spend a non-trivial amount of time learning how to think about analysis.
When you have been working in a field like medical writing or healthcare consulting you realize the secret sauce is scalability. Do you want to be a data mechanic constantly repairing and fixing poorly designed questions or do you crave higher level collaborations? I have noticed many colleagues falling in with the status quo and giving their clients what they have asked for. Poorly designed or articulated outcomes questions from multiple choice surveys or a data question only interested in probing or interrogating a single source of data.
You can dance around to the music provided to you like a little monkey--or become curious. Is there a better way? Am I measuring what I think I am measuring or just grabbing low hanging bananas?
The process, even with Tableau Prep can be quite laborious but is the juice worth the squeeze? I would say yes for prepping your data--but maybe not when formatting your survey data.
The perfect format for most analytics is ranking data. We want to be able to create a hierarchy of sentiment not only between questions but also between respondents. In Likert, a 5 or even a 7 response shouldn't be compared to a similar response on a different question. Just because a respondent selects Strongly Agree for example can we make assumptions that the degree of agreement here is the same as on a different question?
No. No we can't. But if we use probabilities like the ones we can generate from asking respondents to rank responses--we now know how they prioritized their behavior or sentiment.
And ranking or rating questions can be more straightforward to analyze.
Do you have questions regarding cleaning data and what tools are available?
Join us on August 14th for a Healthcare Tableau User Group meeting. You can register below...
My thoughts on running a small data and communication business are vinified products rendered from insights from Seth Godin. The pomace has been removed from the maceration, filtering and aging has transpired and to bring the enology analogy full circle--the wine is ready for sipping.
Here are 3 that I access daily--maybe even hourly
1. You aren't making your products or offering your services for everyone. Stay true to yourself and in the face of rejection--whisper to yourself--"I didn't make this for you"
2. Don't bid on pricing. This has been useful in my work freelancing or working as an entrepreneur. Seth's thoughts (paraphrased by me) are it is a useless race to the bottom. I agree. I don't sell dollars. Clients need to value the tasks or questions they are trying to solve--and pay accordingly.
3. Now that you have your ducks in a row--what are you going to do with the duck--simply brilliant. I have even titled data talks with this exact phase. As a thank you I gifted Seth a bronze duck when I met him at the On Being Gathering. He actually teared up and gave me a hug.
The theme of data gathering and sourcing was used by Cole in a podcast replay, Dataklubben, that I caught on my way to the gym. Think about our roles as data analysts. We gather/source data--lots of it. But what are we looking for? The single pearl of truth perhaps but in the process we shouldn't leave huge piles of oyster shells in our wake.
If certainty is truly the goal you will likely be disappointed.
Think how many oysters you would have to shuck or eat to potentially find a pearl.
Maybe what we really need is targeted well thought out questions--and well calculated probabilities.
I thought it was just me. I bristle at all of the pay walls, sponsored content, and monetization of conversations that matter. Or should matter. How can we have meaningful dialogue if access to the conversation is stratified by who is willing or able to pay to listen?
When I was a newbie medical writer my contract contained a requirement that I be provided with all requested references. A few earlier projects had been completed with me footing the bill or having to travel to a nearby medical library. For the first go-around, they would send the freely available articles but try and substitute the pay wall articles with just the abstracts. Nope. Not good enough.
I teach writers that the devil is in the details. And by details I mean methodology and results sections. What do you think the clinical landscape would look like if we only wrote about articles we could freely read allowing paywalls to retain part (arguably, the majority) of research papers? I tell you this story simply to own up to a source of the triggering solicited when being poked to pay up or look the other way.
Don't get me wrong, I have access to many journals through my press credentials, access to a fine local academic library, and even the National Press Club library. I also pay a few Patreon accounts, news subscriptions (NY Times), and a few others like Paris Review and Harper's Weekly.
What I don't have is a bottomless wallet to pay for vanity podcast subscriptions (I don't mind listening to your ad roll for free access), digital news aggregators, and for profit content morphing into sponsored content. I remember bumping into a sparkly new journalist at one of the big online news sites. We had gathered for a medical research discussion and he said he was so excited to have been hired. Admitting he knew nothing about medicine or healthcare he shared that the hiring team had told him--"that's okay, we can teach you what you need to know."--gasp.
We build too many walls and not enough bridges--Isaac Newton
A recent podcast by Manoush Zomorodi and Jen Poyant shared the not so common plight of a creative voluntarily shutting down her business. Spoiler alert--she refused to lower quality, commoditize her contributions, or become another monetized blog selling out for profit. Amen. They close up shop in August but go take a look at what our new "money, money, money" mentality is costing us.
Her website is Design*Sponge and it will be missed. Here is an article that integrates nicely with where the article is headed, The Cost of Being Disabled written by Imani Barbarin a contribution from blogger Crutches & Spice--discussions from the intersection of liability, race, gender, and media..
We all know why the temptation exists. The money can be eye popping. I was able to avoid turning to stone by leaving pharma and starting my own data and writing consultancy. I do okay financially but not the same depth of okay as when I wrote what I was asked. period.
I couldn't return to an era where I didn't know the harms being perpetuated in lock-step with the good. The data large and small companies did not want to include. The cursory distortion of data insights toward marketing and away from actual science or unbiased discovery was hard to miss.
I read the article by Christopher Booth, MD, a medical oncologist and recognized the duplicity. You might be sacrificing your reputation at the moment you decide to change your industry ties from "none" to "some".
...Since that time he has had no relationships with industry. Moreover, he now “sees” industry influence in almost all facets of patient care, medical education, clinical research, and even certification exams (in which the correct answers are based on pharmaceutical funded guidelines).--From the $80 hamburger to managing conflicts of interest with the pharmaceutical industry
Not sure why there are few teeth in the discussion of clinical trial data and how to teach all of us scalable literacy to inform or observe what should or shouldn't be happening at the point of care. What are some of the distortions we find when we become aware of industry influence in clinical trials?
In summary, we have found that modern RCTs in breast cancer, NSCLC, and CRC are substantially larger and more international in scope than those of earlier decades. Although methodology and quality of reporting seems to be improving over time, serious deficiencies persist, particularly in the identification of the primary end point and by not including all randomly assigned patients in ITT analyses. There has been a substantial shift toward industry sponsorship of oncology RCTs. Over the past 30 years, authors’ endorsement of novel therapies has increased while relative effect size has remained stable.
Before I teach data literacy workshops on how to read clinical literature--I begin with the history. You need to understand how effect sizes and p-values can be influenced by the sheer increase in sample sizes, over powered studies can make spurious associations seem larger, and the rise of the surrogate endpoint. Dig a little deeper and you can appreciate the evolution from little or no industry sponsorship of clinical trials (1990s) to upwards of 90% now funded by industry.
I haven't been able to see exactly what is being taught by panel discussions on writing about clinical trials or societies asking members for money to access articles--but what I have been able to see is not worth your time or effort.
Have questions? Reach out over on twitter @datamongerbonny. The blog will always be free. Thank you to those of you that have supported this work for so long.
About a week ago a reluctant client gave me a call. He mentioned being a follower of this blog for a few years but felt like surveys they create in-house were perfectly functional--until during a live webinar I randomly used a survey template that his company had created--and was a wee bit critical.
Now let me be honest. I don't pull the smaller companies to task but focus on the bigger companies with deeper pockets and analytics departments. It is frustrating to see low quality multiple choice questions written in the absence of any concern toward research methodology or rigorous question design.
The good news is I have been invited onsite to a few locations to deliver scalable solutions. It is much easier to drive change in an organization when the process is implemented across a team or a company.
I am not particularly clever about how to market these opportunities but several forward thinking clients decided what they needed. We scheduled a few live customized webinars to support the in-person training and now we are off to the races.
Let me share a few of the insights that seemed the most helpful and easily integrated into processes that either already exist--or should.
I have nothing against Survey Monkey. If you are a skilled analytics professional or a subject matter expert with learning platform expertise and can distinguish satisficing question responses and how to avoid them--go get that banana.
I find Qualtrics to be just the right temperature--not too simplistic and accompanied by adequate support and direction to help the novice at least become aware of the possibilities beyond low value question design.
We won't delve into the statistical layer that makes it all possible--not yet--but conjoint analysis is the right tool to answer complex decisions like those made at the point of care. Health care providers and patients consider multiple attributes (characteristics) and features of an intervention or recommendation in complicated ways that it is impossible to measure or address if you are only asking in a Likert or multiple choice format.
I can recognize when teams aren't up to the challenge. In these cases, just use ranking questions. At least you can safely talk about probabilities if you know how options are prioritized.
You might find a little context helpful. If you want to prioritize a behavior, examine what influences decisions, predict how cost impacts therapy selection, or uncover competitive advantages from another product--conjoint analysis is the right way to go.
If I had a nickel for every client that mentions question fatigue as a reason to not develop an impactful survey instrument I would be writing this from Bora Bora with a drink floating a little paper umbrella. My standard argument? Participants/respondents are willing if there is a high-value trade-off.
We are trying to uncover, "Are health care providers/patients/payers willing to sacrifice______ in order to have ______?
First step is to define the attributes of whatever intervention/behavior/commodity we are attempting to measure value decisions. Within these attributes are levels. The statistical modeling will draw combinations of attribute levels and present choice sets to the respondents.
When presented with choices--respondents select the higher level of utility.
Below is a framework that you can follow to design a DCE of your own. The first step is to consider the attributes and assign levels. This works surprisingly well in healthcare and there are many resources to help you get started.
There isn't likely to be anything ready to go out of the gate but if you step through some of the methods carefully you can build your foundation and then perhaps reach out for the analytic platform.
The critical component is to realize the limitations of low-value data gathering when it does not capture the considerations at the point of care. Health economics considers cost-benefit and trade-offs. Building a better survey strategy can serve as a first step in evaluating characteristics leading to improved behaviors and patient outcomes.
You can send an email to schedule a free 15 minute discussion or schedule directly below...
People don't want what you make
Seth explains that "marketing is the generous act of helping someone solve a problem. Their problem".
It may sound strange but think about the one thing you are passionate about creating. For me, it really is about the underlying truths that obfuscate better decision making in healthcare. And the strategy--at least for me--is to liberate the data. Not just numbers because they are meaningless without curation and a narrative.
If I was going to hand over a "product" in the way Seth describes, it would be data literacy. Why? Because it is scalable and attainable. Currently it seems to be claimed by those with power and access to proprietary data but what if I could show you where the non-proprietary data lives, teach you how to access it, and empower you individually or enterprise-wide how to curate the data for both empathy, information, and insights?
Seth Godin also differentiates between tactics and strategy. "Tactics are easy to understand because we can list them. You use a tactic or you don't. Strategy is more amorphous. It's the umbrella over your tactics, the work the tactics seek to support."
It is okay to share your strategy. The tactics are specifically how you will execute your strategy--those you need to protect.
I'll go first. There is a problem in healthcare. The problem that everyone wants to solve is never clearly defined. Everyone has the solution. How is that possible?
If everyone wanted to see all of the data--even the data that might oppose a strongly held belief or tension--we would at least be walking in the same direction. I see a better alternative, come with me--Seth Godin.
“The sense of having walked from far inside yourself / out into the revelation, to have risked yourself / for something that seemed to stand both inside you / and far beyond you, that called you back”--David Whyte
Browse the archive...
Thank you for making a donution!
In a world of "evidence-based" medicine I am a bigger fan of practice-based evidence.
Remember the quote by Upton Sinclair...
“It is difficult to get a man to understand something, when his salary depends upon his not understanding it!”