Approaching the close of a busy year of travel and public speaking I am reminded of a gap in my narrative. I lack an elevator speech. Now don't get me wrong, I can certainly share my latest data insights or a particularly interactive workshop that I loved giving but if I had to say one thing--just one thing--what would it be? I want it to be unique and free of attachment. A sort of "cosmic giggle" that recognizes the futility of stringing together the right order of words that will resonate with someone. I seriously have mediated on this "hinderance" and only discovered its lack of relevance. How does "recovering medical writer that got tired of focusing on client profit motives instead of point of care decisions that matter, so studied applied analytics and data visualization" sound? I don't like how the typical descriptors hang flat in the air, "data analyst", "data visualization professional", "insight analyst" and why pick one over the other? I am reminded of the five hindrances of buddhism. These negative mental states are what many of us sit with through mindful meditation. Maybe its just me but I see how seamlessly they apply to impeding not only our meditation practice but also how we walk through our business life. Ram Dass speaks of radio channels and frequencies of engagement but I don't want to get to woo woo. If this sort of thing is intriguing to you here is a link to Who Are You. Here is the wrinkle. We are instructed to use key words and tags that will yield us discoverable. What if you enjoy your work but simultaneously don't want to adapt to the flotsam of "being picked"? Seth Godin says it best, “The next time you catch yourself being average when you feel like quitting, realize that you have only two good choices: Quit or be exceptional. Average is for losers.” I would also be interested to hear if your knee-jerk response to "So what do you do?" changes as constantly as mine does. It feels so dynamic in real life. Especially if you spend a nontrivial amount of time engaged in ongoing conversations. Talking to peers and colleagues aligns different edges to your thought processes. A recent twitter post from a colleague described what I do quite succinctly. I write about topics of interest in the data healthcare sphere and people hire me to talk about it. This blog wasn't designed to be that sort of a platform but here we are. Now, it’s interesting how subtle and yet how formal our identities are, and how much we’re attached to them, because of how much we are used to our cards of identity. To be honest with you, I have been sitting in this metaphysical realm since hearing about a few devastating diagnoses in my circle of friends. It made me contemplate my own identity. Am I simply my awareness? Can I make choices separate from my "identity"? Does your work life get a free pass simply because we have sold out to things that we think matter? What are your business goals? Clicks and shares? Monetizing? SEO? No thanks. I want to be able to sit down and write about the tensions or complexities in healthcare, health economics, and health policy. We all know how to Kardashian-ize a headline or create titillating content. Count me out. I find graphics like the one published by Vox to be mesmerizing and worthy of deeper analyses and consideration. If more of us were data literate and fluent, we would be empowered to question answers and delve a little deeper. That is going to be my 2020. My focus will be to share more tools, shape more data explorations, and guide us all into collectively doing a little better. My free newsletter will continue to focus on many of these principles. This blog will continue to gather insights and share careful considerations. Although many of us continue to work over the holiday weeks, we share a history with many of the curious that preceded our humble efforts. The quote that originated the phrase "Comfort the Afflicted, Afflict the Comfortable" is from 1902. A humorist, Finley Peter Dunne would write pronouncements in an Irish dialectical speech according to the Quote Investigator. A rendering from the website states the following in standard spelling... The newspaper does everything for us. It runs the police force and the banks, commands the militia, controls the legislature, baptizes the young, marries the foolish, comforts the afflicted, afflicts the comfortable, buries the dead and roasts them afterward. If you think of newspapers from 1902 as the 2019 "media" the parallels remain even if often they are tongue in cheek. We have a responsibility to not simply serve our own interests but to dig deeper and offer granularity. The devil is indeed in the details and what better way to afflict the comfortable?
A popular workshop this year introduces a wide variety of learners to data literacy and visualization. Speaking at the Women in Tech Summit I shared how I came across the title for this year's talk--data talks, meople pumble. I wasn't sure how a talk embedded in a Tableau Story viz would be received by several hundred techy women but come to find out--it was standing room only. We might not be aware but everyone is a data person. You either create, curate, or consume data. I think it is imperative that we become literate and aware. The last few public experiences I have had were with women audiences. A powerful discussion with Angela Saini, about her latest book Superior the return of race science was sponsored by 500 Women Scientists NYC, the recent Women in Tech Summit, and next year's Fifth Annual Women’s Economic Development Network Leadership Forum VISION 2025: Tools for the Future. The energy in these gatherings is tremendous. I think this might be my preferred method of communicating about data although I have an active archive of print articles (and these blog posts) that I access on a regular basis. There is something intoxicating about having conversations about data--that people don't want to be having. I am talking to you personalized medicine. It isn't just about the cells--its about the society. ![]() Look. I am excited about new discoveries in medicine. But I can still acknowledge the small molecular discoveries--at least the major ones are likely behind us. I bristle when I hear the term "innovation" attributed to pharmacology and solutions for complex diseases. True innovation would be paying attention to the upstream causes of the rise of disease chronicity and inevitability. Point research dollars toward understanding how our environments influence disease instead of beating investors into a venture fund froth at the idea of expanding markets. Those markets and data points are actually people.
This video is creepy but also timely and relevant to memetics and epigenetic theories that continue to evolve. I stumbled upon it during something said in a podcast I discuss below. Once you watch it you might simulatenously see how bizarre this is for a leaked internal video by Google but yet, is it really?
![]()
I see the image of donuts being "bathed" in sprinkles and I see a business. Of course the obvious deliciousness of Holtman's Donuts but I think of donuts as ideas--moments of pleasure--and the sprinkles are the data. Not all of it. Simply the curated and carefully selected information to help start a conversation.
My business model is simple. When I am on the road I am listening to conversations around the halls of the National Press Club, Brookings Institute, and recently attended Applying Big Data to Address the Social Determinants of Health in Oncology: A National Cancer Policy Forum Workshop In Collaboration with the Committee on Applied and Theoretical Statistics. When I am lucky enough to be working out of my home office I start most days with a morning run with Fred--my. hound. I queue up podcasts from a broad range of interests and arrive home filled with new "sprinkles". I am waiting to share a full post as I am holding a workshop this week--you can sign up here Big Data on a Less Big Budget (meanwhile, I don't want to give away the goods) and speaking at Women In Tech Summit next week. But the serendipity of my chosen topic from the Tableau Fringe Festival and a common thread of data skill workshops was surprisingly realized in both the National Cancer Policy Forum and this morning's podcast from my 9-mile run. The LSE Public Lectures are a dynamic series of timely topics where I am able to consider at length discussions around economics and political science serving about global discourse on a wide variety of topics. Today the discussion was around ordinal citizenship or even better--eigencapital. I imagine this as an application of eigenvalues and vectors--"An eigenvalue is a number, telling you how much variance there is in the data in that direction, in the example above the eigenvalue is a number telling us how spread out the data is on the line." My undergraduate students have been having a tough time understanding qualitative ordinal and nominal data types. I was gobsmacked and more than just a little interested to listen to a podcast on ordinal citizenship. I found it compelling and a novel foundation to introduce variable selection when measuring and defining relationships like poverty, race, and yes--even citizenship. "As digital technologies have enabled a broadening of economic and social incorporation, the possibilities for classifying, sorting, slotting and scaling people have also grown and diversified. New ways of measuring and demonstrating merit have sprung up, some better accepted than others. Institutions, both market and state, find themselves compelled to build up and exploit this efficient, proliferating, fine-grained knowledge in order to manage individual claims on resources and opportunities.
Although this slide was part of yesterday's discussions it comes to mind for linking all the disparates thoughts informing my analyses this afternoon. In healthcare we talk a lot about value-based care and "quality".
But are we prepared to explore the social risk factors that influence patient outcomes beyond the type of care the patients receive? Big Data on a Less Big Budget
When you become data curious you begin to notice the vagueness of descriptors and measures in the clinical and political literature. I see surveys where the very thing they intend to measure lacks a definition. Or headlines lacking clarification of what the numerator and denominator might be in their claim. Alarming reductions and increases in benefits or risks lack context but still our heart rates increase and we lean into the new normal of agitation and fear.
We can become data literate and seek the answers we need. As an analyst, if I am asked to discuss poverty for example, this requires a longer, deeper and often existential conversation. We have advanced in our understanding of the complexity of poverty. Not unlike how we talk about health and social correlates of health, poverty is not simply one thing. When we analyze a numerical variable and consider those values as representative of our analyses we are being short-sighted. The categorical variables need to be included as well. "...deprivation comes in many forms and use a new multi-dimensional measure that not
The U.S. income data determines the poverty level and from there federal aid is indexed and provided for a variety of programs such as SNAP, health insurance, etc... "a lot of power for a statistic that is out of touch with reality" as described by the New York Times back in 2001.
A simple tool within US Census called Abacus will let you examine 2 variables at the US or state level over a specific time on your mobile device!
I rely on the American Community Survey to explore expanded measures of poverty. I prefer my analyses outside of the IPUMS website and rely on Python or Tableau Prep to quickly clean up data from the larger dataset.
You can observe below how the expanded Household-Economic Characteristics contribute additional granularity to discussions of poverty--beyond a simple calculation that allocates a household either above or below an arbitrary income cut-off.
I plan to dig into the census data more and more over the coming months. It is an under-utilized resource in our discussions about healthcare inequality and a granular understanding of the societal poverty line and determinants of health.
*I love the musings of what data is from this website--key concepts in information and knowledge--defining data. I am aware of datum as singular and data as plural but defer to common usage.
This is the time of year I start gearing up for conference presentations (Women in Tech Summit), prepping for class material (Understanding Data at local university and Intro to Tableau at a local tech college).
I don't exactly look in typical places for story themes or ideas for introducing data literacy to populations with varying degrees of educational maturity. I believe in taking the complex and technical and making it more approachable. So imagine my joy at finding this quirky but charming HBO special by Julio Torres, My Favorite Shapes. It is pretty much about Julio's favorite shapes (bet you didn't see that coming) but oh so much more.
We are relaunching the newsletter. You can subscribe here.
The quote at the center of the discussion is an important one, "Open data is what the government wants you to know. Freedom of information requests are for what they don't want you to know. The things you can't FOI because they don't collect them are what they really don't want you to know." Anna Powell-Smith
What does this mean? When we formulate a data question we need to explore datasets for access and suitability. What do you do if you discover a gap in data collection? Often, I notice we just pull in the low hanging fruit. Race stands in for social constructs or worse--a lousy biologic proxy, poverty is simply a numeric value, and social determinants (correlates) are neglected or not interpreted in a meaningful or reproducible manner.
These are the conversations we need to be having. Join in.
I usually discover misleading data collection and reporting the same way everybody else does. An eye-catching headline or something similar limps across a social media feed--not unlike a wounded animal in the wild--the eye is drawn. In this recent instance, it was the claim that 60% of healthcare executives say they use predictive analytics. Clearly the headline referred to this recent report from the society of actuaries--2019 Predictive Analytics in Health Care Trend Forecast. I have "0" familiarity with the Society of Actuaries but I loosely know what their professional responsibilities include. Wikipedia defines an actuary as "a business professional who deals with the measurement and management of risk and uncertainty". ![]()
Let's embed the discussion of this report alongside a few steps to improve your survey game and post survey analytics, shall we?
Full disclosure, I have not seen the survey and for all I know some of these practices were implemented under the hood but if they were--why not say so? Define what you are measuring. How do you know if predictive analytics are being used in your organization if you aren't presented with a baseline definition of predictive analytics or how you would like respondents to consider whether it is being used or not and in what capacity? Report the raw numbers. When percentages are reported without accompanying numerators and denominators how are we to evaluate the ability to extrapolate the findings to the real world. Perhaps this only applies to the 201* respondents to the survey. How many received the survey? If it was sent to 20,000 health payers and provider executives how reflective are the findings of this single survey to the larger group? *stated on last page of report--100 health payer executives and 101 health provider executives were interviewed. What is a health provider executive? I know what I think they are. Are they defined the same way across organizations? ![]()
You can't compare percentages from one year to the next without stating the raw numbers. Clearly the number of respondents varies from year to year so how are we intended to evaluate a 13-percentage point increase from last year? Or a 6-point increase from 2017?
How are they using the predictive analytics and are they comparable across organizations? What exactly are they trying to predict? Employee retention? Per member per month (PMPM) capitation payments?
You need specificity in your outcomes. "Nearly two-thirds of executives (61%) forecast that predictive analytics will save their organization 15% or more over the next five years."
Save their organization what? Who are these organizations that are saying no to predictive analytics? Isn't that the foundational algorithm embedded in healthcare outcomes forecasting? In the absence of a workable definition of predictive analytics presented to the respondents--what can we say for certain?
To be honest with you, I don't even know what I am looking at in this graphic below. No idea. So I am simply going to skip it. Don't report low-value information. Not everything needs a graphic.
My confusion continues with the next graphic. Costs of what? You need to measure specifically in order to know if costs were reduced. And what does "Staffing/workforce needs Clinical outcomes" mean? Is it a typo? Is part of the chart missing?
At first glance I would also make an assumption that the "actual results overall" are negative? Why? The color choice. Red in a chart can be misinterpreted because we all arrive at graphicacy with our own perceptions and biases.
Another problem of not seeing the actual survey--I am not sure if these choices are ranked or asked in a Likert or multiple choice format. Ranking would be preferred and add value even if I don't quite understand the responses. For example, what question wording would you have you respond "Data visualization"?
If the future of predictive capabilities question yielded "Data visualization at 23% what are you all looking at now? Ranking questions can yield probabilities but any other format would be reporting descriptive statistics only.
This report--although well intentioned--lacks clarity that seems at odds with the work of an actuarial organization.
My point isn't too blame but to demonstrate how we can all do better--myself included. We need to slow down and take the time to make sure that we aren't introducing unsolvable paradoxes with our own data collection. After all, when an unstoppable force meets an immovable object, the laws of physics are quite boring. Don't let your survey become a black hole... If you see me talking to myself, do not disturb, I'm having a staff meeting.
Professionally there has been a stall or two in a few projects--one really hit home. A rare disease of personal interest to me. A lovely neighbor, curator of ancient art, was misdiagnosed and then diagnosed with a rare Parkinsonian type disease. Pharma was interested until it wasn't deemed profitable, no funding, no molecule. Many colleagues imagine it is only sunshine and smiles in the independent world of digital media. My two passions of writing and analytics speak to each other through my work. Leaving one or the other out of the conversation results in a phantom limb of sorts. But often they are at odds. Requests to write are often saddled with datasets not up to the task at hand--regardless of client enthusiasm. When I decline, I know there will be an endless pool of writers only too happy to cash the check regardless of the chasm between ethics and action.
You wanna fly, you got to give up the shit that weighs you down -- Toni Morrison August brings into sharp focus and a furious boil everything I've been listening to in the late spring and summer--Henry Rollins
Perhaps I can blame it on fewer distractions but I do my best reading when faced with long stretches of quiet and solitude. I have a stack of books to get through, several from a course I am teaching in the fall at a local university--Understanding Data. Maya Angelou is attributed to saying something like, "When you know better, you do better." I do a fair share of pro bono work teaching the basics of survey design and how to clean data upstream from the fancy dashboard visualizations that everyone is clamoring for. And I spend a non-trivial amount of time learning how to think about analysis. When you have been working in a field like medical writing or healthcare consulting you realize the secret sauce is scalability. Do you want to be a data mechanic constantly repairing and fixing poorly designed questions or do you crave higher level collaborations? I have noticed many colleagues falling in with the status quo and giving their clients what they have asked for. Poorly designed or articulated outcomes questions from multiple choice surveys or a data question only interested in probing or interrogating a single source of data. You can dance around to the music provided to you like a little monkey--or become curious. Is there a better way? Am I measuring what I think I am measuring or just grabbing low hanging bananas?
The process, even with Tableau Prep can be quite laborious but is the juice worth the squeeze? I would say yes for prepping your data--but maybe not when formatting your survey data. The perfect format for most analytics is ranking data. We want to be able to create a hierarchy of sentiment not only between questions but also between respondents. In Likert, a 5 or even a 7 response shouldn't be compared to a similar response on a different question. Just because a respondent selects Strongly Agree for example can we make assumptions that the degree of agreement here is the same as on a different question?
No. No we can't. But if we use probabilities like the ones we can generate from asking respondents to rank responses--we now know how they prioritized their behavior or sentiment. And ranking or rating questions can be more straightforward to analyze. Do you have questions regarding cleaning data and what tools are available? Join us on August 14th for a Healthcare Tableau User Group meeting. You can register below... My thoughts on running a small data and communication business are vinified products rendered from insights from Seth Godin. The pomace has been removed from the maceration, filtering and aging has transpired and to bring the enology analogy full circle--the wine is ready for sipping. Here are 3 that I access daily--maybe even hourly 1. You aren't making your products or offering your services for everyone. Stay true to yourself and in the face of rejection--whisper to yourself--"I didn't make this for you" 2. Don't bid on pricing. This has been useful in my work freelancing or working as an entrepreneur. Seth's thoughts (paraphrased by me) are it is a useless race to the bottom. I agree. I don't sell dollars. Clients need to value the tasks or questions they are trying to solve--and pay accordingly. 3. Now that you have your ducks in a row--what are you going to do with the duck--simply brilliant. I have even titled data talks with this exact phase. As a thank you I gifted Seth a bronze duck when I met him at the On Being Gathering. He actually teared up and gave me a hug.
The theme of data gathering and sourcing was used by Cole in a podcast replay, Dataklubben, that I caught on my way to the gym. Think about our roles as data analysts. We gather/source data--lots of it. But what are we looking for? The single pearl of truth perhaps but in the process we shouldn't leave huge piles of oyster shells in our wake. If certainty is truly the goal you will likely be disappointed. Think how many oysters you would have to shuck or eat to potentially find a pearl. Maybe what we really need is targeted well thought out questions--and well calculated probabilities.
I thought it was just me. I bristle at all of the pay walls, sponsored content, and monetization of conversations that matter. Or should matter. How can we have meaningful dialogue if access to the conversation is stratified by who is willing or able to pay to listen?
When I was a newbie medical writer my contract contained a requirement that I be provided with all requested references. A few earlier projects had been completed with me footing the bill or having to travel to a nearby medical library. For the first go-around, they would send the freely available articles but try and substitute the pay wall articles with just the abstracts. Nope. Not good enough. I teach writers that the devil is in the details. And by details I mean methodology and results sections. What do you think the clinical landscape would look like if we only wrote about articles we could freely read allowing paywalls to retain part (arguably, the majority) of research papers? I tell you this story simply to own up to a source of the triggering solicited when being poked to pay up or look the other way. Don't get me wrong, I have access to many journals through my press credentials, access to a fine local academic library, and even the National Press Club library. I also pay a few Patreon accounts, news subscriptions (NY Times), and a few others like Paris Review and Harper's Weekly. What I don't have is a bottomless wallet to pay for vanity podcast subscriptions (I don't mind listening to your ad roll for free access), digital news aggregators, and for profit content morphing into sponsored content. I remember bumping into a sparkly new journalist at one of the big online news sites. We had gathered for a medical research discussion and he said he was so excited to have been hired. Admitting he knew nothing about medicine or healthcare he shared that the hiring team had told him--"that's okay, we can teach you what you need to know."--gasp. We build too many walls and not enough bridges--Isaac Newton
A recent podcast by Manoush Zomorodi and Jen Poyant shared the not so common plight of a creative voluntarily shutting down her business. Spoiler alert--she refused to lower quality, commoditize her contributions, or become another monetized blog selling out for profit. Amen. They close up shop in August but go take a look at what our new "money, money, money" mentality is costing us.
Her website is Design*Sponge and it will be missed. Here is an article that integrates nicely with where the article is headed, The Cost of Being Disabled written by Imani Barbarin a contribution from blogger Crutches & Spice--discussions from the intersection of liability, race, gender, and media..
We all know why the temptation exists. The money can be eye popping. I was able to avoid turning to stone by leaving pharma and starting my own data and writing consultancy. I do okay financially but not the same depth of okay as when I wrote what I was asked. period.
I couldn't return to an era where I didn't know the harms being perpetuated in lock-step with the good. The data large and small companies did not want to include. The cursory distortion of data insights toward marketing and away from actual science or unbiased discovery was hard to miss. I read the article by Christopher Booth, MD, a medical oncologist and recognized the duplicity. You might be sacrificing your reputation at the moment you decide to change your industry ties from "none" to "some". ...Since that time he has had no relationships with industry. Moreover, he now “sees” industry influence in almost all facets of patient care, medical education, clinical research, and even certification exams (in which the correct answers are based on pharmaceutical funded guidelines).--From the $80 hamburger to managing conflicts of interest with the pharmaceutical industry
Not sure why there are few teeth in the discussion of clinical trial data and how to teach all of us scalable literacy to inform or observe what should or shouldn't be happening at the point of care. What are some of the distortions we find when we become aware of industry influence in clinical trials?
In summary, we have found that modern RCTs in breast cancer, NSCLC, and CRC are substantially larger and more international in scope than those of earlier decades. Although methodology and quality of reporting seems to be improving over time, serious deficiencies persist, particularly in the identification of the primary end point and by not including all randomly assigned patients in ITT analyses. There has been a substantial shift toward industry sponsorship of oncology RCTs. Over the past 30 years, authors’ endorsement of novel therapies has increased while relative effect size has remained stable.
Before I teach data literacy workshops on how to read clinical literature--I begin with the history. You need to understand how effect sizes and p-values can be influenced by the sheer increase in sample sizes, over powered studies can make spurious associations seem larger, and the rise of the surrogate endpoint. Dig a little deeper and you can appreciate the evolution from little or no industry sponsorship of clinical trials (1990s) to upwards of 90% now funded by industry.
I haven't been able to see exactly what is being taught by panel discussions on writing about clinical trials or societies asking members for money to access articles--but what I have been able to see is not worth your time or effort.
Have questions? Reach out over on twitter @datamongerbonny. The blog will always be free. Thank you to those of you that have supported this work for so long. |
Sign up for our newsletter!
Browse the archive...
Thank you for making a donution!
In a world of "evidence-based" medicine I am a bigger fan of practice-based evidence.
Remember the quote by Upton Sinclair... “It is difficult to get a man to understand something, when his salary depends upon his not understanding it!” |