I primarily use data sources that are publicly available. I travel around teaching data literacy and want to share data anyone can use to answer their data questions or hone business insights into actionable tactics. My biggest pet peeve is being sent a tool to simplify data curation--and then a bill several weeks later. I like a challenge so I encourage colleagues to learn a few skills to eliminate the middle man.
Look at this ugly little packed bubble chart. They are horrible little charts. Our eyes are not able to differentiate quantities in spherical formats. Okay maybe some do look bigger but by how much? I picked this monstrosity because while searching through clinicaltrials.gov for recruiting trials sorted geographically I noticed how a lack of standardization during data collection can leave chaos.
Another obvious data flaw is the wide variety of labels for studies within the database. What influences the classification scheme by colorectal cancer, colon cancer, colorectal carcinoma or even colorectal adenocarcinoma?
Tableau Prep is useful for sorting through the mess with a wide variety of tools to help clean up a dataset. I will admit I can be a little loosey goosey when analyzing in discovery mode. This particular task will be shared with an audience hoping to improve their skills at utilizing clinicaltrials.gov database. Look at the lack of clarity in how the conditions are captured...
Following a bit of data wrangling I was able to filter down to 3 broad categories shown to the right of the original snapshot. I keep them grouped so I can still drill down if I need more granularity or want to verify the findings. This visualization below is not with the clean version of grouping conditions--I was actually looking for specific types of Phase II data. This included editing classifications captured as Phase 1/2, Phase 2, Phase I/II, and Phase II...yikes.
What I am actually doing is comparing demographics between clinical trial participants and Phase II historical proprietary research data I have permission to access. How have the populations included or excluded from clinical trials in colorectal cancer changed, evolved, or stayed the same?
Can you see why this might be relevant in this current healthcare ecosystem where the prices of oncology drugs continue to escalate but the clinical efficacy is ill-defined?
Progression-free survival is a recurrent primary endpoint used to rate and compare randomized clinical trial findings but what does this mean for point of care decisions between community oncologists and their patients with colorectal cancer?
"We can only connect the dots we collect"...Amanda Palmer
Browse the archive...
Thank you for making a donution!
In a world of "evidence-based" medicine I am a bigger fan of practice-based evidence.
Remember the quote by Upton Sinclair...
“It is difficult to get a man to understand something, when his salary depends upon his not understanding it!”
Sign up for our newsletter!