data&donuts
  • Data & Donuts (thinky thoughts)
  • COLLABORATor
  • Data talks, people mumble
  • Cancer: The Brand
  • Time to make the donuts...
  • donuts (quick nibbles)
  • Tools for writers and soon-to-be writers
  • datamonger.health
  • The "How" of Data Fluency

hello data
I visualize data buried in non-proprietary healthcare databases
https://unsplash.com/@winstonchen

Knowledge is power but information is liberating

10/6/2018

 
Picture
The title of today's post is attributed to a statement made by Kofi Annan about education. On the surface, what could be more anodyne than knowledge right? But what if a large percentage of the collective, narrowly defines information based on a single perspective?

The remaining part of the quote, "Education is the premise of progress, in every society, in every family" might be the key element. I want to encourage you to pursue education daily. Many peers, mid-career or beyond, tuck into comfortable cozy careers either in a subject niche, algorithmic informed writing habit, or simply decide to focus on regulatory and compliance template driven writing. I am here to tell you--those days are waning. Big data is here and has invited CRISP-DM along for the ride. Briefly, the 6-phases of cross-industry standard process for data mining include understanding your data, preparation, modeling, evaluation, deployment, and business understanding.

You might be thinking--hey wait a minute--I am a writer not a data person. Okay. But here comes the hard no. If you aren't data informed or aware, what are you actually writing? Opinions?
The Government are very keen on amassing statistics - they collect them, add them, raise them to the nth power, take the cube root and prepare wonderful diagrams.--Josiah Stamp
Last week was a return to travel, workshops, and a smattering of data literacy work from the setting of a local conference on interdisciplinary statistics and combinatorics. I had grounded myself for a few months. Opting to step out of the frenzied and chaotic world of conference attendance and ask myself an important question--who is being served by me being in the audience?

Am I able to bring information immediately applicable at the point of care or am I listening to "experts" proclaim mind-numbing self-promotion and shameless marketing? Media credentials can get you front row access to most conferences but I would also argue--beware the chattering masses--they have heterogeneous motivations and goals. It is quite all right to return to square one--learn what you didn't know, you didn't know...
Thursday afternoon I traveled to North Carolina State University to attend a 2-hour workshop R for Document Creation (R Markdown). Because I am handling large datasets and would like the data to travel with the visualizations I am learning how to make this happen. I am learning R (with Python refreshers to follow) to expedite working with non-proprietary healthcare data. The datasets are large and often in need of cleaning and preparation for analysis. Check out your local universities. Free workshops are a great way to add a few new "shiny" tools to your skillset.

Here is an example of how Edward Tufte uses R markdown to prepare margin figures in his books and writings...The beauty of R--or at least for me--is the simplicity of the scripts. You can name them and save them for quick updates or application to additional data sets.
Picture
Using or learning R has been impactful and instrumental in allowing me to approach data visualization with a question informed by multiple datasets but not limited by interoperability or access. When you "knit" the code--voila the image will be published with the code either displayed or neatly hidden away--your choice.


This weekend, I wandered over to a local conference being held at an alma mater, UNC-Greensboro. Invited as media I scanned the conference brochure looking for sessions of interest (and was not disappointed).
Picture
I will be writing more in depth about the sessions but for now the familiarity of Simpson's paradox and refinement of my ability to interpret gene signature data were highlights.

Picture
Picture

You can amass data. That is the easy part. Big data lakes host a wide variety of potential information but you will need buckets--S3 or Hadoop clusters to organize. You will need pipelines to your data in order to begin the process. Upstream insights to confirm the right data is being generated (garbage in, garbage out), all 6 steps of CRISP-DM, and a governance strategy to protect your data and control access.

Follow along for information and guidance--or reach out for a hand. You never know what you might find swimming beneath the surface...

Comments are closed.
    Sign up for our newsletter!
    Picture
    Browse the archive...
    follow us in feedly
    Picture
    Thank you for making a donution!
    donations=more content
    In a world of "evidence-based" medicine I am a bigger fan of practice-based evidence.

    ​Remember the quote by Upton Sinclair...


    “It is difficult to get a man to understand something, when his salary depends upon his not understanding it!”

    Follow the evolution of Alzheimer's Disease into a billion dollar brand
    Picture
Proudly powered by Weebly
  • Data & Donuts (thinky thoughts)
  • COLLABORATor
  • Data talks, people mumble
  • Cancer: The Brand
  • Time to make the donuts...
  • donuts (quick nibbles)
  • Tools for writers and soon-to-be writers
  • datamonger.health
  • The "How" of Data Fluency