data&donuts
  • Data & Donuts (thinky thoughts)
  • COLLABORATor
  • Data talks, people mumble
  • Cancer: The Brand
  • Time to make the donuts...
  • donuts (quick nibbles)
  • Tools for writers and soon-to-be writers
  • datamonger.health
  • The "How" of Data Fluency

hello data
I visualize data buried in non-proprietary healthcare databases
https://unsplash.com/@winstonchen

Nimble solutions for big data...

8/2/2018

 
What a busy week. I am taking a refresher R course, Introduction to R for Journalists: How to Find Great Stories in Data. I also hop in and out of the Data Science Specialization created by The Johns Hopkins over at Coursera. Admittedly the R Programming course is quite intensive but if you make it out the other side you will be in masterful shape to tackle large datasets in relatively no time at all. I highly recommend the 10 course specialization but admit it is hard to stay focused while juggling client work and travel.

But here is why it is important. If you are safely ensconced in Excel spreadsheets and tables--good luck. That data endeavor is manual and labor intensive. You need to rebuild the ship every time you decide to set sail. Not the best use of your time. Data projects rely on open source data and also a seemingly infinite number of non-proprietary data floating around the web. Data brokers have created streamlined solutions by cleaning the data for you and in many cases combining datasets to provide longitudinal analyses.

A little elbow grease though, and you are able to write a little code to update or tidy your date on the fly. As someone who did this manually to create data visualizations for clients--the juice is worth the squeeze. I code in R and Python and the living is easier...

This is also the first week in several, where work won out over running on the trails and training for an ultra-event. Sometimes only able to secure a few hours in the pre-dawn, I use the time to re-enter work life with the scale tipped back toward work. I don't know how you stay on your professional toes but I would be sunk without podcasts whispering insights and ideas into my ears.

Perhaps not a typical vacation read, but The Open Revolution: Rewriting the Rules of the Information Age introduced me to the work and writings of Rufus Pollock an economist and founder of Open Knowledge International.
​

Picture
In this new world, intellectual property is intellectual monopoly. Monopolies are unjustified and unjust, dangerous both to our economies and our societies. We need new rules for this new, digital world: rules appropriate to the information economy; rules that provide ways to reward innovators and creators whilst preserving fairness and freedom, and which give everyone a stake in our digital future.--The Open Revolution: New rules for a New World by Rufus Pollock

Picture
I like the powerful analogy of baking a cake. If the ingredients are locally available or even in your pantry--well done you. But what if you had to drive to the farm and wait for the flour to be milled, eggs to be produced, etc. Well a data or insight "cake" has similar challenges. Sources of friction--legal issues, data quality, and data logistics are all financial as well as temporal barriers to efficiency.
The Frictionless Data Field Guide is for those of us working with data of all stripes. I found it an erstwhile companion to my work in R as a guide to workflows, data collections, data sourcing, validation, getting data out in the world, and improving data publishing. For the sake of transparency I made a mess of my desktop and dropbox files as I manipulated huge amounts of data into repositories and visualizations. After each project I dutifully collected the raw data and packaged the workbooks for clients but needed to devote not a trivial amount of time creating order from the flotsam. The workflow has made my data sources ready for the next query and accessibly organized.

In the video below, Rufus explains how the containerization of shipping automated reduced costs and increased efficiency by an order of > 1000%.  Think of data packages as containers for data--once we have standardized data "containers" we have tools to validate, store, search, import, and export data.

Picture
The Economist: Babbage podcast is a nice visit with science and technology and an easily assimilated brief audio snap. The namesake of Charles Babbage the "19th-century polymath and grandfather of computing" presents a potential seed of a solution for drug pricing.

​In continuing the Rufus Pollock theme--he appears momentarily but discusses the tensions around drug price transparency and monopolies.

​A firm supporter of innovation, he recognizes that the big player in tech isn't necessarily the associated gadgetry but the basic rules around ownership. Enterprise, free markets, innovations and inequality depend on rules we combine.

We are operating as if monopolies and patent rights are a single tool and the only solution to  rewarding creators for their investments in R&D. I admire the "out of the box" thinking around better solutions to improve access to life-saving drugs.
The book and podcast distinguish patent rights from property rights. 

The problem arises when patients will die without access to drugs. If they are priced prohibitively to cover R&D and the marginal cost of manufacturing with one price--fewer drugs are sold and patient outcomes may be quite draconian. But what if, as Rufus suggests, there is a two part payment model--cost of goods has the fixed cost of R&D and cheap marginal cost for medicines relative to R&D.

Innovators can invest or pay for two things separately. Subscription fees pay for fixed cost and manufacturing costs are calculated separately. An enumeration model provides access through a fixed fee to cover R&D and money is allocated to a fund to pay for which drugs get used and how effective they are. No need for government to distribute--only to make sure money is collected. An actual plausible model not trying to be crammed into an outdated framework like US healthcare ecosystem.
Picture

I have to admit, it took some time to snap out of my holiday routine of long runs in Montauk and leisurely attention to projects needing a little "kicking down the road". But as we all know--there is much work to be done. I am a new member of the National Press Club and ideas for stories are buzzing around my head constantly. The newly refreshed skills in R programming language help to swat away the time factor. Yes, you need to focus and do the work up front--but once you create your datasets and data frames--you are frictionless...
Picture

“The best thing to do with your data will be thought of by someone else”-Rufus Pollock


Comments are closed.
    Sign up for our newsletter!
    Picture
    Browse the archive...
    follow us in feedly
    Picture
    Thank you for making a donution!
    donations=more content
    In a world of "evidence-based" medicine I am a bigger fan of practice-based evidence.

    ​Remember the quote by Upton Sinclair...


    “It is difficult to get a man to understand something, when his salary depends upon his not understanding it!”

    Follow the evolution of Alzheimer's Disease into a billion dollar brand
    Picture
Proudly powered by Weebly
  • Data & Donuts (thinky thoughts)
  • COLLABORATor
  • Data talks, people mumble
  • Cancer: The Brand
  • Time to make the donuts...
  • donuts (quick nibbles)
  • Tools for writers and soon-to-be writers
  • datamonger.health
  • The "How" of Data Fluency